How To Secure Your GraphQL API
GraphQL is becoming a staple in standardizing APIs for many enterprise-level applications. It gives frontend developers a single entry point and the ability to create filters on only the data they need. The flexibility of GraphQL allows developers to shorten the development cycle, become more flexible with the content they require, and adapt to the needs of a requested feature.
But how secure is GraphQL? Is GraphQL better than REST? and what are the techniques that prevent vulnerabilities from being exploited in GraphQL?
Quick introduction to GraphQL
GraphQL works by providing a single interface for clients to query data without the need of a backend developer customizing every API required. The process flow works with the requesting interface describing the data it needs and GraphQL giving back only the relevant data.
For example, the backend may descript the data as per below:
type Product {
name: String
price: Double
description: String
stock: Number
}
The caller may only need the name and the price. With a traditional API, the description
field would also get sent in the payload. This unnecessarily increases the package size and additional queries required to get the data. However, with GraphQL, you can specify and request only what you need.
{
product(name: "banana"){
price
}
}
Your expected returned result may look something like this:
{
"product": {
"price": 2.99
}
}
In a nutshell, GraphQL is a more efficient way of querying data. It is also a more flexible way for developers to build APIs, especially when all the requirements are not clearly documented, or the business requires constant pivoting and the backend needs to be able to cope with the required changes.
How Secure Is GraphQL?
Everything has a certain degree of insecurity, based on how it is implemented. A barebones and basic implementation of GraphQL can lead to exploitation through Denial of Service (DoS) attacks, SQL injections, and Langsec issues. These exploitations can lead to larger than necessary costs of infrastructure and data compromises.
So how do you mitigate these potential issues?
Query Timeouts
One way is to set query timeouts. This sets the maximum time limit allowed for each query. This prevents overly large queries from being performed and causing potential lock-ups in the database.
To achieve this, GraphQL query depth needs to be implemented. This is because when an attack occurs, the attacker is not looking for information. They are looking to submit a request for the most expensive data package available.
GraphQL Query Nesting
In GraphQL, the more nesting there is in a query, the higher the data loads and cost associated with generating the result for the query. This can overload your server, databases, and network -- resulting in potential chain effect data denials to other parts of your application. When there is no limit on nesting in GraphQL, this is called an unbounded query. With enough nesting depth, a DoS attack lock up your GraphQL server.
So how is deep nesting achieved in GraphQL? One method is through a method called Cyclical Queries. For example, you have a Musician
type that has a list of Songs
, which allows for Musician
to be retrieved. This is because Musician
and Songs
have many to many kind of relationship. In GraphQL, it's easy to create a continuous cyclical query that, in a way, queries itself.
# cyclical query # depth: 8+ query cyclical { musician(id: "xyz") { songs { musician { songs { musician { songs { musician { ... { ... # more deep nesting! } } } } } } } } }
To prevent this from occurring, inline and named fragments can be implemented. This will limit the query depth to one and reject any queries based on depth.
This is different from how we would usually use GraphQL, which is often through inline and named fragments.
# inline fragment # depth: 1 query inlineShallow { musicians ... on Query { songs } } # named fragment # depth: 1 query namedShallow { ... namedFragment } fragment namedFragment on Query { songs }
But a malicious user does not care about effectiveness and efficiency. What they care about is bringing your GraphQL server down by making the most expensive query. This is where Maximum Query Depth
comes in.
With Maximum Query Depth
, a GraphQL server is set to reject anything that goes beyond a set depth.
Setting GraphQL Depth Limits
To set GraphQL depth limits, you can use graphql-depth-limit
, which is an additional module that you can use. This is because Apollo GraphQL currently does not have a built-in query depth limit. To do this, you will need to install graphql-depth-limit
via npm
and then import
it into your project.
Here is how to install and add it to your package.json
.
npm i graphql-depth-limit
Here is how you can use it in your node.js
backend application.
import depthLimit from 'graphql-depth-limit' import express from 'express' import graphqlHTTP from 'express-graphql' import schema from './schema' const app = express() const DepthLimitRule = depthLimit( 3, { ignore: [ 'whatever', 'trusted' ] }, depths => console.log(depths) ) const graphqlMiddleware = graphqlHTTP({ schema, validationRules: [ DepthLimitRule, ], }) app.use('/graphql', graphqlHTTP((req, res) => ({ graphqlMiddleware })))
In the code above, anything beyond a depth of 3 will be rejected. This is set in the first argument under const DepthLimitRule
inside depthLimit()
. The second argument inside depthLimit()
tells your application what to do with the ignored fields.
When the depth limit is exceeded, the application will throw an error that looks something like this:
{ "errors": [ { "message": "'cyclical' exceeds max... depth of 4", "locations": [ { "line": ..., "column": ... } ] } ] }
Dealing with Query Complexity
Cyclical queries make for potentially endless nesting. However, it is not the only technique that can lock up a GraphQL server. Some fields in the schema are more expensive to compute than others and this is due to complexity.
Complexity is based on the number of arguments expected to be returned and the computing power required to query them. Take a look at the example below:
query simple {
musician(id: "xyz") { # complexity: 1
songs(first: 10) { # complexity: 10
title # complexity: 1
}
}
}
To total complexity score for the above query is 12. While musician
and title
are simple returns, there are 10
instances of songs
to be returned. The more items required and available to be returned, the higher the complexity.
This can be applied to other data types where there is a one-to-many relationship such as catalogs of things. For example, an author may have hundreds of posts, or a single micro-investment portfolio with more than half a thousand investments across multiple countries, EFTs, and indexes. While GraphQL can be efficient at getting the data, it can be costly if the complexity is not managed.
Limiting GraphQL Query Complexity
To limit GraphQL query complexity, graphql-validation-complexity
module can be used. To do this, you can install it via npm
and import
it into your node.js
project as per the code sample below.
To install graphql-validation-complexity
:
npm i graphql-validation-complexity
How to implement the GraphQL complexity validation package:
import { createComplexityLimitRule } from 'graphql-validation-complexity'
import express from 'express'
import graphqlHTTP from 'express-graphql'
import schema from './schema'
const app = express()
const ComplexityLimitRule = createComplexityLimitRule(1000, {
scalarCost: 1,
objectCost: 10, // Default is 0.
listFactor: 20, // Default is 10.
})
const graphqlMiddleware = graphqlHTTP({
schema,
validationRules: [
ComplexityLimitRule,
],
})
app.use('/graphql', graphqlHTTP((req, res) => ({
graphqlMiddleware
})))
Here is one way to calculate the cost of a query:
const expensiveField = {
type: ExpensiveItem,
getCost: () => 60,
};
const expensiveList = {
type: new GraphQLList(MyItem),
getCostFactor: () => 100,
};
Alternatively, you can set your GraphQL definition to limit the cost:
type CustomCostItem {
expensiveField: ExpensiveItem @cost(value: 50)
expensiveList: [MyItem] @costFactor(value: 100)
}
Wrap up: Is GraphQL better than REST?
In a way, the immutability of a REST API mitigates the two issues expanded on above. This is because the client is limited to the requirements of the REST API. In contrast, GraphQL offers up flexibility by giving the client the ability to set their own query parameters.
This leads to the question - is GraphQL better than REST? The quick answer to this is that it depends on the situation. For development speed and agility, GraphQL is something that is being advocated by many big names including AWS, Google, and Microsoft, in addition to other big companies like Shopify, Netflix, Twitter, Facebook, Instagram, and even the New York Times.
When it comes to security, GraphQL is still a new area and with low vulnerability awareness within the community. There are other security techniques that you can explore further such as whitelisting queries, persistence, throttling, and GraphQL endpoints, that can help reduce your application's vulnerability against attacks.
Nevertheless, the ideas behind GraphQL, its implementation, and general benefits from it have the potential to supercharge your backend through faster development and create query flexibility that is not available with a standard REST API development.