This disclosure capability of GraphQL is both strength and weakness at the same time and should be tested accordingly.
GraphQL is a query language for APIs. As the name describes it, it is well suited for querying data which can be represented as a graph. Compared to the so far mostly used Rest APIs, GraphQL provides some advantages, which are described in the following section. However, to better compare the two technologies, the functionality of both is briefly described.
A Rest API consists of a set of endpoints, the number varies depending on what data the client can query. Each endpoint is mapped to an entity and returns information about that entity to the client. If a client wants to query information about multiple entities, it has to make a query for each entity to its endpoint, this is very inperformant compared to GraphQL and leads to increased administrative overhead.
In GraphQL there is only one endpoint with which the client communicates. The client makes a query and receives a response. This informativeness of GraphQL is both a strength and a weakness. A Rest API limits the information that can be retrieved with a single query by limiting the informativeness of an endpoint. GraphQL supports arbitrary queries over the data, so a single query can return a large amount of data.
To describe the data, GraphQL uses what is called a schema. The schema specifies exactly which queries and mutations are available to clients.
Additionally, there is the so-called introspection schema, which is used to prompt a GraphQL application for information about which queries are supported. The introspection schema can be output with the query. To query the introspection schema, various GraphQL IDEs or, for example, GraphQL Voyager can also be used.
Based on data from the OWASP Foundation, the most common attacks against GraphQL are:
GraphQL itself does not provide any protection against attacks. So if no protection has been built in, e.g. no parameterized queries, the application may be vulnerable to any kind of injection attacks. Therefore, all incoming data should be validated to ensure that only valid values are allowed. Invalid input should always be rejected without revealing excessive information about how the API works. Furthermore, parameterized queries should be used in the backend to process user-supplied data. Additionally, ensuring that the client rendering the data from the GraphQL response cleanses the data before rendering. And implementing an appropriate content security policy never hurts anyway.
DoS is an attack on the availability and stability of the API, which can cause it to be slow, unresponsive or even unavailable. Therefore, it is advisable to include pagination to limit the amount of data that can be returned in a response. Furthermore, the use of a rate limiter per user or per IP would be useful.
Exploiting weak authorization Exploiting weak authorization logic is by far the most common problem of a GraphQL-based application. Basically, it is important to implement authentication and authorization methods. Each request from a client should be checked to see if the client is authorized to read the data.
GraphQL supports merging queries, this is also known as query batching. Query Batching allows the client to bundle multiple queries for multiple object instances into one call. This enables what is known as a batching attack. This is a form of a brute force attack. To prevent this type of attack, batching of sensitive objects such as usernames, emails, passwords, OTPs, session tokens, etc. should be prevented. This way, an attacker is forced to attack the API like a REST API and make a different network call for each object instance. At the same time, the number of requests which can be executed simultaneously should also be limited.
An example of this would be the approach of so-called "hidden" API endpoints. Often these are provided for functions that should not be accessible to the general public. In principle such endpoints are not a good idea, but should they be accessible without proper authorization it is even worse, especially in the GraphQL environment.
As part of the effort to increase developer-friendliness, GraphQL allows API clients to dynamically query information about the schema, including documentation and types for each query and mutation defined in the schema, with the so-called introspection, which is enabled by default in most GraphQL implementations. This is also used by development tools such as the GraphiQL IDE to dynamically retrieve the schema. If introspection is possible, an attacker can obtain the GraphQL schema to better understand the entire attack surface of the API.
When encountering a GraphQL API, the first step is usually to run an introspection query to obtain a copy of the schema. The schema helps to understand the attack surface of the exposed GraphQL API.
The safest and simplest approach is to disable introspection and GraphiQL system-wide. Additionally, the built-in function that returns a hint if a field name provided by the requester is similar (but incorrect) to an existing field should be disabled. Furthermore, error messages that are returned alla stack traces should be disabled, thus preventing an attacker from gaining valuable information about the system.
In this post, we have looked at the basic functionality and security requirements of GraphQL. Authentication and authorization are the first challenges that need to be addressed. In addition, we showed how common GraphQL related attack surfaces can be reduced.
To practice the described attack options and see them in action, below are some sources.
TryHackMe - GraphQL Room