Modeling Errors in GraphQL

GraphQL excels in modeling data requirements. Modeling errors as schema types in GraphQL is required for certain kinds of errors. In this post, let's analyze some cases where errors contain structured data apart from the message and the location information.

Boopathi Rajaa Nedunchezhiyan

Senior Software Engineer

Posted on Apr 13, 2021

Tags:

Use case to distinguish different errors

GraphQL Errors

GraphQL is an excellent language for writing data requirements in a declarative fashion. It gives us a clear and well-defined concept of nullability constraints and error propagation. In this post, let's discuss how GraphQL lacks in certain places regarding errors and how we can model those errors to fit some of our use-cases.

Before we dive into the topic, let's understand how GraphQL currently treats and handles errors. The response of a GraphQL query is of the following structure -

{
  "data": {
    "foo": null
  },
  "errors": [
    {
      "message": "Something happened",
      "path": ["foo", "bar"]
    }
  ]
}

Error extensions

The Schema we define for GraphQL is used only in the data field of the response. The errors field is a well-defined structure - Array<{ message: string, path: string[] }> in its simplest form. The Schema we define does not affect this Error.

Let's say the client queries a field using an ID. How can the client know from the above error object whether the Error is due to an Internal Server Error or the ID is Not_Found? Parsing the message is a no-go because it is not reliable.

Luckily, in GraphQL, there is a way to provide extensions to the error structure - using extensions. The error.extensions can convey other information related to the Error - properties, metadata, or other clues from which the client can benefit. As for the above example, we can model the response to be -

const err = {
  data: {},
  errors: [
    {
      message: "Not Found",
      extensions: {
        code: "NOT_FOUND",
      },
    },
  ],
};

Errors for Customers

When we have a GraphQL API that delivers content to the end-user - the customers, i.e., we have two levels of users -

The Developer or user of the API - UI/UX/front-end developer.
The Customer or end-user - The one who does not see any technical layers but gets the product's experience in its most presentable format. The Front-end developer builds this experience using data from the GraphQL API.

Since using the word user might be confusing, from now on, Developer will refer to the front-end developer, and Customer will refer to the end-user.

Customer vs Developer

When we have an API whose data is directly consumed by two levels of these users - Developer and Customer, there might be different error data requirements. For example, let's take mutations - when the Customer enters an invalid email address,

The Developer who uses the GraphQL API needs to know that the Customer has entered an Invalid Email address via a parseable format - a boolean or enum or whatever data structure you choose will work except parsing the error message.
The Customer needs to care about the error message in a nicely styled format close to the text box. Also, for different languages or locales, the error message needs to be in the corresponding translated text.

Let's try to model this using the error extensions discussed above -

{
  "data": {},
  "errors": [
    {
      "message": "Die E-Mail-Addresse ist ungültig",
      "extensions": {
        "code": "INVALID_EMAIL"
      }
    }
  ]
}

While this would work, we soon end up in a case where multiple input fields in a mutation can be invalid. What can we do here? Do we model them as different errors or fit everything into the same Error.

The Customer errors still need to be usable by the Developers to propagate it. The front-end developers are the ones ultimately transforming our data structures to UI elements. So they need to understand the Error to highlight that input text-box with a red border. So, to make it easy, let's try modeling these as a single error with multiple validation messages -

{
  "data": {},
  "errors": [
    {
      "message": "Multiple inputs are invalid",
      "extensions": {
        "invalidInputs": [
          {
            "code": "INVALID_EMAIL",
            "message": "Die E-Mail-Addresse ist ungültig"
          },
          {
            "code": "INVALID_PASSWORD",
            "message": "Das Passwort erfüllt nicht die Sicherheitsstandards"
          }
        ]
      }
    }
  ]
}

The codes INVALID_EMAIL and INVALID_PASSWORD will help the front-end dev or Developer highlight the field in the UI, and the message will be displayed to the user right under that text-box.

All this leads to a complicated structure very soon and is not as friendly as the data modeled with a GraphQL schema.

Why you no Schema?

Errors don't have type definitions

The biggest problem we face in modeling these in the extension object is that it's not discoverable. We use such a powerful language like GraphQL to define each field in our data structure using Schemas, but when designing the errors, we went back to a loose mode of not using any of the ideas GraphQL brought us.

Maybe, in future extensions of the language, we can write schemas for Errors as we write for Queries and Mutations. The developers using the Schema get all the benefits of GraphQL even when handling errors. For now, let's concentrate on modeling this using the existing language specification.

Errors in Schema

We want to enjoy the power of GraphQL - the discoverability of fields of data, the tooling, and other aspects for errors. Why don't we put some of these errors in the Schema instead of capturing them in extensions?

For example, the mutation discussed previously can be modeled like this -

mutation returns a Result type
Result type is a union of Success, Error.
Error schema contains necessary error info - like translated messages, etc.

type Mutation {
  register(email: String!, password: String!): RegisterResult
}

union RegisterResult = RegisterSuccess | RegisterError

type RegisterSuccess {
  id: ID!
  email: String!
}

type RegisterError {
  invalidInputs: [RegisterInvalidInput]
}

type InvalidInput {
  field: RegisterInvalidInputField!
  message: String!
}

enum RegisterInvalidInputField {
  EMAIL
  PASSWORD
}

This structure looks exactly like the one we designed above inside error extensions. The advantage of modeling it like this would be that we are using the benefits of GraphQL for errors.

When you have a hammer,

Now, with the idea of modeling errors as Schema types, we are left with more questions than answers -

Should I model all errors as GraphQL types?
How should I decide when to use error extensions and when to use GraphQL types for modeling errors?
etc.

The Problem hammer

When we have multiple teams maintaining the platform, many people contribute and think about modeling different parts of the Schema. There should be clear definitions for the different aspects of the existing data structures and the idea behind how we reached such solutions. The design and the Schema are changed far fewer times than it is read/used.

GraphQL gave us the mindset of "Thinking in Graphs". If we suggest a new way of modeling errors, we need to talk about this mindset and its ideas. Not all errors fit into this modeling (error types in Schema), and it will make the GraphQL API less usable if we approach it by looking at all the errors as nails.

Classification

To model errors, let's try to find some analogies. I want to think about modeling these errors in terms of programming language errors. For example,

Go: Error vs. panic
Java: Error vs. Exception
Rust: Error vs. runtime exception

The programming languages also model errors as two variants. In one model (an error type in go), we inform the Developer who uses the function. The Developer decides either to handle it or to pass it through. In the other variant (a panic in go), we skip everything and bring the program to a halt. We inform the end-user of the program that something has happened. This small variation captured as two different things help us understand the intention of data in errors.

Part 1. Action-ables

What is an error? It tells us that something is wrong and gives us some information on what action can be taken. We can think of errors as containers of action-ables. When modeling them, we classify them into different groups depending on who can take that action.

In GraphQL context, for some errors, the front-end takes care of it - either by a fallback or a retry. In case of some other errors like the invalid inputs, the front-end cannot take action; only the Customer who entered the invalid input can fix the input.

Instead of modeling the errors loosely, we now have a concrete use-case - model it for whoever can take action.

Part 2. Bugs in the system

Errors convey information - either to Developer or Customer. If the Error is conveying some bug in the system, it should not be modeled as schema error types. Here, the system means all the services and software involved in our entire product and not just the GraphQL service. It is essential because it separates the end-user / Customer vs. Developer who uses the API - the end-user looks at our product as one thing, not many individual services.

In the 404 Not Found case, if we had modeled the errors as schema types, it would make the Schema less usable. Let's take a product look-up use-case -

{
  product(id: "foo") {
    ... on ProductSuccess {
      success
    }
    ... on ProductError {
      error
    }
  }
  collection(id: "bar") {
    ... on CollectionSuccess {
      products {
        ... on ProductSuccess {
          success
        }
      }
    }
    ... on CollectionError {
      error
    }
  }
}

This way of handling errors at every level is not friendly for front-end developers. It's too much to type in a query and too many branches to handle in the code.

Part 3. Error propagation

We also have to remember not to disrupt GraphQL semantics of error propagation. If an error occurs in one place in the query, it propagates upwards in the tree till the first nullable field occurs. This propagation does not happen with error types in Schema. It is essential to model these schema error types for only specific use-cases. We go back to Part 1: Action-ables - we design these types for actions that the end-user or Customer can take.

The Problem type

Naming is half the battle in GraphQL. Since the name error is already taken by the GraphQL language (response.errors), it would be confusing to name our error types in Schema as Error. As we did before to look for inspirations, there is a well-defined concept in RFC 7807 - Problem details for HTTP API. So, we will call all our errors in Schema as Problems and, as it has always been, all other errors as errors.

The above register schema with the Problem type would look like this -

type Mutation {
  register(email: String!, password: String!): RegisterResult
}

union RegisterResult = RegisterSuccess | RegisterProblem

type RegisterSuccess {
  id: ID!
  email: String!
}

type RegisterProblem {
  "translated message encompassing all invalid inputs."
  title: String!
  invalidInputs: [RegisterInvalidInput]
}

type InvalidInput {
  field: RegisterInvalidInputField!
  "translated message."
  message: String!
}

enum RegisterInvalidInputField {
  EMAIL
  PASSWORD
}

Problem or Error

Errors vs Problems

Problem refers to the Error as a Schema type. ** Error** refers to the Error that appears in the response.errors array with an error code at error.extensions.code.

Case 1: Resource Not Found

404s are bugs in the system in case of navigation. If the user navigates from the home page to a product page and ends up on a 404 page, some service selected an id that leads to 404 when resolved and this has most likely been the case upon selection. It's not something because the user entered some input. Also, these errors need to be propagated. So, this becomes an Error with an error code as NOT_FOUND and not a Problem.

Case 2: Authorization

Authorization errors are of the Error type and do not fit a problem type. Here, the action taker looks like it's the Customer who needs to log in. But, the UI can take action here and show a login dialog box to the Customer. In apps, the app decides to take the Customer to the login view. The action belongs to the Front-end and only then the Customer. So, we model it for the developer/front-end as an Error with error code NOT_AUTHORIZED and not a Problem.

Case 3: Mutation Inputs

Mutation Inputs is the only case where it is crucial to construct Problem types. It contains inputs directly from the Customer, and only the Customer can take action for this. So, we model these errors as Problems and not Errors.

Case 4: All other bugs / errors

Any runtime exception in the code or Internal Server Errors from any backends that the GraphQL layer connects to should be modeled as Error and need not contain an error code. This way, it is easy for the front-end to treat all non-error code responses as Internal Server Errors and take action accordingly - to retry or show the Customer an error page.

Conclusion

We have discussed Problem type as a possible solution where the error object in the GraphQL response does not suffice the use-cases. But we have to be careful about not overusing this for many use-cases where the error extensions already provide enough value.

We have to understand that the Problem type in unnecessary places does make the query and front-end code complicated. Our GraphQL Schema should try to simplify and provide a friendly interface.

In case you are interested, here are further posts in the GraphQL series -

We're hiring! Do you like working in an ever evolving organization such as Zalando? Consider joining our teams as a Software Engineer!

Understanding GraphQL Directives: Practical Use-Cases at Zalando

In this blog post, we dive into the practical applications of GraphQL directives at Zalando. With simple examples,... Read more...

Boopathi Rajaa Nedunchezhiyan

Principal Engineer

Oct 19

2023

GraphQL persisted queries and Schema stability

Learn how Zalando uses persisted queries, and how we define and think about different levels of stability of our... Read more...

Boopathi Rajaa Nedunchezhiyan

Senior Software Engineer

Feb 17

2022

Optimize GraphQL Server with Lookaheads

GraphQL offers a way to optimize the data between a client and a server. We can use the declarative nature of a... Read more...

Boopathi Rajaa Nedunchezhiyan

Senior Software Engineer

Mar 18

2021

Modeling Errors in GraphQL

GraphQL Errors

Error extensions

Errors for Customers

Why you no Schema?

Errors in Schema

When you have a hammer,

Classification

Part 1. Action-ables

Part 2. Bugs in the system

Part 3. Error propagation

The Problem type

Problem or Error

Case 1: Resource Not Found

Case 2: Authorization

Case 3: Mutation Inputs

Case 4: All other bugs / errors

Conclusion

Related posts

Related posts

Understanding GraphQL Directives: Practical Use-Cases at Zalando

GraphQL persisted queries and Schema stability

Optimize GraphQL Server with Lookaheads