Stopping OpenAPI from logging all your secrets

Java classes generated by the OpenAPI Generator include all fields in their toString() implementation (yes even if they’re called "password"). Which raises the risk that they’ll be accidentally included in application logs.

Unfortunately there’s no easy way to exclude fields from the toString() method. This post provides a workaround but is mostly just a whinge about how much this problem sucks.

tl;dr to exclude a value from the generated toString() map it to a custom ‘wrapper’ class. That way you can control how it’s logged. See here for the code and here for an explanation.

The Problem

Logging is one of the more annoying problems in software. More logs is generally good. We want the ability to observe our code in action. But if you log the wrong object, configure the wrong log level or misconfigure your analytics library suddenly you’re storing database passwords, cryptographic keys and customer information in plain text.

Keeping Secrets Out of Logs provides a great explanation of why this issue sucks and what you can do about it. But in summary there’s no one way to keep logs clean.

This is mostly a case of hard problems being hard. But part of the issue seems to be that, despite this affecting almost every production deployment most tooling completely neglects it.

For example libraries that should expect to encounter secrets, like validators and serialisers, rarely document what information will be included in their errors. It’s up to you to feed them with bad data to see how they behave. Will logging an exception result in Failed to parse field 'password' or Failed to parse { "password": "s3cr3tz" }?

Cloud monitoring services like Datadog provide some tools to prevent logging secrets, but handling them once they’ve already been logged is a problem left up to you.

And code generators make it difficult to control how the classes they generate will be logged. Which is the topic of this post.

`secret.toString()`

It’s generally common knowledge that secrets should be excluded from a classes toString() method. In the past this was easy. The default Java toString() implementation only includes the class name and object hashcode. This is safe to log by default (although not exactly useful). A developer would need to implement their own toString() method and manually include a sensitive field for it to end up in your logs.

But now with the ubiquitous use of Lombok and Java Records the default toString() implementation automatically includes every field. This is usually very useful, but it means developers’ now need to remember to override the default implementation on any class that contains a secret. And reviewers need to spot the absence of a custom toString() to notice if it’s including something sensitive.

Code generation makes this trickier. Lombok lets you exclude fields from its generated toString() but it has edge cases.

If you’re trying to generate models from an OpenAPI specification, the official generator currently provides no way to exclude fields from toString(). You can use the format keyword to mark a field as a password. That’ll exclude it correctly in Java, but it doesn’t seem to be implemented for other languages. And it feels odd marking a name or email field as a password.

Thankfully we can hook into the format mechanism to handle things ourselves.

The Solution

We can configure the OpenAPI generator to map any value with format: secret to our own class with a safe toString () implementation.

First, create a wrapper class that’ll hold the secret string. This is based on the domain primitives section of the article mentioned earlier.

@EqualsAndHashCode
public class Secret {

  @Nonnull
  public final String secret;

  public Secret(@Nonnull String secret) {
    this.secret = secret;
  }

  @Nonnull
  public String unwrap() {
    return secret;
  }

  @Override
  public String toString() {
    return "[redacted]";
  }
}

Then, configure OpenAPI to use our wrapper class for any field in the spec with format: secret.

<!-- pom.xml-->

<configuration>
  <typeMappings>secret=com.example.model.Secret</typeMappings>
</configuration>

Lastly, we need to tell the serialisation library how to wrap and unwrap our Secret class. This example uses GSON, but other serialisations libraries will be similar.

Secret -> JSON:

public class SecretSerializer implements JsonSerializer<Secret> {

  @Override
  public JsonElement serialize(Secret secret, Type type, JsonSerializationContext jsonSerializationContext) {
    return new JsonPrimitive(secret.unwrap());
  }
}

JSON -> Secret:

public class SecretDeserializer implements JsonDeserializer<Secret> {

  @Override
  public Secret deserialize(JsonElement jsonElement, Type type, JsonDeserializationContext jsonDeserializationContext) throws JsonParseException {
    return new Secret(jsonElement.getAsString());
  }
}

If you’re using the generated API client you’ll also need to add register the custom serialisers:

// Register our custom serializers with OpenAPI's generated API clients. JSON is a static object used by all clients
// for serializing requests and deserializing responses. So we should only need to configure this once.
JSON.setGson(new GsonBuilder()
    .registerTypeAdapter(Secret.class, new SecretDeserializer())
    .registerTypeAdapter(Secret.class, new SecretSerializer())
    .create());

And that’s it. The downside of this solution is you need to be able to add the format keyword to your OpenAPI spec. Which isn’t always possible if you’re the consumer of a specification produced by some third party.

You also need to remember to add the format keyword to every sensitive field in your spec. And we still have the issue where a reviewer needs to spot its absence to notice a leak, which isn’t easy.

But having the specification list which fields are sensitive is great documentation. Especially for data like identifiers, where it isn’t always obvious if it’s personally identifiable information or not.

What this solution doesn’t do

A lot.

Again there’s no one way to keep logs clean. This post is purley about keeping sensitive data out of generated toString() implementations. It doesn’t protect the secret once it’s serialised back to JSON or once it’s pulled out of our wrapper class. So you’ll also want to look at things like runtime log scanning and automatic redaction. You may want a better name for the wrapper class too. Secret could imply some in-memory protections what this implementation doesn’t have.

But that’s why this is such an annoying problem. A safe toString() is just the first step in preventing a value from hitting your logs. And it’s something most Java devs learn about in their first year. So why is this so tedious?