Stopping OpenAPI from logging all your secrets
Java classes generated by the OpenAPI Generator include all
fields in their toString() implementation (yes even if they’re called "password"). Which raises the risk that
they’ll be accidentally included in application logs.
Unfortunately there’s no easy way to exclude fields
from the toString() method. This post provides a workaround but is mostly just a whinge about how
much this problem sucks.
tl;dr to exclude a value from the generated toString()
map it to a custom ‘wrapper’ class.
That way you can control how it’s logged. See
here
for the code and here for an explanation.
The Problem
Logging is one of the more annoying problems in software. More logs is generally good. We want the ability to observe our code in action. But if you log the wrong object, configure the wrong log level or misconfigure your analytics library suddenly you’re storing database passwords, cryptographic keys and customer information in plain text.
Keeping Secrets Out of Logs provides a great explanation of why this issue sucks and what you can do about it. But in summary there’s no one way to keep logs clean.
This is mostly a case of hard problems being hard. But part of the issue seems to be that, despite this affecting almost every production deployment most tooling completely neglects it.
For example libraries that should expect to encounter secrets, like validators and serialisers, rarely document what
information will be included in their errors. It’s up to you to feed them with bad data to see how they
behave. Will logging an exception result in Failed to parse field 'password' or
Failed to parse { "password": "s3cr3tz" }?
Cloud monitoring services like Datadog provide some tools to prevent logging secrets, but handling them once they’ve already been logged is a problem left up to you.
And code generators make it difficult to control how the classes they generate will be logged. Which is the topic of this post.
secret.toString()
It’s generally common knowledge that secrets should be excluded from a classes toString() method. In the past this was
easy. The default Java toString() implementation only includes the class name and object hashcode. This is safe
to log by default (although not exactly useful). A developer would need to implement their own toString() method and
manually include a sensitive field for it to end up in your logs.
But now with the ubiquitous use of Lombok and Java Records the default toString()
implementation automatically includes every field. This is usually very useful, but it means developers’ now
need to remember to override the default implementation on any class that contains a secret. And reviewers need
to spot the absence of a custom toString() to notice if it’s including something sensitive.
Code generation makes this trickier. Lombok lets you
exclude fields from its generated toString() but
it has edge cases.
If you’re trying to generate models from an OpenAPI specification, the official
generator currently provides
no way to exclude fields from toString(). You
can use the format keyword to mark a field as
a password. That’ll exclude it correctly in Java,
but it
doesn’t seem to be implemented for other languages.
And it feels odd marking a name or email field as a password.
Thankfully we can hook into the format mechanism to handle things ourselves.
The Solution
We can configure the OpenAPI generator to map any value with format: secret to our own class with a safe toString () implementation.
First, create a wrapper class that’ll hold the secret string. This is based on the domain primitives section of the article mentioned earlier.
@EqualsAndHashCode
public class Secret {
@Nonnull
public final String secret;
public Secret(@Nonnull String secret) {
this.secret = secret;
}
@Nonnull
public String unwrap() {
return secret;
}
@Override
public String toString() {
return "[redacted]";
}
}
Then,
configure OpenAPI to use our wrapper class for any field in the spec with
format: secret.
<!-- pom.xml-->
<configuration>
<typeMappings>secret=com.example.model.Secret</typeMappings>
</configuration>
Lastly, we need to tell the serialisation library how to wrap and unwrap our Secret class. This example uses
GSON, but other serialisations libraries will be similar.
Secret -> JSON:
public class SecretSerializer implements JsonSerializer<Secret> {
@Override
public JsonElement serialize(Secret secret, Type type, JsonSerializationContext jsonSerializationContext) {
return new JsonPrimitive(secret.unwrap());
}
}
JSON -> Secret:
public class SecretDeserializer implements JsonDeserializer<Secret> {
@Override
public Secret deserialize(JsonElement jsonElement, Type type, JsonDeserializationContext jsonDeserializationContext) throws JsonParseException {
return new Secret(jsonElement.getAsString());
}
}
If you’re using the generated API client you’ll also need to add register the custom serialisers:
// Register our custom serializers with OpenAPI's generated API clients. JSON is a static object used by all clients
// for serializing requests and deserializing responses. So we should only need to configure this once.
JSON.setGson(new GsonBuilder()
.registerTypeAdapter(Secret.class, new SecretDeserializer())
.registerTypeAdapter(Secret.class, new SecretSerializer())
.create());
And that’s it. The downside of this solution is you need to be able to add the format keyword to your OpenAPI
spec. Which isn’t always possible if you’re the consumer of a specification produced by some third party.
You also need to remember to add the format keyword to every sensitive field in your spec. And we still have the
issue where a reviewer needs to spot its absence to notice a leak, which isn’t easy.
But having the specification list which fields are sensitive is great documentation. Especially for data like identifiers, where it isn’t always obvious if it’s personally identifiable information or not.
What this solution doesn’t do
A lot.
Again there’s no one way to keep logs clean. This post is purley about keeping sensitive data out of generated
toString() implementations. It doesn’t protect the secret once it’s serialised back to JSON or once it’s pulled out
of our wrapper class. So you’ll also want to look at things like runtime log scanning and automatic redaction. You
may want a better name for the wrapper class too. Secret could imply some in-memory protections what this
implementation doesn’t have.
But that’s why this is such an annoying problem. A safe toString() is just the first step in preventing a value
from hitting your logs. And it’s something most Java devs learn about in their first year. So why is this so tedious?