Site icon Adron's Composite Code

Speaking of Spring Boot Java Logging, Some Misadventures

Java Programming

Preface: I set out upon a task to incorporate some masking, redaction, or filtering of some sort on PII (Personally Identifying Information) for the log files and log data an application service is producing. The application pre-exists, so this isn’t entirely a green field situation, but the application generally fits a Spring Boot Service style application. The one key thing I wasn’t sure about when starting, was what kind of logging was already incorporated into the operational service.

Log Masking for Different Logging Libraries

The first solution I came up with was to incorporate a converter or appender of some sort that would mask PII with a string of some sort, like “****” or “—–“. This solution that I came up with, upon checking, looked like it would work out ok with or for a number of the top logging libraries for Java, specially libraries that run with or as a Spring Boot service like LogBack or Log4j.

Solution: Extend ClassicConverter & Write Up a Classic Converter

The way I implemented that involved the following implementation, I’ll dub this one the the log-mask-example, and have the repo available here. There are several key pieces of collateral that make this work, the first being the logback-spring.xml configuration, added to the resources directory. The file looks like this, including the template for the log message, and the converterClass class path.

<configuration>
<conversionRule conversionWord="piiMask" converterClass="org.skidrow.logmaskexample.PiiMaskingConverter"/>
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36}- %piiMask %n</pattern>
</encoder>
</appender>
<root level="info">
<appender-ref ref="STDOUT"/>
</root>
</configuration>

Then there is the PiiMasking Converter class itself. It shapes up like this, to which, note I’ve removed the imports for brevity. Check out the file here for the full imports and all.

public class PiiMaskingConverter extends ClassicConverter {

private List<PiiRegexPattern> patterns;

public PiiMaskingConverter() throws IOException {
try {
ObjectMapper mapper = new ObjectMapper();
InputStream is = getClass().getResourceAsStream("/regexMasks.json");
patterns = mapper.readValue(is, new TypeReference<>() {
});
} catch (IOException e) {
throw new IOException("Error reading regex_patterns.json", e);
}
}

@Override
public String convert(ILoggingEvent event) {
return maskMessage(event.getFormattedMessage());
}

private String maskMessage(String message) {

boolean turnOn = getMaskingEnabled() {
if (!turnOn)
return message;
for (PiiRegexPattern pattern : patterns) {
message = message.replaceAll(pattern.getRegex().toString(), "****");
}

return message;
}

private boolean getMaskingEnabled() {

Environment env = ApplicationContextProvider.getApplicationContext().getBean(Environment.class);
return env.getProperty("masking.enabled", Boolean.class, true);
}
}

The PiiRegexPattern I setup, which inflates from the values in the regexMasks.json file, looks like this.

public class PiiRegexPattern {

private String title;
private String regex;
private String mask;

public String getTitle() {
return title;
}

public String getRegex() {
return regex;
}

public String getMask() {return mask;}
}

Finally the regexMasks.json file shapes up with a few examples like this.

[
{
"title": "emailMask",
"regex": "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b"
},
{
"title": "ipAddress",
"regex": "\\b(?:[0-9]{1,3}\\.){3}[0-9]{1,3}\\b"
},
{
"title": "phoneMask",
"regex": "\\b(1[-+]?\\s?)?(\\(\\d{3}\\)\\s?\\d{3}[-.]?\\d{4}|\\d{3}[-.]?\\d{3}[-.]?\\d{4})\\b"
},
{
"title": "internationalPhoneMask",
"regex": "\\b\\+?\\d{1,3}?[-.\\s]?\\d{1,4}?[-.\\s]?\\d{1,4}?[-.\\s]?\\d{1,4}(?:\\s*x\\d{1,5})?\\b"
}
]

This way it’s easy to just add additional regular expressions to the mix without adding them in code. Just add them to the JSON file and it’ll pull them in and run each of them.

The example repository I’ve linked also generates some kind of arbitrary log messages and examples just to have something to work with, if you want to clone it and give it a try. If you’ve got any questions, scroll on down the page and pop a comment in there. Happy to answer questions and elaborate on the approach.

Solutions: Log4j2 Multi-Regular Expression Configuration Option

After further work I became aware of another option that eliminates the programmatic nature of this implementation and puts everything in configuration, requiring a shift to Log4j2, which is kind of ideal anyway considering the performance and other characteristics of Log4j2.

This option includes setting up some regular expressions via the properties section of a Log4j2.xml default configuration. You could use a custom configuration too, it doesn’t need to be a default configuration file. The properties shape up like this.

<Property name="phoneRegex">\b(1[-+]?\s)?(\(\d{3}\)\s?\d{3}[-.]?\d{4}|\d{3}[-.]?\d{3}[-.]?\d{4})\b</Property>
<Property name="emailRegex">\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b</Property>
<Property name="passwordFieldRegex">\bpassword\b\s+\[([^]]+)]</Property>
<Property name="redactText">RDCTD</Property>

My pattern layout, in the same configuration file, almost matching the default Spring Boot convention based configuration, shapes up like this.

<PatternLayout pattern="%d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX} %5p ${sys:PID:-1} --- [%-15.15t] %-41.41logger : %replace{%replace{%replace{%msg}{${phoneRegex}}{${redactedText}}}{${emailRegex}}{${redactedText}}}{${fullPasswordEntryRegex}}{${redactedText}}%n"/>

Let’s break down the components of this pattern:

  1. %d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX}: This is the date and time format. %d specifies that the timestamp of the log event should be included. The {} contains the format of the timestamp. In this case, it’s a standard ISO 8601 format, including the year, month, day, hour, minute, second, millisecond, and timezone.
  2. %5p: This represents the log level (like INFO, DEBUG, ERROR) in a 5-character wide field, right-aligned.
  3. ${sys:PID:-1}: This is a system property placeholder. It attempts to include the Process ID (PID) of the running application. If the PID is not available, it defaults to -1.
  4. ---: This is just a separator for readability.
  5. [%-15.15t]: This includes the name of the thread that generated the log message. The -15.15 formatting means the thread name will be truncated or padded to exactly 15 characters.
  6. %-41.41logger: This is the logger’s name, usually the class name where the log statement is written. Similar to the thread name, it’s truncated or padded to 41 characters.
  7. %replace{...}%n: This is a complex replace operation nested within the log message (%msg). It’s designed to redact sensitive information. Each %replace{} function takes three arguments: the input, a regex pattern to search for, and the replacement text. In this sequence, it redacts phone numbers, email addresses, and full password entries with ${redactedText}. The ${phoneRegex}, ${emailRegex}, and ${fullPasswordEntryRegex} are placeholders that should be defined elsewhere in the configuration to match their respective patterns.
  8. %n: This is a newline character, ensuring each log message is printed on a new line.

There is also a need to ensure the dependency for Log4j2 is present in your dependencies file (i.e. your Maven file or whatever you’d want to use).

Solutions Discussion

Below are additional ways to set up custom logging in a Spring Boot application, aside from the two already mentioned. I offer these to extend the brainstorming options when I, or you, are trying to get it sorted out what we need for logging in our applications. I hope it’s useful, and if you dear reader, know of any I’ve missed please add a comment with the additional option and I’ll add it to this list for completeness!

Aspect-Oriented Programming (AOP) for Logging

Custom Annotation for Logging

SLF4J MDC (Mapped Diagnostic Context) for Advanced Logging

Custom Logback Filters

Integration with External Logging Systems

Using JSON Log Format

Custom Log Levels

Summary i.e. the TLDR

In this conversation, I’ve discussed custom logging solutions in Java Spring Boot applications, with a focus on redacting Personally Identifying Information (PII) from logs. The initial approach involved creating a custom Logback converter to mask PII using regular expressions based on the default dependencies included in Spring Boot, configured via logback-spring.xml and an external JSON file. We also explored a Log4j2 configuration that uses properties for regular expressions to redact sensitive information directly in the pattern layout. Additional custom logging strategies were suggested, including using Aspect-Oriented Programming (AOP), custom annotations, SLF4J’s Mapped Diagnostic Context (MDC), custom Logback filters, integration with external logging systems like ELK, outputting logs in JSON format, and defining custom log levels. These methods provide a comprehensive framework for effectively managing and securing log data in Spring Boot applications, ensuring sensitive information is adequately protected.

Exit mobile version