Three Ways To Reduce Your Java Logging Costs By Sampling

Reading time: 4 minutes

TL;DR: Log sampling helps manage log volumes by only recording a portion of events. Use these examples to configure sampling:

  • Logback (used by Spring Boot v2.x and later): Use SamplingAppender and FixedRateEvaluator to log 1 out of every 10 events.
  • Log4j 2: Apply BurstFilter with a rate of 1 in 5 after 10 bursts per minute.
  • JUL: Implement a Filter in custom code for 1 in 5 sampling.

Why Log Sampling?

Log sampling logs a subset of events to balance monitoring and performance:

  • Reduce storage costs: Logs only a portion of events, cutting storage use.
  • Improve performance: Lowers I/O load, enhancing application speed.
  • Maintain insights: Keeps diagnostic value while reducing noise.

Configuring Log Sampling in Java

1. Logback (used by Spring Boot, version 2.x)

Dependencies:

  • Maven:
  <dependency>
      <groupId>ch.qos.logback</groupId>
      <artifactId>logback-classic</artifactId>
      <version>1.2.11</version>
  </dependency>
  • Gradle:
  implementation 'ch.qos.logback:logback-classic:1.2.11'

Config:

<configuration>
    <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
            <pattern>%d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} - %msg%n</pattern>
        </encoder>
    </appender>

    <appender name="SAMPLED_CONSOLE" class="ch.qos.logback.core.sampling.SamplingAppender">
        <appender-ref ref="CONSOLE"/>
        <evaluator class="ch.qos.logback.core.sampling.FixedRateEvaluator">
            <rate>10</rate>
        </evaluator>
    </appender>

    <root level="INFO">
        <appender-ref ref="SAMPLED_CONSOLE"/>
    </root>
</configuration>

Note: FixedRateEvaluator will log 1 out of every 10 log messages, regardless of the message content.

2. Log4j 2

Dependencies:

  • Maven:
  <dependency>
      <groupId>org.apache.logging.log4j</groupId>
      <artifactId>log4j-core</artifactId>
      <version>2.17.1</version>
  </dependency>
  • Gradle:
  implementation 'org.apache.logging.log4j:log4j-core:2.17.1'

Config:

<Configuration status="WARN">
    <Appenders>
        <Console name="Console" target="SYSTEM_OUT">
            <PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss} %-5level %logger{36} - %msg%n"/>
            <BurstFilter level="INFO" rate="5" maxBurst="10" timeWindow="1 minute"/>
        </Console>
    </Appenders>
    
    <Loggers>
        <Root level="INFO">
            <AppenderRef ref="Console"/>
        </Root>
    </Loggers>
</Configuration>

Note: BurstFilter allows up to 10 logs per minute, then switches to sampling 1 out of 5. After the time window of 10 minutes is passed, Log4j will again allow 10 logs per minute and repeat.

3. Java Util Logging (JUL)

Custom Code Example:

import java.util.concurrent.atomic.AtomicInteger;
import java.util.logging.ConsoleHandler;
import java.util.logging.Filter;
import java.util.logging.LogRecord;
import java.util.logging.Logger;

public class SamplingFilter implements Filter {
    private static final AtomicInteger counter = new AtomicInteger(0);
    private static final int SAMPLE_RATE = 5;

    @Override
    public boolean isLoggable(LogRecord record) {
        return counter.incrementAndGet() % SAMPLE_RATE == 0;
    }

    public static void main(String[] args) {
        Logger logger = Logger.getLogger(SamplingFilter.class.getName());
        ConsoleHandler consoleHandler = new ConsoleHandler();
        consoleHandler.setFilter(new SamplingFilter());
        logger.addHandler(consoleHandler);
        logger.info("This is a sampled log message.");
    }
}

Note: The Custom class above will log 1 in every 5 messages. AtomicInteger ensures atomic updates to the counter across multiple threads.

Summary

  • Logback (Spring Boot default): Simple XML configuration with SamplingAppender.
  • Log4j 2: Flexible with BurstFilter for controlled bursts.
  • JUL: Custom Filter needed for sampling.
  • Logstash Logback Encoder: JSON format and flexible configurations.

Log sampling helps balance monitoring needs and system performance. Adjust configurations to suit your project for optimal results.