in java log4j log log-rolling date-rolling size-rolling ~ read.

Date and Size rolling in log4j

Currently, I am working on a project, where we have faced the problem of the huge size of daily logs. We have some environment, which exists on the same shared file system, so if logs on some environment will consume free disk space, it will cause crashing of other environments. Our log policy uses daily rolling of logs (every day a new log file is created and logs are stored only for N last days, any log that becomes older than N days will be deleted). And on a usual day, this logic is ok, but in the case of any errors it can be dangerous.

For example, some module has lost connection to the database and starts logging an error with details, attempting to restore connection every 10 seconds. It works in the case of any minor network issue when the connection is restored in minutes, maybe hours... But in the case of serious issues, the connection could not be restored without the participation of the developer. Imagine then that it happens in the night when everybody is offline and nobody is looking at the issue in real time (just forgot to mention: these are dev environments, not prod). As a result, the module will produce a huge log with a lot of identical errors. Going further... What if the issue happens on the database side and all modules on the environment lose connection? Correct, every module will produce logs at a high rate. In the morning, a developer will find out logs files of some GB, no free space on the shared disk and all other dev environments in a dead state, due to lack of space.

There are some ways to resolve this issue and improve stability of development environments:

  • increase disk space;
  • review logging policy on the application level: produce fewer messages;
  • review the recovery policy of the module: for example, increase time between attempts to reestablish connection;
  • review logging policy on the logger level: introduce a log size limit, enable ZIP for old logs;

Increase disk space. In enterprise development? Huh... you might be kidding? It could last for ages. No, of course, it's possible, but still we have limits. Though it will require more than one day to exhaust disk space, it can still fail on long weekends.

What is about changing the logging policy? First of all, it can involve more complicated logic of messages output and possibly decrease chances of finding the root cause of some problem due to lack of details. Secondly, it requires changes of the code and, by the way, application code is really huge, so it is a challenge to carefully review all the code to find messages which can be output less often.

Changing the module recovery policy: there is also a problem with code changes, also, it is not clear what a new time interval between recover attempts should be, and how it will affect other modules. For example, in the case of a minor issue, recovering of the module in 10 seconds will save working state of dependant module, but, for example, the recovery time of 15 seconds will cause crash of other modules, so minor issue will become a serious problem.

As a result, the most painless way is to introduce the log size limit.

The project uses log4j of version 1.2.17 currently. Short investigation shows that log rolling on a daily basis taking into account log size at the same time is not possible out of the box (you can find DailyRollingFileAppender and RollingFileAppender, but they act independently). It can be done in log4j starting from version 2. Migrating to log4j 2 promises to be very painful due to API changes and the size of our project (also at that time log4j 2 was in beta). So we took a look in the direction of log4j 3rd party appenders. As a result, we found and started to use TimeAndSizeRollingAppender.

It comes as a maven dependency and can be configured as any other log4j appender:

<appender name="FILE" class="uk.org.simonsite.log4j.appender.TimeAndSizeRollingAppender">
  <param name="File" value="\${dir}/logs/\${log4j.id}.log"/>
  <param name="Append" value="true"/>
  <param name="DatePattern" value="'.'yyyy-MM-dd"/>
  <param name="MaxFileSize" value="100MB"/>
  <param name="MaxRollFileCount" value="10"/>
  <param name="DateRollEnforced" value="true"/>
  <param name="CompressionAlgorithm" value="ZIP"/>
  <param name="CompressionMinQueueSize" value="5"/>
  <layout class="org.apache.log4j.PatternLayout">
    <param name="ConversionPattern" value="%d %-5p [%c Thread:%t] %m%n"/>
  </layout>
</appender>

All params here have clear names, I suppose. To put it simply the above configuration allows you to have 10 log files on the disk at the same time irrespective of the reason why a new file was created: because of the daily roll or because of the file size exceeded the specified limit. Every day a new log file will be created, the previous day log will be renamed accordingly to DatePattern param, in addition to this MaxFileSize sets file size limit, once it exceeds - a new log file will also be created. In both cases the oldest log will be deleted, if the number of files become greater than MaxRollFileCount.

Pay attention to the compression settings. CompressionAlgorithm enables compression of old logs and CompressionMinQueueSize defines the max number of existing uncompressed files. So, in this case, latest 5 logs will be stored uncompressed, all other log files will be compressed. Such feature is good because you are able to see the latest logs without decompressing them.

This appender will save the file system from unexpected logs growing. Of course, you can lose the original error message during log rolling, but it depends on the way you output errors. At least, you will surely see the error message that spams you log, but overall system stability will not be affected by 'no free space' error.