We have a long running pipeline that has worked without issue for years on CentOS 7 however after a recent upgrade to Rocky 9 we began noticing that the pipeline was starting to fail. The application being tested would start with no issue at the begining of the test run but as time passed the application would fail to start due to not being able to find a valid JAVA_HOME environment variable.
This was odd for a few reasons
- The test nodes are created from a validated image template with dependencies installed
- JAVA_HOMEis set and confirmed valid at the start of the run
- The application was able to start at the beginning of the run
- The test nor the pipeline modify the Java install
Well it turns out that base image used to create the test image had dnf-automatic installed and enabled.
dnf-automatic itself is an alternative way to invoke dnf-upgrade but is often configured to run with cron or systemd timers. It was this systemd timer that was waking up and updating Java during the test run. This in turn caused the JAVA_HOME environment variable to become invalid during the test run.
This issue did highlight a few issues with the test pipeline.
- 
dnf-automaticis problematic for shorted-lived test nodes. The whole point of the test image is to ensure that your test are running on a well know and consistent test environment.dnf-automaticinvalidates this by modifying the system.
- 
The JAVA_HOMEenvironment variable was too specific. The path used to discover the Java root folder was based ondirname $(readlink -f $(which java)). On my Fedora system this results in the path/usr/lib/jvm/java-17-openjdk-17.0.6.0.10-1.fc38.x86_64/bin/java. If I use this path to determineJAVA_HOMEthen it will be invalid if Java is updated after settingJAVA_HOME. RHEL-like distros provide a number of symlinks to allow multiple versions of Java to be installed. Depending on the scenario it may be more advisable to use one of the more generic links in/usr/lib/jvm
- 
The application itself did not log the value of JAVA_HOMEin the monitored log files which hid the fact that the value had changed over time.
In general dnf-automatic is a very useful tool and something that I would definitely install and configure to install security updates on long-running servers. It should not have been installed on the short-lived test nodes which see frequent manual updates to put them in a known-good state.