As you may have heard, a critical vulnerability (CVE-2021-44228) was recently discovered in the Apache Log4j software library, which is widely used for logging program output in open source projects around the world. Under certain circumstances, this vulnerability can allow an attacker to run arbitrary code on computers that have been compromised.
Our information security team was made aware of this vulnerability shortly after its discovery and immediately moved to assess and remediate any effect this might have on Terra’s security. The good news is that they found no evidence of any successful attacks in the Terra logs, and our engineering teams were able to quickly patch all Terra services that could be affected by the Log4j vulnerability according to the latest guidance from Apache.
This leaves two points on which you may need to take action to further secure your work in Terra.
First, if you have an existing dataproc cluster that was created prior to 12:26pm ET on December 15, 2021, you will need to delete and recreate the cluster in order to benefit from the security updates that were made by our team.
Second, you may need to update any Java-based analysis software packages you use in your work. This may not be trivial, because Log4j is used in a lot of java-based projects, and it’s up to their developers to provide updated versions. For example, the Genome Analysis Toolkit (GATK) was updated this week to use the latest version of Log4j, and the development team is recommending that everyone using GATK should upgrade to the new version to be safe. Accordingly, we have already updated the GATK docker image in all environments and workflows available in Terra that our team controls. If you have been using one of our preconfigured cloud environments or workflows that includes GATK, please upgrade to the newest version. We also recommend that you check any other Java tools you use that may be affected by this vulnerability, and switch to updated versions if available.
What if you can’t update a third-party tool to a safe version?
We understand that some tools you use may not get updated by their developers right away, or you may not be able to update mid-project, so here is a summary of what you need to know to understand and mitigate the risk of using tools that carry the Log4j vulnerability.
In a nutshell, this vulnerability consists of a software flaw that allows an attacker to force the Log4j logging component bundled in a Java tool to download and run malicious code. First, the attacker crafts a string of text that will tell Log4j to retrieve code from an external server using a function called a “JNDI lookup”, and to run that code on the compromised computer. The attack is triggered when the attack string goes through the logging system of a tool that has the vulnerability. This can happen for example if the attack string was embedded in a data input file in a way that causes the analysis tool to emit an error message that includes the attack string itself. Given the right set of circumstances, this can allow the attacker to take control of the machine and run arbitrary code on it. (For more technical details, see the official vulnerability report; you may also appreciate the helpful diagram on the Swiss Government Computer Emergency Response Team’s blog). This kind of remote code execution attack, or “exploit”, is often used to deploy mass-malware for launching DDoS attacks and mining cryptocurrencies, and can also be used for espionage.
In any computing context, the best way to protect yourself against this vulnerability is to update any Java-based analysis tools you use to a safe version; but if that’s not possible, another option is to disable the JNDI lookup function of each tool by deleting the relevant code from the tool’s JAR file, as noted in the Apache Log4j mitigation documentation:
(…) in any release other than 2.16.0, you may remove the JndiLookup class from the classpath:zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class
In Terra, the virtual machines provided for running analyses in cloud environments and via workflows are specifically configured to be isolated from the rest of the system, so it is difficult for an attacker to “move sideways” and affect anything else outside of that one machine. Nevertheless, you should adopt safe practices to keep your work —and any data you have access to— as secure as possible. If you cannot apply the mitigation steps outlined above, another way you can protect yourself is by verifying the integrity of your data inputs, which is something we recommend doing in general anyway. For example, you should always check whether the data input files you’re using come from a trusted source (be especially careful with files provided by pseudonymous users on discussion forums), and in many cases you can verify the integrity of data files using tools such as md5sum checkers, or format validation tools. (Make sure the validation tool you use is not itself vulnerable to this exploit!)
You can find more details about Terra’s security posture on the Security page of the Terra website, and we will continue to document recommendations for this specific issue in the corresponding “Known Issues” article as the situation evolves.
If you have any concerns or questions about any of this, please reach out to the Terra support team for help.