What is stale code? What’s the cost? What can you do about it?
If you aren’t aware, the Instance Metadata Service (IMDS) is a way to get information about your EC2 instance, including credentials to use the instance identity role.
Now, IMDSv1 isn’t officially deprecated, but there is some nice golden warning text in the EC2 console if you use it, reminding you that AWS “recommends” the use of IMDSv2
Hackerone follows the least privileged ideology, and the instance role that this vuln was found on is narrowly scoped. However, a malicious actor could still do some damage with these types of credentials. This isn’t unique to Hackerone either. Applications in the cloud end up needing a fair amount of permissions to do what they need to do. Perhaps it’s invoking a Lambda, fetching an object from S3, or sending metrics to Cloudwatch.
If IMDSv2 launched in 2019, why is IMDSv1 still being used in 2023? Well, the answer is an excellent example of what I wrote about in my story about bucket encryption and the $36k/month KMS bill: “We live in a cycle of continuous improvement.”
The answer is legacy code, or as I like to call it, stale code
What Is Legacy and Stale Code
Legacy code is discussed at length in software engineering communities, but it exists in Infrastructure or Cloud engineering.
There’s a good thread on StackOverflow about What Makes Code Legacy. In my opinion, once code stops being actively developed and maintained, it is legacy.
There is another category in my eyes. It’s not quite legacy code. This other category is still supported, but it just runs. The code requires no maintenance time, and teams often forget about it. It’s what I call stale code.
The use of IMDSv1 here is an example of stale code. Hackerone is a collection of Ruby apps interacting with various AWS services like EC2, S3, RDS, SQS, and more. The libraries used by the app handle much of the authentication behind the scenes. The libraries mostly use the AWS SDK, which utilizes the credential provider chain and includes IMDS as a default fallback. This creates a false sense of security of the state of the cloud.
Without notifications or alerts like the ones shown in the EC2 console, it can be challenging to devote time to thinking about a solution that works and requires little or no maintenance.
Can Infrastructure as Code Really Be Legacy or Stale Code?
I’m sure some may disagree that Ansible, Terraform, and the like can age. But when your infrastructure repo has hundreds of thousands of lines to maintain, it’s inevitable that some code will go longer without being touched.
This is compounded when you consider that most of the time the IAC part of your codebase just works. When was the last time you needed to worry about your scheduled Packer builds not working? Rarely, I bet.
With new improvements from cloud providers improving security, operating systems, and software marching towards end of life, it’s easy to fall behind on the latest secure practices.
Dealing With These Types of Code
Every single cloud or infrastructure team I’ve been a part of or talked to has had enough work to keep them busy for months if not years. Returning and giving the environment a once-over or a review, even annually, can be challenging to justify to management.
The senior or staff engineers of the team need to be the ones who advocate for this work. They’ve been around and can appropriately convey the risks to the stakeholders — executive leadership and their own team — that the work is valuable and required.
Keeping up with trends is essential here. Subscribing to CVE feeds, product updates, following what experts are doing in the field. Consider checking in and asking yourself or your team if there is a particular part of the stack or repo that they haven’t touched for a while. Odds are, if the entire team can’t remember the last meaningful change in that area, it might be due for a look.
Obviously, the payoff for the time spent is there, whether financially, as seen in the case of the KMS/Bucket encryption issue I shared earlier or in the security perimeter of the app. This needs to be part of the story told to have the agreement of the
Ultimately, legacy and stale code become security, financial, or even reliability risks. Spending time devoted to the maintenance of the codebase is critical to the success of the team and the organization.