Integrating Inspector with GitLab CI/CD

How to integrate Inspector with GitLab, my experiences with it, and why do it.

At re:Invent 2023, AWS announced a few updates for Inspector. One of which was the ability to detect vulnerabilities using the Inspector scanner.

Previously, if an organization used Inspector for their vulnerability scanning, there was a drift between what was observed by developers in CI/CD and what surfaced after they deployed code or a container AWS.

Even using Clair for container scanning in CI/CD, which is what ECR uses for basic scanning, resulted in differences that were challenging to reconcile.

What does “Integrating Inspector” exactly mean?

With the new capability, you can run the same scan in your CI/CD pipeline that would be done after the container image is pushed up to ECR. This means that the vulnerability cycle is shorted significantly. Consider what this looks like before the capability was announced:

The Pain of How Container Scanning Used to Work

Now engineers can receive fast feedback on vulnerabilities in container images and fix them before Inspector fires off an alert to the security team. Overall, this is a better experience for everyone involved: The developers don’t need to wait for someone to tell them they need to rollback due to vulnerable code, or immediately fix a critical vulnerability; the security team reduces the amount of noise they need to deal with; and the application has a reduced security gap.

How Does It Work?

The Inspector CI/CD integration has AWS support for Jenkins and TeamCity, but there are instructions for creating a custom integration. The instructions are actually straightforward:

Create an IAM resource that can perform the inspector-scan:ScanSbom action.
Download the SBOM Generator.
Generate an SBOM based on the container image.
Call the inspector-scan API while referencing the SBOM.

I managed to install the SBOM Generator and run the whole thing in an Alpine linux container in a GitLab CI/CD job, it wasn’t too challenging at all.

An Example GitLab CI/CD Job

The CI/CD job can be distilled into just a handful of commands. The below snippet isn’t a full example, but contains the core components needed to get started.

build_image:
  stage: build
  script:
    - docker build --file Dockerfile --tag $CI_REGISTRY_IMAGE/$NAME:$CI_COMMIT_SHORT_SHA.
    - docker push $CI_REGISTRY_IMAGE/$NAME:$CI_COMMIT_SHORT_SHA

inspector_container_scanning:
  stage: test
  script:
    - echo "Installing Inspector SBOM"
    - apk upgrade --no-cache && apk add python3 py3-pip tar unzip
    - wget https://amazon-inspector-sbomgen.s3.amazonaws.com/latest/linux/amd64/inspector-sbomgen.zip
      -O inspector-sbomgen.zip
    - unzip inspector-sbomgen.zip
    - mv inspector-sbomgen-1.0.0/linux/amd64/inspector-sbomgen ./inspector-sbomgen
    - chmod +x ./inspector-sbomgen
    - "./inspector-sbomgen --version"
    - "./inspector-sbomgen container --image $CI_REGISTRY_IMAGE/$NAME:$CI_COMMIT_SHORT_SHA -o sbom.json"
    - pip3 install awscli
    - aws inspector-scan scan-sbom --sbom file://sbom.json --output-format "CYCLONE_DX_1_5" --endpoint "https://inspector-scan.us-west-2.amazonaws.com" --region us-west-2 > output.json
  artifacts:
    paths:
      - output.json

parse_results:
  stage: report
  image: ruby:3.2
  script:
    - ruby scripts/parse.rb

$CI_REGISTRY_IMAGE and $CI_COMMIT_SHORT_SHA are both predefined GitLab CI/CD variables. $NAME would be the name of your choosing.

The inspector_container_scanning job handles most of the work: Getting the SBOMGen binary, ensuring it’s executable, generating the SBOM, and calling the inspector-scan API. Just a handful of commands.

scripts/parse.rb is just a simple script that iterates over output.json to find the counts of each vulnerability type:

#!/usr/bin/env ruby
# frozen_string_literal: true

require 'bundler/inline'

gemfile do
  source 'https://rubygems.org'
  gem 'json'
end

require 'json'

file = File.read('output.json')
data = JSON.parse(file)

vulnerabilities = data['sbom']['metadata']['properties']
exit_code = 0

summary = vulnerabilities.each_with_object(Hash.new(0)) do |vuln, counts|
  severity = vuln['name'].split(':').last.split('_').first
  count = vuln['value'].to_i
  counts[severity] += count
end

puts 'Vulnerability Summary:'
puts '-' * 40
summary.each do |severity, count|
  puts "#{severity.capitalize} Vulnerabilities: #{count}"
  exit_code = 1 if %w[critical high].include?(severity)
end

puts ''
puts '-' * 40
puts ''

if exit_code == 1
  puts 'Critical or High Vulns Found'
  exit 1
end

What Are The Results?

Overall, it’s a servicable solution. It can certainly use some refinement. For example, the above doesn’t post back to the MR with the vulnerability data, or even the counts. It just outputs the counts and fails the jobs if any HIGH or CRITICAL vulns are found.

It also doesn’t plug directly into the native GitLab Vulnerability Report. I spent a bit of time trying to do some conversion, but the schema for GitLab’s vulnerability report, while public, is customized to some degree and I didn’t feel like it was worth the effort to integrate that deeply when Inspector’s interface is quite fine for real vulnerabilities.

Theoretically this should be possible since you can output the data as the CycloneDX format (albeit different versions of the schema).

Why Do This?

Perhaps, at this point, it’s worth asking “Why?”

Well, I like GitLab, but in an AWS-focused organization you are likely to use ECR and Inspector since they are FedRamp approved, easy to integrate, and easy for the team to consume. I wish the vulnerability data from the inspector-scan API call was a little easy to parse through, but with time I think a solution would be easy enough to create.

At the same time, the GitLab solution works, but there are quirks with it. I’ve found that having a mono-repo style of container images means you need to build all the images in the default branch of the vulnerabilities overwrite each other. The interface to work with the vulnerabilities is clunky and in-flux still too. For example, I can leave comments on a particular vulnerability and open an issue for it, but my comments on the vulnerability are synced to the issue (and vice-versa).

Perhaps by version 17 or 18 of GitLab, my opinion might change, but currently I don’t feel like it can compare to the ECR and Inspector combination.

I don’t claim this is a perfect solution, but my goal with it was to bring the vulnerability data closer to the engineers working on the code without requiring them to navigate to AWS everytime they want to see the scan results of a container that was deployed, or worse, be tagged by the Security team. I think the Inspector CI/CD integration accomplishes this.