Terrier: An Open-Source Tool for Identifying and Analyzing Container and Image Components

As part of our Blackhat Europe talk “Reverse Engineering and Exploiting Builds in the Cloud” we publicly released a new tool called Terrier.

Announcing Terrier: An open-source tool for identifying and analysing container and image components.

In this blog post, I am going to show you how Terrier can help you identify and verify container and image components for a wide variety of use-cases, be it from a supply-chain perspective or forensics perspective. Terrier can be found on Github.

Containers and images

In this blog post, I am not going to go into too much detail about containers and images (you can learn more here) however it is important to highlight a few characteristics of containers and images that make them interesting in terms of Terrier. Containers are run from images and currently the Open Containers Initiative (OCI) is the most popular format for images. The remainder of this blog post refers to OCI images as images.

Essentially images are tar archives that container multiple tar archives and meta-information that represent the “layers” of an image. The OCI format of images makes images relatively simple to work with which makes analysis relatively simple. If you only had access to a terminal and the tar command, you could pretty much get what you need from the image’s tar archive.

When images are utilised at runtime for a container, their contents become the contents of the running container and the layers are essentially extracted to a location on the container’s runtime host. The container runtime host is the host that is running and maintaining the containers. This location is typically /var/lib/docker/overlay2/<containerID>/. This location contains a few folders of interest, particularly the "merged" folder. The "merged" folder contains the contents of the image and any changes that have occurred in the container since its creation. For example, if the image contained a location such as /usr/chris/stuff and after creating a container from this image I created a file called helloworld.txt at the location /usr/chris/stuff. This would result in the following valid path on the container runtime host /var/lib/docker/overlay2/<containerID>/merged/usr/chris/stuff/helloworld.txt.

What does Terrier do?

Now that we have a brief understanding of images and containers, we can look at what Terrier does. Often it is the case that you would like to determine if an image or container contains a specific file. This requirement may be due to a forensic analysis need or to identify and prevent a certain supply-chain attack vector. Regardless of the requirement, having the ability to determine the presence of a specific file in an image or container is useful.

Identifying files in OCI images

Terrier can be used to determine if a specific image contains a specific file. In order to do this, you need the following:

An OCI Image i.e TAR archive
A SHA256 hash of a specific file/s

The first point can be easily achieved with Docker by using the following command:

$ docker save imageid -o myImage.tar

The command above uses a Docker image ID which can be obtained using the following command:

$ docker images

Once you have your image exported as a tar archive, you will then need to establish the SHA256 hash of the particular file you would like to identify in the image. There are multiple ways to achieve this but in this example, we are going to use the hash of the Golang binary go1.13.4 linux/amd64 which can be achieved with following command on a Linux host:

$ cat /usr/local/go/bin/go | sha256sum

The command above should result in the following SHA256 hash: 82bce4b98d7aaeb4f841a36f7141d540bb049f89219f9e377245a91dd3ff92dd

Now that we have a hash, we can use this hash to determine if the Golang binary is in the image myImage.tar. To achieve this, we need to populate a configuration file for Terrier. Terrier makes use of YAML configuration files and below is our config file that we save as cfg.yml:

mode: image
image: myImage.tar

hashes:
    - hash: '82bce4b98d7aaeb4f841a36f7141d540bb049f89219f9e377245a91dd3ff92dd'

The config file above has multiple entries which allow us to specify the mode that Terrier will operate in and in this case, we are working with an image file (tar archive) so the mode is image. The image file we are working with is myImage.tar and the hash we are looking to identify is in the hashes list.

We are now ready to run Terrier and this can be done with the following command:

$ ./terrier

The command above should result in output similar to the following:

$ ./terrier 
[+] Loading config:  cfg.yml
[+] Analysing Image
[+] Docker Image Source:  myImage.tar
[*] Inspecting Layer:  34a9e0f17132202a82565578a3c2dae1486bb198cde76928c8c2c5c461e11ccf
[*] Inspecting Layer:  6539a80dd09da08132a525494ff97e92f4148d413e7c48b3583883fda8a40560
[*] Inspecting Layer:  6d2d61c78a65b6e6c82b751a38727da355d59194167b28b3f8def198cd116759
[!] Found file '6d2d61c78a65b6e6c82b751a38727da355d59194167b28b3f8def198cd116759/usr/local/go/bin/go' with hash: 82bce4b98d7aaeb4f841a36f7141d540bb049f89219f9e377245a91dd3ff92dd
[*] Inspecting Layer:  a6e646c34d2d2c2f4ab7db95e4c9f128721f63c905f107887839d3256f1288e1
[*] Inspecting Layer:  aefc8f0c87a14230e30e510915cbbe13ebcabd611e68db02b050b6ceccf9c545
[*] Inspecting Layer:  d4468fff8d0f28d87d48f51fc0a6afd4b38946bbbe91480919ebfdd55e43ce8c
[*] Inspecting Layer:  dbf9da5e4e5e1ecf9c71452f6b67b2b0225cec310a20891cc5dedbfd4ead667c

We have identified a file /usr/local/go/bin/go located at layer 6d2d61c78a65b6e6c82b751a38727da355d59194167b28b3f8def198cd116759 that has the same SHA256 hash as the one we provided. We now have verification that the image “myImage.tar” contains a file with the SHA256 hash we provided.

This example can be extended upon and you can instruct Terrier to search for multiple hashes. In this case, we are going to search for a malicious file. Recently a malicious Python library was identified in the wild and went by the name “Jeilyfish”. Terrier could be used to check if a Docker image of yours contained this malicious package. To do this, we can determine the SHA256 of one of the malicious Python files that contains the backdoor:

$ cat jeIlyfish-0.7.1/jeIlyfish/_jellyfish.py | sha256sum
cf734865dd344cd9b0b349cdcecd83f79a751150b5fd4926f976adddb93d902c

We then update our Terrier config to include the hash calculated above.

mode: image
image: myImage.tar

hashes:
    - hash: '82bce4b98d7aaeb4f841a36f7141d540bb049f89219f9e377245a91dd3ff92dd'
    - hash: 'cf734865dd344cd9b0b349cdcecd83f79a751150b5fd4926f976adddb93d902c'

We then run Terrier against and analyse the results:

$ ./terrier 
[+] Loading config:  cfg.yml
[+] Analysing Image
[+] Docker Image Source:  myImage.tar
[*] Inspecting Layer:  34a9e0f17132202a82565578a3c2dae1486bb198cde76928c8c2c5c461e11ccf
[*] Inspecting Layer:  6539a80dd09da08132a525494ff97e92f4148d413e7c48b3583883fda8a40560
[*] Inspecting Layer:  6d2d61c78a65b6e6c82b751a38727da355d59194167b28b3f8def198cd116759
[!] Found file '6d2d61c78a65b6e6c82b751a38727da355d59194167b28b3f8def198cd116759/usr/local/go/bin/go' with hash: 82bce4b98d7aaeb4f841a36f7141d540bb049f89219f9e377245a91dd3ff92dd
[*] Inspecting Layer:  a6e646c34d2d2c2f4ab7db95e4c9f128721f63c905f107887839d3256f1288e1
[*] Inspecting Layer:  aefc8f0c87a14230e30e510915cbbe13ebcabd611e68db02b050b6ceccf9c545
[*] Inspecting Layer:  d4468fff8d0f28d87d48f51fc0a6afd4b38946bbbe91480919ebfdd55e43ce8c
[*] Inspecting Layer:  dbf9da5e4e5e1ecf9c71452f6b67b2b0225cec310a20891cc5dedbfd4ead667c

The results above indicate that our image did not contain the malicious Python package.

There is no limit as to how many hashes you can search for however it should be noted that Terrier performs all its actions in-memory for performance reasons so you might hit certain limits if you do not have enough accessible memory.

Identifying and verifying specific files in OCI images

Terrier can also be used to determine if a specific image contains a specific file at a specific location. This can be useful to ensure that an image is using a specific component i.e binary, shared object or dependency. This can also be seen as “pinning” components by ensuring that you are images are using specific components i.e a specific version of cURL.

In order to do this, you need the following:

An OCI Image i.e TAR archive
A SHA256 hash of a specific file/s
The path and name of the specific file/s

The first point can be easily achieved with Docker by using the following command:

$ docker save imageid -o myImage.tar

The command above utilises a Docker image id which can be obtained using the following command:

$ docker images

Once you have your image exported as a tar archive, you will need to determine the path of the file you would like to identify and verify in the image. For example, if we would like to ensure that our images are making use of a specific version of cURL, we can run the following commands in a container or some other environment that resembles the image.

$ which curl
/usr/bin/curl

We now have the path to cURL and can now generate the SHA256 of this instance of cURL because in this case, we trust this instance of cURL. We could determine the hash by other means for example many binaries are released with a corresponding hash from the developer which can be acquired from the developer’s website.

$ cat /usr/bin/curl | sha256sum 
9a43cb726fef31f272333b236ff1fde4beab363af54d0bc99c304450065d9c96

With this information, we can now populate our config file for Terrier:

mode: image
image: myImage.tar
files:
  - name: '/usr/bin/curl'
    hashes:
      - hash: '9a43cb726fef31f272333b236ff1fde4beab363af54d0bc99c304450065d9c96'

We’ve saved the above config as cfg.yml and when we run Terrier with this config, we get the following output:

$ ./terrier
[+] Loading config:  cfg.yml
[+] Analysing Image
[+] Docker Image Source:  myImage.tar
[*] Inspecting Layer:  34a9e0f17132202a82565578a3c2dae1486bb198cde76928c8c2c5c461e11ccf
[*] Inspecting Layer:  34a9e0f17132202a82565578a3c2dae1486bb198cde76928c8c2c5c461e11ccf
[*] Inspecting Layer:  6539a80dd09da08132a525494ff97e92f4148d413e7c48b3583883fda8a40560
[*] Inspecting Layer:  6539a80dd09da08132a525494ff97e92f4148d413e7c48b3583883fda8a40560
[*] Inspecting Layer:  6d2d61c78a65b6e6c82b751a38727da355d59194167b28b3f8def198cd116759
[*] Inspecting Layer:  6d2d61c78a65b6e6c82b751a38727da355d59194167b28b3f8def198cd116759
[*] Inspecting Layer:  a6e646c34d2d2c2f4ab7db95e4c9f128721f63c905f107887839d3256f1288e1
[*] Inspecting Layer:  a6e646c34d2d2c2f4ab7db95e4c9f128721f63c905f107887839d3256f1288e1
[*] Inspecting Layer:  aefc8f0c87a14230e30e510915cbbe13ebcabd611e68db02b050b6ceccf9c545
[*] Inspecting Layer:  aefc8f0c87a14230e30e510915cbbe13ebcabd611e68db02b050b6ceccf9c545
[*] Inspecting Layer:  d4468fff8d0f28d87d48f51fc0a6afd4b38946bbbe91480919ebfdd55e43ce8c
[*] Inspecting Layer:  d4468fff8d0f28d87d48f51fc0a6afd4b38946bbbe91480919ebfdd55e43ce8c
[*] Inspecting Layer:  dbf9da5e4e5e1ecf9c71452f6b67b2b0225cec310a20891cc5dedbfd4ead667c
[*] Inspecting Layer:  dbf9da5e4e5e1ecf9c71452f6b67b2b0225cec310a20891cc5dedbfd4ead667c
[!] All components were identified: (1/1)
[!] All components were identified and verified: (1/1)
$ echo $?
0

The output above indicates that the file /usr/bin/curl was successfully identified and verified, meaning that the image contained a file at the location /usr/bin/curl and that the SHA256 of that file matched the hash we provided in the config. Terrier also makes use of return codes and if we analyse the return code from the output above, we can see that the value is 0 which indicates a success. If Terrier cannot identify or verify all the provided files, a return code of 1 is returned which indicates a failure. The setting of return codes is particularly useful in testing environments or CI/CD environments.

We can also run Terrier with verbose mode enable to get more information:

$ ./terrier 
[+] Loading config:  cfg.yml
[+] Analysing Image
[+] Docker Image Source:  myImage.tar
[*] Inspecting Layer:  34a9e0f17132202a82565578a3c2dae1486bb198cde76928c8c2c5c461e11ccf
[*] Inspecting Layer:  6539a80dd09da08132a525494ff97e92f4148d413e7c48b3583883fda8a40560
        [!] Identified  instance of '/usr/bin/curl' at: 6539a80dd09da08132a525494ff97e92f4148d413e7c48b3583883fda8a40560/usr/bin/curl 
        [!] Verified matching instance of '/usr/bin/curl' at: 6539a80dd09da08132a525494ff97e92f4148d413e7c48b3583883fda8a40560/usr/bin/curl with hash: 9a43cb726fef31f272333b236ff1fde4beab363af54d0bc99c304450065d9c96
[*] Inspecting Layer:  6d2d61c78a65b6e6c82b751a38727da355d59194167b28b3f8def198cd116759
[*] Inspecting Layer:  a6e646c34d2d2c2f4ab7db95e4c9f128721f63c905f107887839d3256f1288e1
[*] Inspecting Layer:  aefc8f0c87a14230e30e510915cbbe13ebcabd611e68db02b050b6ceccf9c545
[*] Inspecting Layer:  d4468fff8d0f28d87d48f51fc0a6afd4b38946bbbe91480919ebfdd55e43ce8c
[*] Inspecting Layer:  dbf9da5e4e5e1ecf9c71452f6b67b2b0225cec310a20891cc5dedbfd4ead667c
[!] All components were identified: (1/1)
[!] All components were identified and verified: (1/1)

The output above provides some more detailed information such as which layer the cURL files was located at. If you wanted more information, you could enable the veryveryverbose option in the config file but beware, this is a lot of output and grep will be your friend.

There is no limit for how many hashes you can specify for a file. This can be useful for when you want to allow more than one version of a specific file i.e multiple versions of cURL. An example config of multiple hashes for a file might look like:

mode: image
image: myImage.tar
files:
  - name: '/usr/bin/curl'
    hashes:
      - hash: '9a43cb726fef31f272333b236ff1fde4beab363af54d0bc99c304450065d9c96'
      - hash: 'aefc8f0c87a14230e30e510915cbbe13ebcabd611e68db02b050b6ceccf9c545'
      - hash: '6d2d61c78a65b6e6c82b751a38727da355d59194167b28b3f8def198cd116759'
      - hash: 'd4468fff8d0f28d87d48f51fc0a6afd4b38946bbbe91480919ebfdd55e43ce8c'

The config above allows Terrier to verify if the identified cURL instance is one of the provided hashes. There is also no limit for the amount of files Terrier can attempt to identify and verify.

Terrier’s Github repo also contains a useful script called convertSHA.sh which can be used to convert a list of SHA256 hashes and filenames into a Terrier config file. This is useful when converting the output from other tools into a Terrier friendly format. For example, we could have the following contents of a file:

8946690bfe12308e253054ea658b1552c02b67445763439d1165c512c4bc240d ./bin/uname
6de8254cfd49543097ae946c303602ffd5899b2c88ec27cfcd86d786f95a1e92 ./bin/gzexe
74ff9700d623415bc866c013a1d8e898c2096ec4750adcb7cd0c853b4ce11c04 ./bin/wdctl
61c779de6f1b9220cdedd7dfee1fa4fb44a4777fff7bd48d12c21efb87009877 ./bin/dmesg
7bdde142dc5cb004ab82f55adba0c56fc78430a6f6b23afd33be491d4c7c238b ./bin/which
3ed46bd8b4d137cad2830974a78df8d6b1d28de491d7a23d305ad58742a07120 ./bin/mknod
e8ca998df296413624b2bcf92a31ee3b9852f7590f759cc4a8814d3e9046f1eb ./bin/mv
a91d40b349e2bccd3c5fe79664e70649ef0354b9f8bd4658f8c164f194b53d0f ./bin/chown
091abe52520c96a75cf7d4ff38796fc878cd62c3a75a3fd8161aa3df1e26bebd ./bin/uncompress
c5ebd611260a9057144fd1d7de48dbefc14e16240895cb896034ae05a94b5750 ./bin/echo
d4ba9ffb5f396a2584fec1ca878930b677196be21aee16ee6093eb9f0a93bf8f ./bin/df
5fb515ff832650b2a25aeb9c21f881ca2fa486900e736dfa727a5442a6de83e5 ./bin/tar
6936c9aa8e17781410f286bb1cbc35b5548ea4e7604c1379dc8e159d91a0193d ./bin/zforce
8d641329ea7f93b1caf031b70e2a0a3288c49a55c18d8ba86cc534eaa166ec2e ./bin/gzip
0c1a1f53763ab668fb085327cdd298b4a0c1bf2f0b51b912aa7bc15392cd09e7 ./bin/su
20c358f7ee877a3fd2138ecce98fada08354810b3e9a0e849631851f92d09cc4 ./bin/bzexe
01764d96697b060b2a449769073b7cf2df61b5cb604937e39dd7a47017e92ee0 ./bin/znew
0d1a106dc28c3c41b181d3ba2fc52086ede4e706153e22879e60e7663d2f6aad ./bin/login
fb130bda68f6a56e2c2edc3f7d5b805fd9dcfbcc26fb123a693b516a83cfb141 ./bin/dir
0e7ca63849eebc9ea476ea1fefab05e60b0ac8066f73c7d58e8ff607c941f212 ./bin/bzmore
14dc8106ec64c9e2a7c9430e1d0bef170aaad0f5f7f683c1c1810b466cdf5079 ./bin/zless
9cf4cda0f73875032436f7d5c457271f235e59c968c1c101d19fc7bf137e6e37 ./bin/chmod
c5f12f157b605b1141e6f97796732247a26150a0a019328d69095e9760b42e38 ./bin/sleep
b9711301d3ab42575597d8a1c015f49fddba9a7ea9934e11d38b9ff5248503a8 ./bin/zfgrep
0b2840eaf05bb6802400cc5fa793e8c7e58d6198334171c694a67417c687ffc7 ./bin/stty
d9393d0eca1de788628ad0961b74ec7a648709b24423371b208ae525f60bbdad ./bin/bunzip2
d2a56d64199e674454d2132679c0883779d43568cd4c04c14d0ea0e1307334cf ./bin/mkdir
1c48ade64b96409e6773d2c5c771f3b3c5acec65a15980d8dca6b1efd3f95969 ./bin/cat
09198e56abd1037352418279eb51898ab71cc733642b50bcf69d8a723602841e ./bin/true
97f3993ead63a1ce0f6a48cda92d6655ffe210242fe057b8803506b57c99b7bc ./bin/zdiff
0d06f9724af41b13cdacea133530b9129a48450230feef9632d53d5bbb837c8c ./bin/ls
da2da96324108bbe297a75e8ebfcb2400959bffcdaa4c88b797c4d0ce0c94c50 ./bin/zegrep

The file contents above are trusted SHA256 hashes for specific files. If we would like to use this list for ensuring that a particular image is making use of the files listed above, we can do the following:

$ ./convertSHA.sh trustedhashes.txt terrier.yml

The script above takes the input file trustedhashes.txt which contains our trusted hashes listed above and converts them into a Terrier friendly config file called terrier.yml which looks like the following:

mode: image
image: myImage.tar
files:
  - name: '/bin/uname'
    hashes:
       - hash: '8946690bfe12308e253054ea658b1552c02b67445763439d1165c512c4bc240d'
  - name: '/bin/gzexe'
    hashes:
       - hash: '6de8254cfd49543097ae946c303602ffd5899b2c88ec27cfcd86d786f95a1e92'
  - name: '/bin/wdctl'
    hashes:
       - hash: '74ff9700d623415bc866c013a1d8e898c2096ec4750adcb7cd0c853b4ce11c04'
  - name: '/bin/dmesg'
    hashes:
       - hash: '61c779de6f1b9220cdedd7dfee1fa4fb44a4777fff7bd48d12c21efb87009877'
  - name: '/bin/which'
    hashes:
       - hash: '7bdde142dc5cb004ab82f55adba0c56fc78430a6f6b23afd33be491d4c7c238b'
  - name: '/bin/mknod'

The config file terrier.yml is ready to be used:

$ ./terrier -cfg=terrier.yml
[+] Loading config:  terrier.yml
[+] Analysing Image
[+] Docker Image Source:  myImage.tar
[*] Inspecting Layer:  34a9e0f17132202a82565578a3c2dae1486bb198cde76928c8c2c5c461e11ccf
[*] Inspecting Layer:  6539a80dd09da08132a525494ff97e92f4148d413e7c48b3583883fda8a40560
[*] Inspecting Layer:  6d2d61c78a65b6e6c82b751a38727da355d59194167b28b3f8def198cd116759
[*] Inspecting Layer:  a6e646c34d2d2c2f4ab7db95e4c9f128721f63c905f107887839d3256f1288e1
[*] Inspecting Layer:  aefc8f0c87a14230e30e510915cbbe13ebcabd611e68db02b050b6ceccf9c545
[*] Inspecting Layer:  d4468fff8d0f28d87d48f51fc0a6afd4b38946bbbe91480919ebfdd55e43ce8c
[*] Inspecting Layer:  dbf9da5e4e5e1ecf9c71452f6b67b2b0225cec310a20891cc5dedbfd4ead667c
[!] Not all components were identifed: (4/31)
[!] Component not identified:  /bin/uncompress
[!] Component not identified:  /bin/bzexe
[!] Component not identified:  /bin/bzmore
[!] Component not identified:  /bin/bunzip2
$ echo $?
1

As we can see from the output above, Terrier was unable to identify 4/31 of the components provided in the config. The return code is also 1 which indicates a failure. If we were to remove the components that are not in the provided image, the output from the previous command would look like the following:

$ ./terrier -cfg=terrier.yml
[+] Loading config: terrier.yml
[+] Analysing Image
[+] Docker Image Source: myImage.tar
[*] Inspecting Layer: 34a9e0f17132202a82565578a3c2dae1486bb198cde76928c8c2c5c461e11ccf
[*] Inspecting Layer: 6539a80dd09da08132a525494ff97e92f4148d413e7c48b3583883fda8a40560
[*] Inspecting Layer: 6d2d61c78a65b6e6c82b751a38727da355d59194167b28b3f8def198cd116759
[*] Inspecting Layer: a6e646c34d2d2c2f4ab7db95e4c9f128721f63c905f107887839d3256f1288e1
[*] Inspecting Layer: aefc8f0c87a14230e30e510915cbbe13ebcabd611e68db02b050b6ceccf9c545
[*] Inspecting Layer: d4468fff8d0f28d87d48f51fc0a6afd4b38946bbbe91480919ebfdd55e43ce8c
[*] Inspecting Layer: dbf9da5e4e5e1ecf9c71452f6b67b2b0225cec310a20891cc5dedbfd4ead667c
[!] All components were identified: (27/27)
[!] Not all components were verified: (26/27)
[!] Component not verified: /bin/cat
[!] Component not verified: /bin/chmod
[!] Component not verified: /bin/chown
[!] Component not verified: /bin/df
[!] Component not verified: /bin/dir
[!] Component not verified: /bin/dmesg
[!] Component not verified: /bin/echo
[!] Component not verified: /bin/gzexe
[!] Component not verified: /bin/gzip
[!] Component not verified: /bin/login
[!] Component not verified: /bin/ls
[!] Component not verified: /bin/mkdir
[!] Component not verified: /bin/mknod
[!] Component not verified: /bin/mv
[!] Component not verified: /bin/sleep
[!] Component not verified: /bin/stty
[!] Component not verified: /bin/su
[!] Component not verified: /bin/tar
[!] Component not verified: /bin/true
[!] Component not verified: /bin/uname
[!] Component not verified: /bin/wdctl
[!] Component not verified: /bin/zdiff
[!] Component not verified: /bin/zfgrep
[!] Component not verified: /bin/zforce
[!] Component not verified: /bin/zless
[!] Component not verified: /bin/znew
$ echo $?
1

The output above indicates that Terrier was able to identify all the components provided but many were not verifiable, the hashes did not match and once again, the return code is 1 to indicate this failure.

Identifying files in containers

The previous sections focused on identifying files in images, which can be referred to as a form of “static analysis,” however it is also possible to perform this analysis to running containers. In order to do this, you need the following:

Location of the container’s merged folder
A SHA256 hash of a specific file/s

The merged folder is Docker specific, in this case, we are using it because this is where the contents of the Docker container reside, this might be another location if it were LXC.

The location of the container’s merged folder can be determined by running the following commands. First obtain the container’s ID:

$ docker ps
CONTAINER ID        IMAGE                    COMMAND               CREATED             STATUS              PORTS               NAMES
b9e676fd7b09        golang                   "bash"                20 hours ago        Up 20 hours                             cocky_robinson

Once you have the container’s ID, you can run the following command which will help you identify the location of the container’s merged folder on the underlying host.

$ docker exec b9e676fd7b09 mount | grep diff
overlay on / type overlay (rw,relatime,lowerdir=/var/lib/docker/overlay2/l/7ZDEFE6PX4C3I3LGIGGI5MWQD4:
/var/lib/docker/overlay2/l/EZNIFFIXOVO2GIT5PTBI754HC4:/var/lib/docker/overlay2/l/UWKXP76FVZULHGRKZMVYJHY5IK:
/var/lib/docker/overlay2/l/DTQQUTRXU4ZLLQTMACWMJYNRTH:/var/lib/docker/overlay2/l/R6DE2RY63EJABTON6HVSFRFICC:
/var/lib/docker/overlay2/l/U4JNTFLQEKMFHVEQJ5BQDLL7NO:/var/lib/docker/overlay2/l/FEBURQY25XGHJNPSFY5EEPCFKA:
/var/lib/docker/overlay2/l/ICNMAZ44JY5WZQTFMYY4VV6OOZ,
upperdir=/var/lib/docker/overlay2/04f84ddd30a7df7cd3f8b1edeb4fb89d476ed84cf3f76d367e4ebf22cd1978a4/diff,
workdir=/var/lib/docker/overlay2/04f84ddd30a7df7cd3f8b1edeb4fb89d476ed84cf3f76d367e4ebf22cd1978a4/work)

From the results above, we are interested in two entries, upperdir and workdir because these two entries will provide us with the path to the container’s merged folder. From the results above, we can determine that the container’s merged directory is located at /var/lib/docker/overlay2/04f84ddd30a7df7cd3f8b1edeb4fb89d476ed84cf3f76d367e4ebf22cd1978a4/ on the underlying host.

Now that we have the location, we need some files to identify and in this case, we are going to reuse the SHA256 hashes from the previous section. Let’s now go ahead and populate our Terrier configuration with this new information.

mode: container
path: merged
#image: myImage.tar

hashes:
    - hash: '82bce4b98d7aaeb4f841a36f7141d540bb049f89219f9e377245a91dd3ff92dd'
    - hash: 'cf734865dd344cd9b0b349cdcecd83f79a751150b5fd4926f976adddb93d902c'

The configuration above shows that we have changed the mode from image to container and we have added the path to our merged folder. We have kept the two hashes from the previous section.

If we run Terrier with this configuration from the location /var/lib/docker/overlay2/04f84ddd30a7df7cd3f8b1edeb4fb89d476ed84cf3f76d367e4ebf22cd1978a4/, we get the following output:

$ ./terrier
[+] Loading config: cfg.yml
[+] Analysing Container
[!] Found matching instance of '82bce4b98d7aaeb4f841a36f7141d540bb049f89219f9e377245a91dd3ff92dd' at: merged/usr/local/go/bin/go with hash:82bce4b98d7aaeb4f841a36f7141d540bb049f89219f9e377245a91dd3ff92dd

From the output above, we know that the container (b9e676fd7b09) does not contain the malicious Python package but it does contain an instance of the Golang binary which is located at merged/usr/local/go/bin/go.

Identifying and verifying specific files in containers

And as you might have guessed, Terrier can also be used to verify and identify files at specific paths in containers. To do this, we need the following:

Location of the container’s merged folder
A SHA256 hash of a specific file/s
The path and name of the specific file/s

The points above can be determined using the same procedures described in the previous sections. Below is an example Terrier config file that we could use to identify and verify components in a running container:

mode: container
path: merged
verbose: true
files:
  - name: '/usr/bin/curl'
    hashes:
      - hash: '9a43cb726fef31f272333b236ff1fde4beab363af54d0bc99c304450065d9c96'
  - name: '/usr/local/go/bin/go'
    hashes:
      - hash: '82bce4b98d7aaeb4f841a36f7141d540bb049f89219f9e377245a91dd3ff92dd'

If we run Terrier with the above config, we get the following output:

$ ./terrier
[+] Loading config: cfg.yml
[+] Analysing Container
[!] Found matching instance of '/usr/bin/curl' at: merged/usr/bin/curl with hash:9a43cb726fef31f272333b236ff1fde4beab363af54d0bc99c304450065d9c96
[!] Found matching instance of '/usr/local/go/bin/go' at: merged/usr/local/go/bin/go with hash:82bce4b98d7aaeb4f841a36f7141d540bb049f89219f9e377245a91
dd3ff92dd
[!] All components were identified: (2/2)
[!] All components were identified and verified: (2/2)
$ echo $?
0

From the output above, we can see that Terrier was able to successfully identify and verify all the files in the running container. The return code is also 0 which indicates a successful execution of Terrier.

Using Terrier with CI/CD

In addition to Terrier being used as a standalone CLI tool, Terrier can also be integrated easily with existing CI/CD technologies such as GitHub Actions and CircleCI. Below are two example configurations that show how Terrier can be used to identify and verify certain components of Docker files in a pipeline and prevent the pipeline from continuing if all verifications do not pass. This can be seen as an extra mitigation for supply-chain attacks.

Below is a CircleCI example configuration using Terrier to verify the contents of an image.

version: 2
jobs:
build:
  machine: true
  steps:
    - checkout
    - run:
       name: Build Docker Image
       command: |
             docker build -t builditall .
    - run:
       name: Save Docker Image Locally
       command: |
             docker save builditall -o builditall.tar
    - run:
       name: Verify Docker Image Binaries
       command: |
             ./terrier

Below is a Github Actions example configuration using Terrier to verify the contents of an image.

name: Go
on: [push]
jobs:
build:
  name: Build
  runs-on: ubuntu-latest
  steps:

  - name: Get Code
    uses: actions/checkout@master
  - name: Build Docker Image
    run: |
      docker build -t builditall .
  - name: Save Docker Image Locally
    run: |
      docker save builditall -o builditall.tar
  - name: Verify Docker Image Binaries
    run: |
      ./terrier

Conclusion

In this blog post, we have looked at how to perform multiple actions on Docker (and OCI) containers and images via Terrier. The actions performed allowed us to identify specific files according to their hashes in images and containers. The actions performed have also allowed us to identify and verify multiple components in images and containers. These actions performed by Terrier are useful when attempting to prevent certain supply-chain attacks.

We have also seen how Terrier can be used in a DevOps pipeline via GitHub Actions and CircleCI.

Learn more about Terrier on GitHub at https://github.com/heroku/terrier.

Video Transcript