Skip to content

datosh/pinned-actions

Repository files navigation

Pinned Actions

While researching GitHub Actions for a talk, I asked myself: "How many repositories use GitHub Actions via pin-by-hash?". As I was unable to find a tool that could answer this question, I decided to build one myself.

The results are published at: http://pin-gh-actions.kammel.dev/

Usage

$ go run . --help
Usage of GH Pinned Actions:
  -download-dir string
        path to folder where repositories will be downloaded (default "/tmp/pinned")
  -max-pages int
        maximum number of pages to download (default 1)
  -per-page int
        number of repositories to download per page (default 100)

Example

To replicate the results for 10,000 repositories, run:

go run . -max-pages 100

Note

The default download directory is /tmp/pinned. You can change it with the --download-dir flag.

Warning

Downloading 10,000 repositories will take a long time (depending on your internet connection) and consume about 1.5TB of disk space.

Architecture

Notes about the chosen libraries and APIs.

GitHub Search API

We use the public GitHub repository search API to request the most popular repositories by stars. Although the search API support pagination, it has a limit of 100 results per page, and additionally a limit of 1000 results per search.

To get around this limitation, we modify the search query after each request, and only use the first page returned.

go-git

Although go-git was the initial choice to clone the repositories, it was later replaced by os/exec and git due to performance limitations of the library. See linux-fetcher.

Parsing Actions

stacklok/frizbee already provides all the necessary tools to parse GitHub Actions. We use this library to parse the actions from the repositories.