-
-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSDF PO.3.2 Documentation on maintainance and security of toolchains #2553
Comments
Two aspects of toolchains - there are the server systems that we use in order to run and maintain the processes and automation. The second is the tools used during the individual build/test processes to produce the output on those server systems.. In this comment I will start with the former. Systems we use: GitHubWe use the repositories under https://github.com/adoptium to host our source code ( JenkinsWe are currently running a server hosted at Hetzer running the latest LTS version of jenkins. This will be checked on (at a minimum) a weekly basis to ensure there are no security issues being flagged in the jenkins UI, or jenkins itself. At present the access control for Jenkins is performed using various groups in the AdoptOpenJDK GitHub organisation. Some of this was revamped as part of #1084 Administrative access to the jenkins server is available to a subset of the Adoptium PMC and a small number of other users. Homebrew/DockerHub/JFrogI'm putting these all together for now, but they can be split out later. In addition to publishing tar/zip files in GitHub, we also manage pushing the binaries to Homebrew (for MacOS), DockerHub (Ubuntu, CentOS7, Alpine and some windows distributions) as "official" images and also publish RPM and DEB installers for Linux to an Artifactory instance hosted by JFrog In all cases we do not have control over the servers, ubt we can control the keys used to publish to Homebrew and JFrog. At present we also still ship to the old dockerhub repository for AdoptOpenJDK which has automation keys as we push directly, but the newer Adoptium repository updates are done via pull requests to the official repositories, so no automation keys are used (See also the build image docker images later in this doc) Ansible/AWXMost of our build and test machines are set up using the ansible playbooks from the infrastructure repository. We also have an AWX server that is access controlled by members of the infrastructure team from the AdoptOpenJDK repository (same ACL used for jenkins job access). In general if a user has been allowed administrative access to our build machines they will be granted access to AWX since that is simply an easier way to deploy playbook changes. The intention is to run the playbooks on a regular basis through AWX in order to ensure the machines are in sync and up to date. Playbook changes are typically tested using the VPC and QPC jobs in jenkins to ensure they do not cause any problems when run from scratch on a 'clean' OS install - this runs via jenkins on some machines specifically set up for this purpose. While the AWX server itself does not have an adoptium specific playbook to set it up, there is a guide at https://github.com/adoptium/infrastructure/wiki/Ansible-AWX and the process does make use of the ansible playbooks supplied with AWX. Static Docker containersIn addition to the machines that are configured using the ansible playbooks, on some of our larger machines we run multiple container images on them to better utilise the capacity and provide isolation when running multiple tests. These are set up using the DockerStatic playbook role and are created using dockerfiles with the minimum requirements for running tests. This also lets us test on a wider variety of Linux distributions than we would be otherwise be able to with 'real' VMs. Currently this capability is limited to x64 and aarch64 but there is no reason other than capacity why it could not be rolled out more widely. The patching strategy for these are as implemented in #2070 Dockerhub (for build images)For some platforms (Alpine, Linux/x64, Linux/aarch64) we use docker images created from the dockerfiles in https://github.com/adoptium/infrastructure/tree/master/ansible/docker which are build and pushed up to dockerhub under the adoptopenjdk (NOTE: Not currently adoptium!) project. These are used for building on those platforms and are created using the ansible playbooks. Other platforms use statically created machines. Particularly for Linux/x64 this provides us with additional security since the OS we build on is currently CentOS6 which is out of formal support. Automation keys are used to push these images automatically to github when changes are made to the playbooks using the processes linked to in the FAQ BastillionWe use a bastillion server for distributing ssh keys to our build and tests systems. This server is generally not logged into by members of the infrastructure team (admin access to change the machine details is much more restricted) and containers each users' public keys and the appropriate groups that people are in to give them login access as the The setup of the Bastillion server is described at https://github.com/adoptium/infrastructure/wiki/Bastillion TRSSThe Test Results Summary Service is a database and we front-end maintained for the purposes of archiving historic test results. This service currently runs as root on the machine but there is a plan to change that. The server is set up using the playbook at https://github.com/adoptium/infrastructure/blob/master/ansible/playbooks/AdoptOpenJDK_Unix_Playbook/trss.yml - for most purposes no access control is required to view the data on there, and the TRSS server gets its data from the jenkins jobs, which are not retained beyond a few days. Root access to this server is controlled by a custom NagiosWe have a Nagios server that is configured to be able to monitor all of our build/tests machines and also publishes alerts into the #infrastructure-bot channel on slack. At the moment this server is not well-maintained and the intention is to migrate it to a newer one and set it up again with a useful set of rules and make the #infrastructure-bot channel useful for 'real' alerts that need to be dealt with. The overview of Nagios can be found at https://github.com/adoptium/infrastructure/wiki#nagios-monitoring SummaryThe above lists the servers and external services that are used for the purposes of maintaining the systems and pipelines used to produce the Eclipse Temurin binaries. As can be seen, some are fully under our control and some are not We have backups of most of these which are stored on one of our servers on a regular basis with older backups being kept on a less frequent basis. This is done for Bastillion, Jenkins, Nagios and TRSS. Jenkins thin backups are also maintained on a remote mounted drive on the jenkins server itself. flowchart
A[Playbooks] --> B
B[Ansible/AWX] --> E[playbook deploy]
C[Bastillion] --> E[ssh keys]
D[Nagios] --> E[Monitoring]
Z[GitHub source repos jdkXXu] --> E[Source]
Y[DockerHub] --> E[Build Containers]
E[Build and test systems] --> F
F[JENKINS] --> G
F --> Y
G[TRSS]
F --> H[Publish]
F --> I[Publish]
F --> J[Publish]
F --> K
H[GitHub temurinXX-binaries]
I[JFrog apt/yum repos]
J[Homebrew]
K[DockerHub]
|
The toolchains used for the build and test process are mostly defined by those upstream tools so are subject to change. OpenJDK BuildsWhile we have the support for building for Temurin, OpenJ9, Bisheng, Dragonwell and Corretto in our build processes we will focus on Temurin for the purposes of this document. OpenJ9 has a number of extra requirements which are captured in the ansible playbooks, but typically these are additions to those required for Temurin and not not override them. This also does not go into details of the cross-compilation case which we use for some of the RISC-V builds which we do not currently release. Machine setup (ansible)Jenkins pipelinesThe jenkins pipelines in the ci-jenkins-pipelines repository are used to run the build and test processes. These select a machine to use and run the processes in the temurin-build and aqa-tests repositories in order to build and then test the product. For the current purposes of this document we will focus on the build side as that is the most relevant part from the perspective of securing the supply chain. Build processThe build process is started from the make-adopt-build-farm.sh script and uses other scripts in the temurin-build repository. It clones the source code onto the machine and then builds it using the environment that exists on the build machine plus whatever settings are defined by the pipeline configurations - which is generally set up using ansible (or by running a docker container containing the build image which is pulled from dockerhub). The platform-specific-configurations scripts define, for each platform, some of the tool locations on our machines, such as the compiler versions (as installed by our ansible scripts) and some version or variant specific options. The process, broadly speaking, is as follows:
Tools used
A more comprehensive list of tools which are installed from repositories can be seen in the Common playbook role. |
Of the list at the end of the previous comment, these are the ones that can be (if not pulled from the OS repos) obtained from other locations. With some of the tools there is, of course, a risk that if they're put at the start of the path - as they are on AIX and maybe others - that anything installed by them could override system-provided tools of the same name (Deliberate for AIX to pick up make and others, but that could - and likely should - be worked around)
[1] - Signed and being verified Other tools like Mercurial, OpenSSL, nasm, cmake, freemarker, NVidia CUDA toolkit (Links here are to the corresponding UNIX playbook roles) are installed on the machines used for things other than Temurin builds |
Part of adoptium/adoptium#122:
PO.3.2 Follow recommended security practices to deploy, operate, and maintain tools and toolchains
The text was updated successfully, but these errors were encountered: