Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Python Framework #420

Open
AnastaZIuk opened this issue Sep 20, 2022 · 1 comment
Open

CI Python Framework #420

AnastaZIuk opened this issue Sep 20, 2022 · 1 comment
Assignees
Labels
CI enhancement New feature or request

Comments

@AnastaZIuk
Copy link
Member

AnastaZIuk commented Sep 20, 2022

New CI's job handling diagram

graph TD
    Yes1[Yes]
    No1[No]
    TODO
    A[Devsh Jenkins]
    B[Proxmox OS Host]
    C[Jenkins Agent]
    D[Windows 11 Virtual Machine with PCI passtroughed GPU and required inputs]
    E[Temporary directory created by Proxmox OS Host on demand]
    F[Artifactory]
    id1{Is it a job with CI checks for examples?}

    A-- 1. Run job ---C
    B --> C
    C --> B
    B-- "2. Create VM on demand from predefined system image (having everything installed on the system image to handle Nabla)" ---D
    D-- 3. Process a job in Virtual Machine ---id1
    id1 --> Yes1
    id1 --> No1
    No1 --> TODO
    Yes1-- 3a. Clone Nabla<br/>3b. Build Nabla solution<br/>3c. Let Proxmox OS Host validate an example's 'JSON input'<br/>3d. Run an example which takes 'JSON input', parse validated json data that will be used to produce example's results<br/>3e. Let Proxmox OS Host SCP examples' results from the VM to temporary directory created by it ---E
    E-- 4. Let Proxmox OS Host validate copied results from the temporary directory using Python Framework<br/>4a. Produce a HTML file using Python Framework containging outpus resources from JSON input file<br/>4b. Let Proxmox OS Host SCP the output resources and the HTML file to artifactory ---F
    F-- 5. Bring everything to start point<br/>5a. Let Proxmox OS Host remove it's temporary directory<br/>5b. Let Proxmox OS Host remove the entire Virtual Machine image with all data ---B
Loading

CI security job procedure description

Above CI's job handling diagram shows new way of the CI proceduring a job. Previous pipeline had many disadvantages and it was susceptible to various types of hacking attacks, such as arbitrary code execution that could damage our nodes or even company infrastructure. The key is to limit room for maneuver, job should be executed in encapsulated environemt that cannot comunicate with outside world. To meet these requirements we have Proxmox virtualizer system that is responsible for creating virtual machines on demand when any job is trigerred by our Jenkins - the job executes in the VM and once it completes the VM system gets removed, It also validates some data (like Input JSON files) that needs to be hanlded by the job, talks to the VM, scans job's results and is a gate between artifactory. Any job handling a Nabla example providing JSON Input data files is expected to produce artifacts as result of executed example and a HTML file generated via Python Framework with listed example's results and any data related.

Detailed steps

A job request is triggered by Github hook, Pull Request comment or external URL with proper credentials. Our Jenkins handles the request and runs the job on appropriate Jenkins Agent which is Proxmox OS Host. Jenkins Agent creates a Virtual Machine with predefined system image (Windows 11 with required software to compile Nabla for all available operating systems and NVIDIA drivers installed. Proxmox OS Host is setup to handle PCI passtrough and our created VMs have GPU and some PCI inputs passtroughed) for the following job requested. Having the VM created Proxmox OS Host begins the communication between the VM and itself to perform the job's steps. The communication is only one-way - Proxmox OS Host connects with VM by SSH protocol. The VM cannot talk to Proxmox OS Host.

Further steps for a job that does anything Nabla related

Proxmox OS Host starts to execute job's instructions being connected to the VM by SSH (note: by "it something", like "it runs a command" we mean Proxmox OS Host doing the thing on the VM by SSH) in following order:

It shallow-clones Nabla and it builds it. If the job needs to handle any of examples then for the following examples it validates their Input JSON files before they get executed.

Once the validation completes (sucessfully or not) then it runs the python example which takes Input JSON as the python example's input. The example uses Python Framework's API to parse json inputs, execute Nabla's example executable (which outputs results according to Input JSON file) and perform any CI checks with provided and outputted result data.

When the python job completes it creates temporary directory with hashed name on it's operating system and copies python example's results from the VM to the newly created directory. Once copied it validates and scans results to detect potential malware and when it's done, it generates HTML file in the temporary directory with artifact resources and begins to validate it.

Once it has HTML validated it uses SCP protocol to transfer all of the python example's results and generated HTML from the temporary directory to company's artifactory. When all of data is transfered it deletes the temporary directory and removes the VM.

Further steps for a job that does anything but not Nabla related

Procedure files

JSON Input file description

The following is an example of JSON input file for a given example, it should be common for all of them

{
  "enableParallelBuild": true,
  "threadsPerBuildProcess" : 2,
  "isExecuted": false,
  "testScript": "relative_path_to_python_script_for_bulk_testing arg1 arg2 arg3",
  "cmake": {
    "configurations": [ "Release", "Debug", "RelWithDebInfo" ],
    "buildModes": [ 
      "build_option_1",
      "build_option_2"
    ],
    "requiredOptions": [ "NBL_BUILD_OPTION_1", "NBL_BUILD_OPTION_2" ]
  }, 
  "profiles": [
    {
      "backend": "vulkan",
      "platform": "windows",
      "buildModes": [ "build_option_1" ],
      "runConfiguration": "Release",
      "gpuArchitectures": [ "Turing", "Pascal" ]
    }
  ],
  "dependencies": [
    "relative_path_to_a_file_1",
    "relative_path_to_a_file_2"
  ],
  "data": [
    {
      "dependencies": [
        "relative_path_to_a_file_3",
        "relative_path_to_a_file_4"
      ],
      "command": "whole bash command",
      "outputs": [
        "absolute_path_to_an_output_created_by_command_1",
        "absolute_path_to_an_output_created_by_command_2"
      ]
    },
    {
      "dependencies": [
        "relative_path_to_a_file_5",
        "relative_path_to_a_file_6"
      ],
     "command": "whole bash command",
      "outputs": [
        "absolute_path_to_an_output_created_by_command_1",
        "absolute_path_to_an_output_created_by_command_2"
      ]
    }
  ]
}

enableParallelBuild is a required boolean variable, determines whether the build process runs for different configurations in parallel.

threadsPerBuildProcess is an optional integer variable, use with enableParallelBuild enabled.

isExecuted is a required boolean variable. if true, the example will be built and tests will run to validate if it works properly. If false, this example will be only checked whether it compiles successfuly.

requiredFiles is an array of common relative file paths to any kind of resources like executables, shadares and similar that are required to be present on the Virtual Machine before CI launches an example that will read from the json file.

data is an object array that consists with data batches needed for a single run invocation given an input of fields described bellow:

  • dependencies array which contains relative paths to files that must be present for that data invocation. Those files are unique for the processing batch invocation and may be referenced by command field, for instance they may be scenes
  • command is an array with arguments listed in order to be the command the example will execute
  • outputs is an array with absolute paths to outputs an example should produce for that batch invocation

HTML output file description

TODO!

HTML result features

TODO!

Python Framework

TODO!

Description

TODO!

API

TODO!

Validation

The validation is performed via API provided by Python Framework. If validation fails on any step bellow then a particular job will continue it's job anyway but with additional restrictions:

  • generated HTML file will have a special header at top with linked log file describing the validation issue
  • if the handled job is supposed to execute updateReferenceData script for a given example then it won't do it
  • CI status will be set to FAIL at the end of job's execution

Validation of JSON input file

Includes:

  • checking if all requiredFiles and dependencies are present
  • weird chars in paths of OUTPUTS array validation, it bans space, ", single apostrophe and \ characters
  • type of paths itself in OUTPUTS array validation, only absolute paths are valid

Validation of produced results from a given example

Includes:

  • .bin files validation (we may not validate it because it hard to do so, it's raw data, TODO: decide!)
  • .csv files validation, those files should be processed by any available validator
  • .md5 hash files validation, those files should be ASCII only
  • any example's image results or RenderDoc's saved images like .png, .jpeg, .ktx, etc.

Validation of generated HTML file

Includes:

  • all of resources' hyperlinks, we need to make sure that those redirects to whitelisted domains like ours and githubs, etc.
  • HTTPS validation, we cannot allow for unsecure redirection
@AnastaZIuk AnastaZIuk added enhancement New feature or request CI labels Sep 20, 2022
@AnastaZIuk AnastaZIuk self-assigned this Sep 26, 2022
@AnastaZIuk
Copy link
Member Author

once we have time to finish the issue and begin the work, we may consider golang and proxmox-api-go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant