Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide support for capability in Azure Pipelines scaler #2328

Closed
tomkerkhove opened this issue Nov 24, 2021 Discussed in #2308 · 6 comments
Closed

Provide support for capability in Azure Pipelines scaler #2328

tomkerkhove opened this issue Nov 24, 2021 Discussed in #2308 · 6 comments
Assignees
Labels
azure All issues concerning integration with Azure feature All issues for new features that have been committed to help wanted Looking for support from community scaler-azure-pipelines

Comments

@tomkerkhove
Copy link
Member

Discussed in #2308

Originally posted by SebastianSchuetze November 20, 2021
Hi,

I got an idea I need to share and see if I would give it a go for implementation

Background / Motivation

I recently was able to run the new Azure Container Apps with Azure Pipelines and the KEDA scaler. Which is already a huge improvement, because you can use scalable container based agent pools. Which is on the roadmap for Azure DevOps a long time, but has not been implemented yet.

In my company we use two DevOps tool platforms. Azure DevOps Services and GitLab. Our IT company (DB Systel) who is providing IT for the whole Deutsche Bahn is pretty much container based. They have teams managing Kubernetes Clusters which can be used by all DevOps-Teams or Partners who want to use container based solutions.

Also teams (often inner sourced) provide build container for specific use cases (build java, run whitesource bolt code checks, build CSharp etc.) which is actually fine because you separate concerns and not just make one big image of 10GB and then hope it works for everybody.

Azure Pipelines Agents vs Gitlab Runners

Because of this. Our company makes heavy use of the GitLab Kubernetes Executor. This can be used within GitLab Pipelines. This works as follow:

  • Pipeline is started with a GitLab runner installed on an environment which is pretty basic (no other tools installed)
  • in the YAML build containers are specified (pulled from a private repository like Artifactory) with the K8 Executor
  • the GitLab runner is not spinning up the container on the same host, but actually calling the K8 API to spin up the container and waiting for results (logs, exit code etc.)
  • finishes the pipeline and fails or succeeds accordingly

Now compared to Azure Pipelines, Azure DevOps looks pretty old school. You always have to run the container on the same host where the Agent is running. This works fine with VMs. But with Docker in Docker you get problems (especially using PaaS who prohibit opening up the Docker socket) That means you can't use container based pipelines and still use docker images on the host.

So long story short. I think this could be achieved by extending the Azure Pipelines scaler to also take demands of jobs into account!

How does this help?

The current scaler is calling a not public documented API (which is fine, because the API documentation of AzDO is just really incomplete!) but stable! It calls

https://dev.azure.com/{organization}/_apis/distributedtask/pools/{poolId}/jobrequests

and returns, besides the job itself, also the matching agents and the "demands" property. Demands means, what software should be installed on the agent in order to be selected. If no agent has the demanded capability, then the pipeline fails saying no agent available. The capabilities can be customized and controlled (via environment variables) when connecting an agent.

The current scaler would scale the container regardless of the demands. So of the container running the agent is not including the demand. It would scale but never take a job.

So the idea would be to also give the scaler a list of capabilities that the scaler should look for. If the capability is not part of a waiting job, then this job is not counted.
If I could do this, then I could have multiple different container agents within one pool that scale only up when a job with certain demands is requested. And I could have one pool controlling which type of docker image should be used indirectly.
This would for me be the closest thing of implementing GitLabs K8 Executor approach.

Example

Think of this example queue in an agent pool

- job: job1
  capabilities:
    - cap1
    - cap3
- job: job2
  capabilities:
    - cap3
    - cap4
- job: job3
  capabilities:
    - cap2
    - cap5

No imagine that a new container would spin up for each new waiting job then we would have three containers running. But if I would give the scaler additionally to only count on jobs with capabilities cap3 then we would have only two containers running.

Final question

Do you see any problem with this approach. And since I know @tomkerkhove knows Azure DevOps he might be able to understand if this could be beneficial.

@tomkerkhove tomkerkhove added help wanted Looking for support from community azure All issues concerning integration with Azure feature All issues for new features that have been committed to scaler-azure-pipelines labels Nov 24, 2021
@Eldarrin
Copy link
Contributor

Eldarrin commented Dec 15, 2021

Interestingly I hit the same thought. Started looking at doing something like:
url := fmt.Sprintf("%s/_apis/distributedtask/pools/%s/agents?&demands=%s&api-version=6.1-preview.1", s.metadata.organizationURL, s.metadata.poolID, demands)
just trying to figure how I would instruct the correct scalar so I only increase on the correct ScaledJob profile

edit: implementing a multi variant demand based farm

editedit: If anyone know an API to filter queuelength by demand, this would be easy :)

@Eldarrin
Copy link
Contributor

Eldarrin commented Jan 14, 2022

Unfortunately I don't know go; but I had a play around and put some stuff together that may work (will need a proper dev to write it well :) )

Please note, this only supports exists/not exists; not versions etc

First, add demands to the metadata:

type azurePipelinesMetadata struct {
	organizationURL            string
	organizationName           string
	personalAccessToken        string
	poolID                     string
	demands                    string
	targetPipelinesQueueLength int
	scalerIndex                int
}

Then in GetAzurePipelinesQueueLength:

replace:

for _, value := range jobs {
		v := value.(map[string]interface{})
		if v["result"] == nil {
			count++
		}
	}

with:

for _, value := range jobs {
		v := value.(map[string]interface{})
		if v["result"] == nil {
            if s.metadata.demands == nil {
                count++
            } else {
                var demandsReq = v["demands"].([]interface{})
                var demandsAvail = strings.Split(s.metadata.demands, ",")
                var countDemands = 0
                for _, dr := range demandsReq {
                    for _, da := range demandsAvail {
                        if dr== strings.Trim(da " ") {
                            countDemands++
                        }
                    }
                }
                if countDemands == len(demandsReq) - 1 {
                    count++
                }
            }
        }

The len()-1 is to ignore the the Agent.Version > demand

HTH, Andy

@MartinGolding515
Copy link

I am considering the exact same use case so would also like to have this feature. Note that you could create different agent pools for the different types of image but this has 2 drawbacks:

  1. An instance must be present in each agent pool to be registered.
  2. The azure cli has a significant foot print so would waste a lot of resources considering the above.

@SandipGorde-TomTom
Copy link

We are also waiting for this feature in our use cases

@tomkerkhove
Copy link
Member Author

Anybody willing to contribute this?

@Eldarrin
Copy link
Contributor

I'm working on it at the moment; I created an infantile version as outlined above and it worked. Although it spins up duplicates jobs of any machine that can satisfy it, thus one job requesting java would spin up any scaledjob agent that can satisfy it. I'm trying to refine this at the moment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
azure All issues concerning integration with Azure feature All issues for new features that have been committed to help wanted Looking for support from community scaler-azure-pipelines
Projects
Archived in project
Development

No branches or pull requests

4 participants