Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support customizable cpu, memory, network task complexities #47

Merged
merged 19 commits into from
May 3, 2022

Conversation

alekodu
Copy link
Member

@alekodu alekodu commented Apr 26, 2022

closes #14, closes #9, closes #4, closes #38, closes #20

New endpoint format:

"endpoints": [
{
"name": "end1",
"protocol": "http",
"execution_mode": "parallel",
"cpu_complexity": {
"execution_time": "5s",
"method": "fibonacci",
"workers": 2,
"cpu_affinity": [
0,
1
],
"cpu_load": "100%"
},
"memory_complexity": {
"execution_time": "5s",
"method": "swap",
"workers": 24,
"bytes_load": "100%"
},
"network_complexity": {
"forward_requests": "asynchronous",
"response_payload_size": 512,
"called_services": [
{
"service": "service2",
"port": "80",
"endpoint": "end2",
"protocol": "http",
"traffic_forward_ratio": 1,
"request_payload_size": 256
}
]
}
}
]

New response format:

{
"cpu_task":{
"services":[
"service1/end1", "service2/end2"
],
"statuses":[
"stress-ng: info: [76] dispatching hogs: 2 cpu\nstress-ng: info: [76] successful run completed in 5.00s\nstress-ng: info: [76] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s\nstress-ng: info: [76] (secs) (secs) (secs) (real time) (usr+sys time)\nstress-ng: info: [76] cpu 64633075 5.00 9.61 0.00 12926973.13 6725606.14\n",
"stress-ng: info: [33] dispatching hogs: 2 cpu\nstress-ng: info: [33] successful run completed in 5.32s\nstress-ng: info: [33] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s\nstress-ng: info: [33] (secs) (secs) (secs) (real time) (usr+sys time)\nstress-ng: info: [33] cpu 4349696 5.17 0.80 0.00 841800.59 5437120.00\n"
]
},
"memory_task":{
"services":[
"service1/end1", "service2/end2"
],
"statuses":[
"stress-ng: info: [80] dispatching hogs: 24 vm\nstress-ng: info: [80] successful run completed in 5.28s\nstress-ng: info: [80] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s\nstress-ng: info: [80] (secs) (secs) (secs) (real time) (usr+sys time)\nstress-ng: info: [80] vm 0 5.23 4.82 5.43 0.00 0.00\n",
"stress-ng: info: [35] dispatching hogs: 24 vm\nstress-ng: info: [35] successful run completed in 5.33s\nstress-ng: info: [35] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s\nstress-ng: info: [35] (secs) (secs) (secs) (real time) (usr+sys time)\nstress-ng: info: [35] vm 0 5.31 4.03 5.62 0.00 0.00\n"
]
},
"network_task":{
"services":[
"(service1/end1, service2/end2)"
],
"statuses":[
200
],
"payload":"JDYg8VVuaptwsGdS1rxq7Rwr04axGBdIaRBPaN55iuvcogCJIhpDCPnrcLpKI671sGHkBIylJ8DrCzW9QgI16DnYXKm8D0Of0wL55Tar0EHjwP563hPdAd3xhaoZDM3BIP0ZcNHjLWC5k1v2Y5OEZSTodecrkG5JvPAcl93G1rOU0KUR2ZMQ3aljh8d4uKaXD6j4RMmuNvd2VuuHicUnIkxcUA32WBnEgXkmYJmlF6nUggaU4TR93mZxWhQdMOTYydFsKrtlZQ39zOA66F2kxzV7eYtOobpjoz3XmjCiceEU2PfmnOJiKtsBDzevMRm0lWll2Ua4FZQETnORPWhpwnKnPdxlJcsqGAoAhzmR8yF8JXXACkFP9yfcaW6rVuQIShCHPbMCAi0uEhnbr1tiXoJHhWLUqLSVlh57PDGXC74fmZ1gpuQQiYyKoN5EmPSrOyqUe4iVgMDBg2sOHzWrvpSOg6QDB3rQg3jHP7srh87YpYnjaZBqM6ns6GlnlVKS"
}
}

@alekodu alekodu requested a review from salehsedghpour April 26, 2022 07:27
@alekodu alekodu changed the title Dev Support customizable cpu, memory, network task complexities Apr 26, 2022
"namespace": "default",
"node": "node-1"
"cluster": "cluster1",
"namespace": "edge-namespace",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is suggested to have the existing namespace for all cases, which is default.

version: cluster1
namespace: edge-namespace
data:
conf.json: '{"processes":2,"threads":2,"endpoints":[{"name":"endpoint1","protocol":"http","execution_mode":"sequential","cpu_complexity":{"execution_time":"","method":"","workers":0,"cpu_affinity":null,"cpu_load":""},"memory_complexity":{"execution_time":"","method":"","workers":0,"bytes_load":""},"network_complexity":{"forward_requests":"asynchronous","response_payload_size":512,"called_services":[{"service":"service2","port":"80","endpoint":"endpoint1","protocol":"http","traffic_forward_ratio":1,"request_payload_size":256},{"service":"service2","port":"80","endpoint":"endpoint2","protocol":"http","traffic_forward_ratio":1,"request_payload_size":256}]}},{"name":"endpoint2","protocol":"http","execution_mode":"parallel","cpu_complexity":{"execution_time":"","method":"","workers":0,"cpu_affinity":null,"cpu_load":""},"memory_complexity":{"execution_time":"","method":"","workers":0,"bytes_load":""},"network_complexity":{"forward_requests":"asynchronous","response_payload_size":512,"called_services":[]}}]}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is cpu_affinity true for this example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I didn't update the example yaml files

version: cluster1
namespace: edge-namespace
data:
conf.json: '{"processes":2,"threads":2,"endpoints":[{"name":"endpoint1","protocol":"http","execution_mode":"parallel","cpu_complexity":{"execution_time":"5s","method":"fibonacci","workers":2,"cpu_affinity":[22,23],"cpu_load":"100%"},"memory_complexity":{"execution_time":"5s","method":"swap","workers":24,"bytes_load":"100%"},"network_complexity":{"forward_requests":"asynchronous","response_payload_size":512,"called_services":[]}},{"name":"endpoint2","protocol":"http","execution_mode":"parallel","cpu_complexity":{"execution_time":"","method":"","workers":0,"cpu_affinity":null,"cpu_load":""},"memory_complexity":{"execution_time":"","method":"","workers":0,"bytes_load":""},"network_complexity":{"forward_requests":"asynchronous","response_payload_size":512,"called_services":[]}}]}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not everyone has access to servers with 24 cores of CPU. I would suggest changing the values of cpu_affinity.

]
"execution_mode": "sequential",
"cpu_complexity": {
"execution_time": "10s",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed before, we should reduce the execution_times in the scale of miliseconds per requests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the minimum timeout supported by stress-ng is 1 second...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to kill the stress-ng subprocess in python? so we just pass a huge timeout to stress-ng and then kill it will python

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be but the load stress-ng generates takes some time to be applied and reflected on resource consumption. If we want to have minimum processing delay in the order of milliseconds we could instead simulate a sleep task or just don't configure any cpu or memory task...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sleep task does not create any load. I'll check it out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it doesn't create any load... I think it is hard to simulate tasks that are both cpu-bounded and I/O bounded at the same time... either we can simulate tasks that have a heavy load on CPU/memory or on network, but not both...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion, this will be solved in issue #48.

EpExecModeDefault = "sequential"
EpNwResponseSizeDefault = 512

EpExecTimeDefault = "1s"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default values should be also reduced to the a reasonable scale for a single request.

model/Dockerfile Outdated
@@ -20,6 +20,9 @@ RUN mkdir -p /usr/src/app
RUN apt update
RUN apt install -y jq \
wget \
curl \
vim \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we really using vim and curl in the container? I guess we should delete these two from the docker image.

@@ -43,105 +43,233 @@ def getForwardHeaders(request):
return headers


def run_task(service_endpoint):
def run_task(service_name, service_endpoint):
headers = getForwardHeaders(request)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are already have the same value for headers as defined here, also we are importing headers from wsgiref, don't they conflict with each other? And do we really need wsgiref?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't make changes to the header part... If I remember correctly, this was done to forward jaeger tracing headers through the set of microservices

RequestPayloadSize int `json:"request_payload_size"`
}

type CpuComplexity struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to define optional values for the required values here, so that if the user skip this values, we could generate reasonable yaml files.

@salehsedghpour salehsedghpour merged commit 43e77ac into EricssonResearch:main May 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment