Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data taring before downloads will be really slow #64

Open
nikromen opened this issue Jan 7, 2024 · 1 comment
Open

Data taring before downloads will be really slow #64

nikromen opened this issue Jan 7, 2024 · 1 comment

Comments

@nikromen
Copy link
Member

nikromen commented Jan 7, 2024

We can't create cronjob periodically taring the result directory so the /download endpoint itself does this job. This is horribly slow, look at the example with only 300 really small feedbacks (just name, solution and reason for only 1 log):

[root@backend persistent]# time tar -zcf results.tar.gz results/
real m9.473s
user m9.328s
sys m0.483s

and we want thousands of feedbacks...

This could be solved by handling the taring job to someone else - cronjob in openshift, but we need RWX PVC so containers can share the volume between them. We don't have permission to create this kind of volume in our openshift console so we have to ask someone. Reach out to someone who can permit us to create RWX PVC (fedora infra folks?).
The cronjob solution itself was already written so you can copy-paste it from nikromen@4381ec2

@github-project-automation github-project-automation bot moved this to Needs triage in CPT Kanban Jan 7, 2024
nikromen added a commit to nikromen/log-detective-website that referenced this issue Jan 7, 2024
This solution raises two concerns for the future once we have a
lot of data collected:
- it will be a while until the data will be tarred - creating delay before
  actual download (after cca 200 feedbacks relly noticable delay :/)
  look at example with only 300 (tiny!!) feedbacks:
  [root@backend persistent]# time tar -zcf results.tar.gz results/
  real m9.473s
  user m9.328s
  sys m0.483s
- downloading takes also some time

-> thus blocking the whole worker during this. IIRC we have 8 workers thus
8 downloads and API is unresponsive.

Solution:
- how to solve the delay before download:
  fedora-copr#64
- the issue above is the slowest, once that will be resolved and people
  still complain, do this:
  fedora-copr#65
@nikromen nikromen changed the title Data downloads will be really slow Data taring before downloads will be really slow Jan 7, 2024
nikromen added a commit that referenced this issue Jan 8, 2024
This solution raises two concerns for the future once we have a
lot of data collected:
- it will be a while until the data will be tarred - creating delay before
  actual download (after cca 200 feedbacks relly noticable delay :/)
  look at example with only 300 (tiny!!) feedbacks:
  [root@backend persistent]# time tar -zcf results.tar.gz results/
  real m9.473s
  user m9.328s
  sys m0.483s
- downloading takes also some time

-> thus blocking the whole worker during this. IIRC we have 8 workers thus
8 downloads and API is unresponsive.

Solution:
- how to solve the delay before download:
  #64
- the issue above is the slowest, once that will be resolved and people
  still complain, do this:
  #65
@praiskup praiskup moved this from Needs triage to In 2 years in CPT Kanban Jan 10, 2024
@praiskup
Copy link
Member

We should prioritize this once this really causes issues.

nikromen added a commit to nikromen/log-detective-website that referenced this issue Jul 29, 2024
This solution raises two concerns for the future once we have a
lot of data collected:
- it will be a while until the data will be tarred - creating delay before
  actual download (after cca 200 feedbacks relly noticable delay :/)
  look at example with only 300 (tiny!!) feedbacks:
  [root@backend persistent]# time tar -zcf results.tar.gz results/
  real m9.473s
  user m9.328s
  sys m0.483s
- downloading takes also some time

-> thus blocking the whole worker during this. IIRC we have 8 workers thus
8 downloads and API is unresponsive.

Solution:
- how to solve the delay before download:
  fedora-copr#64
- the issue above is the slowest, once that will be resolved and people
  still complain, do this:
  fedora-copr#65
nikromen added a commit to nikromen/log-detective-website that referenced this issue Jul 29, 2024
This solution raises two concerns for the future once we have a
lot of data collected:
- it will be a while until the data will be tarred - creating delay before
  actual download (after cca 200 feedbacks relly noticable delay :/)
  look at example with only 300 (tiny!!) feedbacks:
  [root@backend persistent]# time tar -zcf results.tar.gz results/
  real m9.473s
  user m9.328s
  sys m0.483s
- downloading takes also some time

-> thus blocking the whole worker during this. IIRC we have 8 workers thus
8 downloads and API is unresponsive.

Solution:
- how to solve the delay before download:
  fedora-copr#64
- the issue above is the slowest, once that will be resolved and people
  still complain, do this:
  fedora-copr#65
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In 2 years
Development

No branches or pull requests

2 participants