Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.50.3 #2426

Merged
merged 75 commits into from
Jun 6, 2023
Merged
Show file tree
Hide file tree
Changes from 53 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
6562e4a
Merge pull request #2399 from chaoss/docs-patch-ccd
sgoggins May 4, 2023
798a893
documentation tweak
sgoggins May 4, 2023
6d47b3f
mkdir
IsaacMilarky May 9, 2023
1f0866f
Merge pull request #2400 from chaoss/migration-patch
IsaacMilarky May 9, 2023
5d7c885
set repo back to error when an unexpected error happened
IsaacMilarky May 9, 2023
b08cb3c
remove raise of error
IsaacMilarky May 10, 2023
75ef404
error log
IsaacMilarky May 10, 2023
2b745b2
Merge pull request #2402 from chaoss/clone-err-handle
IsaacMilarky May 10, 2023
abf7513
Fix success check for org and repo adding
ABrain7710 May 10, 2023
4604eff
Raise exception if no valid github api keys exist
ABrain7710 May 10, 2023
5cca96d
Catch exception when no valid keys exists and return False
ABrain7710 May 10, 2023
bab1ce4
Handle github 204 api status
ABrain7710 May 10, 2023
337268c
default materialized view refresh for every 7 days
IsaacMilarky May 11, 2023
aba50ad
simplify worker start logic
IsaacMilarky May 11, 2023
c8355de
write method to scale celery processes based on memory
IsaacMilarky May 11, 2023
5949f13
Cap algorithm at a maximum amount of processes to schedule
IsaacMilarky May 11, 2023
96d8712
Merge pull request #2404 from chaoss/invalid-keys-fix
IsaacMilarky May 11, 2023
1fea7a4
Merge branch 'dev' into celery-process-scaling
IsaacMilarky May 11, 2023
4e9b511
Merge pull request #2406 from chaoss/materialized-views-interval-update
sgoggins May 11, 2023
f97b648
fix indent issue due to uncaught merge error
IsaacMilarky May 11, 2023
97aeae6
Merge pull request #2405 from chaoss/handle-github-204
sgoggins May 11, 2023
d40fecf
Merge pull request #2408 from chaoss/fix-merge-issue
sgoggins May 11, 2023
0fddebf
Merge branch 'dev' into celery-process-scaling
IsaacMilarky May 11, 2023
6b7324a
add hard cap on memory usage with celery worker children
IsaacMilarky May 11, 2023
b0ea69a
subtract 2 from max_process_estimate to ensure not to exceed 30% of t…
IsaacMilarky May 11, 2023
de0d9b0
Make sure that each worker has a minimum process of 1
IsaacMilarky May 11, 2023
fc72e8e
default 25%
IsaacMilarky May 11, 2023
49645a2
Merge pull request #2407 from chaoss/celery-process-scaling
IsaacMilarky May 11, 2023
10a16cb
Add alembic revision for new version of the config
IsaacMilarky May 15, 2023
e96bb41
missing import
IsaacMilarky May 15, 2023
3b085b3
syntax
IsaacMilarky May 15, 2023
d2bb1ce
Implement use of config values
IsaacMilarky May 15, 2023
1491172
remove celery concurrency option
IsaacMilarky May 15, 2023
f268c7c
Add helper functions for db fixtures
ABrain7710 May 17, 2023
d761854
Add fresh db fixture with each scope level
ABrain7710 May 17, 2023
5d16066
Add read only database fixture
ABrain7710 May 17, 2023
2c305cb
Increase security of password hashing and define standard method to h…
ABrain7710 May 17, 2023
ba94e5a
Implement functionality to add orgs and repos asynchronously
ABrain7710 May 17, 2023
0a5b68f
Merge pull request #2409 from chaoss/add-config-options
sgoggins May 17, 2023
59cd69d
Merge pull request #2410 from chaoss/password-security
ABrain7710 May 17, 2023
c11f299
sendgrid update
sgoggins May 17, 2023
ffe8f67
updating sendgrid .gitignore
sgoggins May 17, 2023
650b13e
sendgrid key removed
sgoggins May 17, 2023
45bad97
fixing send
sgoggins May 17, 2023
a2325f0
Fix backend.py to run frontend worker when collection is disabled
ABrain7710 May 17, 2023
09b9f95
Merge dev into branch
ABrain7710 May 17, 2023
1ca81dd
Merge pull request #2411 from chaoss/add-repo-processing-worker
ABrain7710 May 18, 2023
1ba63be
Add missing file
ABrain7710 May 18, 2023
59df122
Merge pull request #2412 from chaoss/add-repo-processing-worker
sgoggins May 18, 2023
9b81444
Db fixture updates
ABrain7710 May 19, 2023
c7d2cb3
Quickly add repos to group if they already exist
ABrain7710 May 22, 2023
f85e1a3
Remove populated db fixtures until functionalg
ABrain7710 May 22, 2023
85aa238
Merge branch 'dev' into dev-tests
ABrain7710 May 22, 2023
d40f576
Repo group casing fix
ABrain7710 May 23, 2023
c5f8b28
add password reset command
Ulincsys May 23, 2023
963e5f5
Protect queries against sql injection
ABrain7710 May 23, 2023
54dc4db
Merge pull request #2423 from chaoss/password_reset_cli
sgoggins May 23, 2023
9d7efc1
Merge pull request #2425 from chaoss/casing-fix
IsaacMilarky May 23, 2023
97a9ac4
Merge pull request #2424 from chaoss/sql-injection-protection
IsaacMilarky May 23, 2023
b0f159b
Merge pull request #2427 from chaoss/main
sgoggins May 24, 2023
165b6d9
Update collecting-data docs page
IsaacMilarky May 24, 2023
71b5696
Update new-install.md
Seltyk May 24, 2023
52a9595
Update new-install.rst
Seltyk May 24, 2023
85100ce
Update a bunch of the worker docs to reflect present tasks
IsaacMilarky May 24, 2023
314cff3
Merge pull request #2413 from chaoss/dev-tests
IsaacMilarky May 24, 2023
ac05fb7
Merge pull request #2428 from Seltyk/seltyk-install
IsaacMilarky May 24, 2023
7600b52
Add linux badge worker functionality
IsaacMilarky May 26, 2023
8359977
add linux badge functionality to primary jobs
IsaacMilarky May 26, 2023
3dcd132
deal with insertion issue
IsaacMilarky May 26, 2023
0be9473
Merge pull request #2429 from chaoss/isaac-docs-update
sgoggins May 27, 2023
7142a66
Merge pull request #2431 from chaoss/linux-badge-tasks
sgoggins May 27, 2023
08e754f
merge with dev. preserve lowercasing the urls array
IsaacMilarky May 28, 2023
170570d
[docs] Clarify prompts during `make install`
Seltyk May 30, 2023
05db18c
Merge pull request #2414 from chaoss/skip-existing-repos
ABrain7710 Jun 2, 2023
037d00e
Merge pull request #2432 from Seltyk/seltyk-install
sgoggins Jun 5, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -192,4 +192,8 @@ pgdata/
postgres-data/

# Generated files from github
.history/
.history/sendgrid.env
sendgrid.env
*sendgrid*.env
./sendgrid.env
sendgrid.env
2 changes: 1 addition & 1 deletion augur/api/routes/application.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
import pandas as pd
from flask import request, Response, jsonify, session
from flask_login import login_user, logout_user, current_user, login_required
from werkzeug.security import generate_password_hash, check_password_hash
from werkzeug.security import check_password_hash
from sqlalchemy.sql import text
from sqlalchemy.orm import sessionmaker
from sqlalchemy.orm.exc import NoResultFound
Expand Down
4 changes: 2 additions & 2 deletions augur/api/routes/user.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
import pandas as pd
from flask import request, Response, jsonify, session
from flask_login import login_user, logout_user, current_user, login_required
from werkzeug.security import generate_password_hash, check_password_hash
from werkzeug.security import check_password_hash
from sqlalchemy.sql import text
from sqlalchemy.orm import sessionmaker
from sqlalchemy.orm.exc import NoResultFound
Expand Down Expand Up @@ -212,7 +212,7 @@ def update_user():
return jsonify({"status": "Email Updated"})

if new_password is not None:
current_user.login_hashword = generate_password_hash(new_password)
current_user.login_hashword = User.compute_hashsed_password(new_password)
session.commit()
session = Session()
return jsonify({"status": "Password Updated"})
Expand Down
44 changes: 5 additions & 39 deletions augur/api/view/api.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from flask import Flask, render_template, render_template_string, request, abort, jsonify, redirect, url_for, session, flash
from flask_login import current_user, login_required
from augur.application.db.models import Repo
from augur.tasks.frontend import add_org_repo_list
from .utils import *
from ..server import app

Expand Down Expand Up @@ -33,46 +34,11 @@ def av_add_user_repo():
if group == "None":
group = current_user.login_name + "_default"

urls = [url.lower() for url in urls]

added_orgs = 0
added_repos = 0
for url in urls:

# matches https://github.com/{org}/ or htts://github.com/{org}
if Repo.parse_github_org_url(url):
added = current_user.add_org(group, url)
if added:
added_orgs += 1

# matches https://github.com/{org}/{repo}/ or htts://github.com/{org}/{repo}
elif Repo.parse_github_repo_url(url)[0]:
print("Adding repo")
added = current_user.add_repo(group, url)
if added:
print("Repo added")
added_repos += 1

# matches /{org}/{repo}/ or /{org}/{repo} or {org}/{repo}/ or {org}/{repo}
elif (match := re.match(r'^\/?([a-zA-Z0-9_-]+)\/([a-zA-Z0-9_-]+)\/?$', url)):
org, repo = match.groups()
repo_url = f"https://github.com/{org}/{repo}/"
added = current_user.add_repo(group, repo_url)
if added:
added_repos += 1

# matches /{org}/ or /{org} or {org}/ or {org}
elif (match := re.match(r'^\/?([a-zA-Z0-9_-]+)\/?$', url)):
org = match.group(1)
org_url = f"https://github.com/{org}/"
added = current_user.add_org(group, org_url)
if added:
added_orgs += 1


if not added_orgs and not added_repos:
flash(f"Unable to add any repos or orgs")
else:
flash(f"Successfully added {added_repos} repos and {added_orgs} orgs")
add_org_repo_list.si(current_user.user_id, group, urls).apply_async()

flash("Adding repos and orgs in the background")

return redirect(url_for("user_settings") + "?section=tracker")

Expand Down
99 changes: 60 additions & 39 deletions augur/application/cli/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ def start(disable_collection, development, port):
if not port:
port = config.get_value("Server", "port")

worker_vmem_cap = config.get_value("Celery", 'worker_process_vmem_cap')

gunicorn_command = f"gunicorn -c {gunicorn_location} -b {host}:{port} augur.api.server:app"
server = subprocess.Popen(gunicorn_command.split(" "))
Expand All @@ -83,30 +84,18 @@ def start(disable_collection, development, port):
logger.info('Gunicorn webserver started...')
logger.info(f'Augur is running at: {"http" if development else "https"}://{host}:{port}')

scheduling_worker_process = None
core_worker_process = None
secondary_worker_process = None
celery_beat_process = None
facade_worker_process = None
if not disable_collection:

if os.path.exists("celerybeat-schedule.db"):
processes = start_celery_worker_processes(float(worker_vmem_cap), disable_collection)
time.sleep(5)
if os.path.exists("celerybeat-schedule.db"):
logger.info("Deleting old task schedule")
os.remove("celerybeat-schedule.db")

scheduling_worker = f"celery -A augur.tasks.init.celery_app.celery_app worker -l info --concurrency=2 -n scheduling:{uuid.uuid4().hex}@%h -Q scheduling"
core_worker = f"celery -A augur.tasks.init.celery_app.celery_app worker -l info --concurrency=45 -n core:{uuid.uuid4().hex}@%h"
secondary_worker = f"celery -A augur.tasks.init.celery_app.celery_app worker -l info --concurrency=10 -n secondary:{uuid.uuid4().hex}@%h -Q secondary"
facade_worker = f"celery -A augur.tasks.init.celery_app.celery_app worker -l info --concurrency=15 -n facade:{uuid.uuid4().hex}@%h -Q facade"

scheduling_worker_process = subprocess.Popen(scheduling_worker.split(" "))
core_worker_process = subprocess.Popen(core_worker.split(" "))
secondary_worker_process = subprocess.Popen(secondary_worker.split(" "))
facade_worker_process = subprocess.Popen(facade_worker.split(" "))
celery_beat_process = None
celery_command = "celery -A augur.tasks.init.celery_app.celery_app beat -l debug"
celery_beat_process = subprocess.Popen(celery_command.split(" "))

time.sleep(5)
if not disable_collection:


with DatabaseSession(logger) as session:

clean_collection_status(session)
Expand All @@ -120,10 +109,6 @@ def start(disable_collection, development, port):

augur_collection_monitor.si().apply_async()


celery_command = "celery -A augur.tasks.init.celery_app.celery_app beat -l debug"
celery_beat_process = subprocess.Popen(celery_command.split(" "))

else:
logger.info("Collection disabled")

Expand All @@ -135,21 +120,10 @@ def start(disable_collection, development, port):
logger.info("Shutting down server")
server.terminate()

if core_worker_process:
logger.info("Shutting down celery process: core")
core_worker_process.terminate()

if scheduling_worker_process:
logger.info("Shutting down celery process: scheduling")
scheduling_worker_process.terminate()

if secondary_worker_process:
logger.info("Shutting down celery process: secondary")
secondary_worker_process.terminate()

if facade_worker_process:
logger.info("Shutting down celery process: facade")
facade_worker_process.terminate()
logger.info("Shutting down all celery worker processes")
for p in processes:
if p:
p.terminate()

if celery_beat_process:
logger.info("Shutting down celery beat process")
Expand All @@ -162,6 +136,54 @@ def start(disable_collection, development, port):
except RedisConnectionError:
pass

def start_celery_worker_processes(vmem_cap_ratio, disable_collection=False):

#Calculate process scaling based on how much memory is available on the system in bytes.
#Each celery process takes ~500MB or 500 * 1024^2 bytes

process_list = []

#Cap memory usage to 30% of total virtual memory
available_memory_in_bytes = psutil.virtual_memory().total * vmem_cap_ratio
available_memory_in_megabytes = available_memory_in_bytes / (1024 ** 2)
max_process_estimate = available_memory_in_megabytes // 500

#Get a subset of the maximum procesess available using a ratio, not exceeding a maximum value
def determine_worker_processes(ratio,maximum):
return max(min(round(max_process_estimate * ratio),maximum),1)

frontend_worker = f"celery -A augur.tasks.init.celery_app.celery_app worker -l info --concurrency=1 -n frontend:{uuid.uuid4().hex}@%h -Q frontend"
max_process_estimate -= 1
process_list.append(subprocess.Popen(frontend_worker.split(" ")))

if not disable_collection:

#2 processes are always reserved as a baseline.
scheduling_worker = f"celery -A augur.tasks.init.celery_app.celery_app worker -l info --concurrency=2 -n scheduling:{uuid.uuid4().hex}@%h -Q scheduling"
max_process_estimate -= 2
process_list.append(subprocess.Popen(scheduling_worker.split(" ")))

#60% of estimate, Maximum value of 45
core_num_processes = determine_worker_processes(.6, 45)
logger.info(f"Starting core worker processes with concurrency={core_num_processes}")
core_worker = f"celery -A augur.tasks.init.celery_app.celery_app worker -l info --concurrency={core_num_processes} -n core:{uuid.uuid4().hex}@%h"
process_list.append(subprocess.Popen(core_worker.split(" ")))

#20% of estimate, Maximum value of 25
secondary_num_processes = determine_worker_processes(.2, 25)
logger.info(f"Starting secondary worker processes with concurrency={secondary_num_processes}")
secondary_worker = f"celery -A augur.tasks.init.celery_app.celery_app worker -l info --concurrency={secondary_num_processes} -n secondary:{uuid.uuid4().hex}@%h -Q secondary"
process_list.append(subprocess.Popen(secondary_worker.split(" ")))

#15% of estimate, Maximum value of 20
facade_num_processes = determine_worker_processes(.2, 20)
logger.info(f"Starting facade worker processes with concurrency={facade_num_processes}")
facade_worker = f"celery -A augur.tasks.init.celery_app.celery_app worker -l info --concurrency={facade_num_processes} -n facade:{uuid.uuid4().hex}@%h -Q facade"

process_list.append(subprocess.Popen(facade_worker.split(" ")))

return process_list


@cli.command('stop')
def stop():
Expand Down Expand Up @@ -378,7 +400,6 @@ def raise_open_file_limit(num_files):

return


# def initialize_components(augur_app, disable_housekeeper):
# master = None
# manager = None
Expand Down
22 changes: 19 additions & 3 deletions augur/application/cli/user.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
import os
import click
import logging
from werkzeug.security import generate_password_hash
from augur.application.db.models import User
from augur.application.db.engine import DatabaseEngine
from sqlalchemy.orm import sessionmaker
Expand Down Expand Up @@ -48,7 +47,7 @@ def add_user(username, email, firstname, lastname, admin, phone_number, password

user = session.query(User).filter(User.login_name == username).first()
if not user:
password = generate_password_hash(password)
password = User.compute_hashsed_password(password)
new_user = User(login_name=username, login_hashword=password, email=email, text_phone=phone_number, first_name=firstname, last_name=lastname, admin=admin, tool_source="User CLI", tool_version=None, data_source="CLI")
session.add(new_user)
session.commit()
Expand All @@ -59,4 +58,21 @@ def add_user(username, email, firstname, lastname, admin, phone_number, password
session.close()
engine.dispose()

return 0
return 0

@cli.command('password_reset', short_help="Reset a user's password")
@click.argument("username")
@click.password_option(help="New password")
def reset_password(username, password):
session = Session()

user = session.query(User).filter(User.login_name == username).first()

if not user:
return click.echo("invalid username")

password = User.compute_hashsed_password(password)
user.login_hashword = password
session.commit()

return click.echo("Password updated")
3 changes: 2 additions & 1 deletion augur/application/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,8 @@ def get_development_flag():
"log_level": "INFO",
},
"Celery": {
"concurrency": 12
"worker_process_vmem_cap": 0.25,
"refresh_materialized_views_interval_in_days": 7
},
"Redis": {
"cache_group": 0,
Expand Down
38 changes: 30 additions & 8 deletions augur/application/db/models/augur_operations.py
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,17 @@ def get_user(session, username: str):
return user
except NoResultFound:
return None

@staticmethod
def get_by_id(session, user_id: int):

if not isinstance(user_id, int):
return None
try:
user = session.query(User).filter(User.user_id == user_id).one()
return user
except NoResultFound:
return None

@staticmethod
def create_user(username: str, password: str, email: str, first_name:str, last_name:str, admin=False):
Expand All @@ -335,7 +346,7 @@ def create_user(username: str, password: str, email: str, first_name:str, last_n
return False, {"status": "A User already exists with that email"}

try:
user = User(login_name = username, login_hashword = generate_password_hash(password), email = email, first_name = first_name, last_name = last_name, tool_source="User API", tool_version=None, data_source="API", admin=admin)
user = User(login_name = username, login_hashword = User.compute_hashsed_password(password), email = email, first_name = first_name, last_name = last_name, tool_source="User API", tool_version=None, data_source="API", admin=admin)
session.add(user)
session.commit()

Expand Down Expand Up @@ -373,7 +384,7 @@ def update_password(self, session, old_password, new_password):
if not check_password_hash(self.login_hashword, old_password):
return False, {"status": "Password did not match users password"}

self.login_hashword = generate_password_hash(new_password)
self.login_hashword = User.compute_hashsed_password(new_password)
session.commit()

return True, {"status": "Password updated"}
Expand Down Expand Up @@ -429,9 +440,12 @@ def remove_group(self, group_name):
def add_repo(self, group_name, repo_url):

from augur.tasks.github.util.github_task_session import GithubTaskSession

with GithubTaskSession(logger) as session:
result = UserRepo.add(session, repo_url, self.user_id, group_name)
from augur.tasks.github.util.github_api_key_handler import NoValidKeysError
try:
with GithubTaskSession(logger) as session:
result = UserRepo.add(session, repo_url, self.user_id, group_name)
except NoValidKeysError:
return False, {"status": "No valid keys"}

return result

Expand All @@ -445,9 +459,13 @@ def remove_repo(self, group_name, repo_id):
def add_org(self, group_name, org_url):

from augur.tasks.github.util.github_task_session import GithubTaskSession
from augur.tasks.github.util.github_api_key_handler import NoValidKeysError

with GithubTaskSession(logger) as session:
result = UserRepo.add_org_repos(session, org_url, self.user_id, group_name)
try:
with GithubTaskSession(logger) as session:
result = UserRepo.add_org_repos(session, org_url, self.user_id, group_name)
except NoValidKeysError:
return False, {"status": "No valid keys"}

return result

Expand Down Expand Up @@ -578,6 +596,10 @@ def get_favorite_groups(self, session):
return None, {"status": "Error when trying to get favorite groups"}

return groups, {"status": "Success"}

@staticmethod
def compute_hashsed_password(password):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor spelling issue, should probably be compute_hashed_password. Not a super serious thing though.

return generate_password_hash(password, method='pbkdf2:sha512', salt_length=32)



Expand Down Expand Up @@ -864,7 +886,7 @@ def add_org_repos(session, url: List[str], user_id: int, group_name: int):

# if it doesn't exist create one
if not repo_group:
repo_group = RepoGroup(rg_name=owner, rg_description="", rg_website="", rg_recache=0, rg_type="Unknown",
repo_group = RepoGroup(rg_name=owner.lower(), rg_description="", rg_website="", rg_recache=0, rg_type="Unknown",
tool_source="Loaded by user", tool_version="1.0", data_source="Git")
session.add(repo_group)
session.commit()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def total_facade_reset():

shutil.rmtree(path)
#Create path
path.touch()
path.mkdir()
#Move credentials in
shutil.move("/tmp/.git-credentials",f"{facade_base_dir}.git-credentials")

Expand Down
Loading