Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codebase git mirroring #718

Open
wants to merge 29 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
e4a1d39
feat: add model for tracking codebase git mirror
sgfost May 7, 2024
78133ea
feat: add git repo file system api
sgfost May 7, 2024
9919e8c
test: add coverage for git repo api
sgfost Jul 30, 2024
d5aed53
feat: add citation.cff and LICENSE to archive packages and git repos
sgfost Aug 6, 2024
578a01d
fix: only format/access license text when needed
sgfost Aug 7, 2024
f02b46b
feat(WIP): basic github mirroring
sgfost Aug 16, 2024
c364e67
fix: use proper default for integer env settings
sgfost Sep 4, 2024
1b28eb6
fix: idempotency for actions in git fs api and github api
sgfost Sep 6, 2024
baf66c1
feat(WIP): add huey for async task processing
sgfost Sep 6, 2024
dfb41c8
refactor: use huey task for github mirroring
sgfost Sep 6, 2024
eadb6f7
feat: update github mirror on publish
sgfost Sep 6, 2024
7b55d18
build: run huey consumer as a runit deamon in the server service
sgfost Sep 11, 2024
f4f4363
feat: better mirroring UI + repo name validation
sgfost Sep 18, 2024
5d89cc5
refactor: scrap metadata transformers, generate cff from codemeta
sgfost Oct 3, 2024
01e3616
feat: add feature page for github integration
sgfost Oct 3, 2024
230decc
refactor: only make private repos when settings.DEBUG is True
sgfost Oct 4, 2024
97f7e52
build: dynamically add vue apps to vite config
sgfost Oct 10, 2024
a5ce3b3
refactor: simplify gh mirroring task
sgfost Oct 15, 2024
4bca99e
refactor: rename github integration app configs for clarity
sgfost Oct 18, 2024
4446af3
refactor: minor adjustment to github integration UI
sgfost Oct 25, 2024
17b0532
fix: indent/format codemeta.json
sgfost Oct 29, 2024
e6fd0d1
chore: append git mirroring migrations
sgfost Dec 5, 2024
ad57b42
refactor(WIP): codemeta generation to use codemeticulous
sgfost Dec 13, 2024
f2cfcea
feat: use release branches for codebase git repos
sgfost Dec 14, 2024
8687193
feat: cache codemeta json string for quicker page loads
sgfost Dec 21, 2024
40d5a47
feat: check for readme in docs to place in git repo root
sgfost Jan 1, 2025
c40198e
refactor: improved api for metadata building
sgfost Jan 3, 2025
51328bd
fix: prevent double save for codebase and release
sgfost Jan 4, 2025
0ecf5bd
refactor(WIP): build codebase/release metadata on save
sgfost Jan 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ SECRETS_DIR=${BUILD_DIR}/secrets
DB_PASSWORD_PATH=${SECRETS_DIR}/db_password
PGPASS_PATH=${SECRETS_DIR}/.pgpass
SECRET_KEY_PATH=${SECRETS_DIR}/django_secret_key
EXT_SECRETS=hcaptcha_secret github_client_secret orcid_client_secret discourse_api_key discourse_sso_secret mail_api_key datacite_api_password
EXT_SECRETS=hcaptcha_secret github_client_secret orcid_client_secret discourse_api_key discourse_sso_secret mail_api_key datacite_api_password github_integration_app_private_key github_integration_app_client_secret
GENERATED_SECRETS=$(DB_PASSWORD_PATH) $(PGPASS_PATH) $(SECRET_KEY_PATH)

ENVREPLACE := deploy/scripts/envreplace
Expand Down
6 changes: 6 additions & 0 deletions base.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@ services:
- django_secret_key
- github_client_secret
- orcid_client_secret
- github_integration_app_private_key
- github_integration_app_client_secret
- hcaptcha_secret
- mail_api_key
volumes:
Expand Down Expand Up @@ -98,6 +100,10 @@ secrets:
file: ./build/secrets/django_secret_key
github_client_secret:
file: ./build/secrets/github_client_secret
github_integration_app_private_key:
file: ./build/secrets/github_integration_app_private_key
github_integration_app_client_secret:
file: ./build/secrets/github_integration_app_client_secret
hcaptcha_secret:
file: ./build/secrets/hcaptcha_secret
mail_api_key:
Expand Down
6 changes: 6 additions & 0 deletions deploy/conf/.env.template
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,12 @@ ORCID_CLIENT_ID=
DATACITE_API_USERNAME=
DATACITE_DRY_RUN="true" # allowed values: "true" or "false"

# github integration app
GITHUB_INTEGRATION_APP_ID=
GITHUB_INTEGRATION_APP_INSTALLATION_ID=
GITHUB_INTEGRATION_APP_CLIENT_ID=
GITHUB_MODEL_LIBRARY_ORG_NAME=

# test
TEST_USER_ID=10000000
TEST_USERNAME=__test_user__
Expand Down
14 changes: 11 additions & 3 deletions django/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,9 @@ RUN --mount=type=cache,target=/var/lib/apt,sharing=locked \
unzip \
&& update-alternatives --install /usr/bin/python python /usr/bin/python3 1000 \
&& apt-get upgrade -q -y -o Dpkg::Options::="--force-confold" \
&& mkdir -p /etc/service/django \
&& touch /etc/service/django/run /etc/postgresql-backup-pre \
&& chmod a+x /etc/service/django/run /etc/postgresql-backup-pre \
&& mkdir -p /etc/service/django /etc/service/huey \
&& touch /etc/service/django/run /etc/service/huey/run /etc/postgresql-backup-pre \
&& chmod a+x /etc/service/django/run /etc/service/huey/run /etc/postgresql-backup-pre \
&& apt-get autoremove -y && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

WORKDIR /code
Expand All @@ -58,5 +58,13 @@ COPY ./deploy/cron.weekly/* /etc/cron.weekly/
COPY ./deploy/db/autopostgresqlbackup.conf /etc/default/autopostgresqlbackup
COPY ./deploy/db/postgresql-backup-pre /etc/
COPY ${RUN_SCRIPT} /etc/service/django/run
COPY ./deploy/huey.sh /etc/service/huey/run
COPY . /code

# FIXME: replace with install from pypi
# upgrading pip because of some bug with the debian patched version
RUN python3 -m pip install --upgrade pip
RUN pip3 install git+https://github.com/sgfost/codemeticulous.git
RUN pip3 install -r /tmp/requirements.txt

CMD ["/sbin/my_init"]
13 changes: 13 additions & 0 deletions django/core/huey.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from django_redis import get_redis_connection
from huey import RedisHuey


class DjangoRedisHuey(RedisHuey):
"""Huey subclass that uses the existing connection pool
from the django-redis cache backend
"""

def __init__(self, *args, **kwargs):
connection = get_redis_connection("default")
kwargs["connection_pool"] = connection.connection_pool
super().__init__(*args, **kwargs)
8 changes: 8 additions & 0 deletions django/core/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -464,6 +464,14 @@ def github_url(self):
"""
return self.get_social_account_profile_url("github")

@property
def github_username(self):
github_account = self.get_social_account("github")
if github_account:
return github_account.extra_data.get("login")
else:
return None

def get_social_account_profile_url(self, provider_name):
social_acct = self.get_social_account(provider_name)
if social_acct:
Expand Down
4 changes: 1 addition & 3 deletions django/core/serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,8 @@ def create(model_cls, validated_data, context):

def update(serializer_update, instance, validated_data):
tags = TagSerializer(many=True, data=validated_data.pop("tags"))
instance = serializer_update(instance, validated_data)
set_tags(instance, tags)
instance.save()
return instance
return serializer_update(instance, validated_data)


class EditableSerializerMixin(serializers.Serializer):
Expand Down
33 changes: 31 additions & 2 deletions django/core/settings/defaults.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,7 @@ def is_test(self):
"django_extensions",
"django_vite",
"guardian",
"huey.contrib.djhuey",
"rest_framework",
"rest_framework_swagger",
"robots",
Expand Down Expand Up @@ -396,6 +397,11 @@ def is_test(self):
"handlers": ["console", "comsesfile"],
"propagate": False,
},
"huey": {
"level": "INFO",
"handlers": ["console", "comsesfile"],
"propagate": False,
},
},
}

Expand Down Expand Up @@ -479,10 +485,18 @@ def is_test(self):
"LOCATION": "unix:///shared/redis/redis.sock",
"OPTIONS": {
"CLIENT_CLASS": "django_redis.client.DefaultClient",
"CONNECTION_POOL_KWARGS": {"max_connections": 20},
},
}
}

HUEY = {
"name": "comses",
"huey_class": "core.huey.DjangoRedisHuey",
"immediate": False, # always run tasks in the background, even in dev (for now)
# if removed here, it will default to DEBUG
}

# SSO, user registration, and django-allauth configuration, see
# https://django-allauth.readthedocs.io/en/latest/configuration.html
# ACCOUNT_ADAPTER = 'core.adapter.AccountAdapter'
Expand All @@ -501,12 +515,27 @@ def is_test(self):
ACCOUNT_CHANGE_EMAIL = True

ORCID_CLIENT_ID = os.getenv("ORCID_CLIENT_ID", "")

ORCID_CLIENT_SECRET = read_secret("orcid_client_secret")

GITHUB_CLIENT_ID = os.getenv("GITHUB_CLIENT_ID", "")
GITHUB_CLIENT_SECRET = read_secret("github_client_secret")

GITHUB_INTEGRATION_APP_ID = int(os.getenv("GITHUB_INTEGRATION_APP_ID") or 0)
GITHUB_INTEGRATION_APP_PRIVATE_KEY = read_secret("github_integration_app_private_key")
GITHUB_INTEGRATION_APP_INSTALLATION_ID = int(
os.getenv("GITHUB_INTEGRATION_APP_INSTALLATION_ID") or 0
)
# client id and secret are only used for getting user access tokens to be able to push
# to the user's repositories. We are not re-using the regular oauth app in order to
# keep minimal permissions
GITHUB_INTEGRATION_APP_CLIENT_ID = os.getenv("GITHUB_INTEGRATION_APP_ID", "")
GITHUB_INTEGRATION_APP_CLIENT_SECRET = read_secret(
"github_integration_app_client_secret"
)
GITHUB_MODEL_LIBRARY_ORG_NAME = os.getenv("GITHUB_MODEL_LIBRARY_ORG_NAME", "")
GITHUB_INDIVIDUAL_FILE_SIZE_LIMIT = os.getenv(
"GITHUB_INDIVIDUAL_FILE_SIZE_LIMIT", 100 * 1024 * 1024
)

TEST_BASIC_AUTH_PASSWORD = os.getenv("TEST_BASIC_AUTH_PASSWORD", "test password")
TEST_USER_ID = os.getenv("TEST_USER_ID", 1000000)
TEST_USERNAME = os.getenv("TEST_USERNAME", "__test_user__")
Expand Down
5 changes: 5 additions & 0 deletions django/core/settings/staging.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,5 +181,10 @@
"handlers": ["comsesfile"],
"propagate": False,
},
"huey": {
"level": "WARNING",
"handlers": ["comsesfile"],
"propagate": False,
},
},
}
2 changes: 1 addition & 1 deletion django/core/settings/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
SHARE_DIR = path.realpath("library/tests/tmp")
LIBRARY_ROOT = path.join(SHARE_DIR, "library")
LIBRARY_PREVIOUS_ROOT = path.join(SHARE_DIR, ".latest")
REPOSITORY_ROOT = path.join(BASE_DIR, "repository")
REPOSITORY_ROOT = path.join(SHARE_DIR, "repository")
BACKUP_ROOT = path.join(SHARE_DIR, "backups")
BORG_ROOT = path.join(BACKUP_ROOT, "repo")
EXTRACT_ROOT = path.join(SHARE_DIR, "extract")
Expand Down
8 changes: 8 additions & 0 deletions django/core/tests/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,5 +173,13 @@ def initialize_test_shared_folders():
)


def clear_test_shared_folder(dir=settings.REPOSITORY_ROOT):
for fs in os.scandir(dir):
Copy link
Member

@alee alee Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this cleaner if we move to pathlib like the other file where you used iterdir()? e.g.,

folder = Path(dir)
for item in folder.iterdir():
    if item.is_file():
        item.unlink()
    elif item.is_dir():
        shutil.rmtree(item, ignore_errors=True)

if fs.is_dir():
shutil.rmtree(os.path.join(dir, fs.name), ignore_errors=True)
elif fs.is_file():
os.remove(os.path.join(dir, fs.name))


def destroy_test_shared_folders():
shutil.rmtree(settings.SHARE_DIR, ignore_errors=True)
4 changes: 2 additions & 2 deletions django/curator/tests/test_dump_restore.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
from core.tests.base import EventFactory, JobFactory
from library.fs import import_archive
from library.models import Codebase
from library.tests.base import CodebaseFactory
from library.tests.base import CodebaseFactory, TEST_SAMPLES_DIR

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -51,7 +51,7 @@ def setUp(self):
fs_api = self.release.get_fs_api()
import_archive(
codebase_release=self.release,
nested_code_folder_name="library/tests/archives/nestedcode",
nested_code_folder_name=TEST_SAMPLES_DIR / "archives" / "nestedcode",
fs_api=fs_api,
)

Expand Down
2 changes: 2 additions & 0 deletions django/deploy/huey.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/bin/bash
exec /code/manage.py run_huey
2 changes: 1 addition & 1 deletion django/home/apps.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@ def ready(self):
"""
from . import signals

logger.debug("fully loaded signals: %s", signals)
logger.debug("fully loaded signals: %s for app: %s", signals, self.name)
Loading
Loading