Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create script to pull updates from OpenAlex every day #426

Open
hubsmoke opened this issue Jul 12, 2024 · 0 comments
Open

Create script to pull updates from OpenAlex every day #426

hubsmoke opened this issue Jul 12, 2024 · 0 comments
Assignees

Comments

@hubsmoke
Copy link
Member

Create a standalone project that contains scripts to pull from OpenAlex API (filter by date) and import all the entities into a Postgres database. Ensure the job runs just once per day and doesn't miss records since the last pull. Keep track of when records were pulled, perhaps by adding import time into the DB, or adding an ImportLog db table, and marking each entity with ImportLogId.

Use date filter for OpenAlex API and import each entity into the PostgresDB

https://docs.openalex.org/how-to-use-the-api/get-lists-of-entities/filter-entity-lists (required premium subscription, we are getting a quote)

The postgres db structure is as follows https://github.com/ourresearch/openalex-documentation-scripts/blob/main/openalex-pg-schema.sql

Ask sina for db creds.

Feel free to modify schema by adding fields/tables and integrate prisma if helpful

The OpenAlex API allows us to import on an hourly basis if we upgrade to premium. Keep in mind we may want to enable hourly updates in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants