-
Notifications
You must be signed in to change notification settings - Fork 0
Add make sprint-data-import and issue-data-import to import github sprint and issue data to database #84
Conversation
analytics/config.py
Outdated
Validator("SLACK_BOT_TOKEN", must_exist=False), #disabled for testing | ||
Validator("REPORTING_CHANNEL_ID", must_exist=False), #disabled for testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Billy wrote some end to end tests that require these tokens. So these need to stay enabled if we want to run the full end to end test suite, unfortunately.
That said, I don't think we should have ever written end to end tests that require these tokens to be set. So I'm willing to accept this change.
analytics/Makefile
Outdated
sprint-db-data-import: | ||
@echo "=> Importing project data to the database" | ||
@echo "=====================================================" | ||
$(POETRY) analytics export db_export \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⭐⭐⭐ (must change before approval) The makefile command says db-import whereas the python command is db-export. You'll want to pick one or the other. I'm thinking import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also thinking import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update with import. Please also create a separate import app:
import_app = typer.Typer()
...
app.add_typer(import_app, name="import", help="Import data into the database")
Looks good, can you fix the lint test? |
When I run this locally I get: (analytics)(import-sprint-data)$make sprint-db-data-import
=> Importing project data to the database
=====================================================
docker-compose run -e GH_TOKEN --rm grants-analytics poetry run analytics export db_export \
--owner HHS \
--project 13 \
--sprint-file data/sprint-data.json \
--issue-file data/issue-data.json
WARN[0000] /Users/partisan/workshop/grantsgov/nava-simpler/analytics/docker-compose.yml: `version` is obsolete
[+] Creating 1/0
✔ Container grants-analytics-db Running 0.0s
Warning: 'analytics' is an entry point defined in pyproject.toml, but it's not installed as a script. You may get improper `sys.argv[0]`.
The support to run uninstalled scripts will be removed in a future release.
Run `poetry install` to resolve and get rid of this message.
Usage: analytics export db_export [OPTIONS]
Try 'analytics export db_export --help' for help.
╭─ Error ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ No such option: --owner │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
make: *** [sprint-db-data-import] Error 2 |
Missing the issue data export, unless you plan to-do another PR for that one. |
analytics/src/analytics/cli.py
Outdated
) | ||
|
||
BaseDataset.to_sql( | ||
output_table=task_data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output_table
argument is a string for the table name. Can you recheck this locally? Did you have a working version here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@acouch it worked in the sense that the command ran locally, but I haven't been able to find the data in the database
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try instantiating the class, then applying the .to_sql method :
deliverables = DeliverableTasks.load_from_json_files(
sprint_file=sprint_file,
issue_file=issue_file,
)
deliverables.to_sql(
output_table="github_project_data",
engine=connection,
replace_table=True
)
analytics/poetry.lock
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to add an update to the pyproject.toml if you are updating the poetry.lock file.
Expected Failures:
The tests expect slack auth. Turning the validators off ended up interrupting the full suite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏼! But you probably want to follow-up to disable those failing tests
@@ -1,4 +1,4 @@ | |||
POSTGRES_NAME = "app" | |||
POSTGRES_HOST = "0.0.0.0" | |||
POSTGRES_HOST = "grants-analytics-db" | |||
POSTGRES_USER = "app" | |||
POSTGRES_PORT = 5432 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you would need a password here as well, can you confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theres a secrets.toml
that's in the gitignore
that has the password
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏼 !
Thanks for the updates. Can you add the |
#113) ## Summary Fixes #100 ### Time to review: __1 mins__ ## Changes proposed * Added documentation about local database import ## Context for reviewers > The current analytics documentation is focused on the slack integration. This task is to add the work from #84 to the documentation and include the local Metabase steps. ## Additional information > Screenshots, GIF demos, code examples or output to help show the changes working as expected.
…Analytics (#136) ## Summary Fixes #107 Fixes #115 ### Time to review: __5 mins__ ## Changes proposed * removed the `*.toml` files related to dynaconf * removed references to Dynaconf (e.g. in Docstrings, gitignore) * use Pydantic for loading ## Context for reviewers > After getting feedback for #107, the consensus was to reevaluate the way the database loader works for more uniformity. Changes will need to be made primarily in db.py and cli.py > > With the PR #84 , the env settings for the db are stored in settings.toml. The config settings should be updated to use the existing local.env file ## Additional information > Screenshots, GIF demos, code examples or output to help show the changes working as expected.
…rint and issue data to database (#84) Fixes #46 * added `sprint-db-data-import` to Makefile * added `export_json_to_database` > One strategy would be to keep the make sprint-data-export and issue-data-export and create make sprint-db-data-import and issue-data-db-import so that the data is exported to JSON and then imported into the database. > > A single make command could then be created to run the the export and then import files. Sample data in database <img width="1133" alt="Screen Shot 2024-06-26 at 3 38 47 PM" src="https://github.com/navapbc/simpler-grants-gov/assets/37313082/34c962d6-a78e-4963-be15-ef0f7de3bccf">
#113) Fixes #100 * Added documentation about local database import > The current analytics documentation is focused on the slack integration. This task is to add the work from #84 to the documentation and include the local Metabase steps. > Screenshots, GIF demos, code examples or output to help show the changes working as expected.
…Analytics (#136) Fixes #107 Fixes #115 * removed the `*.toml` files related to dynaconf * removed references to Dynaconf (e.g. in Docstrings, gitignore) * use Pydantic for loading > After getting feedback for #107, the consensus was to reevaluate the way the database loader works for more uniformity. Changes will need to be made primarily in db.py and cli.py > > With the PR #84 , the env settings for the db are stored in settings.toml. The config settings should be updated to use the existing local.env file > Screenshots, GIF demos, code examples or output to help show the changes working as expected.
…rint and issue data to database (#84) Fixes #46 * added `sprint-db-data-import` to Makefile * added `export_json_to_database` > One strategy would be to keep the make sprint-data-export and issue-data-export and create make sprint-db-data-import and issue-data-db-import so that the data is exported to JSON and then imported into the database. > > A single make command could then be created to run the the export and then import files. Sample data in database <img width="1133" alt="Screen Shot 2024-06-26 at 3 38 47 PM" src="https://github.com/navapbc/simpler-grants-gov/assets/37313082/34c962d6-a78e-4963-be15-ef0f7de3bccf">
#113) Fixes #100 * Added documentation about local database import > The current analytics documentation is focused on the slack integration. This task is to add the work from #84 to the documentation and include the local Metabase steps. > Screenshots, GIF demos, code examples or output to help show the changes working as expected.
…Analytics (#136) Fixes #107 Fixes #115 * removed the `*.toml` files related to dynaconf * removed references to Dynaconf (e.g. in Docstrings, gitignore) * use Pydantic for loading > After getting feedback for #107, the consensus was to reevaluate the way the database loader works for more uniformity. Changes will need to be made primarily in db.py and cli.py > > With the PR #84 , the env settings for the db are stored in settings.toml. The config settings should be updated to use the existing local.env file > Screenshots, GIF demos, code examples or output to help show the changes working as expected.
…rint and issue data to database (navapbc#84) Fixes #46 * added `sprint-db-data-import` to Makefile * added `export_json_to_database` > One strategy would be to keep the make sprint-data-export and issue-data-export and create make sprint-db-data-import and issue-data-db-import so that the data is exported to JSON and then imported into the database. > > A single make command could then be created to run the the export and then import files. Sample data in database <img width="1133" alt="Screen Shot 2024-06-26 at 3 38 47 PM" src="https://github.com/navapbc/simpler-grants-gov/assets/37313082/34c962d6-a78e-4963-be15-ef0f7de3bccf">
navapbc#113) Fixes #100 * Added documentation about local database import > The current analytics documentation is focused on the slack integration. This task is to add the work from navapbc#84 to the documentation and include the local Metabase steps. > Screenshots, GIF demos, code examples or output to help show the changes working as expected.
…Analytics (navapbc#136) Fixes #107 Fixes #115 * removed the `*.toml` files related to dynaconf * removed references to Dynaconf (e.g. in Docstrings, gitignore) * use Pydantic for loading > After getting feedback for navapbc#107, the consensus was to reevaluate the way the database loader works for more uniformity. Changes will need to be made primarily in db.py and cli.py > > With the PR navapbc#84 , the env settings for the db are stored in settings.toml. The config settings should be updated to use the existing local.env file > Screenshots, GIF demos, code examples or output to help show the changes working as expected.
Summary
Fixes #46
Time to review: x mins
Changes proposed
sprint-db-data-import
to Makefileexport_json_to_database
Context for reviewers
Additional information
Sample data in database