Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Add make sprint-data-import and issue-data-import to import github sprint and issue data to database #84

Merged
merged 14 commits into from
Jun 26, 2024

Conversation

aplybeah
Copy link

@aplybeah aplybeah commented Jun 14, 2024

Summary

Fixes #46

Time to review: x mins

Changes proposed

  • added sprint-db-data-import to Makefile
  • added export_json_to_database

Context for reviewers

One strategy would be to keep the make sprint-data-export and issue-data-export and create make sprint-db-data-import and issue-data-db-import so that the data is exported to JSON and then imported into the database.

A single make command could then be created to run the the export and then import files.

Additional information

Sample data in database

Screen Shot 2024-06-26 at 3 38 47 PM

@aplybeah aplybeah requested review from acouch and chouinar June 20, 2024 17:44
@aplybeah aplybeah marked this pull request as ready for review June 20, 2024 17:45
Comment on lines 24 to 25
Validator("SLACK_BOT_TOKEN", must_exist=False), #disabled for testing
Validator("REPORTING_CHANNEL_ID", must_exist=False), #disabled for testing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Billy wrote some end to end tests that require these tokens. So these need to stay enabled if we want to run the full end to end test suite, unfortunately.

That said, I don't think we should have ever written end to end tests that require these tokens to be set. So I'm willing to accept this change.

Comment on lines 151 to 154
sprint-db-data-import:
@echo "=> Importing project data to the database"
@echo "====================================================="
$(POETRY) analytics export db_export \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⭐⭐⭐   (must change before approval) The makefile command says db-import whereas the python command is db-export. You'll want to pick one or the other. I'm thinking import?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also thinking import

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update with import. Please also create a separate import app:

import_app = typer.Typer()
...
app.add_typer(import_app, name="import", help="Import data into the database")

@acouch
Copy link
Member

acouch commented Jun 20, 2024

Looks good, can you fix the lint test?

@acouch
Copy link
Member

acouch commented Jun 21, 2024

When I run this locally I get:

(analytics)(import-sprint-data)$make sprint-db-data-import
=> Importing project data to the database
=====================================================
docker-compose run -e GH_TOKEN --rm grants-analytics poetry run analytics export db_export \
	--owner HHS \
	--project 13 \
	--sprint-file data/sprint-data.json \
	--issue-file data/issue-data.json
WARN[0000] /Users/partisan/workshop/grantsgov/nava-simpler/analytics/docker-compose.yml: `version` is obsolete
[+] Creating 1/0
 ✔ Container grants-analytics-db  Running                                                                                                                                           0.0s
Warning: 'analytics' is an entry point defined in pyproject.toml, but it's not installed as a script. You may get improper `sys.argv[0]`.

The support to run uninstalled scripts will be removed in a future release.

Run `poetry install` to resolve and get rid of this message.

Usage: analytics export db_export [OPTIONS]
Try 'analytics export db_export --help' for help.
╭─ Error ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ No such option: --owner                                                                                                                                                               │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
make: *** [sprint-db-data-import] Error 2

@acouch
Copy link
Member

acouch commented Jun 21, 2024

Missing the issue data export, unless you plan to-do another PR for that one.

)

BaseDataset.to_sql(
output_table=task_data,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output_table argument is a string for the table name. Can you recheck this locally? Did you have a working version here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acouch it worked in the sense that the command ran locally, but I haven't been able to find the data in the database

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try instantiating the class, then applying the .to_sql method :

  deliverables = DeliverableTasks.load_from_json_files(
    sprint_file=sprint_file,
    issue_file=issue_file,
  )

  deliverables.to_sql(
    output_table="github_project_data",
    engine=connection,
    replace_table=True
  )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to add an update to the pyproject.toml if you are updating the poetry.lock file.

@aplybeah
Copy link
Author

Expected Failures:

FAILED tests/integrations/test_slack.py::test_fetch_slack_channels - TypeError: 'NoneType' object is not subscriptable
FAILED tests/integrations/test_slack.py::test_upload_files_to_slack_channel - TypeError: 'NoneType' object is not subscriptable

The tests expect slack auth. Turning the validators off ended up interrupting the full suite

Copy link
Collaborator

@coilysiren coilysiren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼! But you probably want to follow-up to disable those failing tests

@@ -1,4 +1,4 @@
POSTGRES_NAME = "app"
POSTGRES_HOST = "0.0.0.0"
POSTGRES_HOST = "grants-analytics-db"
POSTGRES_USER = "app"
POSTGRES_PORT = 5432
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you would need a password here as well, can you confirm?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theres a secrets.toml that's in the gitignore that has the password

Copy link
Collaborator

@coilysiren coilysiren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼 !

@acouch
Copy link
Member

acouch commented Jun 26, 2024

Thanks for the updates. Can you add the SprintBoard data as well? https://github.com/navapbc/simpler-grants-gov/pull/84/files#diff-9ebe0c30e3ac3922cd1dd64b982265f1732a5d801148e6cdbf7a3ff5daa399a6R12

@aplybeah aplybeah merged commit 817a1d3 into main Jun 26, 2024
28 checks passed
@aplybeah aplybeah deleted the import-sprint-data branch June 26, 2024 20:15
aplybeah added a commit that referenced this pull request Jul 1, 2024
#113)

## Summary
Fixes #100 

### Time to review: __1 mins__

## Changes proposed
* Added documentation about local database import

## Context for reviewers

> The current analytics documentation is focused on the slack
integration. This task is to add the work from
#84 to the
documentation and include the local Metabase steps.

## Additional information
> Screenshots, GIF demos, code examples or output to help show the
changes working as expected.
aplybeah added a commit that referenced this pull request Jul 26, 2024
…Analytics (#136)

## Summary
Fixes #107 
Fixes #115 

### Time to review: __5 mins__

## Changes proposed
* removed the `*.toml` files related to dynaconf
* removed references to Dynaconf (e.g. in Docstrings, gitignore)
* use Pydantic for loading

## Context for reviewers

> After getting feedback for
#107, the consensus
was to reevaluate the way the database loader works for more uniformity.
Changes will need to be made primarily in db.py and cli.py
> 
> With the PR #84 ,
the env settings for the db are stored in settings.toml. The config
settings should be updated to use the existing local.env file

## Additional information
> Screenshots, GIF demos, code examples or output to help show the
changes working as expected.
acouch pushed a commit that referenced this pull request Sep 18, 2024
…rint and issue data to database (#84)

Fixes #46

* added `sprint-db-data-import` to Makefile
* added `export_json_to_database`

> One strategy would be to keep the make sprint-data-export and
issue-data-export and create make sprint-db-data-import and
issue-data-db-import so that the data is exported to JSON and then
imported into the database.
>
> A single make command could then be created to run the the export and
then import files.

Sample data in database

<img width="1133" alt="Screen Shot 2024-06-26 at 3 38 47 PM"
src="https://github.com/navapbc/simpler-grants-gov/assets/37313082/34c962d6-a78e-4963-be15-ef0f7de3bccf">
acouch pushed a commit that referenced this pull request Sep 18, 2024
#113)

Fixes #100

* Added documentation about local database import

> The current analytics documentation is focused on the slack
integration. This task is to add the work from
#84 to the
documentation and include the local Metabase steps.

> Screenshots, GIF demos, code examples or output to help show the
changes working as expected.
acouch pushed a commit that referenced this pull request Sep 18, 2024
…Analytics (#136)

Fixes #107
Fixes #115

* removed the `*.toml` files related to dynaconf
* removed references to Dynaconf (e.g. in Docstrings, gitignore)
* use Pydantic for loading

> After getting feedback for
#107, the consensus
was to reevaluate the way the database loader works for more uniformity.
Changes will need to be made primarily in db.py and cli.py
>
> With the PR #84 ,
the env settings for the db are stored in settings.toml. The config
settings should be updated to use the existing local.env file

> Screenshots, GIF demos, code examples or output to help show the
changes working as expected.
acouch pushed a commit that referenced this pull request Sep 18, 2024
…rint and issue data to database (#84)

Fixes #46

* added `sprint-db-data-import` to Makefile
* added `export_json_to_database`

> One strategy would be to keep the make sprint-data-export and
issue-data-export and create make sprint-db-data-import and
issue-data-db-import so that the data is exported to JSON and then
imported into the database.
>
> A single make command could then be created to run the the export and
then import files.

Sample data in database

<img width="1133" alt="Screen Shot 2024-06-26 at 3 38 47 PM"
src="https://github.com/navapbc/simpler-grants-gov/assets/37313082/34c962d6-a78e-4963-be15-ef0f7de3bccf">
acouch pushed a commit that referenced this pull request Sep 18, 2024
#113)

Fixes #100

* Added documentation about local database import

> The current analytics documentation is focused on the slack
integration. This task is to add the work from
#84 to the
documentation and include the local Metabase steps.

> Screenshots, GIF demos, code examples or output to help show the
changes working as expected.
acouch pushed a commit that referenced this pull request Sep 18, 2024
…Analytics (#136)

Fixes #107
Fixes #115

* removed the `*.toml` files related to dynaconf
* removed references to Dynaconf (e.g. in Docstrings, gitignore)
* use Pydantic for loading

> After getting feedback for
#107, the consensus
was to reevaluate the way the database loader works for more uniformity.
Changes will need to be made primarily in db.py and cli.py
>
> With the PR #84 ,
the env settings for the db are stored in settings.toml. The config
settings should be updated to use the existing local.env file

> Screenshots, GIF demos, code examples or output to help show the
changes working as expected.
acouch pushed a commit to HHS/simpler-grants-gov that referenced this pull request Sep 18, 2024
…rint and issue data to database (navapbc#84)

Fixes #46

* added `sprint-db-data-import` to Makefile
* added `export_json_to_database`

> One strategy would be to keep the make sprint-data-export and
issue-data-export and create make sprint-db-data-import and
issue-data-db-import so that the data is exported to JSON and then
imported into the database.
>
> A single make command could then be created to run the the export and
then import files.

Sample data in database

<img width="1133" alt="Screen Shot 2024-06-26 at 3 38 47 PM"
src="https://github.com/navapbc/simpler-grants-gov/assets/37313082/34c962d6-a78e-4963-be15-ef0f7de3bccf">
acouch pushed a commit to HHS/simpler-grants-gov that referenced this pull request Sep 18, 2024
navapbc#113)

Fixes #100

* Added documentation about local database import

> The current analytics documentation is focused on the slack
integration. This task is to add the work from
navapbc#84 to the
documentation and include the local Metabase steps.

> Screenshots, GIF demos, code examples or output to help show the
changes working as expected.
acouch pushed a commit to HHS/simpler-grants-gov that referenced this pull request Sep 18, 2024
…Analytics (navapbc#136)

Fixes #107
Fixes #115

* removed the `*.toml` files related to dynaconf
* removed references to Dynaconf (e.g. in Docstrings, gitignore)
* use Pydantic for loading

> After getting feedback for
navapbc#107, the consensus
was to reevaluate the way the database loader works for more uniformity.
Changes will need to be made primarily in db.py and cli.py
>
> With the PR navapbc#84 ,
the env settings for the db are stored in settings.toml. The config
settings should be updated to use the existing local.env file

> Screenshots, GIF demos, code examples or output to help show the
changes working as expected.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Task]: Add make sprint-data-import and issue-data-import to import github sprint and issue data to database
3 participants