Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create events_log table #174

Open
lewismc opened this issue Jan 11, 2023 · 7 comments
Open

Create events_log table #174

lewismc opened this issue Jan 11, 2023 · 7 comments
Assignees
Labels
enhancement New feature or request storage Anything tagbase-server storage/persistence related.

Comments

@lewismc
Copy link
Member

lewismc commented Jan 11, 2023

So far we have identified two buckets of anomalies which can occur during ingestions

  • metadata: when a key exists in the eTUFF global metadata but not in the metadata_types table, or
  • other: when some anomaly (in the past some time series were empty until they were padded with dummy values) exists within the file

In both of the above cases, each individual offense would generate an separate Slack alert. This can be noisy and overwhelming at time so it needs to be improved.

@renato2099 suggested that we create an anomaly table which would, in the instance of an anomaly` generate an entry detailing what the anomaly is. All anomalies for a given submission would be grouped and persisted for archival purposes. This allows for

  1. A single Slack notification detailing an HTTP location URL of a single aggregated report containing one or more anomalies, and
  2. The ability for the user to then Execute a GET on the URL to access the JSON anomaly report for a given submission

This task therefore requires that we

  1. Design the anomaly table
  2. Link the table to the submission.anomaly_report column which will be available post Augment submission table with ingestion task context and status #173
  3. Augment the OpenAPI to facilitate anomaly report access via GET
  4. Implement the logic to generate anomaly reports which covers the metadata and other buckets described above.
  5. Integrate report alerts into Slack messaging
  6. Tests which cover FAILED ingestion scenarios
@lewismc lewismc added enhancement New feature or request storage Anything tagbase-server storage/persistence related. labels Jan 11, 2023
@tagtuna
Copy link
Contributor

tagtuna commented Jan 11, 2023

This captures well the flow - I would point out though, at least with our current design, we aim to utilize two different Slack channels, metadata_ops and deploy_ops, I wonder whether we should flag the anomaly reporting in similar categories, e.g., in the anomaly table, there is atype field with possible values such as "metadata", "missing entries". This value list will grow as we identify more buckets of anomalies?

@lewismc
Copy link
Member Author

lewismc commented Jan 12, 2023

I really like the sound of that yes. I was also thinking that we could avoid the creation of a new table but add a report column to the submission table however getting data out becomes a bit more tricky because we have to use non-standard/complex data types to represent key-values e.g.
{"metadata", "This is a description of the metadata anomaly"}
... rather than explicit rows which make it really easy to query for all anomalies of a particular type for a given submission.
I think we can implement the dedicated anomaly table with the foreign key and types as you suggested. We don't need to make the anomaly type an ENUM right now.

@tagtuna
Copy link
Contributor

tagtuna commented Jan 12, 2023

I think an anomaly table is a cleaner way to organize and it's easier to use as well. So we don't have to bend ourselves to fit things into submission

@lewismc
Copy link
Member Author

lewismc commented Jan 12, 2023

Agreed. Thanks. I'll implement.

@vtsontos
Copy link

HI guys,
thinking a bit more about this, I think could be good to have an "Events_Log" table that would capture the status of all key database event operations, and whether success or anomalies were encountered with whatever descriptive information can be recorded. A standardized event_status code table could be devised. See the attached table proposal with examples.
I think this approach allows us to breakdown and record outcomes for each step in the process in a consistent manner, and should be extensible to allow for additions/changes in future.
Let me know what you think..

Tagbase_EventsTableProposal.xlsx

@lewismc
Copy link
Member Author

lewismc commented Jan 14, 2023

I like it @vtsontos I'll implement that.

@lewismc lewismc changed the title Create anomaly table Create events_log table Jan 14, 2023
@lewismc lewismc self-assigned this Jan 15, 2023
@lewismc lewismc added this to the 0.8.0 milestone Jan 15, 2023
@lewismc
Copy link
Member Author

lewismc commented Jan 15, 2023

This issue now supersedes #173
Essentially the parts which can be cherry-picked are

CREATE TYPE status_enum AS ENUM ('FAILED', 'FINISHED', 'KILLED', 'MIGRATION', 'POSTMIGRATION', 'PREMIGRATION');

ALTER TABLE ONLY event_log
    ADD CONSTRAINT event_log_submission_fkey FOREIGN KEY (submission_id, tag_id) REFERENCES submission(submission_id, tag_id);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request storage Anything tagbase-server storage/persistence related.
Projects
No open projects
Development

No branches or pull requests

3 participants