Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filemanager: s3_object to object modelling does not support overlapping attributes #376

Open
mmalenic opened this issue Jun 25, 2024 · 3 comments
Assignees
Labels
feature New feature filemanager an issue relating to the filemanager pilot Pilot Use Case

Comments

@mmalenic
Copy link
Member

The current s3_object to object 1-many modelling only allows distinct sets of s3_object to belong to one object grouping. This means that attributes on a group of s3_object can only belong to distinct items in that group. This doesn't support use-cases requiring overlapping groups. For example, one s3_object may belong to a group that specifies a portal_run_id and another group that specifies the subject.

Instead, to fix this, the modelling should represent a many-many relationship, by introducing a reference table that contains foreign keys for an object group, and the s3_object file itself.

@mmalenic mmalenic self-assigned this Jun 25, 2024
@mmalenic mmalenic added filemanager an issue relating to the filemanager bug Something isn't working labels Jun 25, 2024
@victorskl
Copy link
Member

victorskl commented Jun 27, 2024

This will come with phase 1 0.2.0

@mmalenic
Copy link
Member Author

Following up from the chat today, this can be paused for now.

The proposed new modelling could look like this:
filemanager_schema

However, we decided that it would be more straightforward to just use the attributes column on the s3_object table to represent groups. For example, to represent a group of objects with the same portal_run_id, each s3_object could be annotated with the same json tag: { "portal_run_id": <...> }.

@victorskl
Copy link
Member

victorskl commented Jun 28, 2024

POC use case we discussed yesterday (27/06/24):

POST request to endpoint to annotate:
/s3_object/<id>

{
    "portalRunId": "20240621de4cac37",
    "libraryId": "L2400160",
    "portalRunIdv2": "1de4cac302024062"
}

For query perspective, we tried simulate in FM database, like so:

-- insert annotation to attribute jsonb column
update s3_object set attributes = '{ "portalRunId": "20240621de4cac37", "libraryId": "L2400160" }' where s3_object_id = '01903a13-6e92-7884-8e73-29cd2f2080f9';

select * from s3_object where s3_object_id = '01903a13-6e92-7884-8e73-29cd2f2080f9';

-- jsonb expression query on attributes column
select * from s3_object WHERE attributes->>'portalRunId' = '20240621de4cac37';
select * from s3_object WHERE attributes->>'libraryId' = 'L2400160';

More on PostgreSQL capability with jsonb data type and its indexing:

We are going to give this as first-cut a try MVP between New UI <> FM interaction.
Once portalRunId is annotated, it becomes immutable through Application layer (such as API call) - any modification then only possible by updating directly through database console.

Attendees: @reisingerf @williamputraintan @raylrui @alexiswl @mmalenic
FYI @ohofmann @brainstorm

@victorskl victorskl added feature New feature pilot Pilot Use Case and removed bug Something isn't working labels Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature filemanager an issue relating to the filemanager pilot Pilot Use Case
Projects
None yet
Development

No branches or pull requests

2 participants