Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(filemanager): remove object table #465

Merged
merged 2 commits into from
Aug 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
-- Remove object table because it is no longer used or necessary.
alter table s3_object drop column object_id;
drop table object;

-- Also remove public_id because s3_object_id is okay for now.
alter table s3_object drop column public_id;
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,7 @@ with input as (
)
-- Select objects into a FlatS3EventMessage struct.
select
object_id,
s3_object_id,
public_id,
s3_object.bucket,
s3_object.key,
date as event_time,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
-- Bulk insert of s3 objects.
insert into s3_object (
s3_object_id,
object_id,
public_id,
bucket,
key,
date,
Expand All @@ -18,20 +16,18 @@ insert into s3_object (
)
values (
unnest($1::uuid[]),
unnest($2::uuid[]),
unnest($3::uuid[]),
unnest($4::text[]),
unnest($5::text[]),
unnest($6::timestamptz[]),
unnest($7::bigint[]),
unnest($2::text[]),
unnest($3::text[]),
unnest($4::timestamptz[]),
unnest($5::bigint[]),
unnest($6::text[]),
unnest($7::timestamptz[]),
unnest($8::text[]),
unnest($9::timestamptz[]),
unnest($9::storage_class[]),
unnest($10::text[]),
unnest($11::storage_class[]),
unnest($12::text[]),
unnest($13::text[]),
unnest($14::boolean[]),
unnest($15::event_type[])
unnest($11::text[]),
unnest($12::boolean[]),
unnest($13::event_type[])
) on conflict on constraint sequencer_unique do update
set number_duplicate_events = s3_object.number_duplicate_events + 1
returning object_id, number_duplicate_events;
returning s3_object_id, number_duplicate_events;
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
-- Bulk insert of s3 objects.
insert into s3_object (
s3_object_id,
object_id,
public_id,
bucket,
key,
deleted_date,
Expand All @@ -19,21 +17,19 @@ insert into s3_object (
)
values (
unnest($1::uuid[]),
unnest($2::uuid[]),
unnest($3::uuid[]),
unnest($4::text[]),
unnest($5::text[]),
unnest($6::timestamptz[]),
unnest($7::bigint[]),
unnest($2::text[]),
unnest($3::text[]),
unnest($4::timestamptz[]),
unnest($5::bigint[]),
unnest($6::text[]),
unnest($7::timestamptz[]),
unnest($8::text[]),
unnest($9::timestamptz[]),
unnest($9::storage_class[]),
unnest($10::text[]),
unnest($11::storage_class[]),
unnest($12::text[]),
unnest($13::text[]),
unnest($14::bigint[]),
unnest($15::boolean[]),
unnest($16::event_type[])
unnest($11::text[]),
unnest($12::bigint[]),
unnest($13::boolean[]),
unnest($14::event_type[])
) on conflict on constraint deleted_sequencer_unique do update
set number_duplicate_events = s3_object.number_duplicate_events + 1
returning object_id, number_duplicate_events;
returning s3_object_id, number_duplicate_events;
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
-- Bulk insert of s3 objects.
insert into s3_object (
object_id,
s3_object_id,
public_id,
bucket,
key,
date,
Expand All @@ -17,21 +15,19 @@ insert into s3_object (
event_type
)
values (
unnest($1::uuid[]),
unnest($2::uuid[]),
unnest($3::uuid[]),
unnest($4::text[]),
unnest($5::text[]),
unnest($6::timestamptz[]),
unnest($7::bigint[]),
unnest($8::text[]),
unnest($9::timestamptz[]),
unnest($10::text[]),
unnest($11::storage_class[]),
unnest($12::text[]),
unnest($13::text[]),
unnest($14::boolean[]),
unnest($15::event_type[])
) on conflict on constraint sequencer_unique do update
unnest($1::uuid[]),
unnest($2::text[]),
unnest($3::text[]),
unnest($4::timestamptz[]),
unnest($5::bigint[]),
unnest($6::text[]),
unnest($7::timestamptz[]),
unnest($8::text[]),
unnest($9::storage_class[]),
unnest($10::text[]),
unnest($11::text[]),
unnest($12::boolean[]),
unnest($13::event_type[])
) on conflict on constraint sequencer_unique do update
set number_duplicate_events = s3_object.number_duplicate_events + 1
returning object_id, number_duplicate_events;
returning s3_object_id, number_duplicate_events;
Original file line number Diff line number Diff line change
Expand Up @@ -114,8 +114,6 @@ update as (
select
-- Note, this is the passed through value from the input in order to identify this event later.
input_id as "s3_object_id!",
object_id,
public_id,
bucket,
key,
date as event_time,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,6 @@ update as (
select
-- Note, this is the passed through value from the input in order to identify this event later.
input_id as "s3_object_id!",
object_id,
public_id,
bucket,
key,
deleted_date as event_time,
Expand Down

This file was deleted.

7 changes: 0 additions & 7 deletions lib/workload/stateless/stacks/filemanager/docs/API_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,13 +142,6 @@ For example, count the total records:
curl -H "Authorization: Bearer $TOKEN" "https://file.dev.umccr.org/api/v1/s3_objects/count" | jq
```

## The `objects` record

There is a similar record kept in the filemanager database called `object`. A similar REST API
is available for these records under `/api/v1/objects`, however the `object` currently don't server
a purpose in filemanager. They were initially included to support attribute linking, however they will likely
be removed because attribute linking can be accomplished using the `attributes` column on `s3_object`.

## Some missing features

There are some missing features in the query API which are planned, namely:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
The aim of the filemanager is to maintain a database state that is as correct as possible at the time an event is received.
Broadly, the architecture of filemanager reflects this, where cloud storage events that contain information about objects
are processed and stored in the database. The database tables reflect the information from the events, and data is stored
in the `object` and `s3_object` tables.
in the `s3_object` tables.

Some details about S3 event processing needs to be addressed in filemanager, specifically in relation to out of order
and duplicate events.
Expand Down
Loading