Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enable loading videos from s3 #580

Merged
merged 14 commits into from
Feb 17, 2023

Conversation

suryatejreddy
Copy link
Collaborator

@suryatejreddy suryatejreddy commented Feb 1, 2023

This PR adds support for loading video datasets from S3.

Sample Queries

  • LOAD VIDEO 's3://bucket/dummy.avi' INTO MyVideo;
  • LOAD VIDEO 's3://bucket/eva_videos/*.mp4 INTO MyVideos;

Design Decisions

  • The video files are downloaded to a directory s3_download_dir inside the config and by default points to inside the .eva folder.
  • This would add redundancy but will be removed in a future PR that removes data duplicity by adding symlinks.

Testing

  • Added test cases for single and multi-video loading

@suryatejreddy suryatejreddy changed the title feat: enable download from s3 feat: enable loading videos from s3 Feb 1, 2023
@gaurav274
Copy link
Member

If we don't download s3 files, will it result in a significant logic change? How do other projects handle this?

eva/executor/load_multimedia_executor.py Outdated Show resolved Hide resolved
test/util.py Outdated Show resolved Hide resolved
test/integration_tests/test_s3_load_executor.py Outdated Show resolved Hide resolved
@gaurav274
Copy link
Member

Add changes to the https://github.com/georgia-tech-db/eva/blob/master/CHANGELOG.md

@xzdandy
Copy link
Collaborator

xzdandy commented Feb 2, 2023

If we don't download s3 files, will it result in a significant logic change? How do other projects handle this?

We can store the "path" to s3 files with prefix (e.g., s3://..., local://....), and have different GET executors. The optimizer will choose the correct executor.

@jarulraj jarulraj merged commit 8772c6a into georgia-tech-db:master Feb 17, 2023
@jarulraj jarulraj mentioned this pull request Apr 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants