You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Screenshots, page source, and other files collected in the browser manager process are currently written directly to disk. This worked when OpenWPM only saved data locally, but will not work for the S3Aggregator. Instead, BaseAggregator should include a save_file method. In LocalAggregator we can implement that to save to disk, and in S3Aggregator we can upload to S3.
The text was updated successfully, but these errors were encountered:
Updating this comment as #753 removed everything mentioned in the original issue.
Observations:
UnstructuredStorageProviders already have an interface suitable for storing a bunch of bytes under a user-defined name
The base path for storing is specified at time of object instantiation
=> There is no more need for a data_directory in the manager params similiar to the database_name name being removed in Data Aggregator Rewrite #753
Paths forward:
Add a second UnstructuredStorageProvider to the StorageController that is responsible for saving unstructured platform data
Expand the UnstructuredStorageProvider interface with a second method that is responsible for saving unstructured platform data
I prefer option 1 as it is inherently more flexible, e.g. this way screenshots can get saved into the cloud while web content just gets saved to disk.
vringar
changed the title
Add support for saving screenshots, page source, and other arbitrary files to data aggregators
Add support for saving screenshots, page source, and other arbitrary files to unstructured storage providers
Dec 21, 2021
Screenshots, page source, and other files collected in the browser manager process are currently written directly to disk. This worked when OpenWPM only saved data locally, but will not work for the S3Aggregator. Instead,
BaseAggregator
should include asave_file
method. InLocalAggregator
we can implement that to save to disk, and inS3Aggregator
we can upload to S3.The text was updated successfully, but these errors were encountered: