- Create a
Google Cloud VM
- Install
MongoDB
on the VM to store Tiki product data - Create a
GCS bucket
- Create the
BigQuery
database and tables
Script: migrate_data
- Export the
product
collection from thetiki
database to a JSON fileproduct.json
- Upload the JSON file to the
mongodb-data-1
bucket - Use
parallel_composite_upload_threshold
to enable parallel composite uploads if the file size exceeds 150 megabytes - After the upload process is done, remove the JSON file
- Use
crontab
to run the script at 22:00 everyday
Script: load_data
- Create a Google Cloud Function that triggers when the file
product.json
is uploaded to themongodb-data-1
bucket and loads the data into theproduct
table within thetiki
database in BigQuery - Write records that failed to load to the BigQuery table to
failed_records.json
for later handling - Output: tiki_product_sample
Script: creat_datamart
- Create the
seller_product
database - Create table
seller
andproduct
from tabletiki.product
- Output: seller_sample/product_sample
Script: analyze_data