Skip to content

Latest commit

 

History

History
75 lines (59 loc) · 2.66 KB

step1-cassandra.md

File metadata and controls

75 lines (59 loc) · 2.66 KB
Exercise 2.3: Denormalizing ℹ️ For technical support, please contact us via email.
⬅️ Back Step 1 of 4 Next ➡️
Explore the KillrVideo dataset

With all of the success you've been having on the video sharing development team, you have been promoted and assigned to work on a high-priority project to incorporate movie content into the KillrVideo application. Your new team is normalizing their video and actor metadata into separate tables and currently are stuck figuring out how to join tables in Cassandra. Having been around the Cassandra block a few times, you know that JOINs are expensive and not supported. It is up to you to show your team the optimal way of performing these queries.

The video metadata is similar to what was in the video sharing domain:

Column Name Data Type
video_id timeuuid
added_date timestamp
description text
encoding video_encoding
tags set
title text
user_id uuid

There is also the additional following metadata:

Column Name Data Type
actor text
character text
genre text

With this metadata, the data model must support the following queries:

  • Q1: Retrieve videos an actor has appeared in (newest first).
  • Q2: Retrieve videos within a particular genre (newest first).

✅ Review the contents of the CSV files with video metadata:

head -n 10 assets/videos_by_genre.csv
head -n 10 assets/videos_by_actor.csv
⬅️ Back Next ➡️