Resources from weekly Zoom lunches revolving around Apache Cassandra and Apache Cassandra-related topics. Hosted by Anant Corporation.
Join Cassandra Lunch Weekly at 12 PM EST Every Wednesday
If you would like to be a guest speaker, you can reach us at solutions@anant.us. If you would like to sponsor Cassandra Lunch, please reach us at the email listed above.
Check out the Cassandra.Lunch playlist on Youtube
- We discuss and take an in-depth look at the improvements and new features that come with Cassandra 4.0.
- We discuss various Cassandra distributions ranging from Cassandra / Cassandra Compliant Databases on JVM, Cassandra Compliant Databases on C++, Cassandra as a Service / Managed Cassandra Based on Open Source Cassandra, and Cassandra as a Service / Managed Cassandra Based on Proprietary Technology.
- We cover Kubernetes, discussing what it is and how it works with Docker and Cassandra. We also looked at some of Kubernetes' competitors and a variety of open sources tools for Kubernetes which will give you an insight as to why we picked Kubernetes to be a worth while investment when working with databases.
- We discuss a number of projects and platforms that you can use to jumpstart your Cassandra projects. They make useful educational resources; as well as, good starting codebases for new projects. We also discuss a recent article on the Yugabyte blog about Cassandra.
- We discuss methods for finding and diagnosing issues in Cassandra clusters with ELK/FEK/BEK.
- We discuss Cassandra Backup / Restoration. We also discuss disaster avoidance, disaster recovery, and different tools that can be used for backup and restoration of your Cassandra data. Also, we discuss an example scenario of how someone has set up multi-node clusters and how they go about data backup and restoration.
- We discuss Cassandra Anti-entropy which is a process of comparing the data of all replicas and updating each replica to the newest version. We also looked at repair and synchronization in Cassandra and how you can prepare for the unexpected.
- We discuss deletion and tombstones in Cassandra.
- Guest speaker, Ryan Quey, a full stack data engineer, discusses a personal project he has been working on called java-podcast-processor, which is a tool to find podcast metadata over an external API, store them, get their RSS feeds, and run ETL using Airflow, Kafka, Spark, and Cassandra. The particular Cassandra distribution used is Elassandra, which allows seamless integration with Elasticsearch. The data is also displayed using a Gatsby app and served using Flask.
- We discuss the combined use of relational databases and Cassandra. We also discuss the advantages of using relational databases and Cassandra separately; as well as, covering the advantages and methods for using both concurrently.
- We discuss Cassandra read and write paths, which is how Cassandra stores and retrieves data at high speeds. We do not cover how Cassandra replicates data because that its own subject, but we take a look at these four sub-topics: Write Path, Update / Delete, Maintenance Path, and Read Path.
- We discuss Cassandra and Staged Event-Driven Architecture with an emphasis on Cassandra stages / thread pools. We also discuss a few different tools that we can use to monitor these stages and thread pools in order to keep your Cassandra running as smoothly as possible.
- We discuss deployment and administration tools for Cassandra. We also discuss a number of tools for the installation, configuration, monitoring, and administration of Cassandra clusters.
- We discuss packaged and DIY methods for Lucene based indexes on Cassandra; as well as, give some pros and cons for using Lucene Based Indexes on Cassandra.
- We discuss a number of use cases for Cassandra, focusing on Cassandra's place in running a digital business technology platform.
- We discuss how Cassandra is used for real-time data platforms; as well as, cover different reference architectures in which Cassandra is and can be used.
- We discuss how Cassandra is used for real-time data platforms; as well as, cover different reference architectures in which Cassandra is and can be used.
- We discuss different methods in which we can deploy Cassandra whether it be on Baremetal, Virtual Machines, or Containers; as well as, pros, cons, and deployment tools.
- We discuss specific scenarios for Cassandra's backup and restore, some methods for restoring data to a Cassandra cluster, and covered how factors like the topology of a cluster or the need for constant uptime can affect the backup/restore process.
- We discuss updates regarding Cassandra and Kubernetes after the recent KubeCon event.
- We discuss the basics of using Spark and Cassandra together, the advantages of each, and the advantages of using them together. We also discuss the potential drawbacks, and configuration methods for avoiding those drawbacks.
- We discuss open-source tools that can be used for BI with Cassandra including a live demo using DSE, Presto, and Metabase.
Apache Cassandra Lunch Online Meetup #32: Cassandra Data Operations – Common Ways to Move Data in Cassandra
- We discuss the various ways of moving data into and out of Cassandra clusters.
Apache Cassandra Lunch Online Meetup #33: Cassandra Deployment – Ansible and Terraform with Cassandra
- We discuss using Terraform and Ansible to set up the infrastructure for and handle the provisioning of a new Cassandra cluster
- We discuss how to use Liquibase with Cassandra and DataStax Astra.
- We discuss some basic data operations that you can do with Apache Spark and Cassandra.
- We discuss various databases that can run on top of Cassandra.
- We discuss CQL Copy and how we can use it for Cassandra data operations.
- We discuss Apache Spark projects that interact with Cassandra specifically through Cassandra’s SSTables
- We discuss General Updates to Apache Cassandra and relevant articles of interest.
- We discuss Scylla’s Spark Migrator and walk through how we can use the Scylla Migrator for Cassandra Data Operations.
- We discuss Cassandra on Kubernetes and give an introduction to Docker, Kubernetes, and Helm.
- We cover SSTable files, their relation to SSTableLoader, and we walk through an example using SSTableloader to load data taken from a cluster to a new, empty cluster.
- We will introduce DSBulk or DataStax Bulk Loader, and show how we can use it with tools like sed and awk to do ETL on Cassandra data.
- We will introduce DSBulk or DataStax Bulk Loader, and show how we can use it with tools like sed and awk to do ETL on Cassandra data.
- In Apache Cassandra Lunch #45, we will discuss how you can stream tweets using Twitter4S (Scala Twitter client) and save them to Cassandra using Alpakka Cassandra.
- In Apache Cassandra Lunch #46, we will discuss how we can use Apache Spark jobs written in Scala to do Cassandra data operations, which will include a live walkthrough!
- In Cassandra Lunch #48, we will discuss using Airflow and Cassandra together. Airflow provides a Cassandra connection type and a Cassandra operator. We will explore what we can do to manage a Cassandra cluster via Airflow.
- We will discuss how to use Spark SQL to do Cassandra data operations such as moving data in Apache Cassandra tables.
- In Apache Cassandra Lunch #50, we will discuss how you can use Apache Spark and Apache Cassandra to perform basic Machine Learning tasks.
- In Cassandra Lunch #51, we will discuss an overview of Cassandra cluster architecture, not to be confused with the Cassandra database architecture. Specifically, using Cassandra Datacenters to isolate workloads.