Cassandra is a NoSQL database with decentralized, fault-tolerant, scalable, and low-cost features, making it a core component of cloud computing systems. The more recent versions have greatly improved the security features, making it suitable for use in enterprise systems.
In this tutorial, you’ll see how Cassandra overcomes the challenges that relational databases face during high scalability demand. You will become familiar with the Cassandra terminologies, components, and their roles. Then you will learn how to create a multi-node Cassandra structure, understand the roles and responsibilities of Cassandra components, and see the data flow during database operations that demand speed, accuracy, and durability.
You will then see how Cassandra stores data onto files on the disk, how to optimize those files to improve performance, and how to monitor the Cassandra database performance using logs and metrics.
We’ll demonstrate the factors that could affect the performance SLAs of the Cassandra database. Next, you will learn how to optimize the data model to provide performance guarantees and consistent performance SLA over time. You’ll also learn how to build the data model on Cassandra and integrate the database with your application.
In the later sections, you’ll connect with Cassandra from Spark to read and write data. You’ll integrate Cassandra with Spark and learn how to process live streaming data with Spark and persist the data in Cassandra for consumption through the downstream system.
By the end of the course, you’ll be able to build powerful, scalable Cassandra database layers for your applications. You’ll design rich schemes to capture the relationships between different data types and master the advanced features available in Cassandra.