Chapter 3. Organizing Related Data

In Chapter 2, The First Table, we created our first table, which stores user accounts. We discussed how to insert data into the table, and how to retrieve it. However, we also encountered several significant limitations in the tasks we can perform with the table we created.

In this chapter, we'll introduce the concept of compound primary keys, which are simply primary keys comprising more than one column. Although this might at first glance seem like a trivial addition to our understanding of Cassandra tables, a table with compound primary keys, in fact, is a considerably richer data structure that opens up substantial new data access patterns.

Our introduction to compound primary keys will help us to build a table that stores a timeline of users' status updates. In this chapter, we'll focus on defining the table and understanding how it works; Chapter 4, Beyond Key-Value Lookup, will introduce new patterns to query compound primary key tables.

We'll explore two different approaches to designing schemas with compound primary keys. In the first approach, the primary key encodes a parent-child relationship implicitly: In this case, a user's status updates are children of the user record itself. We'll also look at an alternative schema design using static columns; this design allows us to store information about users and their status updates in a single table, without duplication. This makes the relationship between users and status updates explicit in the structure of the table.

By the end of this chapter, you'll know:

  • How to create a table with a compound primary key
  • The difference between a partition key and a clustering column
  • How Cassandra organizes data in a compound key table
  • When to use UUIDs for primary key components
  • How to use static columns to associate data with partition keys
  • When static columns are useful for schema design