AWS Database Blog

Announcing Amazon Keyspaces Multi-Region Replication

Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra-compatible database service. With Amazon Keyspaces, you can run your Cassandra workloads on AWS using the same Cassandra application code and developer tools that you use today.

Today we are introducing Amazon Keyspaces Multi-Region Replication. Amazon Keyspaces Multi-Region Replication is a new capability that provides you with automated, fully managed, active-active replication across the Regions of your choice and 99.999% availability. You can improve both availability and resiliency from Regional degradation while also benefiting from low-latency local reads and writes for global applications. With Multi-Region Replication, Amazon Keyspaces asynchronously replicates data between Regions and data is typically propagated within 1 second. Multi-Region replication also removes the difficult work of resolving update conflicts and correcting for data divergence issues, enabling you to focus on your application.

In this post, we discuss the benefits and use cases of this new feature and demonstrate how to get started using Multi-Region Replication in Amazon Keyspaces.

Use cases for Multi-Region Replication

Amazon Keyspaces Multi-Region Replication has the following benefits:

  • Global reads and writes with single-digit millisecond latency – In Amazon Keyspaces, replication is active-active. You can now serve both reads and writes locally from the Regions closest to your customers with single-digit millisecond latency at any scale. With Multi-Region Replication, you can use Amazon Keyspaces for any global application that needs fast response time anywhere in the world.
  • Improved business continuity and protection from single-Region degradation – With Multi-Region Replication, you can recover from a single Region degradation by redirecting your application to a different Region in your multi-Region keyspace. Amazon Keyspaces keeps track of any writes that have been performed on your multi-Region keyspace but haven’t yet been propagated to all replica Regions. When the Region comes back online, Amazon Keyspaces automatically syncs any missing changes, allowing you to recover without any application impact.
  • High-speed replication across Regions – Multi-Region replication uses fast, storage-based physical replication of data across Regions with replication lag latencies typically less than 1 second. Replication in Amazon Keyspaces has little to no impact on your application’s read and write queries because it doesn’t share compute resources. This allows you to deal with high write throughput or bursty use cases without any application impact.

How it works

When you create a multi-Region keyspace, it consists of multiple replica keyspaces (one per Region) that are treated as a single unit. Every Region has the same table schema. When an application writes data in one Region, Amazon Keyspaces uses storage-based asynchronous replication to propagate the writes across Regions with replication lag latencies typically less than 1 second, with no impact to your application’s performance. In the unlikely event of a single-Region impairment, you can simply point your application to one of the healthy Regions in your multi-Region keyspace. Because all Regions in a multi-Region keyspace support both reads and writes, there is no impact to your application’s performance or availability. When the degraded Region comes back online, Amazon Keyspaces automatically asynchronously syncs any missing data.

Amazon Keyspaces also improves on the anti-entropy repair process that Cassandra uses. In Amazon Keyspaces, you don’t have to perform cumbersome tasks such as regularly running repair operations to clean-up data synchronization issues. Amazon Keyspaces monitors data consistency between tables in different AWS Regions by detecting and repairing conflicts if any, and synchronizes replicas automatically. Amazon Keyspaces uses the last writer wins method of data reconciliation. With this conflict resolution mechanism, all of the Regions in a multi-Region keyspace agree on the latest update and converge toward a state in which they all have identical data. The reconciliation process has no impact on application performance. Any conflicts or divergences on Keyspaces are automatically handled.

Solution overview

In the following sections, we demonstrate how to create a new keyspace that spans multiple Regions and then define your tables. You can create a new keyspace using the AWS Management Console or CQL APIs. You can choose up to five Regions, and for each Region, Amazon Keyspaces will replicate your data three times over multiple Availability Zones. Amazon Keyspaces also supports AWS native approaches to resource creation such as the AWS SDK and AWS CloudFormation. You can also set up replication latency monitoring with Amazon CloudWatch.

Create a new multi-Region keyspace using the console

To set up your multi-Region keyspace using the console, complete the following steps:

  1. On the Amazon Keyspaces console, choose Keyspaces in the navigation pane.
  2. Choose Create keyspace.
  3. For Keyspace name, enter an identifier for your multi-Region keyspace.
  4. Choose Multi-Region replication for your replication strategy and specify the Regions you want to replicate your data to.

By default, your current Region is automatically selected.

  1. Choose Create keyspace.

You should now see the recently created keyspace with the replication strategy of multi-Region in your list view.

Create a new multi-Region keyspace using CQL APIs

In addition to using the console, you can also create a multi-Region keyspace using CQL APIs. If you have already created your multi-Region keyspace, you can skip this step. For this post, we use the built-in query editor on the Amazon Keyspaces console. You can use any Cassandra-compatible tool or driver to use the CQL APIs specified in this section.

Navigate to the CQL editor and enter the following create keyspace statement. You can use NetworkTopologyStrategy to define your multi-Region keyspace. Specify the Regions you want to replicate data to. A replication factor of three is fixed because every table will be replicated three times over multiple Availability Zones.

CREATE  KEYSPACE IF NOT EXISTS aws_global
WITH REPLICATION = {'class' : 'NetworkTopologyStrategy',
                    'us-east-2' : 3 , 'us-west-2' : 3 }
AND TAGS = {'blog':'keyspaces', 'launch':'multi-Region'};

You should see your new created keyspace in the list of available keyspaces.

Create a new multi-Region table

Now that you have created a multi-Region keyspace, you create a table within the keyspace. When you create the table, it will be replicated to the Regions specified in the keyspace automatically. Replication lag across Regions is typically less than 1 second.

Navigate to the CQL editor and enter the following create table statement. This table contains transaction data, including a unique identifier, transaction type, event timestamp, amount, and additional details stored in a map. The primary key consists of the ID and event fields, ensuring a unique row for each ID and event combination.

CREATE TABLE aws_global.transactions (
    id text,
    type text,
    event timeuuid,
    amount int,
    details map<text, text>, 
  PRIMARY KEY (id, event)) ;

You should be able to verify that your table was created by going to the list of all tables on the Amazon Keyspaces console and filtering for the table name you specified.

Insert data and validate replication

Now that you have created a multi-Region table, let’s insert some data. To insert data, complete the following steps:

  1. Navigate to the CQL editor and run the following INSERT statement:
    INSERT INTO aws_global.transactions(id, type, event, amount, details) 
    VALUES ('my-first-transaction', 'CREATE', now(), 10, {'keyspaces':'multi-Region', 'local':'us-east2'});
  2. Validate this write was inserted successfully by running a SELECT statement from the same Region:
    SELECT * FROM aws_global.transactions where id = 'my-first-transaction';
  3. After confirming the write in your current Region, switch to replica Region. In this post, us-west-2 is our replica Region. Navigate to the CQL editor in us-west-2 and run the same SELECT statement:
    SELECT * FROM aws_global.transactions where id = 'my-first-transaction';

In the editor, you will see a single record returned containing the values we entered earlier.

  1. Because replication is active-active, you can also INSERT data into the replica Region and observe it get propagated to other Regions in the keyspace. To validate this, insert a new record in the replica Region using the following INSERT statement:
    INSERT INTO aws_global.transactions(id, type, event, amount, details) 
    VALUES ('my-second-transaction', 'CREATE', now(), 10, {'keyspaces':'multi-Region', 'local':'us-west2'})
  2. Validate this write was inserted successfully by running a SELECT statement from the same Region:
    SELECT * FROM aws_global.transactions where id = 'my-second-transaction';
  3. Now let’s verify that the row was inserted by going back to the CQL editor in us-east-2 and running the following SELECT statement:
    SELECT * FROM aws_global.transactions where id = 'my-second-transaction';

In the editor, you see that the second record that you inserted in us-west-2 was successfully replicated to us-east-2.

Set up replication latency monitoring with CloudWatch

Amazon Keyspaces offers native integrations with CloudWatch to provide a high level of observability to relevant metrics. The metrics are collected as raw data and processed into readable, near-real-time metrics. With the launch of Amazon Keyspaces Multi-Region Replication, you have a new metric to monitor latency of replication from one Region to another called ReplicationLatency. To use the ReplicationLatency metric, complete the following steps:

  1. On the CloudWatch console, under Metrics in the navigation pane, choose All metrics.
  2. Search for the keyspace aws_global.
  3. From the list of metrics, choose AWS/Cassandra, Keyspace, ReceivingRegion, TableName, and select ReplicationLatency for the table transactions.

You should be able to see the lag latency on the chart. When using Multi-Region Replication, typical lag latency is less than 1 second.

Summary

In this post, we introduced Amazon Keyspaces Multi-Region Replication. We discussed how it works, how to create a multi-Region keyspace, insert sample data, and monitor replication lag across Regions. By replicating your data across multiple Regions, you can improve availability and reduce latency for your users, while ensuring that your application remains highly available and resilient to failures.

Do you have follow-up questions or feedback? Leave a comment. We’d love to hear your thoughts and suggestions. To get started, refer to our content on Multi-Region Replication.


About the authors

Michael Raney is a Senior Specialist Solutions Architect based in New York and leads the field for Amazon Keyspaces. He works with customers to modernize their legacy database workloads to a serverless architecture. Michael has spent over a decade building distributed systems for high scale and low latency.

Meet Bhagdev is a Principal Product Manager at Amazon Web Services.  Meet is passionate about open-source, databases, and analytics and spends his time working with customers to understand their requirements and building delightful experiences. Meet has over a decade of experience as product manager on database and analytics services. At AWS, Meet leads the product team for Amazon Keyspaces and previously was a lead product manager on Amazon DocumentDB. Prior to his time at AWS, Meet worked on Azure databases at Microsoft.