Choosing an AWS database service
Taking the first step
Amazon Web Services (AWS) offers a growing number of purpose-built database options (currently more than 15) to support diverse data models. These include relational, key-value, document, in-memory, graph, time series, wide column, and ledger databases.
Choosing the right database or multiple databases requires you to make a series of decisions based on your organizational needs. This decision guide will help you ask the right questions, provide a clear path for implementation, and help you migrate from your existing database.
This five minute excerpt is from a 55 minute recording of a presentation by Jeff Carter, VP of Databases and Migrations at AWS at re:Invent 2022. It provides an overview of available AWS database services.
Time to read
Help determine which AWS database(s) are the best fit for your organization.
April 20, 2023
Databases are important backend systems used to store data for any type of app, whether it’s a small mobile app or an enterprise app with internet-scale and real-time requirements.
This decision guide is designed to help you understand the range of choices available to you, establish the criteria that make sense for you to make your database choice, provide you with detailed information on the unique properties of each database - and then allow you to dive deeper into the capabilities that each offers.
What kinds of apps do people build using databases?
- Internet-scale apps: Globally distributed and internet-scale apps that handle millions of requests per second over hundreds of terabytes of data. These databases automatically scale up and down to accommodate your spiky workloads.
- Real-time apps: Real-time apps such as caching, session stores, gaming leaderboards, ride-hailing, ad-targeting, and real-time analytics need microsecond latency and high throughput to support millions of requests per second.
- Open-source apps: Some customers prefer open-source databases for their low cost, community-backed development and support, and large ecosystems of tools and extensions.
- Enterprise apps: Enterprise apps manage core business processes, such as sales, billing, customer service, human resources, and line-of-business processes, such as a reservation system at a hotel chain or a risk-management system at an insurance company. These apps need databases that are fast, scalable, secure, available, and reliable.
Note: This guide focuses on databases suitable for Online Transaction Processing (OLTP) applications. If you primarily need to store and analyze massive amounts of data quickly and efficiently (typically met by an online analytical processing (OLAP) application), AWS offers Amazon Redshift, a fully-managed, cloud-based data warehousing service that is designed to handle large-scale analytics workloads.
There are two high-level categories of AWS OLTP databases - relational and non-relational.
- The AWS relational database family includes seven popular engines for Amazon RDS and Amazon Aurora — Amazon Aurora with MySQL compatibility, Amazon Aurora with PostgreSQL compatibility, MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server — and an option to deploy on-premises with Amazon RDS on AWS Outposts.
- The non-relational database options are designed for those who have a specific need for key-value, document, caching, in-memory, graph, time series, wide column, and ledger databases.
We'll explore all of these in detail in the Choose section of this guide.
Before deciding which database service you want to use to work with your data, you may want to spend a little time thinking about how you're going to migrate your existing database(s).
The best database migration strategy helps you take full advantage of the AWS Cloud. This involves migrating your applications to use purpose-built, cloud-centered databases. It also doesn't tie you to the same database that you've been using on premises. Consider modernizing your applications and choose the databases that best suit your applications’ workflow requirements.
The following resources can help you with your migration strategy:
- Getting started with AWS Database Migration Service
- A high-level overview of AWS Database Migration Service
- Using the AWS Schema Conversion Tool
- Selecting the right database and database migration plan for your workloads
In addition to having a migration strategy at the front end of your planning, you want to have ways to gain insight from your data. You can use Amazon Redshift. It's a fast, fully managed, petabyte-scale data warehouse service that you can use to efficiently analyze all your data using your existing business intelligence tools. It's optimized for datasets that range from a few hundred gigabytes to a petabyte or more.
You're considering hosting a database on AWS. This might be to support a greenfield/pilot project as a first step in your cloud migration journey, or you might want to migrate an existing workload with as little disruption as possible.
Whatever your goal is in hosting a database on AWS, you need to understand the criteria for making your database decision. Here's a summary of the key criteria to consider.
Lift and shift
The first major consideration when choosing your database is your business objective. What is the strategic direction driving your organization to change? As suggested in the 7 Rs of AWS, consider whether you want to re-architect or re-factor an existing workload, move to a new platform to shed commercial license commitments, rehost your existing databases and data as a starting move to modernization or make the move now to a managed database strategy.
Your business objective will also drive the degrees of freedom you have in choosing a target database in AWS for your workload. If you have chosen a Rehost strategy, you might want to migrate the workload to AWS with as few disruptions as possible. So you might adopt a "lift and shift" strategy where you try to migrate your data to as similar a target database as possible.
Can you lift and shift your existing database(s)? If so, you might be able to deploy faster and with fewer data migration headaches. You can lift and shift your self-managed databases from databases such as Oracle, SQL Server, MySQL, PostgreSQL, and MariaDB to Amazon Relational Database Service (Amazon RDS). If you need MySQL and PostgreSQL compatibility, Amazon Aurora is optimized to deliver that.
You can use non-relational databases such as MongoDB and Amazon ElastiCache for Redis as document and in-memory databases. Use cases examples of this are content management, personalization, mobile apps, catalogs, and real-time use cases such as caching, gaming leaderboards, and session stores. In most cases, you can migrate workloads and applications to a managed service without needing to re-architect your applications, using the same database skill sets.
Do you need a database built for a specific purpose? As you might have read, the days of the one-size-fits-all monolithic database are behind us. It's now much more common to choose a purpose-built database that is optimized for a particular task or use case.
AWS offers a broad and deep portfolio of purpose-built databases that support diverse data models. With these databases, you can build data-driven, highly scalable, distributed applications. Selecting the right purpose-built database—optimized for what you need to do—will speed development and deployment.
The core of any database choice includes the characteristics of the data that you need to store, retrieve, analyze, and work with. This includes your data model (is it relational, structured or semi-structured, using a highly connected dataset, or time-series?), data access (how do you need to access your data?), the extent to which you need real-time data, and whether there is a particular data record size you have in mind.
Your primary operational considerations are all about where your data is going to live and how it will be managed. The two key choices you need to make are:
Whether it will be self-hosted or fully managed: The core question here is where your team going to provide the most value to the business? If the database is self-hosted, you will be responsible for the real differentiated value that a database can deliver (through your work on schema design, query construction and query optimization), and responsible for the day-to-day maintenance, monitoring and patching of the database. Choosing a fully-managed AWS database simplifies your work and allows your team to focus on where it's likely to deliver unique value.
- Whether you need a serverless or provisioned database: Amazon Aurora provides a model for how to think about this choice. Amazon Aurora Serverless v2 is suitable for demanding, highly variable workloads. For example, your database usage might be heavy for a short period of time, followed by long periods of light activity or no activity at all. Some examples are retail, gaming, or sports websites with periodic promotional events, and databases that produce reports when needed. Aurora provisioned clusters are suitable for steady workloads. With provisioned clusters, you choose a DynamoDB instance class that has a predefined amount of memory, CPU power, and I/O bandwidth.
Database reliability is key for any business. Achieving and maintaining the reliability and resiliency of your database means paying attention to a number of key factors. These factors include capabilities for backup and restore, replication, failover, and point-in-time recovery (PITR).
In addition, support for a globally distributed application/dataset might be important for you, along with Recovery Time Objective (RTO) / Recovery Point Objective (RPO) requirements.
Consider whether your workload throughput might exceed the capacity of a single compute node. Then consider your potential need for the database to support a high concurrency of transactions (10,000 or more) and whether it needs to be deployed in multiple geographic regions.
Security is a shared responsibility between AWS and you. The AWS shared responsibility model describes this as security of the cloud and security in the cloud. Specific security considerations include data protection at all levels of your data, authentication, compliance, data security, storage of sensitive data and support for auditing requirements.
Now that you know the criteria by which you will be evaluating your database options, you are ready to choose which AWS database is right for your organizational needs.
This table highlights which databases are optimized for which circumstances and type of data. Use it to help determine the database that is the best fit for your use case.
AmazonRDS provides seven relational database engines to choose from, including Amazon Aurora MySQL-Compatible Edition, Amazon Aurora PostgreSQL-Compatible Edition, MySQL, MariaDB, PostgreSQL, Oracle, and Microsoft SQL Server.
With Amazon RDS on AWS Outposts, you can deploy fully managed database instances in your on-premises environments.
Amazon RDS is a collection of managed services designed to simplify setting up, operating, and scaling databases in the cloud.
Amazon RDS for SQL Server makes it easy to set up, operate, and scale SQL Server deployments in the cloud.
Amazon RDS for Oracle is a fully managed commercial database that makes it easy to set up, operate, and scale Oracle deployments in the cloud.
Amazon RDS for PostgreSQL gives you access to the capabilities of the familiar PostgreSQL database engine.
Amazon RDS makes it easier to set up, operate, and scale MariaDB server deployments in the cloud.
Amazon RDS makes it easier to set up, operate, and scale MySQL deployments in the cloud.
Amazon Aurora with MySQL compatibility
Run and manage databases created in MySQL, but with additional capabilities in the Aurora engine.
Amazon Aurora with PostgreSQL compatibility
Run and manage databases created in PostgreSQL, but with additional capabilities in the Aurora engine.
Amazon Aurora provides built-in security, continuous backups, serverless compute, up to 15 read replicas, automated multi-Region replication, and integrations with other AWS services.
A NoSQL database that stores data as a collection of key-value pairs in which a key serves as a unique identifier.
A performant, flexible, scalable, and serverless NoSQL database that is designed to support key-value and document workloads.
A database you can use for applications that require real-time access to data. By storing data directly in memory, these databases provide microsecond latency to applications for which millisecond latency is not enough.
Choose Elasticache for Memcached when you need a simple caching solution to improve application performance or Elasticache for Redis when you need a caching solution to accelerate data access with your primary existing database, but also need richer features such as advanced data structures, replication and transactions.
Choose MemoryDB when you require an ultra-fast primary database with microsecond read and single-digit millisecond write latency.
A database that you can use to store semi-structured data as JSON-like documents. These databases help developers build and update applications quickly.
Amazon DocumentDB (with MongoDB compatibility)
Use Amazon DocumentDB (with MongoDB compatibility) when you need a fully managed database service to simplify setting up, operating, and scaling MongoDB-compatible databases in the cloud.
Wide Column Database
A type of NoSQL database. It uses tables, rows, and columns. However, unlike a relational database, the names and format of the columns can vary from row to row in the same table.
Use Amazon Keyspaces (for Apache Cassandra) if you need a scalable, highly available, and managed Apache Cassandra–compatible database service that you can use without having to provision, patch, or manage servers—or install, maintain, or operate software.
A database that stores nodes and relationships instead of tables or documents. The connections between the data are considered as important as the data itself.
Choose Neptune if you need a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. The core of Neptune is a purpose-built, high-performance graph database engine.
Time Series Database
A database that is designed to store and retrieve data records that are part of a “time series”. A time series is a set of data points that are associated with timestamps.
Use Amazon Timestream if you need a fast, scalable, fully managed, purpose-built time series database to store and analyze trillions of time series data points per day. It manages the lifecycle of time series data by keeping recent data in memory and moving historical data to a cost-optimized storage tier based upon user defined policies.
A NoSQL database that provides an immutable, transparent, and cryptographically verifiable transaction log owned by a central authority.
Amazon Quantum Ledger Database (QLDB)
Choose Amazon QLDB if you need a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log owned by a central trusted authority.
Now that you have learned about the shape of your data, how it fits in your environment, supports your use case, and what each database service is optimized for. You should have been able to select which AWS database service(s) is optimized for your organizational needs.
To explore how to use and learn more about your choice, we have provided two sets of pathways to explore how each database works. The first set of pathways provides in-depth documentation, hands-on tutorials, and resources to get started with Amazon Aurora, Amazon DocumentDB, Amazon DynamoDB, Amazon ElastiCache, and Amazon Keyspaces.
Getting started with Amazon Aurora
We outline the basics of getting started with Aurora. This guide includes tutorials and covers more advanced Aurora concepts and procedures, such as the different kinds of endpoints and how to scale Aurora clusters up and down.
Create a high-availability database
Learn how to configure an Amazon Aurora cluster to create a high-availability database. This database consists of compute nodes that are replicated across multiple Availability Zones to provide increased read scalability and failover protection.
Use Amazon Aurora global databases
We help you get started using Aurora global databases. This guide outlines the supported engines and AWS Region availability for Aurora global databases with Aurora MySQL and Aurora PostgreSQL.
Migrate from Amazon RDS for MySQL to Amazon Aurora MySQL
We show you how to migrate any application's database from Amazon RDS for MySQL to Amazon Aurora MySQL with minimal downtime. This tutorial is not within the free tier and will cost you less than $1.
Create a serverless message processing application
We show you how to create a serverless message processing application with Amazon Aurora Serverless (PostgreSQL-compatible edition), Data API for Aurora Serverless, AWS Lambda, and Amazon SNS.
Getting started with Amazon DocumentDB
We help you get started using Amazon DocumentDB in just seven steps. This guide uses AWS Cloud9 to connect and query your cluster using the MongoDB shell directly from the AWS Management Console.
Explore the guide »
Setting up a document database with Amazon DocumentDB
This tutorial helps you get started connecting to your Amazon DocumentDB cluster from your AWS Cloud9 environment with a MongoDB shell and run a few queries.
Best practices for working with Amazon DocumentDB
Learn best practices for working with Amazon DocumentDB (with MongoDB compatibility), along with the basic operational guidelines when working with it.
Explore the guide »
Migrate from MongoDB to Amazon DocumentDB
Learn how to migrate an existing self-managed MongoDB database to a fully managed database on Amazon DocumentDB (with MongoDB compatibility).
Assessing MongoDB compatibility
Use the Amazon DocumentDB compatibility tool to help you assess the compatibility of a MongoDB application by using the application’s source code or MongoDB server profile logs.
Getting started with Amazon DynamoDB
We help you get started and learn more about Amazon DynamoDB. This guide includes hands-on tutorials and basic concepts.
Getting started with DynamoDB and the AWS SDKs
We help you get started with Amazon DynamoDB and the AWS SDKs. This guide includes hands-on tutorials that show you how to run code examples in DynamoDB.
Explore the guide »
Create and Query a NoSQL Table with Amazon DynamoDB
Learn how to create a simple table, add data, scan and query the data, delete data, and delete the table using the Amazon DynamoDB console.
Create an Amazon DynamoDB table
We show you how to create a DynamoDB table and use the table to store and retrieve data. This tutorial uses an online bookstore application as a guiding example.
Documentation for Amazon ElastiCache
Explore the full set of Amazon ElastiCache documentation, including user guides for ElastiCache for Redis and ElastiCache for Memcached, as well as specific AWS CLI and API references.
Getting started with Amazon ElastiCache for Redis
Learn how to create, grant access to, connect to, and delete a Redis (cluster mode disabled) cluster using the Amazon ElastiCache console.
Build a fast session store for an online application
Learn how to use Amazon ElastiCache for Redis as a distributed cache for session management. You will also learn the best practices for configuring your ElastiCache nodes and how to handle the sessions from your application.
Setting up a Redis Cluster for scalability and high availability
Learn how to create and configure a Redis Cluster with ElastiCache for Redis version 7.0 with TLS-encryption enabled. With cluster mode enabled, your Redis Cluster gains enhanced scalability and high availability.
Getting started with Amazon Keyspaces (for Apache Cassandra)
This guide is for those who are new to Apache Cassandra and Amazon Keyspaces (for Apache Cassandra). It walks you through installing all the programs and drivers that you need to successfully use Amazon Keyspaces.
Run Apache Cassandra workloads with Amazon Keyspaces
Learn how to create your cluster and build graph models using Property Graph and W3C’s RDF. Learn how to write queries using Apache TinkerPop Gremlin, SPARQL, troubleshoot performance, and integrate with AWS Glue and Elasticsearch.
Beginner course on using Amazon Keyspaces
Learn the benefits, typical use cases, and technical concepts of Amazon Keyspaces. You can try the service through the sample code provided or the interactive tool in the AWS Management Console.
The second set of database service pathways provide in-depth documentation, hands-on tutorials, and resources to get started with Amazon MemoryDB, Amazon Neptune, Amazon QLDB, Amazon RDS, and Amazon Timestream.
Getting started with Amazon MemoryDB
We guide you through the steps to create, grant access to, connect to, and delete a MemoryDB cluster using the MemoryDB Management Console.
Getting started using Amazon MemoryDB
Learn how to simplify your architecture and use MemoryDB as a single, primary database instead of using a low-latency cache in front of a durable database.
Integrating Amazon MemoryDB for Redis with Java-based AWS Lambda
We discuss some of the common use cases for the data store, Amazon MemoryDB for Redis, which is built to provide durability and faster reads and writes.
Getting started with Amazon Neptune
We help you get started using Amazon Neptune, a fully managed graph database service. This guide shows you how to create a Neptune database.
Build a fraud detection service using Amazon Neptune
We walk you through the steps to create a Neptune database, design your data model, and use the database in your application.
Build a recommendation engine with Amazon Neptune
We show you how to build a friend recommendation engine for a multiplayer game application using Amazon Neptune.
Getting started with Amazon QLDB
In Amazon Quantum Ledger Database (Amazon QLDB), the journal is the core of the database. This guide provides a high-level overview of Amazon QLDB service components and how they interact.
Creating your first Amazon QLDB ledger
We guide you through the steps to create your first Amazon QLDB sample ledger and populate it with tables and sample data.
Using an Amazon QLDB driver with an AWS SDK
Learn how to use the Amazon QLDB driver with an AWS SDK to create a QLDB ledger and populate it with sample data. The driver lets your application interact with QLDB using the transactional data API.
Getting started with Amazon RDS
We explain how to create and connect to a DB instance using Amazon RDS. You learn to create a DB instance that uses MariaDB, MySQL, Microsoft SQL Server, Oracle, or PostgreSQL.
Getting started creating a MySQL DB instance
We show you how to create an Amazon RDS MySQL database instance using the AWS Management Console and use standard MySQL utilities such as MySQL Workbench to connect to a database on the DB instance.
Explore the guide »
Create a web server and an Amazon RDS DB instance
Learn how to install an Apache web server with PHP and create a MySQL database. The web server runs on an Amazon EC2 instance using Amazon Linux, and the MySQL database is a MySQL DB instance.
Create and Connect to a MySQL Database
Learn how to create an environment to run your MySQL database, connect to the database, and delete the DB instance. We will do this using Amazon RDS and everything done in this tutorial is Free Tier eligible.
Getting started with Amazon Timestream
We help you get started with Amazon Timestream. This guide provides instructions for setting up a fully functional sample application.
Best practices with Amazon Timestream
We explore best practices, including those relating to data modeling, security, configuration, data ingestion, queries, client applications and supported integrations.
Accessing Amazon Timestream using AWS SDKs
Learn how to access Amazon Timestream using the AWS SDKs in the language of your choice: Java, Go, Python, Node.js, or .NET.
Explore reference architecture diagrams to help you develop, scale, and test your databases on AWS.
Explore architecture diagrams »
Explore whitepapers to help you get started, learn best practices, and migrate your databases.
Explore vetted solutions and architectural guidance for common use cases for databases.