Migrate from MongoDB to Amazon DocumentDB
TUTORIAL
In this lesson, you migrate a self-managed MongoDB database to a fully managed database on Amazon DocumentDB (with MongoDB compatibility). First, you learn the benefits of using Amazon DocumentDB as your nonrelational database. Then you work through the steps to migrate an existing MongoDB database to Amazon DocumentDB. At the end of this lesson, you should feel confident in your ability to migrate an existing database to Amazon DocumentDB.
Time to complete: 30–45 minutes
Amazon DocumentDB is a fully managed document database service. It is compatible with the open-source MongoDB 3.6 API and enables you to easily migrate from a self-managed database.
With Amazon DocumentDB, your database is managed by Amazon Web Services (AWS), leaving your team free to focus on innovation. Amazon DocumentDB handles cluster scaling, instance failover, data backups, and software updates. Rely on the efficiencies of the AWS Cloud to use a faster, cheaper, and more reliable database option.
In this lesson, you learn how to migrate a self-managed MongoDB database to a fully managed database on Amazon DocumentDB. This lesson has five steps.
-
1. Create an Amazon DocumentDB database cluster
In this module, you create an Amazon DocumentDB database cluster. This cluster will serve as your primary database after you copy data to it using AWS DMS.
To get started, navigate to the DocumentDB console. On the Clusters page, choose Create to create a new cluster.
(click to zoom)This launches the cluster creation wizard. In the Configuration section, give your cluster a name. Then choose the instance class and number of instances to create for your cluster. The instance class defines how many virtual CPUs (vCPUs) and how much RAM your instances have available. The right choice for instance class depends on many factors, including the amount of working set memory your databases require. Choosing an instance class that is close to the resources of your existing MongoDB deployment is a good place to start.
An Amazon DocumentDB cluster requires at least one instance, which serves as the primary instance. The primary instance is responsible for writing data to your cluster and also can serve read requests. Additional instances act as replicas, which can handle read requests and serve as failover targets. It is a best practice to deploy production Amazon DocumentDB clusters with at least three instances (which are automatically deployed across three Availability Zones by the service) so that you can leave the default selection of three instances for this exercise.
With your current self-managed MongoDB database, you may have used sharding to split your data across different shards. MongoDB sharding is often used to scale storage as your data size grows. Amazon DocumentDB uses a different mechanism for scaling your storage, so you don't need to worry about sharding your data. You can migrate an existing sharded MongoDB database cluster to an Amazon DocumentDB cluster by using AWS DMS.
(click to zoom)In the Authentication section, enter the master username and password for your Amazon DocumentDB cluster. Make sure you save this information so that you can authenticate to your database cluster.
(click to zoom)You need to configure a few more settings for your database cluster, so choose Show advanced settings near the bottom of the wizard.
(click to zoom)In the advanced settings, there is a Network settings section. Choose the Amazon Virtual Private Cloud (Amazon VPC) subnet group and security group for your Amazon DocumentDB cluster.
If you are migrating from a self-managed MongoDB database on Amazon Elastic Compute Cloud (Amazon EC2), you can use the same Amazon VPC and security groups as your existing MongoDB deployment.
If you are migrating from a database that is not hosted on AWS, but your application is hosted on AWS, choose the same Amazon VPC that is used for your application. Then choose a security group for your database instance.
(click to zoom)If you don't have a security group for your instance, you can create one. Navigate to the Security Groups section of the Amazon EC2 console. Choose Create security group to create a new security group.
(click to zoom)Give your security group a name and description, and then choose the VPC to which you want your security group to belong.
(click to zoom)We will set inbound and outbound rules for your security group in the next lesson. For now, you can choose Create security group to create your security group.
(click to zoom)After you have configured the network settings for your Amazon DocumentDB database cluster, choose Create cluster to create your Amazon DocumentDB database cluster.
(click to zoom)Amazon DocumentDB begins provisioning your database cluster. Your cluster should now show a Status of creating.
When your Amazon DocumentDB cluster’s storage volume is ready for use, its status is available.
(click to zoom)(click to zoom)Choose Instances in the navigation pane, and the status of your cluster’s instances is displayed.
When your cluster’s Writer instance shows a status of available, you are ready to move to the next module (the Reader instances may also be available as well).
(click to zoom)(click to zoom)
In this module, you created a fully managed, production-ready Amazon DocumentDB database cluster with MongoDB compatibility. In the next module, you will create a replication instance in AWS Database Migration Service (AWS DMS).
-
2. Create a replication instance in AWS Database Migration Service (AWS DMS)
In this module, you create a replication instance in AWS DMS.
AWS DMS can migrate data from several different database sources to several AWS managed database services. AWS DMS uses an Amazon EC2 instance referred to as a replication instance to host replication tasks, which are the processes responsible for migrating data. You will set up a replication task on this replication instance in the next module.
To create a replication instance, go to the Replication Instances section of the AWS DMS console. Choose Create replication instance to begin the replication instance creation wizard.
(click to zoom)In the Replication instance configuration boxes, enter the name and description of your replication instance. Then choose your instance class. The instance class you use depends on the size of your existing database and the amount of data flowing through it.
Then choose an engine version for AWS DMS. Finally, choose the amount of allocated storage for your replication instance.
(click to zoom)As you continue in the Replication instance configuration section, choose a VPC for your replication instance. Choose the same VPC in which you provisioned your Amazon DocumentDB database.
You may choose to have a Multi-AZ setup for your replication instance for redundancy. If you will be using AWS DMS to keep two databases synchronized over a long period of time, a Multi-AZ configuration is preferred. If you are performing a one-time migration of your data from an existing database to Amazon DocumentDB, you may choose to forgo a Multi-AZ setup to reduce costs.
Finally, choose whether your replication instance should be publicly accessible. If your existing database is in the same VPC as your Amazon DocumentDB cluster and your replication instance, your replication instance should not be publicly accessible. If your source database resides outside the Amazon DocumentDB cluster’s VPC, you need your replication instance to be publicly accessible.
(click to zoom)Next, open the Advanced security and network configuration section. For the VPC security group(s) configuration, choose the same security group that you attached to your Amazon DocumentDB database. This allows your replication instance to access your Amazon DocumentDB database.
(click to zoom)You may also edit the maintenance and tags settings.
When you're ready, choose Create to create your replication instance in AWS DMS.
(click to zoom)After you choose Create, AWS provisions your replication instance. It shows a Status of the replication instance is Creating while AWS provisions and initializes your instance.
(click to zoom)When your replication instance is ready to go, its Status is Available.
(click to zoom)While you are waiting for your replication instance to be available, go to the Security Groups section in the Amazon EC2 console. You need to add a rule to your security group to allow your replication instance to access your database.
In the Security Groups section, find the security group you attached to your Amazon DocumentDB database instance and your replication instance, and choose it.
(click to zoom)Choose Edit inbound rules for your security group.
(click to zoom)Add an inbound rule that allows for TCP traffic on port 27017 from your security group.
Your screen should look as shown in the following screenshot.
Choose Save rules to save the updated rule for your security group.
When your replication instance is available and you have updated the rules for your security group, you can move on to the next module.
(click to zoom)
In this module, you created a replication instance in AWS DMS. The replication instance is used to host the replication tasks that migrate data from an existing database to a fully managed database with Amazon DocumentDB. You also updated a security group to enable access from your replication instance to your Amazon DocumentDB database cluster.
In the next module, you will create endpoints for your source and target databases.
-
3. Create endpoints in AWS DMS
In this module, you create source and target endpoints for a replication task in AWS DMS.
A replication task is a job to migrate data from one database to another by using AWS DMS. Before creating a replication task, you must register endpoints for your source and target databases. An endpoint describes the connection address, credentials, and other information required to connect to a database.
First, create the endpoint for your target database. This is the database you created in Amazon DocumentDB. Navigate to the Endpoints section of the AWS DMS console. Choose Create endpoint to create a new endpoint.
(click to zoom)In the endpoint creation wizard, choose Target endpoint.
(click to zoom)In the Endpoint configuration section, enter the details from your Amazon DocumentDB database cluster. You can find the connectivity information for your Amazon DocumentDB cluster from the Clusters section of the Amazon DocumentDB console.
If Transport Layer Security (TLS), is enabled on your Amazon DocumentDB database cluster, which is the default setting, you need to set the Secure Socket Layer (SSL) mode to verify-full. Then, download the rds-combined-ca-bundle file. Upload the file you downloaded by choosing Add new CA certificate. For more information about this file, see Encrypting Data in Transit.
The Endpoint configuration section should look similar to the following screenshot.
(click to zoom)Before you save your endpoint, test the connection to ensure that it was configured correctly. To test it, open the Test endpoint connection section.
Choose the replication instance you want to use, and then select Run test. After a few seconds, you should see a Status of successful. This indicates that you configured your security group and endpoint correctly. To save your endpoint, select Create endpoint.
(click to zoom)Follow these same steps again to create an endpoint for your source database. Unlike the target database, you need to fill out the connection endpoint, port, and credentials yourself.
You also need to ensure that your replication instance has network access to your source database. If your source database is hosted on Amazon EC2, allow traffic from your replication instance security group into the source database security group. If your source database is not hosted on Amazon EC2, you need to handle the network settings according to the location of your source database.
Before moving on to the next module, you should have two endpoints configured: one for your source database and one for your target database. Make sure that you have tested both endpoints and can successfully connect to both databases. Then move on to the next module.
In this module, you created your endpoints to connect to your databases. In the next module, you will use those endpoints to create a replication task that copies data from your source database to your target database.
-
4. Create a replication task in AWS DMS
In this module, you create a replication task in AWS DMS.
A replication task is responsible for migrating data from your source database to your target database. In your case, you are moving data from an existing database to a database in your newly created Amazon DocumentDB cluster.
To get started, navigate to the Database migration tasks section of the AWS DMS console. Choose Create task to create a new replication task.
(click to zoom)In the Task configuration section, set up the parameters of your replication task. Give your task a name and choose the replication instance you created in an earlier module. Then choose the source endpoint for your existing MongoDB database and your target endpoint for your fully managed database in Amazon DocumentDB.
You need to choose a migration type. There are two migration types:
1. Migrate existing data, which performs a one-time process to copy data from your source database to your target database.
2. Replicate ongoing changes, which copies all ongoing operations from your source database to your target database.If you are migrating your application from using a self-managed database to using a fully managed database, you want to use both types. The first type copies all data in your database, and the second type ensures that all additional updates are replicated to your new database until you switch your application to use the new database.
For the migration type, choose Migrate existing data and replicate ongoing changes.
(click to zoom)In the Table mappings section, configure which database tables to copy. Enter the names of the schemas and tables you want to copy. When working with MongoDB, the schema property refers to the MongoDB databases to copy, and the table property refers to the MongoDB collections you want to copy. You can use % as a wildcard character to copy multiple schemas.
(click to zoom)When you are ready, choose Create task to start your migration task.
After you create your task, your task is shown in the Database migration tasks section with a Status of Creating.
(click to zoom)After the task is initialized, its Status is Starting.
(click to zoom)After the migration of existing data is complete, it shows a Status of Load complete, replication ongoing. Any updates to your source database at this point are copied to your target database.
(click to zoom)
In this module, you created a replication task in AWS DMS to migrate your existing data and sync ongoing changes from your previous database to your new database in Amazon DocumentDB.
In the next module, you complete the migration and clean up the resources you created.
-
5. Complete the migration and clean up resources
If you followed all the steps in this lesson, you have created a new, fully managed Amazon DocumentDB cluster, and created a migration task to copy data from your source database to your new cluster. In this final module, you learn the steps to complete your migration as well as how to clean up your AWS DMS resources.
When your initial migration is complete and all data is synced to your new database, you are ready to use your new database as your primary database.
You have two ways you can handle this:
- If you feel confident about your migration, you can change the database configuration in your application to use your new database. This ensures all reads and writes go to your new database.
- If you want to follow a more cautious approach, you can read from and write to both databases for a period of time. This allows you to compare the results from each database for accuracy while still maintaining the correct data in your existing database.
Whichever method you choose, you should thoroughly test your new database before making it your primary database. After you have switched to using your primary database and are confident in the results, you may want to delete your AWS DMS infrastructure.
First, stop and delete the database migration task to replicate your data. Navigate to the Database migration tasks section of the AWS DMS console. Choose the task you want to remove, and then select Stop.
(click to zoom)It takes a few moments to stop the task. When it is stopped, choose it again, and then select Delete.
(click to zoom)Next, navigate to the Endpoints section of the AWS DMS console. Choose both your source endpoint and your target endpoint, and then choose Delete.
(click to zoom)Then go to the Replication instances section of the AWS DMS console. If your replication instance is not being used for any other replication tasks, choose it and then choose Delete.
(click to zoom)Finally, you may want to terminate your source database because it is no longer being used. If your source database is running on Amazon EC2, you may terminate the Amazon EC2 instance. If your source database is running elsewhere, follow the proper procedures to terminate it.
In this module, you learned how to migrate your application to use your new database. You also learned how to clean up AWS DMS resources when you are done using them.
In this lesson, you migrated an existing MongoDB database to a fully managed document database in Amazon DocumentDB using AWS DMS. By using Amazon DocumentDB, you can free your developers to focus on innovation that is core to your business. And by using AWS DMS, you can automate the delicate task of migrating data to a new database.