Skip to content

Using AWS DMS to migrate MariaDB to PostgreSQL

To mitigate this problem my team and I decided to use AWS Database Migration Service (AWS DMS) to migrate two tables of the main database (MariaDB) to another database (PostgreSQL). This is a summary of how we achieve that and some personal notes that can help you if you are to do this.

The problem

When it comes to migrating data, it is always possible to lose some of it especially if we need to continuously replicate it until we "switch" the flow of data to the new database.

To mitigate this problem my team and I decided to use AWS Database Migration Service (AWS DMS) to migrate two tables of the main database (MariaDB) to another database (PostgreSQL). This is a summary of how we achieve that and some personal notes that can help you if you are to do this.

Moving tables

We decided that we need to move Table A and Table B, the main reason for moving these two tables was the size, combined these tables occupied over 40 GB of data. This data is not vital for our operations and since there was a project to clean and reduce data from our main database, those tables were the first ones to go.

I will be referring to the databases from now on like this:

  • Source (MariaDB)
  • Target (PostgreSQL)

Important to know

Make sure you have a backup of your source database, and that you comply with necessary requirements for the AWS DMS service to work. Like for example making on the parameter group of the source database the value for the key: binlog_format is ROW. More info can be found here https://docs.aws.amazon.com/dms/latest/userguide/CHAP_GettingStarted.Prerequisites.html

Two different schemas

The first roadblock we faced was that we wanted a different schema on the target database. To solve it we just needed to do some configuring when using AWS DMS. Basically, the changes we did were the following:

  • Replace the name of the source database with the new one
  • Example:
    • MariaDB Schema name: example
    • PostgreSQL Schema name: public

AWS DMS by default ignores all tables of the source so you need to decide which tables are going to be included on each migrations task, so that made it really easy to just select the tables we wanted.

Because of our specific use case and to better control the migration we decided to do 2 migrations tasks, one per each table.

Selecting columns

As part of this migration we needed to only select specific columns of table A, luckily doing this is really simple in the setup process of the DMS migration task.

First test

AWS DMS allows you to set up the migration task in one of these 3 options:

  • Migration of existing data (Full load)
  • The application of cached changes
  • Ongoing replication (Change Data Capture)

For the first test, we decided to use the first option.

Proper permissions

After we launched the first test, the process was very smooth. AWS DMS migrated all the data of one table in around 20 min approx. But when we inspected the data on the target database we saw a few things that were not correct. For example, the owner of the table was not the user we expected to be and the autoincrement values of the table were not set.

The first one is our mistake, we set up the endpoints wrong. So be very careful when creating the endpoints on AWS DMS, use the proper credentials.

The second one was a problem with the schema itself, it looks like when you are migrating from MariaDB to PostgreSQL the incremental sequence of the table is not being properly translated, so you need to create an incremental sequence (starting on the greatest number of your PK)

Conclusions and considerations

AWS DMS was really simple and easy to use, once we fixed the endpoints and learned about the incremental sequence, we were able to create two migration tasks and move all the data from the source to the target database.

The entire process took less than 40 minutes, we used the Ongoing replication option as our code base will be updated as soon as we moved the old data first.

Once the migration task displayed that was only moving data that was on the buffer, we applied the changes on our code base to write directly into the target database.

After the buffer of the DMS was finished and it showed no replication activity we shut it down and deleted the replication instance.

So in conclusion, you just need to be careful of the following:

  • Set up endpoints with the proper credentials
  • One table per migration task
  • Be sure to validate the data on the target database.