Recently I migrated a small but non-trivial (25 tables, about 500 columns, no triggers or SPs) MySQL schema to Postgresql using the Open Source pgloader utility.
pgloader supports migration from several databases/formats (MySQL, Sqlite, MS SQL, dBase, CSV) to Postgresql.
pgloader is fast: it took about 20 seconds to migrate/load both the schema and data into Postgresql 9.2 on my Mac notebook running in Virtualbox. The total number of rows was about 3 million narrow rows.
The impressive speed of pgloader lets you iterate quickly when writing pgloader rules, and lets you embed the loading process into automated scripts without any concern about performance.
pgloader is a free scriptable ETL tool that lets you quickly migrate database schemas and data to Postgresql. It is helpful for setting up a PoC to estimate the work required for a full migration analysis. (It does not do SQL syntax or application code migration.) Although pgloader is excellent at what it does, any database migration is a major project.
Installation on CentOS 7
yum install freetds postgresql createdb mydb # unpack and run the build script from github or use the Docker image make pgloader pgloader mysql://root@localhost/mydb postgresql:///mydb
Installation on Debian or Ubuntu
apt-get install pgloader postgresql createdb mydb pgloader mysql://root@localhost/mydb postgresql:///mydb
– 100% scriptable and customizable with pgloader rules (called a “load file”)
– reasonable result for a first pass even without custom pgloader rules
– “pro DBA” feeling – accepts configuration file, and emits log files, similar to the behavior of Oracle’s BCP utility.
– not included with Postgres
– unusual in that it is written in Lisp, but that is not user-visible
– doesn’t create Postgresql users/roles (to be expected as the SQL standard doesn’t specify these, thus they vary greatly across databases)
– doesn’t convert MySQL partitions to Postgresql inheritance-syntax partitions in 8.x or 9.x, but should work with declarative-syntax partitions in 10.x+
– detects and double-quotes reserved object names, but doesn’t notify of schema name conflicts with the Postgresql public namespace ie. table name ‘user’
– will require time-consuming schema cleanup for most use cases. See below.
Schema/Data Cleanup Notes
- MySQL timestamps do not display the tz, but by default pg shows … “+00” unless you specify “timestamp without time zone”, or you use EXTRACT() or TO_CHAR(). Read about related pgloader rules here.
- MySQL text and varchar columns can be migrated with varying results to pg text and bytea types
- treat the table name ‘user’ as reserved since there is a public.user symbol that overrides the search path
- don’t underestimate how long schema cleanup will take. Although pgloader runs quickly, Postgresql does not do casting automatically, so is extremely sensitive to application SQL statements
- MySQL and Postgresql have different models for representing users/roles and timezones that need to be dealt with sooner than later. Here is some advice on timezone setting: Adding timezone to naive datetime fields from MySQL #331
- migrating applications from MySQL to Postgresql is easier with Postgresql 9.5 since it has INSERT … ON CONFLICT DO UPDATE (UPSERT.)
- index names are table-specific in MySQL, but schema-wide in Postgresql, so by default pgloader names them idx_NNNNN_name to uniqify them. If you need to use named indexes, like with ON CONFLICT ON CONSTRAINT in 9.5, then you either need to uniqify the index names in MySQL and do pgloader –with “preserve index names”, or add a pgloader rule to do CREATE INDEX your_index_name.
My Migration Results
After renaming the table ‘user’ to ‘users’ and altering the sessions table (see above), around half of the SELECT queries worked as-is and I could login to the application and click around. 🙂
However, virtually all of the INSERT and UPDATE statements had to be rewritten, taking 2 man days to get to an alpha version and 2 weeks for something worth doing formal QA.
Database Migration Alternatives
Amazon AWS provides 3 powerful data migration tools under the AWS Database Migration Service banner that are either free to use, or have a 6 month trial.
pgloader: Homepage, github, Manual, Licence
http://rhaas.blogspot.com: The State of VACUUM
severalnines.com: Upgrading Your Database to PostgreSQL Version 10 – What You Should Know