MySQL 4.x to Oracle 10g Migration Notes
About once a year I need to migrate a medium-sized web application using MySQL to Oracle.
Usually it’s to prepare a successful intranet application for a move to a formally supported production environment - and in Silicon Valley, that usually means Oracle.
I was apologizing to Monty at OSCON this year about my latest migration project away from MySQL.
His surprising response was, “It’s not a bad thing. I don’t mind hearing about conversions because successful migrations demonstrate that people don’t have to worry about database lock-in.”
Here’s some notes on doing migrations.
Converting MySQL 4.x Apps to Oracle 10g
Conceptual Issues
- Skills: Oracle has a steep learning curve for programmers unfamiliar with database transaction programming, so you should have at least one programmer who has worked with Oracle in the past and can write stored procedures. You will also need an experienced Oracle DBA for non-trivial projects.
- Portability: decide if you want to maintain dual database support or not. It’s not difficult to do since 90% of SQL queries work the same, and it’s unlikely you’ve done anything tricky like stored procedures or views in MySQL since those are new features in 5.0. It’s nice to have dual-database support if your developers use notebook computers (MySQL is fairly light-weight), or if you want to market the software later (for example, most ISP hosting plans support MySQL only.)
- Autocommit: decide if you want to use autocommit or not with Oracle. Usually not.
- Performance: MySQL has a very limited query optimizer, resulting often in slow table scans. Oracle however has a great query optimizer and can optimize queries even with multiple subselects and OR clauses.
If you do mass updates, consider checkpointing them (loop and commit) to avoid filling the Oracle redo log. A MyISAM UPDATE statement that touches all records can fill up the log in Oracle if not sized correctly. For the sames reason, Oracle TRUNCATE is much more efficient than doing a DELETE * FROM TABLE.
MySQL has table types that have specialized features and performance, such as MyISAM/FullText, Innodb/Transactions, Heap, Blackhole, etc.
Oracle has 3 storage engines (heap/parallel query, index-organized, external) with features on top of that. Heap is the default storage engine, index-organized is a B-tree optimized for compactness and quick access that can be several times faster for index and range queries than heap, often used for data warehousing.
- MySQL supports database names, but Oracle is quite different … it uses one database but optionally supports multiple schemas based on userid
- Schema: converting the database schema and data migration will likely be more difficult than doing the application source code changes. There are tools to help. Spectral Core sells Full Convert. Same with regression testing. MySQL can be accomodating about blank vs. null. vs 0000-00-00. Oracle is not, so loading data can be touchy.
- Timezone: MySQL-based applications often use whatever timezone the MySQL server uses. Oracle recommends for performance reasons setting the database to UTC (+0000). Beginning with MySQL 4.1.3 and Oracle 9i, both databases have similar tz features, as both are based on the Olson timezone database. To do timezone conversion, MySQL uses CONVERT_TZ and Oracle uses FROM_TZ. Note that in MySQL 5.0, NOW() is a session timestamp (computed once so replication-safe), while SYSDATE() is a real-time timestamp and would be a different value on the slave.
mysql> SELECT @@global.time_zone, @@session.time_zone; # show MySQL tz settings
SYSTEM SYSTEMmysql> select * from time_zone_name; # see if tz database is loaded yet
Name | Time_zone_id
Africa/Abidjan | 1
Africa/Accra | 2
Africa/Addis_Ababa | 3sqlplus> SELECT * FROM v_$timezone_names;
- Character Set: MySQL-based applications often use whatever character set that MySQL defaults to. When moving to Oracle, you likely want to add explicit support because production Oracle instances are usually set to AL32UTF8 these days (Oracle’s “UTF8″ character set is actually Unicode 2.0 from 8i days). You can see which character set a database or column is set to with:
select * from v$nls_parameters where parameter in ('NLS_LANGUAGE',
'NLS_TERRITORY', 'NLS_CHARACTERSET');
select dump(mycolumn,1017) from mytable where rownum=1;
- Case-sensitivity: MySQL generally does case-insensitive string comparisons if you don’t use the BINARY keyword, but Oracle is case-sensitive. MySQL database and table names are case-sensitive on Unix (but not Windows or Mac OS X HPFS+) because databases and tables are actually directories and files, but Oracle silently upper-cases them and appears case-insensitive.
- Sequence Numbers: Oracle sequence numbers are not guaranteed to be sequential. Values are “lost” in a rollback, and will most likely be “lost” if cached sequence numbers are specified and there is a shutdown or panic or library age-out. Non-cached sequence numbers can cause a noticeable performance impact (a disk access), which is why the default is to cache 20 values per allocation. 1000 is commonly used for bulk loading.
Some Oracle sequence factoids: sequences never rollback after being incremented, they can be non-numeric, and you can use multiple sequences per table in Oracle. (Oracle sequences are actually separate objects from tables.)
To use sequences in Oracle, you can either specify NAME_OF_SEQ.NEXTVAL followed on the same $dbh with NAME_OF_SEQ.CURRVAL in 2 statements, or combine both with an INSERT … RETURNING … INTO statement.
You can emulate MySQL’s autoincrement feature with an Oracle sequence and a trigger. This is documented in the blog posting How to Create Auto Increment Columns in Oracle.
When migrating from MySQL to Oracle, you may want to consider dropping useless surrogate keys altogether, reducing the need for application code changes or creating sequences. - Trailing spaces: MySQL and Oracle handle trailing spaces in columns differently when doing string comparisons. Oracle’s NCHAR preserves trailing spaces, and NVARCHAR2 does not.
- Performance: MySQL is a lightweight database that usually performs well with little planning. Updating a couple of rows and doing a select from a MyISAM table may take milliseconds in MySQL, but one second calendar time in Oracle without prior thought to using batch inserts in a single transaction, or batch sequence numbers.
- NULL: In MySQL, the empty string may be inserted into a column and is not a NULL value. Oracle converts the empty string to NULL.
- MySQL silently truncates input data when too wide for a column, but Oracle considers the column width to be a constraint and fails the insert or update.
- It is smart to quickly migrate a representative sample of your migrated code with the converted Oracle schema to notice any surprise problems.
SQL Syntax Issues
- the ANSI join syntax works in 10g, so that makes porting much easier than in the past. (The Oracle DBAs I have talked to said that the new features in 9i were not ready for prime-time.)
- MySQL allows GROUP BY on any column. Oracle only allows GROUP BY on all of the columns in your query result set.
- MySQL unix_timestamp() can be converted to an Oracle stored procedure
- if you were using a MySQL database for scratch tables, you can do a similar thing in Oracle by declaring tables to be in a scratch tablespace, but in the same schema
- should be able to use Oracle transactions and ACID, and remove MySQL LOCK TABLE and UNLOCK TABLE statements.
- in MySQL, a database is a combination of a hostname and database name. In Oracle it’s a SID, and is defined in the tnsnames.ora configuration file.
- MySQL LIMIT can be replaced with a subquery using ROWNUM in Oracle, for example SELECT * FROM (SELECT ROWNUM limit, … ORDER BY …) WHERE limit BETWEEN ? and ?. Note that MySQL LIMIT is 0-based but Oracle ROWNUM is 1-based.
- Oracle object names (column, table, sequence) are 32 characters and by default case-insensitive. In MySQL, database and table names are just files, so case-sensitive on case-sensitive file systems like Unix
- MySQL autoincrement columns will instead need a sequence in Oracle.
- the optional MySQL AS alias statement keyword is not recognized in Oracle after a FROM clause table name… just delete it. (AS is valid after a column name.)
- MySQL CONCAT can be rewritten as || in Oracle
- MySQL syntax INSERT INTO table SET is not supported in Oracle.
- MySQL syntax for batch INSERT (mutiple VALUE lists) is not supported in Oracle.
- MySQL NOW() can be replaced with Oracle SYSDATE, or CURRENT_TIMESTAMP which works in both MySQL and Oracle. CURRENT_DATE is also portable.
- MySQL SELECT on a datetime field for display without explicit formatting can be emulated with select to_char(SYSDATE,’YYYY-MM-DD HH24:MI:SS’) from dual;
- MySQL EXPLAIN can be done 2 ways in Oracle: EXPLAIN PLAN FOR …; @$ORACLE_HOME/rdbms/admin/utlxpls.sql, or the DBA can create a plan_table so that in SQLPLUS you can type SET AUTOTRACE ON
- MySQL CREATE TABLE … LIKE would need a stored procedure in Oracle to copy metadata. Oracle’s CREATE TABLE … AS SELECT does not copy indexes, triggers, constraints, tablespaces or sequences. Oh, and in Oracle, non-primary indexes must have unique names across tables because they are schema objects that do not actually belong to a table.
- SELECT COUNT(*) FROM table_name is very fast with MySQL MyISAM tables, because the total row count is stored in the index. In Oracle, expect a much slower result as an index scan is required. An estimate is stored in NUM_ROWS which is updated after most non-INSERT table changes, but it can be wildly inaccurate.
- To read blobs in Oracle, you will likely need to allocate memory for the result: $db->{LongReadLen}=500000; # Make sure buffer is big enough for BLOB
- MySQL always sorts NULLs last, but Oracle ORDER … DESC sorts NULLs first, unless you specify NULLS LAST
- queries returning the “top n” results are usually implemented in MySQL with ORDER … LIMIT, but in Oracle there are a number of ways of doing that with subselects, ROWNUM and RANK keywords.
- The MySQL client program (mysql) is a fairly usable text-mode application. Oracle’s version, sqlplus, is inadequate for programmers. Consider using Oracle SQL Developer (Java, so kind of slow sometimes but does work on MacIntel machines), Squirrel, or Qwest TOAD instead.
- MySQL’s LOAD DATA INFILE and SELECT INTO OUTFILE statements can be emulated in Oracle with SQL Loader and BCP external programs. Perl programmers can use the CPAN module Oracle::SQLLoader, although it is simplistic and needs more testing for customized control files.
- Perl DBI’s $sth->rows() returns the row count from a SELECT result set in MySQL, but in Oracle does not. At best it will indicate -1 for failure and 0E0 for success, so do a COUNT(*) or loop over the result set with while and fetch for a row count. In Oracle, $sth->rows() is incremented as you do the fetch, often too late for your program logic.
- Jeremy’s MyTop for MySQL has an analogous display in Oracle’s SQL Developer Reports .. DD Reports .. DB .. Top SQL.
- MySQL’s best-effort statement-based replication and Oracle’s replication are very different. Oracle replication is done in a transaction across master and slave, so they stay in sync.
A sequence hack, that I would seldom recommend, to mimic MySQL autoinc in Oracle is this. Define one sequence in Oracle called APP_ID, have the app call APP_ID.NEXTVAL everywhere an autoinc is needed, regardless of table, and create an Oracle stored procedure called LAST_INSERT_ID that calls APP_ID.CURRVAL. That way you can minimize source code changes to your MySQL app.
The downside is that a busy app will soon be using very large numbers as IDs, perhaps needing wider columns, and making it difficult for humans to write or verbalize them, or even predict what the next value will be for a given table.
Please post your migration tips!
Oracle Globalization Whitepaper
Oracle Documentation
MySQL Documentation
Oracle: Welcome to the “2 Day DBA”
Planet MySQL (High Quality Blog Aggregator)
Oracle-Base: EXPLAIN PLAN Usage
Oracle SQL*Loader FAQ
SOLUTIONS TO COMMON SQL*LOADER QUESTIONS
BULLETIN: CACHING ORACLE SEQUENCES
OraFAQ: BLOBs
MySQL Manual: MySQL Server Time Zone Support
MySQL Manual: MySQL Date and Time Functions
Oracle Date Functions
Oracle Resumable Transactions
Write Time Zone Aware Code in Oracle
Speeding Up Index Access with Index-Organized Tables
Contact James if you need database conversion or Perl consulting.
November 6th, 2006 at 7:24 am
I have read you doc about MySQL to Oracle migration and found it very interesting as it is exactly what I’m doing right now at job.
The only problem I have seen is the aplication to be migrated is written in PHP, and I cannot find a usable way of getting the last insert id when I’m doing an INSERT query. Do you have any idea of how to do this?
Thanx a lot for your time and knowledge.
November 6th, 2006 at 7:30 am
Hi Daniel.
In Oracle, sequences are separate objects which are not tightly bound to a particular table. So you will either have to do an extra query for CURRVAL, or use RETURNING INTO to get the sequence value from the INSERT statement directly.
James.
December 12th, 2006 at 8:41 am
Great article. In reference to “MySQL silently truncates input data when too wide for a column, but Oracle considers the column width to be a constraint and fails the insert or update”
MySQL can be set to TRADITIONAL or STRICT_ALL_TABLES to reject values that are out of range, have an incorrect data type or are missing for columns that have no default value. I.E.
Set sql_mode=’TRADITIONAL’;
October 11th, 2007 at 10:13 am
[...] read more [...]