Moving Chinese encoding (UTF-8) on WordPress from MySQL 4 to 5

Upgraded my WordPress and had my Chinese Characters all gone!

After some inspiration from http://alexking.org/blog/2008/03/06/mysql-latin1-utf8-conversion and http://codex.wordpress.org/Converting_Database_Character_Sets I realized that the reason the upgrade didn’t work was when I was upgrading from MySQL 4 to MySQL 5. I host with 1and1 and used their import script to import the SQL that was exported. Unfortunately, as the exported SQL was too simple (it didn’t explicitly say the COLLATION for each field, and table) the resulting set of wordpress tables were all created with latin1_general_ci as the collation. All the automatic scripts provided by wordpress has already completed and reported success, so it’s time to go manual.

This is what I did:

1. Fix relevant field’s collation

I switched the collation of the relevant fields (post content, post title, post name, etc. and also link and comments table) where I normally will sprinkle UTF-8 characters in from latin1_general_ci to utf8_unicode_ci. You don’t really need to know SQL much, just go to your phpMyAdmin, then the table, Structure, the little Edit pencil of the field and you can change the drop down’s value.

You don’t have to fix every field, although the links above advises you to do so. I’m going to let this mix mode run for a while. If you’re not sure which fields to switch, then fix every text field (all the best).

2. Reexport the table as UPDATE

Next, I went back to my old MySQL 4 DB, re-exported (careful: remember to keep everything UTF-8, save the file as UTF-8) the relevant tables using the Export function. Under Data, I used the UPDATE option, as this will create UPDATE statements instead of INSERT statements (since the entries were already inserted by the wordpress upgrade.php scripts). I omitted the Structure (i.e. the CREATE TABLE statement). What you want to get is a bunch of UPDATE sql statements that has your UTF-8 characters in it saved to a UTF-8 file.

Do take note that if you are also upgrading wordpress, some field names / types might change. I did a field to field comparison and found my situation to be identical. If yours isn’t, you will need to do some search and replace (or Perl script for advanced users) – just keep saving the file in UTF-8 format lest you lose them again.

3. Import the table

Next, I switched back to the new MySQL 5 DB, use the SQL function to upload the file and run it. That’s all. Sometimes uploaded files won’t work with 1and1, so I copied the file into the browser window. Mine was like 2 to 3 MB in size so it wasn’t much of a problem / wait.

*phew*

Did it 3 times for 3 tables: wp_links, wp_comments, wp_posts.

And the site is back up!

Print Friendly

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>