I’m trying to move my WordPress blog from Windows to Linux, but I’m seeing a weird character encoding problem in the Linux version that I’m not seeing on Windows. (I had thought it was an issue with the way I used
mysqldump to export the data, but after upgrading mysql, dumping, checking, etc., I’m pretty sure I did that part OK). The data in the database on Windows is exactly the same as it is on Linux.
Here’s what apostrophe’s look like on my Windows blog:
And here’s how it appears on the Linux blog:
As you can see, there’s a weird
ΓÇÖ series of characters instead of an apostrophe. Is there a setting in WordPress that I need to change to get it to render it correctly?
Windows, I’m getting back in the response the following bytes for the single apostrophe:
E2 80 99
Linux, I’m getting back bytes for
CE 93 C3 87 C3 96
It looks to me like you’re not using an actual apostrophe, but one of the curly apostrophes, such as those that Microsoft Word might be configured to auto-correct to. There is a chance that your mysqldump was actually exporting it incorrectly. In my test database, both PowerShell and Command Prompt were dumping as
ΓÇÖ, whereas Cygwin was dumping as
’. If your export does look fine in those areas, you may want to check the Content-Type return header to see that it is specifying UTF-8. If both of those check out, it could be that the database’s encoding needs to be modified.
As an alternative, you could replace any occurrence of
The problem ended up being (I think) Powershell’s wrangling of the output from mysqldump. In powershell, I had been using:
mysqldump -u**** -p**** -h**** wordpress --default-character-set=utf8 | out-file out.sql -Encoding UTF8
I was even good about explicitly outputting UTF8 for both
out-file! However, it seems (and this is really hard to prove because it’s hard to tell what powershell does with UTF8 multi-byte unicode characters once you pass it to
out-file is having trouble handling multi-byte unicode characters in UTF8.
I switched to using the plain old Windows command prompt, which outputted data correctly:
mysqldump -u**** -p**** -h**** wordpress --default-character-set=utf8 > out.sql