More UTF-8 head-thumping with Hibernate 3

I’ve just finished upgrading an application component so that it uses Hibernate 3 instead of Hibernate 2. The last time I tried to do this, I spend half a day on it, realised that all of my UTF-8 encoded data wasn’t working and simply abandoned the effort. But I was feeling brave, so I tried again.

First off, since I’m running on OS X, the first thing that I had to learn is that Firefox on OS X doesn’t handle a lot of Indic text properly (notice that this bug is just over three years old!). That makes it really hard to test when you are looking at question marks instead of Tamil! Solution: use Safari. It works fine. A good test page is the OpenOffice Tamil intro.

But it still wasn’t working for me. I resorted back to the techniques that I used in an earlier blog post, specifically md5 checksums of the text in question. And, yes, there was definitely a problem.

The solution: you need extra parameters for your connection string when using Hibernate 3:

jdbc:mysql://localhost:3306/mydb?autoReconnect=true&useUnicode=true&characterEncoding=UTF-8

… and now things work again (well, in Safari, anyway).

5 thoughts on “More UTF-8 head-thumping with Hibernate 3”

  1. This solution would seem to indicate that the problem is with mysql rather than Hibernate.

    Last year, I went through our app at work to enable full UTF-8 support across every layer and found that literally every layer was misconfigured. Even before my effort, UTF-8 would seem to work in some areas of the site, but it was purely by accident. We’re using Oracle and a legacy Java app server and so the data was getting into the JVM memory space in tact from the database, but when when being rendered out to the pages it was being garbled in several various exciting ways.

    While it was a very frustrating effort – the result of all the effort is that a web page shows accented text correctly – it was very rewarding as far as learning about a whole area of software that us stupid Americans usually don’t have to worry about.

  2. Pingback: Isocra blog
  3. Arrr thanks! Your approach worked for my charset problem too.

    have latin1 encoded tables and use hibernate3:
    ?useUnicode=true&characterEncoding=iso-8859-1

    – autoreconnect does not rely to the problem so I skipped that
    – I use & because of xml configuration 🙂

  4. In Hibernate 3.0 with MySQL 4 and mysql-connector 3.0.8, UTF-8 is not working…
    core mysql APIs are fetching and inserting utf8 characters correctly, but Hibernate session is not selecting or inserting special characters. I have tried it with Hibernate 3.0 + Mysql 5 + mysql-connector 5.0.8, working fine… but not with above config

    Please help me, how can I insert special characters (UTF-8) using this config : Hibernate 3.0 + Mysql 4.0.1 + mysql-connector 3.0.8

    my hibernat-cfg.xml file contain:

    true
    UTF-8

    In JSPs :

    in HTML :

    Even I am using :
    request.setCharacterEncoding(“UTF-8″);
    in Filter.

    I will highly appreciate…

Comments are closed.