blog.lukhnos.org

Using OS X's Built-in ICU Library in Your Own Project

Unicode string is usually not difficult to handle1. The tricky part comes when you need to have knowledge of Unicode on things like:

Some languages, like Java, already have such Unicode knowledge baked in. For C and C++, the standard library to use is IBM's ICU.

You can either build ICU from its source code, or use tools like MacPorts to install it for you. But you may also wonder: Since OS X has great internationalization support, perhaps OS X also uses ICU?

It turns out that OS X does include a version of pre-built ICU library2 placed in /usr/library/libicucore.dylib, but for various reasons — and I also assume it has much to do with the fact that ICU is written in C++, and library versioning with C++ is a pain — it does not include the headers for you to use.

Here's how you can link against the built-in library:

As the name implies, it only contains the core ICU library, so it lacks quite a few things. For example, C++ methods such as UnicodeString::toUTF8, classes as StringPiece, StringByteSink are missing. A good way to check availability is to use the UNIX tool nm to dump libicucore.dylib. Your best bet is to stick to the C API. On the other hand, this saves the trouble of including your own copy of ICU.


  1. Read Joel Spolsky's excellent primer on Unicode if it still isn't to you. 

  2. Many of NS/CFString and NSCharacterSet's internationalization features use ICU under the hood. I haven't tested it, but I believe the tips provided here can also be used on iOS.