Update: Replaced the video with a new version with English subtitles.
Earlier this year, a Taiwanese newspaper, Apple Daily (蘋果日報, not affiliated with Apple Inc.), made an Animation News (動新聞) clip on the Jay Leno-Conan O’Brien spat, which I also re-posted. But Apple Daily’s real fame started with its animated Tiger Woods tabloid series.
Anyway, this is the new one on Apple’s Antennagate. And once again you don’t need to know any Mandarin Chinese (the official language in Taiwan) in order to understand the gist of the story. Now I must warn you not to eat or drink in front of your laptop while watching this clip…
Why My Next Mac App Will Not Use Garbage Collection
Among the many things I have learned from the past few years’ experience of developing desktop applications, here’s one: Implement your app conservatively. This often translates to using only proven technologies. Cocoa’s garbage collection (gc), alas, does not seem to be one of them. My next Mac app will not use garbage collection. In fact, we are actually even taking the pain of modifying an existing garbage collected app to non-gc.
Since Mac OS X 10.5, you can choose to use automatic garbage collection in your Cocoa app. Apple’s take on gc is a brilliant engineering feat, which makes Objective-C a more modern language (and it’s even open source). Before gc, memory management is much like manual transmission and requires a lot of mental arithmetic — you need to retain an object (to increase its refcount) when you start using it, and release it (to decrease its refcount) when you relinquish its ownership. With gc, no more such mental arithmetic is required. It’s great that you are saved the burden of memory management, particularly when you have other more nasty things to worry about, such as multithreading (which makes manual management harder) and binding (which complicates object graph).
Unfortunately, I have found a few instances that Cocoa’s gc doesn’t work that well. This has particularly to do with libraries that are beyond gc’s reach. I’ll name three: Secure Transport (which handles things like HTTPS), icucore (which handles localization and date/time formatting), and XML parsers (I have tried both NSXMLParser and expat). These three can already leak from time to time when being called from the main thread alone, and almost gurantee to leak if used in different threads, even if proper locks and one-instance-per-thread-at-a-time policies are enforced.
You might be tempted to think, big deal, if those stacks have their baggage and work best in a non-gc app, I’ll just write a separate app, and use distributed objects (DO), another fancy technology, to bridge between the gc and non-gc processes. Here’s the bad news: There is apparently a bug in Apple’s DO implementation under gc, and all proxies objects that ever created to communicate with the remote process will not be destroyed until the app’s termination.
So for the time being my conclusion is, if your app is network-intesive, needs to do a lot of things in parallel, and happens to parse a lot of XML and has a fairly complex user interface that relies on binding, then garbage collection probably isn’t for you.
Now the only big question that I have is, how did Xcode manage it? Xcode, as we know it, is a garbage collected app1. It’s also said to be Apple’s most complicated application. Now, unlike Mail.app, Xcode doesn’t seem to do lots of date/time formatting. It reads plist (for which Apple has a faster parser implemetation) but not really XML. The only network-bound parts seem to be the documentation fetcher and the recently-added automatic iPhone provisioning. Much of the IDE seems to be there already in OS X 10.3 days. All told, Xcode seems to stand well.
I’d like to know to make those non-Cocoa parts work with gc. But before that, I’ll take the more conservative path.
I know this because the previous generation of our Adobe Kuler color picker, Mondrianum, used (and still uses) a cover flow image view, which doesn’t work under gc. In the previous generation, we used some NSGarbageCollector hacks (don’t ask) to manage to make the plug-in work within Xcode. In the current version we have decided not to support gc apps like Xcode, because those hacks never worked perfectly. ↩
My company’s FogBugz client application for Mac, LadyBugz, has a new version. We have given the case history view a facelift, along with many improvements and bug fixes. We weren’t sure if we were on the right direction when we started to make the major interface change, so we asked a few of our customers if they’d like to try out a beta version. The initial response was positive, and so we sent the merge command, and the tentative branch became the main. This enables us to take on implementing other frequently requested features, and we believe this version is a substantial improvement.
D. Richard Hipp, creator of SQLite, in sqlite-users mailing list:
Some of the code in SQLite (such as the Lemon parser generator and the printf implementation) dates back to the late 1980s. But the core of SQLite was not started until 10 years ago. Ten years is not that long ago, though it has been long enough to amass 7114 check-ins - an average of 2.1 check-ins per day. If you are overseeing such a project, 10 years seems like forever. It has hard for me to remember a time when I wasn’t working on SQLite.
SQLite is my favorite software project and a role model. It is lightweight, efficient, self-contained, and vastly powerful. Not many software projects can be said of the all four, especially in terms of self-containedness. SQLite now states it status as public domain in a more official manner (out of institutional use considerations), but I believe all of us can learn a lot from the blessing in its source code:
May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
I also had the opportunity to work on a client project that used SQLite’s both paid versions—SEE and CEROD—and SQLite never disappointed.
I wish the best for the project, and look forward to its 20th and 30th birthdays.
Formosana, a Collection of C++ Libraries for Processing Taiwanese Languages
Formosana is a C++ library collection that provides basic building blocks for processing Taiwanese languages. Currently three languages are supported: Mandarin, Holo and Hakka. It also provides a language-agnostic, bigram-based word segmentation library. It has no external dependencies and can be built on most platforms I know of. It is available on github under the MIT License.
My day job is commercial iPhone and Mac software development. In addition to that, I also develop open source software, mostly in the form of libraries. Designing libraries and frameworks is both a good exercise in itself and an important part of software development. It pushes you to think and plan head for future consumption, and it also gives you a good opportunity to think about the fundamentals of a given problem set.
Formosana currently has three major components:
Formosa::Mandarin: A library for processing Mandarin syllables and handling text input keyboard layouts. An abstract data type represents Mandarin syllable. The syllable data type accepts both Pinyin and Bopomofo as input, and can be used to convert to either form as output. Its internal representation guarantees that the syllable in always in the CVCT form, although it does not guarantee that the produced syllable is always phonetically grammatical (i.e. it can be used to produce syllables not found in the actual Mandarin). It also support four major keyboard layouts (expandable) that map a standard US keyboard to Bopomofo symbols.
Formosa::TaiwaneseRomanization: A library for processing Romanized Holo and Hakka. An abstract data type represents Holo or Hakka syllable. Internally it uses POJ (pe̍h-oē-jī, also called Church Romanization by some). It accepts POJ for both input and output. Tâi-lô (TL, or Taiwanese Romanization System), technically a POJ variation that is the standard for Romanized Holo used by Taiwan’s Ministry of Education, can also be used as both input and output. This syllable library has a normalization member function that guarantees only the composed tonal mark is placed on the correct vowel character according to the resonance in the Holo language. It is weaker than its Mandarin counterpart in that the syllable class does not guarantee the represented syllable is always in the Initial+Vowel+Final+Tone form. It accepts both “composed” form (syllable with diacritics) and “uncomposed” form (tone in numerals, also called database query form in the library) for input, and can also produce output in either forms. This library also supports keyboard layout mappings. Both numerical tone input and dead key combinations are supported.
Formosa::Gramambular (literally, “gram walking”): A language-agnostic, bigram-based word segmentation library. It accepts an input set of weight unigram and bigram key-value pairs, and output a best-scored path. If the key is input syllables and value is a Chinese phrase that the syllables represent, the walk is an input method. If we reverse the key and value, it becomes a word segmentation tool. As the library works without any grammatical knowledge, the quality of the dictionary (that provides the data source for weighted nodes) is the deciding factor of the output quality. I have mentioned the principal of the library’s design in a talk at Open Source Developer Conference, Taipei, Taiwan, in 2008. As a bonus, Gramambular has a debug helper that can produce outputs in the Graphviz DOT format, which you can then feed into the tool and get visualizations like this and this.
Each of the components comes with its own demo code. I have also supplied Makefiles (for Mac OS X and other UNIX platforms) and Microsoft Visual C++ solution files for those sample projects.
The library collection makes use of a few helper classes from The OpenVanilla Project. I have included those class files (also written by me) in the source to make the collection buildable with no external dependencies.
Formosana was first designed for developing input methods, and both the Mandarin and the Taiwanese Romanization modules have been used in actual products. Although Gramambular has not yet been used in production, I have previously worked on an implementation based on similar principles for an internal project at my company. The commit history of the project will tell you that Gramambular was written pretty fast (2 days) from ground-up. For me it was also an exercise to start over from scratch to see if the design is solid.
The library collection has many other uses in processing Taiwanese languages. There is also space for improvement. For example, a syllable class that can validate against the phonetic grammar is highly desirable. Currently the Taiwanese Romanization class instances are mutable. Normalization changes internal representation instead of returning a new immutable object. In addition, for the libraries to be useful for building language-related web applications, bindings to major scripting languages are also desirable. These are the things that developers interested in the field can work on.
I’ll be highly interested in hearing from you if you use or plan to use Formosana in your own projects. My contact info regarding to this project can be found on my github profile.
The connection of soothe to yes is strange but true; it takes a bit of relaxing to get it straight in the mind. I suppose that if something is, and is true, and leads to nodding of the head, and brings the archaic response sooth, or the modern answer yes, it is a soothing experience. The truth is not always soothing, but in a better world it ought to be.
I’m happy that my company just released the version 1.0 of LadyBugz, a FogBugz client for Mac.
If you just heard FogBugz for the first time, it is a “bug and issue tracking, project management, help desk software” service. FogBugz is a product of FogCreek, one of whose founders is Joel Spolsky, the software luminary behind Joel on Software and Stack Overflow.
What Zonble and I like about FogBugz is that it fits the needs of a software company well. The nature of our work requires a good issue tracking tool, but we also need to communicate with clients and customers. Many issue tracking tools are not designed with support department in mind, and this is where FogBugz does it just right.
As we used FogBugz more and more, it became natural that we wanted to create a native Mac client for it. It’s interesting that many good Mac and iPhone applications these days are client software to well established services. Tweetie is a good example. We also happened to know one thing or two about working with successful web services. So we decided to create a Mac client to FogBugz that we want to have the best of the two worlds: good web service delivered with a fast native interface.
The version 1.0 of LadyBugz gives users an integrated case and event browser, a case editor, and a mail composer. Those three components correspond to the three areas in which FogBugz excels: project management, bug and issue tracking, and help desk / customer service integration. It also comes with snippet support, an important feature for people who do frontline support services. Good help desk features are what make FogBugz stand out among similar services, and we would also like to design LadyBugz with both engineering and support departments in mind.
We had the first beta version up and running in mid-November last year, and since then we used it every day for our project management and product support. Like many applications with ambitious feature sets, LadyBugz also underwent architectural changes and total rewrites. We’ve also decided to target solely on the latest Mac OS X (10.6) so that we can leverage great tools like Grand Central Dispatch-backed NSOperationQueue, blocks, and many user interface improvements. The aim is to create smooth user experience with good technical performances.
Since this is our first full-featured Mac application, and also our first commercial product developed as an independent software vendor, we really hope LadyBugz brings good value to FogBugz users. And just like any version 1.0 software, this is really just the beginning of many great plans ahead. We want to spread the words, and will continue bringing out great things to our users.
Finally, I’d like to thank Mike Ash, Jeff Johnson and Lee Falin of Rogue Amoeba, Joe Goh of Phone Journal, Pierre Bernard of Houdah Software, Justin Williams of Second Gear, and Evadne Wu of Iridia Productions for having given us many suggestions that shaped the application. I’d also like to express my gratitude to the authors of the open source libraries that we use, including Sparkle and many others—they are a major force that makes the Mac developer community strong and vibrant.
LadyBugz is commercial software, with an individual license for US$55. You can download and try it with full features for free for 21 days. It also has a presence on Twitter, so follow us and let us know how we do.
For the past few months I’ve been using this as my wallpaper/desktop background. I’m seldom a fan of any personality or any project, but hey it’s LLVM.
I finally found out the source of the picture. It’s made by putting the LLVM logo over one of Cocoia’s wallpapers.
How did I find out the source? I must have seen the link from some friend’s twitter long time ago, but the friend’s tweets are locked, and Twitter’s search is not really helpful.
So how did I finally find out? Surprisingly, via Get Info in Finder! Essentially, it’s the xattr (extended attributes) in the file. Since 10.5, all files downloaded by Safari carry an xattr called com.apple.metadata:kMDItemWhereFroms:. You can use the command line tool xattr to inspect those attributes. The attribute values are in the form of property lists.
I find this a good point to show the importance of file system advance and why metadata matters.