Over the past few months I’ve been researching how TYPO3, Joomla, Plone, and Drupal currently support multilingual content. Since this work was in part supported by Development Seed, I happily accepted the request to guest blog some of my findings.
One piece of my research that I was quite interested in was how each platform handles multilingual content on its frontend and backend. All of the systems I examined provide different solutions for interface and content translation. By investigating these existing systems, I’ve identified three distinct approaches for multilingual support. All have their advantages and disadvantages, both on the frontend and on the backend.
The underlying problem with content translation is that we need some fields (for example, the title and the body) to be translated but want some fields to be shared between translations (like an event date and location). What we share between translations is different from content type to content type and sometimes it needs to be differentiated granularly. For example, if you submit an event that’s organized on the same date but in different locations around the world, you’d want to share the date but not the location between different language versions of the content. This is how the difference between translation and localization makes our life more complex.
Let's look at the existing approaches I found in the systems I examined.
Separate Objects for Translations
This method works by storing translated versions of content as separate instances and relating them to each other. Plone uses this method, as does Drupal with its modules for node-based content storage. Plone implements this based on the underlying object database, which means access to the shared properties of translations can be managed. A canonical content object is defined for every such content group so a canonical version of shared properties can be managed. Drupal's i18n module is gaining support for shared properties, as explained in this previous blog post. The Drupal i18n module also uses this approach for other content objects to allow for unique site structuring and setup for specific languages.
The advantage of this approach is that you can edit the translated or localized version of content on the same interface as you do the "original" content. However, sharing values between the content instances does require software level coordination.
Overlays on Content Objects
Some objects can have a defined set of their properties "overlayed", in effect replaced by translated values upon loading. These shared properties are implemented by not allowing some properties to be overlayed and by falling back to the original values when no overlay value is available. TYPO3 uses this approach, and some modules in the Drupal i18n module set also make use of such a solution.
In the cases I examined, the user interface adapts to the "overlayed content" so users have a separate interface for setting up language versions. This type of solution is often constrained on the system level, and it is not possible to choose arbitrary fields to share. As a result, the interface also gets constrained.
Generic Database Level Value Overlays
A more generic implementation of content object overlays involves allowing property overlays on the database level. Database table names and key values specify a record, and a column name specifies a value to be overlayed for an actual language. This approach is used by Joomla and in part by Drupal's localizer module, and it is generic enough to allow for any kind of translation in the relational database level.
Unfortunately the generic backend brings a generic frontend with it, which makes for an unpleasant experience for translators. In this case the content editor interface knows about select boxes, WYSIWYG editors, validation of entered values, and other features, but unfortunately the generic database level translation method can only provide a text box to edit database field contents. This means that translations need to be made without tailored editing widgets and validation functionality.
So which one is the best?
This is a tricky question. While generic database overlays are marketed as the most versatile solution, they are very unfriendly and limited for the end users. The other end of the spectrum (which is chosen by Drupal modules) has separate content instances that pave the way for a handful of appealing features like revision control and permission handling, as well as professional workflow possibilities. However, data sharing was trickier and inconvenient for the users until recently, before sharing improvements hit Drupal's i18n module set.
Drupal 6 will hopefully push forward a content-based solution, but there are still some open questions. If you are interested, a discussion is taking place on this topic on Drupal.org.
This post was originally published as a guest post on the Development Seed blog. It is not available there anymore, so archived here for posterity.