Refining multilanguage: adding language fallback as an option
When language fallback is disabled, the item is not shown if the translation is not available.
In a previous post I described the first version of the multilanguage database used on this website. For every table that has fields that must translated, we add a 'translation' table with these fields. I also implemented language fallback: if an item, like a blog post, does not exist in the selected language then the item of the (system-wide) default language is shown. This works fine but now I want to add optional fallback allowing some posts only to appear in Spanish, others only in French, etc.
We could implement this easily by changing the fallback language to some unknown / non-existing language. Now if we add a Spanish language post, the post in other languages is not available, just what we want. The only problem is that we now do not have a fallback language anymore. Is this a problem?
Consider the following case. We have a post in Spanish, and want the non-existing post in French to fallback to the default language English. But we do not want the non-existing post in German to fallback to the default language. This is not possible with the current implementation. How often will such a condition occur? Not often, but I can imagine it may happen, and want to prepare for this. So how do we implement this?
Some fallback options
There are many fallback options possible for a multilanguage database. We always need somewhere a switch saying:
- fallback to the default language item if no translation present for the selected language
- do not fallback, meaning that this item is not available in the selected language
For fallback we can have options like:
- record level fallback e.g. title, description
- field level fallback e.g. title
We can also have:
- single fallback, e.g. there is only one fallback language
- recursive fallback, e.g. if not exists in Portuguese use Spanish, if not exists in Spanish use English
The fallback language can be:
- system wide
- at record level
- at field level
Without going into much detail you can imagine that by implementing maximum flexibility, all options, we are going to create a very complex system that will require complex queries. In fact, it may be necessary to build a complex preprocessor for the front-end to prevent slow page views because of all the language option processing.
Back to basics. I want for this website:
- Support for multiple languages
- There is a system wide default language
- There is a system wide fallback language
- The default action must be language fallback
- Per language item it must be possible to disable language fallback, making the item unavailable if not published
I defined the following languages:
- language_selected: language in use
- language_default: the default language, a static system wide setting
- language_fallback: the fallback language, a dynamic system wide setting based on selected language
Most of the time the fallback language will be the default language but in future we could change this to another language as well:
Example#1: fr-Be fallback to fr-FR
- default language: en-US
- selected language: fr-BE
- fallback language: fr-FR
Example#2: es-ES with fallback to en-GB
- default language: en-GB
- selected language: es-ES
- fallback language: en-GB
I want the queries not to slow down the system but if they do we can always add 'query result caching' and/or query preprocessing.
At the moment there two content item tables:
We always start by selecting one or more records from ContentItem. If the ContentItemTranslation record is not present (or published) for the selected language then we must fallback but only if fallback is enabled for this item. Time to look at the query used to retrieve the items. We distinguish two possibilities:
- ContentItemTranslation record does not exist
- ContentItemTranslation record exists
In both cases we can do a join of union, the first part gets the record for the selected language and the second part gets the language for the default language. The difficulty is how we disable fallback. When the ContentItemTranslation record exists we can put a flag 'do_not_fallback' here but what if the ContentItemTranslation record does not exist? Then we need another language specific flag somewhere and again this flag may or may not exist.
One way to do this is add another table ContentItemTranslationDoNotFallback containing just one flag = do_not_fallback. Like with ContentItemTranslation, ContentItemTranslationDoNotFallback records may or may not be present. Then we have three tables:
We want the default action to be fallback to the fallback language if the 'translation record does not exist'. This all looks pretty straightforward. But this also introduces an extra record of a new table that may or may not be present.
Another way to achieve what we want is to add the 'do_not_fallback' flag to the translation record. I mean, if we have to add a record to a new table ContentItemTranslationDoNotFallback to indicate 'do_not_fallback' then why not use the ContentItemTranslation table record for this purpose? If it does not exist, we add it, otherwise we simply use it.
In this case, the default action is still fallback to the fallback language if the 'translation' record does not exist. We insert a translation record now not only when we have translation value but also when we want fallback for this record. In the translation record we use the published / active flag to indicate if the translation exists. The table below shows the actions for the selected language.
|1||Does not exist||-||-||No translated value, fallback|
|2||Exists||True||True or False||Use translated value|
|3||Exists||False||False||No translated value, fallback|
|4||Exists||False||True||No translated value, no fallback, item not available|
Below is a possible SQL query implementation. Note that I am using SELECT's and UNION here which provide for easy selection of objects when using a SQLAlchemy query. In the current implementation I have two flags, deleted and status, that must both be False. content_item_parent_id = 0 means the main translation record.
SET @language_fallback_id = 1; SET @language_selected_id = 2; ###  = fallback: translation record does not exist ( SELECT ci.id ci_id, cit.id cit_id, cit.title cit_title FROM content_item ci, content_item_translation cit WHERE ci.content_item_type = 1 AND ci.content_item_parent_id = 0 AND ci.published = 1 AND cit.content_item_id = ci.id AND cit.content_item_parent_id = 0 AND cit.published = 1 AND cit.language_id = @language_fallback_id AND NOT EXISTS ( # return 1 when exists SELECT 1 FROM content_item_translation cit2 WHERE cit2.content_item_id = ci.id AND cit2.content_item_parent_id = 0 AND cit2.language_id = @language_selected_id ) ) UNION ALL ###  = use translation: translation record exists and published = True ( SELECT ci.id ci_id, cit.id cit_id, cit.title cit_title FROM content_item ci, content_item_translation cit WHERE ci.content_item_type = 1 AND ci.content_item_parent_id = 0 AND ci.published = 1 AND cit.content_item_id = ci.id AND cit.content_item_parent_id = 0 AND cit.published = 1 AND cit.language_id = @language_selected_id ) UNION ALL ###  = fallback: translation record exists and published = False & do_not_fallback = False ( SELECT ci.id ci_id, cit.id cit_id, cit.title cit_title FROM content_item ci, content_item_translation cit WHERE ci.content_item_type = 1 AND ci.content_item_parent_id = 0 AND ci.published = 1 AND cit.content_item_id = ci.id AND cit.content_item_parent_id = 0 AND cit.published = 1 AND cit.language_id = @language_fallback_id AND EXISTS ( # return 1 when published = False & do_not_fallback = False SELECT 1 FROM content_item ci2, content_item_translation cit2 WHERE ci2.id = ci.id AND ci2.content_item_type = 1 AND ci2.content_item_parent_id = 0 AND ci2.published = 1 AND cit2.content_item_id = ci2.id AND cit2.content_item_parent_id = 0 AND cit2.published = 0 AND cit2.do_not_fallback = 0 AND cit2.language_id = @language_selected_id ) ) ORDER BY ci_id DESC;
For SQLAlchemy I decided for the moment to use raw sql to get the count and ids for pagination and then use these ids to select the objects using a second query. Yes, I am a little tired of translating SQL to SQLAlchemy all the time, it takes a lot of time and for what? Still, I believe using ORM can help a lot in reducing code, but not all the time. So its about using it the right way.
Of course the administrator must be able to view the status of the content items. The status of the translation content item is per language indicated as follows:
Not published, fallback
Not published, no fallback, not available
We added the option 'do_not_fallback' to the content items of our database. This introduced more complexity but the extra overhead looks not really massive. Compared to the situation where we always had fallback to the default language, the changes are:
- The count of the records (items) is no longer the count of the number of records for the default language
- When we view an item and switch the language to a non-fallback item that does not have a translation, then we must show that the item is not available in the selected language
- The 'most viewed' (blog posts) function, the 'search' (blog posts) function, etc. must now also filter for non-available items.
Again a lot of work but I feel this now is the multilanguage functionality I can use for the remainder of this project.
Leave a comment
- Two Flask apps, frontend and admin, on one domain using DispatcherMiddleware
- Flask with multiple forms on a page posted using AJAX and returning a rendered form result
- Adding url_for() links to Jinja templates of a Flask multilanguage website
- SLQAlchemy dynamic query building and filtering including soft deletes
- Docker containers suddenly using 192.168.0.0/16 instead of 172.17.0.0/16: services lost
- OWASP Zed Attack Proxy (ZAP) security testing