We have recently completed a project for an international Site featuring a large amount of content in many languages. Not only did we have to support several base languages – it also was a requirement to be able to customize content for specific countries. This lead us to use Sitecore’s built in mechanism of language/country pairs i.E.” en-US” or “en-GB” in combination with the Language Fallback Shared Source module to fall back to the base language i.E. “en” in case there was no specific translation.
With LanguageEmbedding set to “always” – this way we had unique URLs for every content page in every language i.e. mydomain.com/en-US/ without using cookies for keeping track of the selected language / country.
So far so good. But when authors started entering content, some issues started showing up. Soon we had over 140 Sitecore languages set up in the system and the content-tree quickly contained thousands of items.
Issue #1:Uploads to Media Library becoming extremely slow
By default, Media.UploadAsVersionableByDefault is set to “false” leading Sitecore to upload media items unversioned. This is usually a good thing but unfortunately in this case, Sitecore creates a version for each language defined in the database even if we don’t need them. Having media items versioned wasn’t an option because Authors would have had do upload the media in every language.
Luckily Yogesh Patel posted a workaround to this problem which pointed us in the right direction but we couldn’t really adapt it because it was not possible to override the MediaCreator.CreateItem() method which countains a loop creating a version in every language.
A simple trick helped us out here instead:
We set the versionedTemplate to system/media/unversioned/… for each mediaType and set Media.UploadAsVersionableByDefault to true in Sitecore Config. This way, the media items are created with shared fields but the CreateItem() method only creates one language version for each uploaded media item. Uploads were instantly very fast again. Of course this solution prevents using versioned media items with versioned fields but we do not need them.
Issue #2: Large Cache sizes
Sitecore item and prefetch caches tended to use several gigabytes of space. This was because when fetching an item in a specific language i.E. “en-US” it would add this version to the cache even if there was no specific content for that item. That way a lot of items were added to cache even if they didn’t contain any content, leading to a general slow down while editing. (Clearing items out of the cache while saving was taking a lot of time)
What helped was a tweak on the LanguageFallback module’s GetItem() method. Whenever an item was requested in “en-US” and the corresponding fallback version in “en” was returned, our cache would save the connection and next time return the “en” version immediately without even calling GetItem() on the “en-US” version. That way we could avoid a lot of cache entries. The Item cache size decreased from several GB to just a few hundred. Of course we needed to clear our fallback cache in case an “en-US” version ever gets created by authors which was not trivial.
Issue #3: Fix sorting in language dropdown
Having over 140 languages in random sorting was not very good for authors to work with. Fortunately, we followed Igor Strikovs solution to implement custom sorting for languages letting us display all base languages at the top. Thanks Igor for that post.
As you see, building sites with a large amount of languages can lead to issues and currently there seems to be no out of the box solution for dealing with requirements like this. We couldn’t find any other way of solving this and still having clean “RESTful” URLs without using cookies to store the currently selected country.
An other way to achieve the above would probably be to extend Sitecore’s LinkManager and manage language fallbacks while resolving languages instead of a per-item basis. I.e. resolving “en-DE” would check if that language exists in the database and if not, it would directly switch the context language to “en”. That way we would only need to create Sitecore Languages when they were really needed to localize content. Unfortunately, extending LinkManager isn’t that trivial.
Any thoughts on this are greatly appreciated.