Sitecore multi language challenges

We have recently completed a project for an international Site featuring a large amount of content in many languages. Not only did we have to support several base languages – it also was a requirement to  be able to customize content for specific countries. This lead us to use Sitecore’s built in mechanism of language/country pairs i.E.” en-US” or “en-GB” in combination with the Language Fallback Shared Source module to fall back to the base language i.E. “en” in case there was no specific translation.

With LanguageEmbedding set to “always” – this way we had unique URLs for every content page in every language i.e. mydomain.com/en-US/ without using cookies for keeping track of the selected language / country.

So far so good. But when authors started entering content, some issues started showing up. Soon we had over 140 Sitecore languages set up in the system and the content-tree quickly contained thousands of items.

This led to…

Issue #1:Uploads to Media Library becoming extremely slow

By default, Media.UploadAsVersionableByDefault is set to “false” leading Sitecore to upload media items unversioned. This is usually a good thing but unfortunately in this case, Sitecore creates a version for each language defined in the database even if we don’t need them. Having media items versioned wasn’t an option because Authors would have had do upload the media in every language.
Luckily Yogesh Patel posted a workaround to this problem which pointed us in the right direction but we couldn’t really adapt it because it was not possible to override the MediaCreator.CreateItem() method which countains a loop creating a version in every language.

A simple trick helped us out here instead:
We set the versionedTemplate to system/media/unversioned/… for each mediaType and set Media.UploadAsVersionableByDefault to true in Sitecore Config. This way, the media items are created with shared fields but the CreateItem() method only creates one language version for each uploaded media item. Uploads were instantly very fast again. Of course this solution prevents using versioned media items with versioned fields but we do not need them.

Manually removing all un-needed language versions from the database tables helped us speed up things even more as millions of unused records could be removed.

Issue #2: Large Cache sizes

Sitecore item and prefetch caches tended to use several gigabytes of space. This was because when fetching an item in a specific language i.E. “en-US” it would add this version to the cache even if there was no specific content for that item. That way a lot of items were added to cache even if they didn’t contain any content, leading to a general slow down while editing. (Clearing items out of the cache while saving was taking a lot of time)

What helped was a tweak on the LanguageFallback module’s GetItem() method. Whenever an item was requested in “en-US” and the corresponding fallback version in “en” was returned, our cache would save the connection and next time return the “en” version immediately without even calling GetItem() on the “en-US” version. That way we could avoid a lot of cache entries. The Item cache size decreased from several GB to just a few hundred. Of course we needed to clear our fallback cache in case an “en-US” version ever gets created by authors which was not trivial.

Issue #3: Fix sorting in language dropdown

Having over 140 languages in random sorting was not very good for authors to work with. Fortunately, we followed Igor Strikovs solution to implement custom sorting for languages letting us display all base languages at the top. Thanks Igor for that post.

Conclusion

As you see, building sites with a large amount of languages can lead to issues and currently there seems to be no out of the box solution for dealing with requirements like this. We couldn’t find any other way of solving this and still having clean “RESTful” URLs without using cookies to store the currently selected country.

Alternatives

An other way to achieve the above would probably be to extend Sitecore’s LinkManager and manage language fallbacks while resolving languages instead of a per-item basis. I.e. resolving “en-DE” would check if that language exists in the database and if not, it would directly switch the context language to “en”. That way we would only need to create Sitecore Languages when they were really needed to localize content. Unfortunately, extending LinkManager isn’t that trivial.

Any thoughts on this are greatly appreciated.


3 responses to “Sitecore multi language challenges”

  1. Hi Mark,
    I know this is a really old blog post but it’s very relevant to me at the moment 🙂

    I too am managing a site with lots of languages to support, and I’ve always been confused about when to use versioned vs. unversioned media files. We have settled on unversioned for now, and I really like your solution of overriding the versioned template. A few questions I have: if you do this, does that mean when you upload a new media item, you only end up with the default “en” version of that item? And if so, does that version still get served up to any requests for it, regardless of the culture code being used at the time?

    Another question I have is for all the older unversioned media items we have, that still have one version for every language…if we attach a new media file to an item, and then we need to publish it, do we only need to publish the default “en” version so that all requests will get the updated content? Or do we still need to publish every language version?

    I hope those questions make sense. Thanks in advance for your time and attention.

    • Hi Matt,

      This blogpost was based on Sitecore 7.1 so the current releases might be better at handling a large number of languages.

      If you use the trick described in this post you’ll end up with only one language and a shared “media” field. This will allow you to serve up the media item in any language without having to create a version for each. Also, publishing of the one language should be sufficient. Of course, please give this a try first in your environment. There might be some factors of your solution (SC Version, language fallback, custom media handlers,…) that have an impact on this.

      I only advise to follow this workaround though if you are experiencing very slow Media Item uploads.

      • Hi Mark,
        Thanks for the reply. I’ll give it a shot and see how it goes. We are not having slow uploads per se but we are having some overall performance issues and I’m trying to track down any and all improvements we can make. We are on 8.2 and other than the language fallback functionality being built-in now, the language issues are much the same as 7.x. Thanks again for the info.

        Matt

Leave a Reply to Matt Cancel reply

Your email address will not be published. Required fields are marked *