Jump to content

User:RAdimer-WMF/Marking pages for translation

From Meta, a Wikimedia project coordination wiki

This page seeks to provide a basic guide to preparing, marking, and managing translatable pages, assuming zero programming/coding background and a basic knowledge of how to edit MediaWiki pages. With concerns or suggestions, please let me know on my talk page!

Existing knowledge

[edit]

For a brief overview of necessary MediaWiki context/knowledge...

MediaWiki pages are written in MediaWiki markup, also known as wikitext, used to format the page's content. Pressing the "Edit source" button will show you the page's markup.

  • Internal links (use these over external links wherever possible)
    • [[User:RAdimer-WMF|My userpage]] will show as: My userpage
    • Internal links link to a page's title, not the URL. The link target and shown text are separated by a pipe, |.
    • These can link both to pages on the same project, and any other Wikimedia wiki, with few exceptions.
      • A link to my Meta-Wiki userpage, for example, would be [[:m:User:Radimer-WMF]]. "m:" is a shortcut for Meta; you can also use "metawiki:". A full list of shortcuts is available here.
      • Aside from Wikimedia Foundation wikis, there are other acceptable interwiki link targets. See the interwiki map. For example, the prefix for Diff posts is "wmfblog:".
  • External links
    • [https://wikimediafoundation.org Wikimedia Foundation site] is shown as: Wikimedia Foundation site
    • External links need to be the full URL (i.e., starts with https://), and will show in a slightly different color than internal links and with a small icon next to it.
    • The link target and shown text are separated by a space,.
  • Lists – bulleted and numbered
    • Bulleted lists use asterisks, *, at the start of each line. To indent, place an additional asterisk. For normal indents, rather than a bullet, use a colon :
    • Numbered lists use number signs, #, at the start of each line. To indent, place an additional number sign.

For additional formatting options to the ones above, please see Help:Wikitext.

  • Transclusion
    • To transclude a page is to call its contents from another page. This is done with two curly brackets, e.g., {{Navigation header}}.
    • For example, I've created a subpage in my userspace, User:RAdimer-WMF/Hi, whose sole content is "Hello!".
    • When I add {{User:RAdimer-WMF/Hi}}, it shows: "Hello!"
    • To note: if the page is in the template namespace, you do not need to have a colon before the title. (In other words, the template namespace is the default) If it is in mainspace (articles and main content, usually), which has no prefixes, place a single colon before the title ":". Otherwise, just use the full page title.
  • Templates
    • Templates are stored in the template namespace. These are pages that are intended to be transcluded onto pages in other namespaces; think of it as a holding place for frequently-transcluded pages.
    • For example, when we setup the 2023-2024 Annual Plan pages, it involved a header that would be transcluded at the top of every Annual Plan page. The header formatting was then placed in the template namespace. Now, instead of copying and pasting the formatting on top of every Annual Plan page, we can transclude the same content from one central place. This also means that, to change the header everywhere, you only need to change it in one place. Templates are very useful.
  • Subpages
    • A subpage is a way to connect related pages. Its format is generally: Page title/subpage title
    • For example, this page is a subpage. Its title is "User:RAdimer-WMF/Marking pages for translation". Its parent page is my userpage, User:RAdimer-WMF.

What is the translate extension, and how does it work?

[edit]

The translate extension allows us to (relatively) easily translate pages into multiple languages, on wikis where the extension is installed.

After a page is edited with appropriate translation markup, which the rest of this page discusses, a translation admin can mark the page for translation (this is a technical term), turning it into a translatable page. The extension will break up translatable content into numbered units.

When translators translate the page, they will be shown each unit individually, and can translate them in a user-friendly interface. When a user translates a unit, instead of editing the source page, the extension edits (or creates, if it doesn't exist) a subpage with that content.

For example, let's say I'm translating the page Example page. If I translate block 15 into Russian (ru), it would add the translation to Translations:Example page/15/ru. It also updates the main /ru page, Example page/ru with that content. When a reader then accesses the page and clicks "Russian" on the languages bar they are sent to Example page/ru.

The power of the translate extension is in its handling of all of these subpages and language versions. In the past, this had to be done manually. Now, much of the technical side is abstracted away from the reader and translator experience: translators use the user-friendly interface, and readers simply click on their preferred language.

However, someone still needs to tell the extension how to segment a translatable page into translation units.

The role of translation administrators

[edit]

Readers read, translators translate. What do translation administrators do?

Translation administrators ensure that translatable pages are properly set up and handled by the extension. In general, this includes preparing pages for translation (though the userright is not required for this), marking pages for translation, and handling translatable page administration. These three aspects are outlined on this page.

If you're interested, I wrote a Diff post about this topic titled "Understanding translation administration".

Preparing pages for translation

[edit]

Beginning markup

[edit]

We know that marking a page for translation means defining the translation blocks. This happens with the use of <translate> tags. Be sure to use the source editor (not visual editor).

Like all tags, every usage needs both an open <translate> tag, and a close </translate> tag.

Single line

[edit]

Let's start with preparing a single line of text:

Hello! My name is Rae.

I would mark this up as:

<translate>Hello! My name is Rae.</translate>

If I were to save this edit and mark the page for translation, the extension would add a marker for the translation block number right at the front of the block. It would look something like:

<translate><!--T:1--> Hello! My name is Rae.</translate>

When a translator goes to translate the page, the text will be shown to them as a translation block. And, when they translate it, saved as translation block 1.

Do not ever add, remove, or edit the translation block markers, i.e., <!--T:1-->. However, if you are removing a translation block, you should remove its accompanying marker.

Multiple lines

[edit]

So...do I need to add <translate> tags around every line? Nope. In practice, you will not use very many <translate> tags. If you have an empty line between two lines of content within translation tags, it will mark each line as a different translation block. For example:

Hello! My name is Rae.

I am part of the Movement Communications team.

I would mark this up as:

<translate>
Hello! My name is Rae.

I am part of the Movement Communications team.
</translate>

This would create two translation blocks. One for the text on the first line, and another for the text on the third line. Remember that this needs a blank line in between, not just to be on different lines.

Obstacles

[edit]

What about if I have non-text items or formatting in the way of my awesome and clean translation blocks?

It's not always useful to simply wrap an entire page with a single translation tag. Everything inside translation tags will vary between language versions. It's best to ensure that only content which needs to be translated is within translation tags. (The exceptions to this are link targets and some templates, which are covered in the translation variables section.)

Instead, we can close, and then reopen, translation tags around important parts of the page's content that we do not want to vary.

Let's take the content from earlier, and I want to add some styling for the second line.

<translate>Hello! My name is Rae.

<span style="color:blue">I am part of the Movement Communications team.</span></translate>

Generally, we'd want to avoid having translators retype formatting like this. So, we can close and then reopen the translation tags around the formatting markup.

<translate>Hello! My name is Rae.</translate>

<span style="color:blue"><translate>I am part of the Movement Communications team.</translate></span>

This would create two translation blocks: one for the first line of text, and one for only the "I am...team" of the third line.

This looks like a lot of translate tags, and it is. But the benefit is that it shows the translators only the text they need, and avoids the possibility of losing formatting in translation.

Importantly, never close a translation tag mid-sentence, and try to avoid breaking up closely related sentences, as it limits the context translators have. If you want to use span tags or complex formatting mid-sentence, you can use tvars, which is discussed later in this document.

Let's say I have an image in between two blocks of text, which I don't want to change based on language. I might prepare it for translation like:

<translate>Hello! My name is Rae.</translate>

[[File:Eastern Grey Squirrel in St James's Park, London - Nov 2006 edit.jpg]]

<translate>I am part of the Movement Communications team.</translate>

This would create two translation blocks, one for each line of text. Because the image markup is not inside translation tags, it will not be in any translation block, i.e., it will not vary between page language versions.

Section titles
[edit]

Section titles are handled differently, because of how linking to a page's sections works.

Let's say someone has their interface language set to Russian, and they click on this link: [[Special:MyLanguage/Translatable page#Section]]

Special:MyLanguage would open the page's Russian version, but #Section is still in English! Because no section title with that name exists in the Russian version, the link would not direct people to the intended section.

Thanks to updates, this problem is fixed. Including the entire section header line in a translation unit will allow the extension to generate a linkable anchor for the heading.

For example:

<translate>
== Section header ==

Section text.
</translate>

There needs to be a line break between the open translate tag, section header, and close translate tag.

Translation variables (tvars)

[edit]

Translation variables (tvars) are incredibly important to effectively marking pages for translation.

Conceptually, tvars allow us to turn content in a translation block into a variable. These are used when content will be the same between translated versions, but we cannot simply close and then reopen the translate tags around it.

Links are the most common usage of translation variables. Closing and reopening translate tags around a mid-sentence link would cut the sentence in half, into two translation blocks. We want to make sure that translation blocks are presented cleanly and easily to translators.

So, we use tvars.

Let's say I have the following line:

I am part of the [[Movement Communications|Movement Communications team]].

If I were to mark this for translation with a tvar, I would use:

<translate>I am part of the [[<tvar name="1">Movement Communications</tvar>|Movement Communications team]].</translate>

The translator, when they get to this block, will see:

I am part of the [[$1|Movement Communications team]].

Translation variables open with <tvar name="NAME"> and close with </tvar>. Anything inside of the tvar can be called in the block's translation with $NAME, where NAME is the name you set. The name can be anything, but there cannot be two tvars of the same name in the same translation block.

It is also recommended to use numerals for translation variables to prevent accidental translation of the tvar's name into another language that will likely cause a red link.

Because link targets (the page title you're linking to) generally do not change based on language, we almost always use tvars for them. We do not want translators to accidentally translate the name of a page, which would change the target of the link, likely to a page that does not exist.

If the link's target is itself a translatable page, append "Special:MyLanguage" to the front of the link. If it's an interwiki link, that would be after the interwiki prefix and before the namespace specifier and page title, e.g. [[commons:Special:MyLanguage/Help:Contents]].

Templates

[edit]

If a template is called on its own line and separated from text, it's generally best to wrap only the translatable portions in translate tags. For example:

{{Template|1=round|2=<translate>Hello there!</translate>}}

Ensure that the only content in translate tags are those which need to be translated. In the above example, parameter 1 is not, and the text of parameter 2 would be translatable.

As a more practical example, we often use graph templates. It's not useful to have the data be translatable, as that would require translators to retype all of the data. Instead, we might wrap each of the data labels or legend items in translate tags.

So...what if templates are called in the middle of a line? In these cases, it's often best to use tvars, just like you would for a link. This allows easy use of the template without breaking the line into multiple translation blocks.

When transcluding translatable templates, ensure that the template has translation-aware transclusion enabled. When such a template is transcluded on a page, changing languages on that page will call the translated version of the template, if it exists in that language. If not, it will show the template in the source language.

Templates such as Template:Tunit uses parser functions to call for content within the page. As such, when using it in templates that will be transcluded elsewhere, the "source page" variable must be specified. More information can be found at Template:Tunit/doc.

Images

[edit]

If you're not using a caption, or other translatable element in the image markup, it's generally best to exclude it from translation blocks.

With images that have captions, there are generally three options, depending on what you want to do with it.

If the image is inside of a larger translation block, it is generally best to use a tvar. For example:

[[<tvar name="1">File:Image you're using.jpg</tvar>|Caption text]]

This would show to translators as:

[[$1|Caption text]]

If the image is on its own line, and is not otherwise part of a translation block, it's best to wrap the translatable content in translate tags to make it its own translation block.

[[File:Image you're using.jpg|upright=2|<translate>Caption text</translate>]]

If you want to show different images for each language, you can simply include the entire image link and markup in a translation block.

Lists

[edit]

We generally want to avoid creating very long translation blocks, as it can be difficult for translators, and will cause problems (in marking a lot more content than necessary as outdated) if the source page changes.

Remember that translation blocks, within one set of translate tags, separate based on empty lines. Lists also separate based on empty lines; a blank line between two list items designates them as different lists.

Thus, we cannot simply add lines between list items to designate them as different translation blocks; this can cause problems for people using screen readers, as each item would be treated as part of a separate list.

With lists that have a handful of short items, it can be okay to keep them in the same translation block. However, with lists that have more content, it is generally best to wrap each line in translation tags instead. As an example of a line:

#<translate>Hello!</translate>

When doing this, you should exclude the # from within the translate tags; then translators don't have to retype it, and also you can restructure the content later with ease (such as changing from # to *) without invalidating the block.

If the list is very likely to change (either list-items added/removed, or existing items editing), then you should wrap each line individually. That means translators won't need to determine what exactly changed within a multi-line block, but can focus on the new or updated individual line.

Marking the page for translation

[edit]

After you have prepared the page with translation markup, it's time to mark it for translation. To do this, you will need translation-admin rights on that wiki.

There will be "mark this page for translation" text at the top center of the page. Click this, and verify the translation blocks that it will create. Scroll to the bottom of the page, and click the button to mark the page for translation.

How do readers access translated versions?

[edit]

When you mark a page for translation, make sure it has the following markup at the top of the page:

This shows the "languages bar", which lists all translated versions of the page and shows information on how complete each translation is.

Translation-aware transclusion

[edit]

This is an option when you mark a page for translation. The default is yes, and I recommend keeping it there. In short, this means that transcluding the page onto a different translatable page will show it in the selected language.

As an example, the 2023-2024 annual plan has a translatable header template that shows on all plan pages. Translation-aware transclusion is enabled, and thus when you select another language in the top languages bar on a plan page, the transcluded template is shown in that language (if the translation exists).

Changing the source language

[edit]

The default language for a new page is the project's default language; in general, on multilingual Wikimedia projects, this is English. If the page you are preparing for translation is not written in English, be sure to change it using Special:PageLanguage before you mark the page for translation.

You can still change the language after it's been marked. However, doing so would not remove the English subpage (which presumably has non-English content), it would simply mark the source page itself as a different language. You would then need to delete the English page or translate the source content into English, overwriting it.

Translatable page administration

[edit]

Translator notes

[edit]

The translate extension allows you to add notes for translators on individual blocks. After you have prepared the page and marked it for translation, click "Translate this page", and change the language to "qqq". You can then add comments for any translation block, which is shown to translators.

Changing the source text

[edit]

After a page is marked for translation, changing the source page does not automatically change the text that translators are shown. You will need to mark the page for translation again to update the modified translation units.

When doing so, if the change affects any existing translation units, you will have the option to mark those as outdated in versions that have already been translated. This adds a red box around the content for readers.

Removing units

[edit]

Moving a translatable page

[edit]

Message group statistics

[edit]

TL;DR for techy people - The 'shorter' version

(click to expand or collapse)
Anything within <translate><translate> is translatable (can vary between language versions). An empty line separates translatable units:
<translate>
Some text.

== Section header ==

Some other text.
</translate>

When a translate admin marks the above example page for translation, the extension would create and add markers for three units:

<translate>
<!--T:1-->
Some text.

== Section header == <!--T:2-->

<!--T:3-->
Some other text.
</translate>

The unit tags inform the translate extension of what the source content of each unit is, and where to find its translations. Unit tags can be moved within a page or removed, but do not add them manually.

Users can add translations for these units, which are stored in subpages. E.g., if a user translates unit 2 into Russian (ru), it will be stored in Translations:Example page/2/ru. When a reader selects Russian in the language bar, the content of Translations:Example page/2/ru is substituted for unit 2's source content.

(Add the language bar to the top of a page with <languages />.)

If there are non-translatable items, do not include them within <translate><translate> tags:

<translate>
Some text.

</translate>
{{A template that I don't want to change between language versions}}
<translate>
== Section header ==

Some other text.
</translate>

With link targets or other formatting that exist within a translation unit but do not vary between language versions, use translation variables (tvars):

This is a [[<tvar name="1">Link target</tvar>|link to a page]].

The above tvar is named 1, and has the content "Link target". Translators can call this variable in their translations with $1. The scope of translation variables is limited to the unit they are defined in.

If you have multiple translatable units with the same content, use tunits instead:

<translate><!--T:1--> Leadership</translate>
...
{{Tunit|id=1|content=Leadership}}

Tunits call the translations of a given unit, but not its source content (re-specified in the content parameter). By default it calls from the unit on the same page, though you can also call translations from units on other pages with the page parameter.

Updating the page:

  • Edits to translatable pages need a translation admin to "mark" the page as "ready", before the source will update for all the translated versions.
  • When an admin marks a page for translation, and there are changes to source units, they have the choice to mark "prior translations" of those units as invalid - doing so would show a pink outline for the outdated translation unit on the rendered page, and would prioritize them for translators. Example screenshot.

Aim to minimize the work of translators:

  • Use tvars and tunits to reduce repeated work
  • Ensure that formatting and other content which doesn't need to vary between translation units is either in a tvar (if in-line) or outside of translate tags.
  • Add documentation for translators (by making a translation in the "qqq" language) if there are points of potential confusion.
  • Get an experienced translation admin to check your work if you're unsure. It's always easier to make a change before the page is marked for translation.

See also

[edit]