XLIFF Translate Plug-in

DITA-OT Translate Plug-in is a plug-in to create, auto-translate and re-merge XLIFF files, generating translated documentation in a targeted foreign language. It can create and consume files using either XLIFF 1.2 or XLIFF 2.1 format.

Example

This plug-in consists of three DITA-OT transforms:

  • The xliff-create transform creates XLIFF and skeleton files from the *.dita files.
  • The xliff-translate transform populates the <target> texts using an automatic translation service.
  • The xliff-dita transform recreates the DITA project using the translated texts.
<topic id="cicero" xml:lang="en-us">
  <title>Cicero</title>
  <body>
    <p>
      Loves or pursues or desires to obtain pain of itself, because it
      is pain, but occasionally circumstances occur in which toil and
      pain can procure him some great pleasure.
    </p>
    <p>
      To take a trivial example, which of us ever undertakes laborious
      physical exercise, except to obtain some advantage from it?
    </p>
    <p>
      But who has any right to find fault with a man who chooses to
      enjoy a pleasure that has no annoying consequences, or one who
      avoids a pain that produces no resultant pleasure?
    </p>
  </body>
</topic>
Figure 1. DITA File
<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <file datatype="xml" original="/cicero.dita" source-language="en" target-language="es">
    <header xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
      <skl>
        <external-file href="./skl/cicero.dita.skl" />
      </skl>
    </header>
    <body xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
        <trans-unit xmlns="" xmlns:dita="dita-ot.org" approved="no" id="42094" xml:space="preserve">
           <source xml:lang="en">
            Loves or pursues or desires to obtain pain of itself, because it
            is pain, but occasionally circumstances occur in which toil and
            pain can procure him some great pleasure.
          </source>
          <target xml:lang="la">
            Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
            eiusmod tempor incididunt ut labore et dolore magna aliqua.
          </target>
        </trans-unit>
        <trans-unit xmlns="" xmlns:dita="dita-ot.org" approved="no" id="5532" xml:space="preserve">
           <source xml:lang="en">
            To take a trivial example, which of us ever undertakes laborious
            physical exercise, except to obtain some advantage from it?
          </source>
          <target xml:lang="la">
            Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
            nisi ut aliquip ex ea commodo consequat.
          </target>
        </trans-unit>
        <trans-unit xmlns="" xmlns:dita="dita-ot.org" approved="no" id="66134" xml:space="preserve">
           <source xml:lang="en">
            But who has any right to find fault with a man who chooses to
            enjoy a pleasure that has no annoying consequences, or one who
            avoids a pain that produces no resultant pleasure?
          </source>
          <target xml:lang="la">
            Duis aute irure dolor in reprehenderit in voluptate velit esse
            cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
            cupidatat non proident, sunt in culpa qui officia deserunt mollit
            anim id est laborum.
          </target>
        </trans-unit>
        ... etc
      </body>
   </file>
...etc
Figure 2. Sample XLIFF File (with Latin Translation)
<topic id="cicero" xml:lang="la">
  <title>@@@90122@@@</title>
  <body>
    <p>
      @@@42094@@@
    </p>
    <p>
      @@@5532@@@
    </p>
    <p>
      @@@66134@@@
    </p>
  </body>
</topic>
Figure 3. DITA Skeleton File

Install

This is a standalone plug-in without dependencies which can be installed from the command line.

If the xliff-translate transform is to be used, the chosen translation service (usually an external cloud service) will need to be configured before use.

Run the plug-in installation commands:

dita install https://github.com/jason-fox/fox.jason.translate.xliff/archive/master.zip

The dita command line tool requires no additional configuration.

Automatic Translation Services

Several publicly available automatic translation cloud services are available for use, they typically offer a try-before-you-buy option and generally offer sample access to the service for without cost. Upgrading to a paid version will be necessary when transforming larger documents.

IBM Cloud Services

The IBM Language Translator allows you to translate text programmatically from one language into another language.

Introduction: Getting Started

Create an instance of the service:

  1. Go to the Language Translator External link icon page in the IBM Cloud Catalog.
  2. Sign up for a free IBM Cloud account or log in.
  3. Click Create.

Copy the credentials to authenticate to your service instance:

  1. From the IBM Cloud dashboard External link icon, click on your Language Translator service instance to go to the Language Translator service dashboard page.
  2. On the Manage page, click Show to view your credentials.
  3. Copy the API Key and URL values.
  4. Within the plug-in alter the file cfg/configuration.properties to hold your API Key and URL.

By default the Frankfurt translation service URL used is: https://gateway-fra.watsonplatform.net/language-translator/api/v3/translate, amend this when using a regional instance.

Microsoft Azure

Microsoft Translator provides multi-language support for translation, transliteration, language detection, and dictionaries.

Introduction: Overview

Create an instance of the service:

  1. Go to Try Cognitive Services
  2. Select the Translator Text APIs tab.
  3. Under Translator Text, select the Get API Key button.
  4. Agree to the terms and select your locale from the drop-down menu.
  5. Sign in by using your Microsoft, Facebook, LinkedIn, or GitHub account.

You can sign up for a free Microsoft account at the Microsoft account portal. To get started, click Sign in with Microsoft and then, when asked to sign in, click Create one. Follow the steps to create and verify your new Microsoft account.

After you sign in to Try Cognitive Services, your free trial begins. The displayed webpage lists all the Azure Cognitive Services services for which you currently have trial subscriptions. Two subscription keys are listed beside Speech Services. You can use either key in your applications.

Copy the credentials to authenticate to your service instance:

  1. Copy each of the API Key and Endpoint values.
  2. Within the plug-in alter the file cfg/configuration.properties to hold your API Key and URL.

By default the global translation service URL used is: https://api.cognitive.microsofttranslator.com/translate, amend this when using a regional instance.

Yandex Translate

The API provides access to the Yandex online machine translation service. It supports more than 90 languages and can translate separate words or complete texts.

Introduction: Overview

To sign-up to the service:

  1. Review the user agreement and rules for formatting translation results.
  2. Get a free API key.
  3. Read the documentation, where you will find instructions on enabling the API and detailed descriptions of its features.

After you sign in to your account select API Keys and create a new key as necessary. The latest endpoint can be found in the documentation.

https://translate.yandex.net/api/v1.5/tr/translate

Copy the credentials to authenticate to your service instance:

  1. Copy each of the API Key and Endpoint values.
  2. Within the plug-in alter the file cfg/configuration.properties to hold your API Key and URL.

DeepL API

The DeepL API is accessible with a DeepL Pro subscription (DeepL API plan) only. The API is an interface that allows other computer programs to send texts to the DeepL servers and receive high-quality translations.

Introduction: Overview

To sign-up to the service:

  1. Open a DeepL APIdevelopers account. Note that not all accounts offer access to the DeepL API. It is essential that the account type includes REST API access.
  2. Fill out the application details and add a credit card. No payments are required for the first 30 days. You can cancel the card and still maintain free access for the trial period.
  3. Read the documentation, where you will find instructions on enabling the API and detailed descriptions of its features.

After you sign in to your account select API Keys and create a new key as necessary. The latest endpoint can be found in the documentation

https://api.deepl.com/v2/translate

Copy the credentials to authenticate to your service instance:

  1. Copy each of the API Key and Endpoint values.
  2. Within the plug-in alter the file cfg/configuration.properties to hold your API Key and URL.

Usage

XLIFF 1.2 Invocation from the command line

1. to create an XLIFF 1.2 File and associated skeletons with run:

dita --format xliff-create \
     --input document.ditamap \
     --output out \
     --xliff.version=1

A translate.xlf file will appear in the out directory along with a series of skeleton files.

<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <file datatype="xml" original="/document.ditamap" source-language="en" target-language="es">
    <header xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
      <skl>
        <external-file href="./skl/document.ditamap.skl" />
      </skl>
    </header>
    <body xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
        <trans-unit xmlns="" xmlns:dita="dita-ot.org" approved="no" id="42094" xml:space="preserve">
          <source xml:lang="en">
            Loves or pursues or desires to obtain pain of itself, because it
            is pain, but occasionally circumstances occur in which toil and
            pain can procure him some great pleasure. To take a trivial
            example,  <x ctype="x-dita-b" id="d3e14">which of us ever undertakes
            laborious physical exercise,</x> except to obtain some advantage from it?
            But who has any right to find fault with a man who chooses to enjoy a pleasure
            that has no annoying consequences, or one who avoids a pain that produces no
            resultant pleasure?
          </source>
          <target xml:lang="la"/>
        </trans-unit>
        ... etc
      </body>
   </file>
...etc
Figure 4. XLIFF 1.2 output

2. to populate an existing XLIFF 1.2 File with auto-translated text:

dita --format xliff-translate \
     --input translate.xlf \
     --translate.service=[bing|deepl|watson|yandex] \
     --translate.apikey=<api-key>
     --xliff.version=1

The XLIFF 1.2 File is auto-translated in place, with translated text as shown:

<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <file datatype="xml" original="/document.ditamap" source-language="en" target-language="la">
    <header xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
      <skl>
        <external-file href="./skl/document.ditamap.skl" />
      </skl>
    </header>
    <body xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
        <trans-unit xmlns="" xmlns:dita="dita-ot.org" approved="no" id="42094" xml:space="preserve">
          <source xml:lang="en">
            Loves or pursues or desires to obtain pain of itself, because it
            is pain, but occasionally circumstances occur in which toil and
            pain can procure him some great pleasure. To take a trivial
            example, <x ctype="x-dita-b" id="d3e14">which of us ever undertakes
            laborious physical exercise,</x> except to obtain some advantage from it?
            But who has any right to find fault with a man who chooses to enjoy a pleasure
            that has no annoying consequences, or one who avoids a pain that produces no
            resultant pleasure?
          </source>
          <target xml:lang="la">
            Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
            eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
            enim ad minim veniam, <x ctype="x-dita-b" id="d3e14">quis nostrud exercitation
            ullamco laboris,</x> nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor
            in reprehenderit in voluptate velit esse cillum dolore eu fugiat
            nulla pariatur. Excepteur sint occaecat cupidatat non proident,
            sunt in culpa qui officia deserunt mollit anim id est laborum.
          </target>
        </trans-unit>
        ...etc
      </body>
   </file>
...etc
Figure 5. XLIFF 1.2 output with translations

XLIFF 2.1 Invocation from the command line

3. to create an XLIFF 2.1 File and associated skeletons with run:

dita --format xliff-create \
     --input document.ditamap  \
     --output out \
     --xliff.version=2

A translate.xlf file will appear in the out directory along with a series of skeleton files.

<?xml version="1.0" encoding="UTF-8"?>
<xliff srcLang="en" trgLang="la" version="2.0" xmlns="urn:oasis:names:tc:xliff:document:2.0">
  <file id="2" original="/topic.dita">
    <skeleton href="./skl/topic.dita.skl"></skeleton>
    <unit fs:fs="p" id="9962" xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0">
      <originalData>
        <data id="sd4e14">&amp;lt;b&amp;gt;</data>
        <data id="ed4e14">&amp;lt;/b&amp;gt;</data>
      </originalData>
      <segment state="initial">
        <source xml:lang="en" xml:space="preserve">Loves or pursues or desires to obtain pain of
            itself, because it is pain, but occasionally circumstances occur in which toil and pain
            can procure him some  great pleasure. To take a trivial example, <pc dataRefEnd="ed4e14"
            dataRefStart="sd4e14" fs:fs="b" id="d4e14">which of us ever undertakes laborious physical
            exercise,</pc>except to obtain some advantage from it? But who has any right to find fault
            with a man who chooses to enjoy a pleasure that has no annoying consequences, or one who avoids
            a pain that produces no resultant pleasure?
          </source>
          <target xml:lang="la"></target>
      </segment>
    </unit>
    ...etc
  </file>
  ...etc
Figure 6. XLIFF 2.1 output

4. to populate an existing XLIFF 2.1 File with auto-translated text:

dita --format xliff-translate \
     --input translate.xlf --translate.service=[bing|deepl|watson|yandex] \
     --translate.apikey=<api-key> \
     --xliff.version=2

The XLIFF 2.1 File is auto-translated in place, with translated text as shown:

<?xml version="1.0" encoding="UTF-8"?>
<xliff srcLang="en" trgLang="la" version="2.0" xmlns="urn:oasis:names:tc:xliff:document:2.0">
  <file id="2" original="/topic.dita">
    <skeleton href="./skl/topic.dita.skl"></skeleton>
    <unit fs:fs="p" id="9962" xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0">
      <originalData>
        <data id="sd4e14">&amp;lt;b&amp;gt;</data>
        <data id="ed4e14">&amp;lt;/b&amp;gt;</data>
      </originalData>
      <segment state="translated">
        <source xml:lang="en" xml:space="preserve">Loves or pursues or desires to obtain pain of
            itself, because it is pain, but occasionally circumstances occur in which toil and pain
            can procure him some  great pleasure. To take a trivial example, <pc dataRefEnd="ed4e14"
            dataRefStart="sd4e14" fs:fs="b" id="d4e14">which of us ever undertakes laborious physical
            exercise</pc>except to obtain some advantage from it? But who has any right to find fault with
            a man who chooses to enjoy a pleasure that has no annoying consequences, or one who avoids a pain
            that produces no resultant pleasure?
        </source>
        <target xml:lang="la">
            Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
            eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
            enim ad minim veniam, <pc dataRefEnd="ed4e14" dataRefStart="sd4e14" fs:fs="b" id="d4e14">
            quis nostrud exercitation ullamco laboris,</pc> nisi ut aliquip ex ea commodo consequat.
            Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat
            nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
            deserunt mollit anim id est laborum.
        </target>
      </segment>
    </unit>
    ...etc
  </file>
  ...etc
Figure 7. XLIFF 2.1 output with translations

Populating Skeletons from the command line

5. recreate *.dita files using an XLIFF File and its associated skeletons with run:

dita --format xliff-dita \
     --input translate.xlf \
     --output out \
     --xliff.version=1|2

The translated *.dita files are generated into the out directory.

Parameter Reference

  • translate.from - Source language to use. Defaults to the value in configuration.properties
  • translate.to - Target language. Defaults to the value in configuration.properties
  • translate.cachefile - Specifies the location of a previously translated XLIFF file to be used. If the @id matches to a previously translated text snippet in the cache file, the text will be copied over and the snippet marked as approved.
  • translate.service - Decides which translation service to use:
    • bing - Connects to the Microsoft Azure Translation service
    • custom - Sends the translate to an arbitrary URL using POST - use this to connect to proxies for Google Cloud Translate
    • deepl - Connects to the DeepL API Translation service
    • dummy - Avoids accessing a translation service, copies sources to target language directly without amendment.
    • watson - Connects to the IBM Cloud Translation service
    • yandex - Connects to the Yandex Translation service
  • translate.authentication.url - URL for creating an OAuth token if needed for a service. Defaults to the value in configuration.properties.
  • translate.apikey - API Key for the Translation service. Defaults to the value in configuration.properties
  • translate.url - URL for a Translation service. Defaults to the value in configuration.properties
  • xliff.version - Decides which XLIFF format to use. Defaults to the value in configuration.properties:
    • 1 - XLIFF 1.2 format
    • 2 - XLIFF 2.1 format