search

Adding a Custom Domain Dictionary to Watson Language Translation Service


The use of machine translation has become nearly omnipresent across many applications to help users understand content in other languages. Even though machine translation does a great job at giving you a gist of the meaning there are occasions where specialized technical terms are inappropriately translated. To help you manage these situations you can apply your own custom dictionary during machine translation. Recently the Watson Language Translation service on Bluemix introduced domain customization to let you tune your machine translated results.

In this post I am going to explain how you can quickly create your own TMX custom dictionary. If you are unfamiliar with translation memories and the TMX standard you can find more information here.  Before we get started though there are a few key points to remember:

To upload your translation memory TMX file take the following steps:

  1. Call the Watson Language Translation API to get a list of all the Watson Language Translation models to see which ones support customization. Use the Watson Language Translation credentials that you obtain after you bind the Language Translation service to your Bluemix application and replace the username and password in the API with your values.

    curl -u “username”:”password” \
    “https://gateway.watsonplatform.net/language-translation/api/v2/models”

  2. Push your custom dictionary to the Watson Language Translation service. Here is a sample TMX file that you can use to help you get started building your own. In this file you will find that I am customizing the English to French domain model. After you upload the TMX file it may take a few minutes before you can start to use your own customized model. After you make this call you will receive a model id as part of the response. Be certain to write down this model id as you will use it later when translating your content. If you don’t use the model id when calling the translation API, then your custom dictionary will not be applied.

    curl -u “username”:”password” \
    -X POST \
    -F base_model_id=en-fr \
    -F name=”custom_glossary” \
    -F forced_glossary=@glossary.tmx \
    “https://gateway.watsonplatform.net/language-translation/api/v2/models”

  3. To check and see if your custom dictionary is ready for use call the API and check the returned status. If the returned status is available, then you are ready to start to use your custom dictionary. Be certain to replace the model-id with your model id value that you obtained in the prior API call.

    curl -u “username”:”password” \
    “https://gateway.watsonplatform.net/language-translation/api/v2/models/model-id”

  4. To have the Watson Language Translation service use your custom dictionary be sure to specify the model id during API calls.

    curl -u “username”:”password” \
    -X POST \
    -d {
    “model_id”: “model id”,
    “text”: [
    “When you make a post about Cloud Computing be sure to use the correct hashtag”
    ]
    }

I hope that this posting helps you get started with building your own custom dictionaries so that you can supplement the great results you already get from using Watson Language Translation on Bluemix.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

close
search
Latest Tweets

Hi, guest!

settings

menu