Neural machine translation for patent documents

23/10/2017

There are many platforms that you can use to obtain a machine translation of a patent document or a portion of a patent document. Several of them are free and integrated into search platforms. For example, Patenttranslate in the EPO’s Espacenet platform is one I use frequently.

But what is the best machine translation tool out there for patents and have they got any better over the last few years? If you put the same text into different systems you get different results, but I haven’t yet found a system that gives consistently good results, particularly for translating Chinese and Korean documents into English. At the moment, if I want to get a basic understanding of the document, I will tend to use the machine translation tool within whatever search platform I am using at the time, rather than pasting text from one platform into another in the hope of getting better quality of translation.

But machine translation technology is developing. Until recently, most machine translation tools used statistical machine translation (SMT). But the latest technology is neural machine translation (NMT). NMT is now used by Google and Microsoft among others, and has been heralded as a step forward for machine translation.

Neural patent translation is being used in the patent world too. The WIPO translate tool uses neural machine learning. WIPO translate was launched in a beta version last year for translation from English to Chinese and from Chinese to English, and has recently been expanded to cover all PCT languages. Previously, WIPO had an SMT based tool, but they clearly feel that the neural machine translation system is better for all language pairs.

WIPO translate is trained exclusively on patent documents, which WIPO claims leads to better results for patent documents. WIPO translate also allows you to choose a particular technical domain, based on patent classification codes, for the text you are translating. This is in order to remove ambiguity resulting from words, like “beam” or “cell”, that have different meanings in different technical fields.

One significant limitation at present is that there is a limit of 2000 characters that can be translated at once.

Of course for patents, the precise meaning of the words is critical. So for machine translation to be useful for more than gaining a crude understanding of the disclosure of a patent document, it has to be very reliable. Based on my experience so far, WIPO translate is certainly not at that level yet. But it is encouraging to see development, and it may well prove to be better than existing SMT based tools.

There are of course others working on machine translation in the patent space and they may take up NMT too if it proves to be a significant improvement. The EPO works in partnership with Google and so will presumably be introducing Google’s NMT technology into the Espacenet platform at some stage. Translation tools are available within other patent office databases too, such as at JPO and KIPO websites, and they may well use NMT in the future. This is certainly an area of development that we will be closely monitoring.

For now I will continue to use various platforms until a clear front runner emerges. And of course, I will continue to use human translators when accuracy really matters.

This article is for general information only. Its content is not a statement of the law on any subject and does not constitute advice. Please contact Reddie & Grose LLP for advice before taking any action in reliance on it.