This week we will carry out final integration and deployment tests on our distributed pipeline for large scale and continuous translation scenarios that heavily leverage the power of machine translation.
The platform features several configurable services that can be switched on as required. These include:
- automated source pre-editing prior to passing to a choice of custom machine translation engines;
- integrated pre-MT translation memory leverage;
- automated post-edit of raw machine translation prior to human post-edit;
- in-process, low-friction capture of actionable feedback on MT output from humans;
- automated post-processing of human post-edit;
- automated capture of edit distance data for BI and reporting.
The only component missing that will be integrated during May is the text analysis and text classification algorithms which will give us the ability to do automated quality assurance of every single segment. Yes, everything – no spot-checking or limited scope audits.
The platform is distributed and utilises industry standard formats including XLIFF and ITS. Thus it is wholly scalable and extensible. Of note is that this platform delivers upon all six of the trends recently publicised by Kantan. Thanks to Olga O’Laoghaire who made significant contributions to the post-editing components and Ferenc Dobi, lead Architect and developer.
I’m very excited to see the fruition of this project. It doesn’t just represent the ability for us to generate millions of words of translated content, it delivers a controlled environment in which we can apply state-of-the-art techniques that are highly optimised at every stage, measurable and designed to target the goal of fully automated usable translation (FAUT).