This site uses cookies.
Some of these cookies are essential to the operation of the site,
while others help to improve your experience by providing insights into how the site is being used.
For more information, please see the ProZ.com privacy policy.
Have you ever wanted to have a ‘mega-TM’ or AutoSuggest dictionary filled with content that is available on the web? Are you fed up with the inferior quality of a multilingual corpus such as the European Medicines Agency documents (EMEA) that are available as part of the OPUS project?
Join us on our journey to learn how we downloaded (parts) of the Internet (huge parts of the EMEA documents available), how we batch-converted the PDF files into Word files and how we cleaned up th... See more
Have you ever wanted to have a ‘mega-TM’ or AutoSuggest dictionary filled with content that is available on the web? Are you fed up with the inferior quality of a multilingual corpus such as the European Medicines Agency documents (EMEA) that are available as part of the OPUS project?
Join us on our journey to learn how we downloaded (parts) of the Internet (huge parts of the EMEA documents available), how we batch-converted the PDF files into Word files and how we cleaned up the Word files to improve the results of the alignment process. Learn about the tools we used for batch-aligning the files and what we did to clean up the aligned TMs.
We will also discuss just how much the huge TM and AutoSuggest dictionary improved our productivity and which problems still need to be solved in order to achieve the same productivity that is claimed by MT and post-editing.
Date and time: May 28th, 2013, 15h30 CET – Central European Time
Duration: 120 minutes with Q/A
Language: English
Tools that will be discussed include Xaldon WebSpider, Acrobat XI Pro, Lf_Aligner, Advance Word Find Replace, AlignFactory, Trados Studio 2011 and various others that allowed us to produce Mega-TMs (>4 million words) and big AutoSuggest databases (>30 MB) per language pair.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free