quarta-feira, 20 de outubro de 2010

Interview - Linguee: Online Translation Tool

With the slogan "the web as a dictionary", the website http://www.linguee.com/ offers millions of bilingual texts that can be searched for words or expressions, free of charge. Its co-founder, Gereon Frahling, explains how it works and how you can benefit from it, no matter if you are a translator, a teacher or a student.

Tell us about your background and experience.
Gereon - I studied Mathematics in Cologne, Germany and received my Ph.D. in Computer Science in 2006 from the University Paderborn, Germany. From 2006-2007 I worked for Google Research in New York. In my Ph.D. thesis and at Google I did research in the area of Information Retrieval. In 2008 I founded Linguee together with my cofounder Leonard Fink. I am currently CEO of Linguee, based in Cologne, Germany.

What is Linguee? What's the difference between Linguee and other free online dictionaries?
Gereon -
Using Linguee you can search millions of texts translated by humans. In contrast to traditional online dictionaries Linguee contains about 1000 times as many translated texts, most of them complete translated sentences. This enables you to search for words in their context, like " " and " ". The results show you the contexts in which certain translations are used and how to use the translated words in whole sentences.

How did you have the idea to create Linguee?
Gereon -
During the year in New York I had to write many texts in English. I was very disappointed by the existing online dictionaries. In most cases I was not able to find the right translation for words in the context I wanted to express.
On the other hand I knew that there are millions of professionally translated texts freely awailable on the internet - I just could not find them efficiently.
The Linguee idea also fitted my research experience pretty well - you need deep knowledge in information retrieval and machine learning to develop a high quality search engine.
These things convinced my to start a new company based on this idea.

How does it work? I have the impression it aims to be like a parallel corpus, does it?
Gereon -
Yes, Linguee can be seen as a huge searchable parallel corpus. But the Linguee corpus is many times larger than any other parallel corpus you can buy. And it is freely accessible using our Website.

Who checks/updates the dictionaries and how are the texts selected?
Gereon -
Our crawler, a computer program, looks at a huge number of internet pages automatically. It extracts about 1Trillion sentence pairs. Unfortunately most of them do not fulfil our quality requirements - they are poorly translated or translated by computers.
We developed a machine learning algorithm to extract the 0,001% of sentence pairs which are translated by humans in a high quality. This system uses user feedback to improve the automatic filtering - it learns autonomously.
Additionally we have a dictionary of about 500.000 word translations - selected and verified by our editorial board. You recognize these verified entries by a small green hook in front of the entry.

Does the company intend to keep the dictionary free forever? How is it profitable?
Gereon -
The Linguee web service will be free forever. We earn money from advertisers.
In the future we will add additional services for our most active users. Integration into your operating system, ad-freeness, better integration into computer aided translation tools (CAT-Tools). These services will be available as a paid abonnement. But the online service will be free as it is.

So far, the dictionary works in English, Portuguese, Spanish, French and German. What are the plans for expansion? What other languages and services are you planning to offer?
Gereon -
In the future we will definitely add new language pairs like Japanese, Chinese, Arabic, Russian, and more.