Are you excited about the vast new possibilities in applying AI to natural language processing?
Would you like to work for the European Union and help us tear down language barriers for millions of people?
Do you already have relevant expertise in AI / NLP / ML?
The European Commission's eTranslation NLP project (https://language-tools.ec.europa.eu/) is currently offering you an exciting opportunity with impact and purpose. More details can be found in the job description.
Please note: The position is open to any nationality and any level of experience (with a minimum of five years of relevant university studies). Pay will be commensurate with experience. Employment will be through an IT company rather than by the European Commission directly. Remote work will be possible within the legal framework (details are to be agreed upon with the employer). The position will be open until filled.
We are the sector working on machine translation and other forms of natural language processing (NLP), within the information technology unit at the European Commission’s Directorate General for Translation (DGT), the world’s largest translation service. We build and run the eTranslation machine translation service, a flagship artificial intelligence project for the European Institutions that is also available to a broad range of external users everywhere in Europe. eTranslation plays a key role as an enabler of multilingualism in Europe, as a public service that facilitates multilingual communication in many different contexts, including on online platforms such as the Conference on the Future of Europe website or in difficult situations such as the influx of refugees in the context of the Ukrainian crisis.
In addition to machine translation, we also provide services for other forms of natural language processing, including speech transcription, document classification, named entity recognition, and anonymization, and we continue to add new NLP services. We use AI techniques involving deep learning approaches and tools to train our own models, with large volumes of both internal and external data, or to deploy open-source pre-trained models. We work in a cloud environment, in Azure, and use infrastructure-as-a-service (IaaS) and platform-as-a-service (PaaS) cloud services to develop and deliver our services to our internal and external users. We also carry out special projects involving the use of supercomputing/HPC resources for research and development relating to our services.
- Exploration of ways to use large language models (LLMs) and other types of artificial intelligence (AI) technology to build natural language processing (NLP) applications;
- Design, implementation, and evaluation of AI-based NLP applications, including but not limited to machine translation (MT) engines;
- Definition of criteria for quality evaluation applicable to the datasets used to build NLP applications, concerning both the training data and the test data sets.
- Application of advanced data analysis and machine learning techniques, including "deep learning" based on neural models, to MT-related tasks, including, but not limited to domain adaptation;
- Development and maintenance of methods and software to identify useful subsets of existing corpora, filtering out the unwanted parts, using combinations of machine learning approaches with explicit (symbolic) rules.
- Acquisition and management of data sources that are helpful to improve MT performance and quality, such as parallel, comparable, and monolingual corpora (including data crawled from the Web or artificial parallel corpora via back-translation);
- Acquisition and management of data sources for building improved pre- and post-processing tools (e.g. morphological and syntactic analysis, re-ordering, re-scoring, quality estimation);
- Acquisition and management of data sources for building or improving AI-based NLP applications other than MT engines;
- Consulting the development team on integrating these additional data sources into a working solution on a software level;
- Consulting on quality improvements; assessing the impact of changes to the NLP applications onoutput quality and other performance criteria;
- Participation in functional working groups and progress meetings;
- Participation in scientific conferences and workshops related to Artificial Intelligence, Natural Language Processing, Machine Translation and underlying technologies;
- Contribution to and analysis of implementations made to cover specific needs of customers, for example through the creation of domain-specific MT engines or specific algorithms;
- Analysis of benefits and risks of such changes concerning the overall quality of the eTranslation service.
- Interaction with the business analysts, customer, users, project leaders and the developers
- An advanced university degree in data-driven computational linguistics, machine learning, artificial intelligence, data mining, or statistical data modelling, including familiarity with data-driven techniques for natural language processing, such as statistical / neural MT, or equivalent experience
- Very good knowledge and professional experience in the area of artificial intelligence or natural language processing
- In-depth knowledge of setting up and evaluating NLP software, including testing methodologies and tools, such as automatic quality metrics (e.g., BLEU scores and similar for MT) and human evaluation of output quality
- In-depth knowledge and experience with programming languages used for text processing (e.g. Python)
- Ability to implement prototypical solutions efficiently and fast and to evaluate them on very large amounts of textual data
- Ability to give business and technical presentations
- Ability to apply high quality standards
- Ability to cope with fast changing technologies used in NLP, MT, and machine learning
- Very good communication skills with technical and non-technical audiences
- Analysis and problem-solving skills
- Capability to write clear and structured technical documents
- Ability to participate in technical meetings and good communication skills
Due to the particular nature of a large international organisation such as the European Commission, candidates should also have the following non-technical skills:
- Capability of integration in an international/multicultural environment, rapid self-starting capability and experience in working in team;
- Ability to participate in multilingual meetings;
- Ability to work in multi-cultural environment, on multiple large projects;
- Excellent Team Player
- Ability to understand, speak and write EU languages beyond English will be an advantage;
- High degree of discretion and integrity is required as the applications managed and maintained in DGT R.3 contain personal and confidential data