Luís Gomes' page

"Hello World!" myself

I'm a PhD student at NOVA LINCS research center. My advisor is Gabriel Pereira Lopes.

I'm interested in:

My email address is luismsgomes@gmail.com. You may also find me on Google Scholar.

Software

multiwords (version 3) extracts multiword units (MWUs) from raw text.

spsim is measure for spelling similarity that helps identifying possible cognates.

unl-aligner aligns similarly spelled words (proper nouns, numbers, punctuation, cognates, etc) in parallel texts.

hindi_stemmer is a Python implementation of the stemming algorithm described in "A Lightweight Stemmer for Hindi" by Ananthakrishnan Ramanathan and Durgesh D Rao.

czech_stemmer is a stemmer for Czech that I ported to Python from the Java implementation by Ljiljana Dolamic, University of Neuchatel.

croatian_stemmer is a stemmer for Croatian developed by Nikola Ljubešić and Ivan Pandžić. I modified it slightly for allowing usage as a python module. The original implementation is available here.

snowball_stemmer a basic command line program (reads stdin, writes stdout) based on libstemmer (it supports Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish, and Turkish).

mosestokenizer is a Python package that provides wrappers for some pre-processing Perl scripts from the Moses toolkit (tokenizer, sentence splitter and punctuation normalizer).

stringology is a Python package that implements several classical string algorithms.

dicionário terminológico (DT) is a Portuguese linguistic terminology dictionary.

Publications

2016

Using Bilingual Segments in Generating Word-to-word Translations
Kavitha Mahesh, Luı́s Gomes and Gabriel Pereira Lopes
in Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra-6), Osaka, Japan, December 2016
(pdf, bibtex)

First Steps Towards Coverage-based Document Alignment
Luís Gomes and Gabriel Pereira Lopes
in Proceedings of the First Conference on Machine Translation (WMT16), 11-12 August 2016, Berlin (Germany)
(pdf, bibtex)

English-Portuguese Biomedical Translation Task Using a Genuine Phrase-Based Statistical Machine Translation Approach
José Aires, Gabriel Pereira Lopes and Luís Gomes
in Proceedings of the First Conference on Machine Translation (WMT16), 11-12 August 2016, Berlin (Germany)
(pdf, bibtex)

SMT and Hybrid systems of the QTLeap project in the WMT16 IT-task
Rosa Gaudio, Gorka Labaka, Eneko Agirre, Petya Osenova, Kiril Simov, Martin Popel, Dieke Oele, Gertjan van Noord, Luís Gomes, João António Rodrigues, Steven Neale, João Silva, Andreia Querido, Nuno Rendeiro and António Branco
in Proceedings of the First Conference on Machine Translation (WMT16), 11-12 August 2016, Berlin (Germany)
(pdf, bibtex)

First Steps Towards Coverage-based Sentence Alignment
Luís Gomes and Gabriel Pereira Lopes
in Proceedings of the 10th edition of the Language Resources and Evaluation Conference (LREC 2016), 25-27 May 2016, Portorož (Slovenia)
(pdf, bibtex, code)

Word Sense-Aware Machine Translation: Including Senses as Contextual Features for Improved Translation Models
Steven Neale, Luís Gomes, Eneko Agirre, Oier Lopez de Lacalle and António Branco
in Proceedings of the 10th edition of the Language Resources and Evaluation Conference (LREC 2016), 25-27 May 2016, Portorož (Slovenia)
(pdf, bibtex)

Seeking to Reproduce “Easy Domain Adaptation”
Luís Gomes, Gertjan van Noord, António Branco and Steven Neale
in 4REAL – Workshop on Research Results Reproducibility and Resources Citation in Science and Technology of Language, collocated with LREC 2016, 28 May 2016, Portorož (Slovenia)
(pdf, bibtex)

Domain-Specific Hybrid Machine Translation from English to Portuguese
João Rodrigues, Luís Gomes, Steven Neale, Andreia Querido, Nuno Rendeiro, Sanja Štajner, João Silva and António Branco
in PROPOR 2016 – International Conference on the Computational Processing of the Portuguese Language, 13-15 July 2016, Tomar (Portugal), Springer
(pdf, bibtex)


2015

Learning Clusters of Bilingual Suffixes using Bilingual Translation Lexicon
Kavitha Mahesh, Luís Gomes and Gabriel Pereira Lopes
in Mining Intelligence and Knowledge Exploration (MIKE 2015), Hyderabad, India, December 2015, Springer
(pdf, bibtex)

Bilingually motivated segmentation and generation of word translations using relatively small translation data sets
Kavitha Mahesh, Luís Gomes and Gabriel Pereira Lopes
in Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation (PACLIC 29), Shanghai, China, October 2015
(pdf, bibtex)

New Language Pairs in TectoMT
Ondřej Dušek, Luís Gomes, Michal Novák, Martin Popel and Rudolf Rosa
in Proceedings of the Tenth Workshop on Statistical Machine Translation (WMT15), Lisboa, Portugal, September 2015, ACL
(pdf, bibtex, poster)

Bootstrapping a Hybrid Deep MT System
João Silva, João Rodrigues, Luís Gomes and António Branco
in Proceedings of the Fourth Workshop on Hybrid Approaches to Translation (HyTra), Beijing, China, July 2015, ACL
(pdf, bibtex)

Machine Translation for Multilingual Troubleshooting in the IT Domain: A Comparison of Different Strategies
Sanja Štajner, João Rodrigues, Luís Gomes and António Branco
in 1st Deep MT Workshop (DMTW), Prague, Czech Republic, September 2015
(pdf, bibtex)

First Steps in Using Word Senses as Contextual Features in Maxent Models for Machine Translation
Steven Neale, Luís Gomes and António Branco
in 1st Deep MT Workshop (DMTW), Prague, Czech Republic, September 2015
(pdf, bibtex)

Improving bilingual search performance using compact full-text indices
Jorge Costa, Luís Gomes, Gabriel Pereira Lopes and Luís Russo
in Computational Linguistics and Intelligent Text Processing, 16th International Conference, CICLing 2015, Cairo, Egypt, April 2015, Springer
(pdf, bibtex)

Selecting Translation Candidates for Parallel Corpora Alignment
Kavitha Mahesh, Luís Gomes, José Aires and Gabriel Pereira Lopes
in Progress in Artificial Intelligence, 17th Portuguese Conference on Artificial Intelligence, EPIA 2015, Coimbra, Portugal, September 2015, Springer
(pdf, bibtex)


2014

Identification of Bilingual Segments for Translation Generation
Kavitha Mahesh, Luís Gomes and Gabriel Pereira Lopes
in Advances in Intelligent Data Analysis XIII, 13th International Symposium, IDA 2014, Leuven, Belgium, October 2014, Springer
(pdf, bibtex)

Identification of Bilingual Suffix Classes for Classification and Translation Generation
Kavitha Mahesh, Luís Gomes and Gabriel Pereira Lopes
in Advances in Artificial Intelligence, 14th Ibero-American Conference on AI, IBERAMIA 2014, Santiago de Chile, Chile, November 2014, Springer
(pdf, bibtex)


2013

Compact and Fast Indexes for Translation Related Tasks
Jorge Costa, Luís Gomes, Gabriel Pereira Lopes, Luís Russo and Nieves Brisaboa
in Progress in Artificial Intelligence, 16th Portuguese Conference in Artificial Intelligence, EPIA 2013, Açores, Portugal, September 2013, Springer
(pdf, bibtex)


2011

Measuring Spelling Similarity for Cognate Identification
Luís Gomes and Gabriel Pereira Lopes
in Progress in Artificial Intelligence, 15th Portuguese Conference in Artificial Intelligence, EPIA 2011, Lisboa, Portugal, October 2011, Springer
(pdf, bibtex, code)

Using SVMs for Filtering Translation Tables for Parallel Texts Alignment
Kavitha Mahesh, Luís Gomes and Gabriel Pereira Lopes
in Proceedings of 15th Portuguese Conference in Artificial Intelligence, EPIA 2011, Lisboa, Portugal, October 2011
(pdf, bibtex)

Managing and Querying a Bilingual Lexicon with Suffix Trees
Jorge Costa, Luís Gomes, Gabriel Pereira Lopes and Luís Russo
in Proceedings of 15th Portuguese Conference in Artificial Intelligence, EPIA 2011, Lisboa, Portugal, October 2011
(pdf, bibtex)

Representing a Bilingual Lexicon with Suffix Trees
Jorge Costa, Luís Gomes, Gabriel Pereira Lopes and Luís Russo
in Proceedings of 26th Symposium On Applied Computing (SAC 2011), Taichung, Taiwan, March 2011, ACM
(pdf, bibtex)


2009

Parallel Texts Alignment
Luís Gomes, José Aires, and Gabriel Pereira Lopes
in New Trends in Artificial Intelligence, 14th Portuguese Conference in Artificial Intelligence, EPIA 2009, Aveiro, Portugal, October, 2009
(pdf, bibtex)

Phrase Translation Extraction from Aligned Parallel Corpora Using Suffix Arrays and Related Structures
José Aires, Gabriel Pereira Lopes, and Luís Gomes
in Progress in Artificial Intelligence, 14th Portuguese Conference in Artificial Intelligence, EPIA 2009, Aveiro, Portugal, October 2009, Springer
(pdf, bibtex)

Parallel Texts Alignment
Luís Gomes
Master Thesis, February, 2009, Universidade Nova de Lisboa
(pdf, bibtex)

Invited Talks

2016

Extracção supervisionada de léxicos bilingues a partir de corpora paralelos
at Faculdade de Ciências Sociais e Humanas, Universidade Nova de Lisboa: Jornadas dos Dicionários: Lexicografia e Dicionarística Portuguesas, Lisboa, Portugal, 11 and 12 July 2016.

2014

Translation Services: Perspectives of a Portuguese Technological Start-Up
at Universidade da Beira Interior: XXV Jornadas de informática do Núcleo de Informática da UBI, Covilhã, Portugal, 14 May 2014.

2012

Translation Alignment and Translation Extraction
at Université de Caen Basse Normandie, France, 13 Septembre 2012.

Posters

2014

Identification Of Bilingual Segments for Translation Generation
at the Thirteenth International Symposium on Intelligent Data Analysis (IDA 2014), Leuven, from 30 October to 1 November 2014.
(pdf)