We present the findings of the LoResMT 2021 shared task which focuses on machine translation (MT) of COVID-19 data for both low-resource spoken and sign languages. The organization of this task was conducted as part of the fourth workshop on technologies for machine translation of low resource languages (LoResMT). Parallel corpora is presented and publicly available which includes the following directions: English↔Irish, English↔Marathi, and Taiwanese Sign language↔Traditional Chinese. Training data consists of 8112, 20933 and 128608 segments, respectively. There are additional monolingual data sets for Marathi and English that consist of 21901 segments. The results presented here are based on entries from a total of eight teams. Three teams submitted systems for English↔Irish while five teams submitted systems for English↔Marathi. Unfortunately, there were no systems submissions for the Taiwanese Sign language↔Traditional Chinese task. Maximum system performance was computed using BLEU and follow as 36.0 for English–Irish, 34.6 for Irish–English, 24.2 for English–Marathi, and 31.3 for Marathi–English.

Ojha, A. K. &., Liu, C. &., Kann, K. &., Ortega, J. &., Satam, S. &., Fransen, T., Findings of the LoResMT 2021 Shared Task on COVID and Sign Language for Low-Resource Languages, in Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages, (USA (virtual), 16-20 August 2021), Association for Machine Translation in the Americas, N/A 2021: 114-123 [https://hdl.handle.net/10807/270176]

Findings of the LoResMT 2021 Shared Task on COVID and Sign Language for Low-Resource Languages

2021

Abstract

We present the findings of the LoResMT 2021 shared task which focuses on machine translation (MT) of COVID-19 data for both low-resource spoken and sign languages. The organization of this task was conducted as part of the fourth workshop on technologies for machine translation of low resource languages (LoResMT). Parallel corpora is presented and publicly available which includes the following directions: English↔Irish, English↔Marathi, and Taiwanese Sign language↔Traditional Chinese. Training data consists of 8112, 20933 and 128608 segments, respectively. There are additional monolingual data sets for Marathi and English that consist of 21901 segments. The results presented here are based on entries from a total of eight teams. Three teams submitted systems for English↔Irish while five teams submitted systems for English↔Marathi. Unfortunately, there were no systems submissions for the Taiwanese Sign language↔Traditional Chinese task. Maximum system performance was computed using BLEU and follow as 36.0 for English–Irish, 34.6 for Irish–English, 24.2 for English–Marathi, and 31.3 for Marathi–English.
2021
Inglese
Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages
4th Workshop on Technologies for MT of Low Resource Languages
USA (virtual)
16-ago-2021
20-ago-2021
Association for Machine Translation in the Americas
Ojha, A. K. &., Liu, C. &., Kann, K. &., Ortega, J. &., Satam, S. &., Fransen, T., Findings of the LoResMT 2021 Shared Task on COVID and Sign Language for Low-Resource Languages, in Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages, (USA (virtual), 16-20 August 2021), Association for Machine Translation in the Americas, N/A 2021: 114-123 [https://hdl.handle.net/10807/270176]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/270176
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? ND
social impact