This paper presents the results of an experimental study conducted with the aim of comparing two methods for crowdsourcing speech transcription that incorporate two different quality control mechanisms (i.e. explicit versus implicit) and that are based on two different processes (i.e. parallel versus iterative). In the Gold Standard method the same speech segment is transcribed in parallel by multiple contributors whose reliability is checked with respect to some reference transcriptions provided by experts. On the other hand, in the Dual Pathway method two independent groups of contributors work on the same set of transcriptions refining them in an iterative way until they converge, and thus eliminating the need to have reference transcriptions and to check transcription quality in a separate phase. These two methods were tested on about half an hour of broadcast news speech and for two different European languages, namely German and Italian. Both methods obtained good results in terms of Word Error Rate (WER) and compare well with the word disagreement rate of experts on the same data.

Sprugnoli, R., Moretti, G., Fuoli, M., Giuliani, D., Bentivogli, L., Pianta, E., Gretter, R., Brugnara, F., Comparing two methods for crowdsourcing speech transcription, Paper, in Proceedings of ICASSP 2013, (Vancouver, CA, 26-31 May 2013), The Institute of Electrical and Electronics Engineers,Incorporated, N/A 2013: 8116-8120. 10.1109/ICASSP.2013.6639246 [http://hdl.handle.net/10807/132990]

Comparing two methods for crowdsourcing speech transcription

Sprugnoli, Rachele
Primo
;
2013

Abstract

This paper presents the results of an experimental study conducted with the aim of comparing two methods for crowdsourcing speech transcription that incorporate two different quality control mechanisms (i.e. explicit versus implicit) and that are based on two different processes (i.e. parallel versus iterative). In the Gold Standard method the same speech segment is transcribed in parallel by multiple contributors whose reliability is checked with respect to some reference transcriptions provided by experts. On the other hand, in the Dual Pathway method two independent groups of contributors work on the same set of transcriptions refining them in an iterative way until they converge, and thus eliminating the need to have reference transcriptions and to check transcription quality in a separate phase. These two methods were tested on about half an hour of broadcast news speech and for two different European languages, namely German and Italian. Both methods obtained good results in terms of Word Error Rate (WER) and compare well with the word disagreement rate of experts on the same data.
2013
Inglese
Proceedings of ICASSP 2013
38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Vancouver, CA
Paper
26-mag-2013
31-mag-2013
978-1-4799-0356-6
The Institute of Electrical and Electronics Engineers,Incorporated
Sprugnoli, R., Moretti, G., Fuoli, M., Giuliani, D., Bentivogli, L., Pianta, E., Gretter, R., Brugnara, F., Comparing two methods for crowdsourcing speech transcription, Paper, in Proceedings of ICASSP 2013, (Vancouver, CA, 26-31 May 2013), The Institute of Electrical and Electronics Engineers,Incorporated, N/A 2013: 8116-8120. 10.1109/ICASSP.2013.6639246 [http://hdl.handle.net/10807/132990]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/132990
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 4
social impact