IRIS UniCatt

We report and comment the experimental results of the PRAISED system, which implements an automatic method for discovering and resolving a wide range of protein name abbreviations from the full-text versions of scientific articles. This system has been recently proposed as part of a framework for creating and maintaining a publicly-accessible abbreviation repository. The testing phase was carried out against the widely used Medstract Gold Standard Corpus and a relevant subset of real scientific papers extracted from the PubMed database. As far as the Medstract corpus is concerned, we obtained significantly high results in terms of recall, precision and overall correctness. As for the fulltext papers, results inevitably varied, due to the complex and often chaotic nature of the confronted domain; even so, we detected encouraging levels of recall and extremely fast execution times. The major strength of the system lies in addressing the unstructuredness of the scientific publications and being able to save time and effort for extracting protein-related information in an automatic fashion, while at the same time keeping computational overhead to a minimum thanks to its light-weight approach. Copyright © 2011 ACM.

Atzeni, P., Polticelli, F., Toti, D., Experimentation of an automatic resolution method for protein abbreviations in full-text papers, Paper, in 2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2011, (Chicago, IL, usa, 01-03 August 2011), ACM Press, N/A 2011: 465-467. 10.1145/2147805.2147871 [http://hdl.handle.net/10807/163937]

Experimentation of an automatic resolution method for protein abbreviations in full-text papers

Atzeni P.;Polticelli F.;Toti, Daniele

2011

Abstract

We report and comment the experimental results of the PRAISED system, which implements an automatic method for discovering and resolving a wide range of protein name abbreviations from the full-text versions of scientific articles. This system has been recently proposed as part of a framework for creating and maintaining a publicly-accessible abbreviation repository. The testing phase was carried out against the widely used Medstract Gold Standard Corpus and a relevant subset of real scientific papers extracted from the PubMed database. As far as the Medstract corpus is concerned, we obtained significantly high results in terms of recall, precision and overall correctness. As for the fulltext papers, results inevitably varied, due to the complex and often chaotic nature of the confronted domain; even so, we detected encouraging levels of recall and extremely fast execution times. The major strength of the system lies in addressing the unstructuredness of the scientific publications and being able to save time and effort for extracting protein-related information in an automatic fashion, while at the same time keeping computational overhead to a minimum thanks to its light-weight approach. Copyright © 2011 ACM.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2011
			
	Lingua del contenuto
	
				Inglese
			
	Titolo del volume che raccoglie gli atti
	
				2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2011
			
	Denominazione evento
	
				2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, ACM-BCB 2011
			
	Luogo dell'evento
	
				Chicago, IL, usa
			
	Tipo di contributo
	
				Paper
			
	Data inizio evento
	
				1-ago-2011
			
	Data fine evento
	
				3-ago-2011
			
	ISBN della pubblicazione
	
				9781450307963
			
	Editore
	
				ACM Press
			
	DOI del contributo
	
				https://dx.doi.org/10.1145/2147805.2147871
			
	Citazione
	
				Atzeni, P., Polticelli, F., Toti, D., Experimentation of an automatic resolution method for protein abbreviations in full-text papers,  Paper, in 2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2011, (Chicago, IL, usa,  01-03 August 2011), ACM Press, N/A 2011: 465-467. 10.1145/2147805.2147871 [http://hdl.handle.net/10807/163937]
			
	Appare nelle tipologie:
	
				Paper, Selected paper, Contributed paper, Working paper, Poster, Poster paper, Comunicazione, Relazione (in volume)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/163937

Citazioni

ND

8

ND

social impact