IRIS UniCatt

We propose a framework for identifying, disambiguating and storing protein-related abbreviations as found in the full texts of scientific papers, in order to build and maintain a publicly available abbreviation repository via a semi-automatic process. This process involves information extraction methods and techniques for acronym identification and resolution, based on lexical clues and syntactical, largely domain-independent criteria. A dictionary and an ontology for proteins provide the means for matching and disambiguating the biological entities. User feedback is gathered at the end of the process and the confirmed entries are then stored and made available to the scientific community for further reviewing. © 2011 IEEE.

Atzeni, P., Polticelli, F., Toti, D., A framework for semi-automatic identification, disambiguation and storage of protein-related abbreviations in scientific literature, Paper, in Proceedings - International Conference on Data Engineering, (Hannover, deu, 11-16 April 2011), N/A, Hanover 2011: 59-61. 10.1109/ICDEW.2011.5767646 [http://hdl.handle.net/10807/163320]

A framework for semi-automatic identification, disambiguation and storage of protein-related abbreviations in scientific literature

Atzeni P.;Polticelli F.;Toti, Daniele^Primo

2011

Abstract

We propose a framework for identifying, disambiguating and storing protein-related abbreviations as found in the full texts of scientific papers, in order to build and maintain a publicly available abbreviation repository via a semi-automatic process. This process involves information extraction methods and techniques for acronym identification and resolution, based on lexical clues and syntactical, largely domain-independent criteria. A dictionary and an ontology for proteins provide the means for matching and disambiguating the biological entities. User feedback is gathered at the end of the process and the confirmed entries are then stored and made available to the scientific community for further reviewing. © 2011 IEEE.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2011
			
	Lingua del contenuto
	
				Inglese
			
	Titolo del volume che raccoglie gli atti
	
				Proceedings - International Conference on Data Engineering
			
	Denominazione evento
	
				2011 IEEE 27th International Conference on Data Engineering Workshops, ICDE 2011
			
	Luogo dell'evento
	
				Hannover, deu
			
	Tipo di contributo
	
				Paper
			
	Data inizio evento
	
				11-apr-2011
			
	Data fine evento
	
				16-apr-2011
			
	ISBN della pubblicazione
	
				978-1-4244-9195-7
			
	Editore
	
				N/A
			
	DOI del contributo
	
				https://dx.doi.org/10.1109/ICDEW.2011.5767646
			
	Citazione
	
				Atzeni, P., Polticelli, F., Toti, D., A framework for semi-automatic identification, disambiguation and storage of protein-related abbreviations in scientific literature,  Paper, in Proceedings - International Conference on Data Engineering, (Hannover, deu,  11-16 April 2011), N/A, Hanover 2011: 59-61. 10.1109/ICDEW.2011.5767646 [http://hdl.handle.net/10807/163320]
			
	Appare nelle tipologie:
	
				Paper, Selected paper, Contributed paper, Working paper, Poster, Poster paper, Comunicazione, Relazione (in volume)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/163320

Citazioni

ND

14

ND

social impact