IRIS UniCatt

There are currently Wikipedia editions in 264 different languages. Each of these editions contains infoboxes that provide structured data about the topic of the article in which an infobox is contained. The content of infoboxes about the same topic in different Wikipedia editions varies in completeness, coverage and quality. This paper examines the hypothesis that by extracting infobox data from multiple Wikipedia editions and by fusing the extracted data among editions it should be possible to complement data from one edition with previously missing values from other editions and to increase the overall quality of the extracted dataset by choosing property values that are most likely correct in case of inconsistencies among editions. We will present a software framework for fusing RDF datasets based on different conflict resolution strategies. We will apply the framework to fuse infobox data that has been extracted from the English, German, Italian and French editions of Wikipedia and will discuss the accuracy of the conflict resolution strategies that were used in this experiment.

Tacchini, E., Schultz, A., Bizer, C., Experiments with wikipedia cross-language data fusion, Paper, in Proceedings of the 5th International Workshop on Scripting and Development for the Semantic Web (SFSW 2009), (Heraklion, Greece, 31-May 04-June 2008), CEUR-WS.org, Hannover 2009: 28-39 [https://hdl.handle.net/10807/316656]

Experiments with wikipedia cross-language data fusion

Tacchini, Eugenio^Primo;Andreas Schultz;Christian Bizer

2009

Abstract

There are currently Wikipedia editions in 264 different languages. Each of these editions contains infoboxes that provide structured data about the topic of the article in which an infobox is contained. The content of infoboxes about the same topic in different Wikipedia editions varies in completeness, coverage and quality. This paper examines the hypothesis that by extracting infobox data from multiple Wikipedia editions and by fusing the extracted data among editions it should be possible to complement data from one edition with previously missing values from other editions and to increase the overall quality of the extracted dataset by choosing property values that are most likely correct in case of inconsistencies among editions. We will present a software framework for fusing RDF datasets based on different conflict resolution strategies. We will apply the framework to fuse infobox data that has been extracted from the English, German, Italian and French editions of Wikipedia and will discuss the accuracy of the conflict resolution strategies that were used in this experiment.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2009
			
	Lingua del contenuto
	
				Inglese
			
	Titolo del volume che raccoglie gli atti
	
				Proceedings of the 5th International Workshop on Scripting and Development for the Semantic Web (SFSW 2009)
			
	Denominazione evento
	
				International Workshop on Scripting and Development for the Semantic Web (SFSW 2009)
			
	Luogo dell'evento
	
				Heraklion, Greece
			
	Tipo di contributo
	
				Paper
			
	Data inizio evento
	
				31-mag-2008
			
	Data fine evento
	
				4-giu-2008
			
	ISBN della pubblicazione
	
				ISSN 1613-0073
			
	Editore
	
				CEUR-WS.org
			
	URL alternativo
	
				https://ceur-ws.org/Vol-449/Proceedings.pdf
			
	Citazione
	
				Tacchini, E., Schultz, A., Bizer, C., Experiments with wikipedia cross-language data fusion,  Paper, in Proceedings of the 5th International Workshop on Scripting and Development for the Semantic Web (SFSW 2009), (Heraklion, Greece,  31-May 04-June 2008), CEUR-WS.org, Hannover 2009: 28-39 [https://hdl.handle.net/10807/316656]
			
	Appare nelle tipologie:
	
				Paper, Selected paper, Contributed paper, Working paper, Poster, Poster paper, Comunicazione, Relazione (in volume)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/316656

Citazioni

ND

ND

ND

social impact