Design and implementation of an Algorithm for an Author Disambiguation problem

All of RECERCAT

To access the full text documents, please follow this link: http://hdl.handle.net/10459.1/60237

Title:	Design and implementation of an Algorithm for an Author Disambiguation problem
Author:	Echeverria Rovira, Lluís
Other authors:	García González, Roberto; Universitat de Lleida. Escola Politècnica Superior
Notes:	Person name disambiguation is basic to distinguish persons that share the same name where unique identifiers are not defined. This problem is common in many domains, including digital libraries or data bases with publications, where the same name can refer to multiple unique authors. With the aim to attributing correctly the work, the data bases must be disambiguated. This project wants to give a possible solution to this problem, designing and implementing an algorithm for the disambiguation of the names. Different techniques and tools, within the scope of the distributed computations, like Spark or Hadoop, will be used in the development, in order to improve the efficiency of the process. As a base data set, the more than 8 millions of publications from the AGRIS (International System for Agricultural and Technology) repository will be used in the disambiguation process.
Subject(s):	-Big data -Scholarly data -Spark -Data mining -Computació distribuïda -Macrodades
Rights:	cc-by-nc-nd http://creativecommons.org/licenses/by-nc-nd/4.0/
Document type:	masterThesis
Share:

Files	Size	Format	View
llecheverriar.pdf	2.852 MB	application/pdf	View/Open