To access the full text documents, please follow this link: http://hdl.handle.net/10459.1/60237

Design and implementation of an Algorithm for an Author Disambiguation problem
Echeverria Rovira, Lluís
García González, Roberto; Universitat de Lleida. Escola Politècnica Superior
Person name disambiguation is basic to distinguish persons that share the same name where unique identifiers are not defined. This problem is common in many domains, including digital libraries or data bases with publications, where the same name can refer to multiple unique authors. With the aim to attributing correctly the work, the data bases must be disambiguated. This project wants to give a possible solution to this problem, designing and implementing an algorithm for the disambiguation of the names. Different techniques and tools, within the scope of the distributed computations, like Spark or Hadoop, will be used in the development, in order to improve the efficiency of the process. As a base data set, the more than 8 millions of publications from the AGRIS (International System for Agricultural and Technology) repository will be used in the disambiguation process.
-Big data
-Scholarly data
-Spark
-Data mining
-Computació distribuïda
-Macrodades
cc-by-nc-nd
http://creativecommons.org/licenses/by-nc-nd/4.0/
masterThesis
         

Full text files in this document

Files Size Format View
llecheverriar.pdf 2.852 MB application/pdf View/Open

Show full item record

Related documents

Other documents of the same author

 

Coordination

 

Supporters