Search Engine Customization and Data Set Builder
Arias Moreno, Fco Javier
Moens, Marie-Francine
There are two core objectives in this work: firstly, to build a data set, and secondly, to customize a search engine. The first objective is to design and implement a data set builder. There are two steps required for this. The first step is to build a crawler. The second step is to include a cleaner. The crawler collects Web links. The cleaner extracts the main content and removes noise from the files crawled. The goal of this application is crawling Web news sites to find the different sources of the news and retrieve the original articles. The second core objective is to customize a search engine. There are two steps required for this. The first step is to enhance the functionalities of the search engine. The second step is to integrate the enhanced search engine with a different knowledge management platform. In order to enhance the search engine, meta-information is added to its index, and the retrieval process is modified so that the selection of documents to be retrieved takes into account this new meta-information. The integration of the search engine to a different knowledge platform is a requirement of this project, so that pre-existing repositories of knowledge can interact with the search engine in an efficient and effective way. This integration also includes the development of a front-end that will allow users to utilize the search engine. The goal ofthis application is to provide the users with a convenient environment that allows retrieving information depending on meta-information.
Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació
Information retrieval
Web search engines
Recuperació de la informació
Cercadors d'Internet
Attribution-NonCommercial-NoDerivs 3.0 Spain
Katholieke Universiteit te Leuven (1970-);
Department of Computer Science

