Universitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics
Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
2009
The goal of Information Extraction is to automatically generate structured pieces of information from the relevant information contained in text documents. Machine Learning techniques have been applied to reduce the cost of Information Extraction system adaptation. However, elements of human supervision strongly bias the learning process. Unsupervised learning approaches can avoid these biases. In this paper, we propose an unsupervised approach to learning for Relation Detection, based on the use of massive clustering ensembles. The results obtained on the ACE Relation Mention Detection task outperform in terms of F1 score by 5 points the state of the art of unsupervised techniques for this evaluation framework, in addition to being simpler and more flexible.
Peer Reviewed
Postprint (published version)
Conference report
Inglés
Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic; Data mining -- Data processing; Information retrieval; Text analysis; Pattern clustering; Mineria de dades
http://ieeexplore.ieee.org/search/srchabstract.jsp?tp=&arnumber=5360311&queryText%3Dgonz%C3%A0lez+icdm+2009%26openedRefinements%3D*%26searchField%3DSearch+All
Open Access
E-prints [72932]