Título:
|
Efficient document filtering using vector space topic expansion and pattern-mining
|
Autor/a:
|
Proskurnia, Julia; Mavlyutov, Ruslan; Castillo, Carlos; Aberer, Karl; Cudré-Mauroux, Philippe
|
Abstract:
|
Comunicació presentada a: CIKM '17 the 2017 ACM on Conference on Information and Knowledge Management, celebrat del 6 al 10 de novembre de 2017 a Singapur, Singapur. |
Abstract:
|
Automatically extracting information from social media is challenging
given that social content is often noisy, ambiguous, and
inconsistent. However, as many stories break on social channels
first before being picked up by mainstream media, developing methods
to better handle social content is of utmost importance. In this
paper, we propose a robust and effective approach to automatically
identify microposts related to a specific topic defined by a small
sample of reference documents. Our framework extracts clusters
of semantically similar microposts that overlap with the reference
documents, by extracting combinations of key features that de ne
those clusters through frequent pattern mining. Thiis allows us to
construct compact and interpretable representations of the topic,
dramatically decreasing the computational burden compared to
classical clustering and k-NN-based machine learning techniques
and producing highly-competitive results even with small training
sets (less than 1’000 training objects). Our method is eficient and
scales gracefully with large sets of incoming microposts. We experimentally
validate our approach on a large corpus of over 60M
microposts, showing that it significantly outperforms state-of-theart
techniques. |
Abstract:
|
J.P. was supported by a Sinergia Grant by the Swiss National Science Foundation (SNF 147609). C.C. worked on this paper while at Eurecat, supported by the Catalonia Trade and Investment Agency (ACCIÓ). P.C.M. was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement 683253/GraphInt). |
Materia(s):
|
-Event detection -Frequent patterns mining -Semantic attibutes -Microposts |
Derechos:
|
© ACM, 2017. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in CIKM '17 Proceedings of the 2017 ACM on Conference on Information and Knowledge Management http://doi.acm.org/10.1145/3132847.3133016 |
Tipo de documento:
|
Objeto de conferencia Artículo - Versión aceptada |
Editor:
|
ACM Association for Computer Machinery
|
Compartir:
|
|