Publication date

2026-03-03T16:27:21Z

2026-03-03T16:27:21Z

2020

2026-03-03T16:27:21Z



Abstract

Comunicació presentada a la AAAI Conference on Artificial Intelligence (AAAI-20) 2020 , celebrada a New York (USA) del 7 al 12 de febrer de 2020.


In this study we aim to explore automatic methods that can detect online documents of low credibility, especially fake news, based on the style they are written in. We show that general-purpose text classifiers, despite seemingly good performance when evaluated simplistically, in fact overfit to sources of documents in training data. In order to achieve a truly style-based prediction, we gather a corpus of 103,219 documents from 223 online sources labelled by media experts, devise realistic evaluation scenarios and design two new classifiers: a neural network and a model based on stylometric features. The evaluation shows that the proposed classifiers maintain high accuracy in case of documents on previously unseen topics (e.g. new events) and from previously unseen sources (e.g. emerging news websites). An analysis of the stylometric model indicates it indeed focuses on sensational and affective vocabulary, known to be typical for fake news.


This work was supported by the Polish National Agency for Academic Exchange through a Polish Returns grant number PPN/PPO/2018/1/00006 and by the Google Cloud Platform through research credits.

Document Type

Chapter or part of a book


Published version

Language

English

Subjects and keywords

Fake news

Publisher

AAAI Press

Related items

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20); 2020 Feb 7-12; New York, USA. Palo Alto: AAAI Press; 2020. ISBN: 978-1-57735-835-0

Recommended citation

This citation was generated automatically.

Rights

Copyright c 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

This item appears in the following Collection(s)