Advancing fake news combating using machine learning: a hybrid model approach

Aslam, Zahid and Missen, Malik Muhammad Saad and Ghaffar, Arslan Abdul and Mehmood, Arif and Gracia Villar, Mónica and Silva Alvarado, Eduardo René and Ashraf, Imran UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, monica.gracia@uneatlantico.es, eduardo.silva@funiber.org, UNSPECIFIED (2025) Advancing fake news combating using machine learning: a hybrid model approach. Knowledge and Information Systems. ISSN 0219-1377

[img] Text
s10115-025-02588-y.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (2MB)

Abstract

The digital era, while offering unparalleled access to information, has also seen the rapid proliferation of fake news, a phenomenon with the potential to distort public perception and influence sociopolitical events. The need to identify and mitigate the spread of such disinformation is crucial for maintaining the integrity of public discourse. This research introduces a multi-view learning framework that achieves high precision by systematically integrating diverse feature perspectives. Using a diverse dataset of news articles, the approach combines several feature extraction methods, including TF-IDF for individual words (unigrams) and word pairs (bigrams), and counts vectorization to represent text in multiple ways. To capture additional linguistic and semantic information, advanced features, such as readability scores, sentiment scores, and topic distributions generated by latent Dirichlet allocation (LDA), are also extracted. The framework implements a multi-view learning strategy, where separate views focus on basic text, linguistic, and semantic features, feeding into a final ensemble model. Models like logistic regression, random forest, and LightGBM are employed to analyze each view, and a stacked ensemble integrates their outputs. Through rigorous tenfold cross-validation, our proposed multi-view ensemble achieves a state-of-the-art accuracy of 0.9994, outperforming strong baselines, including single-view models and a BERT-based classifier. Robustness testing confirms the model maintains high accuracy even under data perturbations, establishing the value of structured feature separation and intelligent ensemble techniques.

Item Type: Article
Uncontrolled Keywords: Information processing; Fake news detection; Natural language processing; Machine learning; Ensemble model; Social media news
Subjects: Subjects > Engineering
Divisions: Europe University of Atlantic > Research > Scientific Production
Fundación Universitaria Internacional de Colombia > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Universidad Internacional do Cuanza > Research > Scientific Production
University of La Romana > Research > Scientific Production
Depositing User: Sr Bibliotecario
Date Deposited: 27 Oct 2025 09:57
Last Modified: 27 Oct 2025 09:57
URI: http://repositorio.funiber.org/id/eprint/17864

Actions (login required)

View Item View Item