A Comparative Machine Learning and Deep Learning Algorithms for Fake News Detection

Abstract

This study investigates the extent to which gradient-boosted decision trees can rival deep learning architectures in the task of text-based fake news detection. Employing a unified experimental pipeline that evaluates eleven models on a dataset of approximately 45,000 news articles using simple n-gram representations, we find that XGBoost achieves the highest classification accuracy (99.85%), marginally outperforming a multi-layer perceptron baseline (99.64%). Moreover, XGBoost demonstrates superior computational efficiency, with faster training times and easier deployment and interpretability. To ensure methodological robustness and avoid overstatement of results, we complement headline performance metrics with statistical significance testing, efficiency benchmarking, and feature-attribution analyses. A dedicated bias analysis reveals substantial topic- and source-related confounding within the benchmark dataset (e.g., “real” news disproportionately originating from wire services), highlighting that near-perfect in-dataset performance does not necessarily translate to reliable veracity detection in real-world scenarios. Cross-dataset validation further underscores this limitation, showing a marked drop in performance on external corpora (from ~99.8% to ~68%), thereby indicating reliance on dataset-specific shortcuts. Our findings offer three key recommendations for practitioners: (1) establish robust XGBoost baselines before deploying more computationally expensive deep learning models, particularly in resource-constrained or latency-sensitive settings; (2) conduct thorough cross-dataset evaluations to assess generalization capacity; and (3) perform feature audits to identify and mitigate spurious correlations. All code, dataset splits, and evaluation scripts are publicly released to promote reproducibility and enable rigorous future research.

Keywords:

Fake news Detection; Text Classification; XGBoost; Deep Learning; Machine Learning; Bias Audits

Downloads

Download data is not yet available.
Ben Ammar, B. (2025). A Comparative Machine Learning and Deep Learning Algorithms for Fake News Detection. JOURNAL OF ADMINISTRATIVE AND ECONOMIC SCIENCES, 18(2), 779–795. Retrieved from https://jaes.qu.edu.sa/index.php/jae/article/view/2733
Copyright and license info is not available
Copyright and license info is not available
Author biographies is not available.