Threatening URDU Language Detection from Tweets Using Machine Learning

Mehmood, Aneela and Farooq, Muhammad Shoaib and Naseem, Ansar and Rustam, Furqan and Gracia Villar, Mónica and Rodríguez Velasco, Carmen Lilí and Ashraf, Imran UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, monica.gracia@uneatlantico.es, carmen.rodriguez@uneatlantico.es, UNSPECIFIED (2022) Threatening URDU Language Detection from Tweets Using Machine Learning. Applied Sciences, 12 (20). p. 10342. ISSN 2076-3417

Preview

Text
applsci-12-10342-v3.pdf
Available under License Creative Commons Attribution.
Download (884kB) | Preview

Official URL: http://doi.org/10.3390/app122010342

Abstract

Technology’s expansion has contributed to the rise in popularity of social media platforms. Twitter is one of the leading social media platforms that people use to share their opinions. Such opinions, sometimes, may contain threatening text, deliberately or non-deliberately, which can be disturbing for other users. Consequently, the detection of threatening content on social media is an important task. Contrary to high-resource languages like English, Dutch, and others that have several such approaches, the low-resource Urdu language does not have such a luxury. Therefore, this study presents an intelligent threatening language detection for the Urdu language. A stacking model is proposed that uses an extra tree (ET) classifier and Bayes theorem-based Bernoulli Naive Bayes (BNB) as the based learners while logistic regression (LR) is employed as the meta learner. A performance analysis is carried out by deploying a support vector classifier, ET, LR, BNB, fully connected network, convolutional neural network, long short-term memory, and gated recurrent unit. Experimental results indicate that the stacked model performs better than both machine learning and deep learning models. With 74.01% accuracy, 70.84% precision, 75.65% recall, and 73.99% F1 score, the model outperforms the existing benchmark study.

Item Type:	Article
Uncontrolled Keywords:	threatening language detection; Urdu text classification; machine learning; stacking
Subjects:	Subjects > Engineering
Divisions:	Europe University of Atlantic > Research > Scientific Production Fundación Universitaria Internacional de Colombia > Research > Scientific Production Ibero-american International University > Research > Scientific Production Ibero-american International University > Research > Scientific Production Universidad Internacional do Cuanza > Research > Scientific Production
Depositing User:	Sr Bibliotecario
Date Deposited:	26 Oct 2022 09:00
Last Modified:	18 Jul 2023 07:27
URI:	http://repositorio.funiber.org/id/eprint/4194

Actions (login required)

View Item