Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia.
Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia.
Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences (SIMATS), Chennai, India.
Because of, the increasing number of Ethiopians who actively engaging with the Internet and social media platforms, the incidence of clickbait is becomes a significant concern. Clickbait, often utilizing enticing titles to tempt users into clicking, has become rampant for various reasons, including advertising and revenue generation. However, the Amharic language, spoken by a large population, lacks sufficient NLP resources for addressing this issue. In this study, the authors developed a machine learning model for detecting and classifying clickbait titles in Amharic Language. To facilitate this, authors prepared the first Amharic clickbait dataset. 53,227 social media posts from well-known sites including Facebook, Twitter, and YouTube are included in the dataset. To assess the impact of conventional machine learning methods like Random Forest (RF), Logistic Regression (LR), and Support Vector Machines (SVM) with TF-IDF and N-gram feature extraction approaches, the authors set up a baseline. Subsequently, the authors investigated the efficacy of two word embedding techniques, word2vec and fastText, with Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) deep learning algorithms. At 94.27% accuracy and 94.24% F1 score measure, the CNN model with the rapid Text word embedding performs the best compared to the other models, according to the testing data. The study advances natural language processing on low-resource languages and offers insightful advice on how to counter clickbait content in Amharic.
Keywords
Clickbait Detection, Artificial Neural Networks, Natural Language Processing, Machine Learning Techniques, Deep Learning Techniques, Amharic Language, Social Media.
G. Loewenstein, “The psychology of curiosity: A review and reinterpretation.,” Psychological Bulletin, vol. 116, no. 1, pp. 75–98, 1994, doi: 10.1037//0033-2909.116.1.75.
J. Fu, L. Liang, X. Zhou, and J. Zheng, “A Convolutional Neural Network for Clickbait Detection,” 2017 4th International Conference on Information Science and Control Engineering (ICISCE), Jul. 2017, doi: 10.1109/icisce.2017.11.
M. Al-Sarem et al., “An Improved Multiple Features and Machine Learning-Based Approach for Detecting Clickbait News on Social Networks,” Applied Sciences, vol. 11, no. 20, p. 9487, Oct. 2021, doi: 10.3390/app11209487.
B. Naeem, A. Khan, M. O. Beg, and H. Mujtaba, “A deep learning framework for clickbait detection on social area network using natural language cues,” Journal of Computational Social Science, vol. 3, no. 1, pp. 231–243, Feb. 2020, doi: 10.1007/s42001-020-00063-y.
C. Zhang and P. D. Clough, “Investigating clickbait in Chinese social media: A study of WeChat,” Online Social Networks and Media, vol. 19, p. 100095, Sep. 2020, doi: 10.1016/j.osnem.2020.100095.
P. Mowar, M. Jain, R. Goel, and D. K. Vishwakarma, “Clickbait in YouTube Prevention, Detection and Analysis of the Bait using Ensemble Learning,” arXiv preprint arXiv:2112.08611, 2021.
P. Klairith and S. Tanachutiwat, “Thai Clickbait Detection Algorithms Using Natural Language Processing with Machine Learning Techniques,” 2018 International Conference on Engineering, Applied Sciences, and Technology (ICEAST), Jul. 2018, doi: 10.1109/iceast.2018.8434447.
I. N. Awol and S. M. Gashaw, “Lexicon-Stance Based Amharic Fake News Detection,” researchgate.net, May 2022, Accessed: May 10, 2023. [Online]. Available: https://www.researchgate.net/profile/Ibrahim-Awol/publication/369203279_Lexicon-Stance_Based_Amharic_Fake_News_Detection/links/64105d84a1b72772e4f9308a/Lexicon-Stance-Based-Amharic-Fake-News-Detection.pdf
F. Gereme, W. Zhu, T. Ayall, and D. Alemu, “Combating Fake News in ‘Low-Resource’ Languages: Amharic Fake News Detection Accompanied by Resource Crafting,” Information, vol. 12, no. 1, p. 20, Jan. 2021, doi: 10.3390/info12010020.
I. Zitouni, Ed., Natural Language Processing of Semitic Languages. Springer Berlin Heidelberg, 2014. doi: 10.1007/978-3-642-45358-8.
Y. Chen, N. J. Conroy, and V. L. Rubin, “Misleading Online Content,” Proceedings of the 2015 ACM on Workshop on Multimodal Deception Detection, Nov. 2015, doi: 10.1145/2823465.2823467.
A. Chakraborty, B. Paranjape, S. Kakarla, and N. Ganguly, “Stop Clickbait: Detecting and preventing clickbaits in online news media,” 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Aug. 2016, doi: 10.1109/asonam.2016.7752207.
A. Geckil, A. A. Mungen, E. Gundogan, and M. Kaya, “A Clickbait Detection Method on News Sites,” 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Aug. 2018, doi: 10.1109/asonam.2018.8508452.
M. Potthast, S. Köpsel, B. Stein, and M. Hagen, “Clickbait Detection,” Advances in Information Retrieval, pp. 810–817, 2016, doi: 10.1007/978-3-319-30671-1_72.
P. K. Dimpas, R. V. Po, and M. J. Sabellano, “Filipino and english clickbait detection using a long short term memory recurrent neural network,” 2017 International Conference on Asian Language Processing (IALP), Dec. 2017, doi: 10.1109/ialp.2017.8300597.
S. Manjesh, T. Kanakagiri, P. Vaishak, V. Chettiar, and G. Shobha, “Clickbait Pattern Detection and Classification of News Headlines Using Natural Language Processing,” 2017 2nd International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS), Dec. 2017, doi: 10.1109/csitss.2017.8447715.
L. M. Bantelay, M. Abebe, R. Sharma Rajendran, A. Sungheetha, and S. N, “Heuristic Pneumonia and Tuberculosis Detection in X-Ray Images Using Convolutional Neural Networks,” 2023 Annual International Conference on Emerging Research Areas: International Conference on Intelligent Systems (AICERA/ICIS), Nov. 2023, doi: 10.1109/aicera/icis59538.2023.10420329.
H.-T. Zheng, J.-Y. Chen, X. Yao, A. K. Sangaiah, Y. Jiang, and C.-Z. Zhao, “Clickbait Convolutional Neural Network,” Symmetry, vol. 10, no. 5, p. 138, May 2018, doi: 10.3390/sym10050138.
A. Agrawal, “Clickbait detection using deep learning,” 2016 2nd International Conference on Next Generation Computing Technologies (NGCT), Oct. 2016, doi: 10.1109/ngct.2016.7877426.
A. Anand, T. Chakraborty, and N. Park, “We Used Neural Networks to Detect Clickbaits: You Won’t Believe What Happened Next!,” Advances in Information Retrieval, pp. 541–547, 2017, doi: 10.1007/978-3-319-56608-5_46.
M. Ali Nur, M. Abebe, and R. S. Rajendran, “Handwritten Geez Digit Recognition Using Deep Learning,” Applied Computational Intelligence and Soft Computing, vol. 2022, pp. 1–12, Nov. 2022, doi: 10.1155/2022/8515810.
R. Sharma R*, A. Sungheetha, and J. Nuradis, “Brain Tumor Classification by EGSO Based RBFNN Classifier,” International Journal of Recent Technology and Engineering (IJRTE), vol. 8, no. 5, pp. 3005–3012, Jan. 2020, doi: 10.35940/ijrte.e6073.018520.
M. Marreddy, S. R. Oota, L. S. Vakada, V. C. Chinni, and R. Mamidi, “Clickbait Detection in Telugu: Overcoming NLP Challenges in Resource-Poor Languages using Benchmarked Techniques,” 2021 International Joint Conference on Neural Networks (IJCNN), Jul. 2021, doi: 10.1109/ijcnn52387.2021.9534382.
M. N. Fakhruzzaman and S. W. Gunawan, “Web-based Application for Detecting Indonesian Clickbait Headlines using IndoBERT,” Feb. 2021, doi: 10.48550/arxiv.2102.10601.
E. Tilahun, M. Abebe, R. Rajesh Sharma, A. Sungheetha, and N. Sengottaiayn, “Culture Reflecting Artistic Fashion Design using Deep Learning and Assisting Custom Algorithm,” 2023 International Conference on New Frontiers in Communication, Automation, Management and Security (ICCAMS), Oct. 2023, doi: 10.1109/iccams60113.2023.10525953.
W. Kelemework, “Automatic Amharic text news classification: A neural networks approach,” Ethiopian Journal of Science and Technology, vol. 6, no. 2, pp. 127–137, 2013, Accessed: May 17, 2023. [Online]. Available: https://www.ajol.info/index.php/ejst/article/view/117217
S. M. Yimam, H. M. Alemayehu, A. Ayele, and C. Biemann, “Exploring Amharic Sentiment Analysis from Social Media Texts: Building Annotation Tools and Classification Models,” Proceedings of the 28th International Conference on Computational Linguistics, 2020, doi: 10.18653/v1/2020.coling-main.91.
E. N. Hailemichael, “Fake news detection for amharic language using deep learning,” academia.edu, 2021, Accessed: May 17, 2023. [Online]. Available: https://www.academia.edu/download/84664801/ERMIAS_20NIGATU.pdf
R. Sharma, A. Sungheetha, and P. Marikkannu, “Three-dimensional MRI brain tumour classification using hybrid ant colony optimisation and grey wolf optimiser with proximal support vector machine,” International Journal of Biomedical Engineering and Technology, vol. 29, no. 1, p. 34, 2019, doi: 10.1504/ijbet.2019.10017861.
B. Gambäck, F. Olsson, A. Argaw, and L. Asker, “Methods for Amharic part-of-speech tagging,” First Workshop on Language Technologies for African Languages, Mar. 2009, Accessed: May 17, 2023. [Online]. Available: https://www.diva-portal.org/smash/record.jsf?pid=diva2:1042595
C. Kiran et al., “Cyber Physical System Centred Protective Laboratory for Industries,” Advances in Microelectronics, Embedded Systems and IoT, pp. 365–374, 2024, doi: 10.1007/978-981-97-0767-6_30.
X. Cao, T. Le, J. ( Jiasheng, ) Zhang, and D. Lee, “Machine Learning Based Detection of Clickbait Posts in Social Media,” Oct. 2017, Accessed: Apr. 06, 2023. [Online]. Available: https://arxiv.org/abs/1710.01977v1
P. Adelson, S. Arora, and J. Hara, “Clickbait; Didn’t Read: Clickbait Detection using Parallel Neural Networks,” 2017, Accessed: May 16, 2023. [Online]. Available: http://cs229.stanford.edu/proj2017/final-reports/5231575.pdf
K. Shu, S. Wang, T. Le, D. Lee, and H. Liu, “Deep Headline Generation for Clickbait Detection,” 2018 IEEE International Conference on Data Mining (ICDM), Nov. 2018, doi: 10.1109/icdm.2018.00062.
R. Rajesh Sharma and P. Marikkannu, “Hybrid RGSA and Support Vector Machine Framework for Three-Dimensional Magnetic Resonance Brain Tumor Classification,” The Scientific World Journal, vol. 2015, pp. 1–14, 2015, doi: 10.1155/2015/184350.
Z. Abebaw, A. Rauber, and S. Atnafu, “Multi-channel Convolutional Neural Network for Hate Speech Detection in Social Media,” Advances of Science and Technology, pp. 603–618, 2022, doi: 10.1007/978-3-030-93709-6_41.
Acknowledgements
We would like to thank Reviewers for taking the time and effort necessary to review the manuscript. We sincerely appreciate all valuable comments and suggestions, which helped us to improve the quality of the manuscript.
Funding
This research study was funded by Adama Science and Technology University under the grant number: ASTU/SM-
R/851/23. The authors would like to express their gratitude for the assistance received from the institute
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Availability of data and materials
Data sharing is not applicable to this article as no new data were created or analysed in this study.
Author information
Contributions
All authors have equal contribution in the paper and all authors have read and agreed to the published version of the manuscript.
Corresponding author
Mesfin Abebe Haile
Mesfin Abebe Haile
Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia.
Open Access This article is licensed under a Creative Commons Attribution NoDerivs is a more restrictive license. It allows you to redistribute the material commercially or non-commercially but the user cannot make any changes whatsoever to the original, i.e. no derivatives of the original work. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/
Cite this article
Rajesh Sharma R, Akey Sungheetha, Mesfin Abebe Haile, Arefat Hyeredin Kedir, Rajasekaran A and Charles Babu G, “Clickbait Detection for Amharic Language using Deep Learning Techniques”, Journal of Machine and Computing, pp. 603-615, July 2024. doi: 10.53759/7669/jmc202404058.