Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia.
Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia.
Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences (SIMATS), Chennai, India.
Because of, the increasing number of Ethiopians who actively engaging with the Internet and social media platforms, the incidence of clickbait is becomes a significant concern. Clickbait, often utilizing enticing titles to tempt users into clicking, has become rampant for various reasons, including advertising and revenue generation. However, the Amharic language, spoken by a large population, lacks sufficient NLP resources for addressing this issue. In this study, the authors developed a machine learning model for detecting and classifying clickbait titles in Amharic Language. To facilitate this, authors prepared the first Amharic clickbait dataset. 53,227 social media posts from well-known sites including Facebook, Twitter, and YouTube are included in the dataset. To assess the impact of conventional machine learning methods like Random Forest (RF), Logistic Regression (LR), and Support Vector Machines (SVM) with TF-IDF and N-gram feature extraction approaches, the authors set up a baseline. Subsequently, the authors investigated the efficacy of two word embedding techniques, word2vec and fastText, with Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) deep learning algorithms. At 94.27% accuracy and 94.24% F1 score measure, the CNN model with the rapid Text word embedding performs the best compared to the other models, according to the testing data. The study advances natural language processing on low-resource languages and offers insightful advice on how to counter clickbait content in Amharic.
Keywords
Clickbait Detection, Artificial Neural Networks, Natural Language Processing, Machine Learning Techniques, Deep Learning Techniques, Amharic Language, social media.
G. Loewenstein, “The psychology of curiosity: A review and reinterpretation.,” Psychol Bull, vol. 116, no. 1, pp. 75–98, Jul. 1994, doi: 10.1037/0033-2909.116.1.75.
J. Fu, L. Liang, X. Zhou, and J. Zheng, “A Convolutional Neural Network for Clickbait Detection,” in 2017 4th International Conference on Information Science and Control Engineering (ICISCE), 2017, pp. 6–10. doi: 10.1109/ICISCE.2017.11.
M. Al-Sarem et al., “An Improved Multiple Features and Machine Learning-Based Approach for Detecting Clickbait News on Social Networks,” Applied Sciences 2021, Vol. 11, Page 9487, vol. 11, no. 20, p. 9487, Oct. 2021, doi: 10.3390/APP11209487.
B. Naeem, M. Beg, H. Mujtaba, A. Khan, · Mirza, and O. Beg, “A deep learning framework for clickbait detection on social area network using natural language cues,” Springer, vol. 3, no. 1, pp. 231–243, Apr. 2020, doi: 10.1007/s42001-020-00063-y.
C. Zhang and P. D. Clough, “Investigating clickbait in Chinese social media: A study of WeChat,” Online Soc Netw Media, vol. 19, p. 100095, Sep. 2020, doi: 10.1016/J.OSNEM.2020.100095.
P. Mowar, M. Jain, R. Goel, and D. K. Vishwakarma, “Clickbait in YouTube Prevention, Detection and Analysis of the Bait using Ensemble Learning,” arXiv preprint arXiv:2112.08611, 2021.
P. Klairith and S. Tanachutiwat, “Thai clickbait detection algorithms using natural language processing with machine learning techniques,” ICEAST 2018 - 4th International Conference on Engineering, Applied Sciences and Technology: Exploring Innovative Solutions for Smart Society, Aug. 2018, doi: 10.1109/ICEAST.2018.8434447.
I. N. Awol and S. M. Gashaw, “Lexicon-Stance Based Amharic Fake News Detection,” researchgate.net, May 2022, Accessed: May 10, 2023. [Online]. Available: https://www.researchgate.net/profile/Ibrahim-Awol/publication/369203279_Lexicon-Stance_Based_Amharic_Fake_News_Detection/links/64105d84a1b72772e4f9308a/Lexicon-Stance-Based-Amharic-Fake-News-Detection.pdf
F. Gereme, W. Zhu, T. Ayall, and D. Alemu, “Combating Fake News in ‘Low-Resource’ Languages: Amharic Fake News Detection Accompanied by Resource Crafting,” Information 2021, Vol. 12, Page 20, vol. 12, no. 1, p. 20, Jan. 2021, doi: 10.3390/INFO12010020.
I. Zitouni, Natural language processing of semitic languages. Berlin: Springer, 2014. Accessed: May 16, 2023. [Online]. Available: https://link.springer.com/content/pdf/10.1007/978-3-642-45358-8.pdf
Y. Chen, N. J. Conroy, and V. L. Rubin, “Misleading online content: Recognizing clickbait as ‘false news,’” WMDD 2015 - Proceedings of the ACM Workshop on Multimodal Deception Detection, co-located with ICMI 2015, pp. 15–19, Nov. 2015, doi: 10.1145/2823465.2823467.
A. Chakraborty, B. Paranjape, S. Kakarla, and N. Ganguly, “Stop Clickbait: Detecting and preventing clickbaits in online news media,” Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016, pp. 9–16, Nov. 2016, doi: 10.1109/ASONAM.2016.7752207.
A. Geçkil, Müngen, A. A., E. Gündogan, and M. Kaya, “A clickbait detection method on news sites,” IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 932–937, Aug. 2018.
M. Potthast, S. Köpsel, B. Stein, and M. Hagen, “Clickbait Detection,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9626, pp. 810–817, 2016, doi: 10.1007/978-3-319-30671-1_72.
P. K. Dimpas, R. V. Po, and M. J. Sabellano, “Filipino and english clickbait detection using a long short term memory recurrent neural network,” Proceedings of the 2017 International Conference on Asian Language Processing, IALP 2017, vol. 2018-January, pp. 276–280, Feb. 2018, doi: 10.1109/IALP.2017.8300597.
S. Manjesh, T. Kanakagiri, P. Vaishak, V. Chettiar, and G. Shobha, “Clickbait Pattern Detection and Classification of News Headlines Using Natural Language Processing,” 2nd International Conference on Computational Systems and Information Technology for Sustainable Solutions, CSITSS 2017, pp. 1–5, Aug. 2017, doi: 10.1109/CSITSS.2017.8447715. Authors Pre-Proof
Bantelay, Lidia Mekuanint, et al. "Heuristic Pneumonia and Tuberculosis Detection in X-Ray Images Using Convolutional Neural Networks." 2023 Annual International Conference on Emerging Research Areas: International Conference on Intelligent Systems (AICERA/ICIS). IEEE, 2023.
H. T. Zheng, J. Y. Chen, X. Yao, A. K. Sangaiah, Y. Jiang, and C. Z. Zhao, “Clickbait Convolutional Neural Network,” Symmetry 2018, Vol. 10, Page 138, vol. 10, no. 5, p. 138, May 2018, doi: 10.3390/SYM10050138.
A. Agrawal, “Clickbait detection using deep learning,” Proceedings on 2016 2nd International Conference on Next Generation Computing Technologies, NGCT 2016, pp. 268–272, Mar. 2017, doi: 10.1109/NGCT.2016.7877426.
A. Anand, T. Chakraborty, and N. Park, “We used neural networks to detect clickbaits: You won’t believe what happened next!,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10193 LNCS, pp. 541–547, 2017, doi: 10.1007/978-3-319-56608-5_46/COVER.
Ali Nur, Mukerem, Mesfin Abebe, and Rajesh Sharma Rajendran. "Handwritten Geez Digit Recognition Using Deep Learning." Applied Computational Intelligence and Soft Computing 2022 (2022).
Sharma, R., Sungheetha, A., & Nuradis, J. Brain Tumor Classification by EGSO Based RBFNN Classifier.
M. Marreddy, S. R. Oota, L. S. Vakada, V. C. Chinni, and R. Mamidi, “Clickbait Detection in Telugu: Overcoming NLP Challenges in Resource-Poor Languages using Benchmarked Techniques,” Proceedings of the International Joint Conference on Neural Networks, vol. 2021-July, Jul. 2021, doi: 10.1109/IJCNN52387.2021.9534382.
M. N. Fakhruzzaman and S. W. Gunawan, “Web-based Application for Detecting Indonesian Clickbait Headlines using IndoBERT,” Feb. 2021, doi: 10.48550/arxiv.2102.10601.
Tilahun, Efa, et al. "Culture Reflecting Artistic Fashion Design using Deep Learning and Assisting Custom Algorithm." 2023 International Conference on New Frontiers in Communication, Automation, Management and Security (ICCAMS). Vol. 1. IEEE, 2023.
W. Kelemework, “Automatic Amharic text news classification: A neural networks approach,” Ethiopian Journal of Science and Technology, vol. 6, no. 2, pp. 127–137, 2013, Accessed: May 17, 2023. [Online]. Available: https://www.ajol.info/index.php/ejst/article/view/117217
S. M. Yimam, H. M. Alemayehu, A. A. Ayele, and C. Biemann, “Exploring Amharic Sentiment Analysis from Social Media Texts: Building Annotation Tools and Classification Models,” pp. 1048–1060, Jan. 2020, doi: 10.18653/V1/2020.COLING-MAIN.91.
E. N. Hailemichael, “Fake news detection for amharic language using deep learning,” academia.edu, 2021, Accessed: May 17, 2023. [Online]. Available: https://www.academia.edu/download/84664801/ERMIAS_20NIGATU.pdf
Sharma, Rajesh, P. Marikkannu, and Akey Sungheetha. "Three-dimensional MRI brain tumour classification using hybrid ant colony optimisation and grey wolf optimiser with proximal support vector machine." International Journal of Biomedical Engineering and Technology 29.1 (2019): 34-45.
B. Gambäck, F. Olsson, A. Argaw, and L. Asker, “Methods for Amharic part-of-speech tagging,” First Workshop on Language Technologies for African Languages, Mar. 2009, Accessed: May 17, 2023. [Online]. Available: https://www.diva-portal.org/smash/record.jsf?pid=diva2:1042595
Kiran, Chitra, et al. "Cyber Physical System Centred Protective Laboratory for Industries." International Conference on Microelectronics, Electromagnetics and Telecommunication. Singapore: Springer Nature Singapore, 2023.
X. Cao, T. Le, J. ( Jiasheng, ) Zhang, and D. Lee, “Machine Learning Based Detection of Clickbait Posts in Social Media,” Oct. 2017, Accessed: Apr. 06, 2023. [Online]. Available: https://arxiv.org/abs/1710.01977v1
P. Adelson, S. Arora, and J. Hara, “Clickbait; Didn’t Read: Clickbait Detection using Parallel Neural Networks,” 2017, Accessed: May 16, 2023. [Online]. Available: http://cs229.stanford.edu/proj2017/final-reports/5231575.pdf
K. Shu, S. Wang, T. Le, D. Lee, and H. Liu, “Deep Headline Generation for Clickbait Detection,” Proceedings - IEEE International Conference on Data Mining, ICDM, vol. 2018-November, pp. 467–476, Dec. 2018, doi: 10.1109/ICDM.2018.00062.
Sharma, R. Rajesh, and P. Marikkannu. "Hybrid RGSA and support vector machine framework for three-dimensional magnetic resonance brain tumor classification." The Scientific World Journal 2015 (2015).
Z. Abebaw, A. Rauber, and S. Atnafu, “Multi-channel Convolutional Neural Network for Hate Speech Detection in Social Media,” Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, vol. 411 LNICST, pp. 603–618, 2022, doi: 10.1007/978-3-030-93709-6_41
Acknowledgements
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Funding
No funding was received to assist with the preparation of this manuscript.
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Availability of data and materials
Data sharing is not applicable to this article as no new data were created or analysed in this study.
Author information
Contributions
All authors have equal contribution in the paper and all authors have read and agreed to the published version of the manuscript.
Corresponding author
Mesfin Abebe Haile
Mesfin Abebe Haile
Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia.
Open Access This article is licensed under a Creative Commons Attribution NoDerivs is a more restrictive license. It allows you to redistribute the material commercially or non-commercially but the user cannot make any changes whatsoever to the original, i.e. no derivatives of the original work. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/
Cite this article
Rajesh Sharma R, Akey Sungheetha, Mesfin Abebe Haile, Arefat Hyeredin Kedir, Rajasekaran A, Charles Babu G, “Clickbait Detection for Amharic Language using Deep Learning Techniques”, Journal of Machine and Computing, doi: 10.53759/7669/jmc202404058.