Enhancing the software maintenance greatly depends on the precise and prompt handing out of bug reports according to their bug-category and importance. To resolve the aforementioned problems, an automated method of classifying and ranking bug reports is required. Numerous scholars have recently looked into the automated classification and prioritization of bug reports. But not much has been accomplished in this area. During software development, the most crucial stages are testing and maintenance. In these phases of development activity, bug reports are essential. When software modules are being tested, the software quality assurance team creates a bug report. But the main issue that comes up while analysing bug data that is written in normal text. As a result, processing and extracting information from it is extremely challenging. The aforementioned requirements are the driving force for this research. The Proposed research suggested creating a hybrid model that takes advantage of machine learning models' contextual awareness as well as more conventional feature extraction methods (such as TF-IDF). A downstream classifier (such as an SVM, logistic regression) can receive these two feature sets (one from TF-IDF and the other from BERT) after they have been concatenated. This enables the model to take advantage of the extensive contextual relationships that BERT captures as well as the statistical importance of phrases (TF-IDF) These two approaches were used separately in the earlier research, which resulted in less performance. The research made use of a confidential dataset that was acquired from a private company upon request for performing testing, the data included from eight hundred employees. To aid in model training, bug keywords were first taken out of the bug description field. The results shows that proposed model achieves 89% accuracy.
Keywords
Term Frequency, Bug Reports, Inverse Document Frequency, NLP, Machine Learning.
D. E. Messaoudi and D. Nessah, “Enhancing Neural Arabic Machine Translation using Character-Level CNN-BILSTM and Hybrid Attention,” Engineering, Technology & Applied Science Research, vol. 14, no. 5, pp. 17029–17034, Oct. 2024, doi: 10.48084/etasr.8383.
G. Catolino, F. Palomba, A. Zaidman, and F. Ferrucci, “Not all bugs are the same: Understanding, characterizing, and classifying bug types,” Journal of Systems and Software, vol. 152, pp. 165–181, Jun. 2019, doi: 10.1016/j.jss.2019.03.002.
C. Yang et al., “RepoLike: amulti-feature-based personalized recommendation approach for open-source repositories,” Frontiers of Information Technology & Electronic Engineering, vol. 20, no. 2, pp. 222–237, Feb. 2019, doi: 10.1631/fitee.1700196.
K. Goseva-Popstojanova and J. Tyo, “Identification of Security Related Bug Reports via Text Mining Using Supervised and Unsupervised Classification,” 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS), Jul. 2018, doi: 10.1109/qrs.2018.00047.
R. Shokripour, J. Anvik, Z. M. Kasirun, and S. Zamani, “A time-based approach to automatic bug report assignment,” Journal of Systems and Software, vol. 102, pp. 109–122, Apr. 2015, doi: 10.1016/j.jss.2014.12.049.
R. Lotufo, Z. Malik, and K. Czarnecki, “Modelling the ‘hurried’ bug report reading process to summarize bug reports,” Empirical Software Engineering, vol. 20, no. 2, pp. 516–548, Jun. 2014, doi: 10.1007/s10664-014-9311-2.
C.-Z. YANG, C.-M. AO, and Y.-H. CHUNG, “Towards an Improvement of Bug Report Summarization Using Two-Layer Semantic Information,” IEICE Transactions on Information and Systems, vol. E101.D, no. 7, pp. 1743–1750, Jul. 2018, doi: 10.1587/transinf.2017kbp0016.
C. K. Chang, M. J. Christensen, and T. Zhang, “Genetic Algorithms for Project Management,” Annals of Software Engineering, vol. 11, no. 1, pp. 107–139, Nov. 2001, doi: 10.1023/a:1012543203763.
M. Izadi, A. Heydarnoori, and G. Gousios, “Topic recommendation for software repositories using multi-label classification algorithms,” Empirical Software Engineering, vol. 26, no. 5, Jul. 2021, doi: 10.1007/s10664-021-09976-2.
X. Li, H. Jiang, D. Liu, Z. Ren, and G. Li, “Unsupervised deep bug report summarization,” Proceedings of the 26th Conference on Program Comprehension, pp. 144–155, May 2018, doi: 10.1145/3196321.3196326.
H. Jiang, X. Li, Z. Ren, J. Xuan, and Z. Jin, “Toward Better Summarizing Bug Reports With Crowdsourcing Elicited Attributes,” IEEE Transactions on Reliability, vol. 68, no. 1, pp. 2–22, Mar. 2019, doi: 10.1109/tr.2018.2873427.
Z. Ge, Z. Song, S. X. Ding, and B. Huang, “Data Mining and Analytics in the Process Industry: The Role of Machine Learning,” IEEE Access, vol. 5, pp. 20590–20616, 2017, doi: 10.1109/access.2017.2756872.
M. Kumari and V. B. Singh, “An Improved Classifier Based on Entropy and Deep Learning for Bug Priority Prediction,” Intelligent Systems Design and Applications, pp. 571–580, Apr. 2019, doi: 10.1007/978-3-030-16657-1_53.
W. Zou, D. Lo, Z. Chen, X. Xia, Y. Feng, and B. Xu, “How Practitioners Perceive Automated Bug Report Management Techniques,” IEEE Transactions on Software Engineering, vol. 46, no. 8, pp. 836–862, Aug. 2020, doi: 10.1109/tse.2018.2870414.
M. Bansal, A. Goyal, and A. Choudhary, “A comparative analysis of K-Nearest Neighbor, Genetic, Support Vector Machine, Decision Tree, and Long Short Term Memory algorithms in machine learning,” Decision Analytics Journal, vol. 3, p. 100071, Jun. 2022, doi: 10.1016/j.dajour.2022.100071.
Y. Liu, X. Qi, J. Zhang, H. Li, X. Ge, and J. Ai, “Automatic Bug Triaging via Deep Reinforcement Learning,” Applied Sciences, vol. 12, no. 7, p. 3565, Mar. 2022, doi: 10.3390/app12073565.
A. Kukkar, Y. Kumar, A. Sharma, and J. Kaur Sandhu, “Bug severity classification in software using ant colony optimization based feature weighting technique,” Expert Systems with Applications, vol. 230, p. 120573, Nov. 2023, doi: 10.1016/j.eswa.2023.120573.
H. Jantan, A. Razak, and Z. Ali, “Intelligent Techniques for Decision Support System in Human Resource Management,” Decision Support Systems Advances in, Mar. 2010, doi: 10.5772/39401.
Balogun, A.O., Basri, S., Abdulkadir, S.J., Adeyemo, V.E., Imam, A.A., Bajeh, A.O.: Software defect prediction: analysis of class imbalance and performance stability. J. Eng. Sci. Technol. 14, 3294–3308 (2019).
Y. Tan, S. Xu, Z. Wang, T. Zhang, Z. Xu, and X. Luo, “Bug severity prediction using question-and-answer pairs from Stack Overflow,” Journal of Systems and Software, vol. 165, p. 110567, Jul. 2020, doi: 10.1016/j.jss.2020.110567.
I. H. Laradji, M. Alshayeb, and L. Ghouti, “Software defect prediction using ensemble learning on selected features,” Information and Software Technology, vol. 58, pp. 388–402, Feb. 2015, doi: 10.1016/j.infsof.2014.07.005.
B. Ahmed, G. Ali, A. Hussain, A. Baseer, and J. Ahmed, “Analysis of Text Feature Extractors using Deep Learning on Fake News,” Engineering, Technology & Applied Science Research, vol. 11, no. 2, pp. 7001–7005, Apr. 2021, doi: 10.48084/etasr.4069.
D. Elangovan and V. Subedha, “Adaptive Particle Grey Wolf Optimizer with Deep Learning-based Sentiment Analysis on Online Product Reviews,” Engineering, Technology & Applied Science Research, vol. 13, no. 3, pp. 10989–10993, Jun. 2023, doi: 10.48084/etasr.5787.
CRediT Author Statement
The authors confirm contribution to the paper as follows:
Conceptualization: Mamatha Racharla, Lalitha Surya Kumari P and Sharada Adepu;
Writing- Original Draft Preparation: Lalitha Surya Kumari P and Sharada Adepu;
Visualization: Mamatha Racharla, Lalitha Surya Kumari P and Sharada Adepu;
Investigation: Lalitha Surya Kumari P and Sharada Adepu;
Supervision: Mamatha Racharla;
Validation: Lalitha Surya Kumari P and Sharada Adepu;
Writing- Reviewing and Editing: Mamatha Racharla, Lalitha Surya Kumari P and Sharada Adepu; All authors reviewed the results and approved the final version of the manuscript.
Acknowledgements
The authors would like to thank to the reviewers for nice comments on the manuscript.
Funding
No funding was received to assist with the preparation of this manuscript.
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Availability of data and materials
Data sharing is not applicable to this article as no new data were created or analysed in this study.
Author information
Contributions
All authors have equal contribution in the paper and all authors have read and agreed to the published version of the manuscript.
Corresponding author
Mamatha Racharla
Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Hyderabad, Telangana, India.
Open Access This article is licensed under a Creative Commons Attribution NoDerivs is a more restrictive license. It allows you to redistribute the material commercially or non-commercially but the user cannot make any changes whatsoever to the original, i.e. no derivatives of the original work. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/
Cite this article
Mamatha Racharla, Lalitha Surya Kumari P and Sharada Adepu, “A Robust Evaluation of Bug Pre-Processing and Classification Logic using NLP Computation with Machine Learning Technique”, Journal of Machine and Computing, vol.5, no.4, pp. 2224-2229, October 2025, doi: 10.53759/7669/jmc202505172.