The phenomenon of economic globalization has led to the swift advancement of industries across diverse domains. Consequently, big data technology has garnered increasing interest. The generation of network data is occurring at an unparalleled pace, necessitating the intelligent processing of vast amounts of data. To fully leverage the value inherent in this data, the implementation of machine learning techniques is imperative. The objective of machine learning in a vast data setting is to identify particular rules that are concealed within dynamic, variable, multi-origin heterogeneous data, with the ultimate aim of maximizing the value of the data. The integration of big data technology and machine learning algorithms is imperative in order to identify pertinent correlations within intricate and dynamic datasets. Subsequently, computer-based data mining can be utilized to extract valuable research insights. The present study undertakes an analysis of deep learning in comparison to conventional data mining and machine learning techniques. It conducts a comparative assessment of the strengths and limitations of the traditional methods. Additionally, the study introduces the requirements of enterprises, their systems and data, the IT challenges they face, and the role of Big Data in an extended service infrastructure. This study presents an analysis of the probability and issues associated with the utilization of deep learning, including machine learning and traditional data mining techniques, in the big data analytics context.
Keywords
Machine Learning, Big Data, Data Mining, Big Data Analytics, Traditional Data Mining.
Y. Tang et al., “Characterization of Calculus bovis by principal component analysis assisted qHNMR profiling to distinguish nefarious frauds,” J. Pharm. Biomed. Anal., vol. 228, no. 115320, p. 115320, 2023.
Haldorai, A. Ramu, and S. A. R. Khan, Eds., “Business Intelligence for Enterprise Internet of Things,” EAI/Springer Innovations in Communication and Computing, 2020, doi: 10.1007/978-3-030-44407-5.
Haldorai and U. Kandaswamy, “Intelligent Spectrum Handovers in Cognitive Radio Networks,” EAI/Springer Innovations in Communication and Computing, 2019, doi: 10.1007/978-3-030-15416-5.
M. Hajjar, G. Aldabbagh, and N. Dimitriou, “Using clustering techniques to improve capacity of LTE networks,” in 2015 21st Asia-Pacific Conference on Communications (APCC), 2015.
F. D. F. Duarte, “Multimodal optimization with the local optimum ranking 2 algorithm,” Research Square, 2022.
L. Nigro, “Performance of parallel K-means algorithms in Java,” Algorithms, vol. 15, no. 4, p. 117, 2022.
Y. Gao, Y. Hu, and Y. Chu, “Ability grouping of elderly individuals based on an improved K-prototypes algorithm,” Math. Probl. Eng., vol. 2023, pp. 1–11, 2023.
M. A. N. D. Sewwandi, Y. Li, and J. Zhang, “A class-specific feature selection and classification approach using neighborhood rough set and K-nearest neighbor theories,” Appl. Soft Comput., vol. 143, no. 110366, p. 110366, 2023.
Y. Tang, Y. Chang, and K. Li, “Applications of K-nearest neighbor algorithm in intelligent diagnosis of wind turbine blades damage,” Renew. Energy, vol. 212, pp. 855–864, 2023.
L. Wang, M. Zhuang, and K. Yuan, “Active control method for rotor eccentric vibration of high-speed motor based on least squares support vector machine,” Machines, vol. 10, no. 11, p. 1094, 2022.
Y. Feng and Q. Wu, “A statistical learning assessment of Huber regression,” J. Approx. Theory, vol. 273, no. 105660, p. 105660, 2022.
X. Liu, J. Liu, and X. Chen, “A novel method of identifying optimal interval regression model using structural risk minimization and approximation error minimization,” in 2021 33rd Chinese Control and Decision Conference (CCDC), 2021.
Y. Deng, N. Gazagnadou, J. Hong, M. Mahdavi, and L. Lyu, “On the hardness of robustness transfer: A perspective from Rademacher complexity over symmetric difference hypothesis space,” arXiv [cs.LG], 2023.
V. Grabstaite, R. Baleviciute, R. J. Luiniene, M. Landauskas, and A. Vainoras, “Physiologic changes of ECG parameters in actors during performance – reaction complexity,” J. Complex. Health Sci., vol. 3, no. 2, pp. 137–142, 2020.
V. Vapnik and R. Izmailov, “Rethinking statistical learning theory: learning using statistical invariants,” Mach. Learn., vol. 108, no. 3, pp. 381–423, 2019.
M. Mahsuli and T. Haukaas, “Risk minimization for a portfolio of buildings considering risk aversion,” J. Struct. Eng. (N. Y.), vol. 145, no. 2, p. 04018241, 2019.
K. Ashok, M. Ashraf, J. Thimmia Raja, M. Z. Hussain, D. K. Singh, and A. Haldorai, “Collaborative analysis of audio-visual speech synthesis with sensor measurements for regulating human–robot interaction,” International Journal of System Assurance Engineering and Management, Aug. 2022, doi: 10.1007/s13198-022-01709-y.
T. Bellotti, R. Matousek, and C. Stewart, “A note comparing support vector machines and ordered choice models’ predictions of international banks’ ratings,” Decis. Support Syst., vol. 51, no. 3, pp. 682–687, 2011.
H and A. R, “Artificial Intelligence and Machine Learning for Enterprise Management,” 2019 International Conference on Smart Systems and Inventive Technology (ICSSIT), Nov. 2019, doi: 10.1109/icssit46314.2019.8987964.
M. V. Da Silva et al., “A data-driven examination of apathy and depressive symptoms in dementia with independent replication,” bioRxiv, 2022.
Haldorai and U. Kandaswamy, “Energy Efficient Network Selection for Cognitive Spectrum Handovers,” EAI/Springer Innovations in Communication and Computing, pp. 41–64, 2019, doi: 10.1007/978-3-030-15416-5_3.
B. M. Greenwell, “Conditional inference trees,” in Tree-Based Methods for Statistical Learning in R, Boca Raton: Chapman and Hall/CRC, 2022, pp. 111–146.
H. Zhang and C. X. Ling, “Geometric properties of naive Bayes in nominal domains,” in Machine Learning: ECML 2001, Berlin, Heidelberg: Springer Berlin Heidelberg, 2001, pp. 587–599.
G. Zhang, P. Nulty, and D. Lillis, “Enhancing legal argument mining with domain pre-training and neural networks,” J. Data Min. Digit. Humanit., vol. NLP4DH, 2022.
M. Nielsen, L. Wenderoth, T. Sentker, and R. Werner, “Self-supervision for medical image classification: state-of-the-art performance with ~100 labeled training samples per class,” arXiv [cs.CV], 2023.
R. W. Farebrother, “Notes on the prehistory of principal components analysis,” J. Multivar. Anal., vol. 188, no. 104814, p. 104814, 2022.
R. Chen, Y. Tang, Y. Xie, W. Feng, and W. Zhang, “Semisupervised progressive representation learning for deep multiview clustering,” IEEE Trans. Neural Netw. Learn. Syst., vol. PP, 2023.
M. Nasir Amin, B. Iftikhar, K. Khan, M. Faisal Javed, A. Mohammad AbuArab, and M. Faisal Rehman, “Prediction model for rice husk ash concrete using AI approach: Boosting and bagging algorithms,” Structures, vol. 50, pp. 745–757, 2023.
N. S. F. Putri, A. P. Wibawa, H. Ar Rasyid, A. Nafalski, and U. R. Hasyim, “Boosting and bagging classification for computer science journal,” Int. J. Adv. Intell. Inform., vol. 9, no. 1, p. 27, 2023.
M. Zhan, X. Shi, F. Liu, and R. Hu, “IGCNN-FC: Boosting interpretability and generalization of convolutional neural networks for few chest X-rays analysis,” Inf. Process. Manag., vol. 60, no. 3, p. 103258, 2023.
J. Wang, R. Min, Z. Wu, and Y. Hu, “Boosting I/O performance of internet servers with user-level custom file systems,” Perform. Eval. Rev., vol. 29, no. 2, pp. 26–31, 2001.
T. R. Adyalam, Z. Rustam, and J. Pandelaki, “Classification of osteoarthritis disease severity using adaboost support vector machines,” J. Phys. Conf. Ser., vol. 1108, p. 012062, 2018.
A. R. Kulkarni, N. Kumar, and K. R. Rao, “Efficacy of Bluetooth-based data collection for road traffic analysis and visualization using big data analytics,” Big Data Min. Anal., vol. 6, no. 2, pp. 139–153, 2023.
M. Kālis, A. Locāns, R. Šikovs, H. Naseri, and A. Ambainis, “A hybrid quantum-classical approach for inference on restricted Boltzmann machines,” arXiv [quant-ph], 2023.
Acknowledgements
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Funding
No funding was received to assist with the preparation of this manuscript.
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Availability of data and materials
No data available for above study.
Author information
Contributions
All authors have equal contribution in the paper and all authors have read and agreed to the published version of the manuscript.
This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you
give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and
indicate if changes were made. The images or other third party material in this article are included in the article‟s
Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the
article‟s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the
permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/
Cite this article
Francisco Pedro, “A Review of Data Mining, Big Data Analytics, and Machine Learning Approaches”, Journal of Computing and Natural Science, vol.3, no.4, pp. 169-181, October 2023. doi: 10.53759//181X/JCNS/202303016.