Journal of Machine and Computing


Optimising Clustering Accuracy Using Mahalanobis Distance-Based Ensemble Methods - A Novel Data Analysis Paradigm



Journal of Machine and Computing

Received On : 02 March 2025

Revised On : 16 May 2025

Accepted On : 21 July 2025

Published On : 05 October 2025

Volume 05, Issue 04

Pages : 2171-2182


Abstract


A conspicuous paradigm in Machine Learning involves leveraging the collaborative integration of multiple models through Ensemble Learning to surpass the performance, accuracy, and generalization capabilities of individual procedures. This study aims to investigate the deployment of Ensemble Learning as an approach to refine the precision of Cluster Analysis. Conventional clustering algorithms normally tackle with the complexity of datasets portrayed by multifaceted attributes, many of which demonstrate obscure or non-linear interdependencies. By employing advanced Ensemble Learning approaches, this research hopes to elaborate clustering efficacy and perceive latent patterns that remain imperceptible through traditional approaches. The proposed investigation underscores Ensemble Learning as a transformative approach in the domain of cluster analysis within data science. Its ability to augment the accurateness of clustering techniques and extract hidden structures from data not promptly apparent through conventional algorithms accentuates its indispensability. Accordingly, this research endeavours to revolutionize the precision and applicability of clustering methodologies in solving real-world data analysis challenges. To substantiate the efficacy of the proposed framework, a comprehensive comparative evaluation is performed, benchmarking its outcomes against those obtained from varied datasets and established clustering algorithms.


Keywords


Machine Learning, Ensemble Techniques, Clustering Accuracy, Hidden Patterns, Mahala Nobis Distance, Optimization.


  1. J. MacQueen. Some methods for classification and analysis of multivariate observations. Berkeley Symp. Math. Stat. Probab. 1967, 5, 281–297.
  2. I. Gurrutxaga et al., “SEP/COP: An efficient method to find the best partition in hierarchical clustering based on a new cluster validity index,” Pattern Recognition, vol. 43, no. 10, pp. 3364–3373, Oct. 2010, doi: 10.1016/j.patcog.2010.04.021.
  3. H. Azzag and M. Lebbah, “A New Way for Hierarchical and Topological Clustering,” Advances in Knowledge Discovery and Management, pp. 85–97, 2013, doi: 10.1007/978-3-642-35855-5_5.
  4. A. Lotfi, P. Moradi, and H. Beigy, “Density peaks clustering based on density backbone and fuzzy neighborhood,” Pattern Recognition, vol. 107, p. 107449, Nov. 2020, doi: 10.1016/j.patcog.2020.107449.
  5. D. Birant and A. Kut, “ST-DBSCAN: An algorithm for clustering spatial–temporal data,” Data Knowledge Engineering, vol. 60, no. 1, pp. 208–221, Jan. 2007, doi: 10.1016/j.datak.2006.01.013.
  6. G. Govaert and M. Nadif, “An EM algorithm for the block mixture model,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 4, pp. 643–647, Apr. 2005, doi: 10.1109/tpami.2005.69.
  7. A. L. N. Fred and A. K. Jain, “Combining multiple clusterings using evidence accumulation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 6, pp. 835–850, Jun. 2005, doi: 10.1109/tpami.2005.113.
  8. D. Huang, C.-D. Wang, and J.-H. Lai, “Locally Weighted Ensemble Clustering,” IEEE Transactions on Cybernetics, vol. 48, no. 5, pp. 1460–1473, May 2018, doi: 10.1109/tcyb.2017.2702343.
  9. L. Xu and S. Ding, “A novel clustering ensemble model based on granular computing,” Applied Intelligence, vol. 51, no. 8, pp. 5474–5488, Jan. 2021, doi: 10.1007/s10489-020-01979-8.
  10. P. Zhou, X. Wang, L. Du, and X. Li, “Clustering ensemble via structured hypergraph learning,” Information Fusion, vol. 78, pp. 171–179, Feb. 2022, doi: 10.1016/j.inffus.2021.09.003.
  11. P. Wang and Y. Yao, “CE3: A three-way clustering method based on mathematical morphology,” Knowledge-Based Systems, vol. 155, pp. 54–65, Sep. 2018, doi: 10.1016/j.knosys.2018.04.029.
  12. P. Wang and X. Yang, “Three-Way Clustering Method Based on Stability Theory,” IEEE Access, vol. 9, pp. 33944–33953, 2021, doi: 10.1109/access.2021.3057405.
  13. K. Lavanya, Y. S. Reddy, D. C. Varsha, N. V. Sai, and K. L. Meghana, “IDS-PSO-BAE: The Ensemble Method for Intrusion Detection System Using Bagging–Autoencoder and PSO,” International Conference on Innovative Computing and Communications, pp. 805–820, Oct. 2023, doi: 10.1007/978-981-99-4071-4_61.
  14. X. Wang, C. Yang, and J. Zhou, “Clustering aggregation by probability accumulation,” Pattern Recognition, vol. 42, no. 5, pp. 668–675, May 2009, doi: 10.1016/j.patcog.2008.09.013.
  15. F. Li, Y. Qian, J. Wang, C. Dang, and L. Jing, “Clustering ensemble based on sample’s stability,” Artificial Intelligence, vol. 273, pp. 37–55, Aug. 2019, doi: 10.1016/j.artint.2018.12.007.
  16. S. Vega-Pons, J. Correa-Morris, and J. Ruiz-Shulcloper, “Weighted partition consensus via kernels,” Pattern Recognition, vol. 43, no. 8, pp. 2712–2724, Aug. 2010, doi: 10.1016/j.patcog.2010.03.001.
  17. Z. Yu, H. Wong, J. You, G. Yu, G. Han Hybrid cluster ensemble framework based on the random combination of data transformation operators Pattern Recognit., 45 (5) (2012), pp. 1826-1837.
  18. D. Huang, J.-H. Lai, and C.-D. Wang, “Robust Ensemble Clustering Using Probability Trajectories,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 5, pp. 1312–1326, May 2016, doi: 10.1109/tkde.2015.2503753.
  19. A. Rezaeipanah, P. Amiri, H. Nazari, M. Mojarad, and H. Parvin, “An Energy-Aware Hybrid Approach for Wireless Sensor Networks Using Re-clustering-Based Multi-hop Routing,” Wireless Personal Communications, vol. 120, no. 4, pp. 3293–3314, Jun. 2021, doi: 10.1007/s11277-021-08614-w.
  20. S. VEGA-PONS and J. RUIZ-SHULCLOPER, “A SURVEY OF CLUSTERING ENSEMBLE ALGORITHMS,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 25, no. 03, pp. 337–372, May 2011, doi: 10.1142/s0218001411008683.
  21. T. Ma et al., “$$\varvec{\textit{KDVEM}} $$ KDVEM: a $$k$$ k -degree anonymity with vertex and edge modification algorithm,” Computing, vol. 97, no. 12, pp. 1165–1184, Apr. 2015, doi: 10.1007/s00607-015-0453-x.
  22. H. Alizadeh, B. Minaei-Bidgoli, and H. Parvin, “Cluster ensemble selection based on a new cluster stability measure1,” Intelligent Data Analysis, vol. 18, no. 3, pp. 389–408, Apr. 2014, doi: 10.3233/ida-140647.
  23. T. Ma et al., “LED: A fast-overlapping communities detection algorithm based on structural clustering,” Neurocomputing, vol. 207, pp. 488–500, Sep. 2016, doi: 10.1016/j.neucom.2016.05.020.
  24. Z. Yu et al., “Incremental semi-supervised clustering ensemble for high dimensional data clustering,” 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp. 1484–1485, May 2016, doi: 10.1109/icde.2016.7498386.
  25. T. Ma, H. Rong, C. Ying, Y. Tian, A. Al‐Dhelaan, and M. Al‐Rodhaan, “Detect structural‐connected communities based on BSCHEF in C‐DBLP,” Concurrency and Computation: Practice and Experience, vol. 28, no. 2, pp. 311–330, Feb. 2015, doi: 10.1002/cpe.3437.
  26. L. Zhang, W. Lu, X. Liu, W. Pedrycz, and C. Zhong, “Fuzzy C-Means clustering of incomplete data based on probabilistic information granules of missing values,” Knowledge-Based Systems, vol. 99, pp. 51–70, May 2016, doi: 10.1016/j.knosys.2016.01.048.
  27. Fern XZ, Brodley CE. Cluster ensembles for high dimensional clustering: an empirical study. Corvallis Or Oregon State University Dept of Computer Science; 2004. p. 1–26.
  28. M. C. Naldi, A. C. P. L. F. Carvalho, and R. J. G. B. Campello, “Cluster ensemble selection based on relative validity indexes,” Data Mining and Knowledge Discovery, vol. 27, no. 2, pp. 259–289, Sep. 2012, doi: 10.1007/s10618-012-0290-x.
  29. F.-J. Li, Y.-H. Qian, J.-T. Wang, and J.-Y. Liang, “Multigranulation information fusion: A dempster-shafer evidence theory-based clustering ensemble method,” 2015 International Conference on Machine Learning and Cybernetics (ICMLC), pp. 58–63, Jul. 2015, doi: 10.1109/icmlc.2015.7340898.
  30. V. G. Krishnan, K. Sankar, M. V. V. Saradhi, K. H. Priya, and V. Vijayaraja, “Tom and Jerry Based Multipath Routing with Optimal K-medoids for choosing Best Clusterhead in MANET,” International Journal of Communication Networks and Information Security (IJCNIS), vol. 15, no. 1, pp. 70–82, May 2023, doi: 10.17762/ijcnis. v15i1.5707.

CRediT Author Statement


The authors confirm contribution to the paper as follows:

Conceptualization: Kalimuthu Marimuthu, LNC Prakash K, Durgadevi P, Prashanthi M and Sunil P; Writing- Original Draft Preparation: Kalimuthu Marimuthu, LNC Prakash K, Durgadevi P, Prashanthi M and Sunil P; Visualization: Kalimuthu Marimuthu, LNC Prakash K; Validation: Durgadevi P, Prashanthi M and Sunil P; Writing- Reviewing and Editing: Kalimuthu Marimuthu, LNC Prakash K, Durgadevi P, Prashanthi M and Sunil P; All authors reviewed the results and approved the final version of the manuscript.


Acknowledgements


Authors thank Reviewers for taking the time and effort necessary to review the manuscript.


Funding


No funding was received to assist with the preparation of this manuscript.


Ethics declarations


Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.


Availability of data and materials


Data sharing is not applicable to this article as no new data were created or analysed in this study.


Author information


Contributions

All authors have equal contribution in the paper and all authors have read and agreed to the published version of the manuscript.


Corresponding author


Rights and permissions


Open Access This article is licensed under a Creative Commons Attribution NoDerivs is a more restrictive license. It allows you to redistribute the material commercially or non-commercially but the user cannot make any changes whatsoever to the original, i.e. no derivatives of the original work. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/


Cite this article


Kalimuthu Marimuthu, LNC Prakash K, Durgadevi P, Prashanthi M and Sunil P, “Optimising Clustering Accuracy Using Mahalanobis Distance-Based Ensemble Methods - A Novel Data Analysis Paradigm”, Journal of Machine and Computing, vol.5, no.4, pp. 2171-2182, October 2025, doi: 10.53759/7669/jmc202505168.


Copyright


© 2025 Kalimuthu Marimuthu, LNC Prakash K, Durgadevi P, Prashanthi M and Sunil P. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.