Journal of Machine and Computing


Machine Learning for Genomic Expression Classification-Based Phenotype Prediction in Topological Data Analysis



Journal of Machine and Computing

Received On : 30 March 2024

Revised On : 26 July 2024

Accepted On : 26 August 2024

Published On : 05 October 2024

Volume 04, Issue 04

Pages : 1152-1157


Abstract


Genomic data has become more prevalent due to sequencing and Machine Learning (ML) innovations, which have increased the biological genomics study. The multidimensional nature of this data provides challenges to phenotype prediction, which is required for individualized health care and the research investigation of genetic problems; nevertheless, it holds tremendous potential for understanding the association between genes and physical features. The authors of this paper introduce a new technique for symptom prediction from data from genomes, which combines Topological Data Analysis (TDA), Graph Convolutional Networks (GCN), and Support Vector Machines (SVM). The proposed method aims to address these challenges. By using TDA for multifaceted feature extraction, GCN to analyze gene interaction networks, and SVM for reliable classification in high-dimensional spaces, the above technique overcomes the drawbacks of conventional approaches. This TDA-GCN-SVM model has been demonstrated to be implemented in a method that is superior to conventional methods on distinct tumor datasets in terms of accuracy and additional measures. A novel method for genomic study and a more significant comprehension of genomic data analysis are both caused by this innovation, which is an enormous achievement in precision healthcare.


Keywords


Deep Learning, Genomic Expression, Topological Data Analysis, Graph Convolutional Networks.


  1. K. B. Johnson et al., “Precision Medicine, AI, and the Future of Personalized Health Care,” Clinical and Translational Science, vol. 14, no. 1, pp. 86–93, Oct. 2020, doi: 10.1111/cts.12884.
  2. M. Babu and M. Snyder, “Multi-Omics Profiling for Health,” Molecular & Cellular Proteomics, vol. 22, no. 6, p. 100561, Jun. 2023, doi: 10.1016/j.mcpro.2023.100561.
  3. K. Wang, M. A. Abid, A. Rasheed, J. Crossa, S. Hearne, and H. Li, “DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants,” Molecular Plant, vol. 16, no. 1, pp. 279–293, Jan. 2023, doi: 10.1016/j.molp.2022.11.004.
  4. G. Gonzalez, A. Ushakova, R. Sazdanovic, and J. Arsuaga, “Prediction in Cancer Genomics Using Topological Signatures and Machine Learning,” Topological Data Analysis, pp. 247–276, 2020, doi: 10.1007/978-3-030-43408-3_10.
  5. R. Rabadán et al., “Identification of relevant genetic alterations in cancer using topological data analysis,” Nature Communications, vol. 11, no. 1, Jul. 2020, doi: 10.1038/s41467-020-17659-7.
  6. P. Scherer et al., “Unsupervised construction of computational graphs for gene expression data with explicit structural inductive biases,” Bioinformatics, vol. 38, no. 5, pp. 1320–1327, Dec. 2021, doi: 10.1093/bioinformatics/btab830.
  7. S. Kim, S. Bae, Y. Piao, and K. Jo, “Graph Convolutional Network for Drug Response Prediction Using Gene Expression Data,” Mathematics, vol. 9, no. 7, p. 772, Apr. 2021, doi: 10.3390/math9070772.
  8. Z. Li, K. Jiang, S. Qin, Y. Zhong, and A. Elofsson, “GCSENet: A GCN, CNN and SENet ensemble model for microRNA-disease association prediction,” PLOS Computational Biology, vol. 17, no. 6, p. e1009048, Jun. 2021, doi: 10.1371/journal.pcbi.1009048.
  9. T. Nguyen, G. T. T. Nguyen, T. Nguyen, and D.-H. Le, “Graph Convolutional Networks for Drug Response Prediction,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 19, no. 1, pp. 146–154, Jan. 2022, doi: 10.1109/tcbb.2021.3060430.
  10. W. Peng, T. Chen, and W. Dai, “Predicting Drug Response Based on Multi-Omics Fusion and Graph Convolution,” IEEE Journal of Biomedical and Health Informatics, vol. 26, no. 3, pp. 1384–1393, Mar. 2022, doi: 10.1109/jbhi.2021.3102186.
  11. T. Chu and T. Nguyen, “Graph Transformer for drug response prediction,” Dec. 2021, doi: 10.1101/2021.11.29.470386.
  12. M. E. Mswahili, J. Hwang, Y.-S. Jeong, and Y. Kim, “Graph Neural Network Models for Chemical Compound Activeness Prediction For COVID-19 Drugs Discovery using Lipinski’s Descriptors,” 2022 5th International Conference on Artificial Intelligence for Industries (AI4I), vol. 17, pp. 20–21, Sep. 2022, doi: 10.1109/ai4i54798.2022.00011.
  13. T. Xu, L. Ou-Yang, X. Hu, and X.-F. Zhang, “Identifying Gene Network Rewiring by Integrating Gene Expression and Gene Network Data,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 15, no. 6, pp. 2079–2085, Nov. 2018, doi: 10.1109/tcbb.2018.2809603.
  14. H. A. Chowdhury, D. K. Bhattacharyya, and J. K. Kalita, “(Differential) Co-Expression Analysis of Gene Expression: A Survey of Best Practices,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 17, no. 4, pp. 1154–1173, Jul. 2020, doi: 10.1109/tcbb.2019.2893170.
  15. J.-J. Tu, L. Ou-Yang, X. Hu, and X.-F. Zhang, “Inferring Gene Network Rewiring by Combining Gene Expression and Gene Mutation Data,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 16, no. 3, pp. 1042–1048, May 2019, doi: 10.1109/tcbb.2018.2834529.

Acknowledgements


The authors would like to thank to the reviewers for nice comments on the manuscript.


Funding


No funding was received to assist with the preparation of this manuscript.


Ethics declarations


Conflict of interest

The authors would like to thank to the reviewers for nice comments on the manuscript.


Availability of data and materials


Data sharing is not applicable to this article as no new data were created or analysed in this study.


Author information


Contributions

All authors have equal contribution in the paper and all authors have read and agreed to the published version of the manuscript.


Corresponding author


Rights and permissions


Open Access This article is licensed under a Creative Commons Attribution NoDerivs is a more restrictive license. It allows you to redistribute the material commercially or non-commercially but the user cannot make any changes whatsoever to the original, i.e. no derivatives of the original work. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/


Cite this article


Narender M, Karrar S. Mohsin, Ragunthar T, Anusha Papasani, Firas Tayseer Ayasrah and Anjaneyulu Naik R, “Machine Learning for Genomic Expression Classification-Based Phenotype Prediction in Topological Data Analysis”, Journal of Machine and Computing, pp. 1152-1157, October 2024. doi:10.53759/7669/jmc202404106.


Copyright


© 2024 Narender M, Karrar S. Mohsin, Ragunthar T, Anusha Papasani, Firas Tayseer Ayasrah and Anjaneyulu Naik R. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.