Analisis Pengaruh Reduksi Dimensi terhadap Kinerja Deep Learning dalam Klasifikasi Data Ekspresi Gen Microarray RNASeq
- Duwi Lufita Marfiana
- 14230015
ABSTRAK
ABSTRAK
- Nama : Duwi Lufita Marfiana
- NIM : 14230015
- Program Studi : Ilmu Komputer
- Fakultas : Teknologi Informasi
- Jenjang : Strata Dua (S2)
Peminatan : Kecerdasan Buatan dan Blockchain Judul : Analisis Pengaruh Reduksi Dimensi terhadap Kinerja Deep Learning dalam Klasifikasi Data Ekspresi Gen Microarray RNA-Seq .
Penelitian ini bertujuan untuk mengevaluasi efektivitas metode reduksi dimensi dalam menghasilkan kinerja model deep learning yang lebih baik pada data ekspresi gen kanker berbasi microarray dan RNA-Seq. Dengan menggunakan dataset yang terdiri dari 801 sampel dengan 20.351 fitur dan 5 kelas kanker (BRCA, COAD, KIRC, LUAD, PRAD) yang diperoleh dari UCI Machine Learning, tantangan utama dalam proses eksperimen ini adalah tingginya dimensi fitur, sampel yang terbatas dan ketidakseimbangan kelas. Penelitian ini menggunakan teknik dengan evaluasi dari 4 teknik reduksi dimensi, yaitu Principal Component Analysis (PCA), Truncated Singular Value Decomposition (TruncatedSVD), PaCMAP dan TorchDR yang digunakan sebagai input pada 4 model deep learning: Deep Neural Network (DNN), Multilayer Perceptron (MLP), TabNet, dan SelfAttention Network (SAN). Hasil yang diperoleh menunjukan bahwa kombinasi DNN dengan PCA dan TorchDr memperoleh performa terbaik dengan nilai prediksi dan metrik diatas 99%. TabNet menunjukan perbaikan signifikan pada data hasil reduksi non-linear pada PaCMAP dan TorchDR. Penelitian ini membuktikan bahwa pemilihan teknik reduksi dimensi yang sesuai dapat secara signifikan meningkatkan kinerja model berbasis Deep Learning, terutama dalam bidang bioinformatika dengan data genomik berdimensi tinggi
KATA KUNCI
Analisis Pengaruh Reduksi Dimensi
DAFTAR PUSTAKA
DAFTAR PUSTAKA
[1] Bagiroz, B., Doruk, E., & Yildiz, O. (2020). Machine Learning In Bioinformatics: Gene Expression And Microarray Studies. 2020 Medical Technologies Congress (TIPTEKNO). https://doi.org/10.1109/tiptekno50054.2020.9299285.
[2] Tabassum, N., Kamal, M. a. S., Akhand, M. a. H., & Yamada, K. (2024). Cancer Classification from Gene Expression Using Ensemble Learning with an Influential Feature Selection Technique. BioMedInformatics, 4(2), 1275–1288. https://doi.org/10.3390/biomedinformatics4020070.
[3] X. Deng and Y. Xu, "Cancer Classification Using Microarray Data By DPCAForest," 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, OR, USA, 2019, pp. 1081- 1087, https://doi.org/10.1109/ICTAI.2019.00151.
[4] Chebli, H., Mashhadieh, Z., Ali, M. A., Madi, M. K., & Kassem, I. R. (2023). Unlocking the Potential of DNA Microarray for Accurate Cancer Diagnosis with Deep Learning. 2023 Seventh International Conference on Advances in Biomedical Engineering (ICABME). https://doi.org/10.1109/icabme59496.2023.10293017.
[5] Das, A., Neelima, N., Deepa, K., & Özer, T. (2024). Gene Selection Based Cancer Classification with Adaptive Optimization Using Deep Learning Architecture. IEEE Access, 1. https://doi.org/10.1109/access.2024.3392633.
[6] Basavegowda, H. S., & Dagnew, G. (2019). Deep learning approach for microarray cancer data classification. CAAI Transactions on Intelligence Technology, 5(1), 22–33. https://doi.org/10.1049/trit.2019.0028.
[7] Gupta, S., Gupta, M. K., Shabaz, M., & Sharma, A. (2022). Deep learning techniques for cancer classification using microarray gene expression data. Frontiers in Physiology, 13. https://doi.org/10.3389/fphys.2022.952709.
[8] Almarzouki, H. Z. (2022). Deep-Learning-Based Cancer Profiles Classification Using Gene Expression Data Profile. Journal of Healthcare Engineering, 2022, 1–13. https://doi.org/10.1155/2022/4715998. 85 Program Studi Ilmu Komputer (S2) FTI Universitas Nusa Mandiri
[9] Kurdi, S. Z., Ali, M. H., Jaber, M. M., Saba, T., Rehman, A., & Damaševi?ius, R. (2023). Brain Tumor Classification Using MetaHeuristic Optimized Convolutional Neural Networks. Journal of Personalized Medicine, 13(2), 181. https://doi.org/10.3390/jpm13020181.
[10] Ali, W., & Saeed, F. (2023). Hybrid Filter and Genetic Algorithm-Based Feature Selection for Improving Cancer Classification in HighDimensional Microarray Data. Processes, 11(2), 562. https://doi.org/10.3390/pr11020562.
[11] Nagra, A. A., Khan, A. H., Abubakar, M., Faheem, M., Rasool, A., Masood, K., & Hussain, M. (2024). A gene selection algorithm for microarray cancer classification using an improved particle swarm optimization. Scientific Reports, 14(1). https://doi.org/10.1038/s41598- 024-68744-6.
[12] Lederer, J. (2021). Activation Functions in Artificial Neural Networks: A Systematic Overview. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2101.09957.
[13] Sharma, S., Sharma, S., & Athaiya, A. (2020). ACTIVATION FUNCTIONS IN NEURAL NETWORKS. International Journal of Engineering Applied Sciences and Technology, 04(12), 310–316. https://doi.org/10.33564/ijeast.2020.v04i12.054.
[14] Alnuaim, A. A., Zakariah, M., Shukla, P. K., Alhadlaq, A., Hatamleh, W. A., Tarazi, H., Sureshbabu, R., & Ratna, R. (2022). Human-Computer Interaction for Recognizing Speech Emotions Using Multilayer Perceptron Classifier. Journal of Healthcare Engineering, 2022, 1–12. https://doi.org/10.1155/2022/6005446.
[15] Skrlj, B., Dzeroski, S., Lavrac, N., & Petkovic, M. (2020). Feature Importance Estimation with Self-Attention Networks. IOS Press and Distributed Under the Terms of the Creative Commons Attribution NonCommercial License 4.0, 1491–1498. https://dblp.unitrier.de/db/conf/ecai/ecai2020.html#SkrljDLP20. 86 Program Studi Ilmu Komputer (S2) FTI Universitas Nusa Mandiri
[16] Beatriz Remeseiro, Veronica Bolon-Canedo. 2019. A review of feature selection methods in medical applications. Computers in Biology and Medicine 112 (2019) 103375.
[17] Effrosynidis, D., & Arampatzis, A. (2021b). An evaluation of feature selection methods for environmental data. Ecological Informatics, 61, 101224. https://doi.org/10.1016/j.ecoinf.2021.101224
[18] Nasiri, H., & Alavi, S. A. (2022). A Novel Framework Based on Deep Learning and ANOVA Feature Selection Method for Diagnosis of COVID-19 Cases from Chest X-Ray Images. Computational Intelligence and Neuroscience, 2022, 1–11. https://doi.org/10.1155/2022/4694567.
[19] Singh, D., & Singh, B. (2019). Investigating the impact of data normalization on classification performance. Applied Soft Computing, 97, 105524. https://doi.org/10.1016/j.asoc.2019.105524.
[20] Hasan, B. M. S., & Abdulazeez, A. M. (2021). A Review of Principal Component Analysis Algorithm for Dimensionality Reduction. Journal of Soft Computing and Data Mining, 2(1). https://doi.org/10.30880/jscdm.2021.02.01.003.
[21] Reddy, G. T., Reddy, M. P. K., Lakshmanna, K., Kaluri, R., Rajput, D. S., Srivastava, G., & Baker, T. (2020). Analysis of Dimensionality Reduction Techniques on Big Data. IEEE Access, 8, 54776–54788. https://doi.org/10.1109/access.2020.2980942.
[22] Hang Duong Thi, Kha Hoang Manh, Vu Trinh Anh, Trang Pham Thi Quynh, Tuyen Nguyen Viet. 2023. Dimensionality Reduction with Truncated Singular Value Decomposition and K-Nearest Neighbors Regression for Indoor Localization. International Journal of Advanced Computer Science and Applications ; West Yorkshire Vol. 14, Iss. 10, (2023). [31]
[23] Falini, A. (2022). A review on the selection criteria for the truncated SVD in Data Science applications. Journal of Computational Mathematics and Data Science, 5, 100064. https://doi.org/10.1016/j.jcmds.2022.100064.
[24] Complete Guide to Data Augmentation. [Online]. Available: https://www.datacamp.com/tutorial/complete-guide-data-augmentation. 87 Program Studi Ilmu Komputer (S2) FTI Universitas Nusa Mandiri
[25] Zeng, Y., Zhang, Y., Xiao, Z., & Sui, H. (2025). A multi-classification deep neural network for cancer type identification from high-dimension, small-sample and imbalanced gene microarray data. Scientific Reports, 15(1). https://doi.org/10.1038/s41598-025-89475-2.
[26] Sheykhmousa, M., Mahdianpari, M., Ghanbari, H., Mohammadimanesh, F., Ghamisi, P., & Homayouni, S. (2020). Support Vector Machine Versus Random Forest for Remote Sensing Image Classification: A Meta-Analysis and Systematic Review. In IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Vol. 13, pp. 6308–6325). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/JSTARS.2020.3026724.
[27] TorchDR. (n.d.). TorchDR: A library for dimensionality reduction in PyTorch. Retrieved May 3, 2025, from https://torchdr.github.io/.
[28] Reichmann, L., Hägele, D., & Weiskopf, D. (2024). Out-of-Core Dimensionality Reduction for Large Data via Out-of-Sample Extensions. 2024 IEEE 14th Symposium on Large Data Analysis and Visualization (LDAV), 43–53. https://doi.org/10.1109/ldav64567.2024.00008.
[29] Huang, H., Wang, Y., & Rudin, C. (2024). Navigating the effect of parametrization for dimensionality reduction. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2411.15894.
[30] Y. Hozumi and G.-W. Wei, “Analyzing scRNA-seq data by CCPassisted UMAP and tSNE,” PLoS ONE, vol. 19, no. 12, p. e0311791, Dec. 2024, doi: 10.1371/journal.pone.0311791.
[31] Matko Glu?cina, Ariana Lorencin, Nikola An deli and Ivan Lorencin. 2023. Cervical Cancer Diagnostics Using Machine Learning Algorithms and Class Balancing Techniques. Appl. Sci. 2023, 13, 1061. https://doi.org/10.3390/app13021061.
[32] Umer, M., Ashraf, I., Mehmood, A., Kumari, S., Ullah, S., & Choi, G. S. (2020). Sentiment analysis of tweets using a unified convolutional neural network?long short?term memory network model. Computational Intelligence, 37(1), 409–434. https://doi.org/10.1111/coin.12415. 88 Program Studi Ilmu Komputer (S2) FTI Universitas Nusa Mandiri
[33] Wang, S., Ma, L., & Wang, J. (2023). Fault Diagnosis Method Based on CND-SMOTE and BA-SVM Algorithm. Journal of Physics Conference Series, 2493(1), 012008. https://doi.org/10.1088/1742- 6596/2493/1/012008.
[34] Krishna, P. R., & Rajarajeswari, P. (2024). Microarray Gene Expression Dataset Feature Selection and Classification with Swarm Optimization to Diagnosis Diseases. International Journal of Advanced Computer Science and Applications, 15(7). https://doi.org/10.14569/ijacsa.2024.0150753
Detail Informasi
Tesis ini ditulis oleh :
- Nama : Duwi Lufita Marfiana
- NIM : 14230015
- Prodi : Ilmu Komputer
- Kampus : Margonda
- Tahun : 2025
- Periode : I
- Pembimbing : Dr. Muhammad Haris, S.Kom, M.Eng
- Asisten :
- Kode : 0010.S2.IK.TESIS.I.2025
- Diinput oleh : SGM
- Terakhir update : 08 Desember 2025
- Dilihat : 50 kali
TENTANG PERPUSTAKAAN
E-Library Perpustakaan Universitas Nusa Mandiri merupakan
platform digital yang menyedikan akses informasi di lingkungan kampus Universitas Nusa Mandiri seperti akses koleksi buku, jurnal, e-book dan sebagainya.
INFORMASI
Alamat : Jln. Jatiwaringin Raya No.02 RT08 RW 013 Kelurahan Cipinang Melayu Kecamatan Makassar Jakarta Timur
Email : perpustakaan@nusamandiri.ac.id
Jam Operasional
Senin - Jumat : 08.00 s/d 20.00 WIB
Isitirahat Siang : 12.00 s/d 13.00 WIB
Istirahat Sore : 18.00 s/d 19.00 WIB
Perpustakaan Universitas Nusa Mandiri @ 2020