Comparison of Dimensionality Reduction Techniques to Improve Performance and Efficiency of Logistic Regression in Network Anomaly Detection

Mokhamad Isna Marzuki Ahfa; Lukman Hakim; Muhammad Imron Rosadi

doi:10.30996/jitcs.12212

Comparison of Dimensionality Reduction Techniques to Improve Performance and Efficiency of Logistic Regression in Network Anomaly Detection

Authors

Mokhamad Isna Marzuki Ahfa Universitas Yudharta Pasuruan
Lukman Hakim Universitas Yudharta Pasuruan https://orcid.org/0000-0001-6089-5879
Muhammad Imron Rosadi Universitas Yudharta Pasuruan https://orcid.org/0000-0002-8217-1641

DOI:

https://doi.org/10.30996/jitcs.12212

Keywords:

dimensionality reduction, Logistic Regression, network anamoly detection, performance evaluation, Truncated Singular Value Decomposition

Abstract

Network anomaly detection is a crucial process to identify abnormal network traffic, which may pose a security threat. This research aims to improve the performance and efficiency of Logistic Regression (LR) in network anomaly detection by applying dimension reduction techniques, such as Principal Component Analysis (PCA), Truncated Singular Value Decomposition (TSVD), t-Distributed Stochastic Neighbor Embedding (t-SNE), and Independent Component Analysis (ICA). The performance of each dimension reduction method is evaluated based on accuracy, precision, recall, F1-score, and computation time. The results show that TSVD provides the best performance with 95.86% accuracy, 0.96 precision, 0.96 recall, 0.95 F1-score, and 13.83 seconds computation time. In contrast, ICA showed the worst performance, especially in precision, recall, and F1-score, with values of 0.73, 0.83, and 0.78, respectively. Meanwhile, although t-SNE produces competitive accuracy, it has a high computational cost with an execution time of 1698.54 seconds. These findings show that choosing the right dimension reduction algorithm not only improves detection performance but also supports data processing efficiency, making it highly relevant for large-scale network security scenarios. Keywords: dimensionality reduction, Logistic Regression, network anamoly detection, performance evaluation, Truncated Singular Value Decomposition.

Downloads

Download data is not yet available.

Author Biographies

Mokhamad Isna Marzuki Ahfa, Universitas Yudharta Pasuruan

Department of Informatics Engineering

Lukman Hakim, Universitas Yudharta Pasuruan

Department of Informatics Engineering

Muhammad Imron Rosadi, Universitas Yudharta Pasuruan

Department of Informatics Engineering

References

Akritidis, L., & Bozanis, P. (2022). How Dimensionality Reduction Affects Sentiment Analysis NLP Tasks: An Experimental Study. Artificial Intelligence Applications and Innovations. 18, pp. 301–312. Hersonissos, Crete, Greece: Springer. doi:https://doi.org/10.1007/978-3-031-08337-2_25

Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21(6). doi:https://doi.org/10.1186/s12864-019-6413-7

Devassy, B. M., & George, S. (2020). Forensic Science International. Forensic Science International, 311. doi:https://doi.org/10.1016/j.forsciint.2020.110194

Erlin, E., Marlim, Y. N., Junadhi, J., Suryati, L., & Agustina, N. (2022). Early Detection of Diabetes Using Machine Learning with Logistic Regression Algorithm. Jurnal Nasional Teknik Elektro dan Teknologi Informasi, 11(2), 88-96. doi:https://doi.org/10.22146/jnteti.v11i2.3586

Fikri, K. A., & Djuniadi, D. (2021). Keamanan Jaringan Menggunakan Switch Port Security. InfoTekJar: Jurnal Nasional Informatika dan Teknologi Jaringan, 5(2), 302-307. Retrieved from https://jurnal.uisu.ac.id/index.php/infotekjar/article/view/3501

Golub, G. H., & Van Loan, C. F. (2013). Matrix Computations (4th ed.). Baltimore, United States: Johns Hopkins University Press.

Gunawan, M. I., Sugiarto, D., & Mardianto, I. (2020). Peningkatan Kinerja Akurasi Prediksi Penyakit Diabetes Mellitus Menggunakan Metode Grid Seacrh pada Algoritma Logistic Regression. JEPIN (Jurnal Edukasi dan Penelitian Informatika), 6(3), 280-284. doi:https://doi.org/10.26418/jp.v6i3.40718

Gupta, A., Anjum, A., Gupta, S., & Katarya, R. (2021). InstaCovNet-19: A deep learning classification model for the detection of COVID-19 patients using Chest X-ray. Applied Soft Computing, 99. doi:https://doi.org/10.1016/j.asoc.2020.106859

Hasan, B. M., & Abdulazeez, A. M. (2021). A Review of Principal Component Analysis Algorithm for Dimensionality Reduction. Journal of Soft Computing and Data Mining, 2(1), 20-30. Retrieved from https://publisher.uthm.edu.my/ojs/index.php/jscdm/article/view/8032

Hyvärinen, A., & Oja, E. (2000). Independent component analysis: algorithms and applications. Neural Networks, 13(4–5), 411-430. doi:https://doi.org/10.1016/S0893-6080(00)00026-5

Imam, R. M., Sukarno, P., & Nugroho, M. A. (2019). Deteksi Anomali Jaringan Menggunakan Hybrid Algorithm. Proceedings of Engineering (E-Proceeding). 6, pp. 8766-8787. Bandung, Indonesia: Universitas Telkom. Retrieved from https://core.ac.uk/download/pdf/299932449.pdf

Jia, W., Sun, M., Lian, J., & Hou, S. (2022). Feature dimensionality reduction: a review. Complex & Intelligent Systems, 8, 2663–2693. doi:https://doi.org/10.1007/s40747-021-00637-x

Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065). doi:https://doi.org/10.1098/rsta.2015.0202

Kumar, V. (2021). Evaluation of computationally intelligent techniques for breast cancer diagnosis. Neural Computing and Applications, 33, 3195–3208. doi:https://doi.org/10.1007/s00521-020-05204-y

Kurita, T. (2021). Principal Component Analysis (PCA). Springer, Cham. doi:https://doi.org/10.1007/978-3-030-63416-2_649

Kwon, D., Kim, H., Kim, J., Suh, S. C., Kim, I., & Kim, K. J. (2019). A survey of deep learning-based network anomaly detection. Cluster Computing, 22, 949–961. doi:https://doi.org/10.1007/s10586-017-1117-8

Noureen, S. S., Bayne, S. B., Shaffer, E., Porschet, D., & Berman, M. (2019). Anomaly Detection in Cyber-Physical System using Logistic Regression Analysis. 2019 IEEE Texas Power and Energy Conference (TPEC). College Station, TX, USA: IEEE. doi:https://doi.org/10.1109/TPEC.2019.8662186

Onkarappa, A. (2019). Network Anamoly Detection. Kaggle. Retrieved from https://www.kaggle.com/datasets/anushonkar/network-anamoly-detection

Pramakrisna, F. D., Adhinata, F. D., & Tanjung, N. A. (2022). Aplikasi Klasifikasi SMS Berbasis Web Menggunakan Algoritma Logistic Regression. Teknika, 11(2), 90-97. doi:https://doi.org/10.34148/teknika.v11i2.466

Putra, A. P., Wiantari, N. W., Dewi, N. P., & Darmawan, I. D. (2019). Independent Component Analysis (ICA) dan Sparse Component Analysis (SCA) dalam Pemisahan Vokal dan Instrumen pada Seni Geguntangan. JELIKU, 8(1), 105-111. Retrieved from https://www.academia.edu/download/86504929/31504.pdf

Rhamadhani, M. H., & Iswari, L. (2022). Pengembangan Aplikasi Berbasis Web dengan R Shiny untuk Analisis Data Menggunakan Algoritma PCA. Automata, 3(1). Retrieved from https://journal.uii.ac.id/AUTOMATA/article/view/21870

Ruuska, S., Hämäläinen, W., Kajava, S., Mughal, M., Matilainen, P., & Mononen, J. (2018). Evaluation of the confusion matrix method in the validation of an automated system for measuring feeding behaviour of cattle. Behavioural Processes, 148, 56-62. doi:https://doi.org/10.1016/j.beproc.2018.01.004

Sasikala, K., & Vasuhi, S. (2023). Anomaly Based Intrusion Detection on IOT Devices using Logistic Regression. 2023 International Conference on Networking and Communications (ICNWC). Chennai, India: IEEE. doi:https://doi.org/10.1109/ICNWC57852.2023.10127375

Silva, R., & Melo-Pinto, P. (2023). t-SNE: A study on reducing the dimensionality of hyperspectral data for the regression problem of estimating oenological parameters. Artificial Intelligence in Agriculture, 7, 58-68. doi:https://doi.org/10.1016/j.aiia.2023.02.003

Tuo, X., Zhang, Y., Huang, Y., & Yang, J. (2021). Fast Sparse-TSVD Super-Resolution Method of Real Aperture Radar Forward-Looking Imaging. IEEE Transactions on Geoscience and Remote Sensing, 59(8). doi:https://doi.org/10.1109/TGRS.2020.3027053

Utami, D. Y., Nurlelah, E., & Hasan, F. N. (2021). Comparison of Neural Network Algorithms, Naive Bayes and Logistic Regression to predict diabetes. JITE (Journal of Informatics and Telecommunication Engineering), 5(1), 53-64. doi:https://doi.org/10.31289/jite.v5i1.5201

van der Maaten, L., & Hinton, G. (2008). Visualizing Data using t-SNE. Journal of Machine Learning Research, 9(11), 2579-2605. Retrieved from https://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf

Willy, W., Rini, D. P., & Samsuryadi, S. (2021). Perbandingan Algoritma Random Forest Classifier, Support Vector Machine dan Logistic Regression Clasifier Pada Masalah High Dimension (Studi Kasus: Klasifikasi Fake News). Jurnal Media Informatika Budidarma, 5(4), 1720-1728. doi:https://doi.org/10.30865/mib.v5i4.3177

Yacouby, R., & Axman, D. (2020). Probabilistic Extension of Precision, Recall, and F1 Score for More Thorough Evaluation of Classification Models. Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems (Eval4NLP) (pp. 79–91). Association for Computational Linguistics. doi:https://doi.org/10.18653/v1/2020.eval4nlp-1.9

Zhang, Z., Wang, W., An, A., Qin, Y., & Yang, F. (2023). A human activity recognition method using wearable sensors based on convtransformer model. Evolving Systems, 14, 939–955. doi:https://doi.org/10.1007/s12530-022-09480-y

Downloads

Published

2025-01-14

How to Cite

Ahfa, M. I. M., Hakim, L., & Rosadi, M. I. (2025). Comparison of Dimensionality Reduction Techniques to Improve Performance and Efficiency of Logistic Regression in Network Anomaly Detection. Journal of Information Technology and Cyber Security, 3(1), 1–13. https://doi.org/10.30996/jitcs.12212

Download Citation

Issue

Vol. 3 No. 1 (2025): January

Section

Research Article

License

Copyright Notice based on COPE (Committee on Publication Ethics) for JITCS: Journal of Information Technology and Cyber Security

Ownership and Copyright:
1. JITCS: Journal of Information Technology and Cyber Security respects the intellectual property rights of authors. The copyright for individual articles published in JITCS is retained by the respective authors, unless otherwise specified.
2. The articles published in JITCS are licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0), which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial, and no modifications or adaptations are made.
3. JITCS serves as the initial publisher of the articles, providing them with the first publication platform.
Permissions and Usage:
1. Distribution for Non-Commercial Purposes: Permitted: Users are allowed to distribute the article for non-commercial purposes, provided the original work is properly cited and no modifications or adaptations are made.
2. Distribution for Commercial Purposes: Not Permitted: The article may not be distributed for any commercial purposes without obtaining prior written permission from the author(s).
3. Inclusion in a Collective Work (e.g., Anthology) for Non-Commercial Purposes: Permitted: Users are allowed to include the article in a collective work, such as an anthology, as long as the use is non-commercial and the work remains unchanged.
4. Inclusion in a Collective Work for Commercial Purposes: Not Permitted: The article may not be included in any collective work or anthology intended for commercial purposes without prior permission from the author(s).
5. Creation and Distribution of Revised Versions, Adaptations, or Derivative Works (e.g., Translation) for Non-Commercial Purposes: Not Permitted: Users may not create or distribute revised versions, adaptations, or derivative works, including translations, for non-commercial purposes.
6. Creation and Distribution of Revised Versions, Adaptations, or Derivative Works for Commercial Purposes: Not Permitted: Users may not create or distribute revised versions, adaptations, or derivative works, including translations, for commercial purposes.
7. Text or Data Mining for Non-Commercial Purposes: Permitted: Users are permitted to engage in text or data mining of the article for non-commercial research purposes, provided the original work is properly attributed.
8. Text or Data Mining for Commercial Purposes: Not Permitted: Users may not engage in text or data mining of the article for commercial purposes without obtaining explicit permission from the author(s).
Attribution and Citation:
1. Proper attribution and citation of the published work should be provided when using or referring to content from JITCS. This includes clearly indicating the authors, the title of the article, the journal name (JITCS), the volume/issue number, the publication year, and the article's DOI (Digital Object Identifier) when available.
2. When adapting or modifying the published content, proper attribution to the original source should be given, and the adapted or modified content should be shared under the same CC BY-NC-ND 4.0 license.
Plagiarism and Copyright Infringement:
1. JITCS considers plagiarism and copyright infringement as serious ethical violations. Authors are responsible for ensuring that their submitted work is original and does not infringe upon the copyright or intellectual property rights of others.
2. Any allegations of plagiarism or copyright infringement will be investigated promptly and thoroughly. If proven, appropriate actions, including rejection of the manuscript, retraction of the published article, or other corrective measures, will be taken.
Open Access Licensing:
1. JITCS supports open access publishing and encourages authors to consider publishing their work under the CC BY-NC-ND 4.0 license to promote the dissemination and use of knowledge in the field of information technology and cyber security.
2. The specific terms and conditions of the CC BY-NC-ND 4.0 license will be clearly indicated on the published articles.
Policy Review: This Copyright Notice will be periodically reviewed and updated to ensure its continued relevance and compliance with copyright laws, ethical standards, and open access principles in scholarly publishing. Any updates or revisions to the notice will be communicated to the relevant stakeholders.

By adhering to this Copyright Notice, JITCS aims to protect the rights of authors, promote proper attribution and citation practices, and facilitate the responsible and legal use of the published content in accordance with the CC BY-NC-ND 4.0 license.

ISSN
ISSN (Print)	: 2987-3878
ISSN (Online)	: 2987-386X

Female Authors:	36%
Acceptance rate:	43%
Desk Reject Rate:	25%
After Review Reject Rate:	28%
Submission to 1st decision:	20 days
Submission to acceptance:	77 days
Acceptance to publication:	40 days
Note: The time here is an average.

Comparison of Dimensionality Reduction Techniques to Improve Performance and Efficiency of Logistic Regression in Network Anomaly Detection

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Mokhamad Isna Marzuki Ahfa, Universitas Yudharta Pasuruan

Lukman Hakim, Universitas Yudharta Pasuruan

Muhammad Imron Rosadi, Universitas Yudharta Pasuruan

References

Downloads

Published

How to Cite

Issue

Section

License

Current Issue

Information

Make a Submission