KLASIFIKASI AL – QUR’AN TERJEMAHAN BAHASA INDONESIA DENGAN MENGGUNAKAN ALGORITMA SUPPORT VECTOR MACHINE (SVM)

  • Moch Fauzan Institut Sains dan Teknologi Terpadu Surabaya
  • Hartarto Junaedi Institut Sains dan Teknologi Terpadu Surabaya
  • Endang Setyati Institut Sains dan Teknologi Terpadu Surabaya
Keywords: Al-Qur’an, Teknik feature selection, Algoritma Support Vector Machine (SVM), AUC, f1-score

Abstract

The classification of verses of the Koran in Indonesian translation aims to classify verses of the Koran that have the same meaning on certain topics. In this study, the labeling of translated Qur'anic verses is grouped into 6 categories including education, motivation, social, history, politics and science (mathematics). The method proposed in this study uses Chi Square feature selection and Principal Analysis with the application of a classification model using the Support Vector Machine (SVM) algorithm to group the translated verses of the Koran into 6 categories. The initial stage is preprocessing, which aims to find the weighting value for each document using TF-IDF. After getting the weighting value for each document, a search for the best classification model is carried out to label the verses of the Qur'an by using feature selection and without using feature selection. In this study, the best classification model results without using feature selection in the SVM algorithm, the AUC value is 83.3%, while using Chi Square feature selection, the AUC is 73.3%, while the PCA feature selection is 63.3%. So that this research is the best model in classifying the Indonesian translation of the Qur'anic verses without using feature selection with the highest AUC value of 83.3%.

Keywords: Feature Selection Techniques; Holy Qur’an; Algorithm Support Vector Machine (SVM); AUC; f1-score.

ABSTRAK

Klasifikasi ayat al-qur’an terjemahan Bahasa Indonesia bertujuan untuk mengelompokkan ayat alqur’an yang mempunyai makna yang sama pada topik tertentu. Pada penelitian ini pelabelan dokumen ayat al - qur’an terjemahan dikelompokkan menjadi 6 kategori diantaranya pendidikan, motivasi, sosial, sejarah, politik dan sains (matematika). Metode yang diusulkan dalam penelitian ini menggunakan feature selection Chi Square dan Principal Component Analysist (PCA) dengan penerapan model klasifikasi menggunakan algoritma Support Vector Machine (SVM) untuk mengelompokkan ayat al - qur’an terjemahan ke dalam 6 kategori. Tahap awal yang dilakukan adalah preprocessing bertujuan untuk mencari nilai pembobotan pada setiap dokumen dengan menggunakan TF-IDF. Setelah mendapatkan nilai pembobotan pada setiap dokumen dilakukan pencarian model klasifikasi terbaik untuk melabeli ayat al-qur’an dengan menggunakan feature selection dan tanpa menggunakan feature selection. Pada penelitian ini didapatkan hasil model klasifikasi terbaik tanpa menggunakan feature selection pada algoritma SVM didapatkan nilai AUC 83.3% sedangan dengan menggunakan feature selection Chi Square mendapatkan nilai AUC 73.3 % sedangkan dengan pada feature selection PCA mendapatkan nilai AUC 63.3 %. Sehingga penelitian ini model yang terbaik dalam mengklasifikasi ayat al-qur’an terjemahan Bahasa Indonesia tanpa menggunakan feature selection dengan nilai AUC tertinggi 83.3 %.

Kata Kunci: Teknik feature selection; Al-Qur’an; Algoritma Support Vector Machine (SVM); AUC; f1-score

Downloads

Download data is not yet available.

Author Biographies

Moch Fauzan, Institut Sains dan Teknologi Terpadu Surabaya

Departemen Informatika

Hartarto Junaedi, Institut Sains dan Teknologi Terpadu Surabaya

Departemen Informatika

Endang Setyati, Institut Sains dan Teknologi Terpadu Surabaya

Departemen Informatika

References

A. O. Adeleke, N. A. Samsudin, A. Mustapha, and N. M. Nawi, “Comparative analysis of text classification algorithms for automated labelling of Quranic verses,” International Journal on Advanced Science, Engineering and Information Technology, vol. 7, no. 4, pp. 1419–1427, 2017.

A. Adeleke and N. Samsudin, “A Hybrid Feature Selection Technique for Classification of Group-based Holy Quran Verses,” International Journal of Engineering & Technology, no. December, pp. 228–233, 2018.

A. O. Adeleke, N. A. Samsudin, A. Mustapha, and N. M. Nawi, “A group-based feature selection approach to improve classification of Holy Quran verses,” Advances in Intelligent Systems and Computing, vol. 700, no. January, pp. 282–297, 2018.

A. Adeleke, N. Samsudin, A. Mustapha, and S. Ahmad Khalid, “Automating quranic verses labeling using machine learning approach,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 16, no. 2, pp. 925–931, 2019.

S. K. Hamed and M. J. Ab Aziz, “Classification of Holy Quran translation using Neural Network technique,” Journal of Engineering and Applied Sciences, vol. 13, no. 12, pp. 4468–4475, 2018.

A. Adeleke, N. A. Samsudin, Z. A. Othman, and S. K. Ahmad Khalid, “A two-step feature selection method for quranic text classification,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 16, no. 2, pp. 730–736, 2019.

A. Ta’a, S. Zainal Abidin, M. S. Abdullah, A. B. Mat Ali, and M. Ahmad, “Al-Quran themes classification using ontology,” in 4th International Conference on Computing and Informatics (ICOCI 2013), 2013.

M. A. Siddiqui, S. M. Faraz, and S. A. Sattar, “Discovering the Thematic Structure of the Quran using Probabilistic Topic Model,” Proceedings - 2013 Taibah University International Conference on Advances in Information Technology for the Holy Quran and Its Sciences, NOORIC 2013, no. May 2015, pp. 234–239, 2015.

M. F. H. Sianturi, S. Al Faraby, S. Ilmu, K. Fakultas, and I. Universitas, “Klasifikasi Dokumen Menggunakan Kombinasi Algoritma Principal Component Analysis Dan Svm Document Classification Using Combination of Principal Component Analysis Algorithm and Svm,” e-Proceeding of Engineering, vol. 4, no. 3, pp. 5141–5143, 2017.

S. N. Asiyah, “Klasifikasi berita online menggunakan metode support vector machine dan k-nearest neighbor [skripsi],” Surabaya: Institut Teknologi Sepuluh Nopember, vol. 5, no. 2, pp. 1–73, 2016.

M. I. Rahman, N. A. Samsudin, A. Mustapha, and A. Abdullahi, “Comparative analysis for topic classification in Juz Al-Baqarah,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 12, no. 1, pp. 406–411, 2018.

F. Taufiqurrahman, S. Al Faraby, and M. D. Purbolaksono, “Klasifikasi Teks Multi Label pada Hadis Terjemahan Bahasa Indonesia Menggunakan Chi Square dan SVM,” e-Proceeding of Engineering, vol. 8, no. 5, pp. 10650–10659, 2021.

S. Chua and P. N. E. Nohuddin, “Relationship analysis of keyword and chapter in Malay-translated tafseer of Al-Quran,” Journal of Telecommunication, Electronic and Computer Engineering, vol. 9, no. 2–10, pp. 185–189, 2017.

A. Salama, Adiwijaya, and S. Al Faraby, “Klasifikasi Topik Ayat Al-Qur’an Terjemahan Berbahasa Inggris Menggunakan Metode Support Vector Machine Berbasis Vector Space Model dan Word2Vec,” E-proceeding of Engineering, vol. 6, no. 2, pp. 9133–9142, 2019.

T. W. Utami and I. Arianti, “Principal Component Analysis Support Vector Machine (Pca-Svm) Untuk Klasifikasi Kesejahteraan Rumah Tangga Di Kabupaten …,” Proceeding SENDIU 2020, pp. 978–979, 2020.

Published
2022-12-29
Section
Articles