Lung Cancer Classification using Gray-Level Co-Occurrence Matrix Feature Extraction and Forward Selection Feature Selection based on the K-Nearest Neighbor Algorithm

Soeparmi Soeparmi, Mohtar Yunianto, Lukmaniyah Rizky Amalia

Abstract

In diagnosing lung cancer, the medical imaging team manually identifies CT-scan images of the lungs. This identification process makes it difficult for the medical imaging team to differentiate between lung cancer and normal images. This is because there is noise in the image, which reduces the image quality, so image processing must reduce the noise. This study used median and Gaussian filters, Otsu thresholding segmentation, GLCM feature extraction, forward selection, and k-nearest Neighbor classification. The research results show that of the 22 statistical features extracted, only 16 were selected for characterizing image classification. The image datasets used are 900 image data sets for program training and 100 image data sets for program testing. With a dataset of 100 image data sets, the level of diagnostic accuracy without forward selection (22 GLCM features) was 81.67%, while the diagnostic accuracy using forward selection (16 GLCM features) was 93.22% with a sensitivity of 92.25% and specificity is 94.46%.

Keywords

Forward Selection ; GLCM; k-Nearest Neighbor ; Lung cancer

Full Text:

PDF

References

1 Buana, I., & Harahap, D.A. 2022. Asbestos, Radon and Air Pollution as Risk Factors for Lung Cancer in Non-Smoking Women. AVERROUS: Journal of Medicine and Health Malikussaleh, 8 (1), 1–16.

2 Globocan. 2020. Cancer Facts Sheets. International Agency for Research on Cancer.

3 Yu, K., Lee, T., Yen, M.H., Kou, S.C., Rosen, B., Chiang, J.H., & Kohane, I.S. 2020. Reproducible Machine Learning Methods for Lung Cancer Detection Using Computed Tomography Images: Algorithm Development and Validation. Journal of Medical Internet Research, 22 (8), 1–11.

4 Wulan, T.D., Purnama, I.K.E., & Purnomo, M.H. 2015. Classification of Lung Nodules from CT-Scan Images Based on Gray Level Co-occurrence Matrix Using Probabilistic Neural Networks. Technology and Engineering Seminar (SENTRA), 1, 92–97.

5 Najar, A.M., Sudarsana, I.W., Albab, M.U., & Andhika, S. 2022. Machine Learning to Identify Types of Blood Cancer (Leukemia). Vygotsky , 4 (1), 47–56.

6 Amrustian, M.A., Muliati, V.F., & Awal, E.E. 2021. Comparative Study of Machine Learning Methods for Image Classification of Hiragana Vowel Letters. BUDIDARMA MEDIA INFORMATICS JOURNAL, 5 (3), 905–912.

7 Vikri, M.J., & Rohmah, R. 2022. Application of the Exponential Function in the Euclidean Distance Function Weighting K-Nearest Neighbor Algorithm. Generation Journal, 6 (2), 2580–4952.

8 Ibrahim, I., & Abdulazeez, A. 2021. The Role of Machine Learning Algorithms for Diagnosing Diseases. Journal of Applied Science and Technology Trends, 2 (01), 10–19.

9 Podolsky, M.D., Barchuk, A.A., Kuznetcov, V.I., Gusarova, N.F., Gaidukov, V.S., & Tarakanov, S.A. 2016. Evaluation of Machine Learning Algorithm Utilization for Lung Cancer Classification Based on Gene Expression Levels. Asian Pacific Journal of Cancer Prevention, 17 (2), 835–838.

10 Alqasemi, U.S., Qashgari, A.A., & Alansari, M.M. 2018. Enhanced Detecting System for Computer-Aided Diagnosis of CT Lung Cancer Medical Image Recognition View project Mapping of Retrieving Brain Imagination View project. International Journal of Engineering and Advanced Technology (IJEAT), 8 (1), 2249–8958.

11 Xiang, Y., Sun, Y., Liu, Y., Han, B., Chen, Q., Ye, X., Zhu, L., Gao, W., & Fang, W. 2019. Development and Validation of A Predictive Model for the Diagnosis of Solid Solitary Pulmonary Nodules Using Data Mining Methods. Journal of Thoracic Disease, 11 (3), 950–958.

12 Patra, R. 2020. Prediction of Lung Cancer Using Machine Learning Classifier. Communications in Computer and Information Science, 1235 CCIS, 132–142.

13 Lennartz, S., Mager, A., Große Hokamp, N., Schäfer, S., Zopfs, D., Maintz, D., Reinhardt, H.C., Thomas, R.K., Caldeira, L., & Persigehl, T. 2021. Texture Analysis of Iodine Maps and Conventional Images for k-Nearest Neighbor Classification of Benign and Metastatic Lung Nodules. Cancer Imaging, 21 (1), 1–10.

14 SureshKumar, M., Dahiya, D., Shanmugapriya, P., & ReneRobin, R.C. 2022. Integrated Global and Local Feature Extraction and Classication from Computerized Tomography (CT) Images for Lung Cancer Classication. Research Square, 1–23.


15 Sanjaya, R., & Fitriyani. 2019. Thoracic Surgery Prediction Using Forward Selection and K-Nearest Neighbor Feature Selection. JEPIN (Journal of Informatics Education and Research), 5 (3), 316–320.

16 Yunianto, M., Anwar, F., Nur Septianingsih, D., Dwi Ardyanto, T., & Farits Pradana, R. 2021. Lung Cancer Classification Using Naive Bayes with Filter Variations and Gray Level Co-occurrence Matrix (GLCM) Feature Extraction. Indonesian Journal of Applied Physics, 11 (2), 256–268.

17 Wang, A., An, N., Chen, G., Li, L., & Alterovitz, G. 2015. Accelerating Wrapper-Based Feature Selection with K-Nearest-Neighbor. Knowledge-Based Systems, 83 (1), 81–91.

18 Baso, B., & Suciati, N. 2020. Rediscovering Woven Images of East Nusa Tenggara Using Robust Feature Extraction for Changes in Scale, Rotation and Lighting. Journal of Information Technology and Computer Science (JTIK), 7 (2), 349–358.

19 Supiyanto, & Suparwati, T. 2021. Image Improvement Using the Contrast Stretching Method. Siger Journal of Mathematics, 02 (01), 13–18.

20 Miyazaki, D., Onishi, Y., & Hiura, S. 2019. Color Photometric Stereo Using Multi-Band Camera Constrained by Median Filter and Occluding Boundary. Journal of Imaging, 5 (7), 1–29.

21 Wijaya, P.H., Wulanningrum, R., & Halilintar, R. 2021. Image Improvement Using the Gaussian Method and Mean Filter. National Seminar on Technological Innovation, 100–105.

22 Anam, K., Cahyadi, W., Azmi, I., Senjarini, K., & Oktarianti, R. 2021. Analysis of DNA Electrophoresis Results with Image Processing Using the Gaussian Filter Method. IJEIS (Indonesian Journal of Electronics and Instrumentation Systems), 11 (1), 37–48.

23 Medinah, D.R.E., & Sinurat, S. 2020. Analysis and Comparison of the Otsu Thresholding Algorithm with the Region Growing Algorithm in Digital Image Segmentation. Journal of Computer Systems and Informatics (JoSYC), 2 (1), 9–16.

24 Arhami, M., Desiani, A., Yahdin, S., Putri, A.I., Primartha, R., & Husaini, H. 2022. Contrast Enhancement for Improved Blood Vessels Retinal Segmentation Using Top-Hat Transformation and Otsu Thresholding. International Journal of Advances in Intelligent Informatics, 8 (2), 210–223.

25 Yudono, M.A.S., Hamidi, E.A.Z., Jumadi, Kuspranoto, A.H., & Sidik, A.D.W.M. 2022. Back Propagation Neural Network for Texture-Based Covid-19 Classification Using First Order Based on Ches X-Ray Images. Journal of Information Technology and Computer Science (JTIIK), 9 (4), 799–808.

26 Ullu, H.H., Baso, B., Risald, Manek, P.G., & Chrisinta, D. 2022. Texture-Based Feature Extraction in Timor Weaving Images Using the Gray Level Co-occurrence Matrix (GLCM) Method. Journal of Information and Technology Unimor (JITU), 2 (2), 70–74.

27 Iqbal, N., Mumtaz, R., Shafi, U., & Zaidi, S.M.H. 2021. Gray Level Co-occurrence Matrix (GLCM) Texture Based Crop Classification Using Low Altitude Remote Sensing Platforms. Peer J Computer Science, 7, 1–26.

28 Novitasari, D.C.R., Lubab, A., Sawiji, A., & Asyhar, A.H. 2019. Application of Feature Extraction for Breast Cancer using One Order Statistics, GLCM, GLRLM, and GLDM. Advances in Science, Technology and Engineering Systems, 4 (4), 115–120.

29 Surya, RA, Fadlil, A., & Yudhana, A. 2017. Feature Extraction Gray Level Co-Occurrence Matrix (GLCM) Method and Gabor Filter for Pekalongan Batik Image Classification. Journal of Informatics: Journal of IT Development (JPIT), 02 (02), 23–26.

30 Mentari, Y., Nurhasanah, & Sanubary, I. 2018. Extraction of Blue and Brown Iris Patterns Using the Gray Level Cooccurrence Matrix Method. PRISM OF PHYSICS, 6 (2), 75–81.

31 Bharaty, P.T., & Subashini, P. 2013. Texture Feature Extraction of Infrared River Ice Images using Second-Order Spatial Statistics. World Academy of Science, Engineering, and Technology, 7 (2), 272–282.

32 Abouelatta, O.B. 2013. Classification of Copper Alloys Microstructure using Image Processing and Neural Network. Journal of American Science, 9 (6), 213–223.

33 Sahaduta, Y., & Lubis, C. 2013. Gray Level Co-occurrence Matrix as Feature Extractor in Braille Script Recognition. National Seminar on Information Technology and Multimedia, 33–38.


34 Ayyad, S.M., Saleh, A.I., & Labib, L.M. 2019. Gene Expression Cancer Classification Using Modified K-Nearest Neighbors Technique. BioSystems, 176, 41–51.

35 Yunitasari, Hopipah, H.S., & Mayasari, R. 2021. Backward Elimination Optimization for Customer Satisfaction Classification Using the k-nearest Neighbor (k-NN) and Naive Bayes Algorithms. Technomedia Journal (TMJ), 6 (1), 99–110.

36 Hu, L.Y., Huang, M.W., Ke, S.W., & Tsai, C.F. 2016. The Distance Function Effect on k-Nearest Neighbor Classification for Medical Datasets. SpringerPlus, 5 (1), 1–9.

37 Niu, J., An, G., Gu, Z., Li, P., Liu, Q., Bai, R., Sun, J., & Du, Q. 2022. Analysis of Sensitivity and Specificity: Precise Recognition of Neutrophils During Regeneration of Contused Skeletal Muscle in Rats. Forensic Sciences Research, 7 (2), 228–237.

38 Ruuska, S., Hämäläinen, W., Kajava, S., Mughal, M., Matilainen, P., & Mononen, J. 2018. Evaluation of The Confusion Matrix Method in The Validation of An Automated System for Measuring Feeding Behavior of Cattle. Behavioral Processes, 148, 56–62.

39 Yasmeen, D., Nisha, S.S., Sathik, M.M., & Phil, M. 2019. Analytical Study of Various Filters in Lung CT Images. International Research Journal of Engineering and Technology, 322–325.

40 Permata, E., Munarto, R., & Firmansyah, T. 2017. Rain Detection Using NOAA Satellite Imagery Frequency 137.9 MHz Using Erison Morphology. Industrial Services Journal, 3 (1), 317–323.

41 Bhahri, S., & Rachmat. 2018. Binary Image Transformation Using Thresholding and Otsu Thresholding Methods. Journal of Information Systems and Information Technology, 7 (2), 195–203.

42 Rivki, M., & Bachtiar, A.M. 2017. Implementation of the k-Nearest Neighbor Algorithm in Classifying Twitter Followers Who Use Indonesian. Journal of Information Systems, 13 (1), 31–37.

43 Bagaskoro, G.N., Fauzi, M.A., & Adikara, P.P. 2018. Application of Tweets Classification in Twitter News Using the K-Nearest Neighbor Method and Query Expansion Based on Distributional Semantics. Journal of Information Technology and Computer Science Development, 2 (10), 3849–3855.

44 Reif, M., & Shafait, F. 2014. Efficient Feature Size Reduction Via Predictive Forward Selection. Pattern Recognition, 47 (4), 1664–1673.

1

Refbacks

  • There are currently no refbacks.