Analyzing Microarray Data To Identify Patterns And Cluster In Medical Database Using Data Mining Techniques

  • B. Lavanya Department of Computer Science University of Madras Chennai – 600 025
  • T. Madhumitha Department of Computer Science University of Madras Chennai – 600 025
Keywords: Unsupervised learning; Microarray; DBSCAN; Itemset; Association rule


This paper analysis the biological data using data mining techniques, namely unsupervised learning. The methods clustering, Apriori and Association rules mining are used to analyze the Autism Spectrum Disorder (ASD) using ASD Microarray dataset from Gene Expression Omnibus. The data contain 100 genes, from which extracting the genes which  highly influence ASD using the unsupervised learning algorithms like Density Based Spatial Clustering Application with Noise (DBSCAN) and Apriori  Association rule mining. Each algorithm discovers the genes in the form of clusters using DBSCAN and plotted , then analyzes the genes using Apriori , Association rule mining to indentify the genes that are frequent and form as itemsets, and then the association rule derived from the itemsets. Then algorithms are compared and tabulated to visualize the genes influence Autism Spectrum Disorder and conclude in discovering the genes which highly influence ASD.


[1] Rohit Gupta, Fayaz S.M, Sanjey Singh “Identification of Gene Network Motifs for Cancer Disease Diagnosis”, IEEE Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER) 2016, Mangalore, India, 13-14 Aug. 2016.

[2] Sakorn Mekruksavanich, “Medical Expert System Based Ontology for Diabetes Disease Diagnosis”, IEEE 7th IEEE International, Conference on Software Engineering and Service Science (ICESS), Beijing, China, 26-28 Aug. 2016.

[3] Charles C.N.Wang, Yu-Liang Lee, Phillip C.Y.Sheu, Jeffrey J.P.Tsai, “Application of Latent Semantic Analysis to Clustering Cardiovascular Gene Ontol ogy”, IEEE 16th International Conference on Bioinformatics and Bioengineering, Taichung, Taiwan, 31 Oct. -2 Nov. 2016.

[4] Firas Zekri, Rafik Bouaziz, Emna Turki, Istanbul, “A fuzzy–based ontology for Alzheimer's disease decision support”, IEEE International Conference on Fuzzy System (FUZZ-IEEE), Turkey, 2-5 Aug. 2015.

[5] Asma Khan, Azeema Sadia, Sohaib Ahmed, Huma Tabassum & M. Shahid Khan, “HEPO: The Hepatitis Ontology For Abductive medical Diagnostic Systems”, IEEE International Conference on Communication, Computing and Digital Systems (C-CODE), Islamabad, Pakistan, 8-9 March 2017.

[6] Duc-Hau Le, Vu-Tung Dang, Springer Berlin Heidelberg, “Ontology-based disease similarity network for disease gene prediction”, Vietnam Journal of Computer science , Volume 3 Issue 3, August 2016.

[7] Iyanuoluwa Emmanuel Obebode, Aryya Gangopadhyay, “Acquisition of diabetes-related biological associations using a motif based network: preliminary results”, IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Washington, DC, USA, 9-12 , Nov. 2015.

[8] Giuseppe Agapito, Mario Cannataro, Senior, Pietro Hiram Guzzi, and Marianna Milano, “Extracting Cross-Ontology Weighted Association Rules from Gene Ontology Annotations”, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2016 Mar-Apr;13(2):197-208. doi: 10.1109/TCBB.2015.2462348.

[9] Adnan Ferdous Ashrafi, A.K.M Iqtidar Newaz, Rasif Ajwad Moin, Mahmud Tanvee, M.A Mottalib, “A Modified Algorithm for DNA Motif Finding and Ranking Considering Variable Length Motif and Mutation” Conference: Recent Trends in Information Systems, Kolkata, India, 2015.

[10] Alagukumar. Sa , Lawrance. Rb, “ A Selective Analysis of Microarray Data Using Association Rule Mining”, Procedia Computer Science 47:3 12 · December 2015