Volume 3, Special Issue 1, ICSTSD 2016
Heart Disease Prediction Using Soft Computing
Pallavi A Agrawal
Student of Master of Engineering (Computer Science & Engineering)
Raisoni College of Engineering and Management Amravati, India.
Prof. Nitin R Chopde
Assistant Professor, Computer Science & Engineering Department
Raisoni College of Engineering and Management Amravati, India.
Abstract— Large number of people has being affected due to
heart disease and it has become a major problem. One of such threat is Cardiovascular Disease (CVD). If it is not treated and detected at an early and proper stage may lead to death. Especially in medical sector there is no such adequate research focus on effective analysis tool to discover and trends in data. Large amount of complex clinical data of patients and other hospital resources are being generated by health care industry nowadays. In male as well as female, it is the first leading cause of death according to world health organization (WHO). Various heart diseases are caused due to reduction in supply of blood and oxygen. Therefore main objective of our paper is to predict more accurately the percentage of possibility of cardiovascular disease in minimum number of attributes (like blood pressure, cholesterol, type of chest pain, blood sugar, etc). Many data mining techniques are used to analyze this rich collection of data from different perspectives and deriving useful information. In this paper, the system which intends to design and develop, diagnosis and prediction for heart disease based on soft computing technique that is ANFIS is used.
Keywords — ANFI; Data mining; Soft Computing; Fuzzy logic; Artificial Neural Network.
I. INTRODUCTION
As heart is an important organ of our body, our life depends on its efficient working. Many parts of our body like liver, kidney, brain, etc also effects due to improper working of heart. If normal working of heart is changed due to some disease, whole body gets disturbed. Due to heart disease normal functionality of heart is affected. There are various factors which increases the risk of heart disease. Some of them are high blood pressure, cholesterol, family history of heart disease, obesity, hypertension, smoking, etc. In today’s modern world, cardiovascular diseases are the highest flying diseases and in every year 12 million deaths occur over the world due to heart problem. In India casualties are also caused due to cardiovascular diseases and its diagnosis is very difficult process. Normally, these diseases can be analyzed using intuition of the medical specialist and it would be highly beneficial if the techniques used for analysis shall be improved with the medical information system. At reduced cost, if a
decision support or computer based information system is developed then it will be helpful for accurate diagnosis. This integration of different data mining techniques with existing medical decision support system requires comparison of different data mining techniques for extracting the suitable data for the said job.
Soft Computing is one multidisciplinary system as the fusion of the fields of Fuzzy logic, Neuro-computing, Evolutionary and Probabilistic Computing. It is the combination of methodologies designed to model and enable solutions to real world problems, which are not modeled or two difficult to model mathematically.
There are some different data mining and soft computing approaches that are used in prediction of heart disease as;
1. Association 2. Clustering 3. Data Visualization 4. Decision Tree 5. Linear Regression 6. Fuzzy Logic
7. Artificial Neural Network
A. Association
The association is one the basic and the mostly used data mining terms. The association mining is about to identify the relationship between the attributes of the dataset as well as to identify the relationship between the records. The association is one of the major modeling approaches that are helpful to identify the valid dataset at the initial stage and to remove the records or the attributes having less association. Association mining is also taken as the pre-processing stage to filter the dataset by removing the dataset impurities and to keep the most valuable data values in the dataset. But the use of association mining is not limited to this only. It is also appropriate in some other mining operations such as classification etc [13].
B. Clustering
Volume 3, Special Issue 1, ICSTSD 2016 scientific approach so that the similar kind of the data will be
maintained in one cluster. The similarity between the data items is analyzed by using some distance based measures such as Euclidean Distance. There are number of clustering approaches such as C-Means, K-means clustering approach etc.
C. Data Visualization
The visualization is the approach to present the results in an effective way so that any stakeholder can easily drive the conclusion by viewing the results. Such kind of data transformation is performed in terms of pictorial data such as graphs, tables, charts etc. This is actually the management level presentation approach to represent data conclusions.
D. Decision Tree
Decision tree, as the name suggest is the tree based approach in which the decision are represented by the parent nodes and the associated events are represented by the child nodes. This kind of algorithm is used basically to perform the data classification. The decision is been taken about the data acceptance or the rejection under some rule defined as the parent node.
E. Linear Regression
It is another statistical approach that work as the filtration as well as the analytical approach to perform the prediction of the data values. The regression is basically the analysis of an attribute respective to one or more attributes of the same dataset.
F. Fuzzy Logic
This methodology is effective to drive the predictive analysis on a single record as well as on a large dataset. In this present work, the Fuzzy based approach is been used to predict the patient disease. As it is the intelligent soft computing approach, it can represent the probabilistic relation based on the patient symptom analysis. As the work is rule based, the easy estimation of the interrelated variables can be identified to understand the approach followed by the fuzzy analysis. The validity of the algorithm implementation is based on the validity of the collected dataset. The first step is to define a valid dataset with large amount of data. In this present work we are going to define a patient dataset along with patient basic details as well as patient symptoms. Let us divide the tuples of the database into partitions, not necessarily of equal size. Once we get the normalized dataset, the next work is to implement the fuzzy rule on this dataset to perform the classification.
The fuzzy will perform the work in three main steps by acquiring the domain knowledge:
Identify the attributes or the variables that are most associated to some event.
Identify the relationship between these attributes so that the flow can be defined.
Define a probabilistic decision over the variables defined in input data set so that the link associated in identifying the most related symptoms attributes [5].
G. Artificial Neural Network
In computer science and related fields, artificial neural networks are computational models inspired by animal central nervous systems (in particular the brain) that are capable of machine learning and pattern recognition. They are usually presented as systems of interconnected "neurons" that can compute values from inputs by feeding information through the network. For example, in a neural network for handwriting recognition, a set of input neurons may be activated by the pixels of an input image representing a letter or digit. The activations of these neurons are then passed on, weighted and transformed by some function determined by the network's designer, to other neurons, etc., until finally an output neuron is activated that determines which character was read. Like other machine learning methods, neural networks have been used to solve a wide variety of tasks that are hard to solve using ordinary rule-based programming, including computer vision and speech recognition.
II LITERATURE SURVEY
Work done in heart disease prediction using data mining techniques and fuzzy logic are discussed below:
Mai Shouman, Tim Turner and Rob Stocker [1], proposed various single and hybrid data mining techniques in heart disease prediction. Using single data mining technique for heart disease has been thoroughly investigated showing the considerable levels of accuracy. Recent investigation shows that hybridizing more than one technique, will obtain enhanced result in diagnosis. Here author applies various data mining techniques like naive bayes, decision tree on various heart disease datasets and measures the accuracy of it [8]. After applying hybrid data mining techniques on different heart disease datasets shows the different accuracies [11] and [16]. After comparing both techniques in diagnosis on heart disease datasets, hybrid datasets showing the better accuracy than single data mining techniques.
Danashree S. Madhekar, Mayur P. Bote, Shruti D. Deshmukh [2], presents a classifier approaches for heart disease prediction and shows how Naïve Bayes classification can be used for this purpose. The system proposed have categorized medical data into five distinct categories namely no, low, average, high and very high. The analysis is done on Cleveland datasets [18] and applying the classification techniques to predict group membership for data instances [15].
Volume 3, Special Issue 1, ICSTSD 2016 three classifier like naive bayes, decision tree and bagging
algorithm are used to predict the diagnosis of heart disease patient with same obtained before the reduction of all parameters.
Nikita, Madan lal Yadav [4] have proposed the system which uses Fuzzy rule based approach for prediction of patient disease. This methodology is functional to drive predictive analysis on single as well as on large dataset [13]. As it is the intelligent soft computing approach, it can represent the probabilistic relation based on patient symptom analysis. As work is rule based, the easy estimation of the interrelated variables can be identified to understand the approach followed by the fuzzy analysis.
The paper referred for this is by Prachi Jambhulkar, Vaidehi Baporikar [7]. This research article work presents a real time WSN system for prediction and monitoring of any upcoming cardiovascular diseases. The system has a capability of monitoring of any upcoming of any upcoming cardiovascular diseases [6]. The system has a capability of monitoring multiple patients at a time and delivers remote diagnosis and prescription to the patients it also provides fast and effective warning to doctors, relatives and hospitals. From this paper we get an idea of using wireless sensor network, we can enhance and expand the model with combination of WSN system and data mining techniques for getting accuracy and more real time data sets in prediction of various cardiac diseases.
Aqueel Ahmed Shaikh Abdul Hannan [5], presents the research paper to find out the various cardiovascular disease through data mining, genetic algorithm, support vector machine (SVM), rough set theory, association rules and neural networks. After study, we examined that from the above said techniques decision tree and SVM are the most effective for cardiac disease amongst all [12]. Hence, we observed that, data mining could helpful in the identification of high or low risk cardiac diseases. The three classification function techniques in data mining like naïve bayes, decision tree and classification via clustering [10] are compared for predicting cardiac disease with reduced number of attributes has been presented by Shamsher.
III PROPOSED METHODOLOGY
The idea of soft computing was initiated in 1981 when Lotfi A. Zadeh published his first paper on soft data analysis “What is Soft Computing”, Soft Computing. Springer-Verlag Germany/USA 1997]. Zadeh, defined Soft Computing into one multidisciplinary system as the fusion of the fields of Fuzzy Logic, Neuro-Computing, Evolutionary and Genetic Computing, and Probabilistic Computing. Soft Computing is the fusion of methodologies designed to model and enable solutions to real world problems, which are not modeled or too difficult to model mathematically. The aim of soft computing is to exploit the tolerance for imprecision, uncertainty, approximate reasoning, and partial truth in order to achieve close resemblance with human like decision making. Soft Computing is still growing and developing. Soft Computing is tolerant of imprecision, uncertainty, partial truth, and
approximation. The guiding principle of soft computing is to exploit these tolerance to achieve tractability, robustness and low solution cost. In effect, the role model for soft computing is the human mind [15].
Here we are using ANFIS as our soft computing method, which is combination of fuzzy logic and artificial neural network. It is called Adaptive because it is a closed loop system. That is, back propagation is inbuild in ANFIS. Closed loop system means, the output of first stage is given to next as input and from last stage again to the first stage till required output is not gained. This is what called as back propagation algorithm. In ANFIS, the structure of artificial neural network is used by defining fuzzy rules in it. The basic block diagram of ANFIS is as follows.
Figure 1 Basic diagram of ANFIS
A. ANFIS Algorithm in our system
1. Initialize the fuzzy system. 2. Use genfis1 command.
3. Give other parameters for learning.
4. Declare number of epochs (iterations) and tolerance (error). 5. Start learning process (use anfis command)
6. Stop when least tolerance is achieved. 7. Validate with independent data. 8. End
B. The Block diagram of ANFIS
Volume 3, Special Issue 1, ICSTSD 2016 minimum error is achieved. At certain stage we get very
negligible difference in the error of different epochs which means our target is achieved. Finally the validation is done with the independent data variables and we get the classification of our data into the normal person or the heart disease patient in our case [17].
Figure 2 Framework of ANFIS used in proposed method
IV. RESULT ANALYSIS
Following section gives the comparison of accuracy between the various methodologies proposed for prediction of heart disease and proposed system. Also the table of it with corresponding graphs are given below. Here Table 1 and Figure 3 shows the accuracy in prediction of heart disease using trop i/t test and ECG images, while Table 2 and Figure 4 shows the comparison of accuracy between various methodologies in present system and soft computing which is used in proposed system.
Table 1 Accuracy using Trop I/T and ECG Images
Parameters Used for Heart Disease Prediction
Accuracy of Proposed System Using Trop I/ T Test 99 %
Using ECG Image 96 %
Graph 1 Accuracy graph using trop i/t test and ECG image.
Table 2 Comparison of accuracy between existing system and proposed system.
Parameters Used Accura cy using Naïve Bayes
Accura cy using Decisio n tree
Accura cy using Genetic with Naïve Bayes
Accura cy of propos ed system
Age(years)
78.56 %
75.73 %
96.5 % 95 % Gender
Chest Pain Location Peak Exercise Blood Pressure
Cholestroal
Fasting Blood Sugar Resting Electrocardiogra
phic Maximum Heart
Rate Exercise Induced Angina
ST Depression
Slope of Peak exercise ST seg
Number of major vessels
colored by Flourosopy
Thal 0% 20% 40% 60% 80% 100%
Using Trop I/ T
Test
Using ECG Image
Accuracy of Proposed
System
Volume 3, Special Issue 1, ICSTSD 2016
Figure 4 Accuracy comparison graph between present system and proposed system
V. CONCLUSION
After studying various data mining and soft computing techniques, in this paper, conclusion can be made that ANFIS is better prediction method for heart disease. The existing system is not sufficient for prediction of heart disease to some extent without medical experts and the accuracy level is also not so high. The proposed Model for Heart Disease Prediction system using Soft Computing gives more accurate and precise result. It is going to be helpful in many aspects like; to find normal person and heart disease patient without going frequently to the hospitals, can work with standardized database which may be in large amount in certain hospitals or universities. This system can be used on personal basis as well as in big hospitals.
With the help of fuzzy rules and artificial neural network under ANFIS our system recognizes nearly accurate prediction of heart disease patient by using various attributes responsible for it.
VI. FUTURE SCOPE
In this system for Heart Disease Prediction Using Soft Computing, we have implemented ANFIS in which, fuzzy rules are defined in neural network structure for training and testing of data and its classification into normal or abnormal class for patients.
In future an attempt can be made with other recent techniques or test for prediction of heart disease and also by taking large amount of ECG images in databases of patients and achieving more accuracy than our proposed system.
References
[1] Mai Shouman, Tim Turner, and Rob Stocker “Using Data Mining Technique in Heart Disease Diagnosis
and Treatment,” 2012 Japan-Egypt Conference on Electronics, Communications and Computers 978-1-4673-0484-9/12@ 2012 IEEE.pp.173-177.
[2] Dhanashree S. Medhekar, Mayur P. Bote, Shruti D. Deshmukh “Heart Disease Prediction System using Naïve Bayes,” International Journal of Enhanced Research In Science Technology & Engineering, ISSN NO: 2319-7423 VOL. 2 ISSUE 3, MARCH- 2013.p.p. 1-5
[3] Vikas Chaurasia and Saurabh Pal “Data Mining Approach to Detect Heart Diseases,” International Journal of Advanced Computer Science and Information Technology (IJACSIT) Vol. 2, No. 4, 2013, Page: 56-66, ISSN: 2296-1739.
[4] Nikita, Madam lal Yadav “A Fuzzification Approach for Prediction of Heart Disease,” International Journal of Engineering Trends and Technology (IJETT)- Volume 4 Issue 5 May2013. ISSN: 2231-5381.
[5] Aqueel Ahmed, Shaikh Abdul Hannnan “Data Mining Techniques to Find Out Heart Diseases: An overview,” International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-1, Issue-4, September 2012,pp.18-23.
[6] Deepali Chandna “Diagnosis of Heart Disease Using
data Mining Algorithm,” International Journal of Computer Science and Information Technologies (IJCSIT), Vol. 5(2), 2014 pp. 1678-1680.
[7] Prachi Jambhulkar, Vaidehi Baporikar “Review on Prediction of Heart Disease Using Data Mining Technique with Wireless Sensor Network,” International Journal of Computer Science and Applications, Vol. 8, No. 1, Jan-Mar 2015, ISSN: 0974-1011
[8] Nidhi Bhatala, Kiran Jyothi, “ An Analysis of Heart Disease Prediction using Different Data Mining Techniques,” International Journal of Engineering Research Technology (IJERT) ISSN:2278-0181 Vol.1, Issue-8, October-2012.
[9] Vikas Chaurasia and Saurabh Pal “Early Prediction Heart Disease Using Data Mining Techniques,” Caribbean Journal of Science and Technology, 2013, ISSN 0799-3757, Vol.1, pp. 208-217.
[10] Shankar B. Patil, “Intelligence and Effective Heart Disease Prediction System Using Data Mining And Artificial Neural Network,” European Journal of 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.91
Age
(y
ear
s)
Che
st
Pa
in
L
o
cat
ion
Cho
le
stro
al
Re
stin
g…
Exe
rc
is
e
In
d
u
ce
d
…
Slop
e
o
f P
eak
…
Th
al
Accuracy using Naïve Bayes
Accuracy using Decision tree
Accuracy using Genetic with Naïve Bayes
Volume 3, Special Issue 1, ICSTSD 2016 Scientific Research ISSN: 0975-3397 Vol. 3 No. 6,
June 2011.
[11] M. A. Jabbar, “Knowledge Discovery from Mining Association Rule for Heart Disease Prediction,” Journal of Theoretical and Applied Information Technology ISSN: 1992-8643, E- ISSN: 1817-3195, 2005.
[12] T. Srinivasan. “Knowledge Discovery in Clinical Databases with Neural Network Evidence Combination”.
[13] T. Revathi, S. Jeevitha, “Comparitive Study of Heart Disease Prediction Systen Using Data Mining Techniques,” International Journal of Science and Research (IJSR) ISSN: 2319-7064.
[14] Abhishek Taneja, “Heart Disease Prediction System Using Data Mining Techniques,” Oriental Journal of Computer Science and Technology, ISSN: 0974-6471.
[15] Dr. Hari Ganesh S., Gajenthiram M, “Comparitive Study of Data Mining Approaches for Prediction of Heart Disease,” IOSR Journal of Engineering ISSN (e): 2250-3021, ISSN: 0974-6471.
[16] K. Sundhakar, Dr. M. Manimekalai, “Study of Heart Disease Prediction using Data Mining ,” Internatinal Journal of Advanced Research on Computer Science and Software Engineering, ISSN: 2277 128X.
[17] www.myreaders.info/html/soft_computing.html [18] Ms. Shinde S. B., Prof. Amrit Priadarshi, “Decision
Support System on Prediction Using Data Mining Techniques,” iPGCON-2015.
[19] Jyoti Soni, Ujma Ansari, Dipesh Sharma, Sunita Soni, “Prediction Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction,” International Journal of Computer Application (IJCA) ISSN: 0975-8887.