Skip to main content

ORIGINAL RESEARCH article

Front. Comput. Neurosci., 07 October 2022
This article is part of the Research Topic Computational Intelligence for Signal and Image Processing View all 11 articles

Heart disease detection based on internet of things data using linear quadratic discriminant analysis and a deep graph convolutional neural network

  • 1Department of ECE, Koneru Lakshmaiah Education Foundation, Green Fields, Vaddeswaram, Andhra Pradesh, India
  • 2Department of Mathematics and Computer Science, Brandon University, Brandon, MB, Canada
  • 3Research Centre for Interneural Computing, China Medical University, Taichung, Taiwan
  • 4Department of Mathematics and Computer Science, Lebanese American University, Beirut, Lebanon
  • 5Western Norway University of Applied Science, Bergen, Norway

Heart disease is an emerging health issue in the medical field, according to WHO every year around 10 billion people are affected with heart abnormalities. Arteries in the heart generate oxygenated blood to all body parts, however sometimes blood vessels become clogged or restrained due to cardiac issues. Past heart diagnosis applications are outdated and suffer from poor performance. Therefore, an intelligent heart disease diagnosis application design is required. In this research work, internet of things (IoT) sensor data with a deep learning-based heart diagnosis application is designed. The heart disease IoT sensor data is collected from the University of California Irvine machine learning repository free open-source dataset which is useful for training the deep graph convolutional network (DG_ConvoNet) deep learning network. The testing data has been collected from the Cleveland Clinic Foundation; it is a collection of 350 real-time clinical instances from heart patients through IoT sensors. The K-means technique is employed to remove noise in sensor data and clustered the unstructured data. The features are extracted to employ Linear Quadratic Discriminant Analysis. DG_ConvoNet is a deep learning process to classify and predict heart diseases. The diagnostic application achieves an accuracy of 96%, sensitivity of 80%, specificity of 73%, precision of 90%, F-Score of 79%, and area under the ROC curve of 75% implementing the proposed model.

Introduction

According to WHO, cardiovascular disease (CVD) is a significant reason of death worldwide, with 17.8 million deaths every decade (Rath et al., 2021). The American Cardiac Organization (Zhang and Xu, 2021) specifies detailed indications like sleep disorders, slight pain increase as well as a drop-in heart rate and fast weight improvement (up to 1.5-2.5 kg per 7 days) (Vincent Paul et al., 2021). However, more study data and patient records from hospitals become available as time goes on. Machine learning (ML) and artificial intelligence (AI) are now widely recognized as able to play a vital role in the medical industry. ML and deep learning (DL) methods are often used to diagnose conditions as well as classify or anticipate results. ML algorithms can do a complete examination of genetic data in a short amount of time. Medical records are modified and analyzed extra thoroughly for improved predictions, and methods are trained for knowledge pandemic predictions (Liu et al., 2022). Heart disorders are identified with congenital, coronary and rheumatic events, and 370,000 Americans died due to coronary heart disease (HD) type heart attacks in 2015. Annually Americans are spending $250 billion USD on HD diagnosis and treatment. According to the American heart association, medical HD disorders will be able to be predicted by 2030.

Exercise stress tests, chest X-rays, CT scans, MRI, coronary angiograms, and electrocardiograms (ECG) are currently used to diagnose the severity of HD in patients. Patients need early and precise diagnoses of coronary HD to receive timely and effective treatment and boost their chances of long-term survival. Unfortunately, cardiovascular specialists may not be available in many resource-limited places worldwide to do these diagnostic tests. Missing diagnoses, incorrect diagnoses, and therapies put patients’ health in danger in many circumstances. In addition, early detection of HD causes preventative interventions such as drugs, lifestyle changes, angioplasty, or surgery, which can help to slow disease development as well as minimize morbidity (Morris and Lopez, 2021). As a result, precise and timely heart disease diagnostics are critical for lowering mortality as well as enhancing long-term survival rates in patients. Because early detection of coronary HD is challenging, computer-assisted techniques for detecting and diagnosing heart disease in people have been developed. In medical institutions, ML methods that analyze clinical data, evaluate it, and diagnoses medical conditions is becoming increasingly common in healthcare fields.

The research contributions of this paper are as follows:

1. Collect internet of things (IoT) sensor-based heart disease data in the detection of heart disease using a deep learning architecture.

2. Process input data for noise removal and cluster the data using K-means clustering.

3. Extract the features using Linear Quadratic Discriminant Analysis.

4. Classify the extracted data using a deep graph convolutional network (DG_ConvoNet).

Appendix

Internet of things (IoT), World Health Organizations (WHO), cardiovascular disease (CVD), Receiver Operating Characteristic Curve (ROC), Machine learning (ML), Artificial Intelligence (AI), Deep learning (DL), Heart-Disease (HD), Support Vector Machine (SVM), Heart Rate Variability (HRV), Convolution Neural Network (CNN), Magnetic Resonance Imaging (MRI), Deep Graph Convolutional Network (DG_ConvoNet).

Related work

In this section, a brief literature has been employed from the latest research papers related to heart disease prediction using IoT sensor data. Feature extraction, classification and predictions are the major steps involved in intelligence algorithms. Manogaran et al. (2017) utilized a variety of big data methods to detect cardiac illness, as well as hyperparameter tuning to improve the accuracy of results. Kanksha Aman et al. (2021) employed generalized discriminant analysis for extracting nonlinear HD features. A binary classifier with extreme ML has been used to reduce overfitting issues as well as increase training time on finding heart disorders prediction. For detecting coronary HD, the accuracy was 73% had been attained which was very less. Heart rate variability was classified as an arrhythmia by Divya et al. (2021). The heart abnormality disorders classification was done with a multilayer perceptron neural network, and 91% accuracy was reached by decreasing features or using Gaussian Discriminant Analysis, in this research work hidden features haven’t been included. Hasan and Bhattacharjee (2019) employed Gaussian discriminant analysis to reduce HRV signal characteristics to 15 and an SVM classifier to obtain 70% precision, this research work cannot solve unstructured sensor data from IoT networks. An enhanced CNN model is proposed by Huang et al. (2019), in which 92.35% accuracy had been detected, the main limitation of this work is STFT-based spectrogram analysis. The STFT model is very old and faces clustering issues when large datasets have been applied to it. The Fruit classification is a complex process to predict heart diseases through IoT sensor data. The following challenge was solved by using a CNN-based technique by Wang et al. (2020). According to the researchers, designed past HD detection methods has a less classification accuracy, which is get improved than the existing methods. Zhang et al. (2019) proposed a comprehensive description of multimodal data fusion of heart-related sensor data. A combination of CT, MRI, PET, optical imaging and radionuclide datasets has resulted in complete pathology of heart disorders in a radiology manner. The image fusion-based approach has been found to improve clinical diagnosis in recent years but failed at emergency diagnosis conditions. The CNN-based diagnosis algorithm implemented by Zhang et al. (2020a). In this research, stochastic pooling, as well as optimization of hyperparameters connected with CNN. The major drawback of this study is neuroimages orientation is altered from patient to patient so that when applying a new image to the designed application, the HD abnormality detection rate had been getting changes extremely.

The realized methods which are shown in Table 1 have less operational sensitivity, specificity, and accuracy. Zhang et al. (2020b) introduced an FGCNet-based HD features extraction from GCN and CNN models. This method is used to diagnose chest CT scan-based heart disorders prediction but fails at noise-based CT scan radiology images specified as test input. The FGCNet is said to aid quick COVID 19 detection utilizing chest CT scans. Wang et al. (2021a) presented the CCSHNet method for heart disorders detection, which combines deep fusion. The designed CCSHNet models failed at large data samples applied at the training stage. The DCA and transfer learning-based models are very critical to detecting HD at large dimensional data. The CCSHNet is a viable option for detecting infectious heart illnesses, including COVID 19, according to deep exhaustive analysis. The literature review from many latest articles identified that traditional ML-based detection of arrhythmia with ECG signals analysis methods are outdated. However, fewer research works have been published on HD detection utilizing ECG signals and DL techniques are trending but IoT-related works are not much efficient to predict HD. Wang et al. (2021b) evaluate classification algorithms using an ML technique to predict cardiac disease. This work demonstrated the bagging technique prediction for HD with a good performance rate, as well as accuracy level. Superior HD prediction models other than past techniques are necessary. Martins et al. (2021) offer a genetic approach for predicting human heart disease through echocardiographic, the designed method is limited to huge unstructured data. The implemented method might reduce the number of test cases required to detect HD issues based on Ali et al. (2021) and Ladefoged et al. (2021). The successful HD abnormality prediction based on the radiology dataset is outdated as well as latest IoT-based techniques are required. Saikumar et al. (2022) aim to develop a precise categorization algorithm for accurately predicting cardiac disease but are unable to work on IoT sensor data. The following work concluded that regression classification is used to predict HD more accurately than other techniques by Saikumar and Rajesh, 2020a,b. R-C4.5 is proposed, and its features are extractí from the given technique by Koppula et al. (2021). The study used their equipment and found it a very beneficial machine in the healthcare industry for predicting ML-based approaches Garigipati et al. (2022). The above discussions are providing information about earlier HD prediction models and its limitations. It is clear that many cardiac diagnosis models are facing various low-level and high-level issues under dynamic conditions. This research work looks to solve some of the indicated issues from the related works.

TABLE 1
www.frontiersin.org

Table 1. Recent studies related to heart abnormality prediction.

System model

This section discusses the proposed DL technique based on feature extraction as well as classification in heart disease diagnosis. Here, the input data has been collected as IoT sensor data from a patient monitoring system.

The collected data has been processed for noise abstraction using a clustered-based K-means algorithm. Gaussian noise that was present in the medical images was removed at this block. Clustered information is used to extract the features utilizing Linear Quadratic Discriminant Analysis. Finally, the extracted features have been classified using the DG_ConvoNet. The architecture of the proposed method is shown in Figure 1. The pre-processing unit categorizes image registration from the medical raw image data (University of California Irvine machine learning repository). The registration enhancement process is used to line up the image for de-noise processing. Due to speckle disturbances, medical images get damaged and hinder the ability to identify deep features needed for DL. As a result, medical images are de-specked using a filtering approach technique to improve categorization results.

FIGURE 1
www.frontiersin.org

Figure 1. Proposed IoT sensor data-based heart disease (HD) prediction.

K-means clustering

Since k represents the number of clusters, there are k centroids, one for every cluster. After the Euclidean distance between each data point and the centroid has been evaluated, the assignment of data points to the centroid is based on the shortest Euclidean distance from that centroid. An early grouping is done when no point is left unassigned. Now, k new centroids are generated, and the iteration continues until the k centroids’ positions do not change. In this stage, 256 clusters had to be created and processed for the centroid calculation of the cluster.

Let Y = {x1, x2, x3, …, …, xn} are set of dataset opinions as well as Z = {z1, z2, …, …zc} be set of centers.

1. Arbitrarily choose ‘c’ cluster centers.

2. Evaluate the distance among each information point as well as cluster centers.

3. Allot data points to the cluster center with the shortest distance between it and all other cluster centers.

4. Again, evaluate the original cluster center using the following Eq. (1):

Z i = ( 1 / c i ) . Σ j = 1 𝔼 1 x 𝕀 (1)

5. Where ‘ci’ indicates the number of data opinions in the ith cluster.

6. Again, calculate the distance between every data point as well as the original cluster centers.

7. Stop if no information points were reallocated; otherwise, start over at step 3.

The flow chart of K-means clustering is shown in Figure 2. In this K-means flow is explained with clustered extraction on the dataset. The centroid, Euclidean and particle estimation parameters have been providing information about deep dataset information. The dataset consists of shape-based image features which are processed by the K-means algorithm.

FIGURE 2
www.frontiersin.org

Figure 2. Flow chart K-means.

Linear quadratic discriminant analysis based feature extraction

Let Sb and Sw be among and within-class scatter matrices, low-dimensional complement space of null space of Sb, related as 𝒮′, is first extracted. Let Vb = [vb1, …, vbM] be M eigenvectors of Sb corresponding to M non-zero eigenvalues A = [λb1, …, λbM], where M = min(C−1, J). The Sb subspace ℬ′ is thus spanned by Vb, which is further scaled by U=VbAb-1/2 so that UTSbU = ℐ, where Ab = diag (A), diag()indicates the diagonalization operator and ℐ is the (M = M) identity matrix by Eq. (2):

Σ ` i ( α , γ ) = ( 1 - γ ) Σ ` i ( α ) + γ M tr [ Σ ` i ( α ) ] I ,
Σ ` i ( α ) = 1 C i ( α ) [ ( 1 - α ) S i + α S ] , (2)

M is the dimensionality of ′.Ci(α) = (1−α)Ci + αN and Si is the covariance matrix of ith class evaluated in ′, i.e., Si=Σj=1Ci(yij-y¯i)(yij-y¯i)T,yij=UTzij,y¯i=(1/Ci)Σj=1Ciyij and S=Σi=1CSi.

Let Φ = [ϕ(z11), …, ϕ(zCCc)] be corresponding feature representations of training samples in kernel space 𝔽F. Let K be N = N Gram matrix, i.e., K=(Klh)l = l,,C= 1,,CIh is a C1 × Ch sub - matrix of K composed of samples from classes ℐl and 𝒵h, i.e., Klh=(kij)i=1,.,Clj=1,, where kij = k(zli, zhj) and k(⋅)indicates kernel function defined in ℝJ. Let S¯b be between-class scatter in 𝔽F, described as Eq. (3)

S ` b   =   1 N Σ i   = 1 C C i ( ϕ i ϕ ) ( ϕ i ϕ ) T (3)

where ϕi=(1/Ci)Σj=1Ciϕ(zij) is the mean of 𝒴i in 𝔽F and ϕ=(1/N)Σi=1CΣj=1Ciϕ(zij) is mean of training samples FF.

Eigenvectors of Sb, i.e., V`b=[v¯b1,,v¯bM], corresponding to M largest eigenvalues. V`b is obtained by solving the eigenvalue issue of S¯b, which is represented as Eq. (4):

s ¯ b = i = 1 c ( c i N ( ϕ i ϕ ) ) ( c i N ( ϕ i ϕ ) ) T
= i = 1 c ϕ ` ϕ ` i = i T Φ b Φ b T (4)

where ϕ`i=Ci/N(ϕiϕ) and Φb=[ϕ`1,,ϕ`C]. It is given that S¯b is a matrix of size F × F, where F indicates kernel space dimensionality. Due to HD of 𝔽F, a direct computation of eigenvectors of S¯b is impossible(ΦbΦbT)(Φbe¯bi)=λ(Φbe¯bi). Therefore, it is deduced that (Φbe¯bi) is the i th eigenvector of S¯b=ΦbΦb-T

Φ b T Φ b = 1 N B ( A NC T K A NC - 1 N ( A NC T K 1 NC )
- 1 N ( 1 NC T K A NC ) + 1 N 2 ( 1 NC T K 1 NC ) ) B (5)

where B=diag[C1,,CC],1NC is an N × C matrix with all elements equal to 1,ANC = diag[aC1,…,aCC] is an N × C block diagonal matrix and a Ci is a Ci = 1 vector with all elements equal to 1/Ci. Let E¯bM=[e¯b1,,e`bM] consist of M significant eigenvectors of ΦbTΦb corresponding to M largest eigenvalues λb1>,...,>λbM and V`b=ΦbE¯bM, it is not difficult to derive that V¯bTS¯bV¯b=Λ, where b=diag[λb,1,...,2λb,M2]. Thus, the transformation matrix U such that U¯TS`bU¯=Iis evaluated as Eqs. (6), (7):

U ¯ = V b ¯ A b 1 / 2 , V b ¯ = Φ b E ` b M(6)
y ` ij = U ¯ T ϕ ( z ij ) = A b 1 / 2 E ¯ bM T Φ b T ϕ ( z ij ) (7)

where ΦbTϕ(zij) can be expressed as Eq. (8)

Φ b T ϕ ( z ij ) = 1 N B ( A NC v ( ϕ ( z ij ) ) - 1 N 1 NC T v ( ϕ ( z ij ) ) ) (8)

where v(ϕ(zij)) = [ϕ(z11)ϕ(zij),ϕ(z12)ϕ(zij),…,ϕ(zCCC)ϕ(zij)]T is evaluated implicitly through the kernel function described in ℝJ, i.e., ϕ(zmn)ϕ(zij) = k(zmn, zij).

Σ ` i ( α , γ ) = ( 1 - γ ) Σ ` i ( α ) + γ M tr [ Σ ¯ i ( α ) ] I ,
Σ ` i ( α ) = 1 C i ( α ) [ ( 1 - α ) S ¯ i + α S ¯ ] ,
C i ( α ) = ( 1 - α ) C i + α N ,
S ` i = j = 1 C i ( y ¯ ij - y ¯ ¯ i ) ( y ¯ ij - y ¯ ¯ i ) T ,
S ¯ = i = 1 C S ` i .
y ¯ ¯ i = ( 1 / C i ) j = 1 C i y ¯ ij (9)

and (α, γ) is a pair of regularization parameters.

The key component in the evaluation of Σ¯i(α,γ) is to arise covariance matrix of ith class, i.e., S¯i which is given as Eq. (10):

S ` i = j 1 C i ( y ` ij - y ¯ ¯ i ) ( y ¯ ij - y ¯ ¯ i ) T
= j 1 C i y ¯ ij y ¯ ij T - j 1 C i y ¯ ¯ i y ¯ ij T - j 1 C i y ¯ ij y ¯ ¯ i T + j 1 C i y ` ¯ i T y ¯ ¯ i T
= j 1 C i y ¯ ij y ` ij T - C i y ¯ ¯ i y ¯ i T - C i y ¯ ¯ i y ¯ ¯ i T + C i y ¯ ¯ i y ¯ ¯ i T
= j 1 C i y ` ij y ¯ ij T - C i y ¯ ¯ i y ¯ ¯ i T
= J 1 - C i × J 2 , (10)

where J1=Σj=1Ciy`ijy¯ijT and J2=y¯¯iy¯¯iT. The detailed derivation of J1 and J2 is determined in Appendices A and B.

Mahalanob is distance between feature representation of test image q¯ and each class centre y¯¯i is then used to identify the test image. i.e., ID(p)=argminidi(q¯), that can be calculated in Eq. (11) as:

d i ( q ¯ ) = ( q ¯ - y ¯ ¯ i ) T Σ ` i - 1 ( α , γ ) ( q ¯ - y ¯ ¯ i ) + ln | Σ ¯ i ( α , γ ) | - 2 l n π i ,

where πi = Ci/N.

( A ` = arg max A ¯ | A ` T S ` b A ` | / | A ` T S ¯ b A ¯ | + | A ` T S ` w A ` | )
w h e n ( α = 1 , γ = ( tr ( S ¯ i / N ) + M ) / M )

Classification using deep graph ConvoNet (convolutional network)- DG_ ConvoNet:

𝒢 = (𝒴, ℰ, ℋ) defines an undirected and connected graph, Here A and S are limited sets of | A| = S vertices as well as edges W ∈ ℝN × N. Numerous variables in each vertex represent the graph signals. ℒ = D−W, where D=diag(d`0,,dN-1) is a grading matrix designed in steps di = Σj𝒲i,j of vertex i. {χl}/=0N-1, as well as nonnegative eigenvalues 0 ≤ λ0 ≤ ⋯ λN−1⋅ℒ. L is verified by a matrix of eigenvectors 𝒳 = [χ0,⋯,χN−1] such that ℒ = 𝒳Λ𝒳T where ℒ is a diagonal matrix of eigenvalues.

Instead of complex exponentials, the eigenvectors, {χ,}/=0N-1 of Laplacian matrix L that meet perpendicularity criteria are utilized as breakdown bases for graph-structured data is defined as Eq. (12):

f ^ ( λ ) = Σ n = 0 N - 1 χ , T ( n ) f ( n ) = 𝒳 T f (12)

Inverse Fourier transformation is shown in Eq. (12):

f ( n ) = Σ l = 0 N - 1 f ^ ( λ ) χ ( n ) = x f ^ (13)

In the Fourier domain, convolution is converted to a point-wise product, which can then be reconverted to vertex domain utilizing graph Fourier transform as well as convolution theorem, as shown in Eq. (14):

f * g = Σ / = 0 N - 1 f ^ ( λ / ) g ` ( λ ζ ) χ ( n ) = 𝒳 ( ( 𝒳 T f ) ( 𝒳 T g ) )
= 𝒳 d ı a g ( g ` ( λ 0 ) , , g ` ( λ N - 1 ) ) 𝒳 T f (14)

The graph convolution process of 2 graph signals f(n) and g(n) is shown in Figure 3, and its transform, g () l, is called a Conv kernel. A set of free parameters θN−1 in Fourier domain, i.e., Laplacian eigenspace is used to build this kernel. It can also be thought of as a function of eigenvalues, written as g(A). Convolution is then written as Eq. (15):

FIGURE 3
www.frontiersin.org

Figure 3. Graphical illustration of convolution f (n) and g (n).

f * g = x d ı a g ( θ 0 , , θ N - 1 ) 𝒳 T f = 𝒳 𝒢 ( Λ ) 𝒳 T f (15)

The convolution mentioned above on a graph has two drawbacks: (1) Each process involves an Eigen decomposition, which incurs high computational costs; (2) after this operation, the variable value of a vertex is associated with global vertices without considering its locality in space, which is inconsistent with CNNs’ local connections.

suggested a low-order polynomial approximation based on rapid localized convolution that depicts g(A) as a polynomial function of eigenvalues Eq. (16):

𝒢 ( Λ ) = Σ k = 0 K θ k Λ k (16)

θk is the polynomial order, and _k is a vector of polynomial coefficients. The convolution is then rewritten where K is a small positive integer, such as Eq. (17).

f * g = 𝒳 ( Σ k = 0 K θ k Λ k ) 𝒳 T f = ( Σ k = 0 K θ k ( 𝒳 Λ k 𝒳 T ) ) f
= Σ k = 0 K θ k k f (17)

The convolution is performed by K multiplications of sparse matrix L, which speeds up computation by avoiding the Eigen decomposition procedure.

Update equation for a layer l is defined as Eq. (18):

h ` i l + 1 = O h l H k = 1 ( Σ j N i w ij k , l V k , l h j l ) e ` i l + 1 = O e l H k = 1
( w ` i , j k , l ) w ij k , l = softmax j ( w ` i , j k , l )
w v ` i , j k , l = ( Q k , l h i l K k , l h j l d k ) E k , l e i , j l (18)

with Qk,l, Kk,l, Vk,l, Ek,lRdk, Ohl, OelRd × d, k{1,2,,H} represents the number of attention heads, and where OhlR× d,Vk,lRdk × dH indicates the number of heads, L number of layers, d is the hidden dimension and dk is the dimension of a head d H = dk. Note that hli is ith node’s feature at lth layer Eq. (19).

cut ( S k , S ` k ) = v i S k , v j S j e ( v i , v j ) (19)

where Sk is the kth set of a given eigenvector, S`k indicates residual sets excluding Sk and e(vi, vj) is an edge among vertex vi and vj. The cut problem can be rewritten as follows when referring to several sets Eq. (20):

cut ( S 1 , S 2 , S 3 S g ) = 1 2 i = k g cut ( S k , S k ) (20)

The minimum cut problem is extensively researched in literature, with normalized cut reflecting a separate direction Eq. (21):

Ncut ( S 1 , S 2 S g ) = k = 1 g cut ( S k , S ` k ) vol ( S k , V ) (21)

wherever vol(Sk, V) = Σvi∈Sk,vi∈Ve (vi, vj) is the entire grade of bulges from Sk in diagram g.

utilizing DL optimization to turn the minimum cut issue into a DL format Eq. (22):

L cut = lower sum [ ( Y Γ ) ( 1 - Y ) T ] A + lower sum ( 1 T Y - n g ) 2 (22)

The normalized cut is the first term, and Y is defined as an n * g dimension matrix that indicates the neural network’s output. Finally, Γ, Y calculates A, which is the adjacency matrix Eq. (23).

H j [ l + 1 ] = σ ( i = 1 F i n ( k = 0 K θ i , j k k H i [ l ] ) + b j [ l ] ) (23)

Manifold convolutional and pooling layers, as well as one fully associated layer, make up the model. Figure 4 depicts the model’s architecture with two convolutional layers.

FIGURE 4
www.frontiersin.org

Figure 4. Model’s architecture with two convolutional layers.

Convolutional network: Convolutional layers are the foundation of a convolutional neural network. It has some filters (or kernels) whose settings will be figured out as the training progresses. Typically, its filter’s size will be less than that of the image it’s applied. Each filter performs a convolution on the image, yielding an activation map. For convolution, the filtration is moved throughout the height & width of the image, and at each point in space, the dot product between each component of the filter & the input is measured. The implemented design with the Deep Graph CNN model can provide better heart disease prediction compared to earlier models. The main features of this design are to give less ToC and accurate diagnosis results compared to earlier models. Heart diseases had been predicted at the classification stage using the GS-CNN process. The shape-based features are more helpful to find the information medical image such that getting differentiation with training data.

Performance analysis

A thorough experimental analysis was used to calculate the suggested hybrid technique performance. The proposed hybrid technique was tested on a PC with the following parameters: Intel(R) Core (TM) i5-7500 CPU, 32-bit Windows 7 OS, 4 GB RAM with SciPy, NumPy, Pandas, Keras and Matplotlib frameworks and Python 2.7 as the programming language.

Dataset description

Public Health Dataset, which dates from 1988 and consists of four databases: Cleveland, Hungary, Switzerland and Long Beach V, was used for this study. Even though there are 76 qualities in total, including expected attributes, all published studies only utilize a selection of 14 of them.

Information on heart disease

The clinical HD data used in this study came from 303 patients at CCF in Cleveland, Ohio, in the US. Dataset was collected from UCI_MLRepository (Hinton and Salakhutdinov, 2006), part of the Heart Disease Database. There were 75 attributes and a target attribute in each of the 303 clinical situations. The target attribute was an integer ranging from 0 to 4, indicating whether a patient had HD [0] or not [1, 2, 3]. Target qualities for the absence or presence of cardiac disease in patients were ascribed to binary values of 0 and 1 for this study. There were 125 cases with heart disease (44.33%) and 157 cases without heart disease (55.67%) among the 282 total clinical episodes. A total of 76 raw attributes were used to describe each clinical event. Due to missing values among other raw variables, only 29 of the raw attributes were used in the building of DNN models (Djenouri et al., 2022; Mezair et al., 2022).

Table 2 and Figure 5 show comparative analysis in diagnostic accuracy for proposed K-means_LQDA_ DG_ConvoNet. The diagnostic accuracy has been analyzed based on the number of epochs the neural network carries out. The epochs are taken as 100, 200, 300, 400 and 500. For all the iterations of the neural network, the proposed K-means_LQDA_ DG_ConvoNet obtained optimal results than the existing technique. The accuracy obtained in the diagnosis of disease by proposed K-means_LQDA_ DG_ConvoNet is 96% and existing SVM achieved 59% for 500 epochs and CNN obtained 65%, FGCNet attained 72%.

TABLE 2
www.frontiersin.org

Table 2. Comparative analysis of diagnostic accuracy.

FIGURE 5
www.frontiersin.org

Figure 5. Comparative analysis of diagnostic accuracy.

Table 3 and Figure 6 show comparative sensitivity analysis for proposed K-means_LQDA_ DG_ConvoNet. The sensitivity calculation refers prediction of the true positive and false positive rate of the proposed technique in diagnosing heart disease. The sensitivity obtained in disease diagnosis by proposed K-means_LQDA_ DG_ConvoNet is 80% for 500 epochs and existing SVM achieved 66% for 500 epochs and CNN obtained 70%, FGCNet attained 75%.

TABLE 3
www.frontiersin.org

Table 3. Comparative analysis of sensitivity.

FIGURE 6
www.frontiersin.org

Figure 6. Comparative analysis of sensitivity.

Table 4 and Figure 7 show comparative analysis in terms of specificity for proposed K-means_LQDA_ DG_ConvoNet. The specificity calculation relates to the percentage of real negatives projected as negatives. This means that a part of true negatives is forecasted as positives, which is denoted as false positives in the suggested method for identifying HD. The specificity obtained in the diagnosis of disease by proposed K-means_LQDA_ DG_ConvoNet is 73% for 500 epochs and existing SVM achieved 55% for 500 epochs and CNN obtained 57%, FGCNet attained 67%.

TABLE 4
www.frontiersin.org

Table 4. Comparative analysis of specificity.

FIGURE 7
www.frontiersin.org

Figure 7. Comparative analysis of specificity.

Table 5 and Figure 8 show qualified examination in terms of Precision for proposed K-means_LQDA_ DG_ConvoNet. The precision calculation mentions the number of true positives separated by the whole number of positive calculations made by the suggested technique in diagnosing heart disease, as well as the superiority of a positive forecast made by the proposed technique. The precision obtained in the diagnosis of disease by proposed K-means_LQDA_ DG_ConvoNet is 90% for 500 epochs and existing SVM achieved 71% for 500 epochs and CNN obtained 73%, FGCNet attained 79%.

TABLE 5
www.frontiersin.org

Table 5. Comparative analysis of precision.

FIGURE 8
www.frontiersin.org

Figure 8. Precision analysis differentiation.

Table 6 and Figure 9 show a comparative analysis in terms of F-Score for proposed K-means_LQDA_ DG_ConvoNet. The F-Score computation is utilized to assess binary classification techniques which categorize examples as “positive” or “negative.” F-score is shown as the harmonic mean of precision and recall. For example, F-Score obtained in the diagnosis of disease by proposed K-means_LQDA_ DG_ConvoNet is 79% for 500 epochs and existing SVM achieved 65% for 500 epochs and CNN obtained 71%, FGCNet attained 79%.

TABLE 6
www.frontiersin.org

Table 6. Comparative analysis of F-Score.

FIGURE 9
www.frontiersin.org

Figure 9. Comparative analysis of F-Score.

Table 7 and Figure 10 show an examination of the area under the ROC curve for proposed K-means_LQDA_ DG_ConvoNet. The calculation of the extent under the ROC curve is a measure of a classifier’s ability to distinguish between classes as well as used as an instant of the ROC curve.

TABLE 7
www.frontiersin.org

Table 7. ROC curve on various methods.

FIGURE 10
www.frontiersin.org

Figure 10. ROC curve analysis.

AUC indicates how well the method differentiates between positive and negative classes. F-Score obtained in the diagnosis of disease by proposed K-means_LQDA_ DG_ConvoNet is 75% for 500 epochs and existing SVM achieved 45% for 500 epochs and CNN obtained 53%, FGCNet attained 62% shown in Figure 11.

FIGURE 11
www.frontiersin.org

Figure 11. Classification of heart disease prediction.

Conclusion

The proposed work is a novel technique for detecting heart disease based on IoT sensor data with a monitoring application using deep learning architectures. Here, the input data has been collected from IoT sensor data from the University of California Irvine machine learning repository. The collected data has been processed for noise removal and clustered based on K-means clustering. The clustered data has been extracted using Linear Quadratic Discriminant Analysis where the features of clustered data have been extracted. The extracted features have been classified using the deep graph ConvoNet (convolutional network)- DG_ConvoNet. The diagnostic accuracy of 96%, sensitivity of 80%, specificity of 73%, precision of 90%, F-Score of 79%, and area under the ROC curve of 75% are obtained by the proposed classification and prediction model, according to the testing findings. Our strong results clearly show the strength of our methodology and DG_ConvoNet. In the future, we wish to test our system model on other datasets and also look at implementing the DG_ConvoNet for other diseases.

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found here at doi: 10.1136/bmjopen-2020-044070.

Author contributions

KS and VR contributed to the conception and design of the study. GS performed the statistical analysis. KS and JL wrote the first draft of the manuscript. All authors contributed to the manuscript revision, read, and approved the submitted version.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Ali, F., Hasan, B., Ahmad, H., Hoodbhoy, Z., Bhuriwala, Z., Hanif, M., et al. (2021). Protocol: Detection of subclinical rheumatic heart disease in children using a deep learning algorithm on digital stethoscope: A study protocol. BMJ Open 11:e044070. doi: 10.1136/bmjopen-2020-044070

PubMed Abstract | CrossRef Full Text | Google Scholar

Divya, K., Sirohi, A., Pande, S., and Malik, R. (2021). “An IoMT assisted heart disease diagnostic system using machine learning techniques,” in Cognitive internet of medical 4ings for smart healthcare, Vol. 311, eds A. E. Hassanien, A. Khamparia, D. Gupta, K. Shankar, and A. Slowik (Cham: Springer), 145–161. doi: 10.1007/978-3-030-55833-8_9

CrossRef Full Text | Google Scholar

Djenouri, Y., Belhadi, A., Srivastava, G., and Lin, J. C. (2022). When explainable AI meets IoT applications for supervised learning. Cluster Comput. 17:1. doi: 10.1007/s10586-022-03659-3

CrossRef Full Text | Google Scholar

Garigipati, R. K., Raghu, K., and Saikumar, K. (2022). “Detection and identification of employee attrition using a machine learning algorithm,” in Handbook of research on technologies and systems for E-collaboration during global crises, eds J. Zhao and V. Vinoth (Pennsylvania, PA: IGI Global), 120–131. doi: 10.4018/978-1-7998-9640-1.ch009

CrossRef Full Text | Google Scholar

Golande, A., Sorte, P., Suryawanshi, V., Yermalkar, U., and Satpute, S. (2019). Smart hospital for heart disease prediction using IoT. Int. J. Inform. Vis. 3, 198–202.

Google Scholar

Haq, A. U., Li, J. P., Memon, M. H., Nazir, S., and Sun, R. (2018). A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms. Mob. Inf. Syst. 2018:3860146. doi: 10.1155/2018/3860146

CrossRef Full Text | Google Scholar

Hasan, N. I., and Bhattacharjee, A. (2019). Deep learning approach to cardiovascular disease classification employing modified ECG signal from empirical mode decomposition. Biomed. Signal Process. Control 52, 128–140. doi: 10.1016/j.bspc.2019.04.005

CrossRef Full Text | Google Scholar

Hinton, G. E., and Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. science 313, 504–507. doi: 10.1126/science.1127647

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, J., Chen, B., Yao, B., and He, W. (2019). ECG arrhythmia classification using STFT-based spectrogram and convolutional neural network. IEEE Access 7, 92871–92880. doi: 10.1109/ACCESS.2019.2928017

CrossRef Full Text | Google Scholar

Kanksha Aman, B., Sagar, P., Rahul, M., and Aditya, K. (2021). An intelligent unsupervised technique for fraud detection in health care systems. Intell. Decis. Technol. 15, 127–139. doi: 10.3233/IDT-200052

CrossRef Full Text | Google Scholar

Koppula, N., Sarada, K., Patel, I., Aamani, R., and Saikumar, K. (2021). “Identification and recognition of speaker voice using a neural network-based algorithm: Deep learning,” in Handbook of research on innovations and applications of AI, IoT, and cognitive technologies, eds J. Zhao and V. Vinoth Kumar (Pennsylvania, PA: IGI Global), 278–289. doi: 10.4018/978-1-7998-6870-5.ch019

CrossRef Full Text | Google Scholar

Ladefoged, C. N., Hasbak, P., Hornnes, C., Højgaard, L., and Andersen, F. L. (2021). Low-dose PET image noise reduction using deep learning: Application to cardiac viability FDG imaging in patients with ischemic heart disease. Phys. Med. Biol. 66:054003. doi: 10.1088/1361-6560/abe225

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, J., Wang, H., Yang, Z., Quan, J., Liu, L., and Tian, J. (2022). Deep learning-based computer-aided heart sound analysis in children with left-to-right shunt congenital heart disease. Int. J. Cardiol. 348, 58–64. doi: 10.1016/j.ijcard.2021.12.012

PubMed Abstract | CrossRef Full Text | Google Scholar

Majumder, A. K. M., ElSaadany, Y. A., Young, R., and Ucci, D. R. (2019). An energy efficient wearable smart IoT system to predict cardiac arrest. Adv. Hum.Comput. Interact. 2019:1507465. doi: 10.1155/2019/1507465

CrossRef Full Text | Google Scholar

Manogaran, G., Lopez, D., Thota, C., Abbas, K. M., Pyne, S., and Sundarasekar, R. (2017). “Big data analytics in healthcare internet of things,” in Innovative healthcare systems for the 21st century, ed. H. Qudrat-Ullah (Cham: Springer), 263–284. doi: 10.1007/978-3-319-55774-8_10

CrossRef Full Text | Google Scholar

Martins, J. F. B., Nascimento, E. R., Nascimento, B. R., Sable, C. A., Beaton, A. Z., Ribeiro, A. L., et al. (2021). Towards automatic diagnosis of rheumatic heart disease on echocardiographic exams through video-based deep learning. J. Am. Med. Inform.Assoc. 28, 1834–1842. doi: 10.1093/jamia/ocab061

PubMed Abstract | CrossRef Full Text | Google Scholar

Mezair, T., Djenouri, Y., Belhadi, A., Srivastava, G., and Lin, J. C. (2022). Towards an advanced deep learning for the internet of behaviors: Application to connected vehicle. ACM Trans. Sens. Netw. 1–18. doi: 10.1145/3526192

CrossRef Full Text | Google Scholar

Morris, S. A., and Lopez, K. N. (2021). Deep learning for detecting congenital heart disease in the fetus. Nat. Med. 27, 764–765. doi: 10.1038/s41591-021-01354-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Rahmani, A. M., Gia, T. N., Negash, B., Anzanpour, A., Azimi, I., Jiang, M., et al. (2018). Exploiting smart e-health gateways at the edge of healthcare internet-of-yhings: A fog computing approach. Future Gener. Comput. Syst. 78, 641–658. doi: 10.1016/j.future.2017.02.014

CrossRef Full Text | Google Scholar

Rath, A., Mishra, D., Panda, G., and Satapathy, S. C. (2021). Heart disease detection using deep learning methods from imbalanced ECG samples. Biomed. Signal Process. Control 68:102820. doi: 10.1016/j.bspc.2021.102820

CrossRef Full Text | Google Scholar

Saikumar, K., and Rajesh, V. (2020a). A novel implementation heart diagnosis system based on random forest machine learning technique. Int. J. Pharm. Res. 12, 3904–3916. doi: 10.31838/ijpr/2020.SP2.482

CrossRef Full Text | Google Scholar

Saikumar, K., and Rajesh, V. (2020b). Coronary blockage of artery for heart diagnosis with DT Artificial Intelligence Algorithm. Int. J. Res. Pharma. Sci. 11, 471–479. doi: 10.26452/ijrps.v11i1.1844

CrossRef Full Text | Google Scholar

Saikumar, K., Rajesh, V., and Babu, B. S. (2022). Heart disease detection based on feature fusion technique with augmented classification using deep learning technology. Trait. Signal 39, 31–42. doi: 10.18280/ts.390104

CrossRef Full Text | Google Scholar

Vincent Paul, S. M., Balasubramaniam, S., Panchatcharam, P., Malarvizhi Kumar, P., and Mubarakali, A. (2021). Intelligent framework for prediction of heart disease using deep learning. Arab. J. Sci. Eng. 47, 2159–2169. doi: 10.1007/s13369-021-06058-9

CrossRef Full Text | Google Scholar

Wang, H., Shi, H., Chen, X., Zhao, L., Huang, Y., and Liu, C. (2020). An improved convolutional neural network based approach for automated heartbeat classification. J. Med. Syst. 44:35. doi: 10.1007/s10916-019-1511-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, S. H., Govindaraj, V. V., Gorriz, J. M., Zhang, X., and Zhang, Y. D. (2021a). Covid-19 classification by FGCNet with deep feature fusion from graph convolutional network and convolutional neural network. Inf. Fusion 67, 208–229. doi: 10.1016/j.inffus.2020.10.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, S. H., Nayak, D. R., Guttery, D. S., Zhang, X., and Zhang, Y. D. (2021b). COVID-19 classification by CCSHNet with deep fusion using transfer learning and discriminant correlation analysis. Inf. Fusion 68, 131–148. doi: 10.1016/j.inffus.2020.11.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, P., and Xu, F. (2021). Effect of AI deep learning techniques on possible complications and clinical nursing quality of patients with coronary heart disease. Food Sci. Technol. 42, 1–6. doi: 10.1590/fst.42020

CrossRef Full Text | Google Scholar

Zhang, Y. D., Dong, Z., Chen, X., Jia, W., Du, S., Muhammad, K., et al. (2019). Image based fruit category classification by 13-layer deep convolutional neural network and data augmentation, multimed. Tools Appl. 78, 3613–3632. doi: 10.1007/s11042-017-5243-3

CrossRef Full Text | Google Scholar

Zhang, Y. D., Dong, Z., Wang, S. H., Yu, X., Yao, X., Zhou, Q., et al. (2020a). Advances in multimodal data fusion in neuroimaging: Overview, challenges, and novel orientation. Inf. Fusion 64, 149–187. doi: 10.1016/j.inffus.2020.07.006

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Y. D., Nayak, D. R., Zhang, X., and Wang, S. H. (2020b). Diagnosis of secondary pulmonary tuberculosis by an eight-layer improved convolutional neural network with stochastic pooling and hyperparameter optimization. J. Ambient Intell. Humaniz. Comput. 1, 1–18. doi: 10.1007/s12652-020-02612-9

CrossRef Full Text | Google Scholar

Keywords: heart disease, detection, IoT - internet of things, sensor data, deep learning, artificial neural network

Citation: Saikumar K, Rajesh V, Srivastava G and Lin JC-W (2022) Heart disease detection based on internet of things data using linear quadratic discriminant analysis and a deep graph convolutional neural network. Front. Comput. Neurosci. 16:964686. doi: 10.3389/fncom.2022.964686

Received: 08 June 2022; Accepted: 09 September 2022;
Published: 07 October 2022.

Edited by:

Deepika Koundal, University of Petroleum and Energy Studies, India

Reviewed by:

Loknath Sai Ambati, Indiana University Kokomo, United States
Ahmed J. Obaid, University of Kufa, Iraq

Copyright © 2022 Saikumar, Rajesh, Srivastava and Lin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jerry Chun-Wei Lin, jerrylin@ieee.org

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.