top of page
stephen-dawson-qwtCeJ5cLYs-unsplash.jpg

Predicting Mortality and Duration of Hospital Stay for Heart Failure Patients 


 

For the code, please visit: github.com/AhmedEAbdou/Predicting-and-Analyzing-Heart-Failure.Rmd

 

Research Question:


Can we predict the likelihood of death in heart failure patients based on demographic and clinical variables, and can we also predict the length of hospital stay until death for heart failure patients based on the same variables?

Methodology:

  1. Check for missing values and outliers in the dataset and deal with them accordingly.

  2. Conduct exploratory data analysis to explore patterns and relationships between the exploratory variables.

  3. Develop a logistic regression model based on the exploratory data analysis.

  4. Evaluate the logistic regression model.

  5. Develop a Cox proportional hazards model based on exploratory data analysis.

  6. Evaluate the Cox proportional hazards model.

Results:

The models will provide estimates of the odds ratios for each predictor variable, indicating the strength and direction of their relationship with the likelihood of death and the time it will take the event in heart failure patients.

Checking for Missing values:

Picture1.png
  • Changed Death Event, Sex, Anemia, Diabetes, Smoking, and High Blood Pressure variables to factor for easier interpretability in the Analysis.

  • There are no illogical values in the numerical variables (Age, Time, Serum Sodium, Serum Creatinine, Platelets, Ejection Fraction, Diabetes, Creatine Phosphokinase, and Anemia).

visualizations​

000012.png
  • In general, the normal range for CPK for males is higher than that of females.

  • The normal for adult males is between 39-308 mcg/L, and in adult females is between 26-192 mcg/L.

  • In general, the normal range for Platelets is typically considered to be between 150,000 and 450,000 platelets(µL).

  • Note: There is no significant difference in the normal range for platelet counts between males and females

Picture5.png
  • Note: A higher EF is generally considered better.

  • In general, a normal ejection fraction range is between 50-70%.

  • No difference between genders.

  • In general, the normal range for serum creatinine in adults is between 0.6 to 1.3 milligrams per deciliter (mg/dL) for males and 0.5 to 1.1 mg/dL for females.

  • In general, the normal range for serum sodium in adults is between 135 to 145 milliequivalents per liter (mEq/L).

  • Neither higher nor lower is necessarily better.

  • Note: There is no significant difference in the normal range for serum sodium between males and females.

Logistic Model for Death
Predicting the likelihood of death in heart failure patients

 

Full model:

Death Event = Age+ Anaemia+ CPK + Diabetes+ Ejection Fraction + High Blood Pressure + Platelets + Serum Creatinine + Serum Sodium +Sex +Smoking + Time

Coefficients:

                                                    Estimate Std.       Error  z value   Pr(>|z|)   

(Intercept)                                -1.022e+01  5.632e+00  -1.815 0.069596 . 

age                                        -4.742e-02  1.580e-02  -3.001 0.002690 **

anaemiaNo_Anaemia                          -7.470e-03  3.605e-01  -0.021 0.983467   

creatinine_phosphokinase                   -2.222e-04  1.779e-04  -1.249 0.211684   

diabetesNo_Diabetes                         1.451e-01  3.512e-01   0.413 0.679380   

ejection_fraction                           7.666e-02  1.633e-02   4.695 2.67e-06 ***

high_blood_pressureNo_high_blood_pressure  -1.027e-01  3.587e-01  -0.286 0.774688   

platelets                                   1.200e-06  1.889e-06   0.635 0.525404   

serum_creatinine                           -6.661e-01  1.815e-01  -3.670 0.000242 ***

serum_sodium                                6.698e-02  3.974e-02   1.686 0.091855 . 

sexMale                                     5.337e-01  4.139e-01   1.289 0.197299   

smokingSmoking                              1.349e-02  4.126e-01   0.033 0.973915   

time                                        2.104e-02  3.014e-03   6.981 2.92e-12 ***

Removed the non-significant variables using the step function:

Final Model: DEATH EVENT = Age + Ejection Fraction + Serum Creatinine + Time

Coefficients:

                         Estimate Std. Error z value Pr(>|z|)   

(Intercept)       -0.604473   1.036111  -0.583  0.55962   

age               -0.043326   0.014872  -2.913  0.00358 **

ejection_fraction  0.074804   0.015555   4.809 1.52e-06 ***

serum_creatinine  -0.719785   0.174597  -4.123 3.75e-05 ***

time               0.020611   0.002881   7.153 8.48e-13 ***

Exponentiate

(Intercept)                    age      ejection_fraction   serum_creatinine     time

       0.5463623         0.9575993         1.0776726         0.4868568         1.0208249

Discussion:

  • Age: The odds ratio of 0.958 for age means that for each one-unit increase in age, the odds of death relative to the odds of no death decreases by a factor of 0.958, while holding all other variables constant. This suggests that older patients are less likely to die compared to younger patients.

  • Ejection fraction: The odds ratio of 1.078 for ejection fraction means that for each one-unit increase in ejection fraction, the odds of death relative to the odds of no death increases by a factor of 1.078, while holding all other variables constant. This suggests that higher ejection fraction is associated with higher odds of death.

  • Serum creatinine: The odds ratio of 0.487 for serum creatinine means that for each one-unit increase in serum creatinine, the odds of death relative to the odds of no death decreases by a factor of 0.487, while holding all other variables constant. This suggests that higher serum creatinine is associated with lower odds of death.

  • Time: The odds ratio of 1.021 for time means that for each one-unit increase in time, the odds of death relative to the odds of no death increases by a factor of 1.021, while holding all other variables constant. This suggests that longer stay at the hospital is associated with higher odds of death.

survival analysis
Predicting the length of hospital stay until death for heart failure patients

Full model: Surv(time, DEATH_EVENT) ~ Age + Ejection Fraction + Serum Sodium + Serum Creatinine + Anemia + Diabetes + High Blood Pressure + Smoking + Sex + Platelets

coef                      exp(coef)  e(coef)            z Pr(>|z|)   

age                  4.636e-02  1.047e+00  9.463e-03  4.899 9.66e-07 ***

ejection_fraction   -5.154e-02  9.498e-01  1.017e-02 -5.069 3.99e-07 ***

serum_creatinine     3.466e-01  1.414e+00  6.713e-02  5.164 2.42e-07 ***

anaemia              3.393e-01  1.404e+00  2.088e-01  1.625   0.1042   

diabetes             1.745e-01  1.191e+00  2.188e-01  0.797   0.4253   

high_blood_pressure  4.330e-01  1.542e+00  2.148e-01  2.016   0.0438 * 

smoking              1.075e-01  1.113e+00  2.511e-01  0.428   0.6686   

sex                 -1.781e-01  8.369e-01  2.505e-01 -0.711   0.4771   

platelets           -5.362e-07  1.000e+00  1.100e-06 -0.487   0.6260   

Removed the non-significant variables using the step function.

Order: Smoking, platelets, Diabetes, Serum Sodium, Sex, and Anemia.

Final model: include: Age + Ejection Fraction + Serum Creatinine  + High Blood Pressure.

 coef                    exp(coef)       se(coef)              z Pr(>|z|)   

age                  0.044173  1.045163  0.009027  4.894 9.91e-07 ***

ejection_fraction   -0.049587  0.951623  0.009968 -4.975 6.54e-07 ***

serum_creatinine     0.347001  1.414818  0.066705  5.202 1.97e-07 ***

high_blood_pressure  0.471236  1.601973  0.211410  2.229   0.0258 *

Discussion:

  • Age (exp(coef) = 1.045163): For every one-year increase in age, the instantaneous risk of death at any point in time during the hospital stay increases by 4.5%, assuming the patient has survived up to that time and holding all other variables constant.

  • Ejection fraction (exp(coef) = 0.951623): For every one-unit increase in ejection fraction, the instantaneous risk of death at any point in time during the hospital stay decreases by 4.8%, assuming the patient has survived up to that time and holding all other variables constant.

  • Serum creatinine (exp(coef) = 1.414818): For every one-unit increase in serum creatinine, the instantaneous risk of death at any point in time during the hospital stay increases by 41.5%, assuming the patient has survived up to that time and holding all other variables constant.

  • High blood pressure (exp(coef) = 1.601973): For patients with high blood pressure (compared to those without high blood pressure), the instantaneous risk of death at any point in time during the hospital stay increases by 60.2%, assuming the patient has survived up to that time and holding all other variables constant.

Creating a New data frame for testing:

Age= 40 and 80

ejection fractions = 30 and 60

Serum Creatinine = 1.2 and 1.8

High Blood Pressure = 0 and 1

  • The best performing Group is 5:  Age= 40, ejection fraction = 60%, Serum Creatinine= 1.2, and has no High Blood Pressure.

  • The worst-performing group is Group 12: Age= 80, ejection fraction = 30%, Serum Creatinine= 1.8, and has High Blood Pressure.

bottom of page