Abstract
Objectives
We evaluated the added predictive value of combining clinical information and myocardial perfusion SPECT imaging (MPI) data using machine learning (ML) to predict major adverse cardiac events (MACE).
Background
Traditionally, prognostication by MPI has relied on visual or quantitative analysis of images, without objective consideration of the clinical data. ML permits a large number of variables to be considered in combination and at a level of complexity beyond the human clinical reader.
Methods
2,619 consecutive patients (48% male, 62 ± 13 years) undergoing exercise (38%) or pharmacologic stress (62%) with high-speed SPECT MPI were monitored for MACE. 28 clinical variables, 17 stress test variables, and 25 imaging variables (including total perfusion deficit [TPD]) were recorded. Areas under receiver-operating characteristic curve (AUC) for MACE prediction were compared between i) ML with all available data (ML-combined), ii) ML with only imaging data (ML-imaging), iii) 5-point scale visual diagnosis (MD-diagnosis), and iv) automated quantitative imaging analysis (stress TPD and ischemic TPD). ML involved automated variable selection by information gain ranking, model building with a boosted ensemble algorithm, and 10-fold stratified cross validation.
Results
During follow-up (3.2 ± 0.6 years), 239 patients (9.1%) had MACE. MACE prediction was significantly higher for ML-combined than ML-imaging (AUC: 0.81 vs 0.78; P < 0.01). ML-combined also had higher predictive accuracy compared to MD-diagnosis, automated stress TPD and automated ischemic TPD (AUC: 0.81 vs 0.65 vs 0.73 vs 0.71 respectively). Risk reclassification for ML-combined compared to visual MD-diagnosis was 26% (P < 0.001).
Conclusions
ML combining both clinical and imaging data variables was found to have high predictive accuracy for 3-year risk of MACE, and was superior to existing visual or automated perfusion assessments in isolation. ML could allow integration of clinical and imaging data for personalized MACE risk computations in patients undergoing SPECT MPI.
Keywords: Major adverse cardiac events, SPECT myocardial imaging, machine learning
INTRODUCTION
Traditionally, the prognostic value of myocardial perfusion SPECT imaging (MPI) has been studied with semi-quantitative visual and quantitative analysis of image data (1–3). A number of prior studies have already shown that clinical demographics, functional parameters, hemodynamic and stress results all affect the evaluation of myocardial perfusion SPECT imaging (MPI) (4–7). This integration of clinical information and imaging data into a final impression is currently performed subjectively by physicians when assessing the MPI test and often in a non-standardized manner.
Machine learning (ML) is a field of computer science that uses computer algorithms to identify patterns in large multivariable datasets and can be used to predict outcomes. In recent years, ML has been used for prediction and decision-making in a multitude of disciplines, including internet search engines, customized advertising, natural language processing, finance trending, and robotics (8–10). For MPI, large number of parameters including clinical variables, stress test results, and imaging data variables could be considered by ML for outcome prediction. We evaluated the benefits of combining all of these variables using a ML algorithm to predict major adverse cardiac events (MACE) (8). ML prediction using combined data was also compared to physician diagnosis (based on visual read with awareness of clinical data), and to automated perfusion quantification indices (stress and ischemic total perfusion deficit [TPD]).
METHODS
Study population
2689 consecutive patients who were referred for clinically indicated exercise or pharmacologic stress MPI at Sacred Heart Medical Center, between January 2010 and December 2011, were included. The study was approved by the institutional review board including a waiver for informed consent. After excluding 70 patients with early revascularization within 90 days, 2,619 patients were included for further analysis.
Clinical data
Clinical data were derived from patients’ medical records and included age, gender, and risk factors. Risk factors recorded were hypertension, diabetes mellitus, dyslipidemia, and smoking (defined as current smoking or cessation within 3 months of testing), and family history of premature clinical coronary artery disease (CAD). Chest pain presence and type and shortness of breath were assessed by the stress testing physician.
MPI and stress protocols
Rest/stress 1-day 99mTc-sestamibi imaging was performed using a high-efficiency solid-state SPECT scanner (D-SPECT, Spectrum-Dynamics, Haifa, Israel) (11). Weight-adjusted doses of 353 ± 151 MBq (9.5 ± 4.1 mCi) for rest and 1252 ± 196 MBq (34 ± 5.3 mCi) for stress (recommended by vendor), were employed (12), equivalent to a total average effective dose of 10.7 mSv based on the latest ICRP 103 estimates (13). Patients underwent symptom-limited Bruce protocol exercise testing (38%) or pharmacologic stress (62%, regadenoson 0.4mg) with injection at peak stress. Rest image acquisition was performed supine with 6–10 min acquisition time, based on patient BMI. Upright and supine stress imaging (4–6 min) began 15–30 minutes after stress.
Transaxial images were generated from list mode data maximum likelihood expectation maximization reconstruction (11). No attenuation or scatter correction was applied. Images were automatically reoriented into short-axis and vertical and horizontal long-axis slices with Quantitative Perfusion SPECT (QPS)/Quantitative Gated SPECT (QGS) software (Cedars-Sinai Medical Center, Los Angeles, CA).
Visual perfusion analysis
The visual analysis was done by multiple physicians who were aware of patient clinical information and quantitative assessment at the time of the study. Reader scan interpretation (MD-diagnosis) was scored as 0 = normal, 1 = equivocal, 2 = probably abnormal, 3 = abnormal, or 4 = definitely abnormal. A 3-step scale probability of CAD was also reported (0 = low, 1 = intermediate, 2 = high).
Automated quantification
All image datasets were de-identified, transferred to Cedars-Sinai and quality control checked by a single experienced core laboratory technologist without knowledge of clinical data. Automatically generated myocardial contours by QPS/QGS software were evaluated and when necessary, contours were adjusted to correspond to the myocardium. Upright and supine images were quantified as previously described (14). The quantitative perfusion variable employed was automatic total perfusion deficit (TPD), which reflects a combination of defect extent and severity, producing stress, rest, and ischemic (stress – rest) TPD values. Ejection fraction (EF), systolic and diastolic volumes at stress and rest were quantified separately for each acquisition using standard QGS software with 8 frames per cardiac cycle. Transient ischemic dilation (TID) was computed as previously described (15). Counts in the left ventricle were obtained by planar projections of the left ventricular region defined during the first step of data reconstruction (16).
Outcome and follow-up data collection
The endpoint was MACE, comprised of all-cause mortality, non-fatal myocardial infarction (MI), unstable angina, or late coronary revascularization (percutaneous coronary intervention or coronary artery bypass grafting). All-cause mortality was determined from the Social Security Death Index and combined with MACE events obtained by using the hospital electronic medical records including all clinics, cardiology group and hospital visits. Non-fatal MI was defined based on the criteria of hospital admission for chest pain, elevated cardiac enzyme levels and typical changes on the ECG (17). The first event in each patient was used as the outcome. Patients with early revascularization ≤90 days after MPI were excluded.
Machine learning
Figure 1 illustrates the machine learning pathway, which involved automated variable selection by information gain ratio ranking and model building with a boosted ensemble algorithm, both performed into a stratified 10-fold cross validation procedure, as reported in our previous work (8). ML techniques were implemented in the open-source Waikato Environment for Knowledge Analysis (WEKA) platform (3.8.0) (18).
Figure 1. Machine learning pathway.
The overall population is divided into 10 equally sized groups (1, 2, …, 10) with approximately same incidence of MACE events (stratified). Of the 10 groups, one (10%) is retained as the test set (hold-out set) and the others (90%) are used as the training set. To estimate the ML performance for all the data, the cross validation procedure loops 10 times over these groups, each time performing variable selection and model building with a different training set and then testing this model on the unseen test set. Therefore, each data point is used once for testing and 9 times for training, and the result is 10 experimental LogitBoost models trained on 90% fractions. Once finished, the estimates of MACE probability for each of the 10 hold-out sets derived by the corresponding 10 models are concatenated to provide an overall expected estimate of ML performance with unseen (hold-out) data. MACE: major adverse cardiac events, ML: machine learning.
Variable Selection
25 imaging data variables, 17 stress test variables, and 28 clinical variables were available for variable selection by information gain ratio (18). Information gain ratio offers a measure of the effectiveness of a variable in classifying the training data. Only variables resulting in information gain ratio > 0 were subsequently used in model building (Figure 2B).
Figure 2. Variable selection.
A. 25 imaging data (blue bar: 22 selected), 17 stress test (red bar: 8 selected) and 28 clinical (green bar: 17 selected) variables ranked by their mean [95% CI] information gain ratio within 10-fold cross validation. B. Same variables ranked by their individual AUC [95% CI] for MACE prediction. Variables selected by information gain ratio are shown as filled bars. Non-selected variables are shown by white bars.
AUC: area under receiver-operating characteristic curve, BP: blood pressure, CABG: coronary artery bypass graft, EDV: end-diastolic volume, EF: ejection fraction, LV: left ventricle, MACE: major adverse cardiac events, PCI: percutaneous coronary intervention, TAVR: transcatheter aortic valve replacement, TPD: total perfusion deficit.
Model building
Predictive classifiers for MACE scoring were developed by an ensemble (“boosting”) LogitBoost algorithm. The principle behind ML ensemble boosting is to combine the prediction of simple classifiers with weak performances (19) to create a single strong classifier. These weak predictions are then combined in an ensemble (weighted majority voting) to derive an overall classifier–the ‘ML score’.
Cross validation
The performance and general error estimation of the entire ML process (variable selection and LogitBoost) were assessed using stratified 10-fold cross validation (Figure 1), which is currently the preferred validation technique in machine learning (18). The main advantages of this technique, compared to the conventional split-sample approach, are: i) it reduces the variance in prediction error; ii) it maximizes the use of data for both training and validation, without overfitting or overlap between test and validation data; and iii) it guards against testing hypotheses suggested by arbitrarily split data (20).
Statistical analysis
Using receiver-operating characteristic (ROC) analysis and pairwise comparisons according to Delong et al (21) the predictive accuracy for MACE was compared between i) ML with all available data (ML-combined), ii) ML with only imaging data (ML-imaging), iii) 5-point scale visual diagnosis (MD-diagnosis), and iv) automated quantitative imaging analysis (stress TPD and ischemic TPD). Brier score and Pearson correlation were computed between predicted and observed MACE events (22). For all analyses, MACE-free patients were censored to their follow-up date. To define the low-risk limit for MACE prediction by ML-combined, we used clinical diagnosis = 0 which is considered as definitely normal scans, as a well-established low-risk limit. Then, low-risk cutoffs for ML-combined and TPD were calculated for approximately the same population percentile as for MD-diagnosis = 0 (87th percentile). Subsequently, improvement in risk classification using ML-combined compared to MD-diagnosis was assessed with a 5-category reclassification. Statistical calculations were performed using R software version 3.3.1 and PredictABEL package for the reclassification.
RESULTS
Study population and outcome
Table 1 shows the baseline clinical characteristics of the studied population. When the first event per patient was considered, there were 239 (9.1%) 3-year MACE events, with 150 (5.7%) all-cause deaths, 11 (0.4%) non-fatal MIs, 24 (0.9%) unstable anginas, and 54 (2.1%) late target revascularizations. The observed annual MACE rate was 3%.
Table 1.
Patient characteristics.
Characteristic | All patients n = 2619 |
MACE+ n = 239 |
MACE− n = 2380 |
P value |
---|---|---|---|---|
Age (years ± SD) | 62 ± 13 | 70 ± 12 | 62 ± 12 | <0.0001 |
Male, n (%) | 1247 (48%) | 128 (54%) | 1119 (47%) | 0.054 |
Body mass index (kg/m2) | 31 ± 8 | 30 ± 9 | 32 ± 8 | <0.01 |
CAD risk factors, n (%) | ||||
Diabetes | 691 (26%) | 100 (42%) | 591 (25%) | <0.001 |
Hypercholesterolemia | 1491 (57%) | 141 (59%) | 1350 (57%) | 0.5 |
Hypertension | 1692 (65%) | 181 (76%) | 1511 (63%) | <0.001 |
Family history of CAD | 1006 (38%) | 66 (28%) | 940 (40%) | <0.001 |
Smoker | 662 (25%) | 65 (27%) | 597 (25%) | 0.474 |
Typical angina | 301 (11%) | 38 (16%) | 263 (11%) | <0.05 |
History of CAD, n (%) | ||||
Prior MI | 130 (5%) | 31(13%) | 99 (4%) | <0.001 |
Prior PCI | 231(9%) | 52 (22%) | 179 (8%) | <0.001 |
Prior CABG | 172 (7%) | 36 (15%) | 136 (6%) | <0.001 |
CABG: coronary artery bypass graft, CAD: coronary artery disease, MACE: major adverse cardiac event, MI: myocardial infarction, PCI: percutaneous coronary intervention, SD: standard deviation.
Hemodynamic and MPI results
Table 2 shows hemodynamic and stress results separately for pharmacological stress and for exercise stress. The frequency of exercise stress was lower among patients with MACE compared to those without MACE (9% in MACE vs. 41% in no MACE; P < 0.0001). Table 3 shows quantitative and visual MPI results. For the quantitative evaluation of perfusion and function, 9.8% of myocardial contours were corrected by the core lab technologist.
Table 2.
Pharmacologic and exercise stress test results.
Pharmacologic stress n = 1614 |
MACE+ n = 217 |
MACE− n = 1397 |
P value |
---|---|---|---|
Rest heart rate (bpm) | 75 ± 14 | 73 ± 13 | <0.05 |
Peak heart rate at stress (bpm) | 95 ± 19 | 103 ± 20 | <0.0001 |
Rest SBP (mmHg) | 132 ± 22 | 132 ±20 | 0.577 |
Rest DBP (mmHg) | 73 ± 12 | 77 ± 12 | <0.001 |
Peak SBP (mmHg) | 131 ± 27 | 143 ± 27 | <0.0001 |
Peak DBP (mmHg) | 70 ± 12 | 76 ± 13 | <0.0001 |
| |||
Exercise Stress n = 1005 |
MACE+ n = 22 |
MACE− n = 983 |
P value |
| |||
Rest heart rate (bpm) | 81 ± 13 | 76 ± 13 | 0.072 |
Peak heart rate at stress (bpm) | 142 ± 13 | 148 ± 13 | <0.05 |
Rest SBP (mmHg) | 128 ± 19 | 126 ± 17 | 0.647 |
Rest DBP (mmHg) | 74 ± 9 | 79 ± 10 | <0.05 |
Peak SBP (mmHg) | 179 ± 27 | 181 ± 25 | 0.703 |
Peak DBP (mmHg) | 84 ± 10 | 83 ± 12 | 0.700 |
Ischemic ST change during | |||
7 (32%) | 175 (18%) | 0.091 | |
Exercise stress, n (%) |
DBP: diastolic blood pressure, MACE: major adverse cardiac event, SBP: systolic blood pressure.
Table 3.
Perfusion and functional results.
Parameters | MACE+ n = 239 |
MACE− n = 2380 |
P value |
---|---|---|---|
MD-diagnosis = normal, n (%) | 142 (59%) | 2138 (90%) | <0.001 |
MD-diagnosis = abnormal or definitely abnormal, n (%) | 89 (37%) | 217 (9%) | <0.001 |
Stress TPD (%) | 9 ± 11 | 3 ± 5 | <0.0001 |
Ischemic TPD (%) | 4 ± 4 | 2 ± 3 | <0.0001 |
Rest TPD (%) | 5 ± 9 | 1 ± 3 | <0.0001 |
Stress EDV (ml) | 112 ± 57 | 91 ± 36 | <0.0001 |
Stress ESV (ml) | 96 ± 57 | 73 ± 33 | <0.0001 |
Stress EF (%) | 46 ± 9 | 49 ± 3 | <0.0001 |
Rest EDV (ml) | 105 ± 52 | 89 ± 34 | <0.0001 |
Rest ESV (ml) | 89 ± 52 | 71 ± 31 | <0.0001 |
Rest EF (%) | 46 ± 8 | 49 ± 3 | <0.0001 |
Transient ischemic dilation | 1.09 ± 0.16 | 1.03 ± 0.14 | < 0.0001 |
EDV: end-diastolic volume, EF: ejection fraction, ESV: end-systolic volume, MACE: major adverse cardiac event, TPD: total perfusion deficit.
Variable selection
Figure 2A shows the average information gain ratio within 10-fold cross validation. On average, 22 imaging data, 8 stress test and 17 clinical variables were selected. All perfusion and functional variables from MPI had information gain ratio > 0, including left ventricular counts and injected dose. Top 9 selected variables were all imaging data variables.
MACE prediction by individual variables
Figure 2B shows the area under ROC curve (AUC) for the prediction of MACE by each individual variable. Stress TPD, stress heart rate, ischemic TPD, stress systolic blood pressure, rest TPD, and age were the best individual predictors. When compared to the information gain ratio in Figure 2A, there are some variables for which individual AUCs are predictive, yet they do not offer incremental information gain for predicting MACE (white bars). Furthermore, the variables with highest AUCs do not always have the highest information gain.
MACE prediction by combined variables
3-year MACE prediction was significantly higher for ML-combined than ML-imaging (AUC: 0.81 [0.78–0.83] vs 0.78 [0.75–0.81]; P < 0.01). ML-combined also had higher AUC compared to the AUCs of automated stress TPD and automated ischemic TPD (Figure 3), and compared to the AUCs for probability of CAD (0.64 [0.61–0.66]) or MD-diagnosis (0.65 [0.62–0.68]), as reported by the physician (all P < 0.001). When stress test variables were added to image variables for ML integration, AUC did not change significantly (AUC: 0.79 [0.76–0.82] vs. 0.78 [0.75–0.81]; P = 0.4).
Figure 3. ROC curves for prediction of 3-year MACE.
ML combining all variables using variable selection and LogitBoost algorithm (ML-combined) had a significantly higher AUC for MACE prediction than ML combining imaging data variables only (ML-imaging), and standard image analysis. AUC: area under ROC curve, MACE: major adverse cardiac events, ROC: receiver-operating characteristic, TPD: total perfusion deficit.
*P < 0.01, **P < 0.001, in AUC comparison by Delong test
The Brier score for ML-combined prediction of MACE was 0.07, indicating good calibration between ML scores (estimated predicted risk) and observed 3-year risk. The plot of observed vs. predicted MACE events over percentiles of ML-combined risk is shown in Figure 3. High correlation of ML-combined predicted vs. observed MACE was found (r = 0.97, P < 0.0001).
Risk re-categorization
To allow categorical comparison, a low-risk ML-combined score (<0.15) was determined as the cutoff defining the same percentile as visual MD-diagnosis = 0 (87th percentile). This percentile also approximately corresponded to the stress TPD threshold of < 5% (14). For patients within the 95–100th percentile of ML-combined score, 19% (25/131) of patients had normal MD-diagnosis and 10% (13/131) had stress TPD < 5% (Figure 4). Finally, a 5-category risk reclassification of 26% was significant for ML-combined scores with incremental 5% risk as compared to 5-category MD diagnosis (P < 0.001, Table 4), with 30.5% improved identification of patients with MACE and −5% decreased identification of MACE-free patients (all P < 0.001).
Figure 4. Observed vs. predicted 3-year risk of MACE.
Observed proportion of events (grey bars) and predicted ML score (red points) grouped by every 5th percentile of risk. MACE: major adverse cardiac events, ML: machine learning.
Table 4.
Risk reclassification by machine learning (ML) vs. physician diagnosis (MD-diagnosis).
MD-diagnosis | ML-boosting risk category | Total | ||||
---|---|---|---|---|---|---|
MACE, n = 239 | Low <.15 | Equivocal 0.15–0.2 | Mild 0.2–0.25 | Moderate 0.25–0.3 | Severe ≥0.3 | |
|
||||||
Normal | 99 | 19* | 9* | 7* | 8* | 142 |
Equivocal | 1† | 0 | 1* | 0* | 2* | 4 |
Probably abnormal | 2† | 0† | 0 | 1* | 1* | 4 |
Abnormal | 11† | 5† | 8† | 7 | 55* | 86 |
Definitely abnormal | 1† | 1† | 0† | 1† | 0 | 3 |
|
||||||
Total | 114 | 25 | 18 | 16 | 66 | 239 |
| ||||||
No MACE, n = 2380 | Low <0.15 | Equivocal (0.15–0.2) | Mild 0.2–0.25 | Moderate 0.25–0.3 | Severe ≥0.3 | |
|
||||||
Normal | 1959 | 95* | 35* | 16* | 33* | 2138 |
Equivocal | 5† | 1 | 0* | 2* | 3* | 11 |
Probably abnormal | 8† | 0† | 0 | 3* | 3* | 14 |
Abnormal | 69† | 29† | 21† | 23 | 67* | 209 |
Definitely abnormal | 3† | 0† | 1† | 1† | 3 | 8 |
|
||||||
Total | 2044 | 125 | 57 | 45 | 109 | 2380 |
| ||||||
Reclassification | 26% | P < 0.001 |
MACE: major adverse cardiac event
Up-risking by ML.
De-risking by ML.
DISCUSSION
We have developed and validated a highly accurate, personalized method for post-MPI risk computation that utilizes machine learning. This approach allows the combination of all available clinical, stress test and automatically derived imaging data variables without priori assumptions about the influence or weighting of individual factors or how they may interact. The method was used to evaluate the added value of clinical and stress test information for the prediction of MACE after MPI. The observed 3% annual MACE rate was similar to previous studies assessing the prognostic value of SPECT MPI (4). The only human input required for the derivation of the ML-combined MACE risk score was the collation of clinical data from health records (conceivably a task fulfilled by advanced text mining in the future) and the adjustment of contours by the technologists in a minority (< 10%) of the cases. Figure 6 illustrates how the proposed ML model would allow to predict the risk of MACE for an individual unknown case by automatically integrating the clinical data with the imaging data.
Figure 6. Illustration of prognostic risk computation in an individual patient by the proposed machine learning model.
QGS: quantitative gated SPECT, QPS: quantitative perfusion SPECT, MACE: major adverse cardiac event.
The performance of the ML-combined score was superior to image risk metrics traditionally used to study prognostic outcomes after MPI (1–7). The AUC estimate, derived in a rigorous manner with test and training data separated within 10-fold cross validation (preventing overfitting) was substantially higher than that for ML-imaging, visual or automated MPI assessment. Furthermore, risk reclassification analysis demonstrated that the ML-combined risk allows better classification of high-risk patients than visual clinical diagnosis. Risk reclassification revealed that the ML-combined score could up-risk more than 30% of patients with MACE incidence but also up-risk 5% of the MACE-free patients. At the same time, we have found that 19% of the patients in the highest ML-combined risk category (top 5%), with MACE incidence of 38%, were still read as definitely normal scans with MD-diagnosis = 0.
These highlights the difficulty in finding the appropriate thresholds for the multi-category risk scores. The low-risk threshold in this study was derived for the same population percentile as “Normal” visual scans, and subsequent higher risk thresholds were defined at incremental 5% of ML risk. Furthermore, we have found that automatically derived stress/ischemic TPD has better predictive value for MACE than clinical diagnosis, which is in line with our previous reports (9,23), but has not been previously reported for prognostic study.
To our knowledge, this is the first study applying ML to predict MACE in patients undergoing MPI. Recently, our group assessed the feasibility and accuracy of ML to predict 5-year all-cause mortality in 10,030 patients undergoing coronary CT angiography (8). In this analysis, ML exhibited a higher AUC compared with the Framingham risk score or visual CT severity scores alone (8). Automated processing of CT images was not used. In contrast, the current study capitalizes on established automated processing software tools that have long been validated in nuclear cardiology to provide the multiple imaging data variables with limited manual interaction. The intent is to demonstrate the feasibility of edging us closer to a completely automated computer-powered imaging analysis and risk assessment. A future direction and potential next step will be to develop tools capable of automatically extracting the clinical variables too – for example, by text mining electronic health records.
The ML approach provides a computational integration of all available information that is not feasible for subjective analysis by the reporting physician. As part of the clinical decision-making, physicians take into account clinical and stress testing data; however, this is done subjectively without a systematic way of integrating information. Further, although including these variables as part of the MPI report is recommended by guidelines, integration of these findings in the report is not yet part of standardized reporting guidelines (24,25). Intuitive patient-specific weighting of all individual clinical and imaging factors for assessing risk could not be expected to be precise, or consistent between different medical centers, whether performed by the interpreting physician or the physician managing the patient.
Although the average patient radiation dose (10.7 mSv) used in this study was higher than specified in current guidelines recommendation (26) the data was collected before the latest guidelines were adopted, using the same-day rest-first protocol optimized for the acquisition speed rather than for the radiation dose. Furthermore, a weight based protocol was utilized and most of the patients were obese (BMI ≥ 30 kg/m2). It is likely that at least 50% lower effective radiation dose could be achieved with longer acquisition times without any effect on image quality as previously studied (16). Further dose reductions could be achieved with stress-first/stress-only protocols.
Implications
The ability to optimally assess risk in individual patients remains a major challenge in cardiology. With MPI, visual image analysis itself is subjective, and the overall risk assessment, incorporating clinical, stress test, and imaging results, is highly variable, based on physician knowledge and experience, and limited by the complexity of appropriately assigning weight to individual factors. The presented ML score provides an automated precise and objective risk estimate combining imaging, clinical, and stress testing variables. The same optimal method for risk computation would be readily available to all imaging centers, including less experienced centers. The practical implementation will depend on the ability to interface the MPI reporting workstation with electronic patient records, in order to access the clinical variables. Such a tool could be perhaps interfaced with large registry data (e.g., the ImageGuide registry of the American Society of Nuclear Cardiology (25)), which will collect clinical variables similar to those used in this study. The implementation will depend on the availability of the interface to the electronic health records.
Limitations
This was a single center study and further multi-center and external validation of the derived risk score will be required. Future work should include the definition of the optimal ML threshold, to validate prospective practical clinical implementation. The sample size was modest and follow-up was only 3-years; however, all results were significant. Although training data was always separated from test data within 10-fold cross validation, it is not yet known how well such ML score can extrapolate between different centers, patient populations and follow-up time. Although we included key perfusion and function imaging variables in this study, the list was not exhaustive. The derived ML score is generic and can be applied to both pharmacologic and stress protocols, since ML technique uses the information about the type of test internally. However, further evaluation of ML risk stratification for MACE prediction in specific sub-populations, for example in patients with suspected disease, patients with early revascularization, or patients undergoing adenosine protocols, may be appropriate in multicenter studies. Risk reclassification metrics have limitations such as dependence on the choice of cutoff values of the continuous probability risk score. It is likely that more appropriate threshold selection in future studies may optimize the reclassification patterns for specific clinical risks. Alternatively, the MACE risk score without any categories could be also used clinically to indicate the probability of events for a given patient. Finally, we selected a LogitBoost approach for automatic ML variables integration, as in our previous work (8) – but the LogitBoost approach used here is one of many possible ML approaches to combine multiple variables for prediction. It is possible that different approaches such as deep learning may provide more optimal risk score derivation. However, a larger multi-center dataset is required to evaluate possible advantages of other ML approaches.
CONCLUSIONS
ML combining both clinical and imaging data variables was found to have high predictive accuracy for 3-year risk of MACE, and was superior to existing visual or automated perfusion assessments in isolation. This computational method could allow integrating the clinical data with imaging results for the optimal evaluation of MACE risk in patients undergoing SPECT MPI.
Figure 5. Frequency of normal clinical diagnosis and low perfusion scores by predicted ML risk percentile.
The frequency of patients with normal clinical diagnosis and low automated perfusion score (TPD < 5%) across percentiles of the ML score. ML: machine learning, TPD: total perfusion deficit.
Clinical Perspectives.
Competency in Medical knowledge
Combining clinical and imaging information by machine learning algorithm exhibited significantly better MACE prediction than using only imaging information or performing visual and automated perfusion assessment alone in SPECT MPI.
Translational Outlook
Adding clinical information to imaging data by machine learning will aid to comprehensive MPI assessment to improve clinical patient management.
Acknowledgments
This research was supported in part by grant R01HL089765 from the National Heart, Lung, and Blood Institute/National Institute of Health (NHLBI/NIH) (PI: Piotr Slomka). Daniel Berman, Guido Germano and Piotr Slomka participate in software royalties at Cedars-Sinai Medical Center.
Abbreviations
- SPECT
single-photon emission computed tomography.
- MPI
myocardial perfusion SPECT imaging.
- MACE
major adverse cardiac events.
- CAD
coronary artery disease.
- BMI
body mass index.
- TPD
total perfusion deficit.
- EF
ejection fraction.
- TID
transient ischemic dilation.
- MI
myocardial infarction.
- ECG
electrocardiogram.
- CT
computed tomography.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Financial disclosure: Daniel Berman, Guido Germano and Piotr Slomka participate in software royalties at Cedars-Sinai Medical Center.
References
- 1.Gimelli A, Rossi G, Landi P, et al. Stress/Rest Myocardial Perfusion Abnormalities by Gated SPECT: Still the Best Predictor of Cardiac Events in Stable Ischemic Heart Disease. J Nucl Med. 2009;50:546–53. doi: 10.2967/jnumed.108.055954. [DOI] [PubMed] [Google Scholar]
- 2.Hachamovitch R, Kang X, Amanullah AM, et al. Prognostic Implications of Myocardial Perfusion Single-Photon Emission Computed Tomography in the Elderly. Circulation. 2009;120:2197–2206. doi: 10.1161/CIRCULATIONAHA.108.817387. [DOI] [PubMed] [Google Scholar]
- 3.Shaw LJ, Berman DS, Maron DJ, et al. Optimal medical therapy with or without percutaneous coronary intervention to reduce ischemic burden: results from the Clinical Outcomes Utilizing Revascularization and Aggressive Drug Evaluation (COURAGE) trial nuclear substudy. Circulation. 2008;117:1283–91. doi: 10.1161/CIRCULATIONAHA.107.743963. [DOI] [PubMed] [Google Scholar]
- 4.Shaw LJ, Iskandrian AE. Prognostic value of gated myocardial perfusion SPECT. J Nucl Cardiol. 2004;11:171–185. doi: 10.1016/j.nuclcard.2003.12.004. [DOI] [PubMed] [Google Scholar]
- 5.Kang X, Berman DS, Lewin HC, et al. Incremental prognostic value of myocardial perfusion single photon emission computed tomography in patients with diabetes mellitus. Am Heart J. 1999;138:1025–1032. doi: 10.1016/s0002-8703(99)70066-9. [DOI] [PubMed] [Google Scholar]
- 6.Hachamovitch R, Berman DS, Kiat H, et al. Exercise Myocardial Perfusion SPECT in Patients Without Known Coronary Artery Disease: incremental Prognostic Value and Use in Risk Stratification. Circulation. 1996;93:905–914. doi: 10.1161/01.cir.93.5.905. [DOI] [PubMed] [Google Scholar]
- 7.Sharir T, Germano G, Kang X, et al. Prediction of Myocardial Infarction Versus Cardiac Death by Gated Myocardial Perfusion SPECT: Risk Stratification by the Amount of Stress-Induced Ischemia and the Poststress Ejection Fraction. J Nucl Med. 2001;42:831–837. [PubMed] [Google Scholar]
- 8.Motwani M, Dey D, Berman DS, et al. Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. Eur Heart J. 2017;38:500–507. doi: 10.1093/eurheartj/ehw188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Arsanjani R, Dey D, Khachatryan T, et al. Prediction of revascularization after myocardial perfusion SPECT by machine learning in a large population. J Nucl Cardiol. 2015;22:877–884. doi: 10.1007/s12350-014-0027-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Betancur J, Rubeaux M, Fuchs T, et al. Automatic Valve Plane Localization in Myocardial Perfusion SPECT/CT by Machine Learning: Anatomical and Clinical Validation. J Nucl Med. 2016 doi: 10.2967/jnumed.116.179911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Gambhir SS, Berman DS, Ziffer J, et al. A Novel High-Sensitivity Rapid-Acquisition Single-Photon Cardiac Imaging Camera. J Nucl Med. 2009;50:635–643. doi: 10.2967/jnumed.108.060020. [DOI] [PubMed] [Google Scholar]
- 12.Sharir T, Slomka PJ, Hayes SW, et al. Multicenter Trial of High-Speed Versus Conventional Single-Photon Emission Computed Tomography ImagingQuantitative Results of Myocardial Perfusion and Left Ventricular Function. J Am Coll Cardiol. 2010;55:1965–1974. doi: 10.1016/j.jacc.2010.01.028. [DOI] [PubMed] [Google Scholar]
- 13.Andersson M, Johansson L, Minarik D, Leide-Svegborn S, Mattsson S. Effective dose to adult patients from 338 radiopharmaceuticals estimated using ICRP biokinetic data, ICRP/ICRU computational reference phantoms and ICRP 2007 tissue weighting factors. EJNMMI Physics. 2014;1:9. doi: 10.1186/2197-7364-1-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Nakazato R, Tamarappoo BK, Kang X, et al. Quantitative Upright–Supine High-Speed SPECT Myocardial Perfusion Imaging for Detection of Coronary Artery Disease: Correlation with Invasive Coronary Angiography. J Nucl Med. 2010;51:1724–1731. doi: 10.2967/jnumed.110.078782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Xu Y, Arsanjani R, Clond M, et al. Transient ischemic dilation for coronary artery disease in quantitative analysis of same-day sestamibi myocardial perfusion SPECT. J Nucl Cardiol. 2012;19:465–473. doi: 10.1007/s12350-012-9527-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Nakazato R, Berman DS, Hayes SW, et al. Myocardial Perfusion Imaging with a Solid-State Camera: Simulation of a Very Low Dose Imaging Protocol. J Nucl Med. 2013;54:373–379. doi: 10.2967/jnumed.112.110601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Thygesen K, Alpert JS, White HD. Universal Definition of Myocardial Infarction. Circulation. 2007;116:2634–2653. doi: 10.1161/CIRCULATIONAHA.107.187397. [DOI] [PubMed] [Google Scholar]
- 18.Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. SIGKDD Explor Newsl. 2009;11:10–18. [Google Scholar]
- 19.Friedman J, Hastie T, Tibshirani R. Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors) The annals of statistics. 2000;28:337–407. [Google Scholar]
- 20.Kanamori T, Takenouchi T, Eguchi S, Murata N. Robust Loss Functions for Boosting. Neural Comput. 2007;19:2183–2244. doi: 10.1162/neco.2007.19.8.2183. [DOI] [PubMed] [Google Scholar]
- 21.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach. Biometrics. 1988;44:837–845. [PubMed] [Google Scholar]
- 22.Brier GW. Verification of forecast expressed in terms of probability. Monthly Weather Review. 1950;78:1–3. [Google Scholar]
- 23.Arsanjani R, Xu Y, Dey D, et al. Improved accuracy of myocardial perfusion SPECT for detection of coronary artery disease by machine learning in a large population. J Nucl Cardiol. 2013;20:553–562. doi: 10.1007/s12350-013-9706-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Tragardh E, Hesse B, Knuuti J, et al. Reporting nuclear cardiology: a joint position paper by the European Association of Nuclear Medicine (EANM) and the European Association of Cardiovascular Imaging (EACVI) Eur Heart J Cardiovasc Imaging. 2015;16:272–9. doi: 10.1093/ehjci/jeu304. [DOI] [PubMed] [Google Scholar]
- 25.Tilkemeier PL, Mahmarian JJ, Wolinsky DG, Denton EA. ImageGuide™ Update. J Nucl Cardiol. 2015;22:994–997. doi: 10.1007/s12350-015-0217-1. [DOI] [PubMed] [Google Scholar]
- 26.Henzlova MJ, Duvall WL, Einstein AJ, Travin MI, Verberne HJ. ASNC imaging guidelines for SPECT nuclear cardiology procedures: Stress, protocols, and tracers. J Nucl Cardiol. 2016;23:606–639. doi: 10.1007/s12350-015-0387-x. [DOI] [PubMed] [Google Scholar]