Skip to main content
RSNA Journals logoLink to RSNA Journals
. 2023 May 2;307(4):e223351. doi: 10.1148/radiol.223351

Toward AI-supported US Triage of Women with Palpable Breast Lumps in a Low-Resource Setting

Wendie A Berg 1,, Ana-Lilia López Aldrete 1, Ajit Jairaj 1, Juan Carlos Ledesma Parea 1, Claudia Yolanda García 1, R Chad McClennan 1, Steven Yong Cen 1, Linda H Larsen 1, M Teresa Soler de Lara 1, Susan Love 1
PMCID: PMC10323289  PMID: 37129492

Abstract

Background

Most low- and middle-income countries lack access to organized breast cancer screening, and women with lumps may wait months for diagnostic assessment.

Purpose

To demonstrate that artificial intelligence (AI) software applied to breast US images obtained with low-cost portable equipment and by minimally trained observers could accurately classify palpable breast masses for triage in a low-resource setting.

Materials and Methods

This prospective multicenter study evaluated participants with at least one palpable mass who were enrolled in a hospital in Jalisco, Mexico, from December 2017 through May 2021. Orthogonal US images were obtained first with portable US with and without calipers of any findings at the site of lump and adjacent tissue. Then women were imaged with standard-of-care (SOC) US with Breast Imaging Reporting and Data System assessments by a radiologist. After exclusions, 758 masses in 300 women were analyzable by AI, with outputs of benign, probably benign, suspicious, and malignant. Sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) were determined.

Results

The mean patient age ± SD was 50.0 years ± 12.5 (range, 18–92 years) and mean largest lesion diameter was 13 mm ± 8 (range, 2–54 mm). Of 758 masses, 360 (47.5%) were palpable and 56 (7.4%) malignant, including six ductal carcinoma in situ. AI correctly identified 47 or 48 of 49 women (96%–98%) with cancer with either portable US or SOC US images, with AUCs of 0.91 and 0.95, respectively. One circumscribed invasive ductal carcinoma was classified as probably benign with SOC US, ipsilateral to a spiculated invasive ductal carcinoma. Of 251 women with benign masses, 168 (67%) imaged with SOC US were classified as benign or probably benign by AI, as were 96 of 251 masses (38%, P < .001) with portable US. AI performance with images obtained by a radiologist was significantly better than with images obtained by a minimally trained observer.

Conclusion

AI applied to portable US images of breast masses can accurately identify malignancies. Moderate specificity, which could triage 38%–67% of women with benign masses without tertiary referral, should further improve with AI and observer training with portable US.

© RSNA, 2023

Supplemental material is available for this article.

See also the editorial by Slanetz in this issue.


graphic file with name radiol.223351.VA.jpg


Summary

Although specificity was less than with standard-of-care equipment, artificial intelligence applied to portable breast US can potentially reduce about half of tertiary referrals in resource-limited regions.

Key Results

  • ■ In this prospective study, artificial intelligence (AI) using standard-of-care (SOC) US images of 758 masses accurately classified 53 of 56 malignancies (95%) and 554 of 702 benign masses (79%), with an area under the receiver operating characteristic curve of 0.95.

  • ■ With use of low-cost portable US images obtained by a radiologist on 603 masses, AI correctly classified all 35 malignancies (100% sensitivity) but showed reduced specificity at 296 of 563 benign masses (51.2%) versus 457 of 568 benign masses (80.5%) with images from SOC US (P < .001).

  • ■ AI performance was reduced with images obtained with low-cost portable US by an untrained observer.

Introduction

Standards of care (SOCs) must be appropriate for the environments where they are practiced (1). While screening has been the focus in Western countries, low-and middle-income countries (LMIC) often lack access to organized screening programs and technology. In LMIC, breast cancer most commonly presents as a palpable lump, and women under age 50 years are overrepresented (2), with the peak age at diagnosis more than 10 years earlier in some Asian and African countries than in the United States or Europe (3). “Early detection” is the recognition of symptomatic breast cancer at an early stage and is the priority of the Breast Health Global Initiative in LMIC (4). Enabling early detection requires educational efforts to increase awareness, as well as efforts to improve access to imaging of breast lumps and other symptoms, pathology, and treatment. US can play a critical role in early detection, with resulting social and economic benefits of more effective, less invasive treatment and improved outcomes.

The most common breast lumps (ie, cysts, fibroadenomas, and cancers) are usually distinct with US, with fewer than 10% representing malignancy in women under age 55 years (5). In an analysis of 3799 consecutively presenting breast cancers in women with symptoms, US was more sensitive than mammography for women under age 48 years (6). A study of 954 women between the ages of 30 and 39 years with 1208 symptomatic areas showed high sensitivity of US at 95.7% (22 of 23 malignancies) and negative predictive value of 99.9%, while adjunct mammography provided minimal additional value, helping to detect only one additional malignancy; the authors concluded that US should be the primary method of evaluation (7).

US of palpable breast masses, as currently performed, requires appropriate equipment and trained professional staff for performance and interpretation. A handheld portable US device that could provide a stand-alone, low-cost solution without requiring highly trained medical professionals would be of great utility in LMIC and in remote, rural locations with limited resources. Preliminary evidence indicates that, with minimal training, ancillary medical staff (eg, nurses, medical students, and physicians outside of radiology) can obtain adequate breast US images with use of a low-cost portable US device (8). Such images could be interpreted remotely by a radiologist, locally by artificial intelligence (AI), or both. Several studies have shown that AI can provide breast US mass classification on par with a specialist radiologist (9,10) or even better (11).

In LMIC, most women present with either an obvious locally advanced breast cancer or a palpable lump detected with breast self-examination or clinical breast examination. While breast cancer requires referral to a hospital for treatment, the vast majority (80%–90%) of palpable lumps are benign. A woman is often seen in a remote, under-resourced rural center for a breast examination and then may wait months for diagnostic imaging and biopsy at a referral center. If her lump can be shown to be typically benign with US, such as a cyst or normal tissue, then this could obviate hospital referral and increase availability of limited resources for other women who actually have cancer.

The purpose of this study was to demonstrate that AI software applied to breast US images obtained on low-cost portable equipment and also by minimally trained observers could accurately classify palpable breast masses for triage in a low-resource setting.

Materials and Methods

GE HealthCare provided two Vscan US units used in this study, and Koios Medical processed all US images without human intervention. Authors who are not affiliated with these companies retained full control of data and information submitted for publication.

In a prospective multicenter trial, we sought to enroll 500 women older than 18 years, each with at least one palpable breast lump. From December 11, 2017, through May 21, 2021, women were recruited in an institutional review board–approved Health Insurance Portability and Accountability Act–compliant protocol and provided written, informed consent at one of two centers in Mexico: Hospital Valentin Gomez Farias in Zapopan, Jalisco, Mexico, or Hospital General de Tijuana in Tijuana, Mexico. The Tijuana site was closed after enrolling nine patients, because they were not able to provide supporting data. Enrollment was suspended from May 14, 2019, through December 12, 2020—first, to amend the protocol to allow minimally trained personnel to obtain images, then in May 2020 due to COVID-19.

Targeted US was performed twice on each participant. First, orthogonal images with and without calipers were obtained of breast masses with use of the low-cost portable Vscan Extend laptop–based US unit equipped with a 2.9-cm, 8.0–3.3 MHz linear-array transducer. If the area of the palpable lump appeared to represent normal variant fatty lobulation or underlying dense fibroglandular tissue, no images were documented. If incidental adjacent nonpalpable breast masses were seen, they were also documented on both US systems. The first 376 women were scanned by a specialist breast imaging radiologist (J.C.L.P., with 5 years of experience). The subsequent 102 women were scanned by one of two nonphysician research coordinators who had been trained in use of the portable US device with use of a previously validated 30-minute PowerPoint (Microsoft) presentation detailed by Love et al (8). Second, all women were also scanned by the specialist radiologist (J.C.L.P.) with use of a high-end cart-based SOC device (Hi-Vision Avius, equipped with a 5-cm, 13–5 MHz linear-array transducer; Hitachi Medical), and orthogonal images with and without calipers were obtained.

For each breast mass, the radiologist specialist recorded patient age, race, and ethnicity, whether or not a given mass was the palpable “index” lesion, its maximal diameter, and a Breast Imaging Reporting and Data System (BI-RADS) final assessment (12), as follows: 1, negative; 2, benign; 3, probably benign; 4A, low suspicion; 4B, moderate suspicion; 4C, high suspicion; or 5, highly suggestive of malignancy.

US AI

Deidentified paired US images from both Vscan and Hitachi imaging were processed by Koios DS version 3.x (Koios Medical). If there were more than four images of a given lesion, all were processed. There were no cine loops. This AI software automatically segments lesions and can make use of calipers with images to confirm lesion boundaries. System-generated outputs are benign, probably benign, suspicious, and malignant (corresponding to BI-RADS 1 or 2, BI-RADS 3, BI-RADS 4A or 4B, and BI-RADS 4C or 5, respectively). A numeric quantitative score from 0 to 1 is also output by the software, which approximates relative risk of malignancy. An assessment of benign or probably benign for a malignant lesion was considered a false negative. The AI used in this study has been trained with more than 700 000 images from 40 clinical sites, representing all major manufacturers, including 17 different models and a wide range of transducer frequencies, and this software is currently embedded on all SOC breast US equipment from GE HealthCare and on many picture archiving and communication systems.

Exclusions and Multiple Lesions per Participant

We excluded participants with nonbreast malignancies, incomplete data, data mismatch between the radiology database and pathology reports, or missing images (some Hitachi images were exported to discs, deleted from clinical systems, and the discs became unrecoverable). Because AI software has not yet been developed to assess lymph nodes, skin lesions, calcifications, scars, or normal variants, those lesion types were excluded. Lesions reported to have a horizontal diameter larger than the 29-mm horizontal field of view of the portable US transducer were reviewed and excluded if the entire lesion could not be included with a single image (ie, when multiple tiled images were required to include all margins), as the AI software was not trained for this circumstance. BI-RADS 3 masses without follow-up or with suspicious change noted at follow-up but no biopsy results available were also excluded.

Because of the potential influence of multiple lesions per participant, with higher likelihood of malignancy for other masses depicted with help of US synchronous to current malignancy (13), we analyzed results two ways. First, at the participant level, we considered only a single mass per participant, prioritizing palpable masses first. If there was more than one palpable mass, we retained the malignant one. If there was more than one palpable malignancy, we arbitrarily retained the first one listed in the database by site investigators. Second, we evaluated lesion-level results. For all malignancies assessed as benign or probably benign by AI or the radiologist, a radiologist (W.A.B., with 30 years of experience in breast imaging) reviewed the images.

Statistical Analysis

The main purpose of this study was to assess the diagnostic accuracy of AI applied to portable US images; we expected to show at least 40% specificity. With an estimated sample size of 500 women and/or masses (10% malignant) and target sensitivity of at least 95%, we expected the lower limit of the 95% CI of 0.88 for sensitivity and 0.40 for specificity if we observed 45% specificity. A sample size of 450 benign cases was estimated to have greater than 80% power to detect a difference in specificity of 0.55 versus 0.5, allowing no more than 13% discordance between portable US and SOC US with use of a two-sided McNemar test with α = .05. PASS 2021 (NCSS) was used for power calculations.

Sensitivity, specificity, negative predictive value, and area under the receiver operating characteristic curve (AUC) were determined. As per the guidance chapter of BI-RADS fifth edition for diagnostic breast imaging (14), a benign or probably benign assessment for a malignant lesion was considered a false negative when reporting sensitivity. Sensitivity, specificity, and negative predictive value were estimated with use of hierarchical Poisson regression with generalized estimating equations (15). The differences between modalities were compared with use of additive generalized estimating equations models for the absolute differences between rates. For AUC, we used the numeric output from the AI software or categorical ordinal BI-RADS assessments given by the radiologists. Evaluation of AUCs was based on the nonparametric method by DeLong et al (16). We analyzed the subsets of lesions imaged with portable US by the radiologist versus those imaged by minimally trained research coordinators. Statistical calculations were performed with use of SAS 9.4 or R (RStudio, version 1.4.1717 [2021]).

Results

From December 11, 2017, through May 21, 2021, US images were documented for 1216 breast masses, with 126 malignant (10.4%), in 478 Hispanic women. After exclusions detailed in Figure 1, the final lesion-level analysis set included 758 masses from 300 women, with an average age (±SD) of 50.0 years ± 12.5 (range, 18–92 years). Of the 758 masses, the average largest diameter was 13 mm ± 8 (range, 2–54 mm) and 360 (47.5%) were palpable; 56 of 758 (7.4%) were malignant, as were 41 of 360 (11.4%) of the subset of palpable masses. Among the 56 malignancies, 50 (89%) were invasive ductal carcinoma and six were ductal carcinoma in situ. Benign lesions are detailed in Appendix S1. For the 300 index lesions, the average largest diameter was 15 mm ± 9 (range, 2–54 mm), 167 (55.7%) were palpable, and 49 (16.3%) were malignant.

Figure 1:

Flowchart shows study population, exclusions, and final analysis set. Multiple lesions per participant were excluded. The palpable mass was retained. If there were multiple palpable masses, the malignant mass was retained. If there were multiple palpable malignancies, we arbitrarily retained the first lesion listed by the site radiologist. BI-RADS = Breast Imaging Reporting and Data System.

Flowchart shows study population, exclusions, and final analysis set. Multiple lesions per participant were excluded. The palpable mass was retained. If there were multiple palpable masses, the malignant mass was retained. If there were multiple palpable malignancies, we arbitrarily retained the first lesion listed by the site radiologist. BI-RADS = Breast Imaging Reporting and Data System.

Sensitivity and Specificity

Table 1 and Figure S1 (Appendix S1) detail the performance of the AI system. At the participant level, of 49 women with cancer, AI accurately identified 47 (96%, portable US) or 48 (98%, SOC US) and could have triaged 96 (38%, portable US) or 168 (67%, SOC US) of 251 women with benign lesions to routine care. At the lesion level, 53 of 56 malignancies (95%) and up to 554 of 702 benign masses (79%) were correctly classified, with an AUC of 0.95, compared with radiologist AUC of 0.98 (P = .06). There were four unique malignant masses (two palpable and two nonpalpable) assessed as benign or probably benign by the AI software, two misclassified with both portable and SOC US, one with portable US only, and one with SOC US only. On review, each was a circumscribed, oval, hypoechoic mass, and three of the four were low nuclear grade ductal carcinoma in situ (Fig 2). The fourth malignancy misclassified as probably benign by the AI was a grade 3 invasive ductal carcinoma, which was a second, nonindex oval circumscribed mass with internal vascularity and posterior enhancement in a woman with a spiculated invasive ductal carcinoma elsewhere in the same breast (Fig 3); this false-negative mass was excluded from the participant-level analysis. Incidentally, a fifth malignancy due to ductal carcinoma in situ and resembling a cyst was lacking SOC US images for review and, therefore, excluded from the final analysis set; it was misclassified as BI-RADS 3, probably benign, by the radiologist and assessed as suspicious by AI.

Table 1:

Performance of AI on Breast Masses Imaged with Portable Low-Cost US and Minimally Trained Observers versus SOC Equipment and Performance by a Specialist Radiologist

graphic file with name radiol.223351.tbl1.jpg

Figure 2:

Images in a 37-year-old woman show a palpable mass due to low-grade ductal carcinoma in situ. (A) Orthogonal portable US images show a hypoechoic oval mass with subtly indistinct margins (arrows), assessed as probably benign by artificial intelligence (AI). (B) Orthogonal standard-of-care US images show focal microlobulation (arrow) and were assessed as suspicious with AI and as Breast Imaging Reporting and Data System 4A by the radiologist. US-guided core biopsy and excision showed estrogen and progesterone receptor positive low-grade ductal carcinoma in situ.

Images in a 37-year-old woman show a palpable mass due to low-grade ductal carcinoma in situ. (A) Orthogonal portable US images show a hypoechoic oval mass with subtly indistinct margins (arrows), assessed as probably benign by artificial intelligence (AI). (B) Orthogonal standard-of-care US images show focal microlobulation (arrow) and were assessed as suspicious with AI and as Breast Imaging Reporting and Data System 4A by the radiologist. US-guided core biopsy and excision showed estrogen and progesterone receptor positive low-grade ductal carcinoma in situ.

Figure 3:

Images in a 60-year-old woman show two palpable masses in the outer right breast. (A) Orthogonal portable US images of mass in right breast at 8 o’clock axis, 4 cm from the nipple, assessed as suspicious by artificial intelligence (AI). (B) Orthogonal standard-of-care (SOC) US images of the same 19-mm mass show circumscribed margins and posterior enhancement. SOC images of the mass were assessed as probably benign by AI and Breast Imaging Reporting and Data System (BI-RADS) 4A, low suspicion, by the radiologist. (C) Color Doppler US image shows strong internal vascularity. Doppler images are not currently evaluated by AI. (D) Orthogonal US images of second palpable mass in right breast at 8 o’clock axis, 6 cm from the nipple, show an irregular 17-mm hypoechoic spiculated mass with posterior shadowing, assessed as probably malignant by AI and BI-RADS 5 by the radiologist. Histopathologic examination of both masses showed grade 3 invasive ductal carcinoma that was estrogen and progesterone receptor positive and human epidermal growth factor receptor 2 negative.

Images in a 60-year-old woman show two palpable masses in the outer right breast. (A) Orthogonal portable US images of mass in right breast at 8 o’clock axis, 4 cm from the nipple, assessed as suspicious by artificial intelligence (AI). (B) Orthogonal standard-of-care (SOC) US images of the same 19-mm mass show circumscribed margins and posterior enhancement. SOC images of the mass were assessed as probably benign by AI and Breast Imaging Reporting and Data System (BI-RADS) 4A, low suspicion, by the radiologist. (C) Color Doppler US image shows strong internal vascularity. Doppler images are not currently evaluated by AI. (D) Orthogonal US images of second palpable mass in right breast at 8 o’clock axis, 6 cm from the nipple, show an irregular 17-mm hypoechoic spiculated mass with posterior shadowing, assessed as probably malignant by AI and BI-RADS 5 by the radiologist. Histopathologic examination of both masses showed grade 3 invasive ductal carcinoma that was estrogen and progesterone receptor positive and human epidermal growth factor receptor 2 negative.

Of 702 benign masses, 554 (79%) imaged with SOC US were correctly assessed as benign or probably benign by AI, including 43 masses that were considered suspicious (BI-RADS 4A by the radiologist) and biopsied clinically. Specificity was much lower for AI with use of the portable US images, at 340 of 702 benign masses (48%, P < .001).

Operator Dependence

Considering the subset of 204 women with portable US images obtained by the radiologist, there were 603 analyzable lesions (35 [5.8%] were malignant). The AUC of AI was 0.98, sensitivity was 97%–100% (34 or 35 of 35 malignant lesions), and specificity was 52%–80% (296–457 of 568 benign lesions) (Table 2). These results were significantly better (all P < .001) than for portable US images obtained by minimally trained research coordinators in the subset of 155 analyzable lesions (21 [13.5%] were malignant) in 96 women. The AUC of AI for this subset was 0.78, sensitivity was 86% (18 of 21 malignant lesions), and specificity was 33% (44 of 134 benign lesions). Results from images obtained with portable US were generally not different from those with SOC equipment when distinguished by operator, except that specificity remained significantly lower with portable US images (Table 2).

Table 2:

Comparison of AI Performance with Breast US Images Obtained by Observers with Differing Experience

graphic file with name radiol.223351.tbl2.jpg

Discussion

In this analysis, 47 or 48 of 49 women (96%–98%) with cancer depicted by use of US and 96–168 of 251 women (38%–67%) without cancer would have been triaged appropriately by Koios DS artificial intelligence (AI) software, with better performance with standard-of-care (SOC) US equipment than with low-cost portable US. By analyzing 758 breast masses in women with findings seen with US, we found AI in isolation performed well, with the area under the receiver operating characteristic curve (AUC) exceeding 0.95 with images obtained with SOC US equipment and sensitivity of 95% or higher. Importantly, nearly equivalent AUC and sensitivity, at 95% or higher, but lower specificity, was observed for AI applied to images obtained by a radiologist with low-cost portable US. AI applied to images obtained by minimally trained research coordinators with low-cost portable US showed suboptimal performance, indicating the need for greater training of personnel. Training of the AI software with images from low-cost portable US should improve specificity.

The few malignancies erroneously assessed as benign or probably benign by AI in our study did have benign features, and all but one were ductal carcinoma in situ. Delay in diagnosis of low-grade ductal carcinoma in situ is unlikely to adversely affect patient outcomes. The AI software has not been trained on a sufficient volume of US masses due to ductal carcinoma in situ—this is an area for improvement. The one misclassified invasive ductal carcinoma was a circumscribed mass, ipsilateral to a second spiculated mass due to invasive ductal carcinoma and showed strong internal vascularity. Doppler results are not currently considered by AI. The otherwise outstanding performance of the software on invasive breast cancer, even with low-quality images, is highly reassuring.

The current inability to easily triage women with breast lumps appears to be a generalizable challenge in LMICs, with wait times as long as 6 months for evaluation in some settings. An accurate noninvasive portable US device that could be used locally by a minimally trained healthcare worker, concurrent with a clinical breast examination confirming the lump, would ensure that the 10%–20% of women with potentially malignant lesions could be referred urgently for biopsy while a high percentage of women with benign lesions could be safely reassured. Such an approach would ensure that limited trained healthcare personnel and financial resources are focused on those women who would benefit from earlier management and treatment. With use of the handheld portable US images in conjunction with AI, in this study, 33%–52% of women with benign breast lumps corresponding to masses with US could have been triaged remotely, with negative predictive values of 93.6%–100%, with better performance with images obtained by the radiologist and with images from SOC equipment. Use of AI is a much more easily implemented approach than extensive training with breast US, such as the 7-week training of nurses and general practitioners in Rwanda to include interpretation (17). We did find, however, that images obtained by minimally trained observers were not optimal for diagnostic assessment—greater emphasis on US technique is needed. It was reassuring that a specialist could obtain equally diagnostic images even with a low-cost portable US device as with a high-end cart-based system.

The AI used in this study, Koios DS, is intended as decision support to a radiologist and, as stated, has been trained and validated with more than 700 000 breast masses from over 40 centers, representing 17 current high-end breast US platforms from all major US manufacturers. The AI stand-alone performance has previously been shown to be similar to that of specialist radiologists (9,10), with an AUC of 0.77 on a relatively challenging case set of 319 lesions found with screening US (9). Mango et al (10) showed the greatest beneficial impact of AI on physicians outside of radiology on 900 cases, with a stand-alone performance AUC of 0.876 with low-frequency transducer US images not different from an AUC of 0.893 with images from a high-frequency transducer.

There are other AI tools developed for breast US. Shen et al (11) developed and validated AI with stand-alone AUC of 0.976 on a test set of more than 44 000 examinations. When radiologists retrospectively reviewed images with this software, false-positive recalls decreased by 37% and benign biopsies by nearly 28%. S-Detect software (Samsung Medison) performs BI-RADS feature extraction with outputs of possibly malignant or possibly benign. Inexperienced radiologists benefit most from this approach (18), with greatest improvements in specificity (1923).

Current AI assessments of each lesion are independent and do not consider the influence of concurrent lesions elsewhere in the same woman. In the American College of Radiology Imaging Network 6666 protocol, multiple bilateral circumscribed oval masses (assessed as a "single" overall finding) could be safely assessed as benign, BI-RADS 2, with 0 of 153 such lesions malignant (95% CI: 0, 2.4) in 135 women (24). Among these 135 women, 82 also had a solitary suspicious mass, with two of those malignant. In a patient with concurrent malignancy, otherwise BI-RADS 3 masses overall had an 11% rate of malignancy in the series by Kim et al (13). When in the same quadrant as the known cancer, 21.2% (36 of 170 masses) were malignant, as were 9.8% of ipsilateral masses (12 of 122) in a different quadrant and 4.2% (eight of 190) in the contralateral breast. Augmenting current AI to consider concurrent breast lesions in the ipsilateral or contralateral breast may improve performance.

There were limitations to our study. The Vscan Extend US system was never approved by the U.S. Food and Drug Administration for clinical breast US work, and there was no training of the Koios DS algorithms with such images. The low spatial resolution of the transducer used, combined with the lack of system training of the software, likely explain the reduced specificity observed with images obtained on this portable system. GE HealthCare has since updated this portable handheld platform, and the new Vscan Air is equipped with a wireless higher frequency (L3–12 MHz), wider footprint (40 mm) transducer that is approved by the Food and Drug Administration for breast imaging. These system specifications are now similar to the SOC equipment used in American College of Radiology Imaging Network 6666 protocol (25) and should improve diagnostic performance of both the radiologist and the AI and allow better imaging of larger masses. There are other handheld low-cost wireless US systems with similar specifications currently available. The radiologist’s performance in this study appears artificially high, with 100% sensitivity and specificity as high as 87%, in part because of lack of follow-up and exclusion of many masses assessed as negative, benign, or probably benign. Portable US images were not interpreted by a radiologist, so we do not know how the AI performance with those images compares to that of a radiologist. We did not include risk factors, clinical features such as patient age, or findings such as skin retraction or nipple discharge. Doppler and elastography are not currently evaluated with the AI software, and we excluded lymph nodes, skin lesions, and normal tissue areas as the software has not yet been trained on those types of findings.

In conclusion, radiologists using low-cost portable handheld US can generate images of breast masses adequate for accurate artificial intelligence (AI) classification. Although specificity was less than with standard-of-care equipment, AI applied to portable breast US can potentially reduce about half of unnecessary referrals for benign lesions in resource-limited regions. These favorable results were observed despite lack of training of the AI software with images from the device used, and current portable US has improved specifications. We did not show that untrained observers could produce adequately diagnostic images. Additional training of affiliated healthcare workers with image acquisition, improved equipment, and further system training of AI software on such masses is expected to further improve overall performance and allow effective triage of women with palpable lumps in low- and middle-income countries.

Acknowledgments

Acknowledgments

The authors are grateful to GE HealthCare for providing the Vscan portable US systems used in acquisition of images and to Koios Medical for providing the AI software.

Supported by the National Cancer Institute (UH3CA189966) and the Dr Susan Love Research Foundation.

Disclosures of conflicts of interest: W.A.B. Received grant support to the Department of Radiology from Koios Medical for a separate study where she is the principal investigator; voluntary Chief Scientific Advisor to DenseBreast-info.org. A.L.L.A. No relevant relationships. A.J. Employee of Koios Medical. J.C.L.P. No relevant relationships. C.Y.G. No relevant relationships. R.C.M. Employee of Koios Medical. S.Y.C. No relevant relationships. L.H.L. No relevant relationships. M.T.S.d.L. No relevant relationships. S.L. No relevant relationships.

Abbreviations:

AI
artificial intelligence
AUC
area under the receiver operating characteristic curve
BI-RADS
Breast Imaging Reporting and Data System
LMIC
low- and middle-income countries
SOC
standard of care

References

  • 1. Anderson BO , Distelhorst SR . Guidelines for International Breast Health and Cancer Control--Implementation. Introduction . Cancer 2008. ; 113 ( 8 Suppl ): 2215 – 2216 . [DOI] [PubMed] [Google Scholar]
  • 2. Palacio-Mejía LS , Lazcano-Ponce E , Allen-Leigh B , Hernández-Ávila M . Regional differences in breast and cervical cancer mortality in Mexico between 1979-2006 [in Spanish] . Salud Publica Mex 2009. ; 51 ( Suppl 2 ): s208 – s219 . [DOI] [PubMed] [Google Scholar]
  • 3. Lei S , Zheng R , Zhang S , et al . Global patterns of breast cancer incidence and mortality: A population-based cancer registry data analysis from 2000 to 2020 . Cancer Commun (Lond) 2021. ; 41 ( 11 ): 1183 – 1194 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Ginsburg O , Yip CH , Brooks A , et al . Breast cancer early detection: A phased approach to implementation . Cancer 2020. ; 126 ( Suppl 10 ): 2379 – 2393 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Sterns EE . Age-related breast diagnosis . Can J Surg 1992. ; 35 ( 1 ): 41 – 45 . [PubMed] [Google Scholar]
  • 6. Houssami N , Ciatto S , Irwig L , Simpson JM , Macaskill P . The comparative sensitivity of mammography and ultrasound in women with breast symptoms: an age-specific analysis . Breast 2002. ; 11 ( 2 ): 125 – 130 . [DOI] [PubMed] [Google Scholar]
  • 7. Lehman CD , Lee CI , Loving VA , Portillo MS , Peacock S , DeMartini WB . Accuracy and value of breast ultrasound for primary imaging evaluation of symptomatic women 30-39 years of age . AJR Am J Roentgenol 2012. ; 199 ( 5 ): 1169 – 1177 . [DOI] [PubMed] [Google Scholar]
  • 8. Love SM , Berg WA , Podilchuk C , et al . Palpable Breast Lump Triage by Minimally Trained Operators in Mexico Using Computer-Assisted Diagnosis and Low-Cost Ultrasound . J Glob Oncol 2018. ; 4 ( 4 ): 1 – 9 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Berg WA , Gur D , Bandos AI , et al . Impact of Original and Artificially Improved Artificial Intelligence–based Computer-aided Diagnosis on Breast US Interpretation . J Breast Imaging 2021. ; 3 ( 3 ): 301 – 311 . [DOI] [PubMed] [Google Scholar]
  • 10. Mango VL , Sun M , Wynn RT , Ha R . Should We Ignore Follow, or Biopsy? Impact of Artificial Intelligence Decision Support on Breast Ultrasound Lesion Assessment . AJR Am J Roentgenol 2020. ; 214 ( 6 ): 1445 – 1452 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Shen Y , Shamout FE , Oliver JR , et al . Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams . Nat Commun 2021. ; 12 ( 1 ): 5645 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Mendelson EB , Böhm-Vélez M , Berg WA , et al . ACR BI-RADS Ultrasound. ACR BI-RADS Atlas, Breast Imaging Reporting and Data System . Reston, Va: : American College of Radiology; , 2013. . [Google Scholar]
  • 13. Kim SJ , Ko EY , Shin JH , et al . Application of sonographic BI-RADS to synchronous breast nodules detected in patients with breast cancer . AJR Am J Roentgenol 2008. ; 191 ( 3 ): 653 – 658 . [DOI] [PubMed] [Google Scholar]
  • 14. Sickles EA , D’Orsi CJ . Follow-up and outcome monitoring. ACR BI-RADS Atlas, Breast Imaging Reporting and Data System . Reston, Va: : American College of Radiology; , 2013. . [Google Scholar]
  • 15. Sternberg MR , Hadgu A . A GEE approach to estimating sensitivity and specificity and coverage properties of the confidence intervals . Stat Med 2001. ; 20 ( 9-10 ): 1529 – 1539 . [DOI] [PubMed] [Google Scholar]
  • 16. DeLong ER , DeLong DM , Clarke-Pearson DL . Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach . Biometrics 1988. ; 44 ( 3 ): 837 – 845 . [PubMed] [Google Scholar]
  • 17. Pace LE , Dusengimana JV , Hategekimana V , et al . Clinical Diagnoses and Outcomes After Diagnostic Breast Ultrasound by Nurses and General Practitioner Physicians in Rural Rwanda . J Am Coll Radiol 2022. ; 19 ( 8 ): 983 – 989 . [DOI] [PubMed] [Google Scholar]
  • 18. Park HJ , Kim SM , La Yun B , et al . A computer-aided diagnosis system using artificial intelligence for the diagnosis and characterization of breast masses on ultrasound: Added value for the inexperienced breast radiologist . Medicine (Baltimore) 2019. ; 98 ( 3 ): e14146 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Choi JS , Han BK , Ko ES , et al . Effect of a Deep Learning Framework-Based Computer-Aided Diagnosis System on the Diagnostic Performance of Radiologists in Differentiating between Malignant and Benign Masses on Breast Ultrasonography . Korean J Radiol 2019. ; 20 ( 5 ): 749 – 758 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Choi JH , Kang BJ , Baek JE , Lee HS , Kim SH . Application of computer-aided diagnosis in breast ultrasound interpretation: improvements in diagnostic performance according to reader experience . Ultrasonography 2018. ; 37 ( 3 ): 217 – 225 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Wu JY , Zhao ZZ , Zhang WY , et al . Computer-Aided Diagnosis of Solid Breast Lesions With Ultrasound: Factors Associated With False-negative and False-positive Results . J Ultrasound Med 2019. ; 38 ( 12 ): 3193 – 3202 . [DOI] [PubMed] [Google Scholar]
  • 22. Kim S , Choi Y , Kim E , et al . Deep learning-based computer-aided diagnosis in screening breast ultrasound to reduce false-positive diagnoses . Sci Rep 2021. ; 11 ( 1 ): 395 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Nicosia L , Addante F , Bozzini AC , et al . Evaluation of computer-aided diagnosis in breast ultrasonography: Improvement in diagnostic performance of inexperienced radiologists . Clin Imaging 2022. ; 82 : 150 – 155 . [DOI] [PubMed] [Google Scholar]
  • 24. Berg WA , Zhang Z , Cormack JB , Mendelson EB . Multiple bilateral circumscribed masses at screening breast US: consider annual follow-up . Radiology 2013. ; 268 ( 3 ): 673 – 683 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Berg WA , Blume JD , Cormack JB , et al . Combined screening with ultrasound and mammography vs mammography alone in women at elevated risk of breast cancer . JAMA 2008. ; 299 ( 18 ): 2151 – 2163 . [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Radiology are provided here courtesy of Radiological Society of North America

RESOURCES