Deep Learning Assisted Sparse Array Ultrasound Imaging

Baiyan Qi, Xinyu Tian, Lei Fu, Yi Li, Kai San Chan, Chuxuan Ling, Wonjun Yim, Shiming Zhang, Jesse V. Jokerst

Abstract
This study aims to restore grating lobe artifacts and improve the image resolution of sparse array ultrasonography via a deep learning predictive model. A deep learning assisted sparse array was developed using only 64 or 16 channels out of the 128 channels in which the pitch is two or eight times the original array. The deep learning assisted sparse array imaging system was demonstrated on ex vivo porcine teeth. 64- and 16-channel sparse array images were used as the input and corresponding 128-channel dense array images were used as the ground truth. The structural similarity index measure, mean squared error, and peak signal-to-noise ratio of predicted images improved significantly (p < 0.0001). The resolution of predicted images presented close values to ground truth images (0.18 mm and 0.15 mm versus 0.15 mm).

Introduction
Two-dimensional (2D) and three-dimensional (3D) ultrasonography has been widely applied in imaging tissues and organs for diagnosing diseases due to its capability of serving as a real-time, non-invasive, portable, and radiation-free tool. Ultrasound transducers are typically composed of multiple elements arranged in a linear or 2D array pattern for 2D and 3D imaging, respectively. Each element is controlled individually by an electrical channel, serving as a transmitter and receiver of ultrasound waves, which interfere with each other to create the ultrasound beam. At the same center frequency, image resolution is guided by the pitch, i.e., the spacing between individual elements in the ultrasound transducer array, and the number of transducer elements. The pitch is usually designed as one half of the ultrasonic wavelength in phased array ultrasound transducers, and close to one ultrasonic wavelength in linear array ultrasound transducers.

Materials and methods
The CNN model was trained using 64- or 16-channel sparse array images as the input and 128-channel dense array images as the ground truth (i.e., reference). The performance of the CNN model was evaluated based on the improvement in image quality and the accuracy of landmarks localization accuracy of the predicted images. Please see the subsequent sections for more detailed information.

Results
A representative image of porcine tooth (4th pre-molar) reconstructed from the 128-channel transducer was shown in Fig 2B. Notably, while the anatomy remains visible in the 64-channel image (Fig 2C, left), the grating lobe artifacts reduce the visibility of the anatomical features. For example, the ABC cannot be distinguished from the background, and the tooth surface below the CEJ is merged with the artifacts. There are no detectable anatomical features in the 16-channel image (Fig 2D, left). The results predicted from 64- and 16-channel were presented in Fig 2C and 2D, right, respectively. Fig 2E showed the yellow dashed line profile crossing the tooth surface of the 128-channel ground truth, 64-channel predicted, and 16-channel predicted images.

Discussion:
According to the Huygens principle, the pitch of linear array ultrasound transducer is required to be close to one wavelength to form a single wave front from wave fronts from each element. When the pitch is much larger than the wavelength, the interference of ultrasound waves from each element would generate unwanted grating lobes and degrade the image quality, which is the limitation of sparse array imaging. In this study, a predictive CNN model was introduced to assist in restoring the sparse array imaging and improving the image resolution. Benefiting from the powerful information analysis and optimization capability, the typical UNet architecture has demonstrated effective results for improving image quality when trained with limited data [41].

Conclusions
This study reports a deep learning-assisted sparse array ultrasound imaging system to reduce the cost of fabrication, complexity of channel control, and electrical power consumption. The proposed sparse array system only requires 1/8 of the traditional ultrasound transducer channels to generate high-resolution images that have comparable quality of the original dense array.

Citation: Qi B, Tian X, Fu L, Li Y, Chan KS, Ling C, et al. (2023) Deep learning assisted sparse array ultrasound imaging. PLoS ONE 18(10): e0293468. https://doi.org/10.1371/journal.pone.0293468

Editor: Xin Liu, Fudan University, CHINA

Received: July 12, 2023; Accepted: October 13, 2023; Published: October 30, 2023

Copyright: © 2023 Qi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Raw data are available from the Github repository (https://github.com/yrotisoper-cilbup/Raw-Data).

Funding: JVJ acknowledges NIH funding under R01 DE031307, R21 DE029025, and UL1 TR001442. (URL: https://www.nih.gov/) SZ acknowledges the Startup Fund and the Seed Funding for Strategic Interdisciplinary Research Scheme from the University of Hong Kong (HKU). (URL: https://www.hku.hk/) The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: Jesse V. Jokerst is co-founder of StyloSonic. This does not alter our adherence to PLOS ONE policies on sharing data and materials.