Search | arXiv e-print repository

Nosey: Open-source hardware for acoustic nasalance

Authors: Maya Dewhurst, Jack Collins, Justin J. H. Lo, Roy Alderton, Sam Kirkham

Abstract: We introduce Nosey (Nasalance Open Source Estimation sYstem), a low-cost, customizable, 3D-printed system for recording acoustic nasalance data that we have made available as open-source hardware (http://github.com/phoneticslab/nosey). We first outline the motivations and design principles behind our hardware nasalance system, and then present a comparison between Nosey and a commercial nasalance… ▽ More We introduce Nosey (Nasalance Open Source Estimation sYstem), a low-cost, customizable, 3D-printed system for recording acoustic nasalance data that we have made available as open-source hardware (http://github.com/phoneticslab/nosey). We first outline the motivations and design principles behind our hardware nasalance system, and then present a comparison between Nosey and a commercial nasalance device. Nosey shows consistently higher nasalance scores than the commercial device, but the magnitude of contrast between phonological environments is comparable between systems. We also review ways of customizing the hardware to facilitate testing, such as comparison of microphones and different construction materials. We conclude that Nosey is a flexible and cost-effective alternative to commercial nasometry devices and propose some methodological considerations for its use in data collection. △ Less

Submitted 29 May, 2025; originally announced May 2025.

Comments: Accepted to Interspeech 2025

arXiv:2505.04272 [pdf, ps, other]

doi 10.1109/TII.2025.3563531

Joint Task Offloading and Channel Allocation in Spatial-Temporal Dynamic for MEC Networks

Authors: Tianyi Shi, Tiankui Zhang, Jonathan Loo, Rong Huang, Yapeng Wang

Abstract: Computation offloading and resource allocation are critical in mobile edge computing (MEC) systems to handle the massive and complex requirements of applications restricted by limited resources. In a multi-user multi-server MEC network, the mobility of terminals causes computing requests to be dynamically distributed in space. At the same time, the non-negligible dependencies among tasks in some s… ▽ More Computation offloading and resource allocation are critical in mobile edge computing (MEC) systems to handle the massive and complex requirements of applications restricted by limited resources. In a multi-user multi-server MEC network, the mobility of terminals causes computing requests to be dynamically distributed in space. At the same time, the non-negligible dependencies among tasks in some specific applications impose temporal correlation constraints on the solution as well, leading the time-adjacent tasks to experience varying resource availability and competition from parallel counterparts. To address such dynamic spatial-temporal characteristics as a challenge in the allocation of communication and computation resources, we formulate a long-term delay-energy trade-off cost minimization problem in the view of jointly optimizing task offloading and resource allocation. We begin by designing a priority evaluation scheme to decouple task dependencies and then develop a grouped Knapsack problem for channel allocation considering the current data load and channel status. Afterward, in order to meet the rapid response needs of MEC systems, we exploit the double duel deep Q network (D3QN) to make offloading decisions and integrate channel allocation results into the reward as part of the dynamic environment feedback in D3QN, constituting the joint optimization of task offloading and channel allocation. Finally, comprehensive simulations demonstrate the performance of the proposed algorithm in the delay-energy trade-off cost and its adaptability for various applications. △ Less

Submitted 7 May, 2025; originally announced May 2025.

arXiv:2501.02016 [pdf, other]

ST-HCSS: Deep Spatio-Temporal Hypergraph Convolutional Neural Network for Soft Sensing

Authors: Hwa Hui Tew, Fan Ding, Gaoxuan Li, Junn Yong Loo, Chee-Ming Ting, Ze Yang Ding, Chee Pin Tan

Abstract: Higher-order sensor networks are more accurate in characterizing the nonlinear dynamics of sensory time-series data in modern industrial settings by allowing multi-node connections beyond simple pairwise graph edges. In light of this, we propose a deep spatio-temporal hypergraph convolutional neural network for soft sensing (ST-HCSS). In particular, our proposed framework is able to construct and… ▽ More Higher-order sensor networks are more accurate in characterizing the nonlinear dynamics of sensory time-series data in modern industrial settings by allowing multi-node connections beyond simple pairwise graph edges. In light of this, we propose a deep spatio-temporal hypergraph convolutional neural network for soft sensing (ST-HCSS). In particular, our proposed framework is able to construct and leverage a higher-order graph (hypergraph) to model the complex multi-interactions between sensor nodes in the absence of prior structural knowledge. To capture rich spatio-temporal relationships underlying sensor data, our proposed ST-HCSS incorporates stacked gated temporal and hypergraph convolution layers to effectively aggregate and update hypergraph information across time and nodes. Our results validate the superiority of ST-HCSS compared to existing state-of-the-art soft sensors, and demonstrates that the learned hypergraph feature representations aligns well with the sensor data correlations. The code is available at https://github.com/htew0001/ST-HCSS.git △ Less

Submitted 2 January, 2025; originally announced January 2025.

Comments: Accepted at the 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025)

arXiv:2501.02015 [pdf, other]

KANS: Knowledge Discovery Graph Attention Network for Soft Sensing in Multivariate Industrial Processes

Authors: Hwa Hui Tew, Gaoxuan Li, Fan Ding, Xuewen Luo, Junn Yong Loo, Chee-Ming Ting, Ze Yang Ding, Chee Pin Tan

Abstract: Soft sensing of hard-to-measure variables is often crucial in industrial processes. Current practices rely heavily on conventional modeling techniques that show success in improving accuracy. However, they overlook the non-linear nature, dynamics characteristics, and non-Euclidean dependencies between complex process variables. To tackle these challenges, we present a framework known as a Knowledg… ▽ More Soft sensing of hard-to-measure variables is often crucial in industrial processes. Current practices rely heavily on conventional modeling techniques that show success in improving accuracy. However, they overlook the non-linear nature, dynamics characteristics, and non-Euclidean dependencies between complex process variables. To tackle these challenges, we present a framework known as a Knowledge discovery graph Attention Network for effective Soft sensing (KANS). Unlike the existing deep learning soft sensor models, KANS can discover the intrinsic correlations and irregular relationships between the multivariate industrial processes without a predefined topology. First, an unsupervised graph structure learning method is introduced, incorporating the cosine similarity between different sensor embedding to capture the correlations between sensors. Next, we present a graph attention-based representation learning that can compute the multivariate data parallelly to enhance the model in learning complex sensor nodes and edges. To fully explore KANS, knowledge discovery analysis has also been conducted to demonstrate the interpretability of the model. Experimental results demonstrate that KANS significantly outperforms all the baselines and state-of-the-art methods in soft sensing performance. Furthermore, the analysis shows that KANS can find sensors closely related to different process variables without domain knowledge, significantly improving soft sensing accuracy. △ Less

Submitted 2 January, 2025; originally announced January 2025.

Comments: Accepted at IEEE International Conference on Systems, Man, and Cybernetics (IEEE SMC 2024)

arXiv:2405.11133 [pdf]

XCAT-3.0: A Comprehensive Library of Personalized Digital Twins Derived from CT Scans

Authors: Lavsen Dahal, Mobina Ghojoghnejad, Dhrubajyoti Ghosh, Yubraj Bhandari, David Kim, Fong Chi Ho, Fakrul Islam Tushar, Sheng Luoa, Kyle J. Lafata, Ehsan Abadi, Ehsan Samei, Joseph Y. Lo, W. Paul Segars

Abstract: Virtual Imaging Trials (VIT) offer a cost-effective and scalable approach for evaluating medical imaging technologies. Computational phantoms, which mimic real patient anatomy and physiology, play a central role in VITs. However, the current libraries of computational phantoms face limitations, particularly in terms of sample size and diversity. Insufficient representation of the population hamper… ▽ More Virtual Imaging Trials (VIT) offer a cost-effective and scalable approach for evaluating medical imaging technologies. Computational phantoms, which mimic real patient anatomy and physiology, play a central role in VITs. However, the current libraries of computational phantoms face limitations, particularly in terms of sample size and diversity. Insufficient representation of the population hampers accurate assessment of imaging technologies across different patient groups. Traditionally, the more realistic computational phantoms were created by manual segmentation, which is a laborious and time-consuming task, impeding the expansion of phantom libraries. This study presents a framework for creating realistic computational phantoms using a suite of automatic segmentation models and performing three forms of automated quality control on the segmented organ masks. The result is the release of over 2500 new computational phantoms, so-named XCAT3.0 after the ubiquitous XCAT computational construct. This new formation embodies 140 structures and represents a comprehensive approach to detailed anatomical modeling. The developed computational phantoms are formatted in both voxelized and surface mesh formats. The framework is combined with an in-house CT scanner simulator to produce realistic CT images. The framework has the potential to advance virtual imaging trials, facilitating comprehensive and reliable evaluations of medical imaging technologies. Phantoms may be requested at https://cvit.duke.edu/resources/. Code, model weights, and sample CT images are available at https://xcat-3.github.io/. △ Less

Submitted 9 September, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

arXiv:2404.11221 [pdf]

doi 10.1016/j.media.2025.103576

Virtual Lung Screening Trial (VLST): An In Silico Study Inspired by the National Lung Screening Trial for Lung Cancer Detection

Authors: Fakrul Islam Tushar, Liesbeth Vancoillie, Cindy McCabe, Amareswararao Kavuri, Lavsen Dahal, Brian Harrawood, Milo Fryling, Mojtaba Zarei, Saman Sotoudeh-Paima, Fong Chi Ho, Dhrubajyoti Ghosh, Michael R. Harowicz, Tina D. Tailor, Sheng Luo, W. Paul Segars, Ehsan Abadi, Kyle J. Lafata, Joseph Y. Lo, Ehsan Samei

Abstract: Clinical imaging trials play a crucial role in advancing medical innovation but are often costly, inefficient, and ethically constrained. Virtual Imaging Trials (VITs) present a solution by simulating clinical trial components in a controlled, risk-free environment. The Virtual Lung Screening Trial (VLST), an in silico study inspired by the National Lung Screening Trial (NLST), illustrates the pot… ▽ More Clinical imaging trials play a crucial role in advancing medical innovation but are often costly, inefficient, and ethically constrained. Virtual Imaging Trials (VITs) present a solution by simulating clinical trial components in a controlled, risk-free environment. The Virtual Lung Screening Trial (VLST), an in silico study inspired by the National Lung Screening Trial (NLST), illustrates the potential of VITs to expedite clinical trials, minimize risks to participants, and promote optimal use of imaging technologies in healthcare. This study aimed to show that a virtual imaging trial platform could investigate some key elements of a major clinical trial, specifically the NLST, which compared Computed tomography (CT) and chest radiography (CXR) for lung cancer screening. With simulated cancerous lung nodules, a virtual patient cohort of 294 subjects was created using XCAT human models. Each virtual patient underwent both CT and CXR imaging, with deep learning models, the AI CT-Reader and AI CXR-Reader, acting as virtual readers to perform recall patients with suspicion of lung cancer. The primary outcome was the difference in diagnostic performance between CT and CXR, measured by the Area Under the Curve (AUC). The AI CT-Reader showed superior diagnostic accuracy, achieving an AUC of 0.92 (95% CI: 0.90-0.95) compared to the AI CXR-Reader's AUC of 0.72 (95% CI: 0.67-0.77). Furthermore, at the same 94% CT sensitivity reported by the NLST, the VLST specificity of 73% was similar to the NLST specificity of 73.4%. This CT performance highlights the potential of VITs to replicate certain aspects of clinical trials effectively, paving the way toward a safe and efficient method for advancing imaging-based diagnostics. △ Less

Submitted 4 April, 2025; v1 submitted 17 April, 2024; originally announced April 2024.

Comments: 18 pages, 4 figures, 1 table, Appendices; Accepted at Medical Image Analysis

Journal ref: Med. Image Anal. 103 (2025) 103576

arXiv:2402.04419 [pdf]

What limits performance of weakly supervised deep learning for chest CT classification?

Authors: Fakrul Islam Tushar, Vincent M. D'Anniballe, Geoffrey D. Rubin, Joseph Y. Lo

Abstract: Weakly supervised learning with noisy data has drawn attention in the medical imaging community due to the sparsity of high-quality disease labels. However, little is known about the limitations of such weakly supervised learning and the effect of these constraints on disease classification performance. In this paper, we test the effects of such weak supervision by examining model tolerance for th… ▽ More Weakly supervised learning with noisy data has drawn attention in the medical imaging community due to the sparsity of high-quality disease labels. However, little is known about the limitations of such weakly supervised learning and the effect of these constraints on disease classification performance. In this paper, we test the effects of such weak supervision by examining model tolerance for three conditions. First, we examined model tolerance for noisy data by incrementally increasing error in the labels within the training data. Second, we assessed the impact of dataset size by varying the amount of training data. Third, we compared performance differences between binary and multi-label classification. Results demonstrated that the model could endure up to 10% added label error before experiencing a decline in disease classification performance. Disease classification performance steadily rose as the amount of training data was increased for all disease classes, before experiencing a plateau in performance at 75% of training data. Last, the binary model outperformed the multilabel model in every disease category. However, such interpretations may be misleading, as the binary model was heavily influenced by co-occurring diseases and may not have learned the specific features of the disease in the image. In conclusion, this study may help the medical imaging community understand the benefits and risks of weak supervision with noisy labels. Such studies demonstrate the need to build diverse, large-scale datasets and to develop explainable and responsible AI. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: 16 pages , 8 figures. arXiv admin note: text overlap with arXiv:2202.11709

arXiv:2310.13259 [pdf]

Domain-specific optimization and diverse evaluation of self-supervised models for histopathology

Authors: Jeremy Lai, Faruk Ahmed, Supriya Vijay, Tiam Jaroensri, Jessica Loo, Saurabh Vyawahare, Saloni Agarwal, Fayaz Jamil, Yossi Matias, Greg S. Corrado, Dale R. Webster, Jonathan Krause, Yun Liu, Po-Hsuan Cameron Chen, Ellery Wulczyn, David F. Steiner

Abstract: Task-specific deep learning models in histopathology offer promising opportunities for improving diagnosis, clinical research, and precision medicine. However, development of such models is often limited by availability of high-quality data. Foundation models in histopathology that learn general representations across a wide range of tissue types, diagnoses, and magnifications offer the potential… ▽ More Task-specific deep learning models in histopathology offer promising opportunities for improving diagnosis, clinical research, and precision medicine. However, development of such models is often limited by availability of high-quality data. Foundation models in histopathology that learn general representations across a wide range of tissue types, diagnoses, and magnifications offer the potential to reduce the data, compute, and technical expertise necessary to develop task-specific deep learning models with the required level of model performance. In this work, we describe the development and evaluation of foundation models for histopathology via self-supervised learning (SSL). We first establish a diverse set of benchmark tasks involving 17 unique tissue types and 12 unique cancer types and spanning different optimal magnifications and task types. Next, we use this benchmark to explore and evaluate histopathology-specific SSL methods followed by further evaluation on held out patch-level and weakly supervised tasks. We found that standard SSL methods thoughtfully applied to histopathology images are performant across our benchmark tasks and that domain-specific methodological improvements can further increase performance. Our findings reinforce the value of using domain-specific SSL methods in pathology, and establish a set of high quality foundation models to enable further research across diverse applications. △ Less

Submitted 19 October, 2023; originally announced October 2023.

Comments: 4 main tables, 3 main figures, additional supplemental tables and figures

arXiv:2308.09730 [pdf]

The Utility of the Virtual Imaging Trials Methodology for Objective Characterization of AI Systems and Training Data

Authors: Fakrul Islam Tushar, Lavsen Dahal, Saman Sotoudeh-Paima, Ehsan Abadi, W. Paul Segars, Ehsan Samei, Joseph Y. Lo

Abstract: Purpose: The credibility of Artificial Intelligence (AI) models for medical imaging continues to be a challenge, affected by the diversity of models, the data used to train the models, and applicability of their combination to produce reproducible results for new data. Approach: In this work we aimed to explore if the emerging Virtual Imaging Trials (VIT) methodologies can provide an objective res… ▽ More Purpose: The credibility of Artificial Intelligence (AI) models for medical imaging continues to be a challenge, affected by the diversity of models, the data used to train the models, and applicability of their combination to produce reproducible results for new data. Approach: In this work we aimed to explore if the emerging Virtual Imaging Trials (VIT) methodologies can provide an objective resource to approach this challenge. The study was conducted for the case example of COVID-19 diagnosis using clinical and virtual computed tomography (CT) and chest radiography (CXR) processed with convolutional neural networks. Multiple AI models were developed and tested using 3D ResNet-like and 2D EfficientNetv2 architectures across diverse datasets. Results: The performance differences were evaluated in terms of the area under the curve (AUC) and the DeLong method for AUC confidence intervals. The models trained on the most diverse datasets showed the highest external testing performance, with AUC values ranging from 0.73-0.76 for CT and 0.70-0.73 for CXR. Internal testing yielded higher AUC values (0.77 -0.85 for CT and 0.77-1.0 for CXR), highlighting a substantial drop in performance during external validation, which underscores the importance of diverse and comprehensive training and testing data. Most notably, VIT approach provided objective assessment of the utility of diverse models and datasets while further providing insight into the influence of dataset characteristics, patient factors, and imaging physics on AI efficacy. Conclusions: The VIT approach can be used to enhance model transparency and reliability, offering nuanced insights into the factors driving AI performance and bridging the gap between experimental and clinical settings. △ Less

Submitted 16 July, 2025; v1 submitted 17 August, 2023; originally announced August 2023.

Comments: 8 figures, 4 Tables

arXiv:2307.03205 [pdf, other]

Joint Computing Offloading and Resource Allocation for Classification Intelligent Tasks in MEC Systems

Authors: Yuanpeng Zheng, Tiankui Zhang, Jonathan Loo, Yapeng Wang, Arumugam Nallanathan

Abstract: Mobile edge computing (MEC) enables low-latency and high-bandwidth applications by bringing computation and data storage closer to end-users. Intelligent computing is an important application of MEC, where computing resources are used to solve intelligent task-related problems based on task requirements. However, efficiently offloading computing and allocating resources for intelligent tasks in ME… ▽ More Mobile edge computing (MEC) enables low-latency and high-bandwidth applications by bringing computation and data storage closer to end-users. Intelligent computing is an important application of MEC, where computing resources are used to solve intelligent task-related problems based on task requirements. However, efficiently offloading computing and allocating resources for intelligent tasks in MEC systems is a challenging problem due to complex interactions between task requirements and MEC resources. To address this challenge, we investigate joint computing offloading and resource allocation for intelligent tasks in MEC systems. Our goal is to optimize system utility by jointly considering computing accuracy and task delay to achieve maximum system performance. We focus on classification intelligence tasks and formulate an optimization problem that considers both the accuracy requirements of tasks and the parallel computing capabilities of MEC systems. To solve the optimization problem, we decompose it into three subproblems: subcarrier allocation, computing capacity allocation, and compression offloading. We use convex optimization and successive convex approximation to derive closed-form expressions for the subcarrier allocation, offloading decisions, computing capacity, and compressed ratio. Based on our solutions, we design an efficient computing offloading and resource allocation algorithm for intelligent tasks in MEC systems. Our simulation results demonstrate that our proposed algorithm significantly improves the performance of intelligent tasks in MEC systems and achieves a flexible trade-off between system revenue and cost considering intelligent tasks compared with the benchmarks. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2307.02747

arXiv:2307.02748 [pdf, other]

doi 10.1109/TVT.2023.3290546

Dynamic Multi-time Scale User Admission and Resource Allocation for Semantic Extraction in MEC Systems

Authors: Yuanpeng Zheng, Tiankui Zhang, Jonathan Loo

Abstract: This paper investigates the semantic extraction task-oriented dynamic multi-time scale user admission and resourceallocation in mobile edge computing (MEC) systems. Amid prevalence artifi cial intelligence applications in various industries,the offloading of semantic extraction tasks which are mainlycomposed of convolutional neural networks of computer vision isa great challenge for communication… ▽ More This paper investigates the semantic extraction task-oriented dynamic multi-time scale user admission and resourceallocation in mobile edge computing (MEC) systems. Amid prevalence artifi cial intelligence applications in various industries,the offloading of semantic extraction tasks which are mainlycomposed of convolutional neural networks of computer vision isa great challenge for communication bandwidth and computing capacity allocation in MEC systems. Considering the stochasticnature of the semantic extraction tasks, we formulate a stochastic optimization problem by modeling it as the dynamic arrival of tasks in the temporal domain. We jointly optimize the system revenue and cost which are represented as user admission in the long term and resource allocation in the short term respectively. To handle the proposed stochastic optimization problem, we decompose it into short-time-scale subproblems and a long-time-scale subproblem by using the Lyapunov optimization technique. After that, the short-time-scale optimization variables of resource allocation, including user association, bandwidth allocation, and computing capacity allocation are obtained in closed form. The user admission optimization on long-time scales is solved by a heuristic iteration method. Then, the multi-time scale user admission and resource allocation algorithm is proposed for dynamic semantic extraction task computing in MEC systems. Simulation results demonstrate that, compared with the benchmarks, the proposed algorithm improves the performance of user admission and resource allocation efficiently and achieves a flexible trade-off between system revenue and cost at multi-time scales and considering semantic extraction tasks. △ Less

Submitted 5 July, 2023; originally announced July 2023.

arXiv:2306.12361 [pdf, other]

doi 10.1109/TSMC.2024.3419262

Sigma-point Kalman Filter with Nonlinear Unknown Input Estimation via Optimization and Data-driven Approach for Dynamic Systems

Authors: Junn Yong Loo, Ze Yang Ding, Vishnu Monn Baskaran, Surya Girinatha Nurzaman, Chee Pin Tan

Abstract: Most works on joint state and unknown input (UI) estimation require the assumption that the UIs are linear; this is potentially restrictive as it does not hold in many intelligent autonomous systems. To overcome this restriction and circumvent the need to linearize the system, we propose a derivative-free Unknown Input Sigma-point Kalman Filter (SPKF-nUI) where the SPKF is interconnected with a ge… ▽ More Most works on joint state and unknown input (UI) estimation require the assumption that the UIs are linear; this is potentially restrictive as it does not hold in many intelligent autonomous systems. To overcome this restriction and circumvent the need to linearize the system, we propose a derivative-free Unknown Input Sigma-point Kalman Filter (SPKF-nUI) where the SPKF is interconnected with a general nonlinear UI estimator that can be implemented via nonlinear optimization and data-driven approaches. The nonlinear UI estimator uses the posterior state estimate which is less susceptible to state prediction error. In addition, we introduce a joint sigma-point transformation scheme to incorporate both the state and UI uncertainties in the estimation of SPKF-nUI. An in-depth stochastic stability analysis proves that the proposed SPKF-nUI yields exponentially converging estimation error bounds under reasonable assumptions. Finally, two case studies are carried out on a simulation-based rigid robot and a physical soft robot, i.e., robots made of soft materials with complex dynamics to validate effectiveness of the proposed filter on nonlinear dynamic systems. Our results demonstrate that the proposed SPKF-nUI achieves the lowest state and UI estimation errors when compared to the existing nonlinear state-UI filters. △ Less

Submitted 9 November, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

Comments: Accepted at the IEEE Transactions on Systems, Man, and Cybernetics: Systems

arXiv:2301.13412 [pdf]

Development of a Hardware-in-the-loop Testbed for Laboratory Performance Verification of Flexible Building Equipment in Typical Commercial Buildings

Authors: Zhelun Chen, Jin Wen, Steven T. Bushby, L. James Lo, Zheng O'Neill, W. Vance Payne, Amanda Pertzborn, Caleb Calfa, Yangyang Fu, Gabriel Grajewski, Yicheng Li, Zhiyao Yang

Abstract: The goals of reducing energy costs, shifting electricity peaks, increasing the use of renewable energy, and enhancing the stability of the electric grid can be met in part by fully exploiting the energy flexibility potential of buildings and building equipment. The development of strategies that exploit these flexibilities could be facilitated by publicly available high-resolution datasets illustr… ▽ More The goals of reducing energy costs, shifting electricity peaks, increasing the use of renewable energy, and enhancing the stability of the electric grid can be met in part by fully exploiting the energy flexibility potential of buildings and building equipment. The development of strategies that exploit these flexibilities could be facilitated by publicly available high-resolution datasets illustrating how control of HVAC systems in commercial buildings can be used in different climate zones to shape the energy use profile of a building for grid needs. This article presents the development and integration of a Hardware-In-the-Loop Flexible load Testbed (HILFT) that integrates physical HVAC systems with a simulated building model and simulated occupants with the goal of generating datasets to verify load flexibility of typical commercial buildings. Compared to simulation-only experiments, the hardware-in-the-loop approach captures the dynamics of the physical systems while also allowing efficient testing of various boundary conditions. The HILFT integration in this article is achieved through the co-simulation among various software environments including LabVIEW, MATLAB, and EnergyPlus. Although theoretically viable, such integration has encountered many real-world challenges, such as: 1) how to design the overall data infrastructure to ensure effective, robust, and efficient integration; 2) how to avoid closed-loop hunting between simulated and emulated variables; 3) how to quantify system response times and minimize system delays; and 4) how to assess the overall integration quality. Lessons-learned using the examples of an AHU-VAV system, an air-source heat pump system, and a water-source heat pump system are presented. △ Less

Submitted 5 February, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

Comments: presented at the ASHRAE 2022 Annual Conference

arXiv:2211.09854 [pdf, other]

An Iterative Method to Learn a Linear Control Barrier Function

Authors: Zihao Liang, Jason King Ching Lo

Abstract: Control barrier function (CBF) has recently started to serve as a basis to develop approaches for enforcing safety requirements in control systems. However, constructing such function for a general system is a non-trivial task. This paper proposes an iterative, optimization-based framework to obtain a CBF from a given user-specified set for a general control affine system. Without losing generalit… ▽ More Control barrier function (CBF) has recently started to serve as a basis to develop approaches for enforcing safety requirements in control systems. However, constructing such function for a general system is a non-trivial task. This paper proposes an iterative, optimization-based framework to obtain a CBF from a given user-specified set for a general control affine system. Without losing generality, we parameterize the CBF as a set of linear functions of states. By taking samples from the given user-specified set, we reformulate the problem of learning a CBF into an optimization problem that solves for linear function coefficients. The resulting linear functions construct the CBF and yield a safe set which has forward invariance property. In addition, the proposed framework explicitly addresses control input constraints during the construction of CBFs. Effectiveness of the proposed method is demonstrated by learning a CBF for an nonlinear Moore Greitzer jet engine, where the system trajectory is prevented from entering unsafe set. △ Less

Submitted 17 November, 2022; originally announced November 2022.

arXiv:2209.00976 [pdf]

doi 10.1016/j.imed.2022.08.001

Automated Assessment of Transthoracic Echocardiogram Image Quality Using Deep Neural Networks

Authors: Robert B. Labs, Apostolos Vrettos, Jonathan Loo, Massoud Zolgharni

Abstract: Standard views in two-dimensional echocardiography are well established but the quality of acquired images are highly dependent on operator skills and are assessed subjectively. This study is aimed at providing an objective assessment pipeline for echocardiogram image quality by defining a new set of domain-specific quality indicators. Consequently, image quality assessment can thus be automated t… ▽ More Standard views in two-dimensional echocardiography are well established but the quality of acquired images are highly dependent on operator skills and are assessed subjectively. This study is aimed at providing an objective assessment pipeline for echocardiogram image quality by defining a new set of domain-specific quality indicators. Consequently, image quality assessment can thus be automated to enhance clinical measurements, interpretation, and real-time optimization. We have developed deep neural networks for the automated assessment of echocardiographic frame which were randomly sampled from 11,262 adult patients. The private echocardiography dataset consists of 33,784 frames, previously acquired between 2010 and 2020. Deep learning approaches were used to extract the spatiotemporal features and the image quality indicators were evaluated against the mean absolute error. Our quality indicators encapsulate both anatomical and pathological elements to provide multivariate assessment scores for anatomical visibility, clarity, depth-gain and foreshortedness, respectively. △ Less

Submitted 2 September, 2022; originally announced September 2022.

arXiv:2209.00959 [pdf]

doi 10.1007/978-3-030-80432-9_36

Echocardiographic Image Quality Assessment Using Deep Neural Networks

Authors: Robert B. Labs, Massoud Zolgharni, Jonathan P. Loo

Abstract: Echocardiography image quality assessment is not a trivial issue in transthoracic examination. As the in vivo examination of heart structures gained prominence in cardiac diagnosis, it has been affirmed that accurate diagnosis of the left ventricle functions is hugely dependent on the quality of echo images. Up till now, visual assessment of echo images is highly subjective and requires specific d… ▽ More Echocardiography image quality assessment is not a trivial issue in transthoracic examination. As the in vivo examination of heart structures gained prominence in cardiac diagnosis, it has been affirmed that accurate diagnosis of the left ventricle functions is hugely dependent on the quality of echo images. Up till now, visual assessment of echo images is highly subjective and requires specific definition under clinical pathologies. While poor-quality images impair quantifications and diagnosis, the inherent variations in echocardiographic image quality standards indicates the complexity faced among different observers and provides apparent evidence for incoherent assessment under clinical trials, especially with less experienced cardiologists. In this research, our aim was to analyse and define specific quality attributes mostly discussed by experts and present a fully trained convolutional neural network model for assessing such quality features objectively. △ Less

Submitted 2 September, 2022; originally announced September 2022.

Journal ref: Medical Image Understanding and Analysis. MIUA 2021

arXiv:2203.03074 [pdf]

Virtual vs. Reality: External Validation of COVID-19 Classifiers using XCAT Phantoms for Chest Computed Tomography

Authors: Fakrul Islam Tushar, Ehsan Abadi, Saman Sotoudeh-Paima, Rafael B. Fricks, Maciej A. Mazurowski, W. Paul Segars, Ehsan Samei, Joseph Y. Lo

Abstract: Research studies of artificial intelligence models in medical imaging have been hampered by poor generalization. This problem has been especially concerning over the last year with numerous applications of deep learning for COVID-19 diagnosis. Virtual imaging trials (VITs) could provide a solution for objective evaluation of these models. In this work utilizing the VITs, we created the CVIT-COVID… ▽ More Research studies of artificial intelligence models in medical imaging have been hampered by poor generalization. This problem has been especially concerning over the last year with numerous applications of deep learning for COVID-19 diagnosis. Virtual imaging trials (VITs) could provide a solution for objective evaluation of these models. In this work utilizing the VITs, we created the CVIT-COVID dataset including 180 virtually imaged computed tomography (CT) images from simulated COVID-19 and normal phantom models under different COVID-19 morphology and imaging properties. We evaluated the performance of an open-source, deep-learning model from the University of Waterloo trained with multi-institutional data and an in-house model trained with the open clinical dataset called MosMed. We further validated the model's performance against open clinical data of 305 CT images to understand virtual vs. real clinical data performance. The open-source model was published with nearly perfect performance on the original Waterloo dataset but showed a consistent performance drop in external testing on another clinical dataset (AUC=0.77) and our simulated CVIT-COVID dataset (AUC=0.55). The in-house model achieved an AUC of 0.87 while testing on the internal test set (MosMed test set). However, performance dropped to an AUC of 0.65 and 0.69 when evaluated on clinical and our simulated CVIT-COVID dataset. The VIT framework offered control over imaging conditions, allowing us to show there was no change in performance as CT exposure was changed from 28.5 to 57 mAs. The VIT framework also provided voxel-level ground truth, revealing that performance of in-house model was much higher at AUC=0.87 for diffuse COVID-19 infection size >2.65% lung volume versus AUC=0.52 for focal disease with <2.65% volume. The virtual imaging framework enabled these uniquely rigorous analyses of model performance. △ Less

Submitted 6 March, 2022; originally announced March 2022.

Comments: 7 pages, 5 figures, 2 tables, presented at the Medical Imaging 2022: Computer-Aided Diagnosis, 2022

arXiv:2203.01934 [pdf]

Quality or Quantity: Toward a Unified Approach for Multi-organ Segmentation in Body CT

Authors: Fakrul Islam Tushar, Husam Nujaim, Wanyi Fu, Ehsan Abadi, Maciej A. Mazurowski, Ehsan Samei, William P. Segars, Joseph Y. Lo

Abstract: Organ segmentation of medical images is a key step in virtual imaging trials. However, organ segmentation datasets are limited in terms of quality (because labels cover only a few organs) and quantity (since case numbers are limited). In this study, we explored the tradeoffs between quality and quantity. Our goal is to create a unified approach for multi-organ segmentation of body CT, which will f… ▽ More Organ segmentation of medical images is a key step in virtual imaging trials. However, organ segmentation datasets are limited in terms of quality (because labels cover only a few organs) and quantity (since case numbers are limited). In this study, we explored the tradeoffs between quality and quantity. Our goal is to create a unified approach for multi-organ segmentation of body CT, which will facilitate the creation of large numbers of accurate virtual phantoms. Initially, we compared two segmentation architectures, 3D-Unet and DenseVNet, which were trained using XCAT data that is fully labeled with 22 organs, and chose the 3D-Unet as the better performing model. We used the XCAT-trained model to generate pseudo-labels for the CT-ORG dataset that has only 7 organs segmented. We performed two experiments: First, we trained 3D-UNet model on the XCAT dataset, representing quality data, and tested it on both XCAT and CT-ORG datasets. Second, we trained 3D-UNet after including the CT-ORG dataset into the training set to have more quantity. Performance improved for segmentation in the organs where we have true labels in both datasets and degraded when relying on pseudo-labels. When organs were labeled in both datasets, Exp-2 improved Average DSC in XCAT and CT-ORG by 1. This demonstrates that quality data is the key to improving the model's performance. △ Less

Submitted 2 March, 2022; originally announced March 2022.

Comments: 6 pages, 3 figures, 2 tables, Accepted and Presented at SPIE Medical Imaging 2022

arXiv:2202.11709 [pdf]

Co-occurring Diseases Heavily Influence the Performance of Weakly Supervised Learning Models for Classification of Chest CT

Authors: Fakrul Islam Tushar, Vincent M. D'Anniballe, Geoffrey D. Rubin, Ehsan Samei, Joseph Y. Lo

Abstract: Despite the potential of weakly supervised learning to automatically annotate massive amounts of data, little is known about its limitations for use in computer-aided diagnosis (CAD). For CT specifically, interpreting the performance of CAD algorithms can be challenging given the large number of co-occurring diseases. This paper examines the effect of co-occurring diseases when training classifica… ▽ More Despite the potential of weakly supervised learning to automatically annotate massive amounts of data, little is known about its limitations for use in computer-aided diagnosis (CAD). For CT specifically, interpreting the performance of CAD algorithms can be challenging given the large number of co-occurring diseases. This paper examines the effect of co-occurring diseases when training classification models by weakly supervised learning, specifically by comparing multi-label and multiple binary classifiers using the same training data. Our results demonstrated that the binary model outperformed the multi-label classification in every disease category in terms of AUC. However, this performance was heavily influenced by co-occurring diseases in the binary model, suggesting it did not always learn the correct appearance of the specific disease. For example, binary classification of lung nodules resulted in an AUC of < 0.65 when there were no other co-occurring diseases, but when lung nodules co-occurred with emphysema, the performance reached AUC > 0.80. We hope this paper revealed the complexity of interpreting disease classification performance in weakly supervised models and will encourage researchers to examine the effect of co-occurring diseases on classification performance in the future. △ Less

Submitted 23 February, 2022; originally announced February 2022.

Comments: 5 pages, 4 figures, Accepted at SPIE Medical Imaging Conference 2022

arXiv:2011.07995 [pdf, other]

doi 10.1001/jamanetworkopen.2021.19100

Detection of masses and architectural distortions in digital breast tomosynthesis: a publicly available dataset of 5,060 patients and a deep learning model

Authors: Mateusz Buda, Ashirbani Saha, Ruth Walsh, Sujata Ghate, Nianyi Li, Albert Święcicki, Joseph Y. Lo, Maciej A. Mazurowski

Abstract: Breast cancer screening is one of the most common radiological tasks with over 39 million exams performed each year. While breast cancer screening has been one of the most studied medical imaging applications of artificial intelligence, the development and evaluation of the algorithms are hindered due to the lack of well-annotated large-scale publicly available datasets. This is particularly an is… ▽ More Breast cancer screening is one of the most common radiological tasks with over 39 million exams performed each year. While breast cancer screening has been one of the most studied medical imaging applications of artificial intelligence, the development and evaluation of the algorithms are hindered due to the lack of well-annotated large-scale publicly available datasets. This is particularly an issue for digital breast tomosynthesis (DBT) which is a relatively new breast cancer screening modality. We have curated and made publicly available a large-scale dataset of digital breast tomosynthesis images. It contains 22,032 reconstructed DBT volumes belonging to 5,610 studies from 5,060 patients. This included four groups: (1) 5,129 normal studies, (2) 280 studies where additional imaging was needed but no biopsy was performed, (3) 112 benign biopsied studies, and (4) 89 studies with cancer. Our dataset included masses and architectural distortions which were annotated by two experienced radiologists. Additionally, we developed a single-phase deep learning detection model and tested it using our dataset to serve as a baseline for future research. Our model reached a sensitivity of 65% at 2 false positives per breast. Our large, diverse, and highly-curated dataset will facilitate development and evaluation of AI algorithms for breast cancer screening through providing data for training as well as common set of cases for model validation. The performance of the model developed in our study shows that the task remains challenging and will serve as a baseline for future model development. △ Less

Submitted 20 November, 2022; v1 submitted 13 November, 2020; originally announced November 2020.

Journal ref: JAMA Netw Open. 2021;4(8):e2119100

arXiv:2008.08730 [pdf]

iPhantom: a framework for automated creation of individualized computational phantoms and its application to CT organ dosimetry

Authors: Wanyi Fu, Shobhit Sharma, Ehsan Abadi, Alexandros-Stavros Iliopoulos, Qi Wang, Joseph Y. Lo, Xiaobai Sun, William P. Segars, Ehsan Samei

Abstract: Objective: This study aims to develop and validate a novel framework, iPhantom, for automated creation of patient-specific phantoms or digital-twins (DT) using patient medical images. The framework is applied to assess radiation dose to radiosensitive organs in CT imaging of individual patients. Method: From patient CT images, iPhantom segments selected anchor organs (e.g. liver, bones, pancreas)… ▽ More Objective: This study aims to develop and validate a novel framework, iPhantom, for automated creation of patient-specific phantoms or digital-twins (DT) using patient medical images. The framework is applied to assess radiation dose to radiosensitive organs in CT imaging of individual patients. Method: From patient CT images, iPhantom segments selected anchor organs (e.g. liver, bones, pancreas) using a learning-based model developed for multi-organ CT segmentation. Organs challenging to segment (e.g. intestines) are incorporated from a matched phantom template, using a diffeomorphic registration model developed for multi-organ phantom-voxels. The resulting full-patient phantoms are used to assess organ doses during routine CT exams. Result: iPhantom was validated on both the XCAT (n=50) and an independent clinical (n=10) dataset with similar accuracy. iPhantom precisely predicted all organ locations with good accuracy of Dice Similarity Coefficients (DSC) >0.6 for anchor organs and DSC of 0.3-0.9 for all other organs. iPhantom showed less than 10% dose errors for the majority of organs, which was notably superior to the state-of-the-art baseline method (20-35% dose errors). Conclusion: iPhantom enables automated and accurate creation of patient-specific phantoms and, for the first time, provides sufficient and automated patient-specific dose estimates for CT dosimetry. Significance: The new framework brings the creation and application of CHPs to the level of individual CHPs through automation, achieving a wider and precise organ localization, paving the way for clinical monitoring, and personalized optimization, and large-scale research. △ Less

Submitted 19 August, 2020; originally announced August 2020.

Comments: Main text: 11 pages, 8 figures; Supplement material: 7 pages, 5 figures, 7 tables

arXiv:2008.01158 [pdf]

doi 10.1148/ryai.210026

Classification of Multiple Diseases on Body CT Scans using Weakly Supervised Deep Learning

Authors: Fakrul Islam Tushar, Vincent M. D'Anniballe, Rui Hou, Maciej A. Mazurowski, Wanyi Fu, Ehsan Samei, Geoffrey D. Rubin, Joseph Y. Lo

Abstract: Purpose: To design multi-disease classifiers for body CT scans for three different organ systems using automatically extracted labels from radiology text reports.Materials & Methods: This retrospective study included a total of 12,092 patients (mean age 57 +- 18; 6,172 women) for model development and testing (from 2012-2017). Rule-based algorithms were used to extract 19,225 disease labels from 1… ▽ More Purpose: To design multi-disease classifiers for body CT scans for three different organ systems using automatically extracted labels from radiology text reports.Materials & Methods: This retrospective study included a total of 12,092 patients (mean age 57 +- 18; 6,172 women) for model development and testing (from 2012-2017). Rule-based algorithms were used to extract 19,225 disease labels from 13,667 body CT scans from 12,092 patients. Using a three-dimensional DenseVNet, three organ systems were segmented: lungs and pleura; liver and gallbladder; and kidneys and ureters. For each organ, a three-dimensional convolutional neural network classified no apparent disease versus four common diseases for a total of 15 different labels across all three models. Testing was performed on a subset of 2,158 CT volumes relative to 2,875 manually derived reference labels from 2133 patients (mean age 58 +- 18;1079 women). Performance was reported as receiver operating characteristic area under the curve (AUC) with 95% confidence intervals by the DeLong method. Results: Manual validation of the extracted labels confirmed 91% to 99% accuracy across the 15 different labels. AUCs for lungs and pleura labels were: atelectasis 0.77 (95% CI: 0.74, 0.81), nodule 0.65 (0.61, 0.69), emphysema 0.89 (0.86, 0.92), effusion 0.97 (0.96, 0.98), and no apparent disease 0.89 (0.87, 0.91). AUCs for liver and gallbladder were: hepatobiliary calcification 0.62 (95% CI: 0.56, 0.67), lesion 0.73 (0.69, 0.77), dilation 0.87 (0.84, 0.90), fatty 0.89 (0.86, 0.92), and no apparent disease 0.82 (0.78, 0.85). AUCs for kidneys and ureters were: stone 0.83 (95% CI: 0.79, 0.87), atrophy 0.92 (0.89, 0.94), lesion 0.68 (0.64, 0.72), cyst 0.70 (0.66, 0.73), and no apparent disease 0.79 (0.75, 0.83). Conclusion: Weakly-supervised deep learning models were able to classify diverse diseases in multiple organ systems. △ Less

Submitted 16 November, 2021; v1 submitted 3 August, 2020; originally announced August 2020.

Comments: 22 pages, 6 figures, 2 tables; Accepted for publication at Radiology: Artificial Intelligence

arXiv:2003.09033 [pdf]

doi 10.1167/tvst.9.2.38

Microvasculature Segmentation and Inter-capillary Area Quantification of the Deep Vascular Complex using Transfer Learning

Authors: Julian Lo, Morgan Heisler, Vinicius Vanzan, Sonja Karst, Ivana Zadro Matovinovic, Sven Loncaric, Eduardo V. Navajas, Mirza Faisal Beg, Marinko V. Sarunic

Abstract: Purpose: Optical Coherence Tomography Angiography (OCT-A) permits visualization of the changes to the retinal circulation due to diabetic retinopathy (DR), a microvascular complication of diabetes. We demonstrate accurate segmentation of the vascular morphology for the superficial capillary plexus and deep vascular complex (SCP and DVC) using a convolutional neural network (CNN) for quantitative a… ▽ More Purpose: Optical Coherence Tomography Angiography (OCT-A) permits visualization of the changes to the retinal circulation due to diabetic retinopathy (DR), a microvascular complication of diabetes. We demonstrate accurate segmentation of the vascular morphology for the superficial capillary plexus and deep vascular complex (SCP and DVC) using a convolutional neural network (CNN) for quantitative analysis. Methods: Retinal OCT-A with a 6x6mm field of view (FOV) were acquired using a Zeiss PlexElite. Multiple-volume acquisition and averaging enhanced the vessel network contrast used for training the CNN. We used transfer learning from a CNN trained on 76 images from smaller FOVs of the SCP acquired using different OCT systems. Quantitative analysis of perfusion was performed on the automated vessel segmentations in representative patients with DR. Results: The automated segmentations of the OCT-A images maintained the hierarchical branching and lobular morphologies of the SCP and DVC, respectively. The network segmented the SCP with an accuracy of 0.8599, and a Dice index of 0.8618. For the DVC, the accuracy was 0.7986, and the Dice index was 0.8139. The inter-rater comparisons for the SCP had an accuracy and Dice index of 0.8300 and 0.6700, respectively, and 0.6874 and 0.7416 for the DVC. Conclusions: Transfer learning reduces the amount of manually-annotated images required, while producing high quality automatic segmentations of the SCP and DVC. Using high quality training data preserves the characteristic appearance of the capillary networks in each layer. Translational Relevance: Accurate retinal microvasculature segmentation with the CNN results in improved perfusion analysis in diabetic retinopathy. △ Less

Submitted 19 March, 2020; originally announced March 2020.

Comments: 27 pages, 8 figures

arXiv:2002.04752 [pdf]

doi 10.1016/j.media.2020.101857

Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale Chest Computed Tomography Volumes

Authors: Rachel Lea Draelos, David Dov, Maciej A. Mazurowski, Joseph Y. Lo, Ricardo Henao, Geoffrey D. Rubin, Lawrence Carin

Abstract: Machine learning models for radiology benefit from large-scale data sets with high quality labels for abnormalities. We curated and analyzed a chest computed tomography (CT) data set of 36,316 volumes from 19,993 unique patients. This is the largest multiply-annotated volumetric medical imaging data set reported. To annotate this data set, we developed a rule-based method for automatically extract… ▽ More Machine learning models for radiology benefit from large-scale data sets with high quality labels for abnormalities. We curated and analyzed a chest computed tomography (CT) data set of 36,316 volumes from 19,993 unique patients. This is the largest multiply-annotated volumetric medical imaging data set reported. To annotate this data set, we developed a rule-based method for automatically extracting abnormality labels from free-text radiology reports with an average F-score of 0.976 (min 0.941, max 1.0). We also developed a model for multi-organ, multi-disease classification of chest CT volumes that uses a deep convolutional neural network (CNN). This model reached a classification performance of AUROC greater than 0.90 for 18 abnormalities, with an average AUROC of 0.773 for all 83 abnormalities, demonstrating the feasibility of learning from unfiltered whole volume CT data. We show that training on more labels improves performance significantly: for a subset of 9 labels - nodule, opacity, atelectasis, pleural effusion, consolidation, mass, pericardial effusion, cardiomegaly, and pneumothorax - the model's average AUROC increased by 10% when the number of training labels was increased from 9 to all 83. All code for volume preprocessing, automated label extraction, and the volume abnormality prediction model will be made publicly available. The 36,316 CT volumes and labels will also be made publicly available pending institutional approval. △ Less

Submitted 12 October, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

Comments: 20 pages, 3 figures, 5 tables (appendices additional). Published in Medical Image Analysis (October 2020)

arXiv:1910.08296 [pdf, ps, other]

doi 10.1109/TII.2019.2948406

Joint Computation and Communication Design for UAV-Assisted Mobile Edge Computing in IoT

Authors: Tiankui Zhang, Yu Xu, Jonathan Loo, Dingcheng Yang, Lin Xiao

Abstract: Unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) system is a prominent concept, where a UAV equipped with a MEC server is deployed to serve a number of terminal devices (TDs) of Internet of Things (IoT) in a finite period. In this paper, each TD has a certain latency-critical computation task in each time slot to complete. Three computation strategies can be available to each TD.… ▽ More Unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) system is a prominent concept, where a UAV equipped with a MEC server is deployed to serve a number of terminal devices (TDs) of Internet of Things (IoT) in a finite period. In this paper, each TD has a certain latency-critical computation task in each time slot to complete. Three computation strategies can be available to each TD. First, each TD can operate local computing by itself. Second, each TD can partially offload task bits to the UAV for computing. Third, each TD can choose to offload task bits to access point (AP) via UAV relaying. We propose a new optimization problem formulation that aims to minimize the total energy consumption including communication-related energy, computation-related energy and UAV's flight energy by optimizing the bits allocation, time slot scheduling and power allocation as well as UAV trajectory design. As the formulated problem is non-convex and difficult to find the optimal solution, we solve the problem by two parts, and obtain the near optimal solution with within a dozen of iterations. Finally, numerical results are given to validate the proposed algorithm, which is verified to be efficient and superior to the other benchmark cases. △ Less

Submitted 18 October, 2019; originally announced October 2019.

Comments: 10 pages

arXiv:1906.11879 [pdf, other]

Comparing Energy Efficiency of CPU, GPU and FPGA Implementations for Vision Kernels

Authors: Murad Qasaimeh, Kristof Denolf, Jack Lo, Kees Vissers, Joseph Zambreno, Phillip H. Jones

Abstract: Developing high performance embedded vision applications requires balancing run-time performance with energy constraints. Given the mix of hardware accelerators that exist for embedded computer vision (e.g. multi-core CPUs, GPUs, and FPGAs), and their associated vendor optimized vision libraries, it becomes a challenge for developers to navigate this fragmented solution space. To aid with determin… ▽ More Developing high performance embedded vision applications requires balancing run-time performance with energy constraints. Given the mix of hardware accelerators that exist for embedded computer vision (e.g. multi-core CPUs, GPUs, and FPGAs), and their associated vendor optimized vision libraries, it becomes a challenge for developers to navigate this fragmented solution space. To aid with determining which embedded platform is most suitable for their application, we conduct a comprehensive benchmark of the run-time performance and energy efficiency of a wide range of vision kernels. We discuss rationales for why a given underlying hardware architecture innately performs well or poorly based on the characteristics of a range of vision kernel categories. Specifically, our study is performed for three commonly used HW accelerators for embedded vision applications: ARM57 CPU, Jetson TX2 GPU and ZCU102 FPGA, using their vendor optimized vision libraries: OpenCV, VisionWorks and xfOpenCV. Our results show that the GPU achieves an energy/frame reduction ratio of 1.1-3.2x compared to the others for simple kernels. While for more complicated kernels and complete vision pipelines, the FPGA outperforms the others with energy/frame reduction ratios of 1.2-22.3x. It is also observed that the FPGA performs increasingly better as a vision application's pipeline complexity grows. △ Less

Submitted 31 May, 2019; originally announced June 2019.

Comments: 8 pages, Design Automation Conference (DAC), The 15th IEEE International Conference on Embedded Software and Systems, 2019

Showing 1–26 of 26 results for author: Loo, J