Tóm tắt tiếng anh nâng cao hiệu quả hỗ trợ chẩn Đoán một số dạng ung thư dựa trên kỹ thuật xử lý Ảnh và mạng nơ ron tích chập

Objectives of the Dissertation The objective of this dissertation is to explore various solutions to enhance the accuracy of computer-aided diagnosis CADx systems for medical imaging, ut

Trang 1

MYNISTRY OF EDUCATION AND TRAINING

HUNG YEN UNIVERSITY OF TECHNOLOGY AND EDUCATION

HOANG QUOC TUAN

IMPROVE THE EFFECTIVENESS OF SUPPORTING DIAGNOSIS

OF SOME FORMS OF CANCER BASED ON IMAGE PROCESSING TECHNIQUES AND CONVOLUTIONAL NEURAL NETWORKS

Trang 2

The dissertation was completed at:

Hung Yen University of Technology and Education

Supervisors: 1 Assoc Prof Dr Bui Trung Thanh

2 Dr Pham Xuan Hien

Reviewer 1: Prof Dr Tran Xuan Nam

Reviewer 2: Assoc Prof Dr Truong Vu Bang Giang

Reviewer 3: Assoc Prof Dr Bui Ngoc My

The dissertation can be found at:

- Library of Hung Yen University of Technology and Education

- National Library of Vietnam

Trang 3

INTRODUCTION

1 The Urgency of the Dissertation

The shortage of medical resources compared to demand in the field of medical imaging is a current reality in Vietnam and many other countries worldwide Recent statistics indicate that Japan has only 36 radiologists per million people, while Liberia has just 2 radiologists, and 14 countries in Africa have none [1] Even in economically advanced countries, there is a shortage of personnel to handle diagnostic imaging tasks In the UK, it is estimated that more than 300,000 X-ray images wait over 30 days for analysis at any given time of the year [1] Artificial Intelligence (AI) is considered an inevitable trend and a solution for computer-aided diagnosis software that achieves higher accuracy, helping to address personnel shortages [2] Many new AI tools have been developed for analyzing and diagnosing various cancers based on medical images, such as chest X-rays, mammograms, and CT/MRI scans of the brain [3, 4] In the US, several AI applications have been approved by the Food and Drug Administration (FDA), including software for diagnosing acute ischemic stroke based on CT brain scans by Viz.AI Clinical trial results show that Viz.AI's

AI system saves up to 45% of the time needed for diagnosis and patient care [5], which is significant for cases where early diagnosis and medical intervention are critical for patient survival

Today, the healthcare system in Vietnam has seen many positive changes, but investment and expenditures on healthcare remain limited Recent statistics show that Vietnam has just over 8 doctors per 10,000 people, a low ratio even when compared to other countries in Southeast Asia Given this situation, it will take many years for Vietnam to catch up with Singapore, which has 23 doctors per 10,000 people [6] Another issue in Vietnam is the imbalance in skill levels between local hospitals (district and provincial levels) compared to central hospitals or major cities, which still have a significant gap Based on these realities, the use of digital technologies, primarily focusing on big data and artificial intelligence, to build an intelligent healthcare system is seen as a solution to rapidly transform Vietnam's healthcare [7] Digital technologies will be applied to develop early disease diagnosis solutions that are cost-effective and easily accessible to users on a large scale [8, 9]

Over the past five years, the rapid development of big data and computational capabilities has led to significant advancements in AI models Breakthroughs in basic research and the application of AI in healthcare have been continually published and implemented [10-12] In summary, AI models can assist doctors throughout the clinical examination process based on medical images AI helps medical imaging devices produce images faster, with better quality, and at lower costs Tasks such as analysis, disease diagnosis, and automatic report generation can also be handled by AI applications [13, 14] In these tasks, AI has been extensively applied to support disease diagnosis based on imaging [15-17], especially in the early detection of cancer-related conditions [18, 19] Computer-aided detection (CADe) and computer-aided diagnosis (CADx) systems have reduced errors associated with traditional diagnostic methods, which primarily rely on physicians' experience [20, 21] The performance of such systems plays a crucial role in enhancing the quality of diagnostic work

The analysis of practices in the field of medical imaging diagnostics demonstrates that the application of AI in medical image processing to develop detection or diagnosis support systems is

a new area of research with significant contributions to healthcare The topic “Enhancing the Effectiveness of Cancer Diagnosis Support Based on Image Processing Techniques and Convolutional Neural Networks” has been chosen for the doctoral dissertation in Electronics

Engineering The research results are applied to publicly available medical image datasets, validated and labeled by reputable diagnostic imaging physicians

The challenges for this dissertation include the need for high-speed image processing equipment within the researcher’s independent study at the university, the collection of validated standard datasets from hospitals or permitted public datasets, and ensuring the reliability of the data

to guarantee accuracy during experiments

Trang 4

2 Objectives of the Dissertation

The objective of this dissertation is to explore various solutions to enhance the accuracy of computer-aided diagnosis (CADx) systems for medical imaging, utilizing image processing techniques and convolutional neural networks The proposed solutions aim to support cancer diagnosis through imaging, making it easier for diagnostic imaging physicians to diagnose diseases and formulate treatment plans

3 Research Subjects, Scope, and Methodology

Research Subjects of the Dissertation:

- Medical images of certain types of cancer that can be detected early through imaging, with datasets that have been publicly published in previous studies and legally permitted for use in the

experimental research of the dissertation

- Image processing techniques and convolutional neural networks applied in the tasks of image

segmentation and classification

Scope of the Research: This dissertation focuses on various types of medical images from

different parts of the human body with distinct characteristics and processing solutions The study concentrates on medical images with publicly available datasets to reduce the time for data collection and ensure data reliability Thus, the research mainly focuses on two key issues:

- In-depth exploration of technical solutions to enhance the accuracy of models for detecting and segmenting areas of interest in medical images based on image processing techniques and convolutional neural networks;

- Development of solutions for classifying medical images based on CNN architecture with limited training data

Research Methodology:

Theoretical Research: Analyzing and evaluating studies on the detection and segmentation

of objects in medical images and medical image classification published in literature and journals; synthesizing relevant information regarding the research subjects, selecting successful approaches based on published research results; proposing new solutions within the research scope

Experimental Research: Writing programs in Python for the proposed solutions; conducting

experiments with the new proposed program using publicly available medical image datasets previously utilized in other studies; comparing and evaluating experimental results against published research outcomes to validate the research findings

4 Scientific and Practical Significance

Scientific Significance:

The application of scientific and technical advancements in healthcare to enhance the quality

of diagnosis and treatment is essential The use of advanced models and image processing techniques to assist in disease diagnosis through medical imaging is of great interest to many healthcare institutions and scientific organizations The dissertation investigates and proposes solutions to enhance the accuracy of detection models, segmentation of objects in medical images, and medical image classification based on AI (deep learning networks), making it a highly

significant topic in the field of medical imaging diagnosis

Practical Significance:

Given the practical issues of resource shortages in the field of medical imaging, the application of artificial intelligence in medical image processing to develop detection or computer-aided diagnosis systems has high practical significance and contributes greatly to the healthcare sector Computer-assisted diagnostic imaging systems have been developed to assist physicians in diagnosing certain medical conditions based on medical images Enhancing the performance of such systems plays a crucial role in improving the quality of diagnostic work, reducing errors from traditional diagnostic methods primarily based on physicians' experience

Trang 5

5 Contributions of the Dissertation

First contributiont: Proposing a medical image segmentation solution using a single

segmentation network based on an improved U-Net structure, featuring more layers than traditional U-Net, increasing learning capacity and reducing gradient loss during training This enhances the efficiency of detecting and segmenting objects in medical images through a solution utilizing multi-resolution images enhanced from the original

Second Contribution: Proposing a solution to design a medical image classification system based

on CNN architecture This classification system is applicable in cases where the model training encounters a lack of substantial data for training while still maintaining high classification accuracy

6 Structure of the Dissertation

The dissertation consists of three chapters:

Chapter 1: Overview of Medical Image Processing and the Application of Convolutional Neural Networks in Image-Based Diagnosis Support This chapter presents an overview of medical image processing, AI techniques, and the structures of convolutional neural networks developed for object detection and segmentation issues in images It also discusses image classification techniques and the relevant previous studies related to the research objectives and subjects of the dissertation

Chapter 2: Solutions to Enhance the Segmentation Efficiency of Tumors in Breast Ultrasound Images This chapter discusses the proposed new solution to develop AI network models that enhance the effectiveness of detecting and segmenting objects in medical images It details the characteristics of the breast ultrasound image datasets (BUS and BUSI) used in experiments, the experimental process, and the results of segmenting images in the BUS and BUSI datasets for the proposed solution

Chapter 3: Solutions to Enhance the Classification Efficiency of Thyroid Tumors Using Limited Training Data This chapter presents the new proposed solution for developing an image classification model that improves the limitations of convolutional neural networks in medical image classification tasks when there is little training data It outlines the characteristics of the thyroid ultrasound image dataset (TDID) used for the experimental process and the results of classifying images in the TDID dataset for the proposed solution

Trang 6

CHAPTER 1 OVERVIEW OF MEDICAL IMAGE PROCESSING AND THE APPLICATION OF CONVOLUTIONAL NEURAL NETWORKS IN IMAGE DIAGNOSTICS

Summary:

In this chapter, the author presents an overview of medical imaging techniques, the characteristics of various fundamental medical imaging methods, and the application of convolutional neural networks (CNNs) in analyzing medical images The chapter further discusses and analyzes the structure of computer-aided diagnosis (CAD) systems It reviews research findings relevant to this dissertation, drawing on recent publications both domestically and internationally, highlighting existing issues in prior studies Based on these theoretical foundations and the challenges identified in previous research, the author proposes new solutions to enhance the diagnostic support for various types of cancer using image processing techniques and convolutional neural networks

1.1 Overview of Medical Imaging

Modern medicine diagnoses diseases based on clinical symptoms (clinical diagnosis) and paraclinical symptoms (paraclinical diagnosis) In paraclinical diagnosis, imaging techniques derived from medical devices play an increasingly significant role Today, software advancements have further improved the clarity and accuracy of medical images

Medical imaging methods are diverse, including X-ray imaging, ultrasound, color Doppler ultrasound, endoscopic imaging, computed tomography (CT), magnetic resonance imaging (MRI), and microscopy

Medical imaging has significantly contributed to enhancing the accuracy, timeliness, and effectiveness of diagnoses For instance, using ultrasound images, doctors can measure the sizes of solid organs in the abdomen (liver, spleen, kidneys, pancreas) and detect abnormalities From echocardiograms, the structure and size of heart chambers, valves, and major blood vessels can be assessed In obstetrics, ultrasound helps determine and monitor fetal development CT images aid in identifying certain conditions in the brain, particularly detecting intracranial hemorrhages and brain tumors MRI provides more precise details regarding abnormalities in the body

1.2 Medical Images and Basic Techniques in Medical Image Processing

1.3 Analyzing Medical Images with Convolutional Neural Networks

1.3.1 Convolutional Neural Networks (CNNs)

1.3.2 CNNs in Object Detection in Medical Images

1.3.3 CNNs in Medical Image Segmentation

1.3.4 CNNs in Medical Image Fusion

1.3.5 CNNs in Medical Image Classification

1.4 Computer-Aided Diagnosis Systems for Medical Images

1.4.1 Recent Research in CAD Systems for Medical Images

In previous studies [62], CAD systems (Computer-aided Diagnosis) typically consist of four stages, as illustrated in Figure 1.11

Image Preprocessing: This stage may involve tasks such as resizing and adjusting

resolution without altering key features of the images prior to diagnosis

Image Segmentation: Image segmentation divides the image into non-overlapping regions,

isolating objects from the overall image The role of segmentation is to reduce image complexity, making subsequent image processing or analysis simpler This is one of the most challenging tasks

in image processing and pattern recognition, significantly influencing the quality of the final analysis in a comprehensive CAD system

Trang 7

Medical Image Preprocessing Segmentation

Feature ExtractionAnd SelectionClassification

Evaluation

Figure 1.11: Block Diagram of CAD Systems for Medical Images [62]

Feature Extraction and Selection: This step aims to find the feature vector of the object of

interest in the image Based on this feature vector, it can accurately distinguish whether the object is

a lesion/non-lesion or benign/malignant in cancer cases The feature space can be very large and complex, making it crucial to extract and select the most effective features

Classification: Using the selected features, the suspected regions are classified into

lesion/non-lesion or benign/malignant using various classification methods Among the techniques applied in CAD systems, the two most critical techniques that determine the system's performance are image segmentation and classification

1.4.2 Evaluation of Previous Research and Author's Proposed Solutions

From the surveys presented in Section 1.4.1, it can be confirmed that solutions for image segmentation and classification in medical imaging exhibit superior performance when applying CNNs For medical image segmentation tasks, authors have utilized three main network structures: Faster-RCNN, FCN-AlexNet, and U-Net For medical image classification tasks, ResNet and Inception (GoogLeNet) are commonly applied structures in many studies The solutions in the reviewed studies have achieved positive results in image segmentation and classification, as indicated by the evaluation criteria presented by the authors in their publications

However, through analysis, the author identifies limitations in these studies that need improvement:

Limitations in Medical Image Segmentation: Given its medical context, the accuracy

requirement for computer-aided diagnosis systems is very high, largely dependent on results from the segmentation stage The first issue encountered in surveyed studies is the detection of regions containing tumors, which is affected by noise and varying tumor sizes at different disease stages Common solutions like active models, U-Net, Residual UNet, and UNet++ struggle to accurately identify small tumors or large, calcified tumors For small tumors, applying common segmentation network structures leads to information loss after a few convolutional layers, making it difficult to locate the tumor since feature maps decrease in size after each pooling layer The second issue is that to enhance segmentation network performance, previous studies have combined different network models for object detection or used a large volume of training data This approach complicates the segmentation model due to the need to process multiple network structures

Limitations in Medical Image Classification: To achieve high accuracy in classification

models, a substantial amount of training data is required This necessitates more time and effort to collect these data Furthermore, expensive imaging equipment and the need for cooperation and agreements between doctors and patients complicate the process Training can be time-consuming

A significant issue is that conventional classification models may not be applicable in cases where there is not a large volume of medical images available for training, leading to overfitting during training or low classification accuracy

Based on the identified limitations and gaps in previous research, the author's research direction focuses on developing solutions to enhance the performance of detection, segmentation, and classification models for medical images, thereby improving the effectiveness of computer-aided diagnosis systems, particularly for specific types of cancer This aims to facilitate practical applications in hospitals, making it easier for physicians to diagnose diseases and plan treatments

Proposal 1: New Approach for Detection and Segmentation Models in Medical Imaging

Trang 8

Create new images by resizing the original images, thus generating a new dataset containing objects of various sizes This approach allows the segmentation network to learn more effectively and detect objects of different sizes, accommodating the characteristics of tumors, which often change in size over the course of the disease

Use a single segmentation network that is newly constructed and improved based on traditional segmentation structures for object detection and segmentation This method helps reduce the model size while allowing for a deeper network structure compared to previous studies that utilized multiple different networks to detect objects in the input images, then combined the results from these different networks to produce the final segmentation results

Combine the detection and segmentation results of images containing large objects (original images) and small objects (resized images) This will enhance the detection and segmentation outcomes compared to using only the results from a single original input image

Proposal 2: New Approach for Classifying Medical Images

Develop a medical image classification model that can be trained on a small amount of training data, making it applicable in challenging cases where data collection is difficult Use a pre-trained CNN for feature extraction from the input images, serving as the backbone for the classification model Train the classification model using pairs of images from the initial database; this solution significantly increases the training data and mitigates the issue of non-convergence during neural network training Improve classification performance by comparing the input images

with a set of reference images stored in the database

1.5 Conclusion of Chapter 1

Chapter 1 provided an overview of the application of image processing in healthcare, including the necessity of medical image processing in clinical diagnosis, the various imaging modalities used

in diagnostic imaging, and the characteristics of these imaging methods

This chapter also discussed core techniques in medical image processing, solutions involving the application of artificial intelligence in medical image processing, and the AI models used in analyzing medical images to develop computer-aided detection (CAD) systems and computer-aided diagnosis (CADx) systems

Furthermore, it presented a comprehensive overview of prior research related to segmentation and classification solutions in medical imaging To date, global studies have achieved significant successes in data collection, data augmentation solutions, and the development of detection, segmentation, and classification models based on convolutional neural networks with various architectures, such as ResNet, InceptionNet, and U-Net

Trang 9

CHAPTER 2 SOLUTIONS TO ENHANCE TUMOR SEGMENTATION EFFICIENCY IN BREAST

ULTRASOUND IMAGES Summary:

This chapter presents a proposed solution for segmenting breast ultrasound images in the thesis The author introduces a new segmentation method in the introduction of Chapter 2 This is followed by a detailed description of the solution, which includes: a zero-padding-scaling data augmentation technique to create a generalized dataset with multi-resolution images; an improved segmentation network structure utilizing Residual blocks based on the traditional U-Net architecture; and an algorithm to enhance the accuracy of multi-resolution image segmentation The experiments validate the proposed solution on a breast ultrasound image dataset, analyzing and evaluating the experimental results in comparison with other studies that utilized the same dataset

2.1 Overview of the Proposed Solution for Medical Image Segmentation

Figure 2.2 illustrates the proposed segmentation solution in the thesis Initially, input images are augmented from a single original image using the zero-padding-scaling technique These augmented images are trained and tested using a deep learning-based segmentation network to delineate lesions in each image Finally, the segmentation results of the scaled images are combined

to form the final segmented image

Figure 2.2 Overview of the proposed segmentation solution [84]

As shown in Figure 2.2, the proposed solution primarily employs a deep learning-based segmentation network for lesion segmentation However, the author focuses on three key improvements to address the issues in the lesion segmentation system First, to tackle the challenge

of variable lesion regions due to differences in disease stages, the zero-padding-scaling solution is proposed to increase the volume of collected image data Basic data augmentation is performed on the original dataset, resulting in a larger training dataset containing numerous images with varying lesion sizes Detailed information on this step is provided in Section 2.2 The second improvement involves a single segmentation network designed based on the traditional U-Net structure [36], with additional residual connections to allow for a deeper segmentation network while ensuring effective training and enhanced segmentation accuracy compared to previous studies, as detailed in Section 2.3 Lastly, lesion segmentation from input images at different scales is combined to enhance segmentation accuracy Detailed explanations for this step can be found in Section 2.4

Trang 10

2.2 Data Augmentation Using original image resizing and border padding solutions

In this study, the zero-padding-scaling solution produced images of varying resolutions with five scale factor values (0.9, 0.8, 0.7, 0.6, 0.5), generating six images from each original image (one original and five new scaled images), where the sizes of the lesions are arranged from largest to smallest, as shown in Figure 2.9

2.3 Proposed Segmentation Network

In this study, a segmentation network is built upon an encoder-decoder architecture utilizing

an additional residual connection Unlike traditional deep learning classification networks, the segmentation network does not include fully connected layers (FCN) Instead, it predicts the class

label for each pixel (pixel-wise classification) using a convolutional network

(a) (b)

Figure 2.6 Convolutional block diagram:

(a) Conventional convolutional block; (b) convolutional block with residual connection [88]

By employing the additional residual connections as shown in Figure 2.6, a U-Net-based segmentation network [36] is constructed as illustrated in Figure 2.7

Trang 11

Figure 2.7 Improved segmentation network architecture in the thesis [84]

Figure 2.7 depicts the segmentation network architecture for lesion segmentation in ultrasound images, similar to the standard U-Net structure [36] However, it differs by using residual connections to transmit image information rather than relying solely on convolutional layers Additionally, the image matrices in the downsampling paths are further processed using the residual connection to incorporate additional image information

2.4 Synthesizing Output Segmented Images

To combine the outputs from the corresponding images, three coordination principles are employed: AND, OR, and DOMINANT The AND principle is implemented by overlaying the results using the logical AND operation Consequently, the final output image contains the smallest overlapping region across all output images from the segmentation network In other words, if all images in the output series indicate that a pixel in the final image is a lesion, that pixel is considered

a lesion The OR principle involves selecting the broadest coverage from the output series based on the logical OR operation Hence, a pixel in the final image is labeled a lesion if at least one image in the output series identifies it as such Lastly, the DOMINANT principle is based on the most prevailing results in the output series Thus, a pixel in the final output image is regarded as a lesion

if the majority of the DOMINANT threshold values from the output images classify it as a lesion These coordination principles are expressed in equations (2.6-2.8):

AND_RULE = AND(O i) (2.6)

OR_RULE = OR(O i) (2.7) DOMINANT= > 0.5 (2.8)

(a)

Figure 2.8 Example of object segmentation across scaled images:

(a) Left image is the input ultrasound image; right image shows the black background of the lesion

region (b) Output segmentation results of the scaled images

Trang 12

2.5 Experimentation of the Proposed Segmentation Solution

Previous studies have shown that breast cancer is one of the leading causes of death among women worldwide [62, 64, 83] According to the World Health Organization (WHO), in 2020, approximately 2.3 million women were diagnosed with breast cancer, resulting in 685,000 deaths [90] However, breast cancer can be effectively treated, especially if detected early Consequently, early detection and treatment are crucial in reducing mortality rates

Diagnosing breast cancer using ultrasound images remains a challenge within breast diagnostic systems due to time consumption and the requirement for in-depth knowledge from radiologists This often leads to low diagnostic efficiency To assist radiologists, computer-aided diagnostic systems are being developed These systems support radiologists in reading breast ultrasound images and aim to improve the efficiency of the diagnostic process

In this experimentation, the author utilizes two public datasets: the BUS dataset [37] and the BUSI dataset [91] These datasets have been employed in previous studies focusing on breast lesion segmentation The statistical characteristics of both datasets are presented in Table 2.1 The BUS dataset [37] is smaller, containing 163 images collected in 2012 from the UDIAT diagnostic center

of Parc Tauli, Spain, using the ACUSON Sequoia C152 system with a 17L5HD linear array transducer (8.5 MHz)

Table 2.1 Description of BUS and BUSI datasets used in the experiments

Dataset Images with

lesions

Images without lesions

Image Resolution (pixels)

The newly released BUSI dataset by Al-Dhabyani et al [91] contains 780 breast ultrasound images from women aged 25 to 75 Among these, 647 images are indicated as containing lesions, with images 1 to 437 showing benign lesions, images 438 to 467 showing malignant lesions, and

133 images from 468 to 780 containing no lesions This study utilizes only the 647 images containing breast lesions for experimentation, excluding the remaining 133 images due to their lack

of lesions

To evaluate the segmentation performance of the proposed solution and compare it with previous studies, a common testing method in image segmentation and classification experiments is employed: five-fold cross-validation This approach ensures accurate and generalized experimental results In this cross-validation, the dataset (163 images in the BUS dataset and 647 images in the BUSI dataset) is randomly divided into five equal parts without overlapping images

Tiêu đề	Improve the Effectiveness of Supporting Diagnosis of Some Forms of Cancer Based on Image Processing Techniques and Convolutional Neural Networks
Tác giả	Hoang Quoc Tuan
Người hướng dẫn	Assoc. Prof. Dr. Bui Trung Thanh, Dr. Pham Xuan Hien
Trường học	Hung Yen University of Technology and Education
Chuyên ngành	Electronic Engineering
Thể loại	Dissertation Summary
Năm xuất bản	2024
Thành phố	Hung Yen

Định dạng
Số trang	24
Dung lượng	893,2 KB