KLI

검색

Ulsan Univ. Repository Thesis General Graduate School Medical Engineering 1. Theses(Master)

의료 도메인에서 귀납적 전이 학습을 위한 효과적인 표현 학습 방법

Metadata Downloads

Abstract: The deep learning technique has been used in a wide range of fields, with impressive results. However, lack of training data, performance degradation due to modality or domain differences, higher-definition images like radiographs and computed tomography (CT) scans, and the robustness of other medical centers, etc., there are still difficulties in applying deep learning in the medical domain. To address these issues, inductive transfer learning of representation learning, a study that skillfully utilizes the features derived from the network, has been intensively researched: sequential transfer learning and multi-task learning. In this study, three experiments have been performed to confirm how representation learning using inductive transfer learning affects medical domains: ‘Application of deep representation on pediatric diagnosis’, ‘Application of deep representation on brain hemorrhage diagnosis’, and ‘Application of deep representation on low-dose CT denoising task’. In the first study, sequential transfer learning was applied for performance improvement. We constructed class-balanced pediatric radiographs datasets, PedXnets using labels based on radiographic views, and developed their supervised representations. We validated the effects of the representation learning through pediatric downstream tasks including fracture classification and bone age assessment. As a result, the transfer learning from Model-PedXnets showed improved quantitative performances compared to those of the Model-Baseline. Model-PedXnets had equivalent and in some cases even improved performance than Model-ImageNet. In particular, Model-PedXnets focused on the most meaningful regions. In the second study, multi-task learning was applied for robustness. We proposed a supervised multi-task aiding representation transfer learning network (SMART-Net) for the diagnosis of intracranial hemorrhage (ICH). The proposed framework consists of upstream and downstream components. In the upstream, a weight-shared encoder of the model is trained as a robust feature extractor that captures global features by performing slice-level multi-pretext tasks. In the downstream, the transfer learning was conducted with a pre-trained encoder and 3D operator for volume-level tasks. Experimental results based on four test sets indicate that SMART-Net has better robustness and performance in terms of volume-level ICH classification and segmentation over previous methods. In the third study, multi-task learning was applied for the stabilization of discriminator learning. We propose a multi-task discriminator based generative adversarial network (MTD-GAN) simultaneously conducting three vision tasks (classification, segmentation, and reconstruction) in a discriminator. To stabilize GAN training, we introduce two novel loss functions termed non-difference suppression (NDS) loss and reconstruction consistency (RC) loss. Furthermore, we take a fast Fourier transform with convolution block (FFT-Conv Block) in the generator to make use of both high- and low-frequency features. Our model has been evaluated by pixel-space and feature-space based metrics in the head and neck LDCT denoising task, and results show outperformance quantitatively and qualitatively than the state-of-the-art denoising methods. All three studies confirmed that representation learning, which includes sequential transfer learning and multi-task learning, could enhance performance, extract semantic information, and make models robust to external data in medical domains. Instead of simply evaluating the performance of models by training scratch models, representation learning should be included in the future application of artificial intelligence to medical domains.|딥 러닝은 다양한 분야에서 사용되었고, 인상적인 결과를 가져왔다. 그러나 훈련 데이터 부족, 촬영장비 또는 도메인 차이로 인한 성능 저하, 방사선 사진 및 CT 스캔과 같은 고화질 영상 및 다른 의료 센터에 대한 견고성 등 의료 영역에서 딥 러닝을 적용하는 데 여전히 어려움이 있다. 이러한 문제를 해결하기 위해 네트워크에서 파생된 기능을 능숙하게 처리하는 연구인 표현 학습의 귀납적 전이 학습 (inductive transfer learning), 즉 순차 전이 학습 (sequential transfer learning)과 다중 작업 학습 (multi-task learning)이 활발히 연구되었다. 본 연구에서는 표현 학습, 특히 순차 전이 학습과 멀티태스킹 학습을 포함한 귀납적 전이 학습이 의료 영역에 어떤 영향을 미치는지 확인하기 위해 세가지 실험을 수행했다: ‘소아 진단에 대한 심층 표현에 대한 연구’, ‘뇌출혈 진단에 대한 심층 표현 연구’, 그리고 ‘저 선량 CT 노이즈 제거 작업에 대한 심층 표현 연구’. 첫 번째 연구에서는 성능 향상을 위해 순차적 전이 학습을 적용했다. 우리는 방사선 사진 뷰를 기반으로 레이블을 사용하여 클래스 균형 소아 방사선 사진 데이터 세트, PedXnet을 구성하고 감독된 표현 (supervised representation)을 개발했다. 골절 분류 및 골 연령 평가를 포함한 소아 다운 스트림 작업을 통해 표현 학습의 효과를 검증했다. 그 결과, Model-PedXnets의 전이 학습은 Model-Baseline의 것에 비해 향상된 정량적 성능을 보여주었다. Model-PedXnets는 Model-ImageNet과 동등하고 경우에 따라서는 성능이 향상되었습니다. 특히 Model-PedXnets는 가장 의미 있는 ROI에 초점을 맞췄다. 두 번째 연구에서는 견고성을 위해 다중 작업 학습을 적용했다. 우리는 두개내 출혈 (ICH)의 진단을 위해 감독된 다중 작업 지원 표현 전달 학습 네트워크 (SMART-Net)를 제안했다. 제안된 프레임워크는 업 스트림 및 다운 스트림 구성 요소로 구성된다. 업 스트림에서 모델의 가중치 공유 인코더는 슬라이스 레벨 다중 사전 정의 작업 (pretext task)을 수행하여 글로벌 기능을 캡처하는 강력한 기능 추출기로 훈련된다. 다운 스트림에서, 전송 학습은 환자 단위 작업을 위해 사전 훈련된 인코더와 3D 연산자를 사용하여 수행되었다. 네 가지 테스트 세트를 기반으로 한 실험 결과는 SMART-Net이 이전 방법에 비해 볼륨 레벨 ICH 분류 및 세분화 측면에서 견고성과 성능이 우수함을 보여준다. 세 번째 연구에서는 판별기 학습의 안정을 위해 멀티태스킹 학습을 적용했다. 우리는 저 선량 컴퓨터 단층 촬영 (LDCT) 노이즈 제거 모델을 더 잘 정규화 하기 위해 다중 작업 판별기 GAN (MTD-GAN)을 제안한다. 이 모델은 GAN 프레임워크에서 판별기에 대한 세 가지 다중 작업을 활용하여 노이즈 제거 이미지와 정상 선량 이미지 사이의 전역 및 로컬 차이를 모두 학습한다. 또한 이미지와 푸리에 도메인을 모두 사용하여 미세한 구조적 세부 사항을 학습할 수 있는 FFT-Generator를 제안하여 CT 노이즈 제거 작업을 개선하였다. 결과적으로, MTD-GAN은 정량적 결과와 정성적 결과에서 이전 방법보다 방사선 전문가 친화적인 성능을 달성한다. 세 가지 연구 모두 의료 영역에서 순차적 전이 학습과 다중 작업 학습을 포함한 표현 학습이 성능을 향상시키고 의미론적 특징을 추출하고 외부 데이터에 대해 모델을 견고하게 만들 수 있음을 확인했다. 의료 영역에 인공지능을 적용하는 미래에는 단순히 스크래치 모델을 훈련시켜 성능을 평가하기보다는 표현 학습을 고려해야 한다.