Abstract:
With the rapid development of deep learning, image quality enhancement as a critical part of the post-processing pipeline has been revolutionized entirely by neural networks. Such surging innovation is particularly prominent in the field of computer vision. However, progress in medical imaging has been more cautious due to unique challenges posed by the nature of the data.Especially, the use of 3D data in medical imaging significantly increases the difficulty of post-processing compared to 2D data. In medical imaging, the fidelity of
the data holds more importance than the perceptual quality of images, as it directly impact the scientific and diagnostic value of the results. To better understand the underlying confounding factors and explore the potential advantages of machine learning in medical imaging post-processing, this thesis focuses on image quality enhancement tasks for structural Magnetic Resonance Imaging (MRI).
This thesis investigates three types of image quality enhancement tasks in anatomical MRI: super-resolution, retrospective motion correction, and noise removal. All these tasks can be conceptualized as inverse problems, which align well with the unsupervised learning paradigm. Traditionally, addressing these challenges has required either specialized hardware support or engineering-intensive modification to scanning protocols, demanding non-trivial technical expertise and effort. In contrast, advancements in deep learning have enabled these problems to be approached by approximating surjective mappings between input and target data distributions. However, this approach assumes a compact representation of features in the dataset, which does not often hold in medical imaging due to variabilities such as differences in vendors or scanning protocols, can affect the data representation. This further motivates the necessity of developing data-efficient neural nets for post-processing structural MRI.
Strategically, all these tasks are approached through a three-steps framework: (1) simulating corrupted data from the ground truth, (2) training a neural network to approximate the mapping between the corrupted and the ground truth data, and (3) validating the trained model on real-world corrupted data. Despite its power and accessibility, machine learning are know to fall into the trade-off between bias and variance, which limits its generalizability to unseen data. This issue is particularly critical in medical imaging, where data scarcity is a common challenge. A successful model should therefore generalize effectively to unseen data, by learning only the most essential and representative features of the dataset.
As demonstrated by the experiments, we find that the generative models, particularly Generative Adversarial Networks (GANs) and Denoising Diffusion Probabilistic Models (DDPMs), both exceptionally well for such inverse problems. Specifically, GANs are efficient and exhibit strong generalizability when applied to 3D data, owing to their adversarial objective function. In contrast, DDPMs, as autoregressive estimators, are better suited for 2D data due to their stable training process. Despite the demanding memory and computation overhead, improved variants scale effectively with increasing data size, making them a promising model class for integration
with multi-modality data. Additionally, wavelet transform was found a powerful tool for extracting features in the frequency domain of the data. This technique serves as a plug-and-play module for generative models, further enhancing the quality of the restored images by preserving fine anatomical details, even without the need for specialized loss design.