Hubbry Logo
Medical image computingMedical image computingMain
Open search
Medical image computing
Community hub
Medical image computing
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Medical image computing
Medical image computing
from Wikipedia

Medical image computing (MIC) is the use of computational and mathematical methods for solving problems pertaining to medical images and their use for biomedical research and clinical care. It is an interdisciplinary field at the intersection of computer science, information engineering, electrical engineering, physics, mathematics and medicine.

The main goal of MIC is to extract clinically relevant information or knowledge from medical images. While closely related to the field of medical imaging, MIC focuses on the computational analysis of the images, not their acquisition. The methods can be grouped into several broad categories: image segmentation, image registration, image-based physiological modeling, and others.[1]

Data forms

[edit]

Medical image computing typically operates on uniformly sampled data with regular x-y-z spatial spacing (images in 2D and volumes in 3D, generically referred to as images). At each sample point, data is commonly represented in integral form such as signed and unsigned short (16-bit), although forms from unsigned char (8-bit) to 32-bit float are not uncommon. The particular meaning of the data at the sample point depends on modality: for example a CT acquisition collects radiodensity values, while an MRI acquisition may collect T1 or T2-weighted images. Longitudinal, time-varying acquisitions may or may not acquire images with regular time steps. Fan-like images due to modalities such as curved-array ultrasound are also common and require different representational and algorithmic techniques to process. Other data forms include sheared images due to gantry tilt during acquisition; and unstructured meshes, such as hexahedral and tetrahedral forms, which are used in advanced biomechanical analysis (e.g., tissue deformation, vascular transport, bone implants).

Segmentation

[edit]
A T1 weighted MR image of the brain of a patient with a meningioma after injection of an MRI contrast agent (top left), and the same image with the result of an interactive segmentation overlaid in green (3D model of the segmentation on the top right, axial and coronal views at the bottom).

Segmentation is the process of partitioning an image into different meaningful segments. In medical imaging, these segments often correspond to different tissue classes, organs, pathologies, or other biologically relevant structures.[2] Medical image segmentation is made difficult by low contrast, noise, and other imaging ambiguities. Although there are many computer vision techniques for image segmentation, some have been adapted specifically for medical image computing. Below is a sampling of techniques within this field; the implementation relies on the expertise that clinicians can provide.

  • Atlas-based segmentation: For many applications, a clinical expert can manually label several images; segmenting unseen images is a matter of extrapolating from these manually labeled training images. Methods of this style are typically referred to as atlas-based segmentation methods. Parametric atlas methods typically combine these training images into a single atlas image,[3] while nonparametric atlas methods typically use all of the training images separately.[4] Atlas-based methods usually require the use of image registration in order to align the atlas image or images to a new, unseen image.
  • Shape-based segmentation: Many methods parametrize a template shape for a given structure, often relying on control points along the boundary. The entire shape is then deformed to match a new image. Two of the most common shape-based techniques are active shape models[5] and active appearance models.[6] These methods have been very influential, and have given rise to similar models.[7]
  • Image-based segmentation: Some methods initiate a template and refine its shape according to the image data while minimizing integral error measures, like the active contour model and its variations.[8]
  • Interactive segmentation: Interactive methods are useful when clinicians can provide some information, such as a seed region or rough outline of the region to segment. An algorithm can then iteratively refine such a segmentation, with or without guidance from the clinician. Manual segmentation, using tools such as a paint brush to explicitly define the tissue class of each pixel, remains the gold standard for many imaging applications. Recently, principles from feedback control theory have been incorporated into segmentation, which give the user much greater flexibility and allow for the automatic correction of errors.[9]
  • Subjective surface segmentation: This method is based on the idea of evolution of segmentation function which is governed by an advection-diffusion model.[10] To segment an object, a segmentation seed is needed (that is the starting point that determines the approximate position of the object in the image). Consequently, an initial segmentation function is constructed. The idea behind the subjective surface method [11][12][13] is that the position of the seed is the main factor determining the form of this segmentation function.
  • Convolutional neural networks (CNNs): The computer-assisted fully automated segmentation performance has been improved due to the advancement of machine learning models. CNN based models such as SegNet,[14] UNet,[15] ResNet,[16] AATSN,[17] Transformers[18] and GANs[19] have fastened the segmentation process. In the future, such models may replace manual segmentation due to their superior performance and speed.

There are other classifications of image segmentation methods that are similar to categories above. Another group, which is based on combination of methods, can be classified as "hybrid".[20]

Registration

[edit]
CT image (left), PET image (center) and overlay of both (right) after correct registration

Image registration is a process that searches for the correct alignment of images.[21][22][23][24] In the simplest case, two images are aligned. Typically, one image is treated as the target image and the other is treated as a source image; the source image is transformed to match the target image. The optimization procedure updates the transformation of the source image based on a similarity value that evaluates the current quality of the alignment. This iterative procedure is repeated until a (local) optimum is found. An example is the registration of CT and PET images to combine structural and metabolic information (see figure).

Image registration is used in a variety of medical applications:

  • Studying temporal changes. Longitudinal studies acquire images over several months or years to study long-term processes, such as disease progression. Time series correspond to images acquired within the same session (seconds or minutes). They can be used to study cognitive processes, heart deformations and respiration.
  • Combining complementary information from different imaging modalities. An example is the fusion of anatomical and functional information. Since the size and shape of structures vary across modalities, it is more challenging to evaluate the alignment quality. This has led to the use of similarity measures such as mutual information.[25]
  • Characterizing a population of subjects. In contrast to intra-subject registration, a one-to-one mapping may not exist between subjects, depending on the structural variability of the organ of interest. Inter-subject registration is required for atlas construction in computational anatomy.[26] Here, the objective is to statistically model the anatomy of organs across subjects.
  • Computer-assisted surgery. In computer-assisted surgery pre-operative images such as CT or MRI are registered to intra-operative images or tracking systems to facilitate image guidance or navigation.

There are several important considerations when performing image registration:

  • The transformation model. Common choices are rigid, affine, and deformable transformation models. B-spline and thin plate spline models are commonly used for parameterized transformation fields. Non-parametric or dense deformation fields carry a displacement vector at every grid location; this necessitates additional regularization constraints. A specific class of deformation fields are diffeomorphisms, which are invertible transformations with a smooth inverse.
  • The similarity metric. A distance or similarity function is used to quantify the registration quality. This similarity can be calculated either on the original images or on features extracted from the images. Common similarity measures are sum of squared distances (SSD), correlation coefficient, and mutual information. The choice of similarity measure depends on whether the images are from the same modality; the acquisition noise can also play a role in this decision. For example, SSD is the optimal similarity measure for images of the same modality with Gaussian noise.[27] However, the image statistics in ultrasound are significantly different from Gaussian noise, leading to the introduction of ultrasound specific similarity measures.[28] Multi-modal registration requires a more sophisticated similarity measure; alternatively, a different image representation can be used, such as structural representations[29] or registering adjacent anatomy.[30][31] A 2020 study[32] employed contrastive coding to learn shared, dense image representations, referred to as contrastive multi-modal image representations (CoMIRs), which enabled the registration of multi-modal images where existing registration methods often fail due to a lack of sufficiently similar image structures. It reduced the multi-modal registration problem to a mono-modal one, in which general intensity based, as well as feature-based, registration algorithms can be applied.
  • The optimization procedure. Either continuous or discrete optimization is performed. For continuous optimization, gradient-based optimization techniques are applied to improve the convergence speed.

Visualization

[edit]
Volume rendering (left), axial cross-section (right top), and sagittal cross-section (right bottom) of a CT image of a subject with multiple nodular lesions (white line) in the lung

Visualization plays several key roles in medical image computing. Methods from scientific visualization are used to understand and communicate about medical images, which are inherently spatial-temporal. Data visualization and data analysis are used on unstructured data forms, for example when evaluating statistical measures derived during algorithmic processing. Direct interaction with data, a key feature of the visualization process, is used to perform visual queries about data, annotate images, guide segmentation and registration processes, and control the visual representation of data (by controlling lighting rendering properties and viewing parameters). Visualization is used both for initial exploration and for conveying intermediate and final results of analyses.

The figure "Visualization of Medical Imaging" illustrates several types of visualization: 1. the display of cross-sections as gray scale images; 2. reformatted views of gray scale images (the sagittal view in this example has a different orientation than the original direction of the image acquisition; and 3. A 3D volume rendering of the same data. The nodular lesion is clearly visible in the different presentations and has been annotated with a white line.

Atlases

[edit]

Medical images can vary significantly across individuals due to people having organs of different shapes and sizes. Therefore, representing medical images to account for this variability is crucial. A popular approach to represent medical images is through the use of one or more atlases. Here, an atlas refers to a specific model for a population of images with parameters that are learned from a training dataset.[33][34]

The simplest example of an atlas is a mean intensity image, commonly referred to as a template. However, an atlas can also include richer information, such as local image statistics and the probability that a particular spatial location has a certain label. New medical images, which are not used during training, can be mapped to an atlas, which has been tailored to the specific application, such as segmentation and group analysis. Mapping an image to an atlas usually involves registering the image and the atlas. This deformation can be used to address variability in medical images.

Single template

[edit]

The simplest approach is to model medical images as deformed versions of a single template image. For example, anatomical MRI brain scans are often mapped to the MNI template [35] as to represent all the brain scans in common coordinates. The main drawback of a single-template approach is that if there are significant differences between the template and a given test image, then there may not be a good way to map one onto the other. For example, an anatomical MRI brain scan of a patient with severe brain abnormalities (i.e., a tumor or surgical procedure), may not easily map to the MNI template.

Multiple templates

[edit]

Rather than relying on a single template, multiple templates can be used. The idea is to represent an image as a deformed version of one of the templates. For example, there could be one template for a healthy population and one template for a diseased population. However, in many applications, it is not clear how many templates are needed. A simple albeit computationally expensive way to deal with this is to have every image in a training dataset be a template image and thus every new image encountered is compared against every image in the training dataset. A more recent approach automatically finds the number of templates needed.[36]

Statistical analysis

[edit]

Statistical methods combine the medical imaging field with modern computer vision, machine learning and pattern recognition. Over the last decade, several large datasets have been made publicly available (see for example ADNI, 1000 functional Connectomes Project), in part due to collaboration between various institutes and research centers. This increase in data size calls for new algorithms that can mine and detect subtle changes in the images to address clinical questions. Such clinical questions are very diverse and include group analysis, imaging biomarkers, disease phenotyping and longitudinal studies.

Group analysis

[edit]

In the group analysis, the objective is to detect and quantize abnormalities induced by a disease by comparing the images of two or more cohorts. Usually one of these cohorts consist of normal (control) subjects, and the other one consists of abnormal patients. Variation caused by the disease can manifest itself as abnormal deformation of anatomy (see voxel-based morphometry). For example, shrinkage of sub-cortical tissues such as the hippocampus in brain may be linked to Alzheimer's disease. Additionally, changes in biochemical (functional) activity can be observed using imaging modalities such as positron emission tomography.

The comparison between groups is usually conducted on the voxel level. Hence, the most popular pre-processing pipeline, particularly in neuroimaging, transforms all of the images in a dataset to a common coordinate frame via medical image registration in order to maintain correspondence between voxels. Given this voxel-wise correspondence, the most common frequentist method is to extract a statistic for each voxel (for example, the mean voxel intensity for each group) and perform statistical hypothesis testing to evaluate whether a null hypothesis is or is not supported. The null hypothesis typically assumes that the two cohorts are drawn from the same distribution, and hence, should have the same statistical properties (for example, the mean values of two groups are equal for a particular voxel). Since medical images contain large numbers of voxels, the issue of multiple comparison needs to be addressed,.[37][38] There are also Bayesian approaches to tackle group analysis problem.[39]

Classification

[edit]

Although group analysis can quantify the general effects of a pathology on an anatomy and function, it does not provide subject level measures, and hence cannot be used as biomarkers for diagnosis (see Imaging biomarkers). Clinicians, on the other hand, are often interested in early diagnosis of the pathology (i.e. classification,[40][41]) and in learning the progression of a disease (i.e. regression [42]). From methodological point of view, current techniques varies from applying standard machine learning algorithms to medical imaging datasets (e.g. support vector machine[43]), to developing new approaches adapted for the needs of the field.[44] The main difficulties are as follows:

  • Small sample size (curse of dimensionality): a large medical imaging dataset contains hundreds to thousands of images, whereas the number of voxels in a typical volumetric image can easily go beyond millions. A remedy to this problem is to reduce the number of features in an informative sense (see dimensionality reduction). Several unsupervised and semi-/supervised,[44][45][46][47] approaches have been proposed to address this issue.
  • Interpretability: A good generalization accuracy is not always the primary objective, as clinicians would like to understand which parts of anatomy are affected by the disease. Therefore, interpretability of the results is very important; methods that ignore the image structure are not favored. Alternative methods based on feature selection have been proposed,.[45][46][47][48]

Clustering

[edit]

Image-based pattern classification methods typically assume that the neurological effects of a disease are distinct and well defined. This may not always be the case. For a number of medical conditions, the patient populations are highly heterogeneous, and further categorization into sub-conditions has not been established. Additionally, some diseases (e.g., autism spectrum disorder, schizophrenia, mild cognitive impairment can be characterized by a continuous or nearly-continuous spectra from mild cognitive impairment to very pronounced pathological changes. To facilitate image-based analysis of heterogeneous disorders, methodological alternatives to pattern classification have been developed. These techniques borrow ideas from high-dimensional clustering [49] and high-dimensional pattern-regression to cluster a given population into homogeneous sub-populations. The goal is to provide a better quantitative understanding of the disease within each sub-population.

Shape analysis

[edit]

Shape analysis is the field of medical image computing that studies geometrical properties of structures obtained from different imaging modalities. Shape analysis recently become of increasing interest to the medical community due to its potential to precisely locate morphological changes between different populations of structures, i.e. healthy vs pathological, female vs male, young vs elderly. Shape analysis includes two main steps: shape correspondence and statistical analysis.

  • Shape correspondence is the methodology that computes correspondent locations between geometric shapes represented by triangle meshes, contours, point sets or volumetric images. Obviously definition of correspondence will influence directly the analysis. Among the different options for correspondence frameworks are: anatomical correspondence, manual landmarks, functional correspondence (i.e. in brain morphometry locus responsible for same neuronal functionality), geometry correspondence, (for image volumes) intensity similarity, etc. Some approaches, e.g. spectral shape analysis, do not require correspondence but compare shape descriptors directly.
  • Statistical analysis will provide measurements of structural change at correspondent locations.

Longitudinal studies

[edit]

In longitudinal studies the same person is imaged repeatedly. This information can be incorporated both into the image analysis, as well as into the statistical modeling.

  • In longitudinal image processing, segmentation and analysis methods of individual time points are informed and regularized with common information usually from a within-subject template. This regularization is designed to reduce measurement noise and thus helps increase sensitivity and statistical power. At the same time over-regularization needs to be avoided, so that effect sizes remain stable. Intense regularization, for example, can lead to excellent test-retest reliability, but limits the ability to detect any true changes and differences across groups. Often a trade-off needs to be aimed for, that optimizes noise reduction at the cost of limited effect size loss. Another common challenge in longitudinal image processing is the, often unintentional, introduction of processing bias. When, for example, follow-up images get registered and resampled to the baseline image, interpolation artifacts get introduced to only the follow-up images and not the baseline. These artifact can cause spurious effects (usually a bias towards overestimating longitudinal change and thus underestimating required sample size). It is therefore essential that all-time points get treated exactly the same to avoid any processing bias.
  • Post-processing and statistical analysis of longitudinal data usually requires dedicated statistical tools such as repeated measure ANOVA or the more powerful linear mixed effects models. Additionally, it is advantageous to consider the spatial distribution of the signal. For example, cortical thickness measurements will show a correlation within-subject across time and also within a neighborhood on the cortical surface - a fact that can be used to increase statistical power. Furthermore, time-to-event (aka survival) analysis is frequently employed to analyze longitudinal data and determine significant predictors.

Image-based physiological modelling

[edit]

Traditionally, medical image computing has seen to address the quantification and fusion of structural or functional information available at the point and time of image acquisition. In this regard, it can be seen as quantitative sensing of the underlying anatomical, physical or physiological processes. However, over the last few years, there has been a growing interest in the predictive assessment of disease or therapy course. Image-based modelling, be it of biomechanical or physiological nature, can therefore extend the possibilities of image computing from a descriptive to a predictive angle.

According to the STEP research roadmap,[50][51] the Virtual Physiological Human (VPH) is a methodological and technological framework that, once established, will enable the investigation of the human body as a single complex system. Underlying the VPH concept, the International Union for Physiological Sciences (IUPS) has been sponsoring the IUPS Physiome Project for more than a decade,.[52][53] This is a worldwide public domain effort to provide a computational framework for understanding human physiology. It aims at developing integrative models at all levels of biological organization, from genes to the whole organisms via gene regulatory networks, protein pathways, integrative cell functions, and tissue and whole organ structure/function relations. Such an approach aims at transforming current practice in medicine and underpins a new era of computational medicine.[54]

In this context, medical imaging and image computing play an increasingly important role as they provide systems and methods to image, quantify and fuse both structural and functional information about the human being in vivo. These two broad research areas include the transformation of generic computational models to represent specific subjects, thus paving the way for personalized computational models.[55] Individualization of generic computational models through imaging can be realized in three complementary directions:

  • definition of the subject-specific computational domain (anatomy) and related subdomains (tissue types);
  • definition of boundary and initial conditions from (dynamic and/or functional) imaging; and
  • characterization of structural and functional tissue properties.

In addition, imaging also plays a pivotal role in the evaluation and validation of such models both in humans and in animal models, and in the translation of models to the clinical setting with both diagnostic and therapeutic applications. In this specific context, molecular, biological, and pre-clinical imaging render additional data and understanding of basic structure and function in molecules, cells, tissues and animal models that may be transferred to human physiology where appropriate.

The applications of image-based VPH/physiome models in basic and clinical domains are vast. Broadly speaking, they promise to become new virtual imaging techniques. Effectively more, often non-observable, parameters will be imaged in silico based on the integration of observable but sometimes sparse and inconsistent multimodal images and physiological measurements. Computational models will serve to engender interpretation of the measurements in a way compliant with the underlying biophysical, biochemical or biological laws of the physiological or pathophysiological processes under investigation. Ultimately, such investigative tools and systems will help our understanding of disease processes, the natural history of disease evolution, and the influence on the course of a disease of pharmacological and/or interventional therapeutic procedures.

Cross-fertilization between imaging and modelling goes beyond interpretation of measurements in a way consistent with physiology. Image-based patient-specific modelling, combined with models of medical devices and pharmacological therapies, opens the way to predictive imaging whereby one will be able to understand, plan and optimize such interventions in silico.

Mathematical methods in medical imaging

[edit]

A number of sophisticated mathematical methods have entered medical imaging, and have already been implemented in various software packages. These include approaches based on partial differential equations (PDEs) and curvature driven flows for enhancement, segmentation, and registration. Since they employ PDEs, the methods are amenable to parallelization and implementation on GPGPUs. A number of these techniques have been inspired from ideas in optimal control. Accordingly, very recently ideas from control have recently made their way into interactive methods, especially segmentation. Moreover, because of noise and the need for statistical estimation techniques for more dynamically changing imagery, the Kalman filter[56] and particle filter have come into use. A survey of these methods with an extensive list of references may be found in.[57]

Modality-specific computing

[edit]

Some imaging modalities provide very specialized information. The resulting images cannot be treated as regular scalar images and give rise to new sub-areas of medical image computing. Examples include diffusion MRI and functional MRI.

Diffusion MRI

[edit]
A mid-axial slice of the ICBM diffusion tensor image template. Each voxel's value is a tensor represented here by an ellipsoid. Color denotes principal orientation: red = left-right, blue=inferior-superior, green = posterior-anterior

Diffusion MRI is a structural magnetic resonance imaging modality that allows measurement of the diffusion process of molecules. Diffusion is measured by applying a gradient pulse to a magnetic field along a particular direction. In a typical acquisition, a set of uniformly distributed gradient directions is used to create a set of diffusion weighted volumes. In addition, an unweighted volume is acquired under the same magnetic field without application of a gradient pulse. As each acquisition is associated with multiple volumes, diffusion MRI has created a variety of unique challenges in medical image computing.

In medicine, there are two major computational goals in diffusion MRI:

  • Estimation of local tissue properties, such as diffusivity;
  • Estimation of local directions and global pathways of diffusion.

The diffusion tensor,[58] a 3 × 3 symmetric positive-definite matrix, offers a straightforward solution to both of these goals. It is proportional to the covariance matrix of a Normally distributed local diffusion profile and, thus, the dominant eigenvector of this matrix is the principal direction of local diffusion. Due to the simplicity of this model, a maximum likelihood estimate of the diffusion tensor can be found by simply solving a system of linear equations at each location independently. However, as the volume is assumed to contain contiguous tissue fibers, it may be preferable to estimate the volume of diffusion tensors in its entirety by imposing regularity conditions on the underlying field of tensors.[59] Scalar values can be extracted from the diffusion tensor, such as the fractional anisotropy, mean, axial and radial diffusivities, which indirectly measure tissue properties such as the dysmyelination of axonal fibers [60] or the presence of edema.[61] Standard scalar image computing methods, such as registration and segmentation, can be applied directly to volumes of such scalar values. However, to fully exploit the information in the diffusion tensor, these methods have been adapted to account for tensor valued volumes when performing registration [62][63] and segmentation.[64][65]

Given the principal direction of diffusion at each location in the volume, it is possible to estimate the global pathways of diffusion through a process known as tractography.[66] However, due to the relatively low resolution of diffusion MRI, many of these pathways may cross, kiss or fan at a single location. In this situation, the single principal direction of the diffusion tensor is not an appropriate model for the local diffusion distribution. The most common solution to this problem is to estimate multiple directions of local diffusion using more complex models. These include mixtures of diffusion tensors,[67] Q-ball imaging,[68] diffusion spectrum imaging [69] and fiber orientation distribution functions,[70][71] which typically require HARDI acquisition with a large number of gradient directions. As with the diffusion tensor, volumes valued with these complex models require special treatment when applying image computing methods, such as registration[72][73][74] and segmentation.[75]

Functional MRI

[edit]

Functional magnetic resonance imaging (fMRI) is a medical imaging modality that indirectly measures neural activity by observing the local hemodynamics, or blood oxygen level dependent signal (BOLD). fMRI data offers a range of insights, and can be roughly divided into two categories:

  • Task related fMRI is acquired as the subject is performing a sequence of timed experimental conditions. In block-design experiments, the conditions are present for short periods of time (e.g., 10 seconds) and are alternated with periods of rest. Event-related experiments rely on a random sequence of stimuli and use a single time point to denote each condition. The standard approach to analyze task related fMRI is the general linear model (GLM).[76]
  • Resting state fMRI is acquired in the absence of any experimental task. Typically, the objective is to study the intrinsic network structure of the brain. Observations made during rest have also been linked to specific cognitive processes such as encoding or reflection. Most studies of resting state fMRI focus on low frequency fluctuations of the fMRI signal (LF-BOLD). Seminal discoveries include the default network,[77] a comprehensive cortical parcellation,[78] and the linking of network characteristics to behavioral parameters.

There is a rich set of methodology used to analyze functional neuroimaging data, and there is often no consensus regarding the best method. Instead, researchers approach each problem independently and select a suitable model/algorithm. In this context there is a relatively active exchange among neuroscience, computational biology, statistics, and machine learning communities. Prominent approaches include

  • Massive univariate approaches that probe individual voxels in the imaging data for a relationship to the experiment condition. The prime approach is the general linear model.[76]
  • Multivariate- and classifier based approaches, often referred to as multi voxel pattern analysis or multi-variate pattern analysis probe the data for global and potentially distributed responses to an experimental condition. Early approaches used support vector machines to study responses to visual stimuli.[79] Recently, alternative pattern recognition algorithms have been explored, such as random forest based gini contrast [80] or sparse regression and dictionary learning.[81]
  • Functional connectivity analysis studies the intrinsic network structure of the brain, including the interactions between regions. The majority of such studies focus on resting state data to parcelate the brain [78] or to find correlates to behavioral measures.[82] Task specific data can be used to study causal relationships among brain regions (e.g., dynamic causal mapping[83]).

When working with large cohorts of subjects, the normalization (registration) of individual subjects into a common reference frame is crucial. A body of work and tools exist to perform normalization based on anatomy (FSL, FreeSurfer, SPM). Alignment taking spatial variability across subjects into account is a more recent line of work. Examples are the alignment of the cortex based on fMRI signal correlation,[84] the alignment based on the global functional connectivity structure both in task-, or resting state data,[85] and the alignment based on stimulus specific activation profiles of individual voxels.[86]

Software

[edit]

Software for medical image computing is a complex combination of systems providing IO, visualization and interaction, user interface, data management and computation. Typically system architectures are layered to serve algorithm developers, application developers, and users. The bottom layers are often libraries and/or toolkits which provide base computational capabilities; while the top layers are specialized applications which address specific medical problems, diseases, or body systems.

Additional notes

[edit]

See also

[edit]

References

[edit]

Journals on medical image computing

[edit]

In addition the following journals occasionally publish articles describing methods and specific clinical applications of medical image computing or modality specific medical image computing

Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Medical image computing is an interdisciplinary field that develops and applies computational methods to , , analyze, and visualize , enabling robust, automated, and quantitative extraction of clinically relevant information to support , therapy planning, follow-up, and biomedical research. This domain integrates principles from , , and , operating primarily on multidimensional such as 2D images or 3D volumes from modalities including computed (CT), (MRI), (PET), and . At its core, medical image computing involves several fundamental tasks that transform raw data into actionable insights. These include image enhancement to improve quality by reducing noise or artifacts, segmentation to delineate anatomical structures or pathologies, registration to align images from different modalities or time points, and feature extraction for quantitative measurements like volume or texture . Advanced techniques, such as model-based approaches incorporating prior anatomical knowledge or algorithms like convolutional neural networks (CNNs), address the inherent challenges of data variability, including differences in imaging physics, patient , and pathological variations. Advancements as of 2025 emphasize for tasks like automated and synthesis of synthetic images via generative adversarial networks (GANs) and broader generative AI models, along with AI integration in multi-modal , enhancing efficiency and accuracy in handling large-scale datasets. The applications of medical image computing span diagnostics, interventional procedures, and , profoundly impacting healthcare outcomes. In diagnostics, it facilitates early detection of diseases such as tumors or lesions through multi-modal fusion, combining structural (e.g., MRI) and functional (e.g., PET) data for comprehensive assessment. For treatment planning, techniques like image-guided and visualizations enable precise navigation and minimally invasive interventions. In , it supports longitudinal studies and population-level analyses, though challenges like —due to limited , , and variability in experimental setups—remain critical hurdles for clinical translation. Ongoing trends highlight the integration of to manage escalating volumes, from kilobytes in traditional radiographs to terabytes in whole-body scans, promising more personalized and efficient medical practices.

Fundamentals

Definition and Scope

Medical image computing refers to the application of computational algorithms and models to acquire, process, analyze, and interpret digital medical images derived from modalities such as (MRI), computed tomography (CT), and . This field leverages techniques from to extract meaningful information from visual data, enabling automated or semi-automated assistance in medical decision-making. The scope of medical image computing is broad, encompassing stages from initial image acquisition and enhancement to advanced tasks like segmentation, registration, quantitative feature extraction, and seamless integration into clinical workflows. It is inherently interdisciplinary, drawing on expertise from for algorithm development, biomedical engineering for hardware-software interfaces, and for domain-specific validation and application. This collaborative nature ensures that computational methods align with clinical needs, such as improving image quality or fusing multi-modal for comprehensive . The importance of medical image computing lies in its transformative role across healthcare, facilitating precise diagnostics, treatment planning, real-time surgical guidance, and biomedical . For instance, it supports tumor detection by delineating malignant structures in scans, reducing diagnostic errors and enabling earlier interventions, while also advancing through patient-specific -derived models for tailored therapies. In surgical contexts, it processes data to provide navigational overlays, enhancing procedural accuracy and outcomes. Techniques like segmentation and registration underpin these applications by aligning and partitioning elements for targeted . At its foundation, medical image computing relies on key concepts in , where two-dimensional images are composed of pixels—discrete units encoding intensity values at spatial coordinates—and three-dimensional volumes use voxels to extend this representation volumetrically. , defined by the size and density of these units, critically influences the ability to discern fine anatomical details, directly impacting diagnostic reliability and the efficacy of downstream computations.

Historical Development

The field of medical image computing emerged in the alongside the advent of computed tomography (CT), which marked the transition from analog to digital imaging in medicine. The first clinical CT scanner was developed by and installed at Atkinson Morley Hospital in in 1971, enabling the of cross-sectional images through computer processing of X-ray projections. This innovation introduced digital image processing to clinical practice, with early applications focusing on basic enhancement and reconstruction algorithms to handle the computational demands of tomographic data. By the mid-1970s, techniques such as texture analysis for quantitative feature extraction in CT images were proposed, exemplified by Robert M. Haralick's 1973 work on textural features for image classification. The 1980s saw further foundational progress with the clinical adoption of (MRI) and the development of initial algorithms for image analysis. The first whole-body MRI scan was achieved in 1977 by Raymond Damadian's team, expanding the scope of to soft tissues without . Concurrently, early segmentation methods emerged, such as the 1986 algorithm by Wells et al. for (NMR) images, which laid groundwork for delineating anatomical structures. Pioneering contributions from figures like , whose 1940s work on Gabor filters for signal analysis influenced subsequent and filtering techniques in medical images, provided essential mathematical tools for these advancements. In the 1990s, medical image computing matured with the proliferation of registration techniques and probabilistic atlases, driven by the need to align multi-modal data from CT, MRI, and emerging modalities like (PET). Registration methods gained prominence in the early 1990s amid neuroimaging challenges from the , enabling spatial correspondence across images for applications like surgical planning. The first International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI) was held in 1998, fostering collaboration and standardizing research in the field. The 2000s integrated statistical shape models (SSMs), with Timothy Cootes and Christopher Taylor's active appearance models (AAMs) from the mid-1990s evolving into 3D variants for robust organ segmentation, capturing population-based variability in anatomical shapes. Software frameworks like the Insight Toolkit (ITK), initiated in 1999 by the U.S. National Library of Medicine, provided open-source tools for segmentation and registration, accelerating adoption. The 2010s witnessed an explosion in applications, propelled by the 2012 architecture, which demonstrated convolutional neural networks' (CNNs) efficacy in image recognition and inspired adaptations for medical tasks. This shift was amplified by hardware advances like graphics processing units (GPUs), enabling training on large datasets, and initiatives such as the , which began imaging 100,000 participants in 2014 to support population-scale analysis. Seminal works like the 2015 for biomedical segmentation further entrenched , achieving high accuracy in delineating complex structures while addressing data scarcity through efficient architectures. These developments, building on decades of computational foundations, continue to drive precision in diagnostics and interventions.

Data Acquisition and Representation

Imaging Modalities

Medical image computing relies on data acquired from various imaging modalities, each employing distinct physical principles to generate representations of anatomical and functional information within the human body. These modalities produce datasets ranging from two-dimensional (2D) projections to three-dimensional (3D) or four-dimensional (4D, incorporating time) volumes, which serve as the foundation for subsequent computational analysis. Key considerations include the use of ionizing versus non-ionizing radiation, as well as inherent data characteristics such as spatial and temporal resolutions, noise profiles, and common artifacts that influence computing workflows. X-ray imaging is one of the earliest and most fundamental modalities, utilizing high-energy electromagnetic waves generated by accelerating electrons onto a target in an , producing a continuous spectrum via and discrete peaks from characteristic radiation. These s interact with tissues primarily through photoelectric absorption and , where denser structures like bone attenuate more rays, appearing brighter on the resulting 2D projection images captured on a detector. This modality offers high for bony structures (typically 0.1–0.5 mm) but limited soft-tissue contrast due to overlapping projections of 3D anatomy. Data characteristics include grayscale images with Poisson-distributed noise from photon counting statistics, and artifacts such as geometric distortion from patient positioning. X-ray uses , raising concerns for cumulative exposure in repeated scans. Computed tomography (CT) extends X-ray principles by acquiring multiple projections from rotating X-ray sources around the patient, enabling of cross-sectional slices. The physical basis involves measuring X-ray attenuation along lines through the body, formalized by the , which integrates the along projection paths to form a sinogram dataset subsequently inverted to yield volumetric images. CT provides isotropic of 0.5–1 mm and excels in both and soft-tissue visualization, though it employs with doses varying by protocol (e.g., 2–10 mSv for a chest scan). Resulting data are 3D volumes in Hounsfield units, characterized by Poisson noise dominant at low doses, manifesting as granular streaks that degrade low-contrast detection. Common artifacts include beam hardening from polychromatic X-rays and partial volume effects in thin structures. Magnetic resonance imaging (MRI) operates on non-ionizing principles, exploiting the nuclear spin properties of protons in and fat molecules. In a strong static (typically 1.5–3 T), protons align and precess at the Larmor ; a radiofrequency (RF) pulse perturbs this alignment, and upon relaxation, protons emit detectable signals as they return to equilibrium via T1 (spin-lattice) and T2 (spin-spin) processes, with T1 times longer in fluids (e.g., 2000–3000 ms) than in fat (200–500 ms). Gradient fields spatially encode these signals for reconstruction into images. MRI delivers superior soft-tissue contrast and (0.5–2 mm) without radiation, supporting multiplanar and functional (e.g., ) imaging in 3D or 4D formats. Data exhibit , with motion artifacts like ghosting from patient or physiological movement (e.g., respiration) causing blurring or replicas across the phase-encoding direction. Positron emission tomography (PET) focuses on functional and metabolic imaging using from positron-emitting radiotracers (e.g., 18F-FDG) injected into the patient. A nucleus decays by emitting a , which annihilates with an ~1–2 mm away, producing two oppositely directed 511 keV gamma rays detected in coincidence by a ring of scintillators, defining lines of response for . This yields quantitative 3D maps of tracer uptake, with of 4–6 mm limited by positron range and non-collinearity. PET data are low-resolution volumes with high noise from random and scatter events, often requiring correction; supports 4D dynamic studies of processes like blood flow. Artifacts include attenuation mismatches in obese patients. Ultrasound imaging employs non-ionizing high-frequency (1–20 MHz) generated by piezoelectric transducers, which propagate through tissues at ~1540 m/s and reflect at interfaces due to mismatches (Z = × ). Strong reflectors like appear echogenic (bright), while fluids are anechoic (dark); echoes are amplified and time-gained to form real-time 2D or 3D images. It offers excellent (>30 frames/s) for dynamic visualization but spatial resolution varies (0.1–1 mm axially, poorer laterally), with limited penetration (10–30 cm) in air or bone-filled regions. Data characteristics include speckle noise from coherent interference and artifacts like shadowing behind dense structures or from repetitive echoes. Operator dependence affects . Hybrid modalities integrate complementary principles for enhanced , such as PET-MRI, which simultaneously acquires metabolic PET data with high-contrast anatomical MRI in a single session, reducing motion misalignment and compared to PET-CT. This produces aligned 4D multimodal volumes ideal for and , with PET resolution augmented by MRI's soft-tissue detail. These modalities' outputs often require initial preprocessing for noise reduction, such as filtering Poisson noise in CT, to prepare data for computing tasks.

Data Formats and Preprocessing

Medical image data requires standardized formats to facilitate , storage, and retrieval across diverse systems and applications. The and Communications in Medicine () standard serves as the primary format for most clinical modalities, defining protocols for encoding image data, metadata (including patient demographics, acquisition parameters, and study details), and network communications to enable seamless exchange between devices and institutions. In , the NIfTI format has become a , extending the earlier ANALYZE format by incorporating explicit affine transformations for orientation and supporting multidimensional arrays up to 7D, which simplifies handling of functional and structural data. For large, heterogeneous datasets—such as those from multi-omics or —HDF5 provides a flexible, hierarchical structure that accommodates complex objects like arrays, groups, and attributes, optimizing storage and access for computational pipelines in medical research. Preprocessing transforms raw images to mitigate acquisition artifacts and variations, ensuring suitability for downstream analysis. Intensity normalization adjusts pixel values to a common scale, with being a foundational method that spreads out the intensity distribution to enhance contrast, particularly useful in low-contrast regions of or images. Noise reduction employs filters like Gaussian , which convolves the image with a Gaussian kernel to attenuate random fluctuations while maintaining edge integrity, commonly applied to reduce thermal or electronic in CT and MRI scans. Bias field correction addresses slow-varying intensity inhomogeneities in MRI due to sensitivities; the N4ITK refines the earlier N3 method by using a deformable model to estimate and subtract the multiplicative bias, achieving superior uniformity in brain tissue segmentation tasks. Handling medical image data involves inherent challenges that impact computational accuracy. Anisotropic voxels, resulting from slice-selective acquisition in modalities like MRI, introduce directional resolution disparities (e.g., higher in-plane than through-plane resolution), leading to elongated structures in 3D models and errors in quantitative metrics such as diffusion tensor imaging. Multi-scale resolutions emerge from protocol variations across scanners or sessions, complicating alignment and feature extraction by requiring that may amplify noise or during resampling. Metadata extraction poses difficulties due to format-specific inconsistencies, such as optional DICOM tags or proprietary extensions, which hinder automated retrieval of critical details like voxel spacing or use without risking data loss or privacy breaches. Quality assurance pipelines systematically detect and correct artifacts to uphold before analysis. These workflows often integrate automated tools for artifact identification, such as motion-induced distortions or susceptibility artifacts in MRI; for instance, models like 3D-QCNet employ 3D DenseNet architectures to classify volumes and localize anomalies in diffusion MRI, achieving high sensitivity (over 90%) and enabling scalable rejection or of affected regions.

Mathematical Foundations

Image Formation and Reconstruction

In medical image computing, refers to the mathematical modeling of how raw sensor data is generated from the underlying tissue properties, while reconstruction involves inverting these models to recover the image. For computed tomography (CT), image formation is based on the projection geometry, where s pass through the body and are attenuated according to the , which integrates the object's density along lines of projection. In parallel-beam geometry, projections are acquired from multiple angles assuming non-diverging rays, forming the basis for analytical reconstruction. Fan-beam geometry, commonly used in modern CT scanners, extends this by accounting for the diverging fan from a , which requires rebinning to parallel projections or direct fan-beam formulas to handle the geometry. In (MRI), image formation occurs in k-space, the Fourier domain, where the components of the image are encoded through gradient fields modulating the radiofrequency signals from hydrogen protons. The raw MRI data represents samples of the continuous of the distribution, and the image is obtained by applying the inverse . This Fourier basis allows for flexible sampling trajectories, such as Cartesian or radial paths in k-space. Reconstruction algorithms invert these forward models to estimate the image from measured projections or k-space data. In CT, filtered back-projection (FBP) is a widely adopted analytical method that applies a ramp filter to the projections before back-projecting them onto the . The core for parallel-beam FBP is given by f(x,y)=0πp(θ,s)h(xcosθ+ysinθs)dsdθ,f(x,y) = \int_0^\pi \int_{-\infty}^\infty p(\theta, s) \, h(x \cos \theta + y \sin \theta - s) \, ds \, d\theta, where f(x,y)f(x,y) is the reconstructed image density, p(θ,s)p(\theta, s) is the projection data at angle θ\theta and distance ss, and hh denotes the ramp filter kernel, which compensates for the blurring inherent in simple back-projection. This approach, originally formulated using convolution instead of Fourier transforms for computational efficiency, enables rapid reconstruction but can amplify noise without apodization. For (PET), where projections represent line integrals of radionuclide emissions modeled as Poisson processes, iterative methods like expectation-maximization (EM) are preferred to incorporate statistical noise models and system matrices. The EM algorithm iteratively updates the image estimate by maximizing the likelihood, alternating between expectation (computing expected counts given current estimate) and maximization (adjusting estimate to fit observed data), improving convergence over direct methods in low-count scenarios. Compressed sensing has revolutionized reconstruction in MRI by exploiting image sparsity in transform domains to enable below traditional limits, reducing scan times. The core minimizes the l1-norm of the sparse coefficients subject to data consistency: minΨx1s.t.Ax=b,\min \| \Psi x \|_1 \quad \text{s.t.} \quad A x = b, where xx is the image, Ψ\Psi is the sparsifying transform (e.g., ), AA is the undersampled Fourier encoding matrix, and bb is the k-space measurements. This nonlinear recovery, solved via , allows acceleration factors of 3-5 in clinical protocols while suppressing artifacts. Resolution in reconstructed images is fundamentally limited by sampling theory, particularly the Nyquist-Shannon theorem, which requires sampling at least twice the highest to avoid . In , this dictates the minimum projection angles in CT or k-space density in MRI; undersampling below this rate introduces wrap-around artifacts, while enhances resolution at the cost of acquisition time. Preprocessing steps, such as , may follow reconstruction to refine the representation.

Signal Processing and Filtering

Signal processing and filtering play a crucial role in medical image computing by enhancing image quality, reducing noise, and extracting meaningful features from acquired data such as MRI, CT, and images. These techniques operate primarily on the intensities or components of images to mitigate artifacts introduced during acquisition, including , speckle, or blur, thereby improving diagnostic accuracy and enabling downstream analyses like segmentation. In the spatial domain, basic filtering methods such as mean and median filters are widely used for denoising medical images. The mean filter, also known as the average filter, smooths an image by replacing each value with the average of its neighbors within a defined window, effectively reducing but potentially blurring edges in CT or MRI scans. The , on the other hand, replaces each with the median value of its neighborhood, making it particularly effective for removing impulse noise like salt-and-pepper artifacts common in images, while preserving edges better than the mean filter. Frequency domain filtering leverages the to analyze and modify the spectral content of medical images, allowing for targeted noise suppression or enhancement. The two-dimensional of an image f(x,y)f(x,y) is given by F(u,v)=f(x,y)ej2π(ux+vy)dxdy,F(u,v) = \iint f(x,y) e^{-j2\pi(ux+vy)} \, dx \, dy, which decomposes the image into its frequency components; low-pass filters attenuate high frequencies to smooth images and reduce noise in modalities like MRI, while high-pass filters emphasize high frequencies to sharpen edges and highlight structures in images. Advanced methods include transforms for multi-resolution analysis, which decompose medical images into subbands capturing details at varying scales, facilitating and feature extraction in applications such as CT segmentation of regions of interest. For , the Canny algorithm is a seminal approach applied to medical images, involving Gaussian smoothing followed by computation of the gradient magnitude I=Gx2+Gy2|\nabla I| = \sqrt{G_x^2 + G_y^2}
Add your contribution
Related Hubs
User Avatar
No comments yet.