Currently, convolutional neural networks (CNNs) for image restoration and computer vision.
Image restoration: Image denoising, spatial and spectral deconvolution, superresolution, registration, and more!
Computer vision: Object detection and segmentation.
Current applications: What problems are solved today?
Computational tools: What algorithms drive solutions?
Trust: Can we rely on data-driven methods?
Photon budgets trade off between depth, intensity, and spatial and temporal resolution.
Main idea: Cheat the trade-off by restoring image features from low intensity acquisitions.
Seminal paper: Content-Aware Image Restoration: Pushing the Limits of Fluorescence Microscopy (Weigert et al., 2018) [PDF]
Main idea: Improve image resolution beyond raw acquisition limits.
Resolution artifacts may be due to diffraction limits, hardware limits, sparse axial sampling, others?
Learn statistical superresolution models from synthetic data and alternate superresolution techniques (STORM, PALM).
Image restoration: Learn a statistical model to infer a high-quality image from low quality.
Computer vision: Parse the structural content of an image into constituent objects.
Classic techniques (thresholding, watershedding) work in many cases.
CNNs may improve results in challenging cases, allow novel applications.
Examples adapted from (von Chamier et al., 2019) [PDF]
(Rajaraman et al., 2018) [PDF]. RBC slide, its segmentation, and an overlay.
Train a CNN to predict fluorescence targets from bright-field images
Advantage: Predict multiple fluorescence labels at once when that is impractical physically.
Basically, semantic segmentation with fluorescence-derived class labels.
Main idea: Learn convolutional features from image data to solve a classification task.
Sequential convolution operations + nonlinearities (ReLU) + pooling: learn abstract image-scale features to solve, e.g., classification problems.
Idea existed since the 90's, resurgence since the 2012 Alexnet results for natural image classification (Krizhevsky et al., 2012) [PDF]
Idea: Slide a little window along an image, do some computation in each window.
The computation: dot product between a window-sized kernel and the image.
Example: Gaussian blurring convolves a Gaussian kernel with an image.
CNNs learn kernels. Learned kernels are features, convolution output is a feature map.
Linear transforms like convolutions are followed by a nonlinear activation function.
Examples: ReLU is $\max(0, x)$
, sigmoid is $1/\left(1+e^{-x}\right)$
.
Networks with nonlinearities are more expressive: a sequence of linear transforms is just another linear transform.
Downsample an image or feature map to produce a "zoomed-out" view.
Convolution ops applied to downsampled data find patterns across larger spatial scales.
Max pooling: replace 2x2 pixel block with the single largest value.
Mean pooling: replace 2x2 pixel block with the average block value.
Vanilla CNNs provide spatially-coarse classification.
What about spatially-dense image problems? (Segmentation, restoration, etc.)
U-Net: Use convolutional encoder + decoder with skip connections for image-to-image translation.
Seminal paper: (Ronneberger et al., 2015) [PDF]
Simultaneously train two networks, a generator (e.g. U-Net) and discriminator.
Generator trained to perform a task (e.g. image-to-image translation).
Discriminator trained to distinguish generator output from real training samples.
When successful, GAN generators produce more realistic output than without adversarial training.
Bottom line: No. Not always, not right now.
Neural network image restoration may fail subtly on new data or anomalous structures.
Long term, work on network interpretability will probably make results more trustworthy.
For now, validate like you would validate a human, not an algorithm.
Approach 1: model interpretability.
How does a network reach its prediction? What image features most contribute to it?
Seminal paper: (Olah et al., 2018) [link] for natural image classification.
No rock-solid solutions yet, research is ongoing.
Approach 2: predicting network confidence.
Instead of predicting a single pixel value, predict a per-pixel parametric statistical model.
(Weigert et al., 2018) tries predicting mean μ and std σ of Laplace distributions per pixel.
Goal: Low σ = high prediction confidence?
Approach 3: measuring ensemble disagreement.
Ensemble: Collection of statistical models (neural nets or otherwise).
Train multiple instances of the same model from initial conditions.
Measure how well the models agree on a prediction.
More sophisticated approach: Bayesian CNNs.
Learn probability distributions for weights to quantify model uncertainty.
Predict probability distributions for restored pixels to quantify data uncertainty.
See (Xue et al., 2019) for an application to phase imaging microscopy.
Content-aware computation lets microscopists cheat physical limits of sample acquisition.
This enables scientific experiments that are impossible otherwise.
Results cannot be entirely trusted, may fail in subtle ways, trustworthiness will probably improve.
Question: What do you do today when the problem you want to solve requires content-aware tools?