This project takes the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produces a color image with as few visual artifacts as possible. In order to do this, I extracted the three color channel images, placed them on top of each other, and aligned them so they formed a single RGB color image.
My program took a glass plate image as input and produced a single color image as output. First, I divided the image into three equal parts as each image was essentially the three channels stacked as images on top of each other. Then, I aligned the second and third parts (G and R channels) to the first (B) iteratively. I aligned the images by exhaustively searching over a window of possible displacements (I chose [-15, 15] pixels), scored all of the alignments using an image matching metric, and then taking the displacement with the best score for the final image output.
There are different metrics to score how well the images match. In this case, I used Euclidean distance (L2 norm) and Normalized Cross-Correlation (NCC) on a cropped version of the images (so that the chromatic abberation on the boundaries of the image wouldn't affect the results).
This is the simplest metric to use on the images (formula shown below). It worked well on the smaller images, but performance worsened on larger image sizes. To compute the euclidean distance, you first compute the per-element difference for each index i. Square each difference and then sum up those squared values to obtain the Sum-of-Squared-Differences. Finally, take the square root to get the Euclidean distance.
The Normalized Cross-Correlation metric is simply a dot product between two normalized vectors. This metric performed slightly better than the L2 Norm, but the effect was most noticeable on the larger images combined with the pyramid speedup. To compute the normalized cross-correlation metric, you first subtract the mean intensity from each image so they have zero mean. This makes the measure invariant to brightness shifts and centers the data. Then, you compute the dot product of the zero-mean image vectors which measures how much the patterns align. Finally, you compute the magnitude of each vector using the L2 norm and divide the dot product by the product of the magnitudes to normalize. Then, the result is between -1 and 1 where 1 indicates perfect alignment. This adjusts for differences in brightness and contrast compared to simply using Euclidean distance.
For larger images, exhaustive search becomes too expensive since the pixel displacement I've set ([-15,15] pixels) is too large. In this case, I implemented a faster search procedure, namely an image pyramid, that represents the image at multiple scales and processes from the smallest image downwards while updating my estimate.
I found a couple of images from the Prokudin-Gorskii collection online and ran my algorithm on them. They use the normalized cross-correlation metric without a coarse-to-fine pyramid speedup since they're smaller images.
Here is the gallery of my algorithm ran on all of the example images provided (other than Emir which is shown up top) using the NCC image metric and image pyramid speedup.