CS180 Project 1: Colorizing the Prokudin-Gorskii photo collection

Project Overview

The goal of this project is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image. The input is a set of images each containing three plates in the order of blue plate, green plate and red plate. For each image, place three plates on top of each other, and align them so that they form a single RGB color image. An example input image is shown as below:

example input

An example of input image

Simple Alignment

For smaller images, I used a simple algorithm of exhaustively searching over a window of possible displacements in both X axis and Y axis directions. The range of the search window is [-15, 15] pixels. I used Euclidean Distance as the metric for evaluating the alignment between two images. The Euclidean Distance between two image matrices A and B is computed as sqrt(sum((A-B)^2)). Before performing the alignment, I also removed the borders of the images. Denote the height and width of an image by H and W, I removed 0.05H on top and the bottom of the image, and 0.05W on the left and the right side of the image.

Results

The simple alignment algorithm works well on the three smaller images. The results are shown as below with the displacement vectors (dx, dy) for the red and green plates

Image 1
cathedral.jpg

red:(3, 12), green:(2, 5)

Image 2
monastery.jpg

red:(2, 3), green:(2, -3)

Image 3
tobolsk.jpg

red:(3, 6), green:(3, 3)

Image Pyramid Alignment

The simple alignment algorithm works well for smaller images, but it is very inefficient for larger images. For the large size images, I used image pyramid alignment algorithm. The high level idea is to shrink the large .tif images into small images, and iteratively enlarge the small images and find the best displacement vectors on larger images. The details is the following:
Denote the height and the weight of the original image as H and W. Calculate the resize factor as n = log2(min(H, W)/100). Resize the red/green plate and blue plate with height = H/n and width = W/n. Denote the small red/green plate as s_0, and the small blue plate as t_0.


                    for i = 0 to n:
compute the best alignment displacement vector d_i = align(s_i, t_i, d_{i-1}) scale up s_i and t_i by a factor of 2
As the images are getting larger in each iteration, shrink the seach window of displacement. I use (-15, 15) for the smallest imagest, and (-3, 3) for all the other bigger images. At the end of the algorithm, I obtained the best displacement vector for the images of the original size.

Results

The results are shown as below with the displacement vectors (dx, dy) for the red and green plates. For these .tif images, the runtime of image pyramid algorithm is in the range of 15-20s.

Image 1
church.jpg

red:(-4, 59), green:(3, 25)

Image 2
emir.jpg

red:(44, 36), green:(24, 49)

Image 3
harvesters.jpg

red:(13, 123), green:(16, 59)

Image 4
icon.jpg

red:(23, 89), green:(17, 41)

Image 5
lady.jpg

red:(11, 116), green:(8, 56)

Image 6
melons.jpg

red:(12, 178), green:(9, 82)

Image 7
onion_church.jpg

red:(36, 108), green:(26, 51)

Image 8
sculpture.jpg

red:(-26, 139), green:(-11, 33)

Image 9
self_portrait.jpg

red:(36, 176), green:(29, 79)

Image 10
three_generations.jpg

red:(10, 111), green:(13, 53)

Image 11
train.jpg

red:(32, 87), green:(5, 42)

Bells and Whistles

In the results above, emir.jpg is not aligned properly. This is probably due to that the pixels on the same locations have very different intensity values on red, green and blue plates. A possible solution is to use the edge feature for alignment instead of pixel values. I used Canny Edge detection algorithm, and this fixed the misalignment problem. In the results below, the left image does not use Canny Edge detection, whereas the image on the right uses Canny Edge detection.

example input

emir.jpg without Canny Edge detection

example input

emir.jpg with Canny Edge detection