You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, this blog is aims to talk about how to use OpenCV to colorize a grayscale image. Before the advent of color cameras, a lot of cameras took images of each of the r,g,b channels separately in order to get the actual color back based on this.
<p>This blog explores how to use OpenCV to colorize grayscale images. Before the advent of color cameras, photographers captured images of each RGB channel separately to reconstruct the final color image.</p>
93
+
</div>
94
+
95
+
<p>I will use the following image as an example:</p>
96
+
97
+
<divclass="image-container">
98
+
<imgsrc="media/cathedral.jpg" alt="Cathedral original grayscale channels">
99
+
<divclass="image-caption">Original Cathedral image with separate R, G, B channels</div>
100
+
</div>
101
+
102
+
<p>This image is a grayscale image (only one channel instead of RGB), with each of the 3 sections representing the blue, green, and red channels in grayscale. Our job is to combine these three grayscales to create a color image.</p>
103
+
104
+
<p>A naive attempt at this will look something like this:</p>
<divclass="image-caption">Naive alignment - simply stacking the channels without proper alignment</div>
109
+
</div>
110
+
111
+
<p>What we do here is naively chunk the original image into thirds by height and assign the colors respectively. We can see that the images do not align properly.</p>
112
+
113
+
<p>Thus, we need to align the three images together to get the correct color image. An easy way to start is to find the alignment that reduces the L2 loss, since we know that these colors will generally have high correlation.</p>
114
+
115
+
<p>Of course, a natural problem that arises here is that after adding some sort of offset, the image dimensions will no longer match up. Thus, we need to find a way to account for this in our norm. One way to do this is to apply the L2 norm on a smaller patch of the image such that both the original and shifted dimensions include the entirety of this smaller patch. I arbitrarily chose the middle 40% of each dimension (meaning we only use 16% total).</p>
116
+
117
+
<p>Applying this technique to our image, we get the following output:</p>
<p>The problem here is that although this works on a smaller image like this (~300 x 300 pixels), trying to brute force search this on every pixel will take too much time. Thus, we want to implement a pyramid-based search to find the best alignment.</p>
134
+
135
+
<p>A pyramid-based search starts with a much lower resolution version of the image (downsampled) and finds the best offset for this smaller version. Then, we look at a larger version of it, find an offset (based on the offset of the smaller version) and repeat this process until we reach the original resolution.</p>
136
+
137
+
<p>In this particular case, the pyramid search results in the same output without any significant speed up. However, consider a different image such as this image of 3 generations:</p>
<divclass="image-caption">Three Generations - original grayscale channels (9629 x 3714 pixels)</div>
142
+
</div>
143
+
144
+
<p>This image is 9629 x 3714 pixels. If we were to use the brute force approach, we would have to search tens of millions of possible alignments, taking minutes if not hours. On the other hand, the pyramid search only takes a few seconds under the right conditions.</p>
This image is a grayscale image (only one channel instead of rgb), with each of the 3 images
8
-
representing the blue, green, and red channels in a gray scale. Our job is to get all of these there grayscales and put them together to have a color image.
What we do here is to that we just naively chunk the original image into 1/3 of the height and assign the colors respectively. We see that the images do not seem to align properly.
15
-
Thus, we need to align the three images together to get the correct color image. An easy way to start off would be to find the alignment that reduces the L2 loss since we know that
16
-
these colors will generally have pretty high correlation.
Of course, a natural problem that arises here is that after adding some sort of offset, the image dimensions will no longer match up. Thus, we need to find a way to account for this in our norm.
19
-
One way to do this is the apply the L2 norm on a smaller patch of the image such that after both the original and shifted dimensions include the entirety of this smaller patch. I arbitrarily chose the middle 40% of each dimension (meaning we only use 16% total).
20
-
21
-
Applying this technique to our image, we get the following output:
The problem here is that although this works on a smaller image like this ~(300 x 300 pixels), trying to brute force search this on every pixel will take too much time. Thus, we want to implement a pyramid based search to find the best alignment.
31
-
32
-
A pyramid based search starts with a much lower resolution version of the image (downsampled) and finds the best offset for this smaller version.
33
-
Then, we look at a larger version of it, finds an offset (based on the offset of the smaller version) and repeats this process until we reach the original resolution.
34
-
35
-
In this particular case, the pyramid search results in the same output without any significant speed up. However, consider a different image such as this image of 3 generations:
This image is 9629 x 3714 pixels. If we were to use the brute force approach, we would have to search tens of millions of possible aligments, taking minutes if not hours. On the other hand, the pyramid search only takes a few seconds under the right conditions.
40
-
41
-
Here is the output of the pyramid search, as well as the output of pyramid search for various images:
0 commit comments