Before reading this page it is highly advisable to quickly review the IplImage
structure, a version of which is kept on
this
page. (Use 'Find' or [Ctrl-F] and type in 'IplImage'.)
use the HighGUI functions or the one-liner methods that take 15 parameters.
As such, you tend to miss some of the important stuff that can be useful to
know when working at the lower levels. This page is an attempt to address
the problem by showing you some of the 'inner workings' of the IplImage
structure.
It is worth noting that images are not stored using the 'traditional' RGB
colour space, they're actually
stored in BGR (the other way round). Why this is I'm not entirely sure, but
you don't tend to notice it as all the methods are written to use BGR as well.
To draw a red square we'll need to start off by creating an image.
IplImage *img = cvCreateImage(cvSize(100, 100), IPL_DEPTH_8U, 3);
This creates an image of width/height 100/100, using 8-bit unsigned integers
to represent the colour values, and with 3 colour channels.
However, 8-bit unsigned values are not the only type available; values can
also be held as 32-bit floating point numbers (IPL_DEPTH_32F) and a variety
of other ways. In each case the depth is represented as
IPL_DEPTH_<bits>{U|S|F} where U, S and F stand for unsigned, signed
and floating point. i.e.
- IPL_DEPTH_8U
- IPL_DEPTH_8S
- IPL_DEPTH_16U
- IPL_DEPTH_16S
- IPL_DEPTH_32S
- IPL_DEPTH_32F
- IPL_DEPTH_64F
Also notice that it's a pointer to an image - all images should be created in
this way when using OpenCV as most (if not all) of its methods take image
pointers as parameters in order to modify images directly.
Image Data
Images are not stored by pixel. Instead they are stored as arrays of colour
levels which means less processor overhead as you're not constantly
dereferencing pointers. These arrays of colour are stored in BGR order
(as mentioned above).
imageData[0] | imageData[1] | imageData[2] | imageData[3] | imageData[4] | imageData[5] |
B | G | R | B | G | R |
colour | values | go | in | these | elements |
...as opposed to this:
imageData[0] | imageData[1] | ||||
-> red | -> green | -> blue | -> red | -> green | -> blue |
colour | values | go | in | these | elements |
Greyscale image structures differ very slightly - instead of having three
channels they have just the one (for brightness) that can be accessed in the
same way.
i.e. cvCreateImage(cvSize(100, 100), IPL_DEPTH_8U, 1)
In this case the first pixel would be imageData[0], the second would be
imageData[1] and so on.
Finally, images in OpenCV are padded. Most image formats available today such
as JPEG, PNG, TIFF and the like are padded out so that the number of columns
in an image is divisible by 4 - with the exception of BMPs. This means that
if you ever get round to converting between image structures using BMPs, you
can get some rather interesting skewing effects if you try to simply copy the
data arrays over. (This was discovered whilst trying to convert between Leeds'
libRTImage library and OpenCV. If you're interested,
this is what I came up with.)
Direct Pixel Access
So to get our red square going we'll just have to edit every third channel.
Direct access of the pixels is possible using the imageData attribute and the
number of bytes in the image (or img->imageSize) can be used as a
quick way of bounding the for loop.
img->imageData[i] = value;
so we get:
int i;
for (i = 2; i < img->imageSize; i+=3)
img->imageData[i] = 255;
(For a finished file which displays the square in a window, click
here.)
It is worth noting that while most images and methods in OpenCV use or
return 8-bit unsigned data (e.g. cvLoadImage always returns an IPL_DEPTH_8U
image), this is not how OpenCV is written. imageData isn't int
or float, it's actually a char pointer to data within IplImage.
It would seem that this is done for versatility, but presents a little
confusion on use with your first 32-bit float image. 32F images can only hold
values between 0 and 1, but to a rather high degree of accuracy (as you would
expect), so we have to adjust values accordingly. We also have to change the
way for loops are defined - imageSize is measured in bytes and as there
are now four bytes per colour value (floats are four bytes each), the
for loop returns a segmentation fault a quarter of the way through its life
if we use the same code as before. Instead we can use the image's width
and height attributes, multiplying by 3 so that all channels are filled.
Finally, the values themselves need to be converted to float pointers so that
the data is stored in the correct format. The following code should clarify
things.
int i;
for (i = 0; i < img->width*img->height*3; i+=3)
{
((float*)img->imageData)[i] = 64/256.0;
((float*)img->imageData)[i+1] = 196/256.0;
((float*)img->imageData)[i+2] = 256/256.0;
}
The orange square program shows the complete version of the above.
Click here to see it.
There are several colour space conversions available within OpenCV, through
use of the cvCvtColor function:
cvCvtColor(source, destination, space_code);
Here, space_code typically takes the form of the source colour space and
the desired colour space, but note that in each case the source and destination
images must have the correct number of channels. The possible codes are:
Simple:
- CV_BGR2RGB
- CV_RGB2BGR
- CV_BGR2GRAY
- CV_RGB2GRAY
- CV_GRAY2BGR
- CV_GRAY2RGB
- CV_BGR2XYZ
- CV_RGB2XYZ
- CV_XYZ2BGR
- CV_XYZ2RGB
- CV_BGR2YCrCb
- CV_RGB2YCrCb
- CV_YCrCb2BGR
- CV_YCrCb2RGB
- CV_BGR2HSV
- CV_RGB2HSV
- CV_HSV2BGR
- CV_HSV2RGB
- CV_BGR2HLS
- CV_RGB2HLS
- CV_HLS2BGR
- CV_HLS2RGB
- CV_BGR2Lab
- CV_RGB2Lab
- CV_Lab2BGR
- CV_Lab2RGB
- CV_BGR2Luv
- CV_RGB2Luv
- CV_Luv2BGR
- CV_Luv2RGB
- CV_BayerBG2BGR
- CV_BayerGB2BGR
- CV_BayerRG2BGR
- CV_BayerGR2BGR
- CV_BayerBG2RGB
- CV_BayerGB2RGB
- CV_BayerRG2RGB
- CV_BayerGR2RGB
OpenCV completely ignores this when displaying or processing an image. Instead
it chooses to assume BGR order, and this can lead to strange image output if
displayed using cvShowImage.
没有评论:
发表评论