2009年3月13日星期五

The IplImage Structure


Before reading this page it is highly advisable to quickly review the IplImage
structure, a version of which is kept on
this
page
. (Use 'Find' or [Ctrl-F] and type in 'IplImage'.)

OpenCV has a tendency to hide information, especially when you only tend to

use the HighGUI functions or the one-liner methods that take 15 parameters.
As such, you tend to miss some of the important stuff that can be useful to
know when working at the lower levels. This page is an attempt to address
the problem by showing you some of the 'inner workings' of the IplImage
structure.
It is worth noting that images are not stored using the 'traditional' RGB
colour space, they're actually
stored in BGR (the other way round). Why this is I'm not entirely sure, but
you don't tend to notice it as all the methods are written to use BGR as well.

Create An Image

To draw a red square we'll need to start off by creating an image.


IplImage *img = cvCreateImage(cvSize(100, 100), IPL_DEPTH_8U, 3);

This creates an image of width/height 100/100, using 8-bit unsigned integers
to represent the colour values, and with 3 colour channels.
However, 8-bit unsigned values are not the only type available; values can
also be held as 32-bit floating point numbers (IPL_DEPTH_32F) and a variety
of other ways. In each case the depth is represented as
IPL_DEPTH_<bits>{U|S|F} where U, S and F stand for unsigned, signed
and floating point. i.e.

  • IPL_DEPTH_8U
  • IPL_DEPTH_8S
  • IPL_DEPTH_16U
  • IPL_DEPTH_16S
  • IPL_DEPTH_32S
  • IPL_DEPTH_32F
  • IPL_DEPTH_64F

Also notice that it's a pointer to an image - all images should be created in
this way when using OpenCV as most (if not all) of its methods take image
pointers as parameters in order to modify images directly.

Image Data

Images are not stored by pixel. Instead they are stored as arrays of colour
levels which means less processor overhead as you're not constantly
dereferencing pointers. These arrays of colour are stored in BGR order
(as mentioned above).

e.g. IplImage's imageData field looks like this...





















imageData[0] imageData[1] imageData[2] imageData[3] imageData[4] imageData[5]
BGRBGR
colour values go in these elements

...as opposed to this:






















imageData[0] imageData[1]
-> red -> green -> blue -> red -> green -> blue
colour values go in these elements

Greyscale image structures differ very slightly - instead of having three

channels they have just the one (for brightness) that can be accessed in the
same way.

i.e. cvCreateImage(cvSize(100, 100), IPL_DEPTH_8U, 1)

In this case the first pixel would be imageData[0], the second would be
imageData[1] and so on.

Finally, images in OpenCV are padded. Most image formats available today such
as JPEG, PNG, TIFF and the like are padded out so that the number of columns
in an image is divisible by 4 - with the exception of BMPs. This means that
if you ever get round to converting between image structures using BMPs, you
can get some rather interesting skewing effects if you try to simply copy the
data arrays over. (This was discovered whilst trying to convert between Leeds'
libRTImage library and OpenCV. If you're interested,
this is what I came up with.)


Direct Pixel Access

So to get our red square going we'll just have to edit every third channel.
Direct access of the pixels is possible using the imageData attribute and the
number of bytes in the image (or img->imageSize) can be used as a
quick way of bounding the for loop.

img->imageData[i] = value;

so we get:
int i;

for (i = 2; i < img->imageSize; i+=3)

img->imageData[i] = 255;


(For a finished file which displays the square in a window, click
here.)


It is worth noting that while most images and methods in OpenCV use or
return 8-bit unsigned data (e.g. cvLoadImage always returns an IPL_DEPTH_8U
image), this is not how OpenCV is written. imageData isn't int
or float, it's actually a char pointer to data within IplImage.

It would seem that this is done for versatility, but presents a little
confusion on use with your first 32-bit float image. 32F images can only hold
values between 0 and 1, but to a rather high degree of accuracy (as you would
expect), so we have to adjust values accordingly. We also have to change the
way for loops are defined - imageSize is measured in bytes and as there
are now four bytes per colour value (floats are four bytes each), the
for loop returns a segmentation fault a quarter of the way through its life
if we use the same code as before. Instead we can use the image's width
and height attributes, multiplying by 3 so that all channels are filled.
Finally, the values themselves need to be converted to float pointers so that
the data is stored in the correct format. The following code should clarify
things.

int i;

for (i = 0; i < img->width*img->height*3; i+=3)

{

((float*)img->imageData)[i] = 64/256.0;

((float*)img->imageData)[i+1] = 196/256.0;

((float*)img->imageData)[i+2] = 256/256.0;


}

The orange square program shows the complete version of the above.
Click here to see it.

Image Representations

There are several colour space conversions available within OpenCV, through
use of the cvCvtColor function:

cvCvtColor(source, destination, space_code);

Here, space_code typically takes the form of the source colour space and
the desired colour space, but note that in each case the source and destination
images must have the correct number of channels. The possible codes are:

Simple:

  • CV_BGR2RGB
  • CV_RGB2BGR
  • CV_BGR2GRAY
  • CV_RGB2GRAY
  • CV_GRAY2BGR
  • CV_GRAY2RGB
CIE XYZ:
  • CV_BGR2XYZ
  • CV_RGB2XYZ
  • CV_XYZ2BGR
  • CV_XYZ2RGB
YCrCb JPEG (a.k.a. YCC)
  • CV_BGR2YCrCb
  • CV_RGB2YCrCb
  • CV_YCrCb2BGR
  • CV_YCrCb2RGB
HSV:
  • CV_BGR2HSV
  • CV_RGB2HSV
  • CV_HSV2BGR
  • CV_HSV2RGB
HLS:
  • CV_BGR2HLS
  • CV_RGB2HLS
  • CV_HLS2BGR
  • CV_HLS2RGB
CIE Lab:
  • CV_BGR2Lab
  • CV_RGB2Lab
  • CV_Lab2BGR
  • CV_Lab2RGB
CIE Luv:
  • CV_BGR2Luv
  • CV_RGB2Luv
  • CV_Luv2BGR
  • CV_Luv2RGB
Bayer (a pattern widely used in CCD and CMOS cameras):
  • CV_BayerBG2BGR
  • CV_BayerGB2BGR
  • CV_BayerRG2BGR
  • CV_BayerGR2BGR
  • CV_BayerBG2RGB
  • CV_BayerGB2RGB
  • CV_BayerRG2RGB
  • CV_BayerGR2RGB
However, whilst IplImage's attributes include one for colorModel,
OpenCV completely ignores this when displaying or processing an image. Instead
it chooses to assume BGR order, and this can lead to strange image output if
displayed using cvShowImage.

没有评论:

发表评论

欢迎访问、交流!对本博客有何建议,请
来信告知!
本博内容来源于网络,如有不当或侵犯权益,请来信告知,将及时撤除!
如引用博客内容、论文,请注明原作者!

Google一下本博客

  • 《Getting Things Done》读书笔记 - 本文来自 inertial 原创投稿。 我第一次听说《Getting Things Done》这本书的时候误以为它和世面上的那些成功学书籍没什么区别,后来在不少书中看到了这个名字,也看见了很多人的推荐,由此产生了很大的兴趣。上个月正好有不少空闲,就抽时间把这本书读完了。 本来打算读英文原版,但是原版的生...
    5 年前
  • [原]Linux下编译使用boost库 - Boost库是一个可移植、提供源代码的C++库,作为标准库的后备,是C++标准化进程的开发引擎之一。 Boost库由C++标准委员会库工作组成员发起,其中有些内容有望成为下一代C++标准库内容。在C++社区中影响甚大,是不折不扣的“准”标准库。Boost由于其对跨平台的强调,对标准C++的强调,与...
    6 年前
  • [原]猎头、培训与咨询的价值(2)【补1】——北漂18年(93) - 【上期用手机写的,同时用语音输入转化成文字,错字较多,经好友霍师傅提醒本期重写,并增加一部分新内容】 简单谈下我对猎头、培训与咨询的看法。三样都干过,算是有些浅见。 猎头 简单的说就是人才中介。虽然在公司看来是可以直接解决现有企业问题的一个直接方法,但很多时候都不太管用。 猎头费一般是人才的一个月月...
    6 年前
  • OpenCV統計應用-Mahalanobis距離 - Mahalanobis距離是一個可以準確找出資料分布上面極端值(Outliers)的統計方法,使用線性迴歸的概念,也就是說他使用的是共變數矩陣以及該資料分布的平均數來找尋極端值的產生,而可以讓一群資料系統具有穩健性(Robust),去除不必要的雜訊訊息,這邊拿前面共變數矩陣的資料為例,並且新增了兩個點座標向量來做...
    15 年前
  • 努力推进模式识别实际产品的开发与应用 - Salu 无论是手写体识别、文档处理、人脸识别、基于内容的图片搜索、嵌入人工智能的搜索技术、虚拟网络社区、还是其它相关新科技下的信息整合领域,现在都在努力实用化。 前两年、即使现在还有很多人在抱怨说人脸的方法都不能用,但是就今年出现的和正在做的有关人脸识别实际应用的各种形式的产品可以说如雨后春笋。这是一个趋...
    16 年前