Saturday, April 11, 2020

Computer Vision learning




Computer Vision is the field advanced from image processing not just image capturing, formation, or restoration but also to extract the information from it and often its extrapolation also. These information are often related to objects present in the frame. some of the application are :

1. Object detection
2. Object tracking
3. Image classification/Segmentation
4. Motion oriented transformations
5. Face , Iris, Smile detection
6. Depth and 3D construction
7. Optical Character recognition (OCR)

 and several others things and fields like computational photography, Automotives - autonomous vehicle controls and safety, Medical Imaging etc.

Resources to learn it : Book reference : Computer Vision: Algorithms and Applications by Richard Szeliski   , Best place to learn it.

All these above mentioned features are mainly based on relationship between image, geometry and photometry. They uses fundamental 2d and 3d primitives
into their algorithms.

OpenCV is open source computer vision library, it has very large number of algorithms implemented and utilities for several image operations activity.

Image in OpenCV : Image gets read as matrix and mat (image object) classes has methods required for image operations like read, write, display, access image properties like color, channel, shape, size, pixels manipulation. Mat has matrix header and pointer to the matrix containing pixels. the matrix header gets used in reference
counting when there is shared address matrix.

Image Read, Modify and Display :

Mat cv::imread(const string& filename, int flag = IMREAD_COLOR)

IMREAD_GRAYSCALE = 0
IMREAD_COLOR = 1
IMREAD_UNCHANGED = -1  , load as it is, including alpha

using namespace cv;
Mat img = imread(imagepath, 0);

where img is image object allocated automatically , imread loads the image into it. imread reads images in blue green red BGR format.

Image data type:



Modify:

img.at<uchar>(1,1)=128;
img(Range(0,1), Range(0,3)).setTo(128);


Display:

for float datatype - value range is from 0 to 1, for int its 0 to 255

Matplotlib and imshow:

1. plt::figure_size(600, 400);
    plt::imshow(img);
    auto imgPlot = displayImg(img);
    imgPlot

2.  imshow("image", img);
     waitKey(0);
     destroyAllWindows();

Others:
a. named window

void cv::namedWindow (const  string&  windowName, int flag = WINDOW_AUTOSIZE)

b.  waitKey

int cv::waitKey(int delay = 0)  , 0 waits for key press

c. destroyWindow

void cv::destroyWindow(const string&  windowName)

d. destroyAllWindow

void cv::destroyAllWindow()


Saving Image:

bool cv::imwrite(const string& filename, InputArray img, const vector<int>&
      params = std::vector<int>())

imwrite("../testImg.jpg", img);

Color space:

OpenCV reads a given image in the BGR format by default. So, you’ll need to change the color space of your image from BGR to RGB when reading images using OpenCV

cv::cvtColor(img,cv.COLOR_BGR2GRAY)


Color Images

img.size() -> y x x
img.Channels() -> 3

Channel Operations : Split and Merge

Mat Chnls[3]

split(img, Chnls);

can be seen using displayImage(Chnls[i]) or imshow(Chnls[i])


Accessing color pixels

img.at<Vec3b>(0,0); --> gets 3 bytes in vector format

[1,1,1] --> intensity values

Modifying ->

img.at<Vec3b>(0,0) = Vec3b(0,255,255); B, G, R --> G and R --> yellow

Region :

img(Range(0,2), Range(0,3).setTo(Scalar(255,0,0));


Alpha channel :

Mat img = imread(img, -1);

Mat imgChanl[4];

split(img, imgChanl); --> imgChanl[i] -> points to each channel , B, G, R , A intensity value ranging from 0 to 255


========================================================

Create, Crop Images:

Mat img = Mat(400, 600, CV_8UC4, Scalar(128,128,128,0));

Mat img1 = img.clone();

Cropping
Mat  crop = img(Range(100, 200), Range(100, 400));
y --> 100 to 200
x --> 200 to 400

Resizing an Image:

void cv::resize( InputArray src,  --> input image
      OutputArray dst,  --> output image
      Size dstSize,   --> output size
      double fx = 0,
      double fy = 0,
      int interpolation = INTER_LINEAR)  --> bilinear , better quality than nearest neighbor

fx, fy sampling factor in horizontal and vertical axis
interpolation are usually nearest neighbor, bilinear, bicubic 

Image Masking:

used to segment out area of interest and not.

Mat mask = Mat::zeroes(image.size(), image.type());

mask(Range(40,100), Range(120, 180)).setTo(255);

inRange function : 

void cv::inRange(InputArray src,
     InputArray lower,
     InputArray upper,
     OutputArray dst)

Mat mask;
inRange(image, Scalar(0,0,150), Scalar(100,100,255), mask);