Thùng rác của Sea Otter: tháng 9 2010

Thứ Năm, 30 tháng 9, 2010

[CV][Descriptor] Shape Context

1. Authors

Shape Context is shape oriented descriptor proposed by Belongie & Malik in paper 2000 [3].

2. Implementation

Shape context for an arbitrary object (Figure 1.a) is computed by the following steps:

- Find edge of object (inside or on object boundary) - use Canny (Figure 1.c).

- Choose N sample points on object edge (regardless of inside or on boundary). They are any points on edge and not required to be interest point.

- For each sample point, calculate the distance to N-1 others (Figure 1.d)

- Normalize: compute median ($\lambda$ - Figure 1.d) of NxN distances above and divide each distance by this median

- Represent in log polar space for each sample point with N-1 normalized distances. In log-polar space, Histogram of one sample point shows that the vertical is $n_r$ bins in log of radius and horizontal is $n_{\theta}$ bins in angle related to tangent of sample point. The authors called the histogram as shape context. Figure 1.e is histogram with $n_r = 5$ and $n_{\theta} = 6$, so it contains 30 bins. Histogram is flattened by concatenating rows to get vector with $n_r \times n_{\theta}$ elements.

- Concatenate N histograms to obtain shape context representation of object (matrix with N rows and $n_r \times n_{\theta}$ columns.

Figure 1: Shape context computation and graph matching (a, b) Original image pair. (c) Edges and tangents of first letter with 50 sample points. (d) Vectors from a sample point (at left, middle) to all other points. The median distance $\lamda$ for all $N^2$ point pairs is shown at bottom for reference. (e) $\log{r, \theta}$ histogram of vectors in (d), with 5 and 6 bins, respectively. (Dark = large value). (f) Correspondences found using Hungarian method, with weights given by sum of two terms: histogram dissimilarity and tangent angle dissimilarity. (g, h) The "shape contexts" for the two letters, formed by flatterning and concatening the histograms for all points in each image; each shape context has 50 rows, one per sample point, and 30 columns, one for each histogram bin.

Parameters:

Parameter	Meaning	Initial implementation value
N	The number of sample points	100
$n_{\theta}$	The number of angular bins	12 (360 degrees)
$n_r$	The number of log(radius) covers from $0.125\lambda$ to $2\lambda$, values inside $0.125\lambda$ to first bin and outside $2\lambda$ to last bin	5

3. Properties

- Implementation complexity: easy (and few parameters)

- The ability of transformation invariance

	Illumination	Transition	Rotation	Scale	Affine
Shape Context		x	x	x

+ Transition-invariance because of relative distance with respect to sample point

+ Rotation-invariance because angular bins are measured with respect to tangent at sample point

+ Scale - invariance because normalization by N x N distance median.

4. Matching

- Give 2 objects with histogram $g_i(k)$ and $h_j(k)$, here i and j are sample points.

- Compute the matching distance between point i in object 1 and point j in object 2 to make a N x N matrix:

\[
C_{ij} = (1-\beta)C_S_{ij} + \beta C_A_{ij}
\]

(ordinarily, $\beta = 0.3$)
We have $C_S_{ij}$ is shape dissimilarity (use chi-square distance) and $C_A_{ij}$ is local appearance.
\[
C_S_{ij} = \frac{1}{2}\sum_{k = 1}^{n_r \times n_{\theta}}\frac{[g_i(k) - h_j(k)]^2}{g_i(k) + h_j(k)}
\]

\[
C_A_{ij} = \frac{1}{2}\|(cos{\theta_i}-cos{\theta_j}, sin{\theta_i}-sin{\theta_j})\|
\]
($\theta_i$ and $\theta_j$ are tangent of points i and j)
- Our target is to minimize the total cost of matching with constraint that the matching to be one-one. That means we will find a permutation $\pi(i)$ so that $\sum_i C_{i\pi(i)}$ is a minimum.

--> Use Hungarian method with computation cost $O(N^3)$

5. Evaluation

- Good shape descriptor for clear object (especially synthesis object, non-noise image): handwritten, shihouettes, logos.

- Good idea for scale, rotation invariance

- Hard for real image (because of noise) (think so :D)

6. Source code

- Shape-matching framework by Yanirta

References:
[1] Shape Context on Wikipedia

[2] Log-polar space

[3] S. Belongie and J. Malik (2000). "Matching with Shape Contexts". IEEE Workshop on Content based Access of Image and Video Libraries (CBAIVL-2000).

[4] Another alternative to [3]

[5] Newton Petersen (4/25/2008)

Thứ Năm, 23 tháng 9, 2010

[CV][Detector] Harris Laplace alogrithm

In [2] there are 2 proposed implementation of Harris Laplace detectors as followings:

I. Standard Harris Laplace detector (without extension):

1. Build the scale-space representation with $\sigma _n = s^n \sigma_0$

2. At each scale level, detecting maximum point in 8 neighbours of that point and more than a threshold:
\[
det(\mu (x, \sigma _n)) - \alpha trace^2(\mu(x, \sigma_n))>threshold_H
\]

here, $\mu$ is second moment matrix (or auto-correlation matrix)

\[
\mu(\textbf{x}, \sigma_I, \sigma_D) = \begin{bmatrix}\mu_{11} & \mu_{12}\\ \mu_{21} & \mu_{22} \end{bmatrix} = \sigma_D^2g(\sigma_I) \star \begin{bmatrix} L_x^2(\boldsymbol{x}, \sigma_D) & L_xL_y(\boldsymbol{x}, \sigma_D)\\ L_xL_y(\boldsymbol{x}, \sigma_D)& L_y^2(\boldsymbol{x}, \sigma_D) \end{bmatrix}
\]

integration scale: $\sigma_I =\sigma_n$ and derivation scale $\sigma_D= k\sigma_I$

In details, given L is image at scale n (I is smoothed with gaussian $\sigma_n$).

- Compute three derivatives of L $(L_x, L_{xy}, L_y)$ using Gaussian-with-$\sigma_D$ derivative kernel.

-Compute $L_x^2 = L_x L_x, L_y^2 = L_y L_y$.

-Convolute $L_x^2, L_{xy}, L_y^2$ with $g(\sigma_I)$ (means that derivation value is weighted by Gaussian with $\sigma_I$) and multiplied by $\sigma_D^2$.

- Take cornerness:
\[
det(\mu (x, \sigma _n)) - \alpha trace^2(\mu(x, \sigma_n))>threshold_H
\]
- Find maximum point in 8-neighbours is greater than $threahold_H$ --> candidate point set
3 For each point in candidate set, compute Laplacian of Gaussian and take point with maximum over scale and greater than Laplacian threshold. Laplacian of Gaussian is given by:
\[
\sigma_n^2 \left| L_{xx}(\boldsymbol{x}, \sigma_n) + L_{yy}(\boldsymbol{x}, \sigma_n) \right| > threshold_L
\]
-Compute Laplacian of Gaussian with $\sigma_n$ at each candidate point and multiply this result by $\sigma_n^2$

Output: $(x, y, \sigma_n)$ - position and scale

Parameters:

Symbol	Meaning	Value
s	scale factor	1.2
$\sigma_0$	initial scale	1 (inference from $\sigma_n = s^n$)
$\sigma_n = s^n$	sigma range	n = (1:17)
$threshold_H$	Harris function threshold	1000
$threshold_L$	Laplacian of Gaussian threshold	10
$\alpha$	in Harris function	0.06
k	$\sigma_D/\sigma_I$	0.6

References:
[1] K. Mikolajczyk, K. and C. Schmid. "Scale and affine invariant interest point detectors". International Journal of Computer Vision. 2004

[2] K. Mikolajczyk's PhD thesis

Thứ Tư, 22 tháng 9, 2010

[Theory][Gaussian] 2D Gaussian function and derivatives

The 2D Gaussian function is given by:

\[
G = \frac{1}{2\pi\sigma^2}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

The first derivatives:

\[
G_x = -\frac{x}{2\pi\sigma^4}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

\[
G_y = -\frac{y}{2\pi\sigma^4}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

The second derivatives:

\[
G_{x^2} = \frac{x^2-\sigma^2}{2\pi\sigma^6}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

\[
G_{y^2} = \frac{y^2-\sigma^2}{2\pi\sigma^6}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

\[
G_{xy} = \frac{xy}{2\pi\sigma^6}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

The third derivatives:

\[
G_{x^3} = \frac{3\sigma^2x-x^3}{2\pi\sigma^8}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

\[
G_{y^3} = \frac{3\sigma^2y-y^3}{2\pi\sigma^8}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

\[
G_{xxy} = \frac{y(\sigma^2 - x^2)}{2\pi\sigma^8}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

\[
G_{xyy} = \frac{x(\sigma^2 - y^2)}{2\pi\sigma^8}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

The fourth derivatives:

\[
G_{x^4} = \frac{x^4 - 6\sigma^2x^2 + 3\sigma^4}{2\pi\sigma^{10}}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

\[
G_{y^4} = \frac{y^4 - 6\sigma^2y^2 + 3\sigma^4}{2\pi\sigma^{10}}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

\[
G_{x^3y} = \frac{xy(x^2-3\sigma^2)}{2\pi\sigma^{10}}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

\[
G_{xy^3} = \frac{xy(y^2-3\sigma^2)}{2\pi\sigma^{10}}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

\[
G_{x^2y^2} = \frac{(x^2-\sigma^2)(y^2-\sigma^2)}{2\pi\sigma^{10}}e^{-\frac{(x^2 + y^2)}{2\sigma^2}}
\]

Laplacian of Gaussian:

\[
LoG = L_{x^2}+L_{y^2} = \frac{x^2 + y^2 - 2\sigma^2}{2\pi\sigma^6}e^{-\frac{(x^2+y^2)}{2\sigma^2}}
\]

Thứ Ba, 21 tháng 9, 2010

[CV][Descriptor] Histogram of Oriented gradients (HoG)

1. Tác giả, năm ra đời, paper gốc

Đề xuất bởi Navneet Dalal và Bill Triggs (INRIA-France) trong paper [2] năm 2005 trong bài toán human detection.

2. Ý tưởng chính của HoG

-HOG dựa trên kết hợp và khắc phục điểm mạnh và yếu của Shape Context , SIFT để tạo ra một edge descriptor mạnh.

-Điểm đặc biệt là Dalal đã sử dụng hướng tiếp cận chia dense grid và lấy HOG tại mỗi đỉnh (các HOG hai đỉnh lân cận có thể overlap với nhau) cho phép mô tả tốt thông tin cạnh của đối tượng.

Đã áp dụng thành công cho bài toán:

- Human detetion/recognition, object recognition

(cần bổ sung thêm)

3. Các biến thể của HOG

HOG có hai loại: static HOG (cho ảnh tĩnh) và motion HOG (cho video).

Static HOG
Tác giả giới thiệu 4 biến thể gồm: R - HOG (rectangular HOG), C - HOG (circular HOG), center - surround HOG, R2 - HOG. C
ác phiên bản static HOG khác nhau ở cách lấy cell trong một block. Hình 1 mô tả hai kiểu lấy HOG: R-HOG (a) và C-HOG (có 2 kiểu cho C-HOG là chia thành những cell nhỏ đối với cell trung tâm (b) và không chia (c))
Hình 1: Các dạng HOG

- R-HOG (Rectangular HOG) :một block chia ô lưới bàn cờ giống như SIFT. Trong mỗi cell, hướng mỗi pixel được đánh dựa vào trọng số Gaussian.
- C-HOG (Circular HOG): chia thành những cell theo bán kính và các góc. Kích thước bin theo hướng angular là như nhau nhưng theo hướng bán kính (radius) tăng dẫn theo bán kính. C-HOG khá giống Shape-Context
- Center-Surround HOG: cách này không đánh trọng số Gaussian nên các cell chỉ được chuẩn hóa một lần, vì vậy đây là phiên bản tính toán nhanh của HOG.
- R2-HOG: bên cạnh gradient, tính toán thêm đạo hàm bậc hai (second order derivative), hai kết quả (histogram) này sẽ được nối lại với nhau.

Motion HOG
Tương tự như static HOG, tuy nhiên motion HOG thay giai đoạn tính toán gradient bằng hai bước tính toán optical flow cho hai frame liên tiếp rồi tính toán differential flow của hai ảnh flow đó.

4. Cài đặt R-HOG và kết quả

Theo thực nghiệm của tác giả, trong các phiên bản của HOG, R - HOG (cùng với C - HOG) cho kết quả tốt và bền nhất (xem Hình 3), do đó ở đây chỉ trình bày cài đặt của R - HOG.

Cài đặt cho R - HOG ([3:page 22, 23, 49])

a) Bước xử lý ban đầu:

- Chuẩn hóa gamma/color normalization: square root gamma correction được dùng cho cả 3 channel (ảnh input là ảnh 3 channel)

- Gradient Computation:Kernel $[-1, 0, 1]$ và $[-1, 0, 1]^T$ được dùng để tính đạo hàm ảnh cho mỗi kênh màu và lấy giá trị có norm lớn nhất làm vector gradient.

b) Tính toán block descriptor

Các tham số:

+$\eta \times \eta$ : số cell trên một block

+$\varsigma \times \varsigma$: số pixels cho mỗi cell

+$\beta$: số bin của histogram

b.1 Sub-windows được phân chia thành là một grid (dense sampling). Lần lượt tại mỗi đỉnh của grid áp block $\eta\varsigma \times \eta\varsigma$ với tâm tại đỉnh đó

b.2 Mỗi gradient của mỗi pixel trong block sẽ được đánh trọng số bởi hàm Gaussian với $\sigma = 0.5 \times \varsigma \eta$

b.3 Độ lớn gradient (đã được đánh trọng số) của mỗi pixel trong mỗi cell sẽ được vote vào histogram hướng ứng với cell đó (nội suy tam tuyến tính - tri-linear interpolation được dùng trong quá trình vote)

b.4 Các histogram hướng của các cell trong block sẽ được nối lại với nhau

c) Chuẩn hóa histogram của block

c.1 Dùng chuẩn hóa L2-Hys hoặc L1-sqrt cho mỗi block (block histogram sau khi đã được chuẩn hóa chính là HOG descriptor)

c2. Nối tất cả các histogram đã chuẩn hóa của các block lại với nhau tạo thành descriptor vector.

Hình 2: Các bước tính toán HOG

Kết quả:

Cho bài toán human detection thì thứ tự các descriptor như sau (độ tốt giảm dần):HOG>Haar-wavelet >PCA-SIFT> Shape Context (xem Hình 3 - đường nào càng gần gốc tọa độ càng tốt). Hình 4 cho thấy các các cell có trọng số lớn nhất tập trung tại biên của đối tượng.

Hình 3: Kết quả HOG cho bài toán Human detection

Hình 4: HOG cho human detection. Các ảnh bên phải

(trong bộ 3 cho thấy các hướng các cell nằm ở biên đối tượng)

5. Các tính chất của HOG

- Mức độ khó khi cài đặt: bình thường (đánh giá cá nhân :D)

- Khả năng bất biến

	Illumination	Transition	Rotation	Scale	Affine
HOG	x
HOG+Sub-Window	x	x		x

x: có khả năng

HOG chống được illumination là do gamma normalization, constant normalization, edge information.

6. Nhận xét (của cá nhân, cần phải check lại)

- Nên dùng R-HOG (dễ cài đặt, kết quả cao)

- Thực sự hiệu quả của HOG là do kết hợp quá trình chia lưới + overlapping block + cell được biểu diễn trong nhiều block --> cho phép biểu diễn biên đối tượng kể có khi biên đối tượng có sự biến đổi nhất định

Những loại đối tượng có thể áp dụng:

- Do chỉ dùng thông tin cạnh nên HOG có thể dùng cho những đói tượng có sự biến đổi lớn về màu sắc bên trong đối tượng (người mặc quần áo, loài vật có nhiều màu lông khác nhau, xe có nhiều màu sơn v.v...)

- Xử lý được những đối tượng biến đổi hình dạng những yêu cầu vẫn giữ được hình dạng chính, ví dụ:

+ Đối với người đi bộ, tư thế tay có thể thay đổi tuy nhiên vẫn có hình dạng chính giống cái này (không biết gọi là cái gì).

+ Học đối tượng với 1 view cố định

7. Phân biệt với SIFT, Shape Context

R-HOG vs SIFT

- Giống nhau: tương đối giống nhau: chia thành các ô, tính histogram hướng cho mỗi cell rồi chuẩn hóa--> do cách sử dụng histogram nên cả 2 sẽ có khả năng chịu được sự sai lệch nhỏ về vị trí vùng describe.

-Khác nhau:

Thực chất HOG và SIFT phục vụ cho hai bài toán khác nhau (HOG dùng để mô tả nguyên mọt object dùng với subwindow cho object detection trong khi SIFT dùng để mô tả một point trong bài toán matching dưới nhiều phép biến đổi) nên có thể thấy những điểm khác biệt cơ bản giữa HOG và SIFT:

+ SIFT có chuẩn hóa hướng chính (HOG không) --> SIFT bất biến rotation (HOG không)

+ SIFT bất biến transition & scale do interest point detector trong khi HOG bất biến transition & scale nhờ sub-window

HOG vs Shape Context (sẽ được cập nhật)

8. Cài đặt, source code

- Bản cài đặt HoG của tác giả được cung cấp cùng với bộ công cụ INRIA Object Detection and Localization Toolkit

Tài liệu tham khảo:

[1] HoG on Wikipedia

[2] Navneet Dalal and Bill Triggs. "Histograms of Oriented Gradients for Human Detection". International Conference on Computer Vision & Pattern Recognition.June 2005.

[3] Navneet Dalal's PhD thesis (Chapter 3, 4)

Chủ Nhật, 19 tháng 9, 2010

[CV][Descriptor] Local Binary Pattern

1. Ý nghĩa khi tạo ra LBP

Ý tưởng của LBP là kết hợp hai thông tin: thông tin về không gian và tương phản ảnh grayscale

2. Tác giả

LBP được giới thiệu đầu tiên năm 1996 bởi Timo Ojala trong [2] (University of Oulu, Finland) để giải quyết bài toán texture classification.

3. Mức độ phổ biến - những lĩnh vực đã áp dụng

Từ lĩnh vực texture classification, trong những năm gần đây LBP đã được mở rộng nhiều lĩnh vực khác [3, 4] bao gồm:

- Texture classification

- Motion Analysis/Tracking/Background subtraction.

- Human detection/recognition, Object detection/recognition

- Biometrics: eye localization, iris recognition, finger recognition, palmprint recognition, gait recognition and facial age classification.

- LBP dùng cho mặt được thực hiện bằng cách chia mặt người thành ô lưới không overlap. Tính toán LBP histogram cho mỗi ô rồi nồi lại tại thành global histogram (Hình 1).

Hình 1: Một ví dụ dùng LBP cho face

- Image and video retrieval

- Scene Analysis

- Environmental Modeling

4. Những biến thể - hướng mở rộng

- LBP gốc của Ojala chỉ có 8 điểm trong lận cận 3x3 (có $2^8$ label)

- Mở rộng ra kích thước bất kì (Ojala -2002). Khi đó kí hiệu (P, R) nghĩa là P điểm cách đều nhau nằm trên đường tròn bán kính R

Hình 2: LBP với 8 điểm và bán kính là R. 1 là mẫu áp lên một vị trí trung tâm c có $g_c = 70$ 2. Sự chênh lệch giữa các điểm tại vị trí trên mẫu so với điểm ở giữa 3. Những giá trị <0 00001111 ="">

Đối với mẫu như hình 1 sẽ có 2^8 = 256 label. Như vậy tại mỗi điểm bất kì trong ảnh khi áp mẫu này lên sẽ trả về một label. Quét qua tất cả các vị trí trong ảnh ta sẽ xây dựng được một LBP histogram cho ảnh đó. Thông thường những histogram này sẽ được chuẩn hóa về [0,1].

- uniform pattern: dựa trên quan sát một số mẫu xuất hiện tương đối phổ biến hơn các mẫu khác, đồng thời gian chiều dài vector feature và gom nhóm các mẫu lại tạo thành những mẫu rotation invariance. Cụ thể một mẫu gọi là uniform nếu số lần chuyển từ (0--> 1 hoặc 1-->0) nhiều nhất là 2, ví dụ:

Mẫu 00000000 (0 lần dịch chuyển) -->uniform

Mẫu 01110000 (2 lần dịch chuyển) -->uniform

Mẫu 11001111 (2 lần dịch chuyển) --> uniform

Mẫu 11001001 (4 lần dịch chuyển) --> không phải

Mẫu 01010010 (6 lần dịch chuyển) --> không phải

Đồng thời một số mẫu sẽ được gom lại, những mẫu không quan tâm có thể gom lại thành một nhóm duy nhất.

- VLBP (Volume Local Binary Pattern) do Zhao và Pietikainen (2007) đề nghị, bao gồm cả chiều thời gian T bên cạnh 2 chiều không gian là X và Y. Như vậy VLBP tính toán như LBP nhưng trong một không gian 3 chiều xung quanh điểm trung tâm, vấn đề của VLBP là khi số điểm lấy mẫu là P thì số lượng mẫu vào khoảng $2^{3P + 2}$. Nếu P lớn thì số mẫu sẽ rất lớn. Chính vì thể hai tác giả đã đề xuất LBP-TOP tính toán LBP bình thường trên 3 mặt phẳng XY, XT, YT rồi nối lại như vậy số mẫu sẽ chỉ là $3 \times 2^P$. Trong LBP-TOP, tại mỗi điểm LBP sẽ được tính toán trên mỗi mặt phẳng, bán kính và số điểm của mẫu LBP có thể khác nhau trên mỗi mặt phẳng. Cứ mỗi chiều XY quét qua ảnh và tính toán LBP histogram theo mặt phẳng XY, tương tự cho XT, YT, ta sẽ có được 3 histogram. Chuẩn hóa riêng cho mỗi histogram rồi nối lại ta thu được histogram của VLBP (Hình 3, 4). Cả VLBP và LBP-TOP được giới thiệu trong [5].

Hình 3: Tính toán histogram của LBP-TOP

Hình 4: Sự biến đổi ảnh theo chiều thời gian T

- LBP bất biến: illumination (Li et al-2007), scale and rotation (Ojala et al - 2001, 2002)

- Nghiên cứu những mẫu LBP quan trọng (Liao et al. 2009)

- Kết hợp với những descriptor khác: CS-LBP (kết hợp với SIFT - Heikkila et al 2009), Gabor feature (Tan et al.2007, Wang et al. 2009)

- Color LBP (Maenpaa and Pietikainen - 2004)

5. Cài đặt

Các cài đặt được giới thiệu trên trang web của University of Oulu [3]:

- LBP Matlab code by Authors

- Spatial-temporal LBP (LBP-TOP) (C++)

- LBP (C++) by Topi Maenpaa

6. Trường hợp áp dụng

- Cần một descriptor tính toán nhanh

- Dùng tốt cho texture, face (trong [5] report đạt 95.19% dùng 2-fold và 96.26% dùng 10-fold trên facial expression video database Cohn-Kanade) , motion (tracking)

- Nếu chỉ dùng LBP mà không có LBP-histogram --> local descriptor ngược lại nếu dùng histogram thì sẽ là global descriptor.

7. Điểm mạnh và yếu của LBP

Điểm mạnh:

- Đơn giản --> hiệu quả tính toán-->real time

- Chịu được biến đổi đơn điệu bao gồm:

+ Có khả năng chịu được sáng (do lấy hiệu giữa điểm mẫu và điểm giữa)

+ Khả năng chịu được biến đổi màu da người (thuộc dân tộc khác nhau)

- LBP có thể giúp cho bài toán dynamic texture classification giải với độ chính xác 99.43% [5] trên dataset DynTex-->không nên theo bài toán Dynatic Texture classification cũng như Texture Classification

Tài liệu tham khảo:

[1] LBP on Wikipedia

[2] Ojala, T., Pietikäinen, M. and Harwood, D. (1996), "A Comparative Study of Texture Measures with Classification Based on Feature Distributions". Pattern Recognition 19(3):51-59

[3] All LBP in one

[4] LBP on Scholarpedia

[5] Guoying Zhao, Matti Pietikainen. "Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions". IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007

Thứ Năm, 9 tháng 9, 2010

[Lang][Matlab] Change the input argument in Matlab function

By Google search, I found 3 ways:

1. Use handle object in Matlab

- Create class that cover your argument that you want to change

- Pass to your function

2. Use Inplace parkage but it still has many limitations

3. Change the appropriate variable in caller workspace

- For scalar variable: Use two functions inputname and assignin [2]

function matlab_exp_increase(a)

A = inputname(1);

disp(['Name of variable a in caller : ' A]);

assignin('caller', A, a + 1);

end

Result:

>>a = 10

>>matlab_exp_increase(a)

a = 11

This way is useful as the variable is scalar, otherwise if this variable is a structure, matrix or array then you have to change whole variable.

- For structure, matrix or array variable:

A solution is to use evalin function, see this code section:

function matlab_exp_change_array(a)

A = inputname(1);

disp(['Name of variable a in caller : ' A]);

command = sprintf('%s(1) = 0', A);

disp(['Command will be runned in caller: ' command]);

evalin('caller', command);

end

In caller: (command windows)

>> b= [1 , 2, 3]

b =

1 2 3

>>matlab_exp_change_array(b)

Name of variable a in caller : b

Command will be runned in caller: b(1) = 0

>>b

b =

0 2 3

[Download matlab code: matlab_exp_increase.m | matlab_exp_change_array.m]

[1] Mathworks

[2] http://www.dsprelated.com/groups/matlab/show/446.php

Thứ Tư, 8 tháng 9, 2010

[Lang][Matlab] Structure Array in Matlab

1. Structure declaration:

s = struct('field1', value1, 'field2', value2, ....) // create structure with fields and appropriate values

s = struct('field1', {}, 'field2', {},...) //create structure with empty values for each field

s = struct // create structure with empty field

s = struct([]) //create empty structure

s = struct

Example 1:

stuMyStruct = struct('color', {'red', 'green', 'yellow'}, 'type', 1)

Results:

stuMyStruct is structure array with:

stuMyStruct(1):

color: 'red'

type: 1

stuMyStruc(2):

color: 'green'

type: 1

stuMyStruct(3):

color: 'yellow'

type: 1

Example 2:

stuMyStruct = struct('color', {'red', 'green', 'yellow'}, 'type', {1, 2, 3})

Results:

stuMyStruct is structure array with:

stuMyStruct(1):

color: 'red'

type: 1

stuMyStruc(2):

color: 'green'

type:2

stuMyStruct(3):

color: 'yellow'

type: 3

Example 3:

stuMyStruct.color

Results:

ans = red

ans = green

ans = yellow

2. Access structure

There are two ways to access structure fields: by specifying field name or by dynamic field name. Here, we mention the dynamic field name because of the easy of the first way.

Example 4:

stuMyStruct = struct('color', 'red', 'type', 1)

strNewField = 'weight';

stuMyStruct.(strNewField) = 100;

Results:

stuMyStruct =

color: 'red'

type: 1

weight: 100

Note: distinguish from this case

stuMyStruct = struct('color', 'red', 'type', 1)

strNewField = 'weight';

stuMyStruct.strNewField = 100;

leads to this result

stuMyStruct =

color: 'red'

type: 1

strNewField : 100

3. Some other functions for field process

fieldnames: list all field name, type and attributes of structure or structure array

setfield: set value to field (very strong)

getfield: get value of specified field (very strong)

isfield: check the existence of a fieldname

orderfields: reorder the fieldnames

rmfield: remove specified fieldname

isstruct: determine whether input is a structure array

Refercences:

[1] Structure for Matlab

[2] Memory Managements for functions and variables

[3] Dynamic field name

Thùng rác của Sea Otter

Thứ Năm, 30 tháng 9, 2010

[CV][Descriptor] Shape Context

Thứ Năm, 23 tháng 9, 2010

[CV][Detector] Harris Laplace alogrithm

Thứ Tư, 22 tháng 9, 2010

[Theory][Gaussian] 2D Gaussian function and derivatives

Thứ Ba, 21 tháng 9, 2010

[CV][Descriptor] Histogram of Oriented gradients (HoG)

Chủ Nhật, 19 tháng 9, 2010

[CV][Descriptor] Local Binary Pattern

Thứ Năm, 9 tháng 9, 2010

[Lang][Matlab] Change the input argument in Matlab function

Thứ Tư, 8 tháng 9, 2010

[Lang][Matlab] Structure Array in Matlab

Người theo dõi

Lưu trữ Blog

Giới thiệu về tôi