Thứ Sáu, 8 tháng 10, 2010

[CV][Detector] Ferns

1. Authors:

Proposed by Mustafa Ozuysal et al in PAMI 2010 [1]. A detector based on machine learning, particularly, Naive-Bayes combination of classifiers using ferns

2. Details
a) There are two phases: training and evaluation

- Training:
Select some images as model images (to extract patches for training). Run Harris detector on each image to extract keypoints and take $K \times K$ patch at each keypoint. This is done by deforming the images many times, applying Harris detector. Select top-n patches with highest detection frequency on original image and its deformed ones (Figure 1) to make N classes .

Figure 1: Some classes for training (From [1])

Affine deformation to create variant images is given by:

\[
R_{\theta}R_{-\phi}\text{diag}(\lambda _1, \lambda _2) R_{\theta}
\]
where $R_{\theta}$,$R_{-\phi}$ are rotations of angle $\theta$,$\phi$ respectively. One way to make training set for each class is to randomly create 30 affine deformations per degree of rotation.

- Evaluation:
Run trained detector on new image (containing whole or a part of training image). The detected patch on new image will be assigned with appropriate class label associated with corresponding patch on training image.

b) Naive-Bayes combination of classifiers method:
$C_i$ is $i^{th}$ in N classes:
\[
\hat{c_i} = argmax_{c_i} P(C = c_i | f_1, f_2, ..., f_N)
\]
Apply Bayes' formula:
\[
P(C = c_i | f_1, f_2, ..., f_N) = \frac{P(f_1, f_2, ..., f_N|C = c_i) P(C = c_i)}{P(f_1, f_2, ..., f_N)}
\]
Assuming a uniform prior $P(C = c_i)$ and $P(f_1, f_2, ..., f_N)$ is the same for all classes, reduce to find:
\[
\hat{c_i} = argmax_{c_i} P(f_1, f_2, ..., f_N|C = c_i)
\]
In this formula, feature $f_j$ depends on two locations $d_{j,1}$ and $d_{j,2}$ on the patch I:


\[
f_j = \left\{\begin{matrix} 1 & \text{if } I_{j, 1} <> I_{j, 2} || 0 & \text{otherwise} \end{matrix} \right.\]
Group features into M groups of size $S = \frac{N}{M}$. This group is called fern.
\[
P(f_1, f_2, ..., f_N| C= c_i) = \prod_{k=1}^K P(F_k|C = c_i)
\]
where, $F_k$ is $k^{th}$ fern.


c) Parameters:
ParameterMeaningValue
S# features per fern11
M# groups (ferns)30-50
$K \times K$The path size for training and evaluation32x32
$\theta$angle of deformation$[0:2\pi]$
$\phi$angle of deformation$[0:2\pi]$
$\lambda _1, \lambda _2$$[0.6:1.5]$
Nthe number of classes(1500)

3. Conclusion

Novel idead !!!!!!!!!!!!!!!!!!!!!!!!!!!!

Advantages:
- No-descriptor, no-matching procedure (because class meaning match id is returned after detection)
- 1000 faster and more accurate than SIFT (Figure 2). Training time is also fast (5 minutes on MacBook Pro laptop, 200 classes). (Using Naive-Bayes)
- Invariance depends on training set (training set containing affine image, illumination image makes detector become affine and illumination invariance)
- Run on embedded system

Disadvantages:
- Only detect on image related to training image (includes or belongs to training image)
Figure 2: Fern gives more and more accuracy than SIFT (From [1])

4. Applied Areas:
SLAM, wide baseline matching, 3D objects, panorama & 3D annotation and on hand-held device
- Panorama annotation: give a panorama image with annotation, for a input image (patch of panorama), automatically annotate on input image (Figure 3).
- 3D annotation: give annotated 3D model of scene or object, for a input image, automatically annotate on input image (Figure 3).
Figure 3: panorama and 3D annotation application (From [1])

5. Source code
Fern demo on EPFL

References:
[1] Mustafa Ozuysal, Michael calonder, Vincent Lepetit and Pascal Fua. "Fast Keypoint Recognition Using Random Ferns". PAMI. 2010

Không có nhận xét nào:

Đăng nhận xét