1. Authors:
Proposed by Mustafa Ozuysal et al in PAMI 2010 [1]. A detector based on machine learning, particularly, Naive-Bayes combination of classifiers using ferns
2. Details
a) There are two phases: training and evaluation
- Training:
Select some images as model images (to extract patches for training). Run Harris detector on each image to extract keypoints and take $K \times K$ patch at each keypoint. This is done by deforming the images many times, applying Harris detector. Select top-n patches with highest detection frequency on original image and its deformed ones (Figure 1) to make N classes .

Figure 1: Some classes for training (From [1])
Affine deformation to create variant images is given by:
\[
R_{\theta}R_{-\phi}\text{diag}(\lambda _1, \lambda _2) R_{\theta}
\]
where $R_{\theta}$,$R_{-\phi}$ are rotations of angle $\theta$,$\phi$ respectively. One way to make training set for each class is to randomly create 30 affine deformations per degree of rotation.
- Evaluation:
Run trained detector on new image (containing whole or a part of training image). The detected patch on new image will be assigned with appropriate class label associated with corresponding patch on training image.
b) Naive-Bayes combination of classifiers method:
$C_i$ is $i^{th}$ in N classes:\[
\hat{c_i} = argmax_{c_i} P(C = c_i | f_1, f_2, ..., f_N)
\]
Apply Bayes' formula:
\[
P(C = c_i | f_1, f_2, ..., f_N) = \frac{P(f_1, f_2, ..., f_N|C = c_i) P(C = c_i)}{P(f_1, f_2, ..., f_N)}
\]
Assuming a uniform prior $P(C = c_i)$ and $P(f_1, f_2, ..., f_N)$ is the same for all classes, reduce to find:
\[
\hat{c_i} = argmax_{c_i} P(f_1, f_2, ..., f_N|C = c_i)
\]
In this formula, feature $f_j$ depends on two locations $d_{j,1}$ and $d_{j,2}$ on the patch I:
\[
f_j = \left\{\begin{matrix} 1 & \text{if } I_{j, 1} <> I_{j, 2} || 0 & \text{otherwise} \end{matrix} \right.\]
Group features into M groups of size $S = \frac{N}{M}$. This group is called fern.
\[
P(f_1, f_2, ..., f_N| C= c_i) = \prod_{k=1}^K P(F_k|C = c_i)
\]
where, $F_k$ is $k^{th}$ fern.
c) Parameters:
3. Conclusion
| Parameter | Meaning | Value |
| S | # features per fern | 11 |
| M | # groups (ferns) | 30-50 |
| $K \times K$ | The path size for training and evaluation | 32x32 |
| $\theta$ | angle of deformation | $[0:2\pi]$ |
| $\phi$ | angle of deformation | $[0:2\pi]$ |
| $\lambda _1, \lambda _2$ | $[0.6:1.5]$ | |
| N | the number of classes | (1500) |
3. Conclusion
Novel idead !!!!!!!!!!!!!!!!!!!!!!!!!!!!
Advantages:
- No-descriptor, no-matching procedure (because class meaning match id is returned after detection)
- 1000 faster and more accurate than SIFT (Figure 2). Training time is also fast (5 minutes on MacBook Pro laptop, 200 classes). (Using Naive-Bayes)
- Invariance depends on training set (training set containing affine image, illumination image makes detector become affine and illumination invariance)
- Run on embedded system
Disadvantages:
- Only detect on image related to training image (includes or belongs to training image)
4. Applied Areas:
SLAM, wide baseline matching, 3D objects, panorama & 3D annotation and on hand-held device
- Panorama annotation: give a panorama image with annotation, for a input image (patch of panorama), automatically annotate on input image (Figure 3).
- 3D annotation: give annotated 3D model of scene or object, for a input image, automatically annotate on input image (Figure 3).
Figure 3: panorama and 3D annotation application (From [1])
5. Source code
Fern demo on EPFL
References:
[1] Mustafa Ozuysal, Michael calonder, Vincent Lepetit and Pascal Fua. "Fast Keypoint Recognition Using Random Ferns". PAMI. 2010



Không có nhận xét nào:
Đăng nhận xét