Devon's Object Recognition Research Log

Devonator
Administrator

Posts: 47

Devon's Object Recognition Research Log Dec 7, 2012 17:55:50 GMT -5

Quote

Post by Devonator on Dec 7, 2012 17:55:50 GMT -5

Since I've decided to deal with object recognition in this project. I'm going to document all my findings and decisions here. This is beneficial to everyone even those not included in the decisions. We're all smart and need to inform each team member why we chose something and what is good/bad about it.

The process: Determine the state of the art in object recognition methods. Find their goods/bads and post them here. Will present them to the others involved and the entire team allowing them to decide which one we use.

Once decided, will research the improvements being made for that method/the past methods used and begin implementing it. This will give us a concrete implementation.

Terms I ran into that need to be defined/understood:

RGB-D Kernel Descriptors
State Vector Machines

Last Edit: Dec 7, 2012 22:42:46 GMT -5 by Devonator

Devonator
Administrator

Posts: 47

Devon's Object Recognition Research Log Dec 7, 2012 19:44:26 GMT -5

Quote

Post by Devonator on Dec 7, 2012 19:44:26 GMT -5

Format for information mined from research papers

Reference:

Purpose:

Proposal:

Applies to what:

Theory or Application:

What does it improve on:

Pros:

Cons:

Possible issues:

Last Edit: Dec 7, 2012 19:44:37 GMT -5 by Devonator

Devonator
Administrator

Posts: 47

Devon's Object Recognition Research Log Dec 7, 2012 20:07:54 GMT -5

Quote

Post by Devonator on Dec 7, 2012 20:07:54 GMT -5

Reference:

Link to PDF: dl.acm.org/ft_gateway.cfm?id=2396359&ftid=1311354&dwn=1&CFID=154130435&CFTOKEN=59695977

Large-Scale Simultaneous Multi-Object Recognition and
Localization via Bottom Up Search-Based Approach, Chun-Che-Wu et al, National Taiwan University, 2012

[/b]

[li] multiple object recognition and localization over large-scale object classes

Proposal:

[/li][li] propose a bottom up search-based approach, which localizes the grid-based search candidates in Markov Random Field (MRF)

Results:

[/li][li]proposed approach enables simultaneously
recognizing and localizing multiple objects; therefore, it reduces
response time and ensures the accuracy as well

[/li][li] Experimental results show that the proposed method can have 40% relative
improvement over the state-of-the-art bag-of-words model

[/li][li] Propose a bottom-up framework to recognize multiple objects
[/li][li] Simultaneous recognition of multiple objects by MRF

Applies to what:

[/li][li] Object recognition. Extracting multiple objects locally from a single image using databases. Can be used for sorting objects in images and removing noise or uninteresting objects.

Theory or Application:

[/li][li] Both

What does it improve on:

[/li][li] Bag of words model. Object extraction and classification, moves it from one object recognized per image to multiple.

[/li][li] Efficient subwindow search was the first proposal to solve multi-object recognition. It uses a top-down approach.
[/li][li] ESS Iteratively finds a better boundary until convergence.

[/li][li] Adaptive Window Search (AWS) was proposed. It works like this: First find the most possible object in the image and use spatial verification to locate the known object. Then, based on the found object, they build the template-based window for further search.

Pros:

[/li][li] Grid based recognition and searching using Markov Random Fields
[/li][li] Can do multi-object queries with single images.
[/li][li] Combines approaches of past techniques (See AWS and ESS)

Pros of bottom-up multi-object recognition:

[/li][li] Suppress the noisy effects because the whole image contains a huge amount of noisy features.
[/li][li] With local information from each grid, we are able to recognize multiple objects concurrently.

Cons:

Possible issues:

How to do Multi-Object recognition:

Large-scale grid-based search and scoring:

[/li][li] First, divide the queried image into grids.
[/li][li] Calculate similarity scores for each grid
[/li][li] Grid similarity scores of each candidate are aggregated to form the score distributions
[/li][li] All candidates would compete for ownership of each grid by MRF.
[/li][li] After optimization for the object partition, calculate final score of each candidate.
[/li][li] Check spatial consistency between query and dataset candidates.

Notes: For similarity score, uses intersection of normalized histogram on bag-of-words.
Notes: Adopts inverted file as indexing structure for efficient similarity measurement between query and large-scale object classes databases.

Multi-object recognition by MRF:

[/li][li] Let the candidates fight for the grids ownership with grid similarity scores
[/li][li] Formulize the score distributions of candidates as first order MRF.
[/li][li] Define the graph G = (V, e) where V is the set of all vertices and E is the set of all edges in G. In our case, a vertex v in V is a grid in query image and we build an edge between 2 vertices if their corresponding grids are 8 connected in the query.
[/li][li] Each candidate competes for ownership of vertices
[/li][li] They use Hessian Affine and SIFT as local feature detector and descriptor. They then apply hierarchical k-means to quantize local features as BoW.

[/li][/ul]

Last Edit: Dec 7, 2012 22:08:08 GMT -5 by Devonator

Devonator
Administrator

Posts: 47

Devon's Object Recognition Research Log Dec 8, 2012 18:39:46 GMT -5

Quote

Post by Devonator on Dec 8, 2012 18:39:46 GMT -5

Reference:

Spatial-Based Feature for Locating Objects, Lu Cao, Yoshinori Kobayashi, Yoshinori Kuno
2012

[/b]

[li] Discuss how humans locate and detect objects using spatial expressions

Proposal:

[/li][li] Spatial based feature for object localization and recognition tasks

Applies to what:

[/li][li] Localization and recognition of objects. Different method of doing so than SIFT.

Theory or Application:

[/li][li] Theory

What does it improve on:

[/li][li] Object recognition

Pros:

[/li][li] Describing spatial relationships between targets and other objects could be more accurate and efficient than training a tremendously complex classifier
[/li][li] Developed a pose estimator

Cons:

Possible issues:

Thunderbots@Home

Devon's Object Recognition Research Log

Post by Devonator on Dec 7, 2012 17:55:50 GMT -5

Post by Devonator on Dec 7, 2012 19:44:26 GMT -5

Post by Devonator on Dec 7, 2012 20:07:54 GMT -5

Post by Devonator on Dec 8, 2012 18:39:46 GMT -5

Shoutbox