pptx - Smart Geometry Processing Group

Download Report

Transcript pptx - Smart Geometry Processing Group

Annotating RGBD Images of Indoor Scenes

Yu-Shiang Wong and Hung-Kuo Chu National Tsing Hua University CGV LAB SA2014.SIGGRAPH.ORG

SPONSORED BY

Outline

Motivation Related Works Annotation Procedure User Study SA2014.SIGGRAPH.ORG

SPONSORED BY

Motivation

Scene understanding is a popular topic.

RGBD dataset with high quality semantic annotations are valuable: Learning Evaluations Two fundamental problems • Data Acquisition and Annotation SA2014.SIGGRAPH.ORG

SPONSORED BY

Motivation

Scene understanding is a popular topic.

RGBD dataset with high quality semantic annotations are valuable: Learning Evaluations Two fundamental problems • Data Acquisition and Annotation SA2014.SIGGRAPH.ORG

SPONSORED BY

RGBD Indoor Datasets

Cornell-RGBD (2011-12) : 24 labeled office scenes NYU2 (2011-12) : 1449 labeled indoor scenes – 408,000+ RGBD videos frames ( unlabeled ) SUN 3D (2013) : 415+ full captured room – 10+ room is full labeled , annotations are propagated through video.

UZH & ETH 3D Scanned Point Datasets (2014) : 42 x full captured room – high quality point clouds ( unlabeled )

Object Detection and Classification from Large-Scale Cluttered Indoor Scans (EG 2014)

… SA2014.SIGGRAPH.ORG

SPONSORED BY

Motivation

Data annotation is a painstaking and time consuming task

OMG! So many data need to be annotated

SA2014.SIGGRAPH.ORG

SPONSORED BY

Motivation

Data annotation is a painstaking and time consuming task Interactive tool for annotating RGBD indoor scenes

We need a good tool!

SA2014.SIGGRAPH.ORG

SPONSORED BY

Motivation

Data annotation is a tedious and time consuming task Interactive tool for annotating RGBD indoor scenes Leverage both the cognitive ability of human and computational power of machine.

SA2014.SIGGRAPH.ORG

SPONSORED BY

RELATED WORKS

SA2014.SIGGRAPH.ORG

SPONSORED BY

Image Annotation

LabelMe: a database and web-based tool for image annotation. Russell et. al. , IJCV 2007 SUN3D: A Database of Big Spaces Reconstructed using SfM and Object Labels, Xiao et.al. ICCV 2013 Cheaper by the Dozen: Group Annotation of 3D Data, Boyko et. al., UIST 2014 SPONSORED BY SA2014.SIGGRAPH.ORG

Scene Understanding using RGBD Data

Image-based

Indoor segmentation and support inference from RGBD images.

Silberman et.al. ECCV 2012.

RGB-(D) scene labeling: Features and algorithms

. Ren et. al. CVPR. 2012 Proxy-based

Imagining the unseen: Stability- based cuboid arrangements for understanding cluttered indoor scenes.

Shao et. al., SIGGRAPH Asia 2014

PanoContext: A whole-room 3d context model for panoramic scene understanding.

Zhang et. al., ECCV 2014

Holistic scene understanding for 3D object detection with rgbd cameras

. , Lin et. al., ICCV 2013

3D- based reasoning with blocks, support, and stability

. Xiao et. al. CVPR 2013 SPONSORED BY SA2014.SIGGRAPH.ORG

Annotation Procedure: Overview

Input : RGB-D image Output : Seg., Label, Box proxy, Support structure

Machine

Input Output

Å User

SPONSORED BY SA2014.SIGGRAPH.ORG

Input RGB-D Image

Annotation Procedure: Overview

Machine Session

Extract Room Draw Scribbles Estimate Boxes Annotate Label and Structure Output Annotated 3D Structure

User Session

SA2014.SIGGRAPH.ORG

SPONSORED BY

Annotation Procedure:

Preprocessing

Estimate normal Perform over-segmentation using both color and normal map .

• Efficient graph based image segmentation [Felzenszwalb et.al. 2004] • The coarser segmentation is used for room estimation.

• The finer segmentation is used for user assisted object segmentation.

SPONSORED BY SA2014.SIGGRAPH.ORG

Annotation Procedure:

Extracting Room Layout

Input RGB-D Image Extract Room Draw Scribbles Estimate Boxes Annotate Label and Structure Output Annotated 3D Structure Perform RANSAC fitting on each seg.

Roughly align point cloud by Gravity Info 𝑔 Find the floor segmentation by : E i = (1 −< n i , y e > ) + inverse ratio of seg. size + normalized Y coords Estimate wall candidates like 𝐸 = < 𝑛 𝑖 , 𝑓𝑙𝑜𝑜𝑟 > 𝑖 * If gravity info is not available: 𝐸 = < 𝑛 𝑖 , 𝑛 𝑗 𝑖 𝑖≠𝑗 > SA2014.SIGGRAPH.ORG

Annotation Procedure:

Input RGB-D Image

User Scribbles

Extract Room Draw Scribbles Estimate Boxes Annotate Label and Structure Output Annotated 3D Structure Check floor and walls hypotheses • If the hypotheses fail, user clicks the segment to identify floor and walls .

User draws scribbles to extract the object segments SPONSORED BY

User

SA2014.SIGGRAPH.ORG

Annotation Procedure:

Estimating Boxes

Input RGB-D Image Extract Room SA2014.SIGGRAPH.ORG

Draw Scribbles Estimate Boxes Annotate Label and Structure Output Annotated 3D Structure • Box orientation = Find out an orthogonal basis in 3D domain ( 3 unknowns direction ) • We assume one direction of box is parallel to the normal of floor (1 unknowns direction, 1 by cross product) Box Fitting Method : 1.

Filtering point cloud by KNN 2.

3.

Project point cloud of a box to floor plane Fit a line in 2D domain to extract a major direction 4.

Annotation Procedure:

Annotate Label and 3D Structure

Input RGB-D Image Extract Room Draw Scribbles Estimate Boxes Annotate Label and Structure Output Annotated 3D Structure User Tasks : 1. Type in the object label 2. Drag an arrow to specify the support relationships SPONSORED BY

User

SA2014.SIGGRAPH.ORG

Annotation Procedure:

Box Quality Refinement (Optional)

Input RGB-D Image Extract Room Draw Scribbles Estimate Boxes Annotate Label and Structure Output Annotated 3D Structure User Tasks : 1. Adjust the orientation of boxes 2. Adjust the size of boxes SPONSORED BY

User

SA2014.SIGGRAPH.ORG

USER STUDY

SA2014.SIGGRAPH.ORG

SPONSORED BY

User Study : Settings

• Select 50 x scenes NYU2 across 7 scene class from • Recruit 2 users , • Each user is requested to annotate 50 x scenes • Target class : 24 merged object classes • List : bed, chair, cabinet, dresser, television, night stand, table, sofa, picture, pillow, … • Each scene contains 3-6 objects SPONSORED BY SA2014.SIGGRAPH.ORG

User Study : Results

• System Process Time: calculate normal, fitting planes and boxes: < 3 sec [in C++] • Annotation Time: ( 50 x Scenes ) Task Type Check Room Draw Scribbles Type Labels Drag Supports Boxes Adjustment Mean time per box - 16 sec 4 sec 2 sec 11 sec Mean time per scene 1.6 sec 1 min 17 sec 9 sec 35 sec Total Time 1.3 min 51 min 13 min 7.5 min 29 min ( Accuracy = 64 %) TOTAL = 101 min SPONSORED BY SA2014.SIGGRAPH.ORG

Demo

SA2014.SIGGRAPH.ORG

SPONSORED BY

Conclusion

An interactive system to facilitate annotating RGBD indoor scenes.

Generating high quality ground truth data with rich annotations Object segments Object labels 3D geometry 3D structure SPONSORED BY SA2014.SIGGRAPH.ORG

On Going Work

The major bottleneck lie in manual operations: Drawing scribbles Refine box proxy Typing labels Specify structure Incorporate inferring algorithm and 3D structure analysis to reduce the manual burden from the user.

SPONSORED BY SA2014.SIGGRAPH.ORG

SA2014.SIGGRAPH.ORG

THANKS YOU !

SPONSORED BY