Tuesday, April 23, 2013

Envisioning sketch recognition: a local feature based approach to recognizing informal sketches

Oltmans, M. (2007). Envisioning sketch recognition: a local feature based approach to recognizing informal sketches (Doctoral dissertation, Massachusetts Institute of Technology).


This dissertation proposes a visual method for recognizing hand drawn sketches. The problem is approached as the recognition of individual "parts" of a sketch. To recognize a sketch, the strokes are first resampled, then a bullseye feature is placed every five pixels of stroke. The bullseye feature is a circular grid which acts as a histogram, counting the amount of ink in each bin. The circles are spaced logarithmically, allowing the outer rings to create context, while the inner rings capture detail. To ensure rotation invariance, the bullseye is rotated to be close to the direction of the stroke. Point orientations are also taken into account when examining a bullseye feature.

The computed features are compared to a codebook of previously learned features and a distance metric identifies the closest match for the bullseye. An SVM is used along with a one-vs-one strategy for classifying each feature. To convert an entire sketch this process is completed on windows of varying sizes across the sketch canvas. Using additional classifiers and overlap information, each area is classified.

This work was evaluated on two datasets and compared against an image-based classifier. When evaluated on the dataset of electrical circuits, the system achieved 89.5% accuracy. On the dataset of PowerPoint style shapes, the system achieved 94.4% accuracy.

The work presented here is a novel modification of vision methods for use on 2D sketches. This system is particularly impressive in its ability to ingest an entire sketch and classify all pieces. There are weaknesses however, for example in the electrical components case the system had to be somewhat modified to understand that unclassified pieces should be considered wires, making this approach not entirely domain independent. It would be interesting to see this technique combined with gesture recognition techniques. I would also like to test it on recognizing geons in 3D sketches.

No comments:

Post a Comment