This paper proposes a new visual method for sketch recognition. The approach taken represents symbols as feature images, proposes a set of visual features for detecting orientation and endpoints, and introduces an efficient classification technique which is robust to deformations and rotation.
First, strokes are resampled at a constant spatial frequency. Next, each symbol is normalized by translating its center of mass to the origin and scaling it horizontally and vertically so it has unit standard deviation on each axis. Then, five features for each sample point are computed (four for orientation, one for endpoints). These five features are each rendered onto 24 x 24 feature grids. The grids span 2.5 standard deviations of the original symbol's space in each direction. The intensity of a pixel is determined by the maximum feature value of all sample points that fall within that cell. Next, to increase tolerance to shifts and distortions, a Gaussian blur is applied. The image is then down sampled using a MAX filter to 6x6 size.
For recognition, the computed feature images are distance matched (with an image deformation model) to existing templates.
It is necessary to optimize some of these processes so that recognition time remains low. First, a "coarse" metric is established and used before attempting exact matches, this eliminates many of the candidate matches. To do this, images are indexed using their first 128 principal components. Additionally, a branch and bound technique is used to generate a clustering tree. For rotational invariance, 32 orientations of the original feature image are considered.
Testing on the Pens Digits Set revealed 99.2% accuracy, while the HHReco Dataset yielded 98.2% accuracy and the Circuits Dataset showed 96.2% accuracy. The system was capable of classifying about 100 symbols per second on a 2.4GHz processor.
This paper presents a surprising result in that even after significant down-sampling, the feature images are rich enough to provide great accuracy in classification tasks. The paper also illustrates that optimizations can greatly improve the performance of a sketch recognition system, and carefully balances the need for features with the need for processing power. This is overall an elegant solution.
First, strokes are resampled at a constant spatial frequency. Next, each symbol is normalized by translating its center of mass to the origin and scaling it horizontally and vertically so it has unit standard deviation on each axis. Then, five features for each sample point are computed (four for orientation, one for endpoints). These five features are each rendered onto 24 x 24 feature grids. The grids span 2.5 standard deviations of the original symbol's space in each direction. The intensity of a pixel is determined by the maximum feature value of all sample points that fall within that cell. Next, to increase tolerance to shifts and distortions, a Gaussian blur is applied. The image is then down sampled using a MAX filter to 6x6 size.
For recognition, the computed feature images are distance matched (with an image deformation model) to existing templates.
It is necessary to optimize some of these processes so that recognition time remains low. First, a "coarse" metric is established and used before attempting exact matches, this eliminates many of the candidate matches. To do this, images are indexed using their first 128 principal components. Additionally, a branch and bound technique is used to generate a clustering tree. For rotational invariance, 32 orientations of the original feature image are considered.
Testing on the Pens Digits Set revealed 99.2% accuracy, while the HHReco Dataset yielded 98.2% accuracy and the Circuits Dataset showed 96.2% accuracy. The system was capable of classifying about 100 symbols per second on a 2.4GHz processor.
This paper presents a surprising result in that even after significant down-sampling, the feature images are rich enough to provide great accuracy in classification tasks. The paper also illustrates that optimizations can greatly improve the performance of a sketch recognition system, and carefully balances the need for features with the need for processing power. This is overall an elegant solution.