Wednesday, March 20, 2013

What!?! No Rubine Features?: Using Geometric-based Features to Produce Normalized Confidence Values for Sketch Recognition


What!?! No Rubine Features?: Using Geometric-based Features to
Produce Normalized Confidence Values for Sketch Recognition

This paper explores the merging of geometric recognition techniques and gesture-based recognition techniques into a single recognizer. The recognizer produced is capable of allowing natural sketches to be classified, while offering normalized confidence values for alternative interpretations. The major surprising result is that geometric features are more helpful for recognition than gesture-based features, when given naturally sketched data.

The paper utilizes a statistical classifier, a quadratic classifier, for examining features from both geometric and gesture space. Initially 44 features were used (31 geometric, and 13 Rubine). Testing was conducted using a 50% split of the data, split by user, so that the system was not trained on data from the participant whose sketches it would be attempting to classify. Feature Subset Selection was employed to order the entry of features. Using only the features which were present at least 50% of the time, the system is able to achieve results that are not statistically significantly different from PaleoSketch.

There are several interesting aspects to this paper. First, it is noted that with only the top six features, 93% accuracy is still achievable. Additionally, only one gesture-based feature, total rotation, was chosen as one of the top features. This is interesting because the testing setup was more typical of a real world use system, where the system is not trained with the user, but is independent between users. This provides further evidence that geometric based properties are more useful for user independent designs.


PaleoSketch: Accurate Primitive Sketch Recognition and Beautification

Brandon Paulson and Tracy Hammond. 2008. PaleoSketch: accurate primitive sketch recognition and beautification. In Proceedings of the 13th international conference on Intelligent user interfaces (IUI '08). ACM, New York, NY, USA, 1-10. DOI=10.1145/1378773.1378775 http://doi.acm.org/10.1145/1378773.1378775


This paper presents PaleoSketch, a system which is capable of classifying eight primitive shapes as well as combinations of the primitives. The system has recognition rates around 98% and attributes its success to using geometric properties, including two new features and a new ranking algorithm for distinguishing polylines from curved segments.

The recognizer as implemented in the paper uses a three phased approach for recognition. First, duplicate points are removed from the stroke, and a series of measures are computed. Among these measures are, Normalized Distance Between Direction Extremes (NDDE) and Direction to Change Ration (DCR). The former, is a measure of the stroke length between the highest and lowest points on a direction plot, yielding higher readings for more gradual arcs, and lower values for more abrupt changes. The latter measurement, is the maximum change in direction divided by the average change in direction. Polylines typically have higher DCR values than curved strokes. An amount of overtrace is computed, and finally the figure is tested for being either open or closed. Next, the stroke data is fed into various recognizers, which each return a boolean flag. Finally, the results are sent to a hierarchy function which sorts for the best fit.

Robust testing was performed using a large dataset with Paleo sketch and indicated an 98.56% accuracy rate.

The most interesting aspect of Paleo Sketch is its ability to return alternative interpretations. This is a valuable resource to have, when higher level components may be able to reason given this information.

Visual Similarity of Pen Gestures

A. Chris Long, Jr., James A. Landay, Lawrence A. Rowe, and Joseph Michiels. 2000. Visual similarity of pen gestures. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems (CHI '00). ACM, New York, NY, USA, 360-367. DOI=10.1145/332040.332458 http://doi.acm.org/10.1145/332040.332458

This paper aims to determine which gestures users perceive as similar and develop a computational model for predicting perceived similarity of gestures. The primary contribution appears to be for gesture designers, however the paper contributes several new features for gesture recognition, as well as a model for gesture similarity.

Two studies were performed to hep gather data on perception of similarity between gestures. First, 21 participants were asked to view 364 triads of sketches and identify the sketch that was least like the others for each triad. To determine the similarity, MDS was run on the collected data. A regression analysis was then run to determine which of the geometric features correlated with the similarity. The features examined consisted of the Rubine features, and several additional features (mostly logarithmic interpretations) inspired by work done by Attneave, a psychology researcher.

A second trial was conducted with 20 participants and was designed to examine specific aspects of the features identified in the first study. The data from study 1 was used to predict the results of study 2. The derived model predicted accurately about 70% of the time between study 1 and 2. Further, the correlation between prediction of trial 1 and the data from trial two was about 56%.

This paper illustrates that there are several significant factors which affect the perception of similarity between pen gestures. Unsurprisingly, features which take into consideration the logarithmic nature of some perceptual processes provide great benefit to gesture recognition.

Tuesday, March 5, 2013

Sketch Based Interfaces: Early Processing for Sketch Understanding



Tevfik Metin Sezgin, Thomas Stahovich, and Randall Davis. 2006. Sketch based interfaces: early processing for sketch understanding. In ACM SIGGRAPH 2006 Courses (SIGGRAPH '06). ACM, New York, NY, USA, , Article 22 . DOI=10.1145/1185657.1185783 http://doi.acm.org/10.1145/1185657.1185783


This paper describes a sketch interface which uses multiple sources of knowledge to provide processing of freehand sketches. The approach consists of three phases: approximation, beautification, and basic recognition. The paper spends most of its time describing approximation. Approximation uses data from stroke direction and stroke speed. An interesting aspect of the work is the hybrid fit technique, which generates a set of potential fits and selects the best one. A practical approach to handling discretization of bezier curves is also provided for handling curves. Beautification is briefly discussed as adjusting the slopes of lines using a clustering method. Basic recognition is carried out using a set of hand tailored templates. An evaluation was conducted in which 13 of 14 students liked this system better than a non-sketch alternative tool. The system displayed 96% accuracy.


This work has definite application domains, but the state of the system described in this paper seems unprepared for general usage. Specifically, the hand-tailored recognition templates seem a bit terrifying, but perhaps in a restricted domain that isn't an issue. The user study conducted for this system was very informal and probably shouldn't have been reported in this paper. My final complaint is about beautification. I don't like it in some cases, and I think depending on the software, the interface must support a more interactive experience using the pen, rather than just converting what I sketch into primitives.