CSAIL Publications and Digital Archive header
bullet Technical Reports bullet Work Products bullet Research Abstracts bullet Historical Collections bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line


Research Abstracts - 2006
horizontal line

horizontal line

vertical line
vertical line

Semantic Representation of Digital Ink in the Classroom Learning Partner

Michel Rbeiz & Kimberle Koile


The goal of the Classroom Learning Partner (CLP) project is to increase student interaction and learning in large classes. To this end, CLP will support the use of tablet-pc-based in-class exercises by aggregating student digital ink answers into equivalence classes, then present the summary information to the instructor, e.g., in the form of a histogram and example answers. The research described here addresses a critical issue in the introduction of systems such as CLP into the classroom: interpretation and semantic representation of digital ink. Recent advances in educational technology support student interaction, e.g., in the form of in-class exercises, by means of distributed wireless presentation systems. Such systems have proven successful, but are limited in their scope because they rely on digital ink as the communication medium between an instructor and his or her students. The research described here contributes a technique for interpreting digital ink, i.e., translating ink into a semantic representation, enabling aggregation of student answers. The technique is being tested in MIT's introductory computer science class, 6.001, Structure and Interpretation of Computer Programs [1].


Previous research in educational psychology has shown that students learn better when they are actively engaged in the learning situation [6] [7]. It has also shown that students learn better when engagement takes the form of hands-on activities that yield immediate feedback through interaction with peers and/or instructors [8]. CLP takes inspiration from two different teaching methodologies, each of which engages students actively in the learning situation by means of in-class exercises - the use of Personal Response Systems (PRS) in large classrooms, which support the use of multiple-choice and matching questions; and the use of a wide variety of in-class exercises in small classrooms.

Using a wide variety of in-class problems, particularly what are called open-ended questions, has proven beneficial to student learning [8]. This style of teaching is well-supported by a tablet-pc-based system called Classroom Presenter [4]. Classroom Presenter is a tablet-pc-based distributed presentation system which enables instructors to annotate slides with digital ink [3]. A recent pilot study with Classroom Presenter has shown that this engagement style of teaching resulted in more students scoring in the top 10% of the class than expected, and fewer scoring in the bottom 10% in the introductory computer science class. [9]


Version one of CLP, which is to be deployed Spring '06 term, is being designed to interpret and aggregate handwritten text and arrows using existing state of the art recognizers. Interpreting both text and arrows presents a challenge because recognizers, e.g., the Microsoft English Handwriting Recognizer and the Sketch Recognizer from the Design Rationale Group at MIT [2], are able to recognize text or a sketch, respectively, but not both in the same sequence of ink.

The CLP ink interpreter uses a two tiered architecture for our ink interpretation module similar to that of MathPad: [10]

  • An Interpreter which performs recognition and semantic information extraction from digital ink.
  • A Renderer which renders and displays semantic representation of digital ink. This module displays to the user what it “thinks” the user input was. Rendering is useful for ensuring that the recognized information matches the user's input.

Shown in Figure 1 is the ink interpreter architecture, which will leverage the power of the two recognizers, mentioned above. The Ink Analyzer will segment text and sketches and pass corresponding strokes to the appropriate recognizer. The challenge will be to differentiate between intertwined sketch and text. Semantic information will then be extracted from partial results.


The Handwriting Recognizer component has been implemented; it interprets text and arrows. Its architecture is shown in Figure 2.


The Handwriting Recognizer works in the following way:

  • The Ink Segmentation module segments ink in individual chunks. Chunks are elementary units that are theoretically words or arrows.
  • The Chunk Error Correction module attempts to fix errors common to handwriting recognizers: splitting a word into two words, or combining two words into one.
  • The strokes of each chunk are then passed to the Microsoft English Recognizer which outputs several hypotheses, ranked by a confidence score.
  • The hypotheses are then sent to the Language Model module, which uses a domain-specific dictionary and knowledge of expected exercise answer type in order to choose a best hypothesis.
Preliminary Results

Recognition accuracy has traditionally been measured with a Word Error Rate. In our research, however, it was more appropriate to test the distance between the input and recognized strings in order to test for partial improvements. If the input string is “caar”, for example, and the subsequent recognition results are “cr” and “car”, it is important for our accuracy measure to take into account the partial accuracy improvement, something a word error rate measurement would not catch up on. The Levenshtein distance [5] or edit distance measures the distance between two strings, defined as the minimum number of operations needed to transform one string into the other, where an operation is an insertion, deletion, or substitution of a single character. In our example, the edit distance between “caar” and “cr” is 2 while the distance between “caar” and “car” is 1.

In a controlled experiment, users were asked to ink 21 representative answers, and 167 inked answers were collected from several users. The inked answers were dynamically interpreted, stripped of spaces, and converted to lower case. They were then compared to the input. The current version of the handwriting recognizer performed 249 characters errors out of 1820 characters. (~ 13% error rate)

Future Work

Future work on the ink interpreter will be in two areas:

  • Incorporating sketch recognition, both of standalone sketches and sketches mixed with text;
  • Implementing a more elaborate language model, e.g., which contains knowledge about likely formats for student answers. Such a language model would improve handwriting accuracy. In the computer science domain, and 6.001 in particular, such a model would include knowledge of the syntax of expressions written in the course's programming language, Scheme.
Funding Sources

This work is supported by the MIT/Microsoft iCampus initiative.


[1] Abelson, H., Sussman, G., Sussman, J., Structure and Interpretation of Computer Programs, MIT Press and McGraw-Hill, 2nd edition, 1996

[2] Alvarado, C. and Davis, R., SketchREAD: A MultiDomain Sketch Recognition Engine, in Proc. of UIST 2004.

[3] Anderson, Richard, Anderson, Ruth, Hoyer, C., Prince, C., Su, J., Videon, F., and Wolfman, S., A Study of Diagrammatic Ink in Lecture, University of Washington, University of Virginia, University of British Columbia, 2005.

[4] Anderson, R., Anderson, R., Simon, B. VanDeGrift, T., Wolfman, S., Yasuhara, K., Experiences With a Tablet-PC-Based Lecture Presentation System , University of Washington, University of Virginia, University of British Columbia, 2004.

[5] Atallah, M. J. (Editor), Algorithms and Theory of Computation Handbook , “Levenshtein Distance (13-5)”, CRC Press, 1998.

[6] Bransford, JD, Brown, AL and Cocking, RR., How People Learn: Brain, Mind, Experience, and School , National Academy Press, Washington , D.C. , 2003.

[7] Bressler, L., Lessons Learned: Findings from Ten Formative Assessments of Educational Initiatives at MIT (2000-2003) , Report of the MIT Teaching and Learning Laboratory, 2004.

[8] Hake, RR, Interactive-Engagement versus Traditional Methods: A Six-Thousand Student Survey of Mechanics Test Data for Introductory Physics Courses , American Journal of Physics , 66(1):64-74, 1998.

[9] Koile, K., Singer, D., Development of a Tablet-PC-based System to Increase Instructor-Student Classroom Interactions and Student Learning, to appear in Proc. of WIPTE, 2006.

[10] LaViola, J. and Zeleznik, R., MathPad: A System for the Creation and Exploration of Mathematical Sketches, Brown University , 2005.


vertical line
vertical line
horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu