Error analysis With the introduction of computers and word-processors, there has been a change in how text-entry is performed. In the past, using a typewriter, speed was measured with a stopwatch and errors were tallied by hand. With the current technology, document preparation is more about using word-processors as a composition aid, changing the meaning of error rate and how it is measured. Research, performed by R. William Soukoreff and I. Scott MacKenzie, has led to a discovery of the application of a well-known algorithm. Through the use of this algorithm and accompanying analysis technique, two statistics were used,
minimum string distance error rate (MSD error rate) and
keystrokes per character (KSPC). The two advantages of this technique include: • Participants are allowed to enter text naturally, since they may commit errors and correct them. • The identification of errors and generation of error rate statistics is easy to automate.
Deconstructing the text input process Through analysis of keystrokes, the keystrokes of the input stream were divided into four classes: Correct (C), Incorrect Fixed (IF), Fixes (F), and Incorrect Not Fixed (INF). These key stroke classification is broken down into the following • The two classes Correct and Incorrect Not Fixed comprise all of the characters in transcribed text. • Fixes (F) keystrokes are easy to identify, and include keystrokes such as backspace, delete, cursor movements, and modifier keys. • Incorrect Fixed (IF) keystrokes are found in the input stream, but not the transcribed text, and are not editing keys. Using these classes, the Minimum String Distance Error Rate and the Key Strokes per Character statistics can both be calculated.
Minimum string distance error rate The minimum string distance (MSD) is the number of "primitives" which is the number of insertions, deletions, or substitutions to transform one string into another. The following equation was found for the MSD Error Rate.
MSD Error Rate = (INF/(C + INF)) * 100\%
Key strokes per character (KSPC) With the minimum string distance error, errors that are corrected do not appear in the transcribed text. The following example shows why this can be an important class of errors to consider:
Presented Text: the quick brown
Input Stream: the quix
Transcribed Text: the quick brown In the above example, the incorrect character ('x') was deleted with a backspace ('(C+INF+IF+F)/(C+INF) There are some shortcomings of the KSPC statistic, such as: • High KSPC values can be related to either many errors which were corrected, or few errors which were not corrected; however, there is no way to distinguish the two. • KSPC depends on the text input method, and cannot be used to meaningfully compare two different input methods, such as a
QWERTY keyboard and a multi-tap device. • There is no obvious way to combine KSPC and MSD into an overall error rate, even though they have an inverse relationship.
Further metrics Using the
classes described above, further metrics were defined by R. William Soukoreff and I.Scott MacKenzie:
Error correction efficiency refers to the ease with which the participant performed error correction. •
Correction Efficiency = IF/F
Participant conscientiousness is the ratio of corrected errors to the total number of error, which helps distinguish
perfectionists from apathetic participants. •
Participant Conscientiousness = IF / (IF + INF) If C represents the amount of useful information transferred, INF, IF, and F represent the proportion of bandwidth wasted. •
Utilized Bandwidth = C / (C + INF + IF + F) •
Wasted Bandwidth = (INF + IF + F)/ (C + INF + IF + F)
Total error rate The classes described also provide an intuitive definition of total
error rate: •
Total Error Rate = ((INF + IF)/ (C + INF + IF)) * 100% •
Not Corrected Error Rate = (INF/ (C + INF + IF)) * 100% •
Corrected Error Rate = (IF/ (C + INF + IF)) * 100% Since these three error rates are ratios, they are comparable between different devices, something that cannot be done with the KSPC statistic, which is device dependent.
Tools for text entry research Currently, two tools are publicly available for text entry researchers to record text entry performance metrics. The first is TEMA that runs only on the
Android (operating system). The second is WebTEM that runs on any device with a modern Web browser, and works with almost all text entry technique. ==Keystroke dynamics==