Microsoft was involved in speech recognition and
speech synthesis research for many years before WSR. In 1993, Microsoft hired
Xuedong Huang from
Carnegie Mellon University to lead its speech development efforts; the company's research led to the development of the
Speech API (SAPI) introduced in 1994. Speech recognition had also been used in previous Microsoft products.
Office XP and
Office 2003 provided speech recognition capabilities among
Internet Explorer and
Microsoft Office applications; it also enabled limited speech functionality in
Windows 98,
Windows Me,
Windows NT 4.0, and
Windows 2000.
Windows XP Tablet PC Edition 2002 included speech recognition capabilities with the Tablet PC Input Panel, and
Microsoft Plus! for Windows XP enabled voice commands for Windows Media Player. However, these all required installation of speech recognition as a separate component; before Windows Vista, Windows did not include integrated or extensive speech recognition.
Windows Vista in
Windows Vista (then known as "Longhorn")
build 4093 At
WinHEC 2002 Microsoft announced that Windows Vista (codenamed "Longhorn") would include advances in speech recognition and in features such as
microphone array support as part of an effort to "provide a consistent quality audio infrastructure for natural (continuous) speech recognition and (discrete) command and control."
Bill Gates stated during
PDC 2003 that Microsoft would "build speech capabilities into the system — a big advance for that in 'Longhorn,' in both recognition and synthesis, real-time"; and pre-release builds during the
development of Windows Vista included a speech engine with training features. A PDC 2003 developer presentation stated Windows Vista would also include a user interface for microphone feedback and control, and user configuration and training features. Microsoft clarified the extent to which speech recognition would be integrated when it stated in a pre-release
software development kit that "the common speech scenarios, like speech-enabling menus and buttons, will be enabled system-wide." During WinHEC 2004 Microsoft included WSR as part of a strategy to improve productivity on mobile PCs. Microsoft later emphasized
accessibility, new mobility scenarios, support for additional languages, and improvements to the speech user experience at WinHEC 2005. Unlike the speech support included in Windows XP, which was integrated with the Tablet PC Input Panel and required switching between separate Commanding and Dictation modes, Windows Vista would introduce a dedicated interface for speech input on the desktop and would unify the separate speech modes; users previously could not speak a command after dictating or vice versa without first switching between these two modes. Windows Vista Beta 1 included integrated speech recognition. To incentivize company employees to analyze WSR for software
glitches and to provide feedback, Microsoft offered an opportunity for its testers to win a Premium model of the
Xbox 360. During a demonstration by Microsoft on July 27, 2006—before Windows Vista's
release to manufacturing (RTM)—a notable incident involving WSR occurred that resulted in an unintended output of "Dear aunt, let's set so double the killer delete select all" when several attempts to dictate led to consecutive output errors; the incident was a subject of significant derision among analysts and journalists in the audience, despite another demonstration for application management and navigation being successful. Reports from early 2007 indicated that WSR is vulnerable to attackers using speech recognition for malicious operations by playing certain audio commands through a target's speakers; it was the first vulnerability discovered after Windows Vista's
general availability. Microsoft stated that although such an attack is theoretically possible, a number of mitigating factors and prerequisites would limit its effectiveness or prevent it altogether: a target would need the recognizer to be active and configured to properly interpret such commands; microphones and speakers would both need to be enabled and at sufficient volume levels; and an attack would require the computer to perform visible operations and produce audible feedback without users noticing.
User Account Control would also prohibit the occurrence of privileged operations.
Windows 7 WSR was updated to use
Microsoft UI Automation and its engine now uses the
WASAPI audio stack, substantially enhancing its performance and enabling support for
echo cancellation, respectively. The document harvester, which can analyze and collect text in email and documents to contextualize user terms has improved performance, and now runs periodically in the background instead of only after recognizer startup. Sleep mode has also seen performance improvements and, to address security issues, the recognizer is turned off by default after users speak "stop listening" instead of being suspended. Windows 7 also introduces an option to submit speech training data to Microsoft to improve future recognizer versions. A new dictation scratchpad interface functions as a temporary document into which users can dictate or type text for insertion into applications that are not compatible with the
Text Services Framework.
Windows 8.x and Windows RT WSR can be used to control the
Metro user interface in Windows 8, Windows 8.1, and Windows RT with commands to open the
Charms bar ("Press Windows C"); to dictate or display commands in
Metro-style apps ("Press Windows Z"); to perform tasks in apps (e.g., "Change to Celsius" in
MSN Weather); and to display all installed apps listed by the
Start screen ("Apps").
Windows 10 WSR is featured in the
Settings application starting with the Windows 10 April 2018 Update (
Version 1803); the change first appeared in
Insider Preview Build 17083. The April 2018 Update also introduces a new ++ keyboard shortcut to activate WSR.
Windows 11 In Windows 11 version 22H2, a second Microsoft app, Voice Access, was added in addition to WSR. In December 2023 Microsoft announced that WSR is deprecated in favor of Voice Access and may be removed in a future build or release of Windows. ==Overview and features==