User Tools

Site Tools


speech_20recognition_20_28shared_29

Speech Recognition (Shared)

by Rob Jeffs, February 2017

The code below demonstrates Command and Control speech recognition for a Shared Recognizer, via the API route (not COM automation/ActiveX). Although, confusingly, the first unavoidable step is to create an instance of a speech recognizer object using COM automation!

Microsoft's online documentation provided the basis for this demonstration, including examples of xml Grammar files. The documentation contains object method lists in VTable order, which relate to the offsets in the SYS commands below; however, be aware that method tables include methods inherited from other objects.

When running the code, the operating system may open a set-up wizard if speech recognition hasn't been used before. Because we're using a shared recognizer, this means that the desktop and other applications may also be listening for commands spoken by the user. A small 'User Interface' should appear, through which the microphone can be turned on and off. A related menu can be accessed via the icon bar, and it may be necessary to adjust the speech recognition language to English-US via the Control Panel / Advanced Speech Options. (The best results were obtained using a plug in microphone, without training.)

Note: IIDFromString and CLSIDFromString perform identical functions

   INSTALL @lib$+"COMLIB"
   PROC_cominit
   SYS "GetModuleHandle","OLE32.DLL" TO O%
   SYS "GetProcAddress",O%,"IIDFromString" TO `IIDFromString`
   SYS "GetProcAddress",O%,"CoTaskMemFree" TO `CoTaskMemFree`
 
   DIM spevent{idpt%,stream%,audio_lo%,audio_hi%,wparam%,lparam%}:REM SPEVENT structure
   DIM multib% 255:REM For converting 8 bit ASCII to 16 bit
 
   ON CLOSE PROCcleanup:QUIT
   ON ERROR PROCcleanup:REPORT:END
 
   sr%=0:rc%=0:grammar%=0:REM Object pointers
   PROCinit_speech_recog(@dir$+"GrammarColor.xml","ruleColors")
   text%=0:REM Pointer to text returned from Recognition Result
 
   PRINT"Listening..."
   REPEAT
 
     REM Poll the speech recognition event queue
     spevent.lparam%=0
     REM GetEvents by calling RecoContext method
     SYS !(!rc%+44),rc%,1,spevent{},0
     IF spevent.lparam%<>0 THEN
       REM GetText by calling a Result method (Result is object in lparam%)
       SYS !(!spevent.lparam%+20),spevent.lparam%,-1,-1,0,^text%,0
       REM Make string from GetText
       T$="":a=0
       REPEAT
         c=text%?a:a+=2:IF c<>0 THEN T$+=CHR$(c)
       UNTIL c=0
       PRINT T$
       REM Release memory used for GetText
       SYS `CoTaskMemFree`,text%
       REM lparam% contains object pointer to RecoResult
       PROC_releaseobject(spevent.lparam%)
     ENDIF
 
     WAIT 4
   UNTIL FALSE
   END
 
   DEF PROCinit_speech_recog(P$,R$)
   LOCAL C%,I%
   REM Create Shared Recognizer
   DIM C% LOCAL 15,I% LOCAL 15
   PROCconvert_to_wide("{C2B5F241-DAA0-4507-9E16-5A1EAA2B7A5C}"):REM IID_ISpRecognizer
   SYS `IIDFromString`,multib%,I%
   PROCconvert_to_wide("{3BEE4890-4FE9-4A37-8C1E-5E7E12791C1F}"):REM CLSID_SpSharedRecognizer
   SYS `CLSIDFromString`,multib%,C%
   SYS `CoCreateInstance`,C%,0,15,I%,^sr%:REM CLSCTX_ALL
   REM Create RecoContext by calling a SharedRecognizer method
   SYS !(!sr%+48),sr%,^rc%
   REM Create Grammar object by calling a RecoContext method
   SYS !(!rc%+56),rc%,0,0,^grammar%
   REM Load Grammar file (xml) by calling a Grammar method
   PROCconvert_to_wide(P$)
   SYS !(!grammar%+52),grammar%,multib%,0
   REM Set Rule State to active by calling a Grammar method
   PROCconvert_to_wide(R$)
   SYS !(!grammar%+72),grammar%,multib%,0,1
   ENDPROC
 
   DEF PROCconvert_to_wide(T$)
   SYS "MultiByteToWideChar",0,0,T$,-1,multib%,LEN(T$)+1
   ENDPROC
 
   DEF PROCcleanup
   ON ERROR OFF
   IF grammar%>0 THEN PROC_releaseobject(grammar%)
   IF rc%>0 THEN PROC_releaseobject(rc%):REM RecoContext
   IF sr%>0 THEN PROC_releaseobject(sr%):REM Shared Recognizer
   PROC_comexit
   ENDPROC

Here's the xml source required for recognizing the names of BASIC colours. It can be entered via Notepad and should be saved as GrammarColor.xml

  <grammar version="1.0" xml:lang="en-UK" mode="voice" root="ruleColors" xmlns="http://www.w3.org/2001/06/grammar" tag-format="semantics/1.0">
 
  <rule id="ruleColors" scope="public">
  <one-of>
  <item> red </item>
  <item> green </item>
  <item> yellow </item>
  <item> blue </item>
  <item> magenta </item>
  <item> cyan </item>
  <item> white </item>
  <item> black </item>
  </one-of>
  </rule>
 
  </grammar>
This website uses cookies. By using the website, you agree with storing cookies on your computer. Also you acknowledge that you have read and understand our Privacy Policy. If you do not agree leave the website.More information about cookies
speech_20recognition_20_28shared_29.txt · Last modified: 2024/01/05 00:21 by 127.0.0.1