Transcript Document
Language Model Grammar Conversion XML ABNF IHD BNF BNF JSGF Wesley Holland, Julie Baca, Dhruva Duncan, Joseph Picone Center for Advanced Vehicular Systems Mississippi State University Speech Recognition • Acoustic Model • Maps audio data to words or phonemes • Language Model • Specifies order in which a sequence of words or phonemes is likely to occur • Described using grammar Language Model Grammar Conversion Page 1 of 10 Grammar Specifications • Backus-Naur Form (BNF) • Augmented BNF (ABNF) • JSpeech Grammar Format (JSGF) • Speech Recognition Grammar Specification (SRGS) • ISIP Hierarchical Digraph (IHD) BNF ABNF <A>::=aB <B>::=bB <B>::=ε JSGF <A>::=ab* XML-SRGS <A>=a(b)*; IHD a <item repeat=“0-”> b </item> Language Model Grammar Conversion Page 2 of 10 Conversion Design • Goals • JSGF ↔ IHD • XML-SRGS ↔ IHD • Determination of equivalence • Grammar minimization • Final Architecture XML ABNF BNF IHD JSGF Language Model Grammar Conversion Page 3 of 10 JSGF/XML-SRGS → ABNF • JSGF → ABNF • Trivial • Similar in syntax and structure to ABNF • XML-SRGS → ABNF • Harder than JSGF • Different in syntax and structure from ABNF • Requires enumeration of certain repeat attributes XML-SRGS ABNF <item repeat=‘1-2’> a b </item> <S>::=(ab)|(abab) <item repeat=‘2-’> a b </item> <S>::=abab(ab)* Language Model Grammar Conversion Page 4 of 10 JSGF/XML-SRGS → ABNF • XML-SRGS → ABNF (continued) • Different weighting mechanisms (weight and repeat-prob attributes) a <item repeat=“0-” repeat-prob=“.45”> b </item> <one-of> <item weight=“.4”>c</item> <item weight=“.6”>d</item> </one-of> Language Model Grammar Conversion Page 5 of 10 ABNF → BNF • Normalized BNF • Consists of rules of the following formats: •(RULE_NAME)::=(TERMINAL),(NON_TERMINAL) ABNF •(RULE_NAME)::=(NON_TERMINAL) •(RULE_NAME)::=ε 1. Break rule into multiple rules at each top-level alternation. Recurse on each rule. • Complicated 2. • Accomplished using a recursive algorithm that extracts sets of normalized BNF rules from a set of ABNF rules For each concatenation, Kleene star, or Kleene plus, extract a set of left symbols and a set of right symbols. 3. For n left symbols and m right symbols, create n x m connecting rules. • ABNF → BNF BNF Language Model Grammar Conversion Page 6 of 10 BNF ↔ IHD • BNF ↔ IHD • Each arc translates to a normalized BNF • Terminals correspond to nodes; concatenations correspond to arcs BNF RS→R0 R3→C,R3 RS→R1 R3→C,RT R0→A,R3 RT→ε R1→B,R3 Language Model Grammar Conversion IHD Nodes Arcs (S,1) (2,3) 1: A (S,2) (3,3) 2: B (1,3) (3,T) 3: C Page 7 of 10 BNF → JSGF/XML-SRGS • BNF → JSGF/XML-SRGS • Rule-by-rule • Trivial XML-SRGS <rule id=“a”> a <ruleref uri=“#b”/> </rule> BNF <A>::=aB <B>::=bB <B>::=ε Language Model Grammar Conversion JSGF <A>=aB; <B>=b*; <rule id=“b”> <one-of> <item> b <ruleref uri=“#b”/> </item> <item> <ruleref special= “NULL”/> </item> </one-of> </rule> Page 8 of 10 Software Tools • ISIP Network Converter • Console tool to perform conversions to and from arbitrary grammar formats • ISIP Network Builder • Java-based graphical tool to design grammars as finite state machines • Can exports grammars to JSGF, XML-SRGS, ABNF, BNF, and IHD • ISIP Language Model Tester • Console tool for testing of grammars • Can generate valid sentences in a given grammar • Can parse sentences and determine if accepted by a given grammar. Language Model Grammar Conversion Page 9 of 10 Minimization • Minimization • Happens in BNF • Iterate over rule set, merging redundant rules • Rules can be merged if the non terminal of both rules reference the same terminal • Example: Language Model Grammar Conversion Page 10 of 10