1 | Achieving Spoken Communication with Computers | 3 |
1.1 | Problem Solving Environment: Task-Oriented Dialogs | 6 |
1.2 | Integrating Dialog with Task Assistance: The Target Behaviors | 7 |
1.2.1 | Problem Solving to Achieve a Goal | 8 |
1.2.2 | Subdialogs and Effective Movement Between Them | 8 |
1.2.3 | Accounting for User Knowledge and Abilities | 10 |
1.2.4 | Expectation of User Input | 11 |
1.2.5 | Variable Initiative | 11 |
1.2.6 | Integrated Behavior Via the Missing Axiom Theory | 12 |
1.3 | Preliminary Study | 13 |
1.4 | An Outline of the Book | 13 |
2 | Foundational Work in Integrated Dialog Processing | 15 |
2.1 | Problem Solving in an Interactive Environment | 15 |
2.2 | Language Use in a Problem-Solving Environment | 16 |
2.2.1 | The Missing Axiom Theory | 16 |
2.2.2 | Speech Act Theory | 17 |
2.2.3 | Computational Speech Act Theory: Analyzing Intentions | 18 |
2.2.4 | Differing Subdialog Purposes: The Plan-Based Theory of Litman and Allen | 21 |
2.2.5 | Collective Intentions | 22 |
2.3 | User Model | 23 |
2.3.1 | General User Modeling Architecture | 24 |
2.3.2 | Using User Model Information in Generation | 26 |
2.3.3 | Acquiring User Model Information | 27 |
2.4 | Expectation Usage | 29 |
2.4.1 | Speech Recognition | 29 |
2.4.2 | Plan Recognition | 29 |
2.5 | Variable Initiative Theory | 31 |
2.5.1 | Defining Initiative | 31 |
2.5.2 | Discourse Structure in Variable Initiative Dialogs | 32 |
2.5.3 | Plan Recognition for Variable Initiative Dialog | 32 |
2.6 | Integrated Dialog Processing Theory | 33 |
2.6.1 | Subdialog Switching: Reichman's Conversational Moves | 33 |
2.6.2 | Beyond Speech Acts: Conversation Acts of Traum and Hinkelman | 35 |
2.6.3 | Integrated Discourse Structure: The Tripartite Model of Grosz and Sidner | 36 |
2.7 | Dialog Systems | 38 |
2.7.1 | Requirements | 39 |
2.7.2 | Portable Systems | 39 |
2.7.3 | Question-Answer Systems: Keyboard Input | 42 |
2.7.4 | Spoken Input Systems | 42 |
2.7.5 | A Discourse System | 44 |
2.7.6 | Variable Initiative Systems | 45 |
2.8 | Summary | 46 |
3 | Dialog Processing Theory | 47 |
3.1 | System Architecture | 47 |
3.2 | Modeling Interactive Task Processing | 51 |
3.2.1 | Computer and User Prerequisites | 51 |
3.2.2 | A Domain-Independent Language for Describing Goals, Actions, and States | 52 |
3.2.3 | Robust Selection of Task Steps | 54 |
3.2.4 | Determining Task Step Completion | 55 |
3.2.5 | What About Dialog? | 57 |
3.3 | Integrating Task Processing with Dialog: The Missing Axiom Theory | 57 |
3.3.1 | The Role of Language: Supplying Missing Axioms | 58 |
3.3.2 | Interruptible Theorem Proving Required [implies] IPSIM | 58 |
3.4 | Exploiting Dialog Context: User Model | 59 |
3.4.1 | Accounting for User Knowledge and Abilities | 59 |
3.4.2 | Computing Inferences from User Input | 60 |
3.4.3 | User Model Usage: Integrating Task Processing with Dialog | 60 |
3.5 | Exploiting Dialog Context: Input Expectations | 63 |
3.5.1 | Foundations of Expectation-Driven Processing | 63 |
3.5.2 | Using Expectation-Driven Processing | 64 |
3.6 | A Theory of Variable Initiative Dialog | 68 |
3.6.1 | Defining Variable Initiative and Dialog Mode | 68 |
3.6.2 | Response Formulation in Variable Initiative Dialog | 70 |
3.7 | Putting the Pieces Together | 72 |
3.7.1 | What Is a Dialog? | 72 |
3.7.2 | Integrated Theory | 73 |
4 | Computational Model | 75 |
4.1 | Dialog Processing Algorithm | 75 |
4.1.1 | Motivation and Basic Steps | 75 |
4.1.2 | Tracing the Basic Steps | 77 |
4.2 | Receiving Suggestion from Domain Processor | 78 |
4.3 | Selection of Next Goal | 79 |
4.4 | Attempting Goal Completion | 81 |
4.4.1 | Step 2a: Attempt to Prove Completion | 87 |
4.4.2 | Step 2b: Computing Final Utterance Specification | 88 |
4.4.3 | Step 2c: Computing Expectations for the User's Response | 89 |
4.4.4 | Step 2d: Receiving User Input | 94 |
4.4.5 | Step 2e: Computing World Interpretation | 95 |
4.4.6 | Steps 2f and 2g: Updating Context and Discourse Structure | 96 |
4.4.7 | Step 2h: Computing Inferences from the Input | 97 |
4.4.8 | Step 2i: Selecting Applicable Axiom | 97 |
4.5 | Updating System Knowledge | 101 |
4.6 | Determine Next Domain Processor Operation | 102 |
4.7 | Solutions to Dialog Processing Problems | 103 |
4.7.1 | Interrupts | 103 |
4.7.2 | Robustness and the Handling of Speech Recognition Errors | 115 |
4.7.3 | Variable Initiative Dialog | 117 |
4.8 | Integrated Dialog Processing: A Summary | 119 |
5 | Parsing | 121 |
5.1 | Introduction | 121 |
5.2 | Overview of the Parser | 123 |
5.3 | The Parser Input Lattice | 125 |
5.3.1 | What is in a Word? | 125 |
5.3.2 | Uncertain Inputs | 126 |
5.3.3 | Arc Weights | 127 |
5.3.4 | Indexing Lattice Nodes | 128 |
5.3.5 | Inputs Used in the Experiments | 129 |
5.4 | Translation Grammars | 130 |
5.5 | Minimum Distance Translation | 132 |
5.5.1 | Distance Between Strings | 132 |
5.5.2 | A Precise Definition of What the MDT Algorithm Does | 133 |
5.6 | An Efficient Algorithm for MDT | 135 |
5.6.1 | Data Structures Used by MDT | 135 |
5.6.2 | The Outer Procedure | 136 |
5.6.3 | The Inner Procedure | 137 |
5.6.4 | An Important Optimization | 141 |
5.7 | Enhancements to the MDT Algorithm | 142 |
5.7.1 | Lexicon Dependent Deletion and Insertion Costs | 142 |
5.7.2 | Grammar Dependent Insertion Costs | 143 |
5.8 | Expectation Processing | 144 |
5.8.1 | Wildcards | 144 |
5.8.2 | Wildcard String Matching | 145 |
5.8.3 | Enhancements to the Minimum Matching String Algorithm | 148 |
5.8.4 | Wildcard String Matching Versus Unification | 149 |
5.8.5 | Expectation Based Hypothesis Selection | 149 |
5.8.6 | The Expectation Function | 149 |
5.9 | Computational Complexity | 151 |
5.9.1 | Notation | 151 |
5.9.2 | The Complexity of Input Lattice Node Renumbering | 151 |
5.9.3 | The Complexity of MDT | 151 |
5.9.4 | The Complexity of Expectation Processing | 153 |
5.9.5 | Overall Parser Complexity | 153 |
6 | System Implementation | 155 |
6.1 | Knowledge Representation | 156 |
6.1.1 | Prolog | 156 |
6.1.2 | GADL | 156 |
6.1.3 | snf | 156 |
6.1.4 | Sef | 156 |
6.1.5 | IPSIM | 157 |
6.1.6 | Discourse Structure | 158 |
6.1.7 | Axioms | 159 |
6.1.8 | Interfaces | 160 |
6.2 | Domain Processor | 160 |
6.2.1 | Debugging Methodology | 161 |
6.2.2 | Decision Making Strategies | 165 |
6.2.3 | Debugging Control Strategy Modifications for Dialog | 170 |
6.3 | Generation | 178 |
6.3.1 | Overview | 178 |
6.3.2 | Natural Language Directions for Locating Objects | 178 |
6.4 | Resource Utilization | 179 |
7 | Experimental Results | 181 |
7.1 | Hypotheses | 181 |
7.2 | Preliminary Results | 181 |
7.3 | Experimental Design | 184 |
7.3.1 | Overview | 184 |
7.3.2 | Problem Selection | 185 |
7.3.3 | Session 1 Procedure | 186 |
7.3.4 | Session 2 Procedure | 190 |
7.3.5 | Session 3 Procedure | 192 |
7.4 | Experimental Setup | 192 |
7.5 | Subject Pool | 196 |
7.6 | Cumulative Results | 197 |
7.6.1 | Basic System Performance | 197 |
7.6.2 | Parameter Definitions | 197 |
7.6.3 | Aggregate Results | 199 |
7.6.4 | Results as a Function of Problem | 206 |
7.6.5 | Statistical Analysis of the Results | 210 |
7.7 | Results from Subject Responses about System Usage | 212 |
7.8 | Conclusions | 214 |
8 | Performance of the Speech Recognizer and Parser | 219 |
8.1 | Preparation of the Data | 219 |
8.2 | Speech Recognizer Performance | 221 |
8.2.1 | Comparison to Other Speech Recognizers | 223 |
8.2.2 | Comparison to Humans | 223 |
8.3 | Parser Performance | 224 |
8.4 | Optimal Expectation Functions | 227 |
9 | Enhanced Dialog Processing: Verifying Doubtful Inputs | 231 |
9.1 | Handling Misunderstandings | 231 |
9.2 | Deciding When to Verify | 232 |
9.2.1 | Confidence Estimates | 232 |
9.2.2 | Selecting a Verification Threshold | 237 |
9.3 | Experimental Results | 238 |
9.4 | Summary of Verification Subdialogs | 239 |
10 | Extending the State of the Art | 241 |
10.1 | Continuing Work | 241 |
10.1.1 | Automatic Switching of Initiative | 241 |
10.1.2 | Exploiting Dialog Context in Response Generation | 242 |
10.1.3 | Miscommunication and Metadialog | 244 |
10.1.4 | Less Restricted Vocabulary | 245 |
10.1.5 | Evaluating Model Applicability | 246 |
10.2 | Where Do We Go Next? | 247 |
A | The Goal and Action Description Language | 249 |
B | User's Guide for the Interruptible Prolog SIMulator (IPSIM) | 253 |
B.1 | Introduction | 253 |
B.2 | Specifying Rules and Axioms for IPSIM | 253 |
B.2.1 | Sample Specification and Description | 254 |
B.2.2 | Additional Requirements for the Specification | 254 |
B.2.3 | The Special Clauses of IPSIM | 256 |
B.3 | Using IPSIM | 257 |
B.3.1 | The IPSIM Command Language | 257 |
B.3.2 | The Use of Knowledge | 262 |
B.3.3 | A Sample Control Scheme | 262 |
B.4 | Creating Dynamic Lists of Missing Axioms | 262 |
B.4.1 | The Defaults | 262 |
B.4.2 | Redefining axiom_need | 262 |
B.5 | Using IPSIM Calls within Theorem Specifications | 264 |
C | Obtaining the System Software Via Anonymous FTP | 265 |
| Bibliography | 267 |
| Index | 279 |