A dialog system or conversational agent (CA) is a computer system intended to converse with a human, with a coherent structure. Dialog systems have employed text, speech, graphics, haptics, gestures and other modes for communication on both the input and output channel.
What does and does not constitute a dialog system may be debatable. The typical GUI wizard does engage in some sort of dialog, but it includes very few of the common dialog system components, and dialog state is trivial.
Components of Dialog systems
There are many different architectures for dialog systems. What sets of components are included in a dialog system, and how those components divide up responsibilities differs from system to system. Principal to any dialog system is the dialog manager, which is a component that manages the state of the dialog, and dialog strategy. A typical activity cycle in a dialog system contains the following phases:
1. The user speaks, and the input is converted to plain text by the system's input recognizer/decoder, which may include:
o automatic speech recognizer (ASR)
o gesture recognizer
o handwriting recognizer
2. The text is analyzed by a Natural language understanding unit (NLU), which may include:
o Proper Name identification
o part of speech tagging
o Syntactic/semantic parser
3. The semantic information is analyzed by the dialog manager (see section below), along with a task manager that has knowledge of the specific task domain.
4. The dialog manager produces output using an output generator, which may include:
o natural language generator
o gesture generator
o layout engine
5. Finally, the output is rendered using an output renderer, which may include:
o text-to-speech engine (TTS)
o talking head
o robot or avatar
Dialog systems that are based on a text-only interface (e.g. text-based chat) contain only stages 2-4.
Dialog manager
The dialog manager is the core component of the dialog system. It maintains the history of the dialog, adopts certain dialog strategy (see below), retrieve the content (stored in files or databases), and decides on the best response to the user. The dialog manager maintains the dialog flow.
The design of the dialog manager evolves over time.
• finite-state machine
• frame-based: The system has several slots to be filled. The slots can be filled in any order. This supports mixed-initiative dialog strategy.
• information-state based
The dialog flow can have the following strategies:
• System-initiative dialog: The system is in control to guide the dialog at each step.
• Mixed-initiative dialog: Users can barge in and change the dialog direction. The system follows the user request, but tries to direct the user back the original course. This is the most commonly used dialog strategy in today's dialog systems.
• User-initiative dialog: The user takes lead, and the system respond to whatever the user directs.
The dialog manager can be connected with an expert system to give the ability to respond with specific expertise.
Types of systems
Dialog systems fall into the following categories, which are listed here along a few dimensions. Many of the categories overlap and the distinctions may not be well established.
• by modality
o text-based
o spoken dialog system
o graphical user interface
o multi-modal
• by device
o telephone-based systems
o PDA systems
o in-car systems
o robot systems
o desktop/laptop systems
native
in-browser systems
in-virtual machine
o in-virtual environment
o robots
• by style
o command-based
o menu-driven
o natural language
o speech graffiti
• by initiative
o system initiative
o user initiative
o mixed initiative
Applications
Dialog systems can support a broad range of applications in business enterprises, education, government, healthcare, and entertainment. For example:
• Responding to customers' questions about products and services via a company’s website or intranet portal
• Customer service agent knowledge base: Allows agents to type in a customer’s question and guide them with a response
• Guided selling: Facilitating transactions by providing answers and guidance in the sales process, particularly for complex products being sold to novice customers
• Help desk: Responding to internal employee questions, e.g., responding to HR questions
• Website navigation: Guiding customers to relevant portions of complex websites --a Website concierge
• Technical support: Responding to technical problems, such as diagnosing a problem with a product or device
• Personalized service: Conversational agents can leverage internal and external databases to personalize interactions, such as answering questions about account balances, providing portfolio information, delivering frequent flier or membership information, for example
• Training or education: They can provide problem-solving advice while the user learns
• Simple dialog systems are widely used to decrease human workload in call centres. In this and other industrial telephony applications, the functionality provided by dialog systems is known as interactive voice response or IVR.
In some cases, conversational agents can interact with users using artificial characters. These agents are then referred to as embodied agents.
Toolkits and architectures
• VXML "Voice XML", dialog markup language (primarily for telephony) developed initially by AT&T then administered by an industry consortium and finally a W3C specification. Commercial systems include:
o Quack.com QXML Development Environment [company bought by AOL]
• AIML NLP system
• ChatScript NLP system, by Bruce Wilcox
• SALT: multimodal dialog markup language developed by Microsoft
• CSLU Toolkit a state-based speech interface prototyping environment
• VoiceBrowse: architecture enabling the dynamic production of dialogue driven by unstructured online (Internet) sources
Based on http://en.wikipedia.org/wiki/Conversational_agent licensed under the Creative Commons Attribution-Share-Alike License 3.0
No comments:
Post a Comment