CHAPTER 1: CHAPTER 1: CHATBOT SYSTEM OVERVIEW
1.7: NLG (Natural Language Generation)
NLG is the chatbot's answer generator. It relies on mapping conversation manager actions into the natural language to respond to users.
NLG could be a multi-stage handle, with each step advance refining the information being utilized to create a substance with natural-sounding dialect. The six stages of NLG are as takes after:
O Content analysis: To decide what should be included in the material created after the process, data is filtered. This step entails figuring out the primary themes and connections between the original document's contents.
O Data understanding: The data is analyzed, trends are found, and its context is established. The usage of machine learning is widespread at this point.
O Document structuring: Based on the type of data being analyzed, a documented plan is produced, and a narrative framework is selected.
O Sentence Aggregation: The topic is succinctly and accurately covered through the combination of pertinent phrases or sentence fragments.
O Grammatical structuring: Grammar principles are used to create content that sounds natural. The sentence's syntactical structure is inferred by the software. It then makes use of this knowledge to recreate the statement using the proper grammar.
O Language presentation: The ultimate yield is produced based on a format or arrangement the client or software engineer has chosen.
There are 4 commonly used mapping methods: Template-Base, Plan-based, Class-base, RNN-base
1.7.1: Template-based
This answer mapping method is to use predefined bot response templates to
generate answers.
33
Semantic Frame Natural Language
ConfirmQ “Please tell me more about the product you are looking for.”
Confirm(area=$V) “Do you want somewhere in the $V?”
Confirm(food=$V) “Do you want a $V restaurant?”
Confirm(food=$V,area=$W) | “Do you want a $V restaurant in the $W?”
Table 2.0: Template-based example(Yun-Nung (Vivian) Chen, et.al, 2018) O Pros: Simple, easy control. Suitable for closed domain problems
O Cons: Time-consuming to define rules, not natural in answer. For large systems, it is difficult to control the rules, resulting in the system being difficult to develop and maintain.
Sentence
Sentence
Plan Plan Surface
Generator Reranker Realizer
1.7.2: Plan-based
Inform(
name=Z_House, Z House is a
price=cheap J cheap restaurant
) ⁄
⁄ OY
4
⁄
Jn XK syntatic tree
FA L|
Figure 1.9: Plan-based example(Yun-Nung (Vivian) Chen, et.al, 2018) O Pros: Can model complex language structures
O Cons: Heavy design, requires clear knowledge domain 1.7.3: Class-based
34
Language
models Generation
Dialogue
Manager
Candidate utterances
what time on
Tagged Corpora
{depart_date}? input frame
f
at what time would you be t act query content depart_time depart_date 20001001
1 ‡
Scoring
Best utterances what time on {depart_date}?
| Complete utterances
what time on Sun, Oct 1st?
Slot Filling --—
Figure 1.10: Class-based(Yun-Nung (Vivian) Chen, et.al, 2018) The bot will learn labeled input responses using this technique. The bot will provide the most accurate response possible based on the previously trained response dataset in response to the actions and information (slot) from the conversation manager.
O Pros:easy to implement
O Cons:depends on the response data previously labeled with training.Besides, the inefficient score calculation also leads to the generation of wrong answers
1.8: NLG models
O Markov chain: One of the earliest techniques for generating languages was the Markov chain. The next word in a phrase is predicted using a sequence of
35
n words that are now accessible, and the probability of the following word is calculated by considering how all previous words relate to this sequence.
Markov sequences are used to generate word suggestions for sentences in the auto-suggest section of smartphone keyboards, where we have already encountered them.
O Recurrent Neural Network — RNN: Models called neural networks aim to replicate how the brain of people functions. Each thing within the groupings is sent through the relay network by the RNN, which utilizes the results of the demonstration as input for the subsequent thing within the grouping and stores the data from the previous step. The model assesses the likelihood of the following word and saves the words it has already encountered in its memory for each iteration. The model calculates a probability for each word in the dictionary depending on the word before it, picks the phrase with the highest likelihood, and memorizes it. This paradigm is perfect for language generation since the RNN's "memory" always recalls the "content" of the conversation. However, given the size of the sequence increases, the RNN is unable to store the remote encountered words in the sentence and only makes expectations based on the foremost later word. Due to this limitation, RNNs cannot formulate complex, cohesive sentences.
O LSTM (Long short-term memory): To solve the long-range dependency problem, a variant of RNN called Short-Long-Term Memory (LSTM) was introduced. Short-long-term memory is an artificial regressive neural arrangement utilized within the field of profound learning. Unlike standard feedforward neural networks, LSTMs contain feedback connections. The network not only deals with single data points; but also processes entire data series.
36
Although like RNNs, LSTM models consist of a four-layer neural network.
The LSTM consists of four parts: the unit, the input port, the output port, and the bypass gate. These permit the RNN to keep in mind or disregard words at any time by altering the data stream of the unit. When a dab is experienced, the Overlooked Door recognizes that the setting of the sentence may alter and may disregard the current unit state data. This permits the arrange to specifically track as it were significant data whereas minimizing the vanishing slope issue, permitting the show to keep in mind the data for a longer period.
However, the capacity of LSTM recollections is restricted to several hundred words due to their inalienably complex successive ways from the past unit to the current unit. The same complexity leads to tall computational necessities that make LSTMs troublesome to prepare or parallelize.
O Transformer: A relatively new model first introduced in a 2017 Google paper, it proposed an unused strategy known as the "self-attention component”. The transformer comprises of a square of numerous encoders to prepare input of any length and another decoder to produce yield sentences.
In differentiate to LSTM, Transformer takes as it were a little, steady number of steps, and applies a self-attention component that specifically mimics the relationship between all the words in a sentence. Not at all like past models, the Transformer employments the representation of all words in a setting without having to compress all data into a single fixed-length representation permitting the framework to handle sentences. longer without the requirement for skyrocketing computation.
37