An Equal-Size Hard EM Algorithm for Multi-Decoder Dialogue Generation
Date
Author
Institution
Degree Level
Degree
Department
Supervisor / Co-Supervisor and Their Department(s)
Citation for Previous Publication
Link to Related Item
Abstract
Building intelligent open-domain dialogue systems is a long-standing goal of artificial intelligence. These systems, also known as chatbots, aim to hold conversations with humans in an open-ended fashion. However, it is well known that standard encoder-decoder dialogue systems tend to generate generic responses. A previous study hypothesizes that this phenomenon is due to the one-to-many mapping in the open-domain dialogue task, where the target distribution is multi-modal. As a result, standard cross-entropy training fails as it learns an overly smoothed function that causes the mode averaging problem.
In this work, we address the mode averaging issue with a multi-decoder model, where each decoder can cover a subset of the modes. We treat the choice of the decoder as a latent variable and apply EM-like algorithms. However, we observe that traditional Hard-EM and Soft-EM may not perform well due to the collapse issue: the decoders fail to specialize and the multi-decoder model degenerates to a single-decoder model. To this end, we propose EqHard-EM, which is an EM variant that assigns an equal number of samples to every decoder to alleviate the collapse issue. Results show that our EqHard-EM algorithm achieves significant improvements over single-decoder models in terms of both response quality and diversity. In addition, extensive analyses show that our EqHard-EM algorithm indeed alleviates the collapse issue: different decoders are specialized and generate diverse responses.
