0%

【arxiv论文】 Computation and Language 2020-09-29

目录

1. PIN: A Novel Parallel Interactive Network for Spoken Language Understanding [PDF] 摘要
2. Injecting Entity Types into Entity-Guided Text Generation [PDF] 摘要
3. Aspects of Terminological and Named Entity Knowledge within Rule-Based Machine Translation Models for Under-Resourced Neural Machine Translation Scenarios [PDF] 摘要
4. Transformers Are Better Than Humans at Identifying Generated Text [PDF] 摘要
5. Similarity Detection Pipeline for Crawling a Topic Related Fake News Corpus [PDF] 摘要
6. Reducing Quantity Hallucinations in Abstractive Summarization [PDF] 摘要
7. Learning to Match Jobs with Resumes from Sparse Interaction Data using Multi-View Co-Teaching Network [PDF] 摘要
8. A Diagnostic Study of Explainability Techniques for Text Classification [PDF] 摘要
9. Pchatbot: A Large-Scale Dataset for Personalized Chatbot [PDF] 摘要
10. Graph-based Multi-hop Reasoning for Long Text Generation [PDF] 摘要
11. Zero-shot Multi-Domain Dialog State Tracking Using Descriptive Rules [PDF] 摘要
12. Augmented Natural Language for Generative Sequence Labeling [PDF] 摘要
13. Dissecting Lottery Ticket Transformers: Structural and Behavioral Study of Sparse Neural Machine Translation [PDF] 摘要
14. Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models [PDF] 摘要
15. Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [PDF] 摘要
16. Incomplete Utterance Rewriting as Semantic Segmentation [PDF] 摘要
17. Generative latent neural models for automatic word alignment [PDF] 摘要
18. Neural Baselines for Word Alignment [PDF] 摘要
19. Deep Transformers with Latent Depth [PDF] 摘要
20. What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams [PDF] 摘要
21. Reactive Supervision: A New Method for Collecting Sarcasm Data [PDF] 摘要
22. A Simple and Efficient Ensemble Classifier Combining Multiple Neural Network Models on Social Media Datasets in Vietnamese [PDF] 摘要
23. Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning [PDF] 摘要
24. SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval [PDF] 摘要
25. Fancy Man Lauches Zippo at WNUT 2020 Shared Task-1: A Bert Case Model for Wet Lab Entity Extraction [PDF] 摘要
26. Unsupervised Pre-training for Biomedical Question Answering [PDF] 摘要
27. What does it mean to be language-agnostic? Probing multilingual sentence encoders for typological properties [PDF] 摘要
28. TernaryBERT: Distillation-aware Ultra-low Bit BERT [PDF] 摘要
29. Hierarchical Deep Multi-modal Network for Medical Visual Question Answering [PDF] 摘要
30. Inductively Representing Out-of-Knowledge-Graph Entities by Optimal Estimation Under Translational Assumptions [PDF] 摘要
31. Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval [PDF] 摘要
32. Modeling Topical Relevance for Multi-Turn Dialogue Generation [PDF] 摘要
33. Multi-timescale representation learning in LSTM Language Models [PDF] 摘要
34. A Brief Survey and Comparative Study of Recent Development of Pronoun Coreference Resolution [PDF] 摘要
35. Stylized Dialogue Response Generation Using Stylized Unpaired Texts [PDF] 摘要
36. Local and non-local dependency learning and emergence of rule-like representations in speech data by Deep Convolutional Generative Adversarial Networks [PDF] 摘要
37. Neural Proof Nets [PDF] 摘要
38. Techniques to Improve Q&A Accuracy with Transformer-based models on Large Complex Documents [PDF] 摘要
39. Clustering-based Unsupervised Generative Relation Extraction [PDF] 摘要
40. KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning [PDF] 摘要
41. Recurrent Inference in Text Editing [PDF] 摘要
42. DWIE: an entity-centric dataset for multi-task document-level information extraction [PDF] 摘要
43. Automatic Arabic Dialect Identification Systems for Written Texts: A Survey [PDF] 摘要
44. ARPA: Armenian Paraphrase Detection Corpus and Models [PDF] 摘要
45. Metaphor Detection using Deep Contextualized Word Embeddings [PDF] 摘要
46. Topic-Aware Multi-turn Dialogue Modeling [PDF] 摘要
47. iNLTK: Natural Language Toolkit for Indic Languages [PDF] 摘要
48. QuatRE: Relation-Aware Quaternions for Knowledge Graph Embeddings [PDF] 摘要
49. Learning to Plan and Realize Separately for Open-Ended Dialogue Systems [PDF] 摘要
50. Modeling Dyadic Conversations for Personality Inference [PDF] 摘要
51. BET: A Backtranslation Approach for Easy Data Augmentation in Transformer-based Paraphrase Identification Context [PDF] 摘要
52. XTE: Explainable Text Entailment [PDF] 摘要
53. Hierarchical Sparse Variational Autoencoder for Text Encoding [PDF] 摘要
54. Visually Grounded Compound PCFGs [PDF] 摘要
55. RecoBERT: A Catalog Language Model for Text-Based Recommendations [PDF] 摘要
56. BiteNet: Bidirectional Temporal Encoder Network to Predict Medical Outcomes [PDF] 摘要
57. Visual Exploration and Knowledge Discovery from Biomedical Dark Data [PDF] 摘要
58. Reinforcement Learning-based N-ary Cross-Sentence Relation Extraction [PDF] 摘要

摘要

1. PIN: A Novel Parallel Interactive Network for Spoken Language Understanding [PDF] 返回目录
  Peilin Zhou, Zhiqi Huang, Fenglin Liu, Yuexian Zou
Abstract: Spoken Language Understanding (SLU) is an essential part of the spoken dialogue system, which typically consists of intent detection (ID) and slot filling (SF) tasks. Recently, recurrent neural networks (RNNs) based methods achieved the state-of-the-art for SLU. It is noted that, in the existing RNN-based approaches, ID and SF tasks are often jointly modeled to utilize the correlation information between them. However, we noted that, so far, the efforts to obtain better performance by supporting bidirectional and explicit information exchange between ID and SF are not well this http URL addition, few studies attempt to capture the local context information to enhance the performance of SF. Motivated by these findings, in this paper, Parallel Interactive Network (PIN) is proposed to model the mutual guidance between ID and SF. Specifically, given an utterance, a Gaussian self-attentive encoder is introduced to generate the context-aware feature embedding of the utterance which is able to capture local context information. Taking the feature embedding of the utterance, Slot2Intent module and Intent2Slot module are developed to capture the bidirectional information flow for ID and SF tasks. Finally, a cooperation mechanism is constructed to fuse the information obtained from Slot2Intent and Intent2Slot modules to further reduce the prediction bias.The experiments on two benchmark datasets, i.e., SNIPS and ATIS, demonstrate the effectiveness of our approach, which achieves a competitive result with state-of-the-art models. More encouragingly, by using the feature embedding of the utterance generated by the pre-trained language model BERT, our method achieves the state-of-the-art among all comparison approaches.
摘要:口语理解(SLU)是语音对话系统,该系统通常由意图检测(ID)和槽填充(SF)的任务的一个重要部分。最近,回归神经网络(RNNs)为基础的方法所取得的状态的最先进的用于SLU。应注意的是,在现有的基于RNN的办法,ID和SF任务通常共同建模,以利用它们之间的相关性的信息。然而,我们注意到,到目前为止,通过支持ID和SF之间的双向和明确的信息交换,以获得更好的性能的努力都没有得到很好的这个HTTP URL此外,一些研究试图夺取当地情况的信息,以提高SF的性能。由这些研究结果的启发,在本文中,并行交互网络(PIN),提出了ID和SF之间的相互引导建模。具体地,给出的话语,高斯自周到编码器引入到生成上下文感知功能的嵌入,它能够捕捉到当地的环境信息的发声的。取特征嵌入发声的,Slot2Intent模块和Intent2Slot模块的开发,以捕捉ID和SF任务的双向信息流。最后,一个合作机制被构造为融合来自Slot2Intent和Intent2Slot模块以进一步获得的信息减少预测bias.The实验在两个基准数据集,即,SNIPS和ATIS,证明我们的方法的有效性,这实现有竞争力的结果与国家的最先进的车型。更令人鼓舞的是,通过使用预先训练的语言模型BERT产生的话语的功能嵌入,我们的方法在所有比较接近达到国家的最先进的。

2. Injecting Entity Types into Entity-Guided Text Generation [PDF] 返回目录
  Xiangyu Dong, Wenhao Yu, Chenguang Zhu, Meng Jiang
Abstract: Recent successes in deep generative modeling have led to significant advances in natural language generation (NLG). Incorporating entities into neural generation models has demonstrated great improvements by assisting to infer the summary topic and to generate coherent content. In order to enhance the role of entity in NLG, in this paper, we aim to model the entity type in the decoding phase to generate contextual words accurately. We develop a novel NLG model to produce a target sequence (i.e., a news article) based on a given list of entities. The generation quality depends significantly on whether the input entities are logically connected and expressed in the output. Our model has a multi-step decoder that injects the entity types into the process of entity mention generation. It first predicts the token of being a contextual word or an entity, then if an entity, predicts the entity mention. It effectively embeds the entity's meaning into hidden states, making the generated words precise. Experiments on two public datasets demonstrate type injection performs better than type embedding concatenation baselines.
摘要:深生成模型最近的成功也导致了自然语言生成(NLG)显著的进步。结合实体为神经代车型已经通过协助推断总结主题,并产生一致的内容表现出极大的改善。为了增强实体的在NLG的作用,在本文中,我们的目标是在实体类型中的解码阶段模型来准确地生成上下文单词。我们开发了一种新NLG模型生成基于实体的给定列表中的靶序列(即新闻)。生成质量上输入的实体是否在逻辑上连接并在输出表示显著取决于。我们的模型具有一个多步骤的解码器喷射实体类型分为实体提生成的过程。它首先预测令牌是上下文的单词或一个实体,那么如果一个实体,预测实体提。它有效地嵌入了实体的意思隐入状态,使得产生的话精确。在两个公共数据集的实验结果证明式注射执行比嵌入型串联基线更好。

3. Aspects of Terminological and Named Entity Knowledge within Rule-Based Machine Translation Models for Under-Resourced Neural Machine Translation Scenarios [PDF] 返回目录
  Daniel Torregrosa, Nivranshu Pasricha, Maraim Masoud, Bharathi Raja Chakravarthi, Juan Alonso, Noe Casas, Mihael Arcan
Abstract: Rule-based machine translation is a machine translation paradigm where linguistic knowledge is encoded by an expert in the form of rules that translate text from source to target language. While this approach grants extensive control over the output of the system, the cost of formalising the needed linguistic knowledge is much higher than training a corpus-based system, where a machine learning approach is used to automatically learn to translate from examples. In this paper, we describe different approaches to leverage the information contained in rule-based machine translation systems to improve a corpus-based one, namely, a neural machine translation model, with a focus on a low-resource scenario. Three different kinds of information were used: morphological information, named entities and terminology. In addition to evaluating the general performance of the system, we systematically analysed the performance of the proposed approaches when dealing with the targeted phenomena. Our results suggest that the proposed models have limited ability to learn from external information, and most approaches do not significantly alter the results of the automatic evaluation, but our preliminary qualitative evaluation shows that in certain cases the hypothesis generated by our system exhibit favourable behaviour such as keeping the use of passive voice.
摘要:基于规则机器翻译是机器翻译的范例,其中语言知识是由那翻译从源文本到目标语言规则的形式方面的专家进行编码。虽然这种方法授予在系统的输出广泛的控制,正式确定所需的语言知识的成本比培养了基于语料库的系统,其中一个机器学习方法用于自动学习到从实例翻译高得多。在本文中,我们将介绍能够利用包含基于规则机器翻译系统中的信息来提高基于语料库的一个,即神经机器翻译模型,重点在低资源情况不同的方法。三种不同类型的信息被用于:形态学信息,命名实体和术语。除了评估系统的整体性能,我们系统地分析与目标现象打交道时所提出的方法的性能。我们的研究结果表明,所提出的模型有来自外部信息的学习能力有限,大多数方法不显著改变自动评估的结果,但我们的初步定性评价表明,在某些情况下,我们的系统表现出良好的行为,所产生的假说为保持使用被动语态。

4. Transformers Are Better Than Humans at Identifying Generated Text [PDF] 返回目录
  Antonis Maronikolakis, Mark Stevenson, Hinrich Schutze
Abstract: Fake information spread via the internet and social media influences public opinion and user activity. Generative models enable fake content to be generated faster and more cheaply than had previously been possible. This paper examines the problem of identifying fake content generated by lightweight deep learning models. A dataset containing human and machine-generated headlines was created and a user study indicated that humans were only able to identify the fake headlines in 45.3% of the cases. However, the most accurate automatic approach, transformers, achieved an accuracy of 94%, indicating that content generated from language models can be filtered out accurately.
摘要:通过互联网和社交媒体虚假信息传播影响舆论和用户活动。生成模型能够比过去可能会产生虚假的内容更快,更便宜。本文探讨标识由轻量级的深度学习模型生成的虚假内容的问题。一个包含人类和机器生成的数据集的标题创建和用户研究表明,人类是唯一能够确定的情况下,45.3%的假头条新闻。然而,最准确的自动方法,变压器,达到了94%的准确度,表示从语言模型生成的内容可以被精确地过滤掉。

5. Similarity Detection Pipeline for Crawling a Topic Related Fake News Corpus [PDF] 返回目录
  Inna Vogel, Jeong-Eun Choi, Meghana Meghana
Abstract: Fake news detection is a challenging task aiming to reduce human time and effort to check the truthfulness of news. Automated approaches to combat fake news, however, are limited by the lack of labeled benchmark datasets, especially in languages other than English. Moreover, many publicly available corpora have specific limitations that make them difficult to use. To address this problem, our contribution is threefold. First, we propose a new, publicly available German topic related corpus for fake news detection. To the best of our knowledge, this is the first corpus of its kind. In this regard, we developed a pipeline for crawling similar news articles. As our third contribution, we conduct different learning experiments to detect fake news. The best performance was achieved using sentence level embeddings from SBERT in combination with a Bi-LSTM (k=0.88).
摘要:假新闻检测是一项艰巨的任务,旨在减少人类的时间和精力来检查的消息的真实性。自动化的方法来打击假新闻,然而,因缺乏标记的基准数据集的限制,尤其是在英语以外的语言。此外,许多公开可用的语料库有具体的限制,这使得他们很难使用。为了解决这个问题,我们的贡献是一举三得。首先,我们提出了一个新的,可公开获得的德国话题相关的假新闻语料的检测。据我们所知,这是其首创的语料库。在这方面,我们开发了爬行类似新闻报道的管道。作为我们的第三个贡献,我们进行不同的学习实验,以检测假新闻。使用句子级的嵌入组合物达到的最佳性能从SBERT与双LSTM(K = 0.88)。

6. Reducing Quantity Hallucinations in Abstractive Summarization [PDF] 返回目录
  Zheng Zhao, Shay B. Cohen, Bonnie Webber
Abstract: It is well-known that abstractive summaries are subject to hallucination---including material that is not supported by the original text. While summaries can be made hallucination-free by limiting them to general phrases, such summaries would fail to be very informative. Alternatively, one can try to avoid hallucinations by verifying that any specific entities in the summary appear in the original text in a similar context. This is the approach taken by our system, Herman. The system learns to recognize and verify quantity entities (dates, numbers, sums of money, etc.) in a beam-worth of abstractive summaries produced by state-of-the-art models, in order to up-rank those summaries whose quantity terms are supported by the original text. Experimental results demonstrate that the ROUGE scores of such up-ranked summaries have a higher Precision than summaries that have not been up-ranked, without a comparable loss in Recall, resulting in higher F$_1$. Preliminary human evaluation of up-ranked vs. original summaries shows people's preference for the former.
摘要:这是众所周知的,抽象的总结受幻觉---包括不是由原文的支持材料。虽然摘要可以通过限制他们一般短语进行幻觉自由,这样的总结会失败是非常丰富的。另外,可以尝试通过验证的摘要存在任何特定的实体出现在相似背景的原始文本,以避免出现幻觉。这是我们的系统,赫尔曼所采取的做法。系统学习识别和验证的数量实体(日期,数字,货币等款项)由国家的最先进的机型生产的抽象总结,一束价值以高达秩的总结,其数量项由原始文本的支持。实验结果表明,这种最高排名摘要的ROUGE得分比那些没有被上排名摘要更高的精度,在不召回的可比损失,从而获得更高的F $ _1 $。初步评价的人最多,排名与原来的摘要显示人们的偏爱前者。

7. Learning to Match Jobs with Resumes from Sparse Interaction Data using Multi-View Co-Teaching Network [PDF] 返回目录
  Shuqing Bian, Xu Chen, Wayne Xin Zhao, Kun Zhou, Yupeng Hou, Yang Song, Tao Zhang, Ji-Rong Wen
Abstract: With the ever-increasing growth of online recruitment data, job-resume matching has become an important task to automatically match jobs with suitable resumes. This task is typically casted as a supervised text matching problem. Supervised learning is powerful when the labeled data is sufficient. However, on online recruitment platforms, job-resume interaction data is sparse and noisy, which affects the performance of job-resume match algorithms. To alleviate these problems, in this paper, we propose a novel multi-view co-teaching network from sparse interaction data for job-resume matching. Our network consists of two major components, namely text-based matching model and relation-based matching model. The two parts capture semantic compatibility in two different views, and complement each other. In order to address the challenges from sparse and noisy data, we design two specific strategies to combine the two components. First, two components share the learned parameters or representations, so that the original representations of each component can be enhanced. More importantly, we adopt a co-teaching mechanism to reduce the influence of noise in training data. The core idea is to let the two components help each other by selecting more reliable training instances. The two strategies focus on representation enhancement and data enhancement, respectively. Compared with pure text-based matching models, the proposed approach is able to learn better data representations from limited or even sparse interaction data, which is more resistible to noise in training data. Experiment results have demonstrated that our model is able to outperform state-of-the-art methods for job-resume matching.
摘要:随着网络招聘数据的不断增加的增长,就业的简历匹配已成为一项重要的任务,以自动匹配与适当的恢复工作。此任务通常浇注,有监督文本匹配的问题。当标记的数据是足够的监督学习是强大的。然而,在网上招聘平台,与工作简历交互数据稀疏和嘈杂,影响的就业简历匹配算法的性能。为了解决这些问题,在本文中,我们提出了从求职简历匹配稀疏交互数据的新的多视角合作教学网络。我们的网络由两个主要部分,即基于文本的匹配模型和基于关系的匹配模式。这两部分捕获语义兼容性在两种不同的意见,相得益彰。为了解决从稀疏和噪声数据的挑战,我们设计了两个具体的战略将两个部件结合起来。首先,两个部件共享学会参数或表示,使得每个部件的原始表示可被增强。更重要的是,我们采用一个共同的教学机制,以降低训练数据噪声的影响。其核心思想是让两种成分互相帮助,通过选择更可靠的训练实例。这两个策略分别侧重于表达增强和数据增强。与纯粹的基于文本的匹配机型相比,该方法能够借鉴有限,甚至稀疏交互数据,这是更耐受在训练数据的噪声更好的数据表示。实验结果表明,我们的模型能够国家的最先进的方法,跑赢大盘为求职简历匹配。

8. A Diagnostic Study of Explainability Techniques for Text Classification [PDF] 返回目录
  Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein
Abstract: Recent developments in machine learning have introduced models that approach human performance at the cost of increased architectural complexity. Efforts to make the rationales behind the models' predictions transparent have inspired an abundance of new explainability techniques. Provided with an already trained model, they compute saliency scores for the words of an input instance. However, there exists no definitive guide on (i) how to choose such a technique given a particular application task and model architecture, and (ii) the benefits and drawbacks of using each such technique. In this paper, we develop a comprehensive list of diagnostic properties for evaluating existing explainability techniques. We then employ the proposed list to compare a set of diverse explainability techniques on downstream text classification tasks and neural network architectures. We also compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones. Overall, we find that the gradient-based explanations perform best across tasks and model architectures, and we present further insights into the properties of the reviewed explainability techniques.
摘要:在机器学习的最新发展介绍,在接近建筑增加复杂性为代价的人类的性能车型。努力使理论基础与模型的预测透明背后有启发的新explainability技术的丰富。与已经训练模型提供的,他们计算的输入实例的话显着成绩。然而,存在于没有明确的指南(一)如何给一个特定的应用任务,模型结构,并选择这样的技术(二)效益和利用每一个这样的技术的缺点。在本文中,我们开发的诊断属性的完整列表,评价现有explainability技术。然后,我们使用建议列表来比较下游的文本分类的任务和神经网络结构的一组不同的explainability技术。我们还比较与显着输入地区的人类注解explainability技术分配给找到一个模型的性能和它的基本原理与人类的那些协议之间关系的显着成绩。总体而言,我们发现,基于梯度的解释执行最佳横跨任务和模型的架构,以及我们提出了进一步深入了解的审查explainability技术性能。

9. Pchatbot: A Large-Scale Dataset for Personalized Chatbot [PDF] 返回目录
  Xiaohe Li, Hanxun Zhong, Yu Guo, Yueyuan Ma, Hongjin Qian, Zhanliang Liu, Zhicheng Dou, Ji-Rong Wen
Abstract: Natural language dialogue systems raise great attention recently. As many dialogue models are data-driven, high quality datasets are essential to these systems. In this paper, we introduce Pchatbot, a large scale dialogue dataset which contains two subsets collected from Weibo and Judical forums respectively. Different from existing datasets which only contain post-response pairs, we include anonymized user IDs as well as timestamps. This enables the development of personalized dialogue models which depend on the availability of users' historical conversations. Furthermore, the scale of Pchatbot is significantly larger than existing datasets, which might benefit the data-driven models. Our preliminary experimental study shows that a personalized chatbot model trained on Pchatbot outperforms the corresponding ad-hoc chatbot models. We also demonstrate that using larger dataset improves the quality of dialog models.
摘要:自然语言对话系统,最近募集的高度重视。由于许多对话模型是数据驱动的,高品质的数据集对这些系统是必不可少的。在本文中,我们介绍Pchatbot,大规模数据集的对话包含分别从微博和论坛上统一司法收集两个子集。从只包含响应后对现有数据集不同的是,我们包括匿名用户ID以及时间戳。这使得这取决于用户的历史对话的可用性个性化的对话模式的发展。此外,Pchatbot的规模比现有数据集,这可能有利于数据驱动模型显著较大。我们的初步实验研究表明,经过训练,上Pchatbot个性化的聊天机器人模型优于相应的临时聊天机器人模型。我们还表明,使用更大的数据集提高对话模型的质量。

10. Graph-based Multi-hop Reasoning for Long Text Generation [PDF] 返回目录
  Liang Zhao, Jingjing Xu, Junyang Lin, Yichang Zhang, Hongxia Yang, Xu Sun
Abstract: Long text generation is an important but challenging task.The main problem lies in learning sentence-level semantic dependencies which traditional generative models often suffer from. To address this problem, we propose a Multi-hop Reasoning Generation (MRG) approach that incorporates multi-hop reasoning over a knowledge graph to learn semantic dependencies among sentences. MRG consists of twoparts, a graph-based multi-hop reasoning module and a path-aware sentence realization module. The reasoning module is responsible for searching skeleton paths from a knowledge graph to imitate the imagination process in the human writing for semantic transfer. Based on the inferred paths, the sentence realization module then generates a complete sentence. Unlike previous black-box models, MRG explicitly infers the skeleton path, which provides explanatory views tounderstand how the proposed model works. We conduct experiments on three representative tasks, including story generation, review generation, and product description generation. Automatic and manual evaluation show that our proposed method can generate more informative and coherentlong text than strong baselines, such as pre-trained models(e.g. GPT-2) and knowledge-enhanced models.
摘要:长文本生成是一个重要但又充满挑战task.The主要问题在于学习句子级别语义依赖其传统的生成模型经常患。为了解决这个问题,我们建议结合了多跳推理在知识图以了解句子之间的语义相关性的多跳推理代(MRG)的方法。 MRG由twoparts,基于图形的多跳推理模块和路径感知句子实现模块。推理模块负责从模仿想象过程中的人为写作的语义转移知识图搜索骨架的路径。基于推断的路径,句子实现模块然后生成一个完整的句子。不同于以往的黑箱模型,MRG明确地推断出路径框架,它提供tounderstand解释性意见怎么提出的模型作品。我们进行了三个有代表性的任务实验,包括故事的产生,评审产生,和产品说明的产生。自动和手动评估表明,该方法能产生比强基线,如预训练模型的更多信息和coherentlong文本(例如,GPT-2)和知识增强模式。

11. Zero-shot Multi-Domain Dialog State Tracking Using Descriptive Rules [PDF] 返回目录
  Edgar Altszyler, Pablo Brusco, Nikoletta Basiou, John Byrnes, Dimitra Vergyri
Abstract: In this work, we present a framework for incorporating descriptive logical rules in state-of-the-art neural networks, enabling them to learn how to handle unseen labels without the introduction of any new training data. The rules are integrated into existing networks without modifying their architecture, through an additional term in the network's loss function that penalizes states of the network that do not obey the designed rules. As a case of study, the framework is applied to an existing neural-based Dialog State Tracker. Our experiments demonstrate that the inclusion of logical rules allows the prediction of unseen labels, without deteriorating the predictive capacity of the original system.
摘要:在这项工作中,我们提出了纳入国家的最先进的神经网络描述逻辑规则,使他们能够学习如何处理看不见标签没有引入任何新的训练数据的框架。该规则在网络中的损耗功能集成到现有的网络,而不改变其结构,通过额外的术语,不遵守规则而设计的网络的惩罚状态。作为研究的情况下,框架应用到现有的基于神经对话状态跟踪。我们的实验证明的逻辑规则列入允许看不见标签的预测,而不会降低原有系统的预测能力。

12. Augmented Natural Language for Generative Sequence Labeling [PDF] 返回目录
  Ben Athiwaratkun, Cicero Nogueira dos Santos, Jason Krone, Bing Xiang
Abstract: We propose a generative framework for joint sequence labeling and sentence-level classification. Our model performs multiple sequence labeling tasks at once using a single, shared natural language output space. Unlike prior discriminative methods, our model naturally incorporates label semantics and shares knowledge across tasks. Our framework is general purpose, performing well on few-shot, low-resource, and high-resource tasks. We demonstrate these advantages on popular named entity recognition, slot labeling, and intent classification benchmarks. We set a new state-of-the-art for few-shot slot labeling, improving substantially upon the previous 5-shot ($75.0\% \rightarrow 90.9\%$) and 1-shot ($70.4\% \rightarrow 81.0\%$) state-of-the-art results. Furthermore, our model generates large improvements ($46.27\% \rightarrow 63.83\%$) in low-resource slot labeling over a BERT baseline by incorporating label semantics. We also maintain competitive results on high-resource tasks, performing within two points of the state-of-the-art on all tasks and setting a new state-of-the-art on the SNIPS dataset.
摘要:我们提出了联合序列标签和语句级分类的生成框架。在我们的模型中执行多个序列标注任务一次使用单个,共享的自然语言输出空间。不同于现有判别方法,我们的模型自然融合跨越的任务标签语义和知识分享。我们的框架是通用的,在几拍,低资源和高资源工作表现良好。我们证明上流行的命名实体识别,插槽标签和意图分类基准,这些优势。我们设置一个新的国家的最先进的用于几个射时隙标识,在所述前5次显着地改善($ 75.0 \%\ RIGHTARROW 90.9 \%$)和1次($ 70.4 \%\ RIGHTARROW 81.0 \% $)国家的先进成果。此外,我们的模型产生大的改进($ 46.27 \%\ RIGHTARROW 63.83 \%$),在资源匮乏的插槽标签在BERT基线通过将标签语义。我们也维持在高资源任务的竞争的结果,在执行所有任务的国家的最先进的两个点内设置一个新的国家的最先进的剪刀数据集。

13. Dissecting Lottery Ticket Transformers: Structural and Behavioral Study of Sparse Neural Machine Translation [PDF] 返回目录
  Rajiv Movva, Jason Y. Zhao
Abstract: Recent work on the lottery ticket hypothesis has produced highly sparse Transformers for NMT while maintaining BLEU. However, it is unclear how such pruning techniques affect a model's learned representations. By probing sparse Transformers, we find that complex semantic information is first to be degraded. Analysis of internal activations reveals that higher layers diverge most over the course of pruning, gradually becoming less complex than their dense counterparts. Meanwhile, early layers of sparse models begin to perform more encoding. Attention mechanisms remain remarkably consistent as sparsity increases.
摘要:在彩票假设最近的工作已经产生了高度稀疏变压器的NMT,同时保持BLEU。但是,目前还不清楚这种修剪技术如何影响模型的学表示。通过探测稀疏变形金刚,我们发现复杂的语义信息是首先被降解。内部激活的分析表明,高层分歧大部分在修剪过程中,逐渐变得比他们的同行密集的那么复杂。同时,稀疏模式的早期层开始执行更多的编码。注意机制仍然是稀疏的增加非常一致。

14. Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models [PDF] 返回目录
  Subhajit Naskar, Amirmohammad Rooshenas, Simeng Sun, Mohit Iyyer, Andrew McCallum
Abstract: The discrepancy between maximum likelihood estimation (MLE) and task measures such as BLEU score has been studied before for autoregressive neural machine translation (NMT) and resulted in alternative training algorithms (Ranzato et al., 2016; Norouzi et al., 2016; Shen et al., 2016; Wu et al., 2018). However, MLE training remains the de facto approach for autoregressive NMT because of its computational efficiency and stability. Despite this mismatch between the training objective and task measure, we notice that the samples drawn from an MLE-based trained NMT support the desired distribution -- there are samples with much higher BLEU score comparing to the beam decoding output. To benefit from this observation, we train an energy-based model to mimic the behavior of the task measure (i.e., the energy-based model assigns lower energy to samples with higher BLEU score), which is resulted in a re-ranking algorithm based on the samples drawn from NMT: energy-based re-ranking (EBR). Our EBR consistently improves the performance of the Transformer-based NMT: +3 BLEU points on Sinhala-English and +2.0 BLEU points on IWSLT'17 French-English tasks.
摘要:最大似然估计(MLE)和任务的措施,如BLEU分数之间的差异已经为自回归神经机器翻译(NMT)之前进行了研究,并导致在替代训练算法(Ranzato等人,2016; Norouzi等人,2016。 ; Shen等人,2016; Wu等人,2018)。然而,MLE训练仍然由于其计算效率和稳定性,自回归NMT的实际做法。尽管培训目标和任务措施之间的这种不匹配,我们注意到,从基于MLE训练NMT抽取样品支持所需的分布 - 有高得多的BLEU分数样本比较梁解码输出。为了从这种观察的好处,我们培养基于能量的模型,任务措施的模仿行为(即,基于能量的模型受让人较低能量,具有较高的BLEU分数样本),这是导致基于重排序算法从NMT抽取的样品:基于能量的重排序(EBR)。我们一贯EBR提高了基于变压器的NMT的性能:在僧伽罗语,英语+3 BLEU点和IWSLT'17法语英语任务+2.0 BLEU点。

15. Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [PDF] 返回目录
  Zhihan Zhang, Xiubo Geng, Tao Qin, Yunfang Wu, Daxin Jiang
Abstract: We focus on the task of procedural text understanding, which aims to track entities' states and locations during a natural process. Although recent approaches have achieved substantial progress, they are far behind human performance. Two challenges, difficulty of commonsense reasoning and data insufficiency, still remain unsolved. In this paper, we propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which leverages external knowledge sources to solve these issues. Specifically, we retrieve informative knowledge triples from ConceptNet and perform knowledge-aware reasoning while tracking the entities. Besides, we employ a multi-stage training schema which fine-tunes the BERT model over unlabeled data collected from Wikipedia before further fine-tuning it on the final model. Experimental results on two procedural text datasets, ProPara and Recipes, verify the effectiveness of the proposed methods, in which our model achieves state-of-the-art performance in comparison to various baselines.
摘要:我们专注于程序性文本的理解,在一个自然的过程来跟踪实体的状态和位置,其目的任务。虽然最近的方法已经取得了实质性进展,但远远落后于人的表现。两个挑战,常识推理和数据不全的困难,仍然没有解决。在本文中,我们提出了一种新的知识感知程序文本的理解(考拉)模型,它利用外部知识源来解决这些问题。具体而言,我们检索ConceptNet信息知识三倍和同时跟踪实体进行知识感知推理。此外,我们采用了多级培训模式,其微调结束前进一步微调它的最终模型维基百科收集标签数据的BERT模型。两个程序的文本数据集,ProPara和配方实验结果,验证了该方法的有效性,在我们的模型实现了对比较基准不同国家的最先进的性能。

16. Incomplete Utterance Rewriting as Semantic Segmentation [PDF] 返回目录
  Qian Liu, Bei Chen, Jian-Guang Lou, Bin Zhou, Dongmei Zhang
Abstract: Recent years the task of incomplete utterance rewriting has raised a large attention. Previous works usually shape it as a machine translation task and employ sequence to sequence based architecture with copy mechanism. In this paper, we present a novel and extensive approach, which formulates it as a semantic segmentation task. Instead of generating from scratch, such a formulation introduces edit operations and shapes the problem as prediction of a word-level edit matrix. Benefiting from being able to capture both local and global information, our approach achieves state-of-the-art performance on several public datasets. Furthermore, our approach is four times faster than the standard approach in inference.
摘要:近年来,不完全的话语重写的任务已经提出了大量的关注。以前的作品通常它塑造成机器翻译任务和应用程序基于序列架构复制机制。在本文中,我们提出了一个新的和广泛的方法,它制定它作为一个语义分割任务。代替从头开始,这样的制剂引入了编辑操作生成的和形状的问题,因为一个字级编辑矩阵的预测。从能够捕捉到本地和全球信息化的推动,我们的方法实现了几个公共数据集的国家的最先进的性能。此外,我们的做法是不是在推理的标准方法快四倍。

17. Generative latent neural models for automatic word alignment [PDF] 返回目录
  Anh Khoa Ngo Ho, François Yvon
Abstract: Word alignments identify translational correspondences between words in a parallel sentence pair and are used, for instance, to learn bilingual dictionaries, to train statistical machine translation systems or to perform quality estimation. Variational autoencoders have been recently used in various of natural language processing to learn in an unsupervised way latent representations that are useful for language generation tasks. In this paper, we study these models for the task of word alignment and propose and assess several evolutions of a vanilla variational autoencoders. We demonstrate that these techniques can yield competitive results as compared to Giza++ and to a strong neural network alignment system for two language pairs.
摘要:字路线确定词与词之间的翻译对应的平行句对,并使用,例如,学习双语词典,来训练统计机器翻译系统或进行质量评估。变自动编码已在各种自然语言处理的最近被用在无人监督的方式潜表示这是语言生成任务有用的学习。在本文中,我们研究这些模型字对齐的任务,并提出和评估的香草变自动编码几个变阵。我们表明,相比于吉萨++和两个语言对一个强大的神经网络对准系统这些技术可以产生竞争的结果。

18. Neural Baselines for Word Alignment [PDF] 返回目录
  Anh Khoa Ngo Ho, François Yvon
Abstract: Word alignments identify translational correspondences between words in a parallel sentence pair and is used, for instance, to learn bilingual dictionaries, to train statistical machine translation systems , or to perform quality estimation. In most areas of natural language processing, neural network models nowadays constitute the preferred approach, a situation that might also apply to word alignment models. In this work, we study and comprehensively evaluate neural models for unsupervised word alignment for four language pairs, contrasting several variants of neural models. We show that in most settings, neural versions of the IBM-1 and hidden Markov models vastly outperform their discrete counterparts. We also analyze typical alignment errors of the baselines that our models overcome to illustrate the benefits-and the limitations-of these new models for morphologically rich languages.
摘要:字路线确定词与词之间的翻译对应的平行句对,使用,例如,学习双语词典,来训练统计机器翻译系统,或者进行质量评估。在自然语言处理的大部分地区,神经网络模型时下构成了首选的方法,这种情况可能也适用于字对齐模式。在这项工作中,我们学习,全面评估四个语言对监督的字对齐神经车型,对比车型神经几个变种。我们发现,在大多数设置,IBM-1和隐马尔可夫模型的神经版本大大超越了同行的离散。我们也分析了基线的典型对准误差,我们的模型克服说明的好处和对于形态丰富语言的局限性,这些新车型。

19. Deep Transformers with Latent Depth [PDF] 返回目录
  Xian Li, Asa Cooper Stickland, Yuqing Tang, Xiang Kong
Abstract: The Transformer model has achieved state-of-the-art performance in many sequence modeling tasks. However, how to leverage model capacity with large or variable depths is still an open challenge. We present a probabilistic framework to automatically learn which layer(s) to use by learning the posterior distributions of layer selection. As an extension of this framework, we propose a novel method to train one shared Transformer network for multilingual machine translation with different layer selection posteriors for each language pair. The proposed method alleviates the vanishing gradient issue and enables stable training of deep Transformers (e.g. 100 layers). We evaluate on WMT English-German machine translation and masked language modeling tasks, where our method outperforms existing approaches for training deeper Transformers. Experiments on multilingual machine translation demonstrate that this approach can effectively leverage increased model capacity and bring universal improvement for both many-to-one and one-to-many translation with diverse language pairs.
摘要:变压器模式取得了许多序列建模任务的国家的最先进的性能。然而,如何利用模型容量大或可变的深度仍然是一个开放的挑战。我们提出了一个概率框架自动学习哪个层(一个或多个)通过学习层选择的后验分布使用。作为该框架的扩展,我们建议培养一个共享变压器网络为多语言的机器翻译具有不同层选择后验为每种语言对的新方法。所提出的方法减轻了消失梯度问题并且使得深变压器(例如100层)的稳定的训练。我们评估对WMT英语,德语机器翻译和掩盖语言建模任务,我们的方法优于现有的训练更深变压器的方法。多语种机器翻译实验证明,这种方法可以有效利用提高模型的能力,并带来普遍改善两个多到一和一到多的翻译与不同的语言对。

20. What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams [PDF] 返回目录
  Di Jin, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang, Peter Szolovits
Abstract: Open domain question answering (OpenQA) tasks have been recently attracting more and more attention from the natural language processing (NLP) community. In this work, we present the first free-form multiple-choice OpenQA dataset for solving medical problems, MedQA, collected from the professional medical board exams. It covers three languages: English, simplified Chinese, and traditional Chinese, and contains 12,723, 34,251, and 14,123 questions for the three languages, respectively. We implement both rule-based and popular neural methods by sequentially combining a document retriever and a machine comprehension model. Through experiments, we find that even the current best method can only achieve 36.7\%, 42.0\%, and 70.1\% of test accuracy on the English, traditional Chinese, and simplified Chinese questions, respectively. We expect MedQA to present great challenges to existing OpenQA systems and hope that it can serve as a platform to promote much stronger OpenQA models from the NLP community in the future.
摘要:开放域问答(OpenQA)任务,最近已吸引了越来越多的注意力从自然语言处理(NLP)的社区。在这项工作中,我们提出了解决医疗问题,MedQA,从专业的医疗委员会检查收集到的第一个自由形式的选择题OpenQA数据集。分别为英语,简体中国,和中国传统的,并且包含了三种语言12,723,34251和14123的问题,它的占地面积三种语言。我们通过顺序组合文档检索和机器理解模型实现这两个规则为基础和流行的神经方法。通过实验,我们发现,即使是目前最好的方法只能达到36.7 \%,42.0 \%,而对英语,中国传统的测试精度70.1 \%,并简化了中国人的问题,分别。我们预计MedQA为现有OpenQA系统,并希望它可以作为一个平台,在将来的NLP社区促进强得多OpenQA车型目前巨大的挑战。

21. Reactive Supervision: A New Method for Collecting Sarcasm Data [PDF] 返回目录
  Boaz Shmueli, Lun-Wei Ku, Soumya Ray
Abstract: Sarcasm detection is an important task in affective computing, requiring large amounts of labeled data. We introduce reactive supervision, a novel data collection method that utilizes the dynamics of online conversations to overcome the limitations of existing data collection techniques. We use the new method to create and release a first-of-its-kind large dataset of tweets with sarcasm perspective labels and new contextual features. The dataset is expected to advance sarcasm detection research. Our method can be adapted to other affective computing domains, thus opening up new research opportunities.
摘要:讽刺检测是情感计算的一个重要的任务,需要大量的标记数据。我们引入反应监督,利用网聊的动力来克服现有的数据收集技术的局限性新的数据收集方法。我们用新的方法来创建和发布首个,其独一无二的大型数据集讽刺视角标签和新的上下文功能的鸣叫。该数据集有望提前讽刺检测研究。我们的方法可以适用于其他情感计算领域,从而开辟了新的研究机会。

22. A Simple and Efficient Ensemble Classifier Combining Multiple Neural Network Models on Social Media Datasets in Vietnamese [PDF] 返回目录
  Huy Duc Huynh, Hang Thi-Thuy Do, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Abstract: Text classification is a popular topic of natural language processing, which has currently attracted numerous research efforts worldwide. The significant increase of data in social media requires the vast attention of researchers to analyze such data. There are various studies in this field in many languages but limited to the Vietnamese language. Therefore, this study aims to classify Vietnamese texts on social media from three different Vietnamese benchmark datasets. Advanced deep learning models are used and optimized in this study, including CNN, LSTM, and their variants. We also implement the BERT, which has never been applied to the datasets. Our experiments find a suitable model for classification tasks on each specific dataset. To take advantage of single models, we propose an ensemble model, combining the highest-performance models. Our single models reach positive results on each dataset. Moreover, our ensemble model achieves the best performance on all three datasets. We reach 86.96% of F1- score for the HSD-VLSP dataset, 65.79% of F1-score for the UIT-VSMEC dataset, 92.79% and 89.70% for sentiments and topics on the UIT-VSFC dataset, respectively. Therefore, our models achieve better performances as compared to previous studies on these datasets.
摘要:文本分类是自然语言处理,其目前已经吸引了全世界大量的研究工作的一个热门话题。在社交媒体数据的显著增加,需要研究人员的关注广大来分析这些数据。有在这一领域的各种研究在许多语言,但限于越南语。因此,本研究的目的是在社交媒体上的越南文从三个不同的越南基准数据集进行分类。先进的深度学习模型的使用和在这项研究中,包括CNN,LSTM,及其变种进行了优化。我们也实现了BERT,它从来没有被应用到数据集。我们的实验发现对每个特定的数据集分类任务的合适的模型。要利用单一模型,我们提出了一个整体模型,结合了最高性能的车型。我们的单车型达成每个数据集的积极成果。此外,我们的集成模型实现了对所有三个数据集的最佳性能。我们分别对UIT-相对于Fc数据集达到情感和主题F1-比分为HSD-VLSP数据集的86.96%,F1-比分为UIT-VSMEC数据集的65.79%,92.79%和89.70%。因此,相比于这些数据集以前的研究我们的模型中实现更好的性能。

23. Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning [PDF] 返回目录
  Haochen Liu, Wentao Wang, Yiqi Wang, Hui Liu, Zitao Liu, Jiliang Tang
Abstract: Dialogue systems play an increasingly important role in various aspects of our daily life. It is evident from recent research that dialogue systems trained on human conversation data are biased. In particular, they can produce responses that reflect people's gender prejudice. Many debiasing methods have been developed for various natural language processing tasks, such as word embedding. However, they are not directly applicable to dialogue systems because they are likely to force dialogue models to generate similar responses for different genders. This greatly degrades the diversity of the generated responses and immensely hurts the performance of the dialogue models. In this paper, we propose a novel adversarial learning framework Debiased-Chat to train dialogue models free from gender bias while keeping their performance. Extensive experiments on two real-world conversation datasets show that our framework significantly reduces gender bias in dialogue models while maintaining the response quality.
摘要:对话系统,在我们日常生活的各个方面发挥着越来越重要的作用。从最近的研究,经过训练的人交谈数据的对话系统被偏置是显而易见的。特别是,它们能产生反映人们的性别偏见的反应。许多去除偏差的方法已经开发了各种自然语言处理任务,如文字嵌入。然而,他们并不直接适用于对话系统,因为它们可能迫使对话模型来生成不同性别类似的反应。这大大降低了产生响应的多样性,极大伤害了对话模型的性能。在本文中,我们提出了一种新的对抗学习框架Debiased聊天到火车对话模型不受性别偏见,同时保持它们的性能。两个现实世界对话的数据集大量的实验表明,我们的框架显著减少性别偏见的对话模式,同时保持响应质量。

24. SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval [PDF] 返回目录
  Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee
Abstract: We introduce SPARTA, a novel neural retrieval method that shows great promise in performance, generalization, and interpretability for open-domain question answering. Unlike many neural ranking methods that use dense vector nearest neighbor search, SPARTA learns a sparse representation that can be efficiently implemented as an Inverted Index. The resulting representation enables scalable neural retrieval that does not require expensive approximate vector search and leads to better performance than its dense counterpart. We validated our approaches on 4 open-domain question answering (OpenQA) tasks and 11 retrieval question answering (ReQA) tasks. SPARTA achieves new state-of-the-art results across a variety of open-domain question answering tasks in both English and Chinese datasets, including open SQuAD, Natuarl Question, CMRC and etc. Analysis also confirms that the proposed method creates human interpretable representation and allows flexible control over the trade-off between performance and efficiency.
摘要:介绍斯巴达一种新颖的神经检索方法昭示着在性能,泛化和解释性的开放域问答很大的希望。与使用密集向量最近邻搜索许多神经排名方法,斯巴达学习的稀疏表示,可作为倒排索引来有效地实现。产生的表示使得可扩展性神经恢复,不需要昂贵的近似向量的搜索,并导致比它的对手密集更好的性能。我们验证了我们对4个开放域问答(OpenQA)任务和11检索问答(REQA)任务的方法。斯巴达实现国家的最先进的新成果跨越各种中英文开放域问答任务和中国的数据集,包括开放式的阵容,Natuarl问题,CMRC等分析也证实,该方法创造人性化的解释的表示并允许在性能和效率之间的权衡灵活的控制。

25. Fancy Man Lauches Zippo at WNUT 2020 Shared Task-1: A Bert Case Model for Wet Lab Entity Extraction [PDF] 返回目录
  Haoding Meng, Qingcheng Zeng, Xiaoyang Fang, Zhexin Liang
Abstract: Automatic or semi-automatic conversion of protocols specifying steps in performing a lab procedure into machine-readable format benefits biological research a lot. These noisy, dense, and domain-specific lab protocols processing draws more and more interests with the development of deep learning. This paper presents our teamwork on WNUT 2020 shared task-1: wet lab entity extract, that we conducted studies in several models, including a BiLSTM CRF model and a Bert case model which can be used to complete wet lab entity extraction. And we mainly discussed the performance differences of \textbf{Bert case} under different situations such as \emph{transformers} versions, case sensitivity that may don't get enough attention before.
摘要:指定在执行实验室程序为机器可读格式的步骤的协议的自动或半自动转换利于生物研究了很多。这嘈杂的,密集的,和特定域实验室的协议处理吸引越来越多的利益深度学习的发展。本文介绍了在WNUT 2020共享任务-1我们的团队:湿实验室实体提取物,我们在几种模式,包括BiLSTM CRF模型,它可以用来完成湿实验室实体提取伯特例模型进行了研究。我们主要讨论了\ textbf {伯特情况下}在不同情况下的性能差异,如\ {EMPH变压器}版本,区分大小写,可能没有得到之前足够的重视。

26. Unsupervised Pre-training for Biomedical Question Answering [PDF] 返回目录
  Vaishnavi Kommaraju, Karthick Gunasekaran, Kun Li, Trapit Bansal, Andrew McCallum, Ivana Williams, Ana-Maria Istrate
Abstract: We explore the suitability of unsupervised representation learning methods on biomedical text -- BioBERT, SciBERT, and BioSentVec -- for biomedical question answering. To further improve unsupervised representations for biomedical QA, we introduce a new pre-training task from unlabeled data designed to reason about biomedical entities in the context. Our pre-training method consists of corrupting a given context by randomly replacing some mention of a biomedical entity with a random entity mention and then querying the model with the correct entity mention in order to locate the corrupted part of the context. This de-noising task enables the model to learn good representations from abundant, unlabeled biomedical text that helps QA tasks and minimizes the train-test mismatch between the pre-training task and the downstream QA tasks by requiring the model to predict spans. Our experiments show that pre-training BioBERT on the proposed pre-training task significantly boosts performance and outperforms the previous best model from the 7th BioASQ Task 7b-Phase B challenge.
摘要:我们探讨的生物医学文本无监督表示学习方法的适用性 - BioBERT,SciBERT和BioSentVec - 生物医学问题答疑。为了进一步提高生物医学QA监督的交涉中,我们介绍从设计到推理的背景下,生物医学实体未标记的数据的新前培训任务。我们预先训练方法包括通过用随机实体提随意更换生物医学实体的一些提,然后查询与正确的实体提及模型,以找到上下文的损坏的部分破坏给定的上下文的。这去噪任务使模型学习从丰富的,未标记的生物医学文本,帮助QA任务,并最大限度地减少了前培训任务,并通过要求模型来预测跨度下游QA任务之间的列车试错配好的表示。我们的实验显示上建议前期培训任务显著提升整体性能是前培训BioBERT,优于从7 BioASQ任务7B-B相挑战以前的最佳模式。

27. What does it mean to be language-agnostic? Probing multilingual sentence encoders for typological properties [PDF] 返回目录
  Rochelle Choenni, Ekaterina Shutova
Abstract: Multilingual sentence encoders have seen much success in cross-lingual model transfer for downstream NLP tasks. Yet, we know relatively little about the properties of individual languages or the general patterns of linguistic variation that they encode. We propose methods for probing sentence representations from state-of-the-art multilingual encoders (LASER, M-BERT, XLM and XLM-R) with respect to a range of typological properties pertaining to lexical, morphological and syntactic structure. In addition, we investigate how this information is distributed across all layers of the models. Our results show interesting differences in encoding linguistic variation associated with different pretraining strategies.
摘要:多语言句子的编码器已经看到下游NLP任务跨语言模型传递很大的成功。然而,我们知之甚少个别语言的性能或语言的变化,它们编码的一般模式。我们提出了相对于从国家的最先进的多语言编码器(LASER,M-BERT,XLM和XLM-R)探测语句陈述的范围有关的词汇,形态和句法结构类型学性质的方法。此外,我们正在调查这个信息是如何通过模型的所有层分布。我们的研究结果显示,编码具有不同的训练前策略相关的语言变异有趣的差异。

28. TernaryBERT: Distillation-aware Ultra-low Bit BERT [PDF] 返回目录
  Wei Zhang, Lu Hou, Yichun Yin, Lifeng Shang, Xiao Chen, Xin Jiang, Qun Liu
Abstract: Transformer-based pre-training models like BERT have achieved remarkable performance in many natural language processing tasks.However, these models are both computation and memory expensive, hindering their deployment to resource-constrained devices. In this work, we propose TernaryBERT, which ternarizes the weights in a fine-tuned BERT model. Specifically, we use both approximation-based and loss-aware ternarization methods and empirically investigate the ternarization granularity of different parts of BERT. Moreover, to reduce the accuracy degradation caused by the lower capacity of low bits, we leverage the knowledge distillation technique in the training process. Experiments on the GLUE benchmark and SQuAD show that our proposed TernaryBERT outperforms the other BERT quantization methods, and even achieves comparable performance as the full-precision model while being 14.9x smaller.
摘要:基于变压器类BERT前培训模式已在许多自然语言处理tasks.However取得了骄人的业绩,这些模型都是计算和内存价格昂贵,阻碍了他们的部署资源受限的设备。在这项工作中,我们提出TernaryBERT,这ternarizes权重的微调BERT模式。具体而言,我们同时使用近似为基础和损失感知的三元化的方法和经验研究BERT不同地区的三元化粒度。此外,为了减少因低比特的较低容量的准确度退化,我们利用在训练过程中的知识蒸馏技术。上胶基准,班实验结果表明,我们所提出的TernaryBERT优于其他BERT量化方法,甚至达到相当的性能,作为完全精确的模型,同时14.9x小。

29. Hierarchical Deep Multi-modal Network for Medical Visual Question Answering [PDF] 返回目录
  Deepak Gupta, Swati Suman, Asif Ekbal
Abstract: Visual Question Answering in Medical domain (VQA-Med) plays an important role in providing medical assistance to the end-users. These users are expected to raise either a straightforward question with a Yes/No answer or a challenging question that requires a detailed and descriptive answer. The existing techniques in VQA-Med fail to distinguish between the different question types sometimes complicates the simpler problems, or over-simplifies the complicated ones. It is certainly true that for different question types, several distinct systems can lead to confusion and discomfort for the end-users. To address this issue, we propose a hierarchical deep multi-modal network that analyzes and classifies end-user questions/queries and then incorporates a query-specific approach for answer prediction. We refer our proposed approach as Hierarchical Question Segregation based Visual Question Answering, in short HQS-VQA. Our contributions are three-fold, viz. firstly, we propose a question segregation (QS) technique for VQAMed; secondly, we integrate the QS model to the hierarchical deep multi-modal neural network to generate proper answers to the queries related to medical images; and thirdly, we study the impact of QS in Medical-VQA by comparing the performance of the proposed model with QS and a model without QS. We evaluate the performance of our proposed model on two benchmark datasets, viz. RAD and CLEF18. Experimental results show that our proposed HQS-VQA technique outperforms the baseline models with significant margins. We also conduct a detailed quantitative and qualitative analysis of the obtained results and discover potential causes of errors and their solutions.
摘要:在医疗领域(VQA-MED)视觉答疑起到给最终用户提供医疗援助具有重要作用。这些用户有望提升要么是/否回答一个简单的问题或挑战性的问题,需要一个详细的和描述性的答案。在VQA - 地中海现有技术不能在不同的问题类型之间进行区分,有时复杂的简单的问题,或过度简化了复杂的问题。这是千真万确的,对于不同的问题类型,几个不同的系统可能会导致混乱和不适用于最终用户。为了解决这个问题,我们提出了一个深层次多模式网络的分析和分类最终用户的问题/查询,然后合并为答案预测查询具体做法。我们称这些建议的方法为分层离析问题基于视觉问题回答,总之HQS-VQA。我们的贡献有三个方面,即:首先,我们提出了VQAMed问题隔离(QS)技术;其次,我们整合了QS模式向深层次多模式神经网络,产生正确的答案相关的医学图像的查询;第三,我们通过比较该模型与QS的性能和无QS模型研究QS在医疗,VQA的影响。我们评估了模型的两个标准数据集,即性能。 RAD和CLEF18。实验结果表明,我们提出的HQS-VQA技术优于基线模型与显著的利润。我们也进行了所得结果的详细定量和定性分析,发现错误及其解决方案的潜在原因。

30. Inductively Representing Out-of-Knowledge-Graph Entities by Optimal Estimation Under Translational Assumptions [PDF] 返回目录
  Damai Dai, Hua Zheng, Fuli Luo, Pengcheng Yang, Baobao Chang, Zhifang Sui
Abstract: Conventional Knowledge Graph Completion (KGC) assumes that all test entities appear during training. However, in real-world scenarios, Knowledge Graphs (KG) evolve fast with out-of-knowledge-graph (OOKG) entities added frequently, and we need to represent these entities efficiently. Most existing Knowledge Graph Embedding (KGE) methods cannot represent OOKG entities without costly retraining on the whole KG. To enhance efficiency, we propose a simple and effective method that inductively represents OOKG entities by their optimal estimation under translational assumptions. Given pretrained embeddings of the in-knowledge-graph (IKG) entities, our method needs no additional learning. Experimental results show that our method outperforms the state-of-the-art methods with higher efficiency on two KGC tasks with OOKG entities.
摘要:传统知识图完成(KGC),假设所有的测试机构培训过程中出现。然而,在现实世界的情景,知识图(KG)的快速发展与外的知识图表经常添加(OOKG)的实体,我们需要有效地表示这些实体。而对整个KG昂贵的再培训,大多数现有的知识图谱嵌入(KGE)方法不能代表OOKG实体。为了提高效率,我们提出了一个简单而有效的方法,它通过感应平移下的假设最优估计代表OOKG实体。鉴于在-知识图(IKG)实体的嵌入预先训练,我们的方法不需要额外的学习。实验结果表明,我们的方法优于与OOKG实体2个KGC任务的国家的最先进的方法有更高的效率。

31. Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval [PDF] 返回目录
  Wenhan Xiong, Xiang Lorraine Li, Srini Iyer, Jingfei Du, Patrick Lewis, William Yang Wang, Yashar Mehdad, Wen-tau Yih, Sebastian Riedel, Douwe Kiela, Barlas Oğuz
Abstract: We propose a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions, which achieves state-of-the-art performance on two multi-hop datasets, HotpotQA and multi-evidence FEVER. Contrary to previous work, our method does not require access to any corpus-specific information, such as inter-document hyperlinks or human-annotated entity markers, and can be applied to any unstructured text corpus. Our system also yields a much better efficiency-accuracy trade-off, matching the best published accuracy on HotpotQA while being 10 times faster at inference time.
摘要:我们提出了一个简单而有效的多跳密集回答复杂开域问题检索方法,达到了两个多跳的数据集,HotpotQA和多证据FEVER状态的最先进的性能。相反,以前的工作,我们的方法并不需要访问任何特定语料库的信息,如文档间的链接或人类标注实体标记,并可以应用到任何非结构化文本语料库。我们的系统也产生更好的效率,准确性权衡,匹配上HotpotQA公布的最佳精度,同时加快在推理时间的10倍。

32. Modeling Topical Relevance for Multi-Turn Dialogue Generation [PDF] 返回目录
  Hainan Zhang, Yanyan Lan, Liang Pang, Hongshen Chen, Zhuoye Ding, Dawei Yin
Abstract: Topic drift is a common phenomenon in multi-turn dialogue. Therefore, an ideal dialogue generation models should be able to capture the topic information of each context, detect the relevant context, and produce appropriate responses accordingly. However, existing models usually use word or sentence level similarities to detect the relevant contexts, which fail to well capture the topical level relevance. In this paper, we propose a new model, named STAR-BTM, to tackle this problem. Firstly, the Biterm Topic Model is pre-trained on the whole training dataset. Then, the topic level attention weights are computed based on the topic representation of each context. Finally, the attention weights and the topic distribution are utilized in the decoding process to generate the corresponding responses. Experimental results on both Chinese customer services data and English Ubuntu dialogue data show that STAR-BTM significantly outperforms several state-of-the-art methods, in terms of both metric-based and human evaluations.
摘要:主题漂移是多圈对话的一个普遍现象。因此,理想的对话一代车型应该能够捕捉到每一个方面的主题信息,检测的有关情况,并相应地产生适当的响应。但是,现有的模型通常使用的词或句子层面的相似性检测的相关背景,其无法很好捕捉局部水平的相关性。在本文中,我们提出了一个新的模式,名为STAR-BTM,来解决这个问题。首先,Biterm主题模型是预先训练在整个训练数据集。然后,主题级别关注权重是基于各方面的话题表示来计算。最后,注意权重和主题分布被用于在解码过程中,以产生相应的响应。两个中国客户服务的数据和英语对话Ubuntu的数据实验结果表明,STAR-BTM显著优于国家的最先进的几种方法,以公制为基础的人的评价方面。

33. Multi-timescale representation learning in LSTM Language Models [PDF] 返回目录
  Shivangi Mahto, Vy A. Vo, Javier S. Turek, Alexander G. Huth
Abstract: Although neural language models are effective at capturing statistics of natural language, their representations are challenging to interpret. In particular, it is unclear how these models retain information over multiple timescales. In this work, we construct explicitly multi-timescale language models by manipulating the input and forget gate biases in a long short-term memory (LSTM) network. The distribution of timescales is selected to approximate power law statistics of natural language through a combination of exponentially decaying memory cells. We then empirically analyze the timescale of information routed through each part of the model using word ablation experiments and forget gate visualizations. These experiments show that the multi-timescale model successfully learns representations at the desired timescales, and that the distribution includes longer timescales than a standard LSTM. Further, information about high-,mid-, and low-frequency words is routed preferentially through units with the appropriate timescales. Thus we show how to construct language models with interpretable representations of different information timescales.
摘要:虽然神经语言模型捕捉自然语言的统计有效,他们表示正在挑战解释。特别是,目前还不清楚这些模型如何保留在多个时间尺度的信息。在这项工作中,我们通过操纵输入结构明确的多时间尺度的语言模型和长短期记忆(LSTM)网络忘记门偏见。时间尺度的分布是通过指数衰减存储单元的组合中,以自然语言的近似功法统计。然后,我们实证分析,通过使用文字消融实验模型的每个部分路由信息的时间表,忘记门可视化。这些实验表明,多时间尺度模型成功地学习表示在所需的时间尺度,而分布包括比标准LSTM更长的时间尺度。另外,关于高,中,和低频词信息通过与适当的时间尺度单位优先路由。因此,我们将展示如何构建语言模型与不同时间尺度的信息的解释表示。

34. A Brief Survey and Comparative Study of Recent Development of Pronoun Coreference Resolution [PDF] 返回目录
  Hongming Zhang, Xinran Zhao, Yangqiu Song
Abstract: Pronoun Coreference Resolution (PCR) is the task of resolving pronominal expressions to all mentions they refer to. Compared with the general coreference resolution task, the main challenge of PCR is the coreference relation prediction rather than the mention detection. As one important natural language understanding (NLU) component, pronoun resolution is crucial for many downstream tasks and still challenging for existing models, which motivates us to survey existing approaches and think about how to do better. In this survey, we first introduce representative datasets and models for the ordinary pronoun coreference resolution task. Then we focus on recent progress on hard pronoun coreference resolution problems (e.g., Winograd Schema Challenge) to analyze how well current models can understand commonsense. We conduct extensive experiments to show that even though current models are achieving good performance on the standard evaluation set, they are still not ready to be used in real applications (e.g., all SOTA models struggle on correctly resolving pronouns to infrequent objects). All experiment codes are available at this https URL.
摘要:代词指代消解(PCR)是解决代名词表达式所有提及它们引用的任务。与一般指代消解任务相比,PCR的主要挑战是共参照关系的预测,而不是提检测。作为一个重要的自然语言理解(NLU)组件,代词分辨率是很多下游的任务,仍然为现有车型的挑战,这促使我们调查现有的方法和思考如何做得更好的关键。在本次调查中,我们首先介绍了普通代词指代消解任务代表性的数据集和模型。然后,我们专注于硬代词指代消解问题的最新进展(例如,威诺格拉德模式挑战)分析当前的模型可以如何理解常识。我们进行了广泛的实验显示在标准评估组,即使目前的模式是实现良好的性能,他们还没有准备好,实际应用中使用(例如,上正确解决代词罕见的对象的所有型号SOTA斗争)。所有实验代码可在此HTTPS URL。

35. Stylized Dialogue Response Generation Using Stylized Unpaired Texts [PDF] 返回目录
  Yinhe Zheng, Zikai Chen, Rongsheng Zhang, Shilei Huang, Xiaoxi Mao, Minlie Huang
Abstract: Generating stylized responses is essential to build intelligent and engaging dialogue systems. However, this task is far from well-explored due to the difficulties of rendering a particular style in coherent responses, especially when the target style is embedded only in unpaired texts that cannot be directly used to train the dialogue model. This paper proposes a stylized dialogue generation method that can capture stylistic features embedded in unpaired texts. Specifically, our method can produce dialogue responses that are both coherent to the given context and conform to the target style. In this study, an inverse dialogue model is first introduced to predict possible posts for the input responses, and then this inverse model is used to generate stylized pseudo dialogue pairs based on these stylized unpaired texts. Further, these pseudo pairs are employed to train the stylized dialogue model with a joint training process, and a style routing approach is proposed to intensify stylistic features in the decoder. Automatic and manual evaluations on two datasets demonstrate that our method outperforms competitive baselines in producing coherent and style-intensive dialogue responses.
摘要:生成程式化的反应必须建立智能化和引人入胜的对话系统。然而,这个任务还远远没有充分研究,由于渲染一致的应对特定风格的困难,尤其是当目标样式被嵌入只有在不能直接用于训练对话模式未成文本。本文提出了可以捕获嵌入文本未成文体特征的程式化对话生成方法。具体来说,我们的方法可以产生对话的反应,它们都一致给定的背景和符合目标的风格。在这项研究中,逆对话模式被首次引入预测可能的职位输入响应,然后将此逆模型被用来产生基于这些程式化未成文本程式化的伪对话对。此外,这些伪对采用了联合培训过程,培训程式化的对话模式,并提出了风格路由的方式加强在解码器文体特征。自动和两个数据集手动评估表明,我们的方法优于竞争力的基线生产连贯和风格密集的对话响应。

36. Local and non-local dependency learning and emergence of rule-like representations in speech data by Deep Convolutional Generative Adversarial Networks [PDF] 返回目录
  Gašper Beguš
Abstract: This paper argues that training GANs on local and non-local dependencies in speech data offers insights into how deep neural networks discretize continuous data and how symbolic-like rule-based morphophonological processes emerge in a deep convolutional architecture. Acquisition of speech has recently been modeled as a dependency between latent space and data generated by GANs in Beguš (arXiv:2006.03965), who models learning of a simple local allophonic distribution. We extend this approach to test learning of local and non-local phonological processes that include approximations of morphological processes. We further parallel outputs of the model to results of a behavioral experiment where human subjects are trained on the data used for training the GAN network. Four main conclusions emerge: (i) the networks provide useful information for computational models of language acquisition even if trained on a comparatively small dataset of an artificial grammar learning experiment; (ii) local processes are easier to learn than non-local processes, which matches both behavioral data in human subjects and typology in the world's languages. This paper also proposes (iii) how we can actively observe the network's progress in learning and explore the effect of training steps on learning representations by keeping latent space constant across different training steps. Finally, this paper shows that (iv) the network learns to encode the presence of a prefix with a single latent variable; by interpolating this variable, we can actively observe the operation of a non-local phonological process. The proposed technique for retrieving learning representations has general implications for our understanding of how GANs discretize continuous speech data and suggests that rule-like generalizations in the training data are represented as an interaction between variables in the network's latent space.
摘要:本文认为在语音数据提供洞察本地和非本地依赖性培训甘斯进入深层神经网络如何离散连续的数据以及如何处理符号般的基于规则的morphophonological进程中出现的一个深卷积架构。语音采集最近已经建模为潜在空间和Beguš(的arXiv:2006.03965)由甘斯生成的数据之间的相关性,谁模型学习简单的本地音位变体分布。我们扩展这个方法,它包括形态过程的近似本地和非本地语音流程的测试学习。我们模型的另外的并行输出到一个行为实验,人类受试者上用于训练GAN网络数据训练的结果。四个主要结论出现:(一)网络提供,即使受过训练的人工语法学习实验的比较小的数据集语言习得的计算模型,有用的信息; (二)本地进程比非本地进程更容易学习,这在人类受试者,以及类型学的行为都在数据世界的语言相匹配。同时,提出(iii)本公司如何积极地观察网络的进步,学习和探索通过在不同的训练步骤保持潜在空间不断学习表示的训练步骤的效果。最后,本文显示,(iv)该网络获知编码具有单个潜变量前缀的存在;通过插值这个变量,我们可以积极观察非本地语音流程的操作。检索学习申述提出的技术对于我们的甘斯如何离散连续语音数据的理解一般的意义,并建议该规则,就像在训练数据的概括表示为在网络的潜在空间变量之间的相互作用。

37. Neural Proof Nets [PDF] 返回目录
  Konstantinos Kogkalidis, Michael Moortgat, Richard Moot
Abstract: Linear logic and the linear {\lambda}-calculus have a long standing tradition in the study of natural language form and meaning. Among the proof calculi of linear logic, proof nets are of particular interest, offering an attractive geometric representation of derivations that is unburdened by the bureaucratic complications of conventional prooftheoretic formats. Building on recent advances in set-theoretic learning, we propose a neural variant of proof nets based on Sinkhorn networks, which allows us to translate parsing as the problem of extracting syntactic primitives and permuting them into alignment. Our methodology induces a batch-efficient, end-to-end differentiable architecture that actualizes a formally grounded yet highly efficient neuro-symbolic parser. We test our approach on ÆThel, a dataset of type-logical derivations for written Dutch, where it manages to correctly transcribe raw text sentences into proofs and terms of the linear {\lambda}-calculus with an accuracy of as high as 70%.
摘要:线性逻辑和线性{\拉姆达}演算在自然语言形式和意义的研究中由来已久的传统。间线性逻辑的证明结石,证明网是特别感兴趣的,将提供导子的一个有吸引力的几何表示,其是通过常规prooftheoretic格式官僚主义并发症牵累。在集合论的学习最新进展的基础上,提出了一种基于Sinkhorn网络证明网神经变种,它允许我们解析翻译为提取句法元,并将其置换成对齐的问题。我们的方法诱导了actualizes一个正式接地而高效神经象征解析器一批高效,终端到终端的微架构。我们测试我们的ÆThel方法,类型逻辑派生的书面荷兰的数据集,其中它管理原始文本的句子正确抄写成线性{\拉姆达}演算的证明,并用方面的70%高的精度。

38. Techniques to Improve Q&A Accuracy with Transformer-based models on Large Complex Documents [PDF] 返回目录
  Chejui Liao, Tabish Maniar, Sravanajyothi N, Anantha Sharma
Abstract: This paper discusses the effectiveness of various text processing techniques, their combinations, and encodings to achieve a reduction of complexity and size in a given text corpus. The simplified text corpus is sent to BERT (or similar transformer based models) for question and answering and can produce more relevant responses to user queries. This paper takes a scientific approach to determine the benefits and effectiveness of various techniques and concludes a best-fit combination that produces a statistically significant improvement in accuracy.
摘要:本文讨论的各种文本处理技术,它们的组合,和编码的有效性,以实现在给定的文本语料库降低复杂性和规模。简化文本语料库发送到BERT(或类似的基于变压器的型号)的问题和回答,并且能够产生用户查询更多的相关回应。本文以科学的方法来确定的利益和各种技术的有效性,并得出结论产生在精确度在统计学上显著改善最合适的组合。

39. Clustering-based Unsupervised Generative Relation Extraction [PDF] 返回目录
  Chenhan Yuan, Ryan Rossi, Andrew Katz, Hoda Eldardiry
Abstract: This paper focuses on the problem of unsupervised relation extraction. Existing probabilistic generative model-based relation extraction methods work by extracting sentence features and using these features as inputs to train a generative model. This model is then used to cluster similar relations. However, these methods do not consider correlations between sentences with the same entity pair during training, which can negatively impact model performance. To address this issue, we propose a Clustering-based Unsupervised generative Relation Extraction (CURE) framework that leverages an "Encoder-Decoder" architecture to perform self-supervised learning so the encoder can extract relation information. Given multiple sentences with the same entity pair as inputs, self-supervised learning is deployed by predicting the shortest path between entity pairs on the dependency graph of one of the sentences. After that, we extract the relation information using the well-trained encoder. Then, entity pairs that share the same relation are clustered based on their corresponding relation information. Each cluster is labeled with a few words based on the words in the shortest paths corresponding to the entity pairs in each cluster. These cluster labels also describe the meaning of these relation clusters. We compare the triplets extracted by our proposed framework (CURE) and baseline methods with a ground-truth Knowledge Base. Experimental results show that our model performs better than state-of-the-art models on both New York Times (NYT) and United Nations Parallel Corpus (UNPC) standard datasets.
摘要:本文着重于监督的关系抽取的问题。现有的概率生成基于模型的关系抽取方法,通过提取句子的功能和使用这些功能作为输入来训练生成模型工作。然后,这个模型被用来集群类似的关系。然而,这些方法没有考虑训练过程中使用相同的实体对句子之间的相关性,这可以模拟性能产生负面影响。为了解决这个问题,我们提出了一种基于聚类的无监督的生成关系抽取(CURE)框架,利用一个“编码器,解码器”的架构来进行自我监督学习,因此编码器可以提取相关信息。给定多个句子具有相同的实体对作为输入,自监督学习是通过预测在句子中的一个的依赖图实体对之间的最短路径展开。在那之后,我们用训练有素的编码器提取相关信息。然后,共享相同的关系实体对基于它们的对应关系的信息聚集。每个群集被标记为基于在对应于每个集群中的实体对的最短路径的话几个字。这些类群标签还描述了这些关系集群的意义。我们通过比较,我们提出的框架(CURE)和基线方法与地面实况知识库中提取的三胞胎。实验结果表明,我们的模型进行比对两个纽约时报(NYT)和联合国国家的最先进的机型更好的平行语料库(UNPC)标准的数据集。

40. KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning [PDF] 返回目录
  Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
Abstract: Generative commonsense reasoning which aims to empower machines to generate sentences with the capacity of reasoning over a set of concepts is a critical bottleneck for text generation. Even the state-of-the-art pre-trained language generation models struggle at this task and often produce implausible and anomalous sentences. One reason is that they rarely consider incorporating the knowledge graph which can provide rich relational information among the commonsense concepts. To promote the ability of commonsense reasoning for text generation, we propose a novel knowledge graphaugmented pre-trained language generation model KG-BART, which encompasses the complex relations of concepts through the knowledge graph and produces more logical and natural sentences as output. Moreover, KG-BART can leverage the graph attention to aggregate the rich concept semantics that enhances the model generalization on unseen concept sets. Experiments on benchmark CommonGen dataset verify the effectiveness of our proposed approach by comparing with several strong pre-trained language generation models, particularly KG-BART outperforms BART by 15.98%, 17.49%, in terms of BLEU-3, 4. Moreover, we also show that the generated context by our model can work as background scenarios to benefit downstream commonsense QA tasks.
摘要:剖成常识推理,旨在授权的机器生成与推理在一组概念的能力句子是文本生成的关键瓶颈。即使在这个任务的国家的最先进的预先训练语言代车型的斗争,并常常产生令人难以置信的和异常的句子。原因之一是,他们很少考虑将知识图可以提供常识性概念之间关系的丰富信息。为了促进常识推理的文本生成的能力,我们建议graphaugmented预先训练语言代车型KG-BART一种新的知识,通过知识图包括概念的复杂关系,并产生更多的逻辑和自然的句子作为输出。此外,KG-BART可以利用图形注意力聚集丰富的概念语义增强了看不见的概念集模型的推广。在基准CommonGen实验数据集与几个强势预训练语言生成模型,特别是KG-BART性能优于BART比较验证我们提出的方法的有效性由15.98%,17.49%,在BLEU-3,4方面。此外,我们还表明,我们的模型产生的背景可以作为背景的场景合作,以造福下游常识QA任务。

41. Recurrent Inference in Text Editing [PDF] 返回目录
  Ning Shi, Ziheng Zeng, Haotian Zhang, Yichen Gong
Abstract: In neural text editing, prevalent sequence-to-sequence based approaches directly map the unedited text either to the edited text or the editing operations, in which the performance is degraded by the limited source text encoding and long, varying decoding steps. To address this problem, we propose a new inference method, Recurrence, that iteratively performs editing actions, significantly narrowing the problem space. In each iteration, encoding the partially edited text, Recurrence decodes the latent representation, generates an action of short, fixed-length, and applies the action to complete a single edit. For a comprehensive comparison, we introduce three types of text editing tasks: Arithmetic Operators Restoration (AOR), Arithmetic Equation Simplification (AES), Arithmetic Equation Correction (AEC). Extensive experiments on these tasks with varying difficulties demonstrate that Recurrence achieves improvements over conventional inference methods.
摘要:在神经的文本编辑的流行序列到序列基础的方法直接映射的未经编辑的文本要么到编辑后的文本或编辑操作,其中,所述性能是由有限的源文本编码退化和长,变化的解码步骤。为了解决这个问题,我们提出了一个新的推断方法,复发,即反复进行编辑操作,显著缩小问题空间。在每次迭代中,编码部分编辑文本,复发解码潜表示,产生短的,固定长度的作用,并适用的动作来完成一个单一的编辑。对于一个全面的比较,我们介绍三种类型的文本编辑任务:算术运算符修复(AOR),算术公式简化(AES),算术公式校正(AEC)。这些任务有不同的困难,大量的实验证明,复发实现比传统的推理方法的改进。

42. DWIE: an entity-centric dataset for multi-task document-level information extraction [PDF] 返回目录
  Klim Zaporojets, Johannes Deleu, Chris Develder, Thomas Demeester
Abstract: This paper presents DWIE, the 'Deutsche Welle corpus for Information Extraction', a newly created multi-task dataset that combines four main Information Extraction (IE) annotation sub-tasks: (i) Named Entity Recognition (NER), (ii) Coreference Resolution, (iii) Relation Extraction (RE), and (iv) Entity Linking. DWIE is conceived as an entity-centric dataset that describes interactions and properties of conceptual entities on the level of the complete document. This contrasts with currently dominant mention-driven approaches that start from the detection and classification of named entity mentions in individual sentences. Further, DWIE presented two main challenges when building and evaluating IE models for it. First, the use of traditional mention-level evaluation metrics for NER and RE tasks on entity-centric DWIE dataset can result in measurements dominated by predictions on more frequently mentioned entities. We tackle this issue by proposing a new entity-driven metric that takes into account the number of mentions that compose each of the predicted and ground truth entities. Second, the document-level multi-task annotations require the models to transfer information between entity mentions located in different parts of the document, as well as between different tasks, in a joint learning setting. To realize this, we propose to use graph-based neural message passing techniques between document-level mention spans. Our experiments show an improvement of up to 5.5 F1 percentage points when incorporating neural graph propagation into our joint model. This demonstrates DWIE's potential to stimulate further research in graph neural networks for representation learning in multi-task IE. We make DWIE publicly available at this https URL.
摘要:本文介绍DWIE,在“德国之声语料库信息提取”,新创建的多任务数据集,结合四个主要的信息抽取(IE)注释子任务:(一)命名实体识别(NER),(二)共指消解,(ⅲ)关系抽取(RE),和(iv)实体链接。 DWIE被看作是一个描述在完成文件的级别的交互和概念实体的属性的实体为中心的数据集。与目前占主导地位的提驱动的方法,从检测和命名实体的分类开始与此相反提到了个别句子。此外,DWIE建设和评估IE模型,它提出时两个主要挑战。首先,使用传统的提级评价指标NER和RE任务的实体为中心的DWIE数据集可能会导致通过更频繁地提到实体预测为主的测量。我们通过提出一个新的实体驱动的指标,考虑到数量提到,撰写解决这一问题的每个预测与地面实况实体。其次,文档级多任务注释需要模型来传递信息之间的实体提到位于文档的不同部分,以及与不同的任务,在共同学习环境。为了实现这一目标,我们建议使用文档级提跨度之间基于图形的神经信息传递技术。结合神经图形传播到我们共同的模型时,我们的实验显示高达5.5 F1个百分点的改善。这表明DWIE的潜力,刺激图形神经网络在多任务IE表示学习进一步研究。我们做DWIE公众可在此HTTPS URL。

43. Automatic Arabic Dialect Identification Systems for Written Texts: A Survey [PDF] 返回目录
  Maha J. Althobaiti
Abstract: Arabic dialect identification is a specific task of natural language processing, aiming to automatically predict the Arabic dialect of a given text. Arabic dialect identification is the first step in various natural language processing applications such as machine translation, multilingual text-to-speech synthesis, and cross-language text generation. Therefore, in the last decade, interest has increased in addressing the problem of Arabic dialect identification. In this paper, we present a comprehensive survey of Arabic dialect identification research in written texts. We first define the problem and its challenges. Then, the survey extensively discusses in a critical manner many aspects related to Arabic dialect identification task. So, we review the traditional machine learning methods, deep learning architectures, and complex learning approaches to Arabic dialect identification. We also detail the features and techniques for feature representations used to train the proposed systems. Moreover, we illustrate the taxonomy of Arabic dialects studied in the literature, the various levels of text processing at which Arabic dialect identification are conducted (e.g., token, sentence, and document level), as well as the available annotated resources, including evaluation benchmark corpora. Open challenges and issues are discussed at the end of the survey.
摘要:阿拉伯语方言的识别是自然语言处理的一个特定的任务,旨在自动预测给定文本的阿拉伯语方言。阿拉伯语方言识别是各种自然语言处理的应用,如机器翻译,多语言文本到语音的合成,以及跨语言文本生成的第一步。因此,在过去的十年中,兴趣在解决阿拉伯方言识别的问题增加。在本文中,我们提出书面文本阿拉伯语方言辨识研究的全面调查。我们首先定义问题和挑战。然后,调查广泛讨论了有关阿拉伯语方言识别任务的重要方式很多方面。所以,我们回顾了传统的机器学习方法,学习深架构和复杂的学习方法,以阿拉伯语方言辨识。我们也详细的功能和特征表示技术用于训练所提出的系统。此外,我们说明在文献中研究了阿拉伯方言的分类法,在该阿拉伯方言识别被进行(例如,令牌,句子,和文档级别)的文本处理的各级,以及可用的注释的资源,包括评价基准语料库。开放的挑战和问题都在调查结束讨论。

44. ARPA: Armenian Paraphrase Detection Corpus and Models [PDF] 返回目录
  Arthur Malajyan, Karen Avetisyan, Tsolak Ghukasyan
Abstract: In this work, we employ a semi-automatic method based on back translation to generate a sentential paraphrase corpus for the Armenian language. The initial collection of sentences is translated from Armenian to English and back twice, resulting in pairs of lexically distant but semantically similar sentences. The generated paraphrases are then manually reviewed and annotated. Using the method train and test datasets are created, containing 2360 paraphrases in total. In addition, the datasets are used to train and evaluate BERTbased models for detecting paraphrase in Armenian, achieving results comparable to the state-of-the-art of other languages.
摘要:在这项工作中,我们采用基于回译半自动方法来生成亚美尼亚语言句子复述语料库。句子的初始集合从翻译亚美尼亚到英语和背部两次,导致对词汇遥远,但语义相似的句子。所生成的释义然后人工审查和注释。使用该方法训练集和测试数据集的创建,总共含有2360个复述。此外,数据集被用于火车和评估BERTbased模型在亚美尼亚检测复述,取得结果媲美其他语言的国家的最先进的。

45. Metaphor Detection using Deep Contextualized Word Embeddings [PDF] 返回目录
  Shashwat Aggarwal, Ramesh Singh
Abstract: Metaphors are ubiquitous in natural language, and their detection plays an essential role in many natural language processing tasks, such as language understanding, sentiment analysis, etc. Most existing approaches for metaphor detection rely on complex, hand-crafted and fine-tuned feature pipelines, which greatly limit their applicability. In this work, we present an end-to-end method composed of deep contextualized word embeddings, bidirectional LSTMs and multi-head attention mechanism to address the task of automatic metaphor detection. Our method, unlike many other existing approaches, requires only the raw text sequences as input features to detect the metaphoricity of a phrase. We compare the performance of our method against the existing baselines on two benchmark datasets, TroFi, and MOH-X respectively. Experimental evaluations confirm the effectiveness of our approach.
摘要:隐喻是自然语言中普遍存在,他们的检测对许多自然语言处理任务,如语言理解,情感分析等具有重要作用大多数现有的方法用于比喻检测依赖于复杂,手工制作和微调功能管道,这大大限制了它们的适用性。在这项工作中,我们提出了深刻的情境字的嵌入,双向LSTMs和多头注意机制的组成,以解决自动检测隐喻的任务结束到终端的方法。我们的方法,与许多其它现有的方法,只需要原始文本序列输入功能,以检测一个短语的隐喻性。我们分别比较两个标准数据集,TroFi我们对现有基准方法的性能,以及卫生部-X。实验的评价证实了我们方法的有效性。

46. Topic-Aware Multi-turn Dialogue Modeling [PDF] 返回目录
  Yi Xu, Hai Zhao, Zhuosheng Zhang
Abstract: In the retrieval-based multi-turn dialogue modeling, it remains a challenge to select the most appropriate response according to extracting salient features in context utterances. As a conversation goes on, topic shift at discourse-level naturally happens through the continuous multi-turn dialogue context. However, all known retrieval-based systems are satisfied with exploiting local topic words for context utterance representation but fail to capture such essential global topic-aware clues at discourse-level. Instead of taking topic-agnostic n-gram utterance as processing unit for matching purpose in existing systems, this paper presents a novel topic-aware solution for multi-turn dialogue modeling, which segments and extracts topic-aware utterances in an unsupervised way, so that the resulted model is capable of capturing salient topic shift at discourse-level in need and thus effectively track topic flow during multi-turn conversation. Our topic-aware modeling is implemented by a newly proposed unsupervised topic-aware segmentation algorithm and Topic-Aware Dual-attention Matching (TADAM) Network, which matches each topic segment with the response in a dual cross-attention way. Experimental results on three public datasets show TADAM can outperform the state-of-the-art method by a large margin, especially by 3.4% on E-commerce dataset that has an obvious topic shift.
摘要:在基于内容的检索,多圈的对话模型,它仍然按照提取突出在上下文中的话语特点来选择最适当的反应是一个挑战。作为谈话的推移,在语篇层面的话题自然转移通过连续多圈的对话情境发生。然而,所有已知的基于内容的检索系统都满足于利用当地的主题词上下文话语代表,但没有捕捉到在语篇层次等基本的全球话题感知线索。而不是采取主题无关的n-gram中发声作为处理单元,用于在现有系统中,提出一种匹配目的为多匝对话建模的新颖主题感知溶液,将其以无监督方式片段并提取主题感知话语,所以出所得到的模型能够在需要的话语级捕捉突出的话题转移,从而有效地跟踪多圈通话过程中的话题流。我们的话题感知模型由新提出的无监督的话题感知分割算法和主题感知双注意匹配(TADAM)网络,这与双横注意力的方式响应每个主题匹配段实施。三个公共数据集的实验结果表明TADAM可以大幅度上有一个明显的主题转换的电子商务数据集超越国家的最先进的方法,尤其是3.4%。

47. iNLTK: Natural Language Toolkit for Indic Languages [PDF] 返回目录
  Gaurav Arora
Abstract: We present iNLTK, an open-source NLP library consisting of pre-trained language models and out-of-the-box support for Paraphrase Generation, Textual Similarity, Sentence Embeddings, Word Embeddings, Tokenization and Text Generation in 13 Indic Languages. By using pre-trained models from iNLTK for text classification on publicly available datasets, we significantly outperform previously reported results. On these datasets, we also show that by using pre-trained models and paraphrases from iNLTK, we can achieve more than 95% of the previous best performance by using less than 10% of the training data. iNLTK is already being widely used by the community and has 40,000+ downloads, 600+ stars and 100+ forks on GitHub. The library is available at this https URL.
摘要:我们提出iNLTK,一个开源NLP库包括预先训练语言模型和出的现成的13台印度语的释义代,文本相似,句子曲面嵌入,字曲面嵌入,符号化和文本生成支持。通过对可公开获得的数据集采用预训练的模型从iNLTK文本分类,我们显著强于大盘先前报道的结果。在这些数据集,我们还表明,采用预训练模式和释义从iNLTK,我们可以通过训练数据的不到10%,达到以往最好的成绩的95%以上。 iNLTK已经被广泛使用的社区,有40000个下载,600多名明星,在GitHub上100+叉。该库可在此HTTPS URL。

48. QuatRE: Relation-Aware Quaternions for Knowledge Graph Embeddings [PDF] 返回目录
  Dai Quoc Nguyen, Thanh Vu, Tu Dinh Nguyen, Dinh Phung
Abstract: We propose a simple and effective embedding model, named QuatRE, to learn quaternion embeddings for entities and relations in knowledge graphs. QuatRE aims to enhance correlations between head and tail entities given a relation within the Quaternion space with Hamilton product. QuatRE achieves this by associating each relation with two quaternion vectors which are used to rotate the quaternion embeddings of the head and tail entities, respectively. To obtain the triple score, QuatRE rotates the rotated embedding of the head entity using the normalized quaternion embedding of the relation, followed by a quaternion-inner product with the rotated embedding of the tail entity. Experimental results show that our QuatRE outperforms up-to-date embedding models on well-known benchmark datasets for knowledge graph completion.
摘要:我们提出了一个简单而有效的嵌入模型,命名为世嘉,学习四元数的嵌入的实体和知识图的关系。世嘉旨在加强给出汉密尔顿产品的四元空间内的关系的头部和尾部的实体之间的相关性。 QUATRE通过与分别用于旋转的头部和尾部的实体的四元数的嵌入,2个四元矢量的每个关系关联实现此。为了获得三重得分,QUATRE旋转利用归一化的四元数嵌入的关系,然后与尾实体的经旋转嵌入一个四元内积头部实体的旋转嵌入。实验结果表明,我们的QUATRE优于上知名的基准数据集的知识图完成了最新款的嵌入。

49. Learning to Plan and Realize Separately for Open-Ended Dialogue Systems [PDF] 返回目录
  Sashank Santhanam, Zhuo Cheng, Brodie Mather, Bonnie Dorr, Archna Bhatia, Bryanna Hebenstreit, Alan Zemel, Adam Dalton, Tomek Strzalkowski, Samira Shaikh
Abstract: Achieving true human-like ability to conduct a conversation remains an elusive goal for open-ended dialogue systems. We posit this is because extant approaches towards natural language generation (NLG) are typically construed as end-to-end architectures that do not adequately model human generation processes. To investigate, we decouple generation into two separate phases: planning and realization. In the planning phase, we train two planners to generate plans for response utterances. The realization phase uses response plans to produce an appropriate response. Through rigorous evaluations, both automated and human, we demonstrate that decoupling the process into planning and realization performs better than an end-to-end approach.
摘要:实现真正的人样来进行会话的能力仍然是开放式的对话系统难以实现的目标。我们断定这是因为对自然语言生成(NLG)现存方法通常解释为终端到终端的架构不充分模拟人体产生过程。为了研究,我们代分离成两个独立的阶段:规划和实现。在规划阶段,我们班列车2个规划者产生响应话语计划。实现阶段使用应变计划,将产生相应的响应。经过严格的评估,自动和人工,我们证明了解耦过程纳入规划和执行实现比终端到终端的方法更好。

50. Modeling Dyadic Conversations for Personality Inference [PDF] 返回目录
  Qiang Liu
Abstract: Nowadays, automatical personality inference is drawing extensive attention from both academia and industry. Conventional methods are mainly based on user generated contents, e.g., profiles, likes, and texts of an individual, on social media, which are actually not very reliable. In contrast, dyadic conversations between individuals can not only capture how one expresses oneself, but also reflect how one reacts to different situations. Rich contextual information in dyadic conversation can explain an individual's response during his or her conversation. In this paper, we propose a novel augmented Gated Recurrent Unit (GRU) model for learning unsupervised Personal Conversational Embeddings (PCE) based on dyadic conversations between individuals. We adjust the formulation of each layer of a conventional GRU with sequence to sequence learning and personal information of both sides of the conversation. Based on the learned PCE, we can infer the personality of each individual. We conduct experiments on the Movie Script dataset, which is collected from conversations between characters in movie scripts. We find that modeling dyadic conversations between individuals can significantly improve personality inference accuracy. Experimental results illustrate the successful performance of our proposed method.
摘要:如今,自然而然的性格推断来自学术界和工业界绘制广泛关注。传统的方法主要是基于用户生成的内容,例如,资料,喜好,以及个人的文章,在社会化媒体,这实际上并不十分可靠。相比之下,个人之间的二元对话,不仅可以捕捉自己怎么一个表现,但也反映了一个如何作出反应不同的情况。在二进谈话丰富的上下文信息可以解释他或她的通话过程中个体的反应。在本文中,我们提出了增强门控重复单元(GRU)模型学习基于个人之间的二元对话监督的个人对话曲面嵌入(PCE)的小说。我们调整与序列序列学习和通话双方的个人信息的传统GRU的每个层的配方。基于学习PCE,我们可以推断出每个人的个性。我们对电影剧本的数据集,它是由电影剧本人物之间的对话收集进行实验。我们发现,个体之间的二元建模可以对话改善显著个性推断的准确性。实验结果表明我们提出的方法的成功表现。

51. BET: A Backtranslation Approach for Easy Data Augmentation in Transformer-based Paraphrase Identification Context [PDF] 返回目录
  Jean-Philippe Corbeil, Hadi Abdi Ghadivel
Abstract: Newly-introduced deep learning architectures, namely BERT, XLNet, RoBERTa and ALBERT, have been proved to be robust on several NLP tasks. However, the datasets trained on these architectures are fixed in terms of size and generalizability. To relieve this issue, we apply one of the most inexpensive solutions to update these datasets. We call this approach BET by which we analyze the backtranslation data augmentation on the transformer-based architectures. Using the Google Translate API with ten intermediary languages from ten different language families, we externally evaluate the results in the context of automatic paraphrase identification in a transformer-based framework. Our findings suggest that BET improves the paraphrase identification performance on the Microsoft Research Paraphrase Corpus (MRPC) to more than 3% on both accuracy and F1 score. We also analyze the augmentation in the low-data regime with downsampled versions of MRPC, Twitter Paraphrase Corpus (TPC) and Quora Question Pairs. In many low-data cases, we observe a switch from a failing model on the test set to reasonable performances. The results demonstrate that BET is a highly promising data augmentation technique: to push the current state-of-the-art of existing datasets and to bootstrap the utilization of deep learning architectures in the low-data regime of a hundred samples.
摘要:新推出的深度学习架构,即BERT,XLNet,罗伯塔和伟业,已被证明是在几个NLP任务强劲。然而,培训了这些架构的数据集的大小是固定和普遍性的条款。为了缓解这个问题,我们采用最廉价的解决方案来更新这些数据集之一。我们把这种方法BET由我们分析在基于变压器的结构的回译的数据增强。使用谷歌翻译与来自十个不同的语系10中介语言的API,我们的外部评估的自动识别意译在基于变压器的框架范围内的结果。我们的研究结果表明,BET提高了对微软研究院意译语料库(MRPC)复述识别性能要超过300%的准确度和F1分数。我们还分析了在低数据政权增强与MRPC,Twitter的释义语料库(TPC)和Quora的问题对的下采样版本。在许多低数据的情况下,我们从测试集,以合理的表演失败的模型观察开关。结果表明,BET是大有希望的数据增强技术:推状态的最先进的现有数据集的和自举深学习架构的利用率在一百样品的低数据制度的电流。

52. XTE: Explainable Text Entailment [PDF] 返回目录
  Vivian S. Silva, André Freitas, Siegfried Handschuh
Abstract: Text entailment, the task of determining whether a piece of text logically follows from another piece of text, is a key component in NLP, providing input for many semantic applications such as question answering, text summarization, information extraction, and machine translation, among others. Entailment scenarios can range from a simple syntactic variation to more complex semantic relationships between pieces of text, but most approaches try a one-size-fits-all solution that usually favors some scenario to the detriment of another. Furthermore, for entailments requiring world knowledge, most systems still work as a "black box", providing a yes/no answer that does not explain the underlying reasoning process. In this work, we introduce XTE Explainable Text Entailment - a novel composite approach for recognizing text entailment which analyzes the entailment pair to decide whether it must be resolved syntactically or semantically. Also, if a semantic matching is involved, we make the answer interpretable, using external knowledge bases composed of structured lexical definitions to generate natural language justifications that explain the semantic relationship holding between the pieces of text. Besides outperforming well-established entailment algorithms, our composite approach gives an important step towards Explainable AI, allowing the inference model interpretation, making the semantic reasoning process explicit and understandable.
摘要:文本蕴涵,确定一块文本是否在逻辑上从另一块文本的如下的任务,就是在NLP一个关键组成部分,对于许多语义应用,例如问答,文本摘要,信息提取,和机器翻译提供输入,等等。蕴涵场景的范围可以从一个简单的语法变化文本块之间更复杂的语义关系,但大多数方法尝试一个放之四海而皆准的解决办法,通常有利于一些场景损害另一方。此外,对于需要世界的知识,大多数系统仍然工作作为一个“黑盒子”,提供是蕴涵/无应答,不解释的基本推理过程。在这项工作中,我们将介绍XTE解释的文字蕴涵 - 用于识别文本蕴涵其中分析蕴涵对,以决定是否必须语法语义或解决了新型复合方法。此外,如果语义匹配参与,我们做答案解释,采用结构化的词汇定义组成的外部知识基础,生成自然语言的理由解释文本的碎片之间的语义关系控股。除了跑赢成熟蕴涵的算法,我们的复合方法提供了对解释的AI的重要一步,使推理模型解释,使语义推理过程,明确易懂。

53. Hierarchical Sparse Variational Autoencoder for Text Encoding [PDF] 返回目录
  Victor Prokhorov, Yingzhen Li, Ehsan Shareghi, Nigel Collier
Abstract: In this paper we focus on unsupervised representation learning and propose a novel framework, Hierarchical Sparse Variational Autoencoder (HSVAE), that imposes sparsity on sentence representations via direct optimisation of Evidence Lower Bound (ELBO). Our experimental results illustrate that HSVAE is flexible and adapts nicely to the underlying characteristics of the corpus which is reflected by the level of sparsity and its distributional patterns.
摘要:在本文中,我们侧重于监督的代表学习和提出了一种新的框架,层次变稀疏自动编码器(HSVAE),即通过证据下的直接优化上句表示强加稀疏约束(ELBO)。我们的实验结果表明,HSVAE是柔性的并且很好地适应于由稀疏的水平和其分布模式反映了文集的基本特性。

54. Visually Grounded Compound PCFGs [PDF] 返回目录
  Yanpeng Zhao, Ivan Titov
Abstract: Exploiting visual groundings for language understanding has recently been drawing much attention. In this work, we study visually grounded grammar induction and learn a constituency parser from both unlabeled text and its visual groundings. Existing work on this task (Shi et al., 2019) optimizes a parser via Reinforce and derives the learning signal only from the alignment of images and sentences. While their model is relatively accurate overall, its error distribution is very uneven, with low performance on certain constituents types (e.g., 26.2% recall on verb phrases, VPs) and high on others (e.g., 79.6% recall on noun phrases, NPs). This is not surprising as the learning signal is likely insufficient for deriving all aspects of phrase-structure syntax and gradient estimates are noisy. We show that using an extension of probabilistic context-free grammar model we can do fully-differentiable end-to-end visually grounded learning. Additionally, this enables us to complement the image-text alignment loss with a language modeling objective. On the MSCOCO test captions, our model establishes a new state of the art, outperforming its non-grounded version and, thus, confirming the effectiveness of visual groundings in constituency grammar induction. It also substantially outperforms the previous grounded model, with largest improvements on more `abstract' categories (e.g., +55.1% recall on VPs).
摘要:语言理解开拓视觉搁浅最近一直画备受关注。在这项工作中,我们研究了视觉接地语法归纳,并从两个未标记的文字和其视觉搁浅学习选区解析器。这项任务的现有工作(Shi等,2019)通过优化强化解析器,只有从图片和句子的排列导出学习信号。虽然他们的模型是整体相对准确,其误差分布很不均匀,对某些成分的类型(在动词短语例如,26.2%的召回,副总裁)和其他高低温性能(例如,在名词短语79.6%的召回,NPS) 。这并不奇怪,作为学习信号导出短语结构语法和梯度估计的各个方面可能不足是嘈杂。我们发现,利用概率上下文无关文法模型中,我们可以做到完全结束微分到终端的视觉接地学习的延伸。此外,这有利于我们补充了语言建模的目标图像文本对齐损失。在MSCOCO测试字幕,我们的模型建立了一个新的艺术状态,比其未接地的版本,因此,确认在选区语法归纳视觉搁浅的有效性。它也基本上优于以前的植基模型,对更`抽象”类别(上的VP例如,+ 55.1%召回)最大的改进。

55. RecoBERT: A Catalog Language Model for Text-Based Recommendations [PDF] 返回目录
  Itzik Malkiel, Oren Barkan, Avi Caciularu, Noam Razin, Ori Katz, Noam Koenigstein
Abstract: Language models that utilize extensive self-supervised pre-training from unlabeled text, have recently shown to significantly advance the state-of-the-art performance in a variety of language understanding tasks. However, it is yet unclear if and how these recent models can be harnessed for conducting text-based recommendations. In this work, we introduce RecoBERT, a BERT-based approach for learning catalog-specialized language models for text-based item recommendations. We suggest novel training and inference procedures for scoring similarities between pairs of items, that don't require item similarity labels. Both the training and the inference techniques were designed to utilize the unlabeled structure of textual catalogs, and minimize the discrepancy between them. By incorporating four scores during inference, RecoBERT can infer text-based item-to-item similarities more accurately than other techniques. In addition, we introduce a new language understanding task for wine recommendations using similarities based on professional wine reviews. As an additional contribution, we publish annotated recommendations dataset crafted by human wine experts. Finally, we evaluate RecoBERT and compare it to various state-of-the-art NLP models on wine and fashion recommendations tasks.
摘要:利用无标签的文本广泛的自我监督前的训练语言模型,最近已经显示出在各种语言理解任务显著推进国家的最先进的性能。但是,目前还不清楚是否以及如何将这些最新型号可以被利用为进行基于文本的建议。在这项工作中,我们介绍RecoBERT,学习目录,专业的语言模型基于文本的项目建议基于BERT的方法。我们建议对项目配对之间的得分相似,不需要项目相似的标签新颖的训练和推理过程。无论是训练和推理技术旨在利用文本目录的未标记的结构,并尽量减少它们之间的差异。通过将推理过程四项得分,RecoBERT可以推断基于文本的项目对项目的相似之处比更准确的其他技术。此外,我们介绍使用基于专业的葡萄酒评论相似之处推荐葡萄酒新的语言理解任务。作为一个额外的贡献,我们发布注释建议数据集制作人为葡萄酒专家。最后,我们评估RecoBERT和比较,对葡萄酒和时尚建议任务的各种先进设备,最先进的NLP模型。

56. BiteNet: Bidirectional Temporal Encoder Network to Predict Medical Outcomes [PDF] 返回目录
  Xueping Peng, Guodong Long, Tao Shen, Sen Wang, Jing Jiang, Chengqi Zhang
Abstract: Electronic health records (EHRs) are longitudinal records of a patient's interactions with healthcare systems. A patient's EHR data is organized as a three-level hierarchy from top to bottom: patient journey - all the experiences of diagnoses and treatments over a period of time; individual visit - a set of medical codes in a particular visit; and medical code - a specific record in the form of medical codes. As EHRs begin to amass in millions, the potential benefits, which these data might hold for medical research and medical outcome prediction, are staggering - including, for example, predicting future admissions to hospitals, diagnosing illnesses or determining the efficacy of medical treatments. Each of these analytics tasks requires a domain knowledge extraction method to transform the hierarchical patient journey into a vector representation for further prediction procedure. The representations should embed a sequence of visits and a set of medical codes with a specific timestamp, which are crucial to any downstream prediction tasks. Hence, expressively powerful representations are appealing to boost learning performance. To this end, we propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey. An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys, based solely on the proposed attention mechanism. We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset. The empirical results demonstrate the proposed BiteNet model produces higher-quality representations than state-of-the-art baseline methods.
摘要:电子健康记录(电子病历)是患者与医疗系统交互的纵向记录。一个病人的电子病历数据的组织结构从顶部到底部三个层级:病人的旅程 - 在一段时间的诊断和治疗的所有的经验;个人游 - 一组在某次访问医疗代码;和医疗代码 - 在医疗代码的形式中的特定记录。随着电子病历开始聚敛数以百万计的潜在利益,这些数据可能会保留为医学研究和医疗结果的预测,是惊人的 - 包括,例如,预测未来的接诊医院,诊断疾病或确定的医疗功效。每个这些分析任务需要域知识提取方法到分层患者旅程变换成用于进一步的预测顺序的向量表示。这些表示应该嵌入访问的序列和一组与特定的时间戳的医疗代码,这些都是任何预测下游任务至关重要。因此,意味深长强大的申述呼吁提高学习成绩。为此,我们建议抓住病人的医疗保健旅程中的语境依赖性和时间关系的新型自注意机制。终端到终端的双向时空编码器网络(BiteNet),然后得知病人的旅程,专司建议关注基于机制的表示。我们已经评估了2点被监视的预测与现实世界的EHR数据集2个无监督聚类的任务我们的方法的有效性。实证结果证明了该BiteNet模型产生更高质量的表示比状态的最先进的基线方法。

57. Visual Exploration and Knowledge Discovery from Biomedical Dark Data [PDF] 返回目录
  Shashwat Aggarwal, Ramesh Singh
Abstract: Data visualization techniques proffer efficient means to organize and present data in graphically appealing formats, which not only speeds up the process of decision making and pattern recognition but also enables decision-makers to fully understand data insights and make informed decisions. Over time, with the rise in technological and computational resources, there has been an exponential increase in the world's scientific knowledge. However, most of it lacks structure and cannot be easily categorized and imported into regular databases. This type of data is often termed as Dark Data. Data visualization techniques provide a promising solution to explore such data by allowing quick comprehension of information, the discovery of emerging trends, identification of relationships and patterns, etc. In this empirical research study, we use the rich corpus of PubMed comprising of more than 30 million citations from biomedical literature to visually explore and understand the underlying key-insights using various information visualization techniques. We employ a natural language processing based pipeline to discover knowledge out of the biomedical dark data. The pipeline comprises of different lexical analysis techniques like Topic Modeling to extract inherent topics and major focus areas, Network Graphs to study the relationships between various entities like scientific documents and journals, researchers, and, keywords and terms, etc. With this analytical research, we aim to proffer a potential solution to overcome the problem of analyzing overwhelming amounts of information and diminish the limitation of human cognition and perception in handling and examining such large volumes of data.
摘要:数据可视化技术毫无顾忌有效的手段来组织和显示数据图形化等吸引人的格式,这不仅加快了决策和模式识别的过程,但也使决策者充分了解的数据洞察和做出明智的决定。随着时间的推移,随着技术和计算资源的上涨,出现了在世界科学知识呈指数增长。然而,大多数的缺乏结构和不容易被归类并导入到常规数据库。这种类型的数据通常被称为黑数据。数据可视化技术提供了一个可行的解决方案,允许信息快速理解探讨这样的数据,新兴趋势发现,关系和模式等。在此实证研究的研究鉴定,我们用考研丰富的语料,包括30多个从生物医学文献百万引文可视地探索和理解使用各种信息可视化技术的底层键的见解。我们采用自然语言处理基于管道知识发现出来的黑暗生物医学数据。不同的词法分析技术的管道包括像主题建模来提取固有的主题和重点领域,网络图,研究如科学文献和期刊,研究人员,和,关键字和条件等有了这个分析研究各实体之间的关系,我们的目标是毫无顾忌,克服分析的信息铺天盖地量的问题,减少人类认知和感知的限制,在处理和检查等大量数据的潜在的解决方案。

58. Reinforcement Learning-based N-ary Cross-Sentence Relation Extraction [PDF] 返回目录
  Chenhan Yuan, Ryan Rossi, Andrew Katz, Hoda Eldardiry
Abstract: The models of n-ary cross sentence relation extraction based on distant supervision assume that consecutive sentences mentioning n entities describe the relation of these n entities. However, on one hand, this assumption introduces noisy labeled data and harms the models' performance. On the other hand, some non-consecutive sentences also describe one relation and these sentences cannot be labeled under this assumption. In this paper, we relax this strong assumption by a weaker distant supervision assumption to address the second issue and propose a novel sentence distribution estimator model to address the first problem. This estimator selects correctly labeled sentences to alleviate the effect of noisy data is a two-level agent reinforcement learning model. In addition, a novel universal relation extractor with a hybrid approach of attention mechanism and PCNN is proposed such that it can be deployed in any tasks, including consecutive and nonconsecutive sentences. Experiments demonstrate that the proposed model can reduce the impact of noisy data and achieve better performance on general n-ary cross sentence relation extraction task compared to baseline models.
摘要:n元交叉句子关系抽取的基础上遥远的监督模型假设连续的句子提的n个实体描述这些N个实体的关系。然而,一方面,这种假设介绍嘈杂的标签数据和危害模型的性能。在另一方面,一些非连续的句子也描述了一个关系,这些句子无法在此假设下标记。在本文中,我们放松用较弱的遥远的监督假设来解决第二个问题,并提出了一种新的句子分布估计模型来解决第一个问题这种强烈的假设。此估算器选择正确标记的句子来减轻噪声的数据的效果是两电平剂强化学习模型。此外,随着注意机制和PCNN的混合方法新颖的普遍联系提取提出这样它可以在任何任务,包括连续和非连续的句子进行部署。实验结果表明,该模型可以减少噪声数据的影响,实现对基线相比,车型一般n元交叉句子关系抽取任务更好的性能。

注:中文为机器翻译结果!封面为论文标题词云图!