摘要

1. Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing [PDF] 返回目录
Ruisheng Cao, Su Zhu, Chenyu Yang, Chen Liu, Rao Ma, Yanbin Zhao, Lu Chen, Kai Yu
Abstract: One daunting problem for semantic parsing is the scarcity of annotation. Aiming to reduce nontrivial human labor, we propose a two-stage semantic parsing framework, where the first stage utilizes an unsupervised paraphrase model to convert an unlabeled natural language utterance into the canonical utterance. The downstream naive semantic parser accepts the intermediate output and returns the target logical form. Furthermore, the entire training process is split into two phases: pre-training and cycle learning. Three tailored self-supervised tasks are introduced throughout training to activate the unsupervised paraphrase model. Experimental results on benchmarks Overnight and GeoGranno demonstrate that our framework is effective and compatible with supervised training.
摘要：语义分析的一个严峻的问题是标注的稀缺性。旨在减少平凡人的劳动，我们提出了两个阶段的语义分析框架，其中第一阶段利用不受监督的改述模型向无自然语言语句转换为规范的话语。下游幼稚语义解析器接受中间输出和返回目标逻辑形式。此外，整个训练过程被分为两个阶段：预训练和学习周期。三量身定制的自我监督的任务是介绍整个训练以激活无人监督的改述模型。在基准测试实验结果过夜，GeoGranno证明我们的架构是有效的，并监督培训兼容。

2. Syntactic Structure Distillation Pretraining For Bidirectional Encoders [PDF] 返回目录
Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried, Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom
Abstract: Textual representation learners trained on large amounts of data have achieved notable success on downstream tasks; intriguingly, they have also performed well on challenging tests of syntactic competence. Given this success, it remains an open question whether scalable learners like BERT can become fully proficient in the syntax of natural language by virtue of data scale alone, or whether they still benefit from more explicit syntactic biases. To answer this question, we introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining, by distilling the syntactically informative predictions of a hierarchical---albeit harder to scale---syntactic language model. Since BERT models masked words in bidirectional context, we propose to distill the approximate marginal distribution over words in context from the syntactic LM. Our approach reduces relative error by 2-21% on a diverse set of structured prediction tasks, although we obtain mixed results on the GLUE benchmark. Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data, and contribute to a better understanding of where syntactic biases are most helpful in benchmarks of natural language understanding.
摘要：经过培训的大量数据的文本表示学习者已实现对下游的任务显着的成功;有趣的是，他们还对挑战的语法能力的测试中表现良好。鉴于这一成功，它仍然是一个悬而未决的问题可扩展的学习者喜欢BERT是否能够凭借数据成为自然语言的语法完全精通规模独自一人，或者他们是否仍然更明确的句法偏见中受益。要回答这个问题，我们引入注入句法偏见到BERT训练前，通过蒸馏层次的句法信息预测---尽管难以规模---语法语言模型知识蒸馏策略。由于BERT模式掩盖了双向背景的话，我们建议从句法LM提炼语境中单词上的近似边缘分布。我们的方法在不同的一套结构化的预测任务2-21％降低相对误差，但我们得到的粘胶基准混合的结果。我们的研究结果表明句法偏见，甚至在表示的学习者利用大量数据，并有助于更好地理解其中的语法偏见是自然语言理解的基准最有用的好处。

3. Self-Training for Unsupervised Parsing with PRPN [PDF] 返回目录
Anhad Mohananey, Katharina Kann, Samuel R. Bowman
Abstract: Neural unsupervised parsing (UP) models learn to parse without access to syntactic annotations, while being optimized for another task like language modeling. In this work, we propose self-training for neural UP models: we leverage aggregated annotations predicted by copies of our model as supervision for future copies. To be able to use our model's predictions during training, we extend a recent neural UP architecture, the PRPN (Shen et al., 2018a) such that it can be trained in a semi-supervised fashion. We then add examples with parses predicted by our model to our unlabeled UP training data. Our self-trained model outperforms the PRPN by 8.1% F1 and the previous state of the art by 1.6% F1. In addition, we show that our architecture can also be helpful for semi-supervised parsing in ultra-low-resource settings.
摘要：神经监督的解析（UP）模式学习无法获得语法注释分析，而对于像语言模型的另一个任务进行优化。在这项工作中，我们提出了神经UP车型自我培训：由我们的监督未来的副本模式的副本，我们预测汇总杠杆注解。为了能够在训练中使用我们模型的预测，我们延续近期的神经UP架构，PRPN（Shen等人，2018A），使得它可以在一个半监督的方式进行培训。然后，我们用我们的模型预测到我们的未标记UP训练数据解析添加的例子。我们的自我训练模型8.1％F1和1.6％F1艺术的先前状态优于PRPN。此外，我们表明，我们的架构也可以在超低资源设置半监督解析很有帮助。

4. Thirty Musts for Meaning Banking [PDF] 返回目录
Johan Bos, Lasha Abzianidze
Abstract: Meaning banking--creating a semantically annotated corpus for the purpose of semantic parsing or generation--is a challenging task. It is quite simple to come up with a complex meaning representation, but it is hard to design a simple meaning representation that captures many nuances of meaning. This paper lists some lessons learned in nearly ten years of meaning annotation during the development of the Groningen Meaning Bank (Bos et al., 2017) and the Parallel Meaning Bank (Abzianidze et al., 2017). The paper's format is rather unconventional: there is no explicit related work, no methodology section, no results, and no discussion (and the current snippet is not an abstract but actually an introductory preface). Instead, its structure is inspired by work of Traum (2000) and Bender (2013). The list starts with a brief overview of the existing meaning banks (Section 1) and the rest of the items are roughly divided into three groups: corpus collection (Section 2 and 3, annotation methods (Section 4-11), and design of meaning representations (Section 12-30). We hope this overview will give inspiration and guidance in creating improved meaning banks in the future.
摘要：含义银行 - 创建语义分析和生成的目的语义标注语料库 - 是一项艰巨的任务。它是要拿出一个复杂的意思表示相当简单，但很难设计一个简单的意思表示，抓住意义的许多细微差别。本文列出了一些教训，在近十年的格罗宁根的含义银行发展过程中意义标注的教训（Bos等人，2017年）和并行含义银行（Abzianidze等，2017）。论文的格式是相当标新立异：没有明确的相关工作，没有方法部分，没有结果，也没有讨论（与当前片段不是一个抽象的，但实际上的介绍序言）。相反，它的结构是由Traum（2000）和本德尔（2013年）的作品的影响。与现有的意思银行（第1节）和其余项目的简要概述清单开始大致分为三类：语料收集（第2节和第3，注释方法（第4-11），和意义的设计表示（第12-30）。我们希望本概述将给予启示和指导，在未来创造改善意义的银行。

5. CausaLM: Causal Model Explanation Through Counterfactual Language Models [PDF] 返回目录
Amir Feder, Nadav Oved, Uri Shalit, Roi Reichart
Abstract: Understanding predictions made by deep neural networks is notoriously difficult, but also crucial to their dissemination. As all ML-based methods, they are as good as their training data, and can also capture unwanted biases. While there are tools that can help understand whether such biases exist, they do not distinguish between correlation and causation, and might be ill-suited for text-based models and for reasoning about high level language concepts. A key problem of estimating the causal effect of a concept of interest on a given model is that this estimation requires the generation of counterfactual examples, which is challenging with existing generation technology. To bridge that gap, we propose CausaLM, a framework for producing causal model explanations using counterfactual language representation models. Our approach is based on fine-tuning of deep contextualized embedding models with auxiliary adversarial tasks derived from the causal graph of the problem. Concretely, we show that by carefully choosing auxiliary adversarial pre-training tasks, language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest, and be used to estimate its true causal effect on model performance. A byproduct of our method is a language representation model that is unaffected by the tested concept, which can be useful in mitigating unwanted bias ingrained in the data.
摘要：通过深层神经网络的理解做出预测是非常困难的，但也是至关重要的其传播。由于所有基于ML-方法，他们一样好，他们的训练数据，还可以捕获不必要的偏见。虽然有工具可以帮助理解这种偏见中是否存在，他们没有相关性和因果关系进行区分，并且可能不适合于基于文本的模型和推理高级语言的概念。估计在给定的模型感兴趣的概念的因果效应的一个关键问题是，这估计需要的反例子，这是与现有的生成技术挑战的产生。为了弥补这一差距，我们提出CausaLM，生产用反语表示模型因果模型解释的框架。我们的做法是基于从问题的因果关系图导出的辅助对抗性任务情境深深嵌入模型的微调。具体来说，我们表明，通过精心选择的辅助对抗前的训练任务，语言表达等车型BERT可以有效地学习反表示对给定感兴趣的概念，被用来估计对模型性能的真正因果关系。我们的方法的副产物是一个语言表示模型，该模型被测试的概念，其可以是在减轻在数据根深蒂固不需要的偏压有用不受影响。

6. The First Shared Task on Discourse Representation Structure Parsing [PDF] 返回目录
Lasha Abzianidze, Rik van Noord, Hessel Haagsma, Johan Bos
Abstract: The paper presents the IWCS 2019 shared task on semantic parsing where the goal is to produce Discourse Representation Structures (DRSs) for English sentences. DRSs originate from Discourse Representation Theory and represent scoped meaning representations that capture the semantics of negation, modals, quantification, and presupposition triggers. Additionally, concepts and event-participants in DRSs are described with WordNet synsets and the thematic roles from VerbNet. To measure similarity between two DRSs, they are represented in a clausal form, i.e. as a set of tuples. Participant systems were expected to produce DRSs in this clausal form. Taking into account the rich lexical information, explicit scope marking, a high number of shared variables among clauses, and highly-constrained format of valid DRSs, all these makes the DRS parsing a challenging NLP task. The results of the shared task displayed improvements over the existing state-of-the-art parser.
摘要：本文介绍了语义分析的IWCS 2019共享任务，其目的是产生话语代表结构（DRS的）英语句子。从DRS用话语表达理论起源和代表范围的意思表示，用于捕获否定，情态动词，量化和预设触发的语义。此外，在DRS的概念和事件的参与者与WordNet的同义词集，并从VerbNet元角色描述。测量两个的DRS之间的相似性，但是它们分句形式，即作为一组元组被表示。参与者系统预计将产生的DRS在此从句形式。考虑到丰富的词汇信息，明确范围标记，大量的条款之间共享变量，有效的DRS的高度约束格式的，所有这些使得DRS解析挑战NLP任务。共享任务的结果显示在现有的国家的最先进的解析器的改进。

7. A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews [PDF] 返回目录
Edison Marrese-Taylor, Cristian Rodriguez-Opazo, Jorge A. Balazs, Stephen Gould, Yutaka Matsuo
Abstract: Despite the recent advances in opinion mining for written reviews, few works have tackled the problem on other sources of reviews. In light of this issue, we propose a multi-modal approach for mining fine-grained opinions from video reviews that is able to determine the aspects of the item under review that are being discussed and the sentiment orientation towards them. Our approach works at the sentence level without the need for time annotations and uses features derived from the audio, video and language transcriptions of its contents. We evaluate our approach on two datasets and show that leveraging the video and audio modalities consistently provides increased performance over text-only baselines, providing evidence these extra modalities are key in better understanding video reviews.
摘要：尽管观点挖掘书面审查的最新进展，一些作品在解决的审查其他来源的问题。在这个问题上的光，我们提出了从挖掘视频的评论，它能够确定正在讨论审查的项目和对他们的情感倾向的方面细粒度的意见多模式的方法。我们的方法工作在句子层面，而不需要时间标注和使用功能，从它的内容的音频，视频和语言改编的。我们评估我们的两个数据集的做法，并表明利用视频和音频模式持续提供增加了纯文本的基线性能，提供证据，这些额外的模式是更好地了解视频的评论关键。

8. Transition-based Semantic Dependency Parsing with Pointer Networks [PDF] 返回目录
Daniel Fernández-González, Carlos Gómez-Rodríguez
Abstract: Transition-based parsers implemented with Pointer Networks have become the new state of the art in dependency parsing, excelling in producing labelled syntactic trees and outperforming graph-based models in this task. In order to further test the capabilities of these powerful neural networks on a harder NLP problem, we propose a transition system that, thanks to Pointer Networks, can straightforwardly produce labelled directed acyclic graphs and perform semantic dependency parsing. In addition, we enhance our approach with deep contextualized word embeddings extracted from BERT. The resulting system not only outperforms all existing transition-based models, but also matches the best fully-supervised accuracy to date on the SemEval 2015 Task 18 English datasets among previous state-of-the-art graph-based parsers.
摘要：指针网络实现了基于过渡解析器已经成为本领域的依存分析的新状态，在生产标记语法树，在这个任务中表现优于基于图形的车型优异。为了进一步测试上更难NLP问题这些强大的神经网络的功能，我们提出了一个过渡系统，得益于指针网络，可以直截了当地产生标记向无环图，并执行语义依存分析。此外，我们增强了与BERT提取深情境字的嵌入方式。最终的系统，不仅性能优于现有的所有基于过渡的机型，同时也是对符合国家的最先进的前面的基于图形的解析器中SemEval 2015年任务18英语数据集的最佳充分监督的准确性日期。

9. Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing [PDF] 返回目录
Daniel Fernández-González, Carlos Gómez-Rodríguez
Abstract: Sequence-to-sequence constituent parsing requires a linearization to represent trees as sequences. Top-down tree linearizations, which can be based on brackets or shift-reduce actions, have achieved the best accuracy to date. In this paper, we show that these results can be improved by using an in-order linearization instead. Based on this observation, we implement an enriched in-order shift-reduce linearization inspired by Vinyals et al. (2015)'s approach, achieving the best accuracy to date on the English PTB dataset among fully-supervised single-model sequence-to-sequence constituent parsers. Finally, we apply deterministic attention mechanisms to match the speed of state-of-the-art transition-based parsers, thus showing that sequence-to-sequence models can match them, not only in accuracy, but also in speed.
摘要：序列到序列构成解析需要一个线性化来表示树木序列。自上而下树线性化，它可以根据括号或移减少行动，都取得了迄今为止最好的准确性。在本文中，我们表明，这些结果可以通过使用有序线性化，而不是得到改善。基于这一观察，我们实现富集在阶移位减少线性化由Vinyals等人的启发。（2015年）的做法，实现了迄今为止关于全面监督单一模式序列对序列的组分解析器中英文PTB数据集的最佳精度。最后，我们应用确定性注意机制，以配合国家的最先进的基于过渡的解析器的速度，从而表明序列对序列模型可以与它们匹配，不仅在精度，而且在速度。

10. Tracking, exploring and analyzing recent developments in German-language online press in the face of the coronavirus crisis: cOWIDplus Analysis and cOWIDplus Viewer [PDF] 返回目录
Sascha Wolfer, Alexander Koplenig, Frank Michaelis, Carolin Müller-Spitzer
Abstract: The coronavirus pandemic may be the largest crisis the world has had to face since World War II. It does not come as a surprise that it is also having an impact on language as our primary communication tool. We present three inter-connected resources that are designed to capture and illustrate these effects on a subset of the German language: An RSS corpus of German-language newsfeeds (with freely available untruncated unigram frequency lists), a static but continuously updated HTML page tracking the diversity of the used vocabulary and a web application that enables other researchers and the broader public to explore these effects without any or with little knowledge of corpus representation/exploration or statistical analyses.
摘要：冠状病毒大流行可能是最大的危机二战以来世界已经不得不面对。这并不令人感到惊讶的是它也对语言作为我们的主要通信工具的影响。这是用来捕捉并说明对德国语言的一个子集，这些影响我们提出三个相互连接的资源：德语新闻源的RSS文集（用免费提供的未截断的单字组频率列表），一个静态的，而是不断更新的HTML网页追踪所使用的词汇和一个Web应用程序，使其他研究人员和广大公众的多样性探索这些效果没有任何或胼代表/勘探或统计分析的一知半解。

11. Catching Attention with Automatic Pull Quote Selection [PDF] 返回目录
Tanner Bohn, Charles X. Ling
Abstract: Pull quotes are an effective component of a captivating news article. These spans of text are selected from an article and provided with more salient presentation, with the aim of attracting readers with intriguing phrases and making the article more visually interesting. In this paper, we introduce the novel task of automatic pull quote selection, construct a dataset, and benchmark the performance of a number of approaches ranging from hand-crafted features to state-of-the-art sentence embeddings to cross-task models. We show that pre-trained Sentence-BERT embeddings outperform all other approaches, however the benefit over n-gram models is marginal. By closely examining the results of simple models, we also uncover many unexpected properties of pull quotes that should serve as inspiration for future approaches. We believe the benefits of exploring this problem further are clear: pull quotes have been found to increase enjoyment and readability, shape reader perceptions, and facilitate learning.
摘要：上拉行情是一个迷人的新闻文章的有效成分。文本的这些跨度是从文章中选择并设置有较为突出的表现，以吸引读者耐人寻味的词语，使文章更直观有趣的目的。在本文中，我们介绍了自动拉报价选择的新任务，构建一个数据集，并制定一套多种方法，从手工制作的功能，国家的最先进的句子的嵌入穿越任务模型的性能。我们表明，预先训练句子-BERT的嵌入胜过所有其他的方法，但是在n元模型的好处是微不足道的。通过仔细审查简单模型的结果，我们也拉行情应该作为灵感的未来的方法揭开许多意想不到的性能。我们认为，探讨这个问题的进一步好处是显而易见的：拉报价已经发现，增加享受和可读性，形状读者的看法，并促进学习。

12. Establishing a New State-of-the-Art for French Named Entity Recognition [PDF] 返回目录
Pedro Javier Ortiz Suárez, Yoann Dupont, Benjamin Muller, Laurent Romary, Benoît Sagot
Abstract: The French TreeBank developed at the University Paris 7 is the main source of morphosyntactic and syntactic annotations for French. However, it does not include explicit information related to named entities, which are among the most useful information for several natural language processing tasks and applications. Moreover, no large-scale French corpus with named entity annotations contain referential information, which complement the type and the span of each mention with an indication of the entity it refers to. We have manually annotated the French TreeBank with such information, after an automatic pre-annotation step. We sketch the underlying annotation guidelines and we provide a few figures about the resulting annotations.
摘要：法国开发树库在大学巴黎7区是法国形态句法和语法标注的主要来源。然而，它不包括与命名实体，这对于一些自然语言处理任务和应用程序的最有用的信息中明确的信息。此外，没有大规模语料库法国与命名实体标注含有参考信息，以补充类型和每个提及与实体的指示它指的跨度。我们手动注释的树库法国这样的信息，自动预标注步骤之后。我们勾画的基本注解指南和我们提供有关生成的注释了几个数字。

13. Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis? [PDF] 返回目录
Kobi Leins, Jey Han Lau, Timothy Baldwin
Abstract: As part of growing NLP capabilities, coupled with an awareness of the ethical dimensions of research, questions have been raised about whether particular datasets and tasks should be deemed off-limits for NLP research. We examine this question with respect to a paper on automatic legal sentencing from EMNLP 2019 which was a source of some debate, in asking whether the paper should have been allowed to be published, who should have been charged with making such a decision, and on what basis. We focus in particular on the role of data statements in ethically assessing research, but also discuss the topic of dual use, and examine the outcomes of similar debates in other scientific disciplines.
摘要：随着越来越多NLP的能力，再加上研究的伦理维度的意识的一部分，问题已经提出有关特定数据集和任务是否应该被视为禁地的NLP研究。我们考察相对于从EMNLP 2019这是某些争论的一个来源，自动法定量刑纸这个问题在询问是否该文件应被允许发表，谁应该被指控做出这样的决定，并在依据是什么。我们特别关注的是数据报表的道德评价研究中的作用，同时也讨论了双重用途的话题，并检查其他科学学科类似辩论的结果。

14. Chat as Expected: Learning to Manipulate Black-box Neural Dialogue Models [PDF] 返回目录
Haochen Liu, Zhiwei Wang, Tyler Derr, Jiliang Tang
Abstract: Recently, neural network based dialogue systems have become ubiquitous in our increasingly digitalized society. However, due to their inherent opaqueness, some recently raised concerns about using neural models are starting to be taken seriously. In fact, intentional or unintentional behaviors could lead to a dialogue system to generate inappropriate responses. Thus, in this paper, we investigate whether we can learn to craft input sentences that result in a black-box neural dialogue model being manipulated into having its outputs contain target words or match target sentences. We propose a reinforcement learning based model that can generate such desired inputs automatically. Extensive experiments on a popular well-trained state-of-the-art neural dialogue model show that our method can successfully seek out desired inputs that lead to the target outputs in a considerable portion of cases. Consequently, our work reveals the potential of neural dialogue models to be manipulated, which inspires and opens the door towards developing strategies to defend them.
摘要：近日，基于神经网络的对话系统已经成为我们日益数字化的社会中无处不在。然而，由于其固有的不透明，有关使用神经模型的一些最近提出的问题也开始受到重视。事实上，有意或无意的行为可能导致对话系统产生不恰当的反应。因此，在本文中，我们探讨我们是否可以学习手艺输入句子导致暗箱神经对话模式被操纵成其输出包含目标词或匹配目标的句子。我们提出了一种基于强化学习模式，能够自动生成所需的这样的投入。一个受欢迎的训练有素的国家的最先进的神经对话模式显示，我们的方法可以成功地找出导致的情况下，有相当一部分的目标输出所需投入大量的实验。因此，我们的工作揭示了神经对话模式的潜力被操纵，从而激发并打开向发展战略，以保护他们的门。

15. MT-Adapted Datasheets for Datasets: Template and Repository [PDF] 返回目录
Marta R. Costa-jussà, Roger Creus, Oriol Domingo, Albert Domínguez, Miquel Escobar, Cayetana López, Marina Garcia, Margarita Geleta
Abstract: In this report we are taking the standardized model proposed by Gebru et al. (2018) for documenting the popular machine translation datasets of the EuroParl (Koehn, 2005) and News-Commentary (Barrault et al., 2019). Within this documentation process, we have adapted the original datasheet to the particular case of data consumers within the Machine Translation area. We are also proposing a repository for collecting the adapted datasheets in this research area
摘要：在这个报告中，我们正在通过Gebru等人提出的标准化模型。（2018）用于记录的EuroParl（科恩，2005年）和新闻评论的流行机器翻译的数据集（巴罗等人。，2019）。在这个文件的过程中，我们已经适应了原来的数据表数据消费者的机器翻译区域内的具体情况。我们还提出一个仓库在这个研究领域收集适应数据表

16. Counterfactual Detection meets Transfer Learning [PDF] 返回目录
Kelechi Nwaike, Licheng Jiao
Abstract: We can consider Counterfactuals as belonging in the domain of Discourse structure and semantics, A core area in Natural Language Understanding and in this paper, we introduce an approach to resolving counterfactual detection as well as the indexing of the antecedents and consequents of Counterfactual statements. While Transfer learning is already being applied to several NLP tasks, It has the characteristics to excel in a novel number of tasks. We show that detecting Counterfactuals is a straightforward Binary Classification Task that can be implemented with minimal adaptation on already existing model Architectures, thanks to a well annotated training data set,and we introduce a new end to end pipeline to process antecedents and consequents as an entity recognition task, thus adapting them into Token Classification.
摘要：我们可以考虑反事实在话语结构和语义，在自然语言理解与本文核心区的域归属感，我们介绍一种方法来解决反检测以及反事实陈述的前因和后项的索引。虽然转移的学习已经被应用于多个NLP任务，它在任务的新号码特征为Excel。我们表明，检测反事实是一个简单的二元分类任务，可以用最少的适应于现有的模型架构来实现，得益于良好注释的训练数据集，我们引入了一个新的端到端的管道过程前因和后项作为一个实体识别任务，因而使他们进入令牌分类。

17. Should Answer Immediately or Wait for Further Information? A Novel Wait-or-Answer Task and Its Predictive Approach [PDF] 返回目录
Zehao Lin, Shaobo Cui, Xiaoming Kang, Guodun Li, Feng Ji, Haiqing Chen, Yin Zhang
Abstract: Different people have different habits of describing their intents in conversations. Some people may tend to deliberate their full intents in several successive utterances, i.e., they use several consistent messages for readability instead of a long sentence to express their question. This creates a predicament faced by dialogue systems' application, especially in real-world industrial scenarios, in which the dialogue system is unsure that whether it should answer the user's query immediately or wait for users' further supplementary input. Motivated by such interesting quandary, we define a novel task: Wait-or-Answer to better tackle this dilemma faced by dialogue systems. We shed light on a new research topic about how the dialogue system can be more competent to behave in this Wait-or-Answer quandary. Further, we propose a predictive approach dubbed Imagine-then-Arbitrate (ITA) to resolve this Wait-or-Answer task. More specifically, we take advantage of an arbitrator model to help the dialogue system decide to wait or answer. The arbitrator's decision is made with the assistance of two ancillary imaginator models: a wait imaginator and an answer imaginator. The wait imaginator tries to predict what the user would supplement and use its prediction to persuade the arbitrator that the user has some information to add, so the dialogue system should wait. The answer imaginator, nevertheless, struggles to predict the answer of the dialogue system and convince the arbitrator that it's a superior choice to answer the users' query immediately. To our best knowledge, our paper is the first work to explicitly define the Wait-or-Answer task in the dialogue system. Additionally, our proposed ITA approach significantly outperforms the existing models in solving this Wait-or-Answer problem.
摘要：不同的人有描述他们的谈话意图的不同习惯。有些人可能会倾向于审议其全部意图在几个连续的话语，即，它们使用一些一致的消息可读性，而不是一个长句来表达他们的问题。这将创建通过对话系统面临困境的应用，特别是在现实世界中工业的情况，其中，对话系统是不能确定它是否应该立即回答用户的查询或等待用户的进一步的辅助输入。通过这种有趣的窘境启发，我们定义了一个新的任务：等待或应答，以更好地解决这一困境所面临的对话系统。我们阐明了关于对话系统如何能够更加胜任这个等待或应答窘境表现一个新的研究课题。此外，我们提出了被称为试想一下，然后仲裁的（ITA）来解决这个等待或应答任务的预测方法。更具体地说，我们采取仲裁模型的优势，帮助对话系统决定等待或答案。等待imaginator和答案imaginator：仲裁员的决定，与两个辅助imaginator车型的援助方面。等待imaginator试图预测哪些用户将补充并使用其预测说服仲裁员用户有一些信息的添加，使对话系统应该等待。答案imaginator，尽管如此，斗争预测对话系统的回答，并说服仲裁员，这是一个绝佳选择立刻回答用户的查询。据我们所知，我们的论文是第一工作，明确定义的对话系统的等待或应答任务。此外，我们提出的ITA的方法显著优于现有的车型中解决这个等待或应答问题。

18. Learning with Weak Supervision for Email Intent Detection [PDF] 返回目录
Kai Shu, Subhabrata Mukherjee, Guoqing Zheng, Ahmed Hassan Awadallah, Milad Shokouhi, Susan Dumais
Abstract: Email remains one of the most frequently used means of online communication. People spend a significant amount of time every day on emails to exchange information, manage tasks and schedule events. Previous work has studied different ways for improving email productivity by prioritizing emails, suggesting automatic replies or identifying intents to recommend appropriate actions. The problem has been mostly posed as a supervised learning problem where models of different complexities were proposed to classify an email message into a predefined taxonomy of intents or classes. The need for labeled data has always been one of the largest bottlenecks in training supervised models. This is especially the case for many real-world tasks, such as email intent classification, where large scale annotated examples are either hard to acquire or unavailable due to privacy or data access constraints. Email users often take actions in response to intents expressed in an email (e.g., setting up a meeting in response to an email with a scheduling request). Such actions can be inferred from user interaction logs. In this paper, we propose to leverage user actions as a source of weak supervision, in addition to a limited set of annotated examples, to detect intents in emails. We develop an end-to-end robust deep neural network model for email intent identification that leverages both clean annotated data and noisy weak supervision along with a self-paced learning mechanism. Extensive experiments on three different intent detection tasks show that our approach can effectively leverage the weakly supervised data to improve intent detection in emails.
摘要：电子邮件仍然是网上交流的最常用的手段之一。人们每天都会花大量的时间显著量上电子邮件交换信息，管理任务和调度事件。以前的工作已经研究了通过优先邮件，提示自动回复或识别意图建议适当措施提高生产力电子邮件方式不同。问题大多已经提出其中不同的复杂的模型，提出了电子邮件分为意图或类别的预定义的分类监督学习问题。是否需要标记的数据一直在训练监督模型的最大瓶颈之一。这是尤其对于许多现实世界的任务，如电子邮件意图分类，其中大型注释的例子是要么很难获得或不可用，由于隐私或数据访问限制的情况下。电子邮件用户经常需要响应在一封电子邮件中表示意图的动作（例如，响应了一次会议调度请求建立到电子邮件）。这样的动作可以由用户互动记录来推断。在本文中，我们提出利用用户行为的监管不力的根源，除了有限的一组注释的例子，来检测在电子邮件的意图。我们开发一个终端到终端强大的深层神经网络，充分利用既干净注释的数据和嘈杂的监管不力有自学机制以及电子邮件的意图识别模型。在三个不同的意图的检测任务，大量的实验表明，我们的方法可以有效利用弱监督的数据，以改善电子邮件的意图检测。

19. Examining Racial Bias in an Online Abuse Corpus with Structural Topic Modeling [PDF] 返回目录
Thomas Davidson, Debasmita Bhattacharya
Abstract: We use structural topic modeling to examine racial bias in data collected to train models to detect hate speech and abusive language in social media posts. We augment the abusive language dataset by adding an additional feature indicating the predicted probability of the tweet being written in African-American English. We then use structural topic modeling to examine the content of the tweets and how the prevalence of different topics is related to both abusiveness annotation and dialect prediction. We find that certain topics are disproportionately racialized and considered abusive. We discuss how topic modeling may be a useful approach for identifying bias in annotated data.
摘要：我们使用的结构主题建模研究中收集到的列车模型来检测社交媒体帖子仇恨言论和侮辱性的语言数据种族偏见。我们通过添加额外的功能说明写在非洲裔美国英语鸣叫存在的预测概率增加了辱骂性语言的数据集。然后，我们使用结构性主题建模研究的tweet的内容和如何的不同主题的流行都相关的滥用注释和方言预测。我们发现，某些主题不成比例的种族化，并认为滥用。我们讨论的主题造型怎么可能在注释的数据确定偏置一个有用的方法。

20. English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too [PDF] 返回目录
Jason Phang, Phu Mon Htut, Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Iacer Calixto, Samuel R. Bowman
Abstract: Intermediate-task training has been shown to substantially improve pretrained model performance on many language understanding tasks, at least in monolingual English settings. Here, we investigate whether English intermediate-task training is still helpful on non-English target tasks in a zero-shot cross-lingual setting. Using a set of 7 intermediate language understanding tasks, we evaluate intermediate-task transfer in a zero-shot cross-lingual setting on 9 target tasks from the XTREME benchmark. Intermediate-task training yields large improvements on the BUCC and Tatoeba tasks that use model representations directly without training, and moderate improvements on question-answering target tasks. Using SQuAD for intermediate training achieves the best results across target tasks, with an average improvement of 8.4 points on development sets. Selecting the best intermediate task model for each target task, we obtain a 6.1 point improvement over XLM-R Large on the XTREME benchmark, setting a new state of the art. Finally, we show that neither multi-task intermediate-task training nor continuing multilingual MLM during intermediate-task training offer significant improvements.
摘要：中级任务的训练已经显示出显着提高预训练模型的性能在许多语言理解任务，至少在单语英语设置。在这里，我们研究了英语中级任务训练是否仍然在零射门跨语言设置非英语目标任务很有帮助。使用一组7个中间语言理解任务，我们评估在零射门跨语言设置9个的目标任务，从XTREME基准中间任务转移。中级任务训练产生的BUCC和Tatoeba任务大的改进，使用模型表示的情况下直接培训和答疑目标任务的温和改善。中级培训使用的阵容达到整个目标任务的最佳效果，8.4点上开发组的平均改善。选择每个目标任务的最佳中间任务模型，我们得到了XLM-R上的XTREME基准大型6.1点的改善，设定了新的艺术状态。最后，我们表明，无论是多任务中间任务训练，也没有人在中间任务训练提供显著的改善持续多语种传销。

21. Comparing BERT against traditional machine learning text classification [PDF] 返回目录
Santiago González-Carvajal, Eduardo C. Garrido-Merchán
Abstract: The BERT model has arisen as a popular state-of-the-art machine learning model in the recent years that is able to cope with multiple NLP tasks such as supervised text classification without human supervision. Its flexibility to cope with any type of corpus delivering great results has make this approach very popular not only in academia but also in the industry. Although, there are lots of different approaches that have been used throughout the years with success. In this work, we first present BERT and include a little review on classical NLP approaches. Then, we empirically test with a suite of experiments dealing different scenarios the behaviour of BERT against the traditional TF-IDF vocabulary fed to machine learning algorithms. Our purpose of this work is to add empirical evidence to support or refuse the use of BERT as a default on NLP tasks. Experiments show the superiority of BERT and its independence of features of the NLP problem such as the language of the text adding empirical evidence to use BERT as a default technique to be used in NLP problems.
摘要：BERT模式已经出现在最近几年，能够应付多种NLP任务，如没有人监督监督文本分类的热门国家的最先进的机器学习模型。它的灵活性，以应付任何类型的语料库提供了很大的成效已经使这种方法非常受欢迎，不仅在学术界，而且在产业。虽然，有很多已经在整个成功的年使用不同的方法。在这项工作中，我们首先提出BERT，包括经典NLP一点审评方针。然后，我们有经验的一套应对不同情景BERT的反传统的TF-IDF词汇输入到机器学习算法的行为实验测试。我们这项工作的目的是经验证据添加到支持或拒绝使用BERT的作为NLP任务的默认。实验表明，BERT和的NLP问题的功能独立的优越性，如文本增加经验证据使用BERT在NLP问题要使用的默认技术的语言。

22. Towards an Open Platform for Legal Information [PDF] 返回目录
Malte Ostendorff, Till Blume, Saskia Ostendorff
Abstract: Recent advances in the area of legal information systems have led to a variety of applications that promise support in processing and accessing legal documents. Unfortunately, these applications have various limitations, e.g., regarding scope or extensibility. Furthermore, we do not observe a trend towards open access in digital libraries in the legal domain as we observe in other domains, e.g., economics of computer science. To improve open access in the legal domain, we present our approach for an open source platform to transparently process and access Legal Open Data. This enables the sustainable development of legal applications by offering a single technology stack. Moreover, the approach facilitates the development and deployment of new technologies. As proof of concept, we implemented six technologies and generated metadata for more than 250,000 German laws and court decisions. Thus, we can provide users of our platform not only access to legal documents, but also the contained information.
摘要：在法律信息系统领域的最新进展已经导致了各种各样的承诺在处理和访问的法律文件支持的应用程序。不幸的是，这些应用程序有不同的限制，例如，对于范围或可扩展性。此外，正如我们在其他领域，如计算机科学的经济学观察我们没有发现对数字图书馆开放获取在法律领域的趋势。为了提高在法律领域开放获取，我们提出我们的开源平台方法来透明地处理和获取法律开放数据。这通过提供一个单一的技术堆栈使法律应用的可持续发展。此外，该方法有利于新技术的开发和部署。由于概念证明，我们实现了6个技术和超过25万个的德国法律，法院判决产生的元数据。因此，我们可以提供我们的平台的用户不仅获得法律文件，而且包含的信息。

23. Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport [PDF] 返回目录
Kyle Swanson, Lili Yu, Tao Lei
Abstract: Selecting input features of top relevance has become a popular method for building self-explaining models. In this work, we extend this selective rationalization approach to text matching, where the goal is to jointly select and align text pieces, such as tokens or sentences, as a justification for the downstream prediction. Our approach employs optimal transport (OT) to find a minimal cost alignment between the inputs. However, directly applying OT often produces dense and therefore uninterpretable alignments. To overcome this limitation, we introduce novel constrained variants of the OT problem that result in highly sparse alignments with controllable sparsity. Our model is end-to-end differentiable using the Sinkhorn algorithm for OT and can be trained without any alignment annotations. We evaluate our model on the StackExchange, MultiNews, e-SNLI, and MultiRC datasets. Our model achieves very sparse rationale selections with high fidelity while preserving prediction accuracy compared to strong attention baseline models.
摘要：顶部相关的选择输入功能已经成为了构建自我解释机型的常用方法。在这项工作中，我们扩展这种选择性合理化的方法来文本匹配，这里的目标是共同选择并对齐文本块，如令牌或句子，作为下游预测的理由。我们的方法采用了最佳的交通（OT）找到输入之间的最小成本对齐。然而，直接应用OT经常产生致密，因此不可解释的比对。为了克服这种限制，我们引入OT问题，即导致高度稀疏比对具有可控稀疏性约束新颖变体。我们的模型是使用OT的Sinkhorn算法的端至端微，且无需任何对准注释进行培训。我们评估我们对StackExchange，MultiNews，电子SNLI和MultiRC数据集模型。我们的模型实现了非常稀疏的理由选择高保真同时相对于强烈关注基线模型保存预测精度。

24. A Study of Neural Matching Models for Cross-lingual IR [PDF] 返回目录
Puxuan Yu, James Allan
Abstract: In this study, we investigate interaction-based neural matching models for ad-hoc cross-lingual information retrieval (CLIR) using cross-lingual word embeddings (CLWEs). With experiments conducted on the CLEF collection over four language pairs, we evaluate and provide insight into different neural model architectures, different ways to represent query-document interactions and word-pair similarity distributions in CLIR. This study paves the way for learning an end-to-end CLIR system using CLWEs.
摘要：在这项研究中，我们采用跨语言词的嵌入（CLWEs）探讨特设跨语言信息检索（CLIR）基于交互的神经配套机型。随着对CLEF收集四个多语言对人进行的实验，我们评估并提供深入了解不同的神经网络模型的架构，不同的方式来表示CLIR查询文档交互和文字对相似度分布。这项研究铺平了道路，学习使用CLWEs的端至端CLIR系统的方式。

注：中文为机器翻译结果！

WITH LOVE OF WORLD

【arxiv论文】 Computation and Language 2020-05-28

目录

摘要