0%

【arxiv论文】 Computation and Language 2020-02-25

目录

1. Resources for Turkish Dependency Parsing: Introducing the BOUN Treebank and the BoAT Annotation Tool [PDF] 摘要
2. Discriminative Adversarial Search for Abstractive Summarization [PDF] 摘要
3. Multilingual Twitter Corpus and Baselines for Evaluating Demographic Bias in Hate Speech Recognition [PDF] 摘要
4. Low-Resource Knowledge-Grounded Dialogue Generation [PDF] 摘要
5. Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation [PDF] 摘要
6. Semi-Supervised Speech Recognition via Local Prior Matching [PDF] 摘要
7. Word Embeddings Inherently Recover the Conceptual Organization of the Human Mind [PDF] 摘要
8. Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation [PDF] 摘要
9. Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation [PDF] 摘要
10. A Hybrid Approach to Dependency Parsing: Combining Rules and Morphology with Deep Learning [PDF] 摘要
11. Predicting Subjective Features from Questions on QA Websites using BERT [PDF] 摘要
12. GRET: Global Representation Enhanced Transformer [PDF] 摘要
13. Do Multi-Hop Question Answering Systems Know How to Answer the Single-Hop Sub-Questions? [PDF] 摘要
14. A Nepali Rule Based Stemmer and its performance on different NLP applications [PDF] 摘要
15. Fill in the BLANC: Human-free quality estimation of document summaries [PDF] 摘要
16. Unsupervised Question Decomposition for Question Answering [PDF] 摘要
17. Exploiting Typed Syntactic Dependencies for Targeted Sentiment Classification Using Graph Attention Neural Network [PDF] 摘要
18. Incorporating Effective Global Information via Adaptive Gate Attention for Text Classification [PDF] 摘要
19. Machine Translation System Selection from Bandit Feedback [PDF] 摘要
20. Markov Chain Monte-Carlo Phylogenetic Inference Construction in Computational Historical Linguistics [PDF] 摘要
21. Data Augmentation for Copy-Mechanism in Dialogue State Tracking [PDF] 摘要
22. Efficient Sentence Embedding via Semantic Subspace Analysis [PDF] 摘要
23. "Wait, I'm Still Talking!" Predicting the Dialogue Interaction Behavior Using Imagine-Then-Arbitrate Model [PDF] 摘要
24. Emergent Communication with World Models [PDF] 摘要
25. Training Question Answering Models From Synthetic Data [PDF] 摘要
26. Extracting and Validating Explanatory Word Archipelagoes using Dual Entropy [PDF] 摘要
27. Modelling Latent Skills for Multitask Language Generation [PDF] 摘要
28. KBSET -- Knowledge-Based Support for Scholarly Editing and Text Processing with Declarative LaTeX Markup and a Core Written in SWI-Prolog [PDF] 摘要
29. Uncertainty based Class Activation Maps for Visual Question Answering [PDF] 摘要
30. Rhythm, Chord and Melody Generation for Lead Sheets using Recurrent Neural Networks [PDF] 摘要
31. Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning [PDF] 摘要
32. FONDUE: A Framework for Node Disambiguation Using Network Embeddings [PDF] 摘要
33. Emosaic: Visualizing Affective Content of Text at Varying Granularity [PDF] 摘要
34. Deep Multimodal Image-Text Embeddings for Automatic Cross-Media Retrieval [PDF] 摘要
35. Automata for Hyperlanguages [PDF] 摘要
36. Sketching Transformed Matrices with Applications to Natural Language Processing [PDF] 摘要

摘要

1. Resources for Turkish Dependency Parsing: Introducing the BOUN Treebank and the BoAT Annotation Tool [PDF] 返回目录
  Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Gözde Berk, Seyyit Talha Bedir, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür
Abstract: In this paper, we describe our contributions and efforts to develop Turkish resources, which include a new treebank (BOUN Treebank) with novel sentences, along with the guidelines we adopted and a new annotation tool we developed (BoAT). The manual annotation process we employed was shaped and implemented by a team of four linguists and five NLP specialists. Decisions regarding the annotation of the BOUN Treebank were made in line with the Universal Dependencies framework, which originated from the works of De Marneffe et al. (2014) and Nivre et al. (2016). We took into account the recent unifying efforts based on the re-annotation of other Turkish treebanks in the UD framework (Türk et al., 2019). Through the BOUN Treebank, we introduced a total of 9,757 sentences from various topics including biographical texts, national newspapers, instructional texts, popular culture articles, and essays. In addition, we report the parsing results of a graph-based dependency parser obtained over each text type, the total of the BOUN Treebank, and all Turkish treebanks that we either re-annotated or introduced. We show that a state-of-the-art dependency parser has improved scores for identifying the proper head and the syntactic relationships between the heads and the dependents. In light of these results, we have observed that the unification of the Turkish annotation scheme and introducing a more comprehensive treebank improves performance with regards to dependency parsing
摘要:在本文中,我们描述了我们的贡献和努力发展土耳其的资源,其中包括一个新的树库(奔树库)与新颖的句子,与我们所采用的准则和新的注释工具,我们开发(船)一起。我们所采用的人工标注过程塑造和一队四个语言学家和五个NLP专业执行。关于奔树库的标注决定与通用依赖性框架,它源于德Marneffe等人的作品行作了。 (2014)和Nivre等。 (2016)。我们考虑到了(蒂尔克等,2019)基于UD框架其他土耳其树库的重新注解近期统一的努力。通过奔树库,我们一共从各种主题的9757句,包括传记文本,全国性的报纸,教学文本,流行文化的文章,和文章的介绍。此外,我们报告了每一个文本类型,共奔树库获得的基于图形的依赖解析器的解析结果,所有土耳其树库,我们要么重新注解或介绍。我们表明,一个国家的最先进的依赖解析器具有识别正确的头和头和家属之间的关系语法改进分数。根据这些结果,我们已观察到土耳其标注方案,并引入更全面的树库的统一提高了与关于依存分析性能

2. Discriminative Adversarial Search for Abstractive Summarization [PDF] 返回目录
  Thomas Scialom, Paul-Alexis Dray, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano
Abstract: We introduce a novel approach for sequence decoding, Discriminative Adversarial Search (DAS), which has the desirable properties of alleviating the effects of exposure bias without requiring external metrics. Inspired by Generative Adversarial Networks (GANs), wherein a discriminator is used to improve the generator, our method differs from GANs in that the generator parameters are not updated at training time and the discriminator is only used to drive sequence generation at inference time. We investigate the effectiveness of the proposed approach on the task of Abstractive Summarization: the results obtained show that a naive application of DAS improves over the state-of-the-art methods, with further gains obtained via discriminator retraining. Moreover, we show how DAS can be effective for cross-domain adaptation. Finally, all results reported are obtained without additional rule-based filtering strategies, commonly used by the best performing systems available: this indicates that DAS can effectively be deployed without relying on post-hoc modifications of the generated outputs.
摘要:我们介绍用于序列解码,判别对抗性搜索(DAS),其具有减轻曝光偏压的效果,而不需要外部度量的所需性质的新方法。通过剖成对抗性网络(甘斯),其特征在于,鉴别器被用于提高发电机,我们的方法从甘斯不同之处在于所述发电机参数不被更新,在训练时间和鉴别器仅用于在推理时间来驱动序列生成的启发。我们调查所提出的方法对写意总结的任务成效:获得表明DAS的天真应用改善了国家的最先进的方法,通过鉴别再培训获得进一步上涨的结果。此外,我们将展示如何DAS可以有效跨域适应。最后,无需额外的基于规则的过滤策略,获得了报告的所有结果,普遍采用现有的最佳执行系统:这表明DAS可以有效地不依赖于产生输出的事后修改进行部署。

3. Multilingual Twitter Corpus and Baselines for Evaluating Demographic Bias in Hate Speech Recognition [PDF] 返回目录
  Xiaolei Huang, Linzi Xing, Franck Dernoncourt, Michael J. Paul
Abstract: Existing research on fairness evaluation of document classification models mainly uses synthetic monolingual data without ground truth for author demographic attributes. In this work, we assemble and publish a multilingual Twitter corpus for the task of hate speech detection with inferred four author demographic factors: age, country, gender and race/ethnicity. The corpus covers five languages: English, Italian, Polish, Portuguese and Spanish. We evaluate the inferred demographic labels with a crowdsourcing platform, Figure Eight. To examine factors that can cause biases, we take an empirical analysis of demographic predictability on the English corpus. We measure the performance of four popular document classifiers and evaluate the fairness and bias of the baseline classifiers on the author-level demographic attributes.
摘要:现有的文档分类模型的公平性评价研究主要采用合成的单语数据,而无需地面实测作者人口属性。在这项工作中,我们组装并发布多语种的Twitter语料库仇恨言论检测与推断4笔者人口因素的任务:年龄,国家,性别和种族/族裔。语料库包括五种语言:英语,意大利语,波兰语,葡萄牙语和西班牙语。我们评估与众包平台,图八的推断人口统计标签。要检查可能会导致偏差的因素,我们把人口预测对英语语料库进行了实证分析。我们衡量的四大流行的文档分类的性能和评价笔者级人口属性基线分类的公平和偏见。

4. Low-Resource Knowledge-Grounded Dialogue Generation [PDF] 返回目录
  Xueliang Zhao, Wei Wu, Chongyang Tao, Can Xu, Dongyan Zhao, Rui Yan
Abstract: Responding with knowledge has been recognized as an important capability for an intelligent conversational agent. Yet knowledge-grounded dialogues, as training data for learning such a response generation model, are difficult to obtain. Motivated by the challenge in practice, we consider knowledge-grounded dialogue generation under a natural assumption that only limited training examples are available. In such a low-resource setting, we devise a disentangled response decoder in order to isolate parameters that depend on knowledge-grounded dialogues from the entire generation model. By this means, the major part of the model can be learned from a large number of ungrounded dialogues and unstructured documents, while the remaining small parameters can be well fitted using the limited training examples. Evaluation results on two benchmarks indicate that with only 1/8 training data, our model can achieve the state-of-the-art performance and generalize well on out-of-domain knowledge.
摘要:知识应对已被确认为一个智能会话代理的重要能力。然而,知识接地对话,作为训练数据学习这种反应生成模型,都很难获得。通过在实践中的挑战的推动下,我们认为自然假设只有有限的训练例子都可以在知识接地对话产生。在这样的低资源设定,制定在顺序的解缠结的响应解码器依赖于从整个一代模型知识接地对话分离物的参数。通过这种方式,该模型的主要部分可以从大量不接地对话和非结构化文档中可以得知,而其余的小参数可以通过有限的训练例子很好地拟合。两个基准测试评价结果表明,只有1/8的训练数据,我们的模型可以实现国家的最先进的性能和域外的知识推广好。

5. Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation [PDF] 返回目录
  Yige Xu, Xipeng Qiu, Ligao Zhou, Xuanjing Huang
Abstract: Fine-tuning pre-trained language models like BERT has become an effective way in NLP and yields state-of-the-art results on many downstream tasks. Recent studies on adapting BERT to new tasks mainly focus on modifying the model structure, re-designing the pre-train tasks, and leveraging external data and knowledge. The fine-tuning strategy itself has yet to be fully explored. In this paper, we improve the fine-tuning of BERT with two effective mechanisms: self-ensemble and self-distillation. The experiments on text classification and natural language inference tasks show our proposed methods can significantly improve the adaption of BERT without any external data or knowledge.
摘要:微调预训练的语言模型,如BERT已经成为国家的最先进的NLP和产量上许多下游任务结果的有效途径。在BERT适应新任务最近的研究主要集中在修改模型结构,重新设计的前车的任务,并利用外部数据和知识。微调战略本身尚未得到充分探讨。在本文中,我们提高BERT的微调有两个有效机制:自合奏和自我升华。文本分类和自然语言推理任务的实验表明,我们所提出的方法可以显著提高BERT的适应,无需任何外部数据和知识。

6. Semi-Supervised Speech Recognition via Local Prior Matching [PDF] 返回目录
  Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Hannun
Abstract: For sequence transduction tasks like speech recognition, a strong structured prior model encodes rich information about the target space, implicitly ruling out invalid sequences by assigning them low probability. In this work, we propose local prior matching (LPM), a semi-supervised objective that distills knowledge from a strong prior (e.g. a language model) to provide learning signal to a discriminative model trained on unlabeled speech. We demonstrate that LPM is theoretically well-motivated, simple to implement, and superior to existing knowledge distillation techniques under comparable settings. Starting from a baseline trained on 100 hours of labeled speech, with an additional 360 hours of unlabeled data, LPM recovers 54% and 73% of the word error rate on clean and noisy test sets relative to a fully supervised model on the same data.
摘要:对于像语音识别序列转任务,强大的结构化模型之前编码关于目标空间的丰富信息,通过赋予它们低概率隐含排除无效序列。在这项工作中,我们提出地方之前匹配(LPM),半监督客观的说,从强之前(例如语言模型)蒸馏出知识,以提供学习信号的培训上未标记语音的判别模型。我们证明LPM理论上良好的动机,实现简单,且优于可比下设置现有的知识蒸馏技术。从上训练100小时标记语音的基线起,与另外的360小时未标记的数据,LPM复苏54%和在干净的和有噪声的测试集相对于在相同的数据的完全监控模型字错误率的73%的。

7. Word Embeddings Inherently Recover the Conceptual Organization of the Human Mind [PDF] 返回目录
  Victor Swift
Abstract: Machine learning is a means to uncover deep patterns from rich sources of data. Here, we find that machine learning can recover the conceptual organization of the human mind when applied to the natural language use of millions of people. Utilizing text from billions of webpages, we recover most of the concepts contained in English, Dutch, and Japanese, as represented in large scale Word Association networks. Our results justify machine learning as a means to probe the human mind, at a depth and scale that has been unattainable using self-report and observational methods. Beyond direct psychological applications, our methods may prove useful for projects concerned with defining, assessing, relating, or uncovering concepts in any scientific field.
摘要:机器学习是从丰富的数据来源揪出深模式的一种手段。在这里,我们发现当应用到自然语言使用的数以百万计的人们,机器学习可以恢复人的心灵的概念组织。从数十亿网页的利用文字,我们收回大部分的概念,包含英语,荷兰语和日语,在大规模词语联想网表示。我们的研究结果证明机器学习,以探测人的心灵,在深度和广度已经高不可攀使用自我报告和观测方法的手段。除了直接的心理应用,我们的方法可以证明对涉及定义,评估,涉及,或以任何科学领域揭示概念的项目非常有用。

8. Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation [PDF] 返回目录
  Alessandro Raganato, Yves Scherrer, Jörg Tiedemann
Abstract: Transformer-based models have brought a radical change to neural machine translation. A key feature of the Transformer architecture is the so-called multi-head attention mechanism, which allows the model to focus simultaneously on different parts of the input. However, recent works have shown that attention heads learn simple positional patterns which are often redundant. In this paper, we propose to replace all but one attention head of each encoder layer with fixed -- non-learnable -- attentive patterns that are solely based on position and do not require any external knowledge. Our experiments show that fixing the attention heads on the encoder side of the Transformer at training time does not impact the translation quality and even increases BLEU scores by up to 3 points in low-resource scenarios.
摘要:基于变压器的模型带来了神经机器翻译的根本改变。变压器结构的一个关键特性是所谓的多磁头注意机制,其允许模型同时着眼于输入的不同部分。然而,最近的作品表明,重视学习的头简单位置的图形,这往往是多余的。在本文中,我们建议全部更换,但有固定每个编码器层的一个关注头 - 非可学习 - 这是完全基于位置周到的模式,不需要任何外部知识。我们的实验表明,在训练时间上变压器的编码器侧固定注意头不影响翻译质量,甚至高达3点在资源匮乏的情况下增加的BLEU分数。

9. Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation [PDF] 返回目录
  Xiaocheng Feng, Yawei Sun, Bing Qin, Heng Gong, Yibo Sun, Wei Bi, Xiaojiang Liu, Ting Liu
Abstract: In this paper, we focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer and aims to preserve text styles while altering the content. In detail, the input is a set of structured records and a reference text for describing another recordset. The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference. The task is unsupervised due to lack of parallel data, and is challenging to select suitable records and style words from bi-aspect inputs respectively and generate a high-fidelity long document. To tackle those problems, we first build a dataset based on a basketball game report corpus as our testbed, and present an unsupervised neural model with interactive attention mechanism, which is used for learning the semantic relationship between records and reference texts to achieve better content transfer and better style preservation. In addition, we also explore the effectiveness of the back-translation in our task for constructing some pseudo-training pairs. Empirical results show superiority of our approaches over competitive methods, and the models also yield a new state-of-the-art result on a sentence-level dataset.
摘要:在本文中,我们着眼于新的实践任务,文件规模的文本内容的操作,这是文字风格转移和目标的相对保留文本样式而改变的内容。详细地,输入是一组结构化的记录,并用于描述另一个记录的参考文本。输出是准确地描述在源记录与基准相同的写作风格的部分内容的摘要。任务被无人看管由于缺乏并行数据,并且是具有挑战性的分别从双方面的输入选择合适的记录和风格字和产生高保真长文档。为了解决这些问题,我们首先建立一个数据集的基础上一场篮球比赛报告文集作为我们的测试平台,并展示互动注意机制,这是用于学习记录和参考文本之间的语义关系,以达到更好的内容传输的无监督神经网络模型和更好的风格保存。此外,我们还探索回译的有效性,我们的任务构建一些伪训练对。实证结果表明,我们在竞争的方式方法的优势,而车型也产生在句子层面的数据集一个新的国家的最先进的结果。

10. A Hybrid Approach to Dependency Parsing: Combining Rules and Morphology with Deep Learning [PDF] 返回目录
  Şaziye Betül Özateş, Arzucan Özgür, Tunga Güngör, Balkız Öztürk
Abstract: Fully data-driven, deep learning-based models are usually designed as language-independent and have been shown to be successful for many natural language processing tasks. However, when the studied language is low-resourced and the amount of training data is insufficient, these models can benefit from the integration of natural language grammar-based information. We propose two approaches to dependency parsing especially for languages with restricted amount of training data. Our first approach combines a state-of-the-art deep learning-based parser with a rule-based approach and the second one incorporates morphological information into the parser. In the rule-based approach, the parsing decisions made by the rules are encoded and concatenated with the vector representations of the input words as additional information to the deep network. The morphology-based approach proposes different methods to include the morphological structure of words into the parser network. Experiments are conducted on the IMST-UD Treebank and the results suggest that integration of explicit knowledge about the target language to a neural parser through a rule-based parsing system and morphological analysis leads to more accurate annotations and hence, increases the parsing performance in terms of attachment scores. The proposed methods are developed for Turkish, but can be adapted to other languages as well.
摘要:完全数据驱动,深学习型模型通常被设计为与语言无关,并已被证明是成功的为众多的自然语言处理任务。然而,当研究语言资源不足地区和训练数据的量不足,这些模型可以从中受益的自然语言基于语法的信息的整合。我们提出了两种方法的依赖尤其是分析与训练数据的限制量语言。我们的第一种方法相结合的状态下的最先进的深基于学习的语法分析器与基于规则的方法,第二个结合形态信息到所述解析器。在基于规则的方法,按规则进行的分析决策编码,并与输入字的附加信息的深网络矢量表示串联。基于形态学的方法提出了不同的方法来包括词的形态结构到解析器网络。实验是在IMST-UD树库进行,结果表明,整合有关通过基于规则的分析系统和形态分析,导致更准确的注解,因此目标语言的神经解析器显性知识,增加了术语解析性能附着分数。所提出的方法是为土耳其的发展,但也可以适用于其他语言。

11. Predicting Subjective Features from Questions on QA Websites using BERT [PDF] 返回目录
  Issa Annamoradnejad, Mohammadamin Fazli, Jafar Habibi
Abstract: Modern Question-Answering websites, such as StackOverflow and Quora, have specific user rules to maintain their content quality. These systems rely on user reports for accessing new contents, which has serious problems including the slow handling of violations, the loss of normal and experienced users' time, the low quality of some reports, and discouraging feedback to new users. Therefore, with the overall goal of providing solutions for automating moderation actions in Q&A websites, we aim to provide a model to predict 20 quality or subjective aspects of questions in QA websites. To this end, we used data gathered by the CrowdSource team at Google Research in 2019 and fine-tuned pre-trained BERT model on our problem. Model achieves 95.4% accuracy after 2 epochs of training and did not improve substantially in the next ones. Results confirm that by simple fine-tuning, we can achieve accurate models, in little time, and on less amount of data.
摘要:现代答疑网站,如StackOverflow的和Quora的,有特定的用户规则来维护自己的内容质量。这些系统依赖于用户报告访问新的内容,其中有严重的问题,包括慢处理违法行为,正常和有经验的用户的时间损失,一些报告质量低,和劝阻反馈给新用户。因此,自动化的Q&A网站中庸行动提供解决方案的总体目标,我们的目标是提供一个模型来预测20质量或在QA网站问题的主观方面。为此,我们使用了我们的问题在2019年由众包团队在谷歌研究收集的数据和微调预训练BERT模式。型号达到95.4%的准确率在2个时期的训练,并没有在接下来的那些大幅度提高。结果证实,通过简单的微调,就可以实现精确的模型,在很少的时间,并在数据量较少。

12. GRET: Global Representation Enhanced Transformer [PDF] 返回目录
  Rongxiang Weng, Haoran Wei, Shujian Huang, Heng Yu, Lidong Bing, Weihua Luo, Jiajun Chen
Abstract: Transformer, based on the encoder-decoder framework, has achieved state-of-the-art performance on several natural language generation tasks. The encoder maps the words in the input sentence into a sequence of hidden states, which are then fed into the decoder to generate the output sentence. These hidden states usually correspond to the input words and focus on capturing local information. However, the global (sentence level) information is seldom explored, leaving room for the improvement of generation quality. In this paper, we propose a novel global representation enhanced Transformer (GRET) to explicitly model global representation in the Transformer network. Specifically, in the proposed model, an external state is generated for the global representation from the encoder. The global representation is then fused into the decoder during the decoding process to improve generation quality. We conduct experiments in two text generation tasks: machine translation and text summarization. Experimental results on four WMT machine translation tasks and LCSTS text summarization task demonstrate the effectiveness of the proposed approach on natural language generation.
摘要:变压器的基础上,编码器,解码器框架,取得了几个自然语言生成任务的国家的最先进的性能。编码器在输入句子的单词映射到隐藏状态,然后将其馈送到解码器中,以产生输出语句的序列。这些隐藏的状态通常对应于捕捉本地信息的输入单词和重点。然而,全球(句子层面)的信息很少探索,留有余地产生质量的提高。在本文中,我们提出了增强型变压器(GRET)一种新型的全球代表全球代表性变压器网络中明确建模。具体而言,在所提出的模型中,对来自编码器的全球代表所产生的外部的状态。则全局表示被在解码过程中熔合到解码器中,以提高生成质量。我们进行了两个文本生成任务实验:机器翻译和文本摘要。四个WMT机器翻译任务和LCSTS文本摘要任务的实验结果表明,在自然语言生成了该方法的有效性。

13. Do Multi-Hop Question Answering Systems Know How to Answer the Single-Hop Sub-Questions? [PDF] 返回目录
  Yixuan Tang, Hwee Tou Ng, Anthony K.H. Tung
Abstract: Multi-hop question answering (QA) requires a model to retrieve and integrate information from different parts of a long text to answer a question. Humans answer this kind of complex questions via a divide-and-conquer approach. In this paper, we investigate whether top-performing models for multi-hop questions understand the underlying sub-questions like humans. We adopt a neural decomposition model to generate sub-questions for a multi-hop complex question, followed by extracting the corresponding sub-answers. We show that multiple state-of-the-art multi-hop QA models fail to correctly answer a large portion of sub-questions, although their corresponding multi-hop questions are correctly answered. This indicates that these models manage to answer the multi-hop questions using some partial clues, instead of truly understanding the reasoning paths. We also propose a new model which significantly improves the performance on answering the sub-questions. Our work takes a step forward towards building a more explainable multi-hop QA system.
摘要:多跳问答(QA)需要一个模型来检索和长文本的不同部分整合信息来回答的问题。人类回答这种通过分而治之的方法复杂的问题。在本文中,我们研究了多跳的问题表现最佳车型是否了解底层的子问题,像人类一样。我们采用神经分解模型来生成子问题的一个多跳复杂的问题,其次是提取相应子的答案。我们发现,多个国家的最先进的多跳QA模型不能正确回答的子问题的很大一部分,但其对应的多跳问题都回答正确。这表明,这些模型能答对使用一些局部的线索多跳的问题,而不是真正理解推理路径。我们也建议其显著提高了回答小问题的性能的新模式。我们的工作采取稳步前进,建立一个更可解释的多跳的质量保证体系。

14. A Nepali Rule Based Stemmer and its performance on different NLP applications [PDF] 返回目录
  Pravesh Koirala, Aman Shakya
Abstract: Stemming is an integral part of Natural Language Processing (NLP). It's a preprocessing step in almost every NLP application. Arguably, the most important usage of stemming is in Information Retrieval (IR). While there are lots of work done on stemming in languages like English, Nepali stemming has only a few works. This study focuses on creating a Rule Based stemmer for Nepali text. Specifically, it is an affix stripping system that identifies two different class of suffixes in Nepali grammar and strips them separately. Only a single negativity prefix (Na) is identified and stripped. This study focuses on a number of techniques like exception word identification, morphological normalization and word transformation to increase stemming performance. The stemmer is tested intrinsically using Paice's method and extrinsically on a basic tf-idf based IR system and an elementary news topic classifier using Multinomial Naive Bayes Classifier. The difference in performance of these systems with and without using the stemmer is analysed.
摘要:词干是自然语言处理(NLP)的一个组成部分。这是在几乎每一个NLP应用预处理步骤。可以说,所产生的最重要的用法是在信息检索(IR)。虽然有很多工作对像英语,尼泊尔语词干语言所产生做过只有少数作品。这项研究的重点是建立一个以规则为基础的词干尼泊尔文本。具体而言,它是一种汽提词缀系统,识别两个不同的类中尼泊尔语法后缀和钢带它们分开。只有一个消极的前缀(Na)的标识和剥离。本研究着重于一些像例外词识别,形态归一化和转化这个词来提高所产生的表现技法。该词干是利用Paice的方法和外在基本TF-IDF基于IR系统,并使用多项朴素贝叶斯分类器的基本新闻话题分类本质上测试。在使用和不使用词干这些系统的性能差异进行了分析。

15. Fill in the BLANC: Human-free quality estimation of document summaries [PDF] 返回目录
  Oleg Vasilyev, Vedant Dharnidharka, John Bohannon
Abstract: We present BLANC, a new approach to the automatic estimation of document summary quality. Our goal is to measure the functional performance of a summary with an objective, reproducible, and fully automated method. Our approach achieves this by measuring the performance boost gained by a pre-trained language model with access to a document summary while carrying out its language understanding task on the document's text. We present evidence that BLANC scores have at least as good correlation with human evaluations as do the ROUGE family of summary quality measurements. And unlike ROUGE, the BLANC method does not require human-written reference summaries, allowing for fully human-free summary quality estimation.
摘要:我们提出BLANC,一种新的方法,以文档摘要质量的自动估计。我们的目标是一种客观的,可重现,和全自动的方法来测量的概要的功能性能。我们的方法通过测量同时开展对文档的文本的语言理解任务由预先训练的语言模型访问文档摘要获得的性能提升达到这一点。我们目前的证据表明,BLANC分数至少有与人评价为做胭脂家族的总结质量测量良好的相关性。不像ROUGE,布兰克方法不需要人工编写的参考摘要,允许完全自由人总结质量估计。

16. Unsupervised Question Decomposition for Question Answering [PDF] 返回目录
  Ethan Perez, Patrick Lewis, Wen-tau Yih, Kyunghyun Cho, Douwe Kiela
Abstract: We aim to improve question answering (QA) by decomposing hard questions into easier sub-questions that existing QA systems can answer. Since collecting labeled decompositions is cumbersome, we propose an unsupervised approach to produce sub-questions. Specifically, by leveraging >10M questions from Common Crawl, we learn to map from the distribution of multi-hop questions to the distribution of single-hop sub-questions. We answer sub-questions with an off-the-shelf QA model and incorporate the resulting answers in a downstream, multi-hop QA system. On a popular multi-hop QA dataset, HotpotQA, we show large improvements over a strong baseline, especially on adversarial and out-of-domain questions. Our method is generally applicable and automatically learns to decompose questions of different classes, while matching the performance of decomposition methods that rely heavily on hand-engineering and annotation.
摘要:我们的目标是通过分解难的问题更容易进入子的问题,现有的质量保证系统能回答改善问答(QA)。由于收集标记分解很麻烦,我们提出了一种无监督的方法来产生子问题。具体来说,由通用抓取借力> 10M的问题,我们学会从多跳问题分布的单跳子问题的分布图。我们回答小问题有一个现成的,现成的QA模型,并在下游,多跳的质量保证体系结合所产生的答案。在一个流行的多跳QA数据集,HotpotQA,我们表现出了强烈的基线大的改进,特别是在对抗和外的域的问题。我们的方法是普遍适用的,自动学习不同类别的分解问题,同时匹配的严重依赖手工工程和注释的分解方法的性能。

17. Exploiting Typed Syntactic Dependencies for Targeted Sentiment Classification Using Graph Attention Neural Network [PDF] 返回目录
  Xuefeng Bai, Pengbo Liu, Yue Zhang
Abstract: Targeted sentiment classification predicts the sentiment polarity on given target mentions in input texts. Dominant methods employ neural networks for encoding the input sentence and extracting relations between target mentions and their contexts. Recently, graph neural network has been investigated for integrating dependency syntax for the task, achieving the state-of-the-art results. However, existing methods do not consider dependency label information, which can be intuitively useful. To solve the problem, we investigate a novel relational graph attention network that integrates typed syntactic dependency information. Results on standard benchmarks show that our method can effectively leverage label information for improving targeted sentiment classification performances. Our final model significantly outperforms state-of-the-art syntax-based approaches.
摘要:有针对性的情感分类预测在给定的目标情感极性在输入文本中提到。占主导地位的方法采用神经网络编码输入句子之间的目标提到了和他们的背景提取关系。近日,图表神经网络已经被研究用于集成依赖语法任务,实现国家的最先进的成果。然而,现有的方法没有考虑依赖标签信息,它可以直观地有用。为了解决这个问题,我们研究了一种新型的关系图关注网络整合类型的语法结构信息。关于标准的基准测试结果表明,该方法可以有效地利用标签信息以提高针对性情感分类表演。我们最后的模型显著优于国家的最先进的基于语法的方法。

18. Incorporating Effective Global Information via Adaptive Gate Attention for Text Classification [PDF] 返回目录
  Xianming Li, Zongxi Li, Yingbin Zhao, Haoran Xie, Qing Li
Abstract: The dominant text classification studies focus on training classifiers using textual instances only or introducing external knowledge (e.g., hand-craft features and domain expert knowledge). In contrast, some corpus-level statistical features, like word frequency and distribution, are not well exploited. Our work shows that such simple statistical information can enhance classification performance both efficiently and significantly compared with several baseline models. In this paper, we propose a classifier with gate mechanism named Adaptive Gate Attention model with Global Information (AGA+GI), in which the adaptive gate mechanism incorporates global statistical features into latent semantic features and the attention layer captures dependency relationship within the sentence. To alleviate the overfitting issue, we propose a novel Leaky Dropout mechanism to improve generalization ability and performance stability. Our experiments show that the proposed method can achieve better accuracy than CNN-based and RNN-based approaches without global information on several benchmarks.
摘要:占主导地位的文本分类的研究集中在仅使用或引入外部知识(例如,手工工艺的特点和领域专家知识)文本实例培训分类。相反,一些语料库级别的统计特征,如词频和分布,没有得到很好的利用。我们的工作表明,这种简单的统计信息,可以提高分类性能与一些基准模型有效且显著比较。在本文中,我们提出了与全球信息(AGA + GI),其中自适应栅极机制整合全球统计功能集成到潜在语义特征和句子中的关注层捕获依赖关系称为自适应栅极注意模型门机构的分类。为了减轻过拟合问题,我们提出了一个新颖的漏降机制,提高推广能力和性能的稳定性。我们的实验表明,该方法可以实现比不上几个基准全球信息化CNN和基于RNN方法更好的精度。

19. Machine Translation System Selection from Bandit Feedback [PDF] 返回目录
  Jason Naradowsky, Xuan Zhang, Kevin Duh
Abstract: Adapting machine translation systems in the real world is a difficult problem. In contrast to offline training, users cannot provide the type of fine-grained feedback typically used for improving the system. Moreover, users have different translation needs, and even a single user's needs may change over time. In this work we take a different approach, treating the problem of adapting as one of selection. Instead of adapting a single system, we train many translation systems using different architectures and data partitions. Using bandit learning techniques on simulated user feedback, we learn a policy to choose which system to use for a particular translation task. We show that our approach can (1) quickly adapt to address domain changes in translation tasks, (2) outperform the single best system in mixed-domain translation tasks, and (3) make effective instance-specific decisions when using contextual bandit strategies.
摘要:在现实世界中适应机器翻译系统是一个棘手的问题。相较于离线训练,用户不能提供通常用于提高系统的细粒度反馈类型。此外,用户有不同的翻译需求,甚至是单个用户的需求可能会随时间而改变。在这项工作中,我们采取不同的方法,治疗适应作为选择的一个问题。相反,适应一个单一的系统中,我们培养使用不同的架构和数据分区很多翻译系统。使用匪学习上模拟用户反馈技术,我们学习的策略选择使用特定的翻译任务,系统。我们表明,我们的方法可以(1)快速适应地址域的变化在翻译任务,(2)混合域翻译任务跑赢大单最好的系统,和(3)利用上下文匪策略时做出有效的情况下,具体的决定。

20. Markov Chain Monte-Carlo Phylogenetic Inference Construction in Computational Historical Linguistics [PDF] 返回目录
  Tianyi Ni
Abstract: More and more languages in the world are under study nowadays, as a result, the traditional way of historical linguistics study is facing some challenges. For example, the linguistic comparative research among languages needs manual annotation, which becomes more and more impossible with the increasing amount of language data coming out all around the world. Although it could hardly replace linguists work, the automatic computational methods have been taken into consideration and it can help people reduce their workload. One of the most important work in historical linguistics is word comparison from different languages and find the cognate words for them, which means people try to figure out if the two languages are related to each other or not. In this paper, I am going to use computational method to cluster the languages and use Markov Chain Monte Carlo (MCMC) method to build the language typology relationship tree based on the clusters.
摘要:越来越多的世界语言正在研究时下,因此,历史语言学研究的传统方式正面临着一些挑战。例如,语言之间的语言比较研究需要人工注释,成为与语言数据的现身在世界各地的越来越多越来越不可能。虽然很难取代语言学家的工作,自动计算方法已经被考虑到,它可以帮助人们减少他们的工作量。一个在历史语言学中最重要的工作是由不同的语言文字比较,并找到他们的同源词,这意味着人们揣摩,如果两种语言相互关联与否。在本文中,我将用计算方法进行聚类的语言和使用马尔可夫链蒙特卡罗(MCMC)方法来构建基于集群的语言类型学关系树。

21. Data Augmentation for Copy-Mechanism in Dialogue State Tracking [PDF] 返回目录
  Xiaohui Song, Liangjun Zang, Yipeng Su, Xing Wu, Jizhong Han, Songlin Hu
Abstract: While several state-of-the-art approaches to dialogue state tracking (DST) have shown promising performances on several benchmarks, there is still a significant performance gap between seen slot values (i.e., values that occur in both training set and test set) and unseen ones (values that occur in training set but not in test set). Recently, the copy-mechanism has been widely used in DST models to handle unseen slot values, which copies slot values from user utterance directly. In this paper, we aim to find out the factors that influence the generalization ability of a common copy-mechanism model for DST. Our key observations include: 1) the copy-mechanism tends to memorize values rather than infer them from contexts, which is the primary reason for unsatisfactory generalization performance; 2) greater diversity of slot values in the training set increase the performance on unseen values but slightly decrease the performance on seen values. Moreover, we propose a simple but effective algorithm of data augmentation to train copy-mechanism models, which augments the input dataset by copying user utterances and replacing the real slot values with randomly generated strings. Users could use two hyper-parameters to realize a trade-off between the performances on seen values and unseen ones, as well as a trade-off between overall performance and computational cost. Experimental results on three widely used datasets (WoZ 2.0, DSTC2, and Multi-WoZ 2.0) show the effectiveness of our approach.
摘要:虽然一些国家的最先进的方法,以对话状态跟踪(DST)已经在几个基准显示出大有希望的演出,但仍然发生在两个训练集和测试看出槽值(即,值之间的显著性能差距集)和看不见的人(出现在训练组值而不是在测试集)。近日,复制机制已被广泛应用于DST车型可供使用者说话直接处理看不见的槽值,其副本槽值。在本文中,我们的目标是找出影响了DST共同的复制机制模型的泛化能力的因素。我们的关键观察包括:1)复制机制倾向于从上下文记忆值,而不是他们推断,这是不令人满意的推广性能的主要原因; 2)在训练集中槽值的更大的多样性增加看不见值性能,但稍微降低上看到的值的性能。此外,我们提出了一个简单的,但数据扩充的有效算法训练复制机制的模型,其中通过复制用户话语,并用随机生成的字符串替换真实槽值增强了输入数据集。用户可以使用两个超参数,实现对看到的价值观和看不见的那些性能之间的权衡,以及整体性能和计算成本之间的权衡。三个广泛使用的数据集实验结果(WOZ 2.0,DSTC2,和多沃兹2.0)表明我们的方法的有效性。

22. Efficient Sentence Embedding via Semantic Subspace Analysis [PDF] 返回目录
  Bin Wang, Fenxiao Chen, Yuncheng Wang, C.-C. Jay Kuo
Abstract: A novel sentence embedding method built upon semantic subspace analysis, called semantic subspace sentence embedding (S3E), is proposed in this work. Given the fact that word embeddings can capture semantic relationship while semantically similar words tend to form semantic groups in a high-dimensional embedding space, we develop a sentence representation scheme by analyzing semantic subspaces of its constituent words. Specifically, we construct a sentence model from two aspects. First, we represent words that lie in the same semantic group using the intra-group descriptor. Second, we characterize the interaction between multiple semantic groups with the inter-group descriptor. The proposed S3E method is evaluated on both textual similarity tasks and supervised tasks. Experimental results show that it offers comparable or better performance than the state-of-the-art. The complexity of our S3E method is also much lower than other parameterized models.
摘要:根据语义子空间分析,建立了一种新的句子埋线法,称为语义子空间句子嵌入(S3E),在这项工作中提出。鉴于字的嵌入可以捕捉语义关系,而语义相似的词往往会形成语义组在高维嵌入空间,我们开发了通过分析它的构成词的语义子空间的句子表达方式。具体来说,我们从两个方面来构建一个句模型。首先,我们表示位于同一语义组使用帧内组描述符的话。其次,表征与组间描述多个语义组之间的相互作用。所提出的S3E方法在两个文本相似的任务和监督的任务进行评估。实验结果表明,它提供了比国家的最先进的相当或更好的性能。本店S3E方法的复杂性也比其它参数化模型低得多。

23. "Wait, I'm Still Talking!" Predicting the Dialogue Interaction Behavior Using Imagine-Then-Arbitrate Model [PDF] 返回目录
  Zehao Lin, Xiaoming Kang, Guodun Li, Feng Ji, Haiqing Chen, Yin Zhang
Abstract: Producing natural and accurate responses like human beings is the ultimate goal of intelligent dialogue agents. So far, most of the past works concentrate on selecting or generating one pertinent and fluent response according to current query and its context. These models work on a one-to-one environment, making one response to one utterance each round. However, in real human-human conversations, human often sequentially sends several short messages for readability instead of a long message in one turn. Thus messages will not end with an explicit ending signal, which is crucial for agents to decide when to reply. So the first step for an intelligent dialogue agent is not replying but deciding if it should reply at the moment. To address this issue, in this paper, we propose a novel Imagine-then-Arbitrate (ITA) neural dialogue model to help the agent decide whether to wait or to make a response directly. Our method has two imaginator modules and an arbitrator module. The two imaginators will learn the agent's and user's speaking style respectively, generate possible utterances as the input of the arbitrator, combining with dialogue history. And the arbitrator decides whether to wait or to make a response to the user directly. To verify the performance and effectiveness of our method, we prepared two dialogue datasets and compared our approach with several popular models. Experimental results show that our model performs well on addressing ending prediction issue and outperforms baseline models.
摘要:生产人类一样自然而准确的答复是智能对话代理的终极目标。到目前为止,大多数过去的作品集中选择或生成一个相关的,并根据当前查询及其上下文流畅的响应。这些模型在一到一个环境中工作,使一个响应一个话语每一轮。然而,在实际人 - 人交谈,人类经常依次发送几个短消息是为了便于阅读,而不是在一匝的长消息。因此消息不会有一个明确的结束信号,这是至关重要的代理商时,答复决定而结束。因此,对于一个智能代理对话的第一步是不回答,但决定是否应该在此刻答复。为了解决这个问题,在本文中,我们提出了一个新颖的想象,然后仲裁的(ITA)的神经对话模式,以帮助代理商决定是否要等待或者直接作出回应。我们的方法具有两个imaginator模块和仲裁模块。这两个imaginators将分别学习代理人和用户的说话方式,可能产生的话语仲裁员的输入,以对话的历史相结合。而该仲裁员决定是否要等待或使直接用户的响应。为了验证方法的性能和效率,我们准备了两个对话数据集和我们相比,使用一些流行的模型法。实验结果表明,我们的模型进行很好的处理结束的预测问题,优于基准模型。

24. Emergent Communication with World Models [PDF] 返回目录
  Alexander I. Cowen-Rivers, Jason Naradowsky
Abstract: We introduce Language World Models, a class of language-conditional generative model which interpret natural language messages by predicting latent codes of future observations. This provides a visual grounding of the message, similar to an enhanced observation of the world, which may include objects outside of the listening agent's field-of-view. We incorporate this "observation" into a persistent memory state, and allow the listening agent's policy to condition on it, akin to the relationship between memory and controller in a World Model. We show this improves effective communication and task success in 2D gridworld speaker-listener navigation tasks. In addition, we develop two losses framed specifically for our model-based formulation to promote positive signalling and positive listening. Finally, because messages are interpreted in a generative model, we can visualize the model beliefs to gain insight into how the communication channel is utilized.
摘要:介绍语言世界图式,一类语言条件生成模型的其中通过预测未来的观测潜代码解释自然语言的消息。这提供了该消息,类似于世界的增强观测,其可以包括对象听音代理的字段的视图以外的可视接地。我们将这一“观察”到持久性存储器的状态,使听音代理的政策条件就可以了,类似于在世界模特内存和控制器之间的关系。我们发现这提高了2D gridworld扬声器收听导航任务的有效沟通和任务的成功。此外,我们还开发专门为我们推动积极的信号,积极的倾听基于模型的框架制定两负。最后,因为消息是在生成模型的解释,我们可以可视化模型的信念深入了解通信信道如何利用。

25. Training Question Answering Models From Synthetic Data [PDF] 返回目录
  Raul Puri, Ryan Spring, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
Abstract: Question and answer generation is a data augmentation method that aims to improve question answering (QA) models given the limited amount of human labeled data. However, a considerable gap remains between synthetic and human-generated question-answer pairs. This work aims to narrow this gap by taking advantage of large language models and explores several factors such as model size, quality of pretrained models, scale of data synthesized, and algorithmic choices. On the SQuAD1.1 question answering task, we achieve higher accuracy using solely synthetic questions and answers than when using the SQuAD1.1 training set questions alone. Removing access to real Wikipedia data, we synthesize questions and answers from a synthetic corpus generated by an 8.3 billion parameter GPT-2 model. With no access to human supervision and only access to other models, we are able to train state of the art question answering networks on entirely model-generated data that achieve 88.4 Exact Match (EM) and 93.9 F1 score on the SQuAD1.1 dev set. We further apply our methodology to SQuAD2.0 and show a 2.8 absolute gain on EM score compared to prior work using synthetic data.
摘要:问题和答案产生是一个数据增强方法,旨在改善问答(QA)模型给出的人类标注的数据量有限。然而,相当大的差距保持合成的和人类生成的问答配对之间。这项工作旨在通过利用大型语言模型,探讨多种因素,如模型的大小,预先训练模型的质量,合成数据的规模,和算法的选择优势,缩小这个差距。在SQuAD1.1问答任务,我们使用单独使用SQuAD1.1训练集的问题时,不是单纯的合成问题和解答达到更高的精度。删除访问维基百科的实际数据,我们综合问题和答案由8.3十亿参数GPT-2模型产生的合成语料库。由于没有进入人类的监管,只有其他机型的访问,我们能够在艺术问题上达到88.4精确匹配(EM)和93.9 F1得分上SQuAD1.1开发一套完全模型生成数据回答网络的列车状态。我们我们的方法还适用于SQuAD2.0和EM得分使用合成数据相比以前的工作表现出2.8的绝对收益。

26. Extracting and Validating Explanatory Word Archipelagoes using Dual Entropy [PDF] 返回目录
  Yukio Ohsawa, Teruaki Hayashi
Abstract: The logical connectivity of text is represented by the connectivity of words that form archipelagoes. Here, each archipelago is a sequence of islands of the occurrences of a certain word. An island here means the local sequence of sentences where the word is emphasized, and an archipelago of a length comparable to the target text is extracted using the co-variation of entropy A (the window-based entropy) on the distribution of the word's occurrences with the width of each time window. Then, the logical connectivity of text is evaluated on entropy B (the graph-based entropy) computed on the distribution of sentences to connected word-clusters obtained on the co-occurrence of words. The results show the parts of the target text with words forming archipelagoes extracted on entropy A, without learned or prepared knowledge, form an explanatory part of the text that is of smaller entropy B than the parts extracted by the baseline methods.
摘要:文本的逻辑连接是由形成群岛的字的连接来表示。在这里,每个群岛是某个词的出现岛屿的序列。这里的一个岛是指其中字强调句子的局部序列,并在字的出现的分布使用熵A的共变(所述基于窗口的熵)被提取媲美的目标文本的长度的群岛与每个时间窗的宽度。然后,文本的逻辑连接是在计算上的句子的分布上的字的同现得到的连接的字集群熵B(基于图形的熵)来评价。结果表明与形成于熵甲萃取群岛,没有获悉或者准备知识字的目标文本的部分,形成比由基线方法提取的部分更小的熵B的文本的说明部分。

27. Modelling Latent Skills for Multitask Language Generation [PDF] 返回目录
  Kris Cao, Dani Yogatama
Abstract: We present a generative model for multitask conditional language generation. Our guiding hypothesis is that a shared set of latent skills underlies many disparate language generation tasks, and that explicitly modelling these skills in a task embedding space can help with both positive transfer across tasks and with efficient adaptation to new tasks. We instantiate this task embedding space as a latent variable in a latent variable sequence-to-sequence model. We evaluate this hypothesis by curating a series of monolingual text-to-text language generation datasets - covering a broad range of tasks and domains and comparing the performance of models both in the multitask and few-shot regimes. We show that our latent task variable model outperforms other sequence-to-sequence baselines on average across tasks in the multitask setting. In the few-shot learning setting on an unseen test dataset (i.e., a new task), we demonstrate that model adaptation based on inference in the latent task space is more robust than standard fine-tuning based parameter adaptation and performs comparably in terms of overall performance. Finally, we examine the latent task representations learnt by our model and show that they cluster tasks in a natural way.
摘要:我们提出了一个生成模型的多任务条件语言的产生。我们的指导假设是潜伏技能underlies许多不同的语言生成任务组共享,并在任务中嵌入空间明确建模这些技能可以跨越任务,并以高效的适应新的任务正传递帮助。我们实例化这个任务嵌入空间的潜在变量序列到序列模型中的潜在变量。覆盖范围广泛的任务和域的和比较的车型无论是在多任务和几拍政权的表现 - 我们通过策划一系列单语文本到文本的语言生成的数据集的评估这一假说。我们证明了我们的潜在任务变量模型优于其他序列到序列基线平均横跨任务在多任务环境。在一个看不见的测试数据集(即,新任务)的几个拍的学习环境,我们展示了基于潜在的任务空间推理这种模式适应比标准微调基于参数的适应和执行更强大的同等来讲整体表现。最后,我们检查我们的模型学到的潜伏任务陈述,并表明他们以自然的方式聚集任务。

28. KBSET -- Knowledge-Based Support for Scholarly Editing and Text Processing with Declarative LaTeX Markup and a Core Written in SWI-Prolog [PDF] 返回目录
  Jana Kittelmann, Christoph Wernhard
Abstract: KBSET is an environment that provides support for scholarly editing in two flavors: First, as a practical tool KBSET/Letters that accompanies the development of editions of correspondences (in particular from the 18th and 19th century), completely from source documents to PDF and HTML presentations. Second, as a prototypical tool KBSET/NER for experimentally investigating novel forms of working on editions that are centered around automated named entity recognition. KBSET can process declarative application-specific markup that is expressed in LaTeX notation and incorporate large external fact bases that are typically provided in RDF. KBSET includes specially developed LaTeX styles and a core system that is written in SWI-Prolog, which is used there in many roles, utilizing that it realizes the potential of Prolog as a unifying language.
摘要:KBSET的是,在两种形式为学术编辑支持的环境:首先,作为一种实用工具KBSET /信函,伴随着对应的版本的发展(特别是从18世纪和19世纪),完全从源文件为PDF和HTML演示。其次,作为实验研究上都围绕着自动命名实体识别版本的工作新形式的原型工具KBSET / ER。 KBSET可以处理在乳胶符号表示声明性应用程序特定标记,并纳入其通常在RDF提供大的外部事实碱基。 KBSET包括专门开发的LaTeX风格和上所写的SWI-Prolog的,这是在许多角色使用有一个核心系统,利用它实现的Prolog作为统一语言的潜力。

29. Uncertainty based Class Activation Maps for Visual Question Answering [PDF] 返回目录
  Badri N. Patro, Mayank Lunayach, Vinay P. Namboodiri
Abstract: Understanding and explaining deep learning models is an imperative task. Towards this, we propose a method that obtains gradient-based certainty estimates that also provide visual attention maps. Particularly, we solve for visual question answering task. We incorporate modern probabilistic deep learning methods that we further improve by using the gradients for these estimates. These have two-fold benefits: a) improvement in obtaining the certainty estimates that correlate better with misclassified samples and b) improved attention maps that provide state-of-the-art results in terms of correlation with human attention regions. The improved attention maps result in consistent improvement for various methods for visual question answering. Therefore, the proposed technique can be thought of as a recipe for obtaining improved certainty estimates and explanations for deep learning models. We provide detailed empirical analysis for the visual question answering task on all standard benchmarks and comparison with state of the art methods.
摘要:理解和解释深学习模型是一个势在必行的任务。为了实现这个,我们建议,取得基于梯度确定性估计,也提供视觉注意力图的方法。尤其是,我们解决了视觉问答任务。我们结合现代化的概率深的学习方法,我们通过使用梯度为这些估计进一步提高。这有两方面的好处:在获得确定性估计与错误分类样本和b)提高注意力的地图,提供结果的国家的最先进的与人类关注点区域相关的关联方面更好)的改善。改进后的注意地图导致对于视觉答疑各种方法持续改进。因此,提出的技术可以被看作是获得改善的确定性的估计和解释深学习模型的配方。我们提供详细的实证分析,在所有标准基准测试,并与现有技术方法相比,状态可视化问答任务。

30. Rhythm, Chord and Melody Generation for Lead Sheets using Recurrent Neural Networks [PDF] 返回目录
  Cedric De Boom, Stephanie Van Laere, Tim Verbelen, Bart Dhoedt
Abstract: Music that is generated by recurrent neural networks often lacks a sense of direction and coherence. We therefore propose a two-stage LSTM-based model for lead sheet generation, in which the harmonic and rhythmic templates of the song are produced first, after which, in a second stage, a sequence of melody notes is generated conditioned on these templates. A subjective listening test shows that our approach outperforms the baselines and increases perceived musical coherence.
摘要:由递归神经网络产生的音乐往往缺乏方向和连贯性的感觉。因此,我们提出一种用于铅片产生一个两阶段的基于LSTM模型,其中,所述歌曲的和声和节奏模板首先产生,在这之后,在第二阶段中,的旋律音符序列来生成调节这些模板。主观听音测试表明,我们的方法比基线和增加感知音乐的连贯性。

31. Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning [PDF] 返回目录
  Wei Ye, Rui Xie, Jinglei Zhang, Tianxiang Hu, Xiaoyin Wang, Shikun Zhang
Abstract: Code summarization generates brief natural language description given a source code snippet, while code retrieval fetches relevant source code given a natural language query. Since both tasks aim to model the association between natural language and program-ming language, recent studies have combined these two tasks to improve their performance. However, researchers have yet been able to effectively leverage the intrinsic connection between the two tasks as they train these tasks in a separate or pipeline manner, which means their performance can not be well balanced. In this paper, we propose a novel end-to-end model for the two tasks by introducing an additional code generation task. More specifically, we explicitly exploit the probabilistic correlation between code summarization and code generation with dual learning, and utilize the two encoders for code summarization and code generation to train the code retrieval task via multi-task learning. We have carried out extensive experiments on an existing dataset of SQL andPython, and results show that our model can significantly improve the results of the code retrieval task over the-state-of-art models, as well as achieve competitive performance in terms of BLEU score for the code summarization task.
摘要:代码生成汇总给源代码片段简要自然语言描述,而代码检索获取给定的自然语言查询相关的源代码。既然两个任务的目标是自然语言和程序铭语言之间的关系进行建模,最近的研究结合这两个任务,以提高其性能。然而,研究人员尚未能有效地利用这两个任务之间的内在联系,因为他们在一个单独或管道的方式,这意味着它们的性能不能得到很好的平衡训练这些任务。在本文中,我们通过引入额外的代码生成任务提出了一种新的终端到终端型号为两个任务。更具体地说,我们明确地利用了双码学习总结和代码生成之间的概率相关性,并利用两个编码器的代码总结和代码生成通过多任务学习训练码检索任务。我们已经进行了大量的实验对SQL andPython的现有数据集,结果表明,该模型可以显著提高代码检索任务在国家的最先进的模型的结果,以及实现BLEU方面竞争力的性能分数代码汇总任务。

32. FONDUE: A Framework for Node Disambiguation Using Network Embeddings [PDF] 返回目录
  Ahmad Mel, Bo Kang, Jefrey Lijffijt, Tijl De Bie
Abstract: Real-world data often presents itself in the form of a network. Examples include social networks, citation networks, biological networks, and knowledge graphs. In their simplest form, networks represent real-life entities (e.g. people, papers, proteins, concepts) as nodes, and describe them in terms of their relations with other entities by means of edges between these nodes. This can be valuable for a range of purposes from the study of information diffusion to bibliographic analysis, bioinformatics research, and question-answering. The quality of networks is often problematic though, affecting downstream tasks. This paper focuses on the common problem where a node in the network in fact corresponds to multiple real-life entities. In particular, we introduce FONDUE, an algorithm based on network embedding for node disambiguation. Given a network, FONDUE identifies nodes that correspond to multiple entities, for subsequent splitting. Extensive experiments on twelve benchmark datasets demonstrate that FONDUE is substantially and uniformly more accurate for ambiguous node identification compared to the existing state-of-the-art, at a comparable computational cost, while less optimal for determining the best way to split ambiguous nodes.
摘要:真实世界的数据通常表示其本身在网络的形式。例子包括社交网络,引文网络,生物网络和知识图。在最简单的形式,网络代表现实生活中的实体(如人,论文,蛋白质,概念)为节点,通过这些节点之间的边的方式描述他们在与其他机构的关系方面。这可以为各种目的,从信息传播的书目分析,生物信息学研究,问题回答的研究价值。网络的质量往往是有问题的,虽然,影响下游任务。本文重点介绍了常见的问题,即实际上对应多个真实的实体网络中的节点。特别是,我们引进火锅,基于网络的嵌入节点消歧的算法。给定的网络中,FONDUE识别节点对应于多个实体,用于随后的分裂。十二个基准数据集大量的实验证明,FONDUE基本上且均匀地为相比于现有状态的最先进的暧昧节点识别更精确,在相当的计算成本,而用于确定的最佳方式以下最佳分裂暧昧节点。

33. Emosaic: Visualizing Affective Content of Text at Varying Granularity [PDF] 返回目录
  Philipp Geuder, Marie Claire Leidinger, Martin von Lupin, Marian Dörk, Tobias Schröder
Abstract: This paper presents Emosaic, a tool for visualizing the emotional tone of text documents, considering multiple dimensions of emotion and varying levels of semantic granularity. Emosaic is grounded in psychological research on the relationship between language, affect, and color perception. We capitalize on an established three-dimensional model of human emotion: valence (good, nice vs. bad, awful), arousal (calm, passive vs. exciting, active) and dominance (weak, controlled vs. strong, in control). Previously, multi-dimensional models of emotion have been used rarely in visualizations of textual data, due to the perceptual challenges involved. Furthermore, until recently most text visualizations remained at a high level, precluding closer engagement with the deep semantic content of the text. Informed by empirical studies, we introduce a color mapping that translates any point in three-dimensional affective space into a unique color. Emosaic uses affective dictionaries of words annotated with the three emotional parameters of the valence-arousal-dominance model to extract emotional meanings from texts and then assigns to them corresponding color parameters of the hue-saturation-brightness color space. This approach of mapping emotion to color is aimed at helping readers to more easily grasp the emotional tone of the text. Several features of Emosaic allow readers to interactively explore the affective content of the text in more detail; e.g., in aggregated form as histograms, in sequential form following the order of text, and in detail embedded into the text display itself. Interaction techniques have been included to allow for filtering and navigating of text and visualizations.
摘要:本文介绍Emosaic,用于可视化情绪状态的文本文档,考虑到情绪的多维度和不同的语义粒度级别的工具。 Emosaic在心理学研究接地的语言,情感,和颜色的感知之间的关系。我们利用人类情感的建立三维模型:价(好,好和差的,可怕的),觉醒(平静,被动与精彩的活动)和显性(弱,控制与强,中控)。此前,情感的多维模型已经很少使用的文本数据的可视化,由于涉及到的感性挑战。此外,直到最近,大多数文本可视化仍处于较高水平,与文本的深层语义内容排除更密切地参与。通过实证研究知情,我们介绍翻译在三维空间情感的任何一点到一个独特的颜色的颜色映射。 Emosaic使用与价觉醒-显性模型的三个参数的情绪注释来提取文本的情感的含义,然后受让人将它们对应的色相 - 饱和度 - 亮度颜色空间的颜色参数字的情感词典。映射情感色彩这种方法旨在帮助读者更容易把握感情基调的文字。 Emosaic的几个特点让读者以交互方式更详细地探讨文本的情感内容;例如,在下面的文本的顺序聚集形式作为直方图,以顺序的形式,并详细嵌入到文本显示器本身。互动技术已被列入允许用于过滤和文本和可视化的导航。

34. Deep Multimodal Image-Text Embeddings for Automatic Cross-Media Retrieval [PDF] 返回目录
  Hadi Abdi Khojasteh, Ebrahim Ansari, Parvin Razzaghi, Akbar Karimi
Abstract: This paper considers the task of matching images and sentences by learning a visual-textual embedding space for cross-modal retrieval. Finding such a space is a challenging task since the features and representations of text and image are not comparable. In this work, we introduce an end-to-end deep multimodal convolutional-recurrent network for learning both vision and language representations simultaneously to infer image-text similarity. The model learns which pairs are a match (positive) and which ones are a mismatch (negative) using a hinge-based triplet ranking. To learn about the joint representations, we leverage our newly extracted collection of tweets from Twitter. The main characteristic of our dataset is that the images and tweets are not standardized the same as the benchmarks. Furthermore, there can be a higher semantic correlation between the pictures and tweets contrary to benchmarks in which the descriptions are well-organized. Experimental results on MS-COCO benchmark dataset show that our model outperforms certain methods presented previously and has competitive performance compared to the state-of-the-art. The code and dataset have been made available publicly.
摘要:本文通过学习跨模态获取一个视觉文本嵌入空间考虑匹配的图像和句子的任务。找到这样的空间是一个具有挑战性的任务,因为功能和文本和图像的表示是不可比的。在这项工作中,我们介绍一个终端到终端的多深卷积经常性网络同时学习两种视觉和语言表述来推断图像文本相似性。该模型获知哪个对是一个匹配(正)和哪些是使用基于铰链 - 三重态排名的失配(负)。要了解联合表示,我们充分利用我们的新提取的来自Twitter的鸣叫的集合。我们的数据集的主要特点是图像和鸣叫不规范一样的基准。此外,还可以是图片和鸣叫违背基准之间的更高语义相关,其中描述是井井有条。在MS-COCO基准数据集显示,我们的模型优于某些方法上提交并相比于国家的最先进的竞争力的性能实验结果。代码和数据集已经可以公开获得的。

35. Automata for Hyperlanguages [PDF] 返回目录
  Borzoo Bonakdarpour, Sarai Sheinvald
Abstract: Hyperproperties lift conventional trace properties from a set of execution traces to a set of sets of execution traces. Hyperproperties have been shown to be a powerful formalism for expressing and reasoning about information-flow security policies and important properties of cyber-physical systems such as sensitivity and robustness, as well as consistency conditions in distributed computing such as linearizability. Although there is an extensive body of work on automata-based representation of trace properties, we currently lack such characterization for hyperproperties. We introduce hyperautomata for em hyperlanguages, which are languages over sets of words. Essentially, hyperautomata allow running multiple quantified words over an automaton. We propose a specific type of hyperautomata called nondeterministic finite hyperautomata (NFH), which accept regular hyperlanguages. We demonstrate the ability of regular hyperlanguages to express hyperproperties for finite traces. We then explore the fundamental properties of NFH and show their closure under the Boolean operations. We show that while nonemptiness is undecidable in general, it is decidable for several fragments of NFH. We further show the decidability of the membership problem for finite sets and regular languages for NFH, as well as the containment problem for several fragments of NFH. Finally, we introduce learning algorithms based on Angluin's L-star algorithm for the fragments NFH in which the quantification is either strictly universal or strictly existential.
摘要:Hyperproperties解除从一组执行跟踪的一组集的执行跟踪的常规跟踪属性。 Hyperproperties已被证明是用于表达和在分布式计算如线性化推理信息流的安全策略和网络的物理系统,如灵敏度和鲁棒性的重要性能,以及一致性条件的有力形式主义。尽管有广泛的跟踪属性的基于自动机的表示工作的机构,我们目前缺乏这种hyperproperties表征。我们引进hyperautomata为EM hyperlanguages,这是在套语的语言。从本质上讲,hyperautomata允许运行在一台自动机的多个量化的话。我们提出了所谓的非确定性有限hyperautomata(NFH)hyperautomata的特殊类型,它接受定期hyperlanguages。我们证明定期hyperlanguages的表达对有限的痕迹hyperproperties的能力。然后,我们探索NFH的基本性质并根据布尔运算展示自己关闭。我们表明,尽管非空性是一般不可判定的,它是可判定为五丰行的几个片段。我们进一步显示了有限集和NFH正规语言的成员问题的可判定性,以及遏制问题的NFH的几个片段。最后,介绍了基于Angluin的L-星算法的片段NFH其中定量或者是严格通用或严格存在学习算法。

36. Sketching Transformed Matrices with Applications to Natural Language Processing [PDF] 返回目录
  Yingyu Liang, Zhao Song, Mengdi Wang, Lin F. Yang, Xin Yang
Abstract: Suppose we are given a large matrix $A=(a_{i,j})$ that cannot be stored in memory but is in a disk or is presented in a data stream. However, we need to compute a matrix decomposition of the entry-wisely transformed matrix, $f(A):=(f(a_{i,j}))$ for some function $f$. Is it possible to do it in a space efficient way? Many machine learning applications indeed need to deal with such large transformed matrices, for example word embedding method in NLP needs to work with the pointwise mutual information (PMI) matrix, while the entrywise transformation makes it difficult to apply known linear algebraic tools. Existing approaches for this problem either need to store the whole matrix and perform the entry-wise transformation afterwards, which is space consuming or infeasible, or need to redesign the learning method, which is application specific and requires substantial remodeling. In this paper, we first propose a space-efficient sketching algorithm for computing the product of a given small matrix with the transformed matrix. It works for a general family of transformations with provable small error bounds and thus can be used as a primitive in downstream learning tasks. We then apply this primitive to a concrete application: low-rank approximation. We show that our approach obtains small error and is efficient in both space and time. We complement our theoretical results with experiments on synthetic and real data.
摘要:假设我们给出一个大的矩阵$ A =(A_ {I,J})$不能被存储在存储器中,但是在一个磁盘或在数据流中被呈现。然而,我们需要计算入门明智地变换矩阵的矩阵分解,$ F(A):=(F(A_ {I,J}))$对于某些功能$ F $。是否有可能做一个空间有效的方式?很多机器学习的应用程序确实需要用这么大的转变矩阵,例如字嵌入方法在NLP需要与逐点互信息(PMI)矩阵工作,而entrywise改造使其难以适用已知的线性代数工具。现有的方法对这个问题要么需要存储整个矩阵,之后执行入门明智的转型,这是占用空间或不可行,或者需要重新设计学习方法,这是特定于应用需要大量的重塑。在本文中,我们首先提出了用转换矩阵计算给定的小矩阵的产品节省空间的素描算法。它的工作原理与可证实的小误差界变革的一般家庭,因此可以用作下游学习任务的原始。接着,我们采用这种原始的具体应用:低等级近似。我们证明了我们的方法获得误差小,是在空间和时间效率。我们对合成和真实数据的实验补充了我们的理论结果。

注:中文为机器翻译结果!