摘要

1. Spoken dialect identification in Twitter using a multi-filter architecture [PDF] 返回目录
Mohammadreza Banaei, Rémi Lebret, Karl Aberer
Abstract: This paper presents our approach for SwissText & KONVENS 2020 shared task 2, which is a multi-stage neural model for Swiss German (GSW) identification on Twitter. Our model outputs either GSW or non-GSW and is not meant to be used as a generic language identifier. Our architecture consists of two independent filters where the first one favors recall, and the second one filter favors precision (both towards GSW). Moreover, we do not use binary models (GSW vs. not-GSW) in our filters but rather a multi-class classifier with GSW being one of the possible labels. Our model reaches F1-score of 0.982 on the test set of the shared task.
摘要：本文介绍了我们的方法为SwissText＆KONVENS 2020共享任务2，这是在Twitter瑞士德语（GSW）识别的多级神经网络模型。我们的模型输出要么GSW或非GSW，并且不意味着被用作一般语言标识符。我们的体系结构由两个独立的过滤器，其中第一个好处记得，而第二个滤波器主张精度（两者朝GSW）。此外，我们不使用二进制模式在我们的过滤器（GSW与未GSW），而是一个多类分类与GSW是可能的标签之一。我们的模型达到0.982的测试集的共享任务的F1-得分。

2. Sentiment Analysis Based on Deep Learning: A Comparative Study [PDF] 返回目录
Nhan Cach Dang, María N. Moreno-García, Fernando De la Prieta
Abstract: The study of public opinion can provide us with valuable information. The analysis of sentiment on social networks, such as Twitter or Facebook, has become a powerful means of learning about the users' opinions and has a wide range of applications. However, the efficiency and accuracy of sentiment analysis is being hindered by the challenges encountered in natural language processing (NLP). In recent years, it has been demonstrated that deep learning models are a promising solution to the challenges of NLP. This paper reviews the latest studies that have employed deep learning to solve sentiment analysis problems, such as sentiment polarity. Models using term frequency-inverse document frequency (TF-IDF) and word embedding have been applied to a series of datasets. Finally, a comparative study has been conducted on the experimental results obtained for the different models and input features
摘要：舆论的研究，可以为我们提供宝贵的信息。在社交网络如Twitter或Facebook情绪的分析，已成为了解用户的意见的有力手段，具有广泛的应用前景。然而，情感分析的效率和精度是由自然语言处理（NLP）所遇到的挑战阻碍。近年来，已经证明了深刻的学习模式是有前景的解决方案，以NLP的挑战。本文回顾了已经采用深度学习解决情感分析问题，比如情感极性的最新研究。使用词频 - 逆文档频率（TF-IDF）和字嵌入模型已经应用于一系列数据集。最后，进行了比较研究已经对不同的模型和输入功能所获得的实验结果进行

3. CoCon: A Self-Supervised Approach for Controlled Text Generation [PDF] 返回目录
Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Jie Fu
Abstract: Pretrained Transformer-based language models (LMs) display remarkable natural language generation capabilities. With their immense potential, controlling text generation of such LMs is getting attention. While there are studies that seek to control high-level attributes (such as sentiment and topic) of generated text, there is still a lack of more precise control over its content at the word- and phrase-level. Here, we propose Content-Conditioner (CoCon) to control an LM's output text with a target content, at a fine-grained level. In our self-supervised approach, the CoCon block learns to help the LM complete a partially-observed text sequence by conditioning with content inputs that are withheld from the LM. Through experiments, we show that CoCon can naturally incorporate target content into generated texts and control high-level text attributes in a zero-shot manner.
摘要：基于变压器的预训练的语言模型（LMS）显示显着的自然语言生成能力。凭借其巨大的潜力，控制文本生成这样的LMS越来越关注。虽然有研究，寻求生成的文本来控制高级属性（如情绪和话题），但仍缺乏对公司的字处理和短语级内容更精确的控制。在这里，我们建议内容护发素（COCON）来控制与目标内容的LM的输出文本，在细粒度水平。在我们的自我监督的方针，COCON块学会帮助LM完成机上的部分观测的文本序列与从LM隐瞒的内容输入。通过实验，我们表明，COCON可以自然一体化对象内容分成产生的文本和控制的高级文本属性在零射门的方式。

4. Beyond Domain APIs: Task-oriented Conversational Modeling with Unstructured Knowledge Access [PDF] 返回目录
Seokhwan Kim, Mihail Eric, Karthik Gopalakrishnan, Behnam Hedayatnia, Yang Liu, Dilek Hakkani-Tur
Abstract: Most prior work on task-oriented dialogue systems are restricted to a limited coverage of domain APIs, while users oftentimes have domain related requests that are not covered by the APIs. In this paper, we propose to expand coverage of task-oriented dialogue systems by incorporating external unstructured knowledge sources. We define three sub-tasks: knowledge-seeking turn detection, knowledge selection, and knowledge-grounded response generation, which can be modeled individually or jointly. We introduce an augmented version of MultiWOZ 2.1, which includes new out-of-API-coverage turns and responses grounded on external knowledge sources. We present baselines for each sub-task using both conventional and neural approaches. Our experimental results demonstrate the need for further research in this direction to enable more informative conversational systems.
摘要：面向任务的对话系统中的大多数以前的工作仅限于域的API的覆盖范围有限，而用户往往具有不涵盖API的域相关请求。在本文中，我们提出通过引入外部非结构化的知识来源，扩大面向任务的对话系统的覆盖范围。我们定义了三个子任务：求知旋转检测，知识选择，知识接地反应生成，可单独或联合进行建模。我们介绍MultiWOZ 2.1，其中包括外的API覆盖新的转弯和接地的外部知识源的反应的增强版本。我们提出的基线使用常规和神经接近各个子任务。我们的实验结果表明，需要进一步研究，需要在这个方向上，以使更多的信息对话系统。

5. Unsupervised Translation of Programming Languages [PDF] 返回目录
Marie-Anne Lachaux, Baptiste Roziere, Lowik Chanussot, Guillaume Lample
Abstract: A transcompiler, also known as source-to-source translator, is a system that converts source code from a high-level programming language (such as C++ or Python) to another. Transcompilers are primarily used for interoperability, and to port codebases written in an obsolete or deprecated language (e.g. COBOL, Python 2) to a modern one. They typically rely on handcrafted rewrite rules, applied to the source code abstract syntax tree. Unfortunately, the resulting translations often lack readability, fail to respect the target language conventions, and require manual modifications in order to work properly. The overall translation process is timeconsuming and requires expertise in both the source and target languages, making code-translation projects expensive. Although neural models significantly outperform their rule-based counterparts in the context of natural language translation, their applications to transcompilation have been limited due to the scarcity of parallel data in this domain. In this paper, we propose to leverage recent approaches in unsupervised machine translation to train a fully unsupervised neural transcompiler. We train our model on source code from open source GitHub projects, and show that it can translate functions between C++, Java, and Python with high accuracy. Our method relies exclusively on monolingual source code, requires no expertise in the source or target languages, and can easily be generalized to other programming languages. We also build and release a test set composed of 852 parallel functions, along with unit tests to check the correctness of translations. We show that our model outperforms rule-based commercial baselines by a significant margin.
摘要：transcompiler，也被称为源极 - 源极转换，是一个系统，转换源代码从一个高级语言（如C ++和Python）到另一个。 Transcompilers主要用于互操作性，并写在一个过时或弃用语言（例如COBOL，Python的2）到现代一个端口的代码库。他们通常依赖于手工重写规则，应用到源代码的抽象语法树。不幸的是，所产生的翻译往往缺乏可读性，不尊重目标语言约定，妥善工作要求，以手动修改。整体平移过程耗时，需要在源语言和目标语言的专业知识，使代码翻译项目贵。虽然神经模型显著超越他们的基于规则的同行中自然语言翻译的情况下，他们的应用程序transcompilation已有限的，由于并行数据的在这一领域的稀缺性。在本文中，我们提出利用无监督机器翻译最近的方法来训练完全无监督神经transcompiler。我们培训的源代码模型从开源GitHub的项目，并表明，它可以℃之间转换功能++，Java和Python的和高精度。我们的方法完全依赖于单一语言的源代码，需要在源或目标语言，没有任何专业知识，并可以很容易地推广到其他编程语言。我们还建立和释放的852个平行功能组成的测试组，以及单元测试，以检查翻译的正确性。我们表明，我们的模型优于由显著保证金规则为基础的商业基准。

6. ELITR Non-Native Speech Translation at IWSLT 2020 [PDF] 返回目录
Dominik Macháček, Jonáš Kratochvíl, Sangeet Sagar, Matúš Žilinec, Ondřej Bojar, Thai-Son Nguyen, Felix Schneider, Philip Williams, Yuekun Yao
Abstract: This paper is an ELITR system submission for the non-native speech translation task at IWSLT 2020. We describe systems for offline ASR, real-time ASR, and our cascaded approach to offline SLT and real-time SLT. We select our primary candidates from a pool of pre-existing systems, develop a new end-to-end general ASR system, and a hybrid ASR trained on non-native speech. The provided small validation set prevents us from carrying out a complex validation, but we submit all the unselected candidates for contrastive evaluation on the test set.
摘要：本文是ELITR系统提交的非本地语音翻译任务，在2020年IWSLT我们描述系统的离线ASR，实时ASR，和我们的级联方式来离线SLT和实时SLT。我们选择了来自现有系统的一个池主考生，开发新的终端到终端的一般ASR系统以及混合ASR对非本地语音训练。所提供的小验证集阻止我们进行一个复杂的验证，但是我们提交全部非选择候选人的测试集对比评价。

7. Understanding Self-Attention of Self-Supervised Audio Transformers [PDF] 返回目录
Shu-wen Yang, Andy T. Liu, Hung-yi Lee
Abstract: Self-supervised Audio Transformers (SAT) enable great success in many downstream speech applications like ASR, but how they work has not been widely explored yet. In this work, we present multiple strategies for the analysis of attention mechanisms in SAT. We categorize attentions into explainable categories, where we discover each category possesses its own unique functionality. We provide a visualization tool for understanding multi-head self-attention, importance ranking strategies for identifying critical attention, and attention refinement techniques to improve model performance.
摘要：自监督音频变压器（SAT）能够在像ASR许多下游语音应用了巨大的成功，但他们如何工作并没有被广泛尚未探索。在这项工作中，我们提出了在SAT注意机制分析多种策略。我们归类关注到解释的范畴，我们发现每个类别拥有自己独特的功能。我们提供了理解多头的自我关注的可视化工具，重要性排序策略，确定关键的关注，并注意细化技术来提高模型的性能。

8. Aspect-based Sentiment Analysis of Scientific Reviews [PDF] 返回目录
Souvic Chakraborty, Pawan Goyal, Animesh Mukherjee
Abstract: Scientific papers are complex and understanding the usefulness of these papers requires prior knowledge. Peer reviews are comments on a paper provided by designated experts on that field and hold a substantial amount of information, not only for the editors and chairs to make the final decision, but also to judge the potential impact of the paper. In this paper, we propose to use aspect-based sentiment analysis of scientific reviews to be able to extract useful information, which correlates well with the accept/reject decision. While working on a dataset of close to 8k reviews from ICLR, one of the top conferences in the field of machine learning, we use an active learning framework to build a training dataset for aspect prediction, which is further used to obtain the aspects and sentiments for the entire dataset. We show that the distribution of aspect-based sentiments obtained from a review is significantly different for accepted and rejected papers. We use the aspect sentiments from these reviews to make an intriguing observation, certain aspects present in a paper and discussed in the review strongly determine the final recommendation. As a second objective, we quantify the extent of disagreement among the reviewers refereeing a paper. We also investigate the extent of disagreement between the reviewers and the chair and find that the inter-reviewer disagreement may have a link to the disagreement with the chair. One of the most interesting observations from this study is that reviews, where the reviewer score and the aspect sentiments extracted from the review text written by the reviewer are consistent, are also more likely to be concurrent with the chair's decision.
摘要：科学论文是复杂的，了解这些文件的有效性需要事先了解。同行评审是由指定的专家在该领域提供一纸意见和持有的大量信息，不仅为编辑和椅子来作出最后的决定，而且要判断纸张的潜在影响。在本文中，我们建议使用的科学评估基础方面，情感分析，以便能够提取有用的信息，这与接受/拒绝的决定很好的相关性。虽然在接近的数据集中工作，从ICLR 8K评论，顶会议在机器学习领域之一，我们使用了一个主动学习的框架搭建方面的预测，这是进一步用于获取方面和情感的训练数据集整个数据集。我们发现，从审查由此获得的方面，情绪的分布是接受和拒绝的论文显著不同。我们使用这些审查方面的情绪，使一个有趣的观察，某些方面存在的文件，并在审查讨论极力确定最终的建议。作为第二个目标，我们量化审阅裁判纸之间存在分歧的程度。我们还调查了评论家和椅子之间分歧的程度和发现，评论家间的分歧可能有一个链接与椅子的分歧。一位来自这项研究最有趣的观察是，评论，在评论者分数，并从所述评论写评论文章中提取方面的情绪是一致的，也更可能是并发与主席的决定。

9. "To Target or Not to Target": Identification and Analysis of Abusive Text Using Ensemble of Classifiers [PDF] 返回目录
Gaurav Verma, Niyati Chhaya, Vishwa Vinay
Abstract: With rising concern around abusive and hateful behavior on social media platforms, we present an ensemble learning method to identify and analyze the linguistic properties of such content. Our stacked ensemble comprises of three machine learning models that capture different aspects of language and provide diverse and coherent insights about inappropriate language. The proposed approach provides comparable results to the existing state-of-the-art on the Twitter Abusive Behavior dataset (Founta et al. 2018) without using any user or network-related information; solely relying on textual properties. We believe that the presented insights and discussion of shortcomings of current approaches will highlight potential directions for future research.
摘要：随着各地对社交媒体平台虐待和仇恨行为的关注不断上升，我们提出了一个集成学习方法来识别和分析等内容的语言特性。我们的堆叠三种机器学习模型集成包括语言的不同拍摄方面，并提供了有关不恰当的语言多样化和一致的见解。所提出的方法提供了可比较的结果到存在于微博滥用行为数据集状态的最先进的（Founta等人2018），而不使用任何用户或网络相关的信息;仅仅依靠文字属性。我们认为，当前方法的缺点提出的见解和讨论会突出了今后的研究方向的潜力。

10. Evaluating Text Coherence at Sentence and Paragraph Levels [PDF] 返回目录
Sennan Liu, Shuang Zeng, Sujian Li
Abstract: In this paper, to evaluate text coherence, we propose the paragraph ordering task as well as conducting sentence ordering. We collected four distinct corpora from different domains on which we investigate the adaptation of existing sentence ordering methods to a paragraph ordering task. We also compare the learnability and robustness of existing models by artificially creating mini datasets and noisy datasets respectively and verifying the efficiency of established models under these circumstances. Furthermore, we carry out human evaluation on the rearranged passages from two competitive models and confirm that WLCS-l is a better metric performing significantly higher correlations with human rating than tau, the most prevalent metric used before. Results from these evaluations show that except for certain extreme conditions, the recurrent graph neural network-based model is an optimal choice for coherence modeling.
摘要：本文以评价文本的一致性，我们建议段落排序任务，以及进行句子排序。我们收集了四个不同的语料从我们调查的现有句子排序方法段落排序任务，适应不同的域。我们还分别人为地制造小型数据集和嘈杂的数据集，在这种情况下验证建立的模型的效率比较学习能力和现有模型的鲁棒性。此外，我们还开展从两个竞争车型上的重排通道人工评估，并确认WLCS-L是与人的评价不是头一个更好的度量来执行显著较高的相关性，前使用的最普遍的指标。从这些评价结果表明，除了某些极端条件下，经常性图表基于神经网络的模型是用于相干性建模的最佳选择。

11. Cross-lingual Transfer Learning for COVID-19 Outbreak Alignment [PDF] 返回目录
Sharon Levy, William Yang Wang
Abstract: The spread of COVID-19 has become a significant and troubling aspect of society in 2020. With millions of cases reported across countries, new outbreaks have occurred and followed patterns of previously affected areas. Many disease detection models do not incorporate the wealth of social media data that can be utilized for modeling and predicting its spread. In this case, it is useful to ask, can we utilize this knowledge in one country to model the outbreak in another? To answer this, we propose the task of cross-lingual transfer learning for epidemiological alignment. Utilizing both macro and micro text features, we train on Italy's early COVID-19 outbreak through Twitter and transfer to several other countries. Our experiments show strong results with up to 0.85 Spearman correlation in cross-country predictions.
摘要：COVID-19的传播已成为社会的一个显著和令人不安的方面在2020年以百万计的不同国家报告的病例中，新的疫情发生，并遵循先前受影响地区的模式。许多疾病的检测模型不包括可用于建模和预测其传播的社交媒体数据的财富。在这种情况下，它要问，我们可以利用这些知识在一个国家到另一个爆发模型有用吗？要回答这个问题，我们提出了流行病学对准跨语言迁移学习的任务。利用宏观和微观的文本特点，我们训练的意大利早期通过Twitter COVID-19爆发转移到其他几个国家。我们的实验显示高达0.85 Spearman相关越野预测的强劲业绩。

12. Human or Machine: Automating Human Likeliness Evaluation of NLG Texts [PDF] 返回目录
Erion Çano, Ondřej Bojar
Abstract: Automatic evaluation of various text quality criteria produced by data-driven intelligent methods is very common and useful because it is cheap, fast, and usually yields repeatable results. In this paper, we present an attempt to automate the human likeliness evaluation of the output text samples coming from natural language generation methods used to solve several tasks. We propose to use a human likeliness score that shows the percentage of the output samples from a method that look as if they were written by a human. Instead of having human participants label or rate those samples, we completely automate the process by using a discrimination procedure based on large pretrained language models and their probability distributions. As follow up, we plan to perform an empirical analysis of human-written and machine-generated texts to find the optimal setup of this evaluation approach. A validation procedure involving human participants will also check how the automatic evaluation correlates with human judgments.
摘要：由数据驱动的智能方法产生的各种文本的质量标准自动评价是很常见的和有用的，因为它价格便宜，速度快，通常会产生重复的结果。在本文中，我们提出了一个试图从自动化用于解决几个任务自然语言生成方法来输出文本样本的人的相似性评价。我们建议使用一个人的相似性得分表演，看起来好像他们是由人编写的输出采样从方法的百分比。而不是人类参与者标签或评级的样本，我们完全通过使用基于大型预训练的语言模型及其概率分布的鉴别过程自动化的过程。作为跟进，我们计划进行人类编写和机器生成的文本进行实证分析，发现该评估方法的最佳设置。涉及人类受试者的验证程序还将检查如何与人为判断的自动评估相关因素。

13. SOLO: A Corpus of Tweets for Examining the State of Being Alone [PDF] 返回目录
Svetlana Kiritchenko, Will E. Hipson, Robert J. Coplan, Saif M. Mohammad
Abstract: The state of being alone can have a substantial impact on our lives, though experiences with time alone diverge significantly among individuals. Psychologists distinguish between the concept of solitude, a positive state of voluntary aloneness, and the concept of loneliness, a negative state of dissatisfaction with the quality of one's social interactions. Here, for the first time, we conduct a large-scale computational analysis to explore how the terms associated with the state of being alone are used in online language. We present SOLO (State of Being Alone), a corpus of over 4 million tweets collected with query terms 'solitude', 'lonely', and 'loneliness'. We use SOLO to analyze the language and emotions associated with the state of being alone. We show that the term 'solitude' tends to co-occur with more positive, high-dominance words (e.g., enjoy, bliss) while the terms 'lonely' and 'loneliness' frequently co-occur with negative, low-dominance words (e.g., scared, depressed), which confirms the conceptual distinctions made in psychology. We also show that women are more likely to report on negative feelings of being lonely as compared to men, and there are more teenagers among the tweeters that use the word 'lonely' than among the tweeters that use the word 'solitude'.
摘要：独自一人可以对我们的生活产生重大影响，国家虽与个人之间的显著时间独处发散经验。心理学家孤独的概念，自愿孤独的一个积极的状态，和寂寞的概念，与一个人的社会互动的质量不满的负面状态之间进行区分。在这里，第一次，我们进行了大规模的计算分析，探讨如何与独处的状态相关联的条款在网络用语中使用。我们现在SOLO（单独作为国家），与查询词“孤独”，“孤独”和“孤独”收集超过400万微博的语料库。我们使用SOLO分析与独处的状态相关联的语言和情感。我们表明，术语“孤独”倾向于用更积极的，高的优势的话（例如，享受，幸福），而术语“寂寞”和“孤独”越来越频繁地与负，低的优势词同时出现同现（例如，恐惧，郁闷），这证实了心理学提出的概念区分。我们还表明，女性更容易对的孤独消极情绪报告与男性相比，有使用单词“寂寞”比使用单词“孤独”的高音喇叭中的高音喇叭中更多的青少年。

14. NewB: 200,000+ Sentences for Political Bias Detection [PDF] 返回目录
Jerry Wei
Abstract: We present the Newspaper Bias Dataset (NewB), a text corpus of more than 200,000 sentences from eleven news sources regarding Donald Trump. While previous datasets have labeled sentences as either liberal or conservative, NewB covers the political views of eleven popular media sources, capturing more nuanced political viewpoints than a traditional binary classification system does. We train two state-of-the-art deep learning models to predict the news source of a given sentence from eleven newspapers and find that a recurrent neural network achieved top-1, top-3, and top-5 accuracies of 33.3%, 61.4%, and 77.6%, respectively, significantly outperforming a baseline logistic regression model's accuracies of 18.3%, 42.6%, and 60.8%. Using the news source label of sentences, we analyze the top n-grams with our model to gain meaningful insight into the portrayal of Trump by media sources.We hope that the public release of our dataset will encourage further research in using natural language processing to analyze more complex political biases. Our dataset is posted at this https URL .
摘要：我们目前的报纸偏差数据集（福利局），从关于唐纳德·特朗普11个新闻来源超过20万个句子的文本语料库。虽然以前的数据集已标记的句子作为无论是自由或保守，福利局占地面积十一大众媒体来源的政治观点，捕捉比传统的二元分类系统做更细致的政治观点。我们培养的两家国有的最先进的深度学习模式，以自十一报纸预测给定句子的新闻来源，发现一个经常性的神经网络来实现最高1，顶部3，和顶级的5精度的33.3％， 61.4％和77.6％，显著跑赢18.3％，42.6％和60.8％的基线逻辑回归模型的精度。使用句子的新闻源标签，我们来分析一下上面的n-gram与我们的模型来获得有意义的洞察特朗普的媒体sources.We希望的写照，我们的数据集的公开发布将鼓励进一步研究使用自然语言处理分析更加复杂的政治偏见。我们的数据被张贴在此HTTPS URL。

15. Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers [PDF] 返回目录
Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Jared Davis, Tamas Sarlos, David Belanger, Lucy Colwell, Adrian Weller
Abstract: Transformer models have achieved state-of-the-art results across a diverse range of domains. However, concern over the cost of training the attention mechanism to learn complex dependencies between distant inputs continues to grow. In response, solutions that exploit the structure and sparsity of the learned attention matrix have blossomed. However, real-world applications that involve long sequences, such as biological sequence analysis, may fall short of meeting these assumptions, precluding exploration of these models. To address this challenge, we present a new Transformer architecture, Performer, based on Fast Attention Via Orthogonal Random features (FAVOR). Our mechanism scales linearly rather than quadratically in the number of tokens in the sequence, is characterized by sub-quadratic space complexity and does not incorporate any sparsity pattern priors. Furthermore, it provides strong theoretical guarantees: unbiased estimation of the attention matrix and uniform convergence. It is also backwards-compatible with pre-trained regular Transformers. We demonstrate its effectiveness on the challenging task of protein sequence modeling and provide detailed theoretical analysis.
摘要：变压器模型已经跨越各种各样的领域实现国家的最先进的成果。然而，在训练中注意机制，了解遥远的输入之间复杂的依赖关系成本的关注持续增长。在回应中，利用所学的关注矩阵的结构和稀疏的解决方案已经开花了。然而，涉及长序列，如生物序列分析现实世界的应用，可能达不到满足这些假设，排除了这些模型的探索。为了应对这一挑战，我们提出了一个新的变压器结构，表演的基础上，快速注意了通过正交随机特征（赞成）。我们的机构呈线性而不是二次地在该序列中的令牌的数量，其特征在于子二次空间复杂性，并且不包含任何稀疏模式先验。此外，它提供了强大的理论保证：注意矩阵和一致收敛的无偏估计。这也符合预训练常规变压器向后兼容。我们证明其对蛋白质序列建模的具有挑战性的任务效能，并提供详细的理论分析。

16. AP20-OLR Challenge: Three Tasks and Their Baselines [PDF] 返回目录
Zheng Li, Miao Zhao, Qingyang Hong, Lin Li, Zhiyuan Tang, Dong Wang, Liming Song, Cheng Yang
Abstract: This paper introduces the fifth oriental language recognition (OLR) challenge AP20-OLR, which intends to improve the performance of language recognition systems, along with APSIPA Annual Summit and Conference (APSIPA ASC). The data profile, three tasks, the corresponding baselines, and the evaluation principles are introduced in this paper. The AP20-OLR challenge includes more languages, dialects and real-life data provided by Speechocean and the NSFC M2ASR project, and all the data is free for participants. The challenge this year still focuses on practical and challenging problems, with three tasks: (1) cross-channel LID, (2) dialect identification and (3) noisy LID. Based on Kaldi and Pytorch, recipes for i-vector and x-vector systems are also conducted as baselines for the three tasks. These recipes will be online-published, and available for participants to configure LID systems. The baseline results on the three tasks demonstrate that those tasks in this challenge are worth paying more efforts to achieve better performance.
摘要：本文介绍了第五东方语言识别（OLR）挑战AP20-OLR，它打算提高语言识别系统的性能，APSIPA年度峰会和会议（APSIPA ASC）一起。数据曲线，三个任务，相应的基线，而评价原则，本文介绍。该AP20-OLR挑战包括更多的语言，方言和现实生活中的数据通过Speechocean和国家自然科学基金项目M2ASR提供，所有的数据是免费的参与者。挑战今年仍侧重于实用和具有挑战性的问题，有三个任务：（1）跨渠道LID，（2）方言识别和（3）嘈杂的盖子。基于Kaldi和Pytorch，配方于i-矢量和x载体系统作为基线的三个任务也进行。这些食谱将在网上发布，并供参与者配置LID系统。这三个任务的基准结果证明，在这一挑战的任务是值得付出更多的努力，以实现更好的性能。

17. Sponge Examples: Energy-Latency Attacks on Neural Networks [PDF] 返回目录
Ilia Shumailov, Yiren Zhao, Daniel Bates, Nicolas Papernot, Robert Mullins, Ross Anderson
Abstract: The high energy costs of neural network training and inference led to the use of acceleration hardware such as GPUs and TPUs. While this enabled us to train large-scale neural networks in datacenters and deploy them on edge devices, the focus so far is on average-case performance. In this work, we introduce a novel threat vector against neural networks whose energy consumption or decision latency are critical. We show how adversaries can exploit carefully crafted $\boldsymbol{sponge}~\boldsymbol{examples}$, which are inputs designed to maximise energy consumption and latency. We mount two variants of this attack on established vision and language models, increasing energy consumption by a factor of 10 to 200. Our attacks can also be used to delay decisions where a network has critical real-time performance, such as in perception for autonomous vehicles. We demonstrate the portability of our malicious inputs across CPUs and a variety of hardware accelerator chips including GPUs, and an ASIC simulator. We conclude by proposing a defense strategy which mitigates our attack by shifting the analysis of energy consumption in hardware from an average-case to a worst-case perspective.
摘要：神经网络训练和推理导致使用加速硬件，如GPU和TPU的高能源成本。虽然这使得我们在数据中心，培养大型神经网络的边缘设备部署它们，重点至今平均情况下的性能。在这项工作中，我们将介绍对神经网络的能源消费或决策延迟是至关重要的一个新的威胁载体。我们展示的敌人如何利用精心打造的$ \ {boldsymbol海绵}〜\ {boldsymbol例子} $，其目的是最大限度地提高能源消耗和延迟输入。我们安装在既定的眼光和语言模型这种攻击的两个变种，由10至200我们的攻击因子增加能耗，也可以用来延迟决策，其中一个网络具有重要的实时性能，如感知为自主汽车。我们证明了我们在CPU恶意输入的便携性和各种硬件加速器芯片，包括图形处理器，以及ASIC模拟器。最后，我们建议其通过从平均情况转变能源消费的分析在硬件最坏情况的角度缓解我们的攻击防御策略。

18. Contextual RNN-T For Open Domain ASR [PDF] 返回目录
Mahaveer Jain, Gil Keren, Jay Mahadeokar, Yatharth Saraf
Abstract: End-to-end (E2E) systems for automatic speech recognition (ASR), such as RNN Transducer (RNN-T) and Listen-Attend-Spell (LAS) blend the individual components of a traditional hybrid ASR system - acoustic model, language model, pronunciation model - into a single neural network. While this has some nice advantages, it limits the system to be trained using only paired audio and text. Because of this, E2E models tend to have difficulties with correctly recognizing rare words that are not frequently seen during training, such as entity names. In this paper, we propose modifications to the RNN-T model that allow the model to utilize additional metadata text with the objective of improving performance on these named entity words. We evaluate our approach on an in-house dataset sampled from de-identified public social media videos, which represent an open domain ASR task. By using an attention model to leverage the contextual metadata that accompanies a video, we observe a relative improvement of about 12% in Word Error Rate on Named Entities (WER-NE) for videos with related metadata.
摘要：端至端（E2E），用于自动语音识别（ASR），诸如RNN传感器（RNN-T）系统和监听手捧拼写（LAS）混合传统的混合ASR系统的各个部件 - 声学模型，语言模型，发音模型 - 到一个单一的神经网络。虽然这有一些不错的优势，它限制只使用配对的音频和文本进行培训系统。正因为如此，E2E车型往往有正确的认识到，没有在训练中经常看到生僻字，比如实体名称的困难。在本文中，我们提出修改到RNN-T模式，允许模型利用与客观上改善这些命名的实体词性能的额外的元数据文本。我们评估从去标识公共社交媒体视频，它代表了一个开放的领域ASR任务采样的内部数据集中我们的做法。通过使用注意模型利用附带的视频场景元数据，我们观察到的约12％，在命名实体（WER-NE），与相关的元数据的视频词错误率相对改善。

19. A New Method Towards Speech Files Local Features Investigation [PDF] 返回目录
Rustam Latypov, Evgeni Stolov
Abstract: There are a few reasons for the recent increased interest in the study of local features of speech files. It is stated that many essential features of the speaker language used can appear in the form of the speech signal. The traditional instruments - short Fourier transform, wavelet transform, Hadamard transforms, autocorrelation, and the like can detect not all particular properties of the language. In this paper, we suggest a new approach to the exploration of such properties. The source signal is approximated by a new one that has its values taken from a finite set. Then we construct a new sequence of vectors of a fixed size on the base of those approximations. Examination of the distribution of the produced vectors provides a new method for a description of speech files local characteristics. Finally, the developed technique is applied to the problem of the automatic distinguishing of two known languages used in speech files. For this purpose, a simple neural net is consumed.
摘要：对于近期的语音文件的本地特征的研究兴趣增加的几个原因。据称，使用扬声器语言的许多基本特征可以在语音信号的形式出现。传统的乐器 - 短傅立叶变换，小波变换，阿达玛变换，自相关等可检测的语言不是所有的特殊属性。在本文中，我们提出一种新的方法，以这样的性质的探索。源信号是由一个新的，其具有从一个有限集采取其值近似。然后，我们构建这些近似值的基础上的固定尺寸的向量的一个新的序列。所产生的矢量的分布的检查提供了语音的描述的新方法文件的本地特性。最后，发达的技术应用到的语音文件中使用两个已知语言自动灭火的问题。为了这个目的，一个简单的神经网络被消耗。

20. Classification Aware Neural Topic Model and its Application on a New COVID-19 Disinformation Corpus [PDF] 返回目录
Xingyi Song, Johann Petrak, Ye Jiang, Iknoor Singh, Diana Maynard, Kalina Bontcheva
Abstract: The explosion of disinformation related to the COVID-19 pandemic has overloaded fact-checkers and media worldwide. To help tackle this, we developed computational methods to support COVID-19 disinformation debunking and social impacts research. This paper presents: 1) the currently largest available manually annotated COVID-19 disinformation category dataset; and 2) a classification-aware neural topic model (CANTM) that combines classification and topic modelling under a variational autoencoder framework. We demonstrate that CANTM efficiently improves classification performance with low resources, and is scalable. In addition, the classification-aware topics help researchers and end-users to better understand the classification results.
摘要：造谣的相关COVID-19大流行的爆炸超载全球事实跳棋和媒体。为了帮助解决这个问题，我们开发的计算方法，以支持COVID-19的造谣揭穿和社会影响的研究。本文礼物：1）当前最大可用手动注释COVID-19的造谣类数据集; 2）分类感知神经主题模型（CANTM）下变的自动编码框架，结合分类和主题建模。我们证明CANTM有效地改善了与低资源分类性能和可扩展性。此外，分类意识主题的帮助研究人员和最终用户能够更好地理解分类结果。

21. GMAT: Global Memory Augmentation for Transformers [PDF] 返回目录
Ankit Gupta, Jonathan Berant
Abstract: Transformer-based models have become ubiquitous in natural language processing thanks to their large capacity, innate parallelism and high performance. The contextualizing component of a Transformer block is the $\textit{pairwise dot-product}$ attention that has a large $\Omega(L^2)$ memory requirement for length $L$ sequences, limiting its ability to process long documents. This has been the subject of substantial interest recently, where multiple approximations were proposed to reduce the quadratic memory requirement using sparse attention matrices. In this work, we propose to augment sparse Transformer blocks with a dense attention-based $\textit{global memory}$ of length $M$ ($\ll L$) which provides an aggregate global view of the entire input sequence to each position. Our augmentation has a manageable $O(M\cdot(L+M))$ memory overhead, and can be seamlessly integrated with prior sparse solutions. Moreover, global memory can also be used for sequence compression, by representing a long input sequence with the memory representations only. We empirically show that our method leads to substantial improvement on a range of tasks, including (a) synthetic tasks that require global reasoning, (b) masked language modeling, and (c) reading comprehension.
摘要：基于变压器的模型已成为自然语言处理的感谢无处不在他们的大容量，天生并行性和高性能。一个变压器块的情境化组分是$ \ textit {成对点积} $具有大$ \欧米茄（L ^ 2）长度$ L $序列$存储器需求，限制了其处理长文件的能力的关注。这是最近相当关注的课题，在多个近似提出了降低使用稀疏矩阵关注二次内存要求。在这项工作中，我们提出以增加稀疏变压器块与基于关注致密$ \ textit {全局存储器} $长度$ M $（$ \ LL大号$），它提供在整个输入序列的集合体全局视图的每一个位置。我们的增强具有可管理的$ø（M \ CDOT（L + M））$内存开销，并且能够无缝地与现有稀疏解一体化。此外，全局存储器也可用于序列压缩，由表示与仅存储表示一长的输入序列。我们经验表明，我们的方法导致的一系列任务，包括需要全球性的推理（一）合成的任务，（B）的显着改善掩盖语言建模，以及（c）阅读理解。

22. Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing [PDF] 返回目录
Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
Abstract: With the success of language pretraining, it is highly desirable to develop more efficient architectures of good scalability that can exploit the abundant unlabeled data at a lower cost. To improve the efficiency, we examine the much-overlooked redundancy in maintaining a full-length token-level presentation, especially for tasks that only require a single-vector presentation of the sequence. With this intuition, we propose Funnel-Transformer which gradually compresses the sequence of hidden states to a shorter one and hence reduces the computation cost. More importantly, by re-investing the saved FLOPs from length reduction in constructing a deeper or wider model, we further improve the model capacity. In addition, to perform token-level predictions as required by common pretraining objectives, Funnel-Transformer is able to recover a deep representation for each token from the reduced hidden sequence via a decoder. Empirically, with comparable or fewer FLOPs, Funnel-Transformer outperforms the standard Transformer on a wide variety of sequence-level prediction tasks, including text classification, language understanding, and reading comprehension. The code and pretrained checkpoints are available at this https URL.
摘要：随着语言训练前的成功，这是非常需要发展良好的扩展性的更有效的架构，能够以较低的成本利用丰富的未标记的数据。在保持全长令牌级的表示为了提高效率，我们考察了很多被忽视的冗余性，特别是对于那些只需要序列的单矢量演示任务。有了这种直觉，我们提出了漏斗变压器逐渐压缩隐藏状态的，以较短的一个序列，并因此降低了计算成本。更重要的是，通过再投资，从长度减少所保存的触发器在构建更深或更宽的模型中，我们进一步提高模型的能力。此外，所要求的公共预训练的目标以执行标记级别预测，漏斗变压器能够经由解码器以恢复从缩小隐序列中的每个令牌的深表示。根据经验，具有相当或更少的触发器，漏斗变压器优于上各种各样的序列级别的预测任务，包括文本分类，语言理解和阅读理解的标准变压器。代码和预训练的检查点可在此HTTPS URL。

注：中文为机器翻译结果！

WITH LOVE OF WORLD

【arxiv论文】 Computation and Language 2020-06-08

目录

摘要