摘要

1. Utilizing Language Relatedness to improve Machine Translation: A Case Study on Languages of the Indian Subcontinent [PDF] 返回目录
Anoop Kunchukuttan, Pushpak Bhattacharyya
Abstract: In this work, we present an extensive study of statistical machine translation involving languages of the Indian subcontinent. These languages are related by genetic and contact relationships. We describe the similarities between Indic languages arising from these relationships. We explore how lexical and orthographic similarity among these languages can be utilized to improve translation quality between Indic languages when limited parallel corpora is available. We also explore how the structural correspondence between Indic languages can be utilized to re-use linguistic resources for English to Indic language translation. Our observations span 90 language pairs from 9 Indic languages and English. To the best of our knowledge, this is the first large-scale study specifically devoted to utilizing language relatedness to improve translation between related languages.
摘要：在这项工作中，我们提出涉及印度次大陆的语言统计机器翻译的广泛研究。这些语言是由遗传和接触关系有关。我们描述了从这些关系所产生的印度语言之间的相似性。我们探讨如何这些语言中的词汇和拼写相似性可以用来提高印度语言之间的翻译质量时，限制平行语料库是可用的。我们也探讨如何印度语言之间的结构对应可用于再利用语言资源英语印度语翻译。我们的观察涵盖从9台印度语和英语90语言对。据我们所知，这是专门针对利用语言关联性，以提高相关的语言之间的翻译的第一次大规模的研究。

2. Beheshti-NER: Persian Named Entity Recognition Using BERT [PDF] 返回目录
Ehsan Taher, Seyed Abbas Hoseini, Mehrnoush Shamsfard
Abstract: Named entity recognition is a natural language processing task to recognize and extract spans of text associated with named entities and classify them in semantic Categories. Google BERT is a deep bidirectional language model, pre-trained on large corpora that can be fine-tuned to solve many NLP tasks such as question answering, named entity recognition, part of speech tagging and etc. In this paper, we use the pre-trained deep bidirectional network, BERT, to make a model for named entity recognition in Persian. We also compare the results of our model with the previous state of the art results achieved on Persian NER. Our evaluation metric is CONLL 2003 score in two levels of word and phrase. This model achieved second place in NSURL-2019 task 7 competition which associated with NER for the Persian language. our results in this competition are 83.5 and 88.4 f1 CONLL score respectively in phrase and word level evaluation.
摘要：命名实体识别是自然语言处理任务的认识，并与命名实体相关联的文本提取跨度和语义类别进行归类。谷歌BERT是一种深深的双向语言模式，对大语料库，可以进行微调，以解决许多NLP任务，比如答疑，命名实体识别，词性标注等在本文中预先训练，我们使用的预-trained深双向网络，BERT，使在波斯命名实体识别模型。我们还比较我们的模型结果与波斯NER所取得的艺术成绩的前一个状态。我们的评估指标是CONLL 2003得分词语和短语中两个层次。这种模式在与NER的波斯语言相关NSURL-2019的任务7竞争中取得第二名。我们在本次比赛的结果是83.5和短语和单词水平评估88.4 F1 CONLL比分分别。

3. Boosting Factual Correctness of Abstractive Summarization with Knowledge Graph [PDF] 返回目录
Chenguang Zhu, William Hinthorn, Ruochen Xu, Qingkai Zeng, Michael Zeng, Xuedong Huang, Meng Jiang
Abstract: A commonly observed problem with abstractive summarization is the distortion or fabrication of factual information in the article. This inconsistency between summary and original text has led to various concerns over its applicability. In this paper, we propose to boost factual correctness of summaries via the fusion of knowledge, i.e. extracted factual relations from the article. We present a Fact-Aware Summarization model, FASum. In this model, the knowledge information can be organically integrated into the summary generation process via neural graph computation and effectively improves the factual correctness. Empirical results show that FASum generates summaries with significantly higher factual correctness compared with state-of-the-art abstractive summarization systems, both under an independently trained factual correctness evaluator and human evaluation. For example, in CNN/DailyMail dataset, FASum obtains 1.2% higher fact correctness scores than UniLM and 4.5% higher than BottomUp.
摘要：与抽象的聚合通常观察到的问题是失真或物品在实际信息制作。摘要和原文之间的矛盾，导致了其适用性多方面的关注。在本文中，我们提出通过知识的融合，从文章中提取，即事实关系，以提高摘要的事实正确性。我们提出了一个事实感知汇总模型，FASum。在这个模型中，知识信息可以有机地融入通过神经图计算汇总生成过程，并有效地提高了事实的正确性。经验结果显示，与FASum更高显著事实正确性产生摘要与国家的最先进的抽象汇总系统相比，两者的下独立地训练事实正确性评价者和人工评估。例如，在CNN /每日邮报数据集，FASum获得高1.2％事实正确性得分高于UniLM和比自下而上4.5％以上。

4. Diversity, Density, and Homogeneity: Quantitative Characteristic Metrics for Text Collections [PDF] 返回目录
Yi-An Lai, Xuan Zhu, Yi Zhang, Mona Diab
Abstract: Summarizing data samples by quantitative measures has a long history, with descriptive statistics being a case in point. However, as natural language processing methods flourish, there are still insufficient characteristic metrics to describe a collection of texts in terms of the words, sentences, or paragraphs they comprise. In this work, we propose metrics of diversity, density, and homogeneity that quantitatively measure the dispersion, sparsity, and uniformity of a text collection. We conduct a series of simulations to verify that each metric holds desired properties and resonates with human intuitions. Experiments on real-world datasets demonstrate that the proposed characteristic metrics are highly correlated with text classification performance of a renowned model, BERT, which could inspire future applications.
摘要：通过定量措施汇总数据样本具有悠久的历史，具有描述性统计是一个很好的例子。然而，由于自然语言处理方法蓬勃发展，但仍有特性指标不足以形容的词，句，或者它们包括段落文本条款的集合。在这项工作中，我们提出了多样性，密度和均匀性的指标，定量测量分散，稀疏和文本集合的一致性。我们进行了一系列的模拟，以验证各指标保持所需的性能和共鸣与人类的直觉。在现实世界中的数据集的实验表明，该特性指标是高度著名的模型，BERT，这可能会激发未来应用的文本分类性能相关。

5. An Analysis on the Learning Rules of the Skip-Gram Model [PDF] 返回目录
Canlin Zhang, Xiuwen Liu, Daniel Bis
Abstract: To improve the generalization of the representations for natural language processing tasks, words are commonly represented using vectors, where distances among the vectors are related to the similarity of the words. While word2vec, the state-of-the-art implementation of the skip-gram model, is widely used and improves the performance of many natural language processing tasks, its mechanism is not yet well understood. In this work, we derive the learning rules for the skip-gram model and establish their close relationship to competitive learning. In addition, we provide the global optimal solution constraints for the skip-gram model and validate them by experimental results.
摘要：为了提高对申述自然语言处理任务的泛化，也就是说使用矢量，其中矢量中距离有关的词的相似性普遍表示。虽然word2vec，国家的最先进的实现跳跃克模型，被广泛使用，并提高了许多自然语言处理任务的性能，其作用机制尚不清楚。在这项工作中，我们推导出跳跃-gram模型的学习规则，并建立自己的密切关系，以有竞争力的学习。此外，我们还提供了跳过克模型的全局最优解的约束，并通过实验验证它们。

6. Normalized and Geometry-Aware Self-Attention Network for Image Captioning [PDF] 返回目录
Longteng Guo, Jing Liu, Xinxin Zhu, Peng Yao, Shichen Lu, Hanqing Lu
Abstract: Self-attention (SA) network has shown profound value in image captioning. In this paper, we improve SA from two aspects to promote the performance of image captioning. First, we propose Normalized Self-Attention (NSA), a reparameterization of SA that brings the benefits of normalization inside SA. While normalization is previously only applied outside SA, we introduce a novel normalization method and demonstrate that it is both possible and beneficial to perform it on the hidden activations inside SA. Second, to compensate for the major limit of Transformer that it fails to model the geometry structure of the input objects, we propose a class of Geometry-aware Self-Attention (GSA) that extends SA to explicitly and efficiently consider the relative geometry relations between the objects in the image. To construct our image captioning model, we combine the two modules and apply it to the vanilla self-attention network. We extensively evaluate our proposals on MS-COCO image captioning dataset and superior results are achieved when comparing to state-of-the-art approaches. Further experiments on three challenging tasks, i.e. video captioning, machine translation, and visual question answering, show the generality of our methods.
摘要：自注意（SA）网络已经显示出图像字幕具有深远的意义。在本文中，我们从两个方面提高SA促进图像字幕的性能。首先，我们建议归自注意（NSA），SA的重新参数化带来的内部SA正常化的好处。而归一化是以前仅SA外部施加，我们介绍一种新颖的归一化方法，并表明它是既可以与有益的上里面SA隐藏激活执行它。其次，以补偿变压器的主要限制在于它不输入对象的几何结构模型，我们提出了一个类，它扩展SA明确和有效的考虑之间的相对几何关系几何感知自我关注（GSA）的的对象在图像中。为了构建我们的形象字幕模型中，我们结合两个模块，并将其应用到香草自重视网络。我们广泛地评估我们在MS-COCO图像字幕数据集的建议和优异的业绩都比较先进设备，最先进的方法时实现。三个有挑战性的任务进一步的实验，即视频字幕，机器翻译，和视觉答疑，表明我们的方法的通用性。

7. Deep Learning for Automatic Tracking of Tongue Surface in Real-time Ultrasound Videos, Landmarks instead of Contours [PDF] 返回目录
M. Hamed Mozaffari, Won-Sook Lee
Abstract: One usage of medical ultrasound imaging is to visualize and characterize human tongue shape and motion during a real-time speech to study healthy or impaired speech production. Due to the low-contrast characteristic and noisy nature of ultrasound images, it might require expertise for non-expert users to recognize tongue gestures in applications such as visual training of a second language. Moreover, quantitative analysis of tongue motion needs the tongue dorsum contour to be extracted, tracked, and visualized. Manual tongue contour extraction is a cumbersome, subjective, and error-prone task. Furthermore, it is not a feasible solution for real-time applications. The growth of deep learning has been vigorously exploited in various computer vision tasks, including ultrasound tongue contour tracking. In the current methods, the process of tongue contour extraction comprises two steps of image segmentation and post-processing. This paper presents a new novel approach of automatic and real-time tongue contour tracking using deep neural networks. In the proposed method, instead of the two-step procedure, landmarks of the tongue surface are tracked. This novel idea enables researchers in this filed to benefits from available previously annotated databases to achieve high accuracy results. Our experiment disclosed the outstanding performances of the proposed technique in terms of generalization, performance, and accuracy.
摘要：医学超声成像的一个用途是可视化和一个实时语音期间表征人舌的形状和运动来研究健康或受损的语音产生。由于低对比度的特点和超声图像的嘈杂性质，可能需要对非专业用户的专业知识来识别应用，例如作为第二语言的视觉训练舌头手势。此外，舌运动的定量分析需要舌背部轮廓被提取，跟踪和可视化。手册舌轮廓提取是一个繁琐的，主观的，容易出错的任务。此外，它不是为实时应用提供可行的解决方案。深度学习的增长一直在大力开发各种计算机视觉任务，包括超声舌轮廓跟踪。在当前的方法中，舌轮廓提取的方法包括图像分割和后处理的两个步骤。本文给出了使用深层神经网络自动的，实时的舌头轮廓跟踪的新新方法。在所提出的方法，而不是两步程序，舌头表面的地标被跟踪。这种新颖的想法使研究人员在此从可用先前注释数据库提交给好处，以实现高精确度的结果。我们的实验中所公开的提出的技术的杰出表现在泛化，性能和准确性方面。

8. Personalized Taste and Cuisine Preference Modeling via Images [PDF] 返回目录
Nitish Nag, Bindu Rajanna, Ramesh Jain
Abstract: With the exponential growth in the usage of social media to share live updates about life, taking pictures has become an unavoidable phenomenon. Individuals unknowingly create a unique knowledge base with these images. The food images, in particular, are of interest as they contain a plethora of information. From the image metadata and using computer vision tools, we can extract distinct insights for each user to build a personal profile. Using the underlying connection between cuisines and their inherent tastes, we attempt to develop such a profile for an individual based solely on the images of his food. Our study provides insights about an individual's inclination towards particular cuisines. Interpreting these insights can lead to the development of a more precise recommendation system. Such a system would avoid the generic approach in favor of a personalized recommendation system.
摘要：随着社会化媒体的使用量呈指数级增长，分享生活实时更新，拍照已经成为一个不可回避的现象。个人在不知不觉中创建这些图像的独特的知识基础。食品的图像，特别感兴趣，因为它们包含的信息太多了。从图像元数据和基于计算机视觉的工具，我们可以为每个用户建立个人档案中提取不同的见解。用美食和其固有的口味与基础连接，我们试图开发完全基于他的食物的图像的单独这样的外形。我们的研究提供了关于个人向特定的美食倾向的见解。解释这些观点可能会导致更精确的推荐系统的发展。这样的系统将避免，取而代之的是个性化推荐系统的通用方法。

9. Giving Commands to a Self-driving Car: A Multimodal Reasoner for Visual Grounding [PDF] 返回目录
Thierry Deruyttere, Guillem Collell, Marie-Francine Moens
Abstract: We propose a new spatial memory module and a spatial reasoner for the Visual Grounding (VG) task. The goal of this task is to find a certain object in an image based on a given textual query. Our work focuses on integrating the regions of a Region Proposal Network (RPN) into a new multi-step reasoning model which we have named a Multimodal Spatial Region Reasoner (MSRR). The introduced model uses the object regions from an RPN as initialization of a 2D spatial memory and then implements a multi-step reasoning process scoring each region according to the query, hence why we call it a multimodal reasoner. We evaluate this new model on challenging datasets and our experiments show that our model that jointly reasons over the object regions of the image and words of the query largely improves accuracy compared to current state-of-the-art models.
摘要：我们提出了一个新的空间存储模块和视觉接地（VG）任务的空间推理。这项任务的目标是找到基于给定文本查询图像中的特定对象。我们的工作主要集中在一个区域建议网络（RPN）的区域纳入其中我们命名多式联运空间地域里森纳（MSRR）一个新的多步推理模型。所引入的模型从RPN为2D空间记忆的初始化使用对象区域，然后实现一个多步骤的推理过程根据查询得分每个区域，因此为什么我们称之为多峰推理。我们评估有挑战性的数据集，这种新的模式和我们的实验表明，我们的模型，在图像的查询的物体的区域和字共同的原因在很大程度上提高了精度，相比于国家的最先进的电流模式。

10. QnAMaker: Data to Bot in 2 Minutes [PDF] 返回目录
Parag Agrawal, Tulasi Menon, Aya Kamel, Michel Naim, Chaikesh Chouragade, Gurvinder Singh, Rohan Kulkarni, Anshuman Suri, Sahithi Katakam, Vineet Pratik, Prakul Bansal, Simerpreet Kaur, Neha Rajput, Anand Duggal, Achraf Chalabi, Prashant Choudhari, Reddy Satti, Niranjan Nayak
Abstract: Having a bot for seamless conversations is a much-desired feature that products and services today seek for their websites and mobile apps. These bots help reduce traffic received by human support significantly by handling frequent and directly answerable known questions. Many such services have huge reference documents such as FAQ pages, which makes it hard for users to browse through this data. A conversation layer over such raw data can lower traffic to human support by a great margin. We demonstrate QnAMaker, a service that creates a conversational layer over semi-structured data such as FAQ pages, product manuals, and support documents. QnAMaker is the popular choice for Extraction and Question-Answering as a service and is used by over 15,000 bots in production. It is also used by search interfaces and not just bots.
摘要：拥有无缝对话僵尸是一个极为理想的功能，今天的产品和服务寻求自己的网站和移动应用。这些机器人帮助减少显著通过处理频繁和直接回答的问题已知的人类得到的支持交通。许多这样的服务具有巨大的参考文档，如FAQ页面，这使得用户很难通过这个数据浏览。在这样的原始数据的对话层可以通过很大余量降低流量人力支持。我们证明QnAMaker，创建了半结构化数据，如FAQ页面，产品手册和支持文档的会话层的服务。 QnAMaker是提取和答疑的服务，并用于在生产超过15,000个机器人的热门之选。它也可以用来通过搜索界面，而不只是机器人。

11. A Corpus of Adpositional Supersenses for Mandarin Chinese [PDF] 返回目录
Siyao Peng, Yang Liu, Yilun Zhu, Austin Blodgett, Yushi Zhao, Nathan Schneider
Abstract: Adpositions are frequent markers of semantic relations, but they are highly ambiguous and vary significantly from language to language. Moreover, there is a dearth of annotated corpora for investigating the cross-linguistic variation of adposition semantics, or for building multilingual disambiguation systems. This paper presents a corpus in which all adpositions have been semantically annotated in Mandarin Chinese; to the best of our knowledge, this is the first Chinese corpus to be broadly annotated with adposition semantics. Our approach adapts a framework that defined a general set of supersenses according to ostensibly language-independent semantic criteria, though its development focused primarily on English prepositions (Schneider et al., 2018). We find that the supersense categories are well-suited to Chinese adpositions despite syntactic differences from English. On a Mandarin translation of The Little Prince, we achieve high inter-annotator agreement and analyze semantic correspondences of adposition tokens in bitext.
摘要：Adpositions是语义关系的频繁标记，但他们是非常模糊的，显著变化从语言到语言。此外，有注释的语料库的调查adposition语义的跨语言变体中，或用于构建消歧多种语言系统缺乏。本文介绍了其中所有adpositions都用普通话中国是语义标注的语料库;据我们所知，这是中国第一语料库与adposition语义被广泛地注释。我们的方法适应，根据表面上是独立于语言的语义标准来定义supersenses的一般设定的框架，但它的发展主要集中在英语介词（Schneider等，2018）。我们发现的SuperSense类别非常适合于中国adpositions尽管英语语法上的不同。在小王子的国语翻译，我们实现了高注释间协议和分析adposition令牌的语义对应于bitext。

注：中文为机器翻译结果！

WITH LOVE OF WORLD

【arxiv论文】 Computation and Language 2020-03-20

目录

摘要