目录
12. Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks [PDF] 摘要
16. Asymmetrical Hierarchical Networks with Attentive Interactions for Interpretable Review-Based Recommendation [PDF] 摘要
18. Improving Dysarthric Speech Intelligibility Using Cycle-consistent Adversarial Training [PDF] 摘要
20. A logic-based relational learning approach to relation extraction: The OntoILPER system [PDF] 摘要
21. Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View [PDF] 摘要
摘要
1. Multi-Source Domain Adaptation for Text Classification via DistanceNet-Bandits [PDF] 返回目录
Han Guo, Ramakanth Pasunuru, Mohit Bansal
Abstract: Domain adaptation performance of a learning algorithm on a target domain is a function of its source domain error and a divergence measure between the data distribution of these two domains. We present a study of various distance-based measures in the context of NLP tasks, that characterize the dissimilarity between domains based on sample estimates. We first conduct analysis experiments to show which of these distance measures can best differentiate samples from same versus different domains, and are correlated with empirical results. Next, we develop a DistanceNet model which uses these distance measures, or a mixture of these distance measures, as an additional loss function to be minimized jointly with the task's loss function, so as to achieve better unsupervised domain adaptation. Finally, we extend this model to a novel DistanceNet-Bandit model, which employs a multi-armed bandit controller to dynamically switch between multiple source domains and allow the model to learn an optimal trajectory and mixture of domains for transfer to the low-resource target domain. We conduct experiments on popular sentiment analysis datasets with several diverse domains and show that our DistanceNet model, as well as its dynamic bandit variant, can outperform competitive baselines in the context of unsupervised domain adaptation.
摘要:对目标域学习算法的域自适应性能是它的源域误差的函数和这两个结构域的数据分布之间的偏差度量。我们提出的在NLP任务范围内各种基于距离的测量,表征根据样本估计域间的差异性进行了研究。我们首先进行分析实验表明其中的这些距离措施最好的分化样本相同与不同的域,并与实验结果是相关的。接下来,我们开发出使用这些距离的措施,或者这些距离测量的混合物DistanceNet模型,作为额外的损失函数要与任务的损失函数共同最小化,从而达到更好的无监督的领域适应性。最后,我们扩展该模型以一种新颖的DistanceNet匪模型,其采用多臂老虎控制器到多个源域之间动态开关和允许模型学习域的最佳轨迹和混合物,然后转移到低资源目标域。我们进行了对流行的情感分析数据集实验与多个不同领域,并表明我们的模型DistanceNet,以及它的动态强盗变种,可以在无人监督的领域适应性的背景下跑赢大市的竞争基准。
Han Guo, Ramakanth Pasunuru, Mohit Bansal
Abstract: Domain adaptation performance of a learning algorithm on a target domain is a function of its source domain error and a divergence measure between the data distribution of these two domains. We present a study of various distance-based measures in the context of NLP tasks, that characterize the dissimilarity between domains based on sample estimates. We first conduct analysis experiments to show which of these distance measures can best differentiate samples from same versus different domains, and are correlated with empirical results. Next, we develop a DistanceNet model which uses these distance measures, or a mixture of these distance measures, as an additional loss function to be minimized jointly with the task's loss function, so as to achieve better unsupervised domain adaptation. Finally, we extend this model to a novel DistanceNet-Bandit model, which employs a multi-armed bandit controller to dynamically switch between multiple source domains and allow the model to learn an optimal trajectory and mixture of domains for transfer to the low-resource target domain. We conduct experiments on popular sentiment analysis datasets with several diverse domains and show that our DistanceNet model, as well as its dynamic bandit variant, can outperform competitive baselines in the context of unsupervised domain adaptation.
摘要:对目标域学习算法的域自适应性能是它的源域误差的函数和这两个结构域的数据分布之间的偏差度量。我们提出的在NLP任务范围内各种基于距离的测量,表征根据样本估计域间的差异性进行了研究。我们首先进行分析实验表明其中的这些距离措施最好的分化样本相同与不同的域,并与实验结果是相关的。接下来,我们开发出使用这些距离的措施,或者这些距离测量的混合物DistanceNet模型,作为额外的损失函数要与任务的损失函数共同最小化,从而达到更好的无监督的领域适应性。最后,我们扩展该模型以一种新颖的DistanceNet匪模型,其采用多臂老虎控制器到多个源域之间动态开关和允许模型学习域的最佳轨迹和混合物,然后转移到低资源目标域。我们进行了对流行的情感分析数据集实验与多个不同领域,并表明我们的模型DistanceNet,以及它的动态强盗变种,可以在无人监督的领域适应性的背景下跑赢大市的竞争基准。
2. CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese [PDF] 返回目录
Liang Xu, Qianqian Dong, Cong Yu, Yin Tian, Weitang Liu, Lu Li, Xuanwei Zhang
Abstract: In this paper, we introduce the NER dataset from CLUE organization (CLUENER2020), a well-defined fine-grained dataset for named entity recognition in Chinese. CLUENER2020 contains 10 categories. Apart from common labels like person, organization, and location, it contains more diverse categories. It is more challenging than current other Chinese NER datasets and could better reflect real-world applications. For comparison, we implement several state-of-the-art baselines as sequence labeling tasks and report human performance, as well as its analysis. To facilitate future work on fine-grained NER for Chinese, we release our dataset, baselines, and leader-board.
摘要:在本文中,我们将介绍从CLUE组织NER数据集(CLUENER2020),一个明确的细粒度数据集在中国命名实体识别。 CLUENER2020包含10个类别。除了像个人,组织和位置共同的标签,它包含了更多样化的类别。它比目前的其他中国NER数据集更具挑战性,更能反映现实世界的应用。为了便于比较,我们实现国家的最先进的一些基线为序列标注任务和报告人的表现,以及它的分析。为了方便日后对中国细粒度NER的工作,我们发布的数据集,基线和领袖板。
Liang Xu, Qianqian Dong, Cong Yu, Yin Tian, Weitang Liu, Lu Li, Xuanwei Zhang
Abstract: In this paper, we introduce the NER dataset from CLUE organization (CLUENER2020), a well-defined fine-grained dataset for named entity recognition in Chinese. CLUENER2020 contains 10 categories. Apart from common labels like person, organization, and location, it contains more diverse categories. It is more challenging than current other Chinese NER datasets and could better reflect real-world applications. For comparison, we implement several state-of-the-art baselines as sequence labeling tasks and report human performance, as well as its analysis. To facilitate future work on fine-grained NER for Chinese, we release our dataset, baselines, and leader-board.
摘要:在本文中,我们将介绍从CLUE组织NER数据集(CLUENER2020),一个明确的细粒度数据集在中国命名实体识别。 CLUENER2020包含10个类别。除了像个人,组织和位置共同的标签,它包含了更多样化的类别。它比目前的其他中国NER数据集更具挑战性,更能反映现实世界的应用。为了便于比较,我们实现国家的最先进的一些基线为序列标注任务和报告人的表现,以及它的分析。为了方便日后对中国细粒度NER的工作,我们发布的数据集,基线和领袖板。
3. AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search [PDF] 返回目录
Daoyuan Chen, Yaliang Li, Minghui Qiu, Zhen Wang, Bofang Li, Bolin Ding, Hongbo Deng, Jun Huang, Wei Lin, Jingren Zhou
Abstract: Large pre-trained language models such as BERT have shown their effectiveness in various natural language processing tasks. However, the huge parameter size makes them difficult to be deployed in real-time applications that require quick inference with limited resources. Existing methods compress BERT into small models while such compression is task-independent, i.e., the same compressed BERT for all different downstream tasks. Motivated by the necessity and benefits of task-oriented BERT compression, we propose a novel compression method, AdaBERT, that leverages differentiable Neural Architecture Search to automatically compress BERT into task-adaptive small models for specific tasks. We incorporate a task-oriented knowledge distillation loss to provide search hints and an efficiency-aware loss as search constraints, which enables a good trade-off between efficiency and effectiveness for task-adaptive BERT compression. We evaluate AdaBERT on several NLP tasks, and the results demonstrate that those task-adaptive compressed models are 12.7x to 29.3x faster than BERT in inference time and 11.5x to 17.0x smaller in terms of parameter size, while comparable performance is maintained.
摘要:大型预训练的语言模型,如BERT表明它们在不同的自然语言处理任务的有效性。然而,巨大的参数尺寸使得它们很难在需要快速推断资源有限的实时应用进行部署。现有的方法压缩BERT为小机型,而这种压缩是任务无关,即对所有不同的下游任务相同的压缩BERT。由必要性和面向任务的BERT压缩的好处的启发,我们提出了一种新的压缩方法,AdaBERT,它利用微神经结构的搜索自动压缩成BERT任务自适应小型号为特定的任务。我们结合了面向任务的知识蒸馏损失提供搜索提示和效率意识的损失,搜索约束,这使得任务自适应BERT压缩一个很好的权衡效率和效益之间。我们评估几个NLP任务AdaBERT,结果表明,这些任务自适应压缩模型12.7倍至29.3x比推理时间和11.5倍BERT更快17.0x参数规模而言较小,而相当的性能得以维持。
Daoyuan Chen, Yaliang Li, Minghui Qiu, Zhen Wang, Bofang Li, Bolin Ding, Hongbo Deng, Jun Huang, Wei Lin, Jingren Zhou
Abstract: Large pre-trained language models such as BERT have shown their effectiveness in various natural language processing tasks. However, the huge parameter size makes them difficult to be deployed in real-time applications that require quick inference with limited resources. Existing methods compress BERT into small models while such compression is task-independent, i.e., the same compressed BERT for all different downstream tasks. Motivated by the necessity and benefits of task-oriented BERT compression, we propose a novel compression method, AdaBERT, that leverages differentiable Neural Architecture Search to automatically compress BERT into task-adaptive small models for specific tasks. We incorporate a task-oriented knowledge distillation loss to provide search hints and an efficiency-aware loss as search constraints, which enables a good trade-off between efficiency and effectiveness for task-adaptive BERT compression. We evaluate AdaBERT on several NLP tasks, and the results demonstrate that those task-adaptive compressed models are 12.7x to 29.3x faster than BERT in inference time and 11.5x to 17.0x smaller in terms of parameter size, while comparable performance is maintained.
摘要:大型预训练的语言模型,如BERT表明它们在不同的自然语言处理任务的有效性。然而,巨大的参数尺寸使得它们很难在需要快速推断资源有限的实时应用进行部署。现有的方法压缩BERT为小机型,而这种压缩是任务无关,即对所有不同的下游任务相同的压缩BERT。由必要性和面向任务的BERT压缩的好处的启发,我们提出了一种新的压缩方法,AdaBERT,它利用微神经结构的搜索自动压缩成BERT任务自适应小型号为特定的任务。我们结合了面向任务的知识蒸馏损失提供搜索提示和效率意识的损失,搜索约束,这使得任务自适应BERT压缩一个很好的权衡效率和效益之间。我们评估几个NLP任务AdaBERT,结果表明,这些任务自适应压缩模型12.7倍至29.3x比推理时间和11.5倍BERT更快17.0x参数规模而言较小,而相当的性能得以维持。
4. Mining customer product reviews for product development: A summarization process [PDF] 返回目录
Tianjun Hou, Bernard Yannou, Yann Leroy, Emilie Poirson
Abstract: This research set out to identify and structure from online reviews the words and expressions related to customers' likes and dislikes to guide product development. Previous methods were mainly focused on product features. However, reviewers express their preference not only on product features. In this paper, based on an extensive literature review in design science, the authors propose a summarization model containing multiples aspects of user preference, such as product affordances, emotions, usage conditions. Meanwhile, the linguistic patterns describing these aspects of preference are discovered and drafted as annotation guidelines. A case study demonstrates that with the proposed model and the annotation guidelines, human annotators can structure the online reviews with high inter-agreement. As high inter-agreement human annotation results are essential for automatizing the online review summarization process with the natural language processing, this study provides materials for the future study of automatization.
摘要:本研究着手从网上评论识别和结构关系到客户的好恶词语来指导产品的开发。先前的方法主要集中在产品功能。然而,评论家表达自己的喜好,不仅在产品功能。在本文的基础上,设计科学的全面的文献,作者提出了一个包含用户偏好的倍数方面,如产品的可供性,情绪,利用状况的总结模式。同时,描述偏好这些方面的语言模式被发现并起草作为注解的指导方针。案例研究表明,与所提出的模型和注释指引,人工注释就可以构建高之间的协议网上审查。由于采用协议间的人类标注的结果是与自然语言处理automatizing在线审核汇总过程中必不可少的,这项研究提供了自动化的未来学习材料。
Tianjun Hou, Bernard Yannou, Yann Leroy, Emilie Poirson
Abstract: This research set out to identify and structure from online reviews the words and expressions related to customers' likes and dislikes to guide product development. Previous methods were mainly focused on product features. However, reviewers express their preference not only on product features. In this paper, based on an extensive literature review in design science, the authors propose a summarization model containing multiples aspects of user preference, such as product affordances, emotions, usage conditions. Meanwhile, the linguistic patterns describing these aspects of preference are discovered and drafted as annotation guidelines. A case study demonstrates that with the proposed model and the annotation guidelines, human annotators can structure the online reviews with high inter-agreement. As high inter-agreement human annotation results are essential for automatizing the online review summarization process with the natural language processing, this study provides materials for the future study of automatization.
摘要:本研究着手从网上评论识别和结构关系到客户的好恶词语来指导产品的开发。先前的方法主要集中在产品功能。然而,评论家表达自己的喜好,不仅在产品功能。在本文的基础上,设计科学的全面的文献,作者提出了一个包含用户偏好的倍数方面,如产品的可供性,情绪,利用状况的总结模式。同时,描述偏好这些方面的语言模式被发现并起草作为注解的指导方针。案例研究表明,与所提出的模型和注释指引,人工注释就可以构建高之间的协议网上审查。由于采用协议间的人类标注的结果是与自然语言处理automatizing在线审核汇总过程中必不可少的,这项研究提供了自动化的未来学习材料。
5. Joint Reasoning for Multi-Faceted Commonsense Knowledge [PDF] 返回目录
Yohan Chalier, Simon Razniewski, Gerhard Weikum
Abstract: Commonsense knowledge (CSK) supports a variety of AI applications, from visual understanding to chatbots. Prior works on acquiring CSK, such as ConceptNet, have compiled statements that associate concepts, like everyday objects or activities, with properties that hold for most or some instances of the concept. Each concept is treated in isolation from other concepts, and the only quantitative measure (or ranking) of properties is a confidence score that the statement is valid. This paper aims to overcome these limitations by introducing a multi-faceted model of CSK statements and methods for joint reasoning over sets of inter-related statements. Our model captures four different dimensions of CSK statements: plausibility, typicality, remarkability and salience, with scoring and ranking along each dimension. For example, hyenas drinking water is typical but not salient, whereas hyenas eating carcasses is salient. For reasoning and ranking, we develop a method with soft constraints, to couple the inference over concepts that are related in in a taxonomic hierarchy. The reasoning is cast into an integer linear programming (ILP), and we leverage the theory of reduction costs of a relaxed LP to compute informative rankings. This methodology is applied to several large CSK collections. Our evaluation shows that we can consolidate these inputs into much cleaner and more expressive knowledge. Results are available at this https URL.
摘要:常识知识(CSK)支持多种AI应用,从视觉理解聊天机器人。在获取CSK之前的作品,如ConceptNet,编译语句关联的概念,像日常生活中的物体或活动,性质搁置了大部分或概念的若干实例。每个概念隔离治疗与其他概念和属性的唯一定量测量(或排序)是置信得分的声明是有效的。本文旨在通过引入CSK语句和方法的多面模型在台相互关联的语句联合推理来克服这些限制。我们的模型捕获CSK报表的四个维度:合理性,典型性,remarkability和显着性,与得分和沿每个维度的排名。例如,鬣狗饮用水是典型的但不显着,而鬣狗吃尸体是显着的。推理和排名,我们开发了一个方法与软约束,耦合,而忽视了在一个分类层次结构相关的概念推理。推理铸造成整数线性规划(ILP),和我们利用的轻松LP的降低成本的理论来计算信息排名。这种方法适用于几个大的CSK集合。我们的评估显示,我们可以整合这些投入更清洁,更富有表现力的知识。结果可在此HTTPS URL。
Yohan Chalier, Simon Razniewski, Gerhard Weikum
Abstract: Commonsense knowledge (CSK) supports a variety of AI applications, from visual understanding to chatbots. Prior works on acquiring CSK, such as ConceptNet, have compiled statements that associate concepts, like everyday objects or activities, with properties that hold for most or some instances of the concept. Each concept is treated in isolation from other concepts, and the only quantitative measure (or ranking) of properties is a confidence score that the statement is valid. This paper aims to overcome these limitations by introducing a multi-faceted model of CSK statements and methods for joint reasoning over sets of inter-related statements. Our model captures four different dimensions of CSK statements: plausibility, typicality, remarkability and salience, with scoring and ranking along each dimension. For example, hyenas drinking water is typical but not salient, whereas hyenas eating carcasses is salient. For reasoning and ranking, we develop a method with soft constraints, to couple the inference over concepts that are related in in a taxonomic hierarchy. The reasoning is cast into an integer linear programming (ILP), and we leverage the theory of reduction costs of a relaxed LP to compute informative rankings. This methodology is applied to several large CSK collections. Our evaluation shows that we can consolidate these inputs into much cleaner and more expressive knowledge. Results are available at this https URL.
摘要:常识知识(CSK)支持多种AI应用,从视觉理解聊天机器人。在获取CSK之前的作品,如ConceptNet,编译语句关联的概念,像日常生活中的物体或活动,性质搁置了大部分或概念的若干实例。每个概念隔离治疗与其他概念和属性的唯一定量测量(或排序)是置信得分的声明是有效的。本文旨在通过引入CSK语句和方法的多面模型在台相互关联的语句联合推理来克服这些限制。我们的模型捕获CSK报表的四个维度:合理性,典型性,remarkability和显着性,与得分和沿每个维度的排名。例如,鬣狗饮用水是典型的但不显着,而鬣狗吃尸体是显着的。推理和排名,我们开发了一个方法与软约束,耦合,而忽视了在一个分类层次结构相关的概念推理。推理铸造成整数线性规划(ILP),和我们利用的轻松LP的降低成本的理论来计算信息排名。这种方法适用于几个大的CSK集合。我们的评估显示,我们可以整合这些投入更清洁,更富有表现力的知识。结果可在此HTTPS URL。
6. ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training [PDF] 返回目录
Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou
Abstract: In this paper, we present a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism.Instead of the optimization of one-step ahead prediction in traditional sequence-to-sequence model, the ProphetNet is optimized by n-step ahead prediction which predicts the next n tokens simultaneously based on previous context tokens at each time step.The future n-gram prediction explicitly encourages the model to plan for the future tokens and prevent overfitting on strong local correlations. We pre-train ProphetNet using a base scale dataset (16GB) and a large scale dataset (160GB) respectively. Experimental results show ProphetNet achieves the best performance on both abstractive summarization and question generation tasks compared to the models using the same base scale pre-training dataset. For the large scale dataset pre-training, ProphetNet achieves new state-of-the-art results on Gigaword and comparable results on CNN/DailyMail using only about 1/5 pre-training epochs of the previous model.
摘要:在本文中,我们提出名为ProphetNet一个新的序列到序列前的训练模式,它引入了一个新的自我监督的目标命名为将来的n-gram预测和建议的N流的自我关注的mechanism.Instead在传统的序列到序列模型中的一个步骤的提前预测的优化,ProphetNet由n-领先一步预测该预测下一个n各自时间step.The将来的n-gram在预测令牌同时基于先前上下文令牌明确地优化鼓励模型来规划未来的令牌,并防止过度拟合强大的本地相关性。我们使用碱规模的数据集(16GB)和分别大规模数据集(160GB)预列车ProphetNet。实验结果表明ProphetNet达到上相比,使用相同的基本预分训练数据集模型既抽象总结和询问生成任务的最佳性能。对于大规模数据集前培训,ProphetNet实现国家的最先进的新的Gigaword和使用CNN /每日邮报以前的型号只有约1/5前的训练时期比较的结果的结果。
Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou
Abstract: In this paper, we present a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism.Instead of the optimization of one-step ahead prediction in traditional sequence-to-sequence model, the ProphetNet is optimized by n-step ahead prediction which predicts the next n tokens simultaneously based on previous context tokens at each time step.The future n-gram prediction explicitly encourages the model to plan for the future tokens and prevent overfitting on strong local correlations. We pre-train ProphetNet using a base scale dataset (16GB) and a large scale dataset (160GB) respectively. Experimental results show ProphetNet achieves the best performance on both abstractive summarization and question generation tasks compared to the models using the same base scale pre-training dataset. For the large scale dataset pre-training, ProphetNet achieves new state-of-the-art results on Gigaword and comparable results on CNN/DailyMail using only about 1/5 pre-training epochs of the previous model.
摘要:在本文中,我们提出名为ProphetNet一个新的序列到序列前的训练模式,它引入了一个新的自我监督的目标命名为将来的n-gram预测和建议的N流的自我关注的mechanism.Instead在传统的序列到序列模型中的一个步骤的提前预测的优化,ProphetNet由n-领先一步预测该预测下一个n各自时间step.The将来的n-gram在预测令牌同时基于先前上下文令牌明确地优化鼓励模型来规划未来的令牌,并防止过度拟合强大的本地相关性。我们使用碱规模的数据集(16GB)和分别大规模数据集(160GB)预列车ProphetNet。实验结果表明ProphetNet达到上相比,使用相同的基本预分训练数据集模型既抽象总结和询问生成任务的最佳性能。对于大规模数据集前培训,ProphetNet实现国家的最先进的新的Gigaword和使用CNN /每日邮报以前的型号只有约1/5前的训练时期比较的结果的结果。
7. Stochastic Natural Language Generation Using Dependency Information [PDF] 返回目录
Elham Seifossadat, Hossein Sameti
Abstract: This article presents a stochastic corpus-based model for generating natural language text. Our model first encodes dependency relations from training data through a feature set, then concatenates these features to produce a new dependency tree for a given meaning representation, and finally generates a natural language utterance from the produced dependency tree. We test our model on nine domains from tabular, dialogue act and RDF format. Our model outperforms the corpus-based state-of-the-art methods trained on tabular datasets and also achieves comparable results with neural network-based approaches trained on dialogue act, E2E and WebNLG datasets for BLEU and ERR evaluation metrics. Also, by reporting Human Evaluation results, we show that our model produces high-quality utterances in aspects of informativeness and naturalness as well as quality.
摘要:本文介绍了生成自然语言文本随机基于语料库的模型。我们的模式首先编码的依赖性和通过功能训练数据集的关系,然后连接这些特征来产生一个给定的意思表示一个新的依赖关系树,最后产生从产生依赖关系树的自然语言语句。我们测试我们从表格,对话行为和RDF格式9个域模型。我们的模型优于训练有素的表格数据集基于语料库的国家的最先进的方法和也实现了比较的结果神经网络的基础上对话行为,E2E和WebNLG数据集的BLEU和ERR评价指标办法训练。此外,通过报告人的评价结果,我们表明,我们的模型在信息量和自然,以及质量方面的生产高品质的话语。
Elham Seifossadat, Hossein Sameti
Abstract: This article presents a stochastic corpus-based model for generating natural language text. Our model first encodes dependency relations from training data through a feature set, then concatenates these features to produce a new dependency tree for a given meaning representation, and finally generates a natural language utterance from the produced dependency tree. We test our model on nine domains from tabular, dialogue act and RDF format. Our model outperforms the corpus-based state-of-the-art methods trained on tabular datasets and also achieves comparable results with neural network-based approaches trained on dialogue act, E2E and WebNLG datasets for BLEU and ERR evaluation metrics. Also, by reporting Human Evaluation results, we show that our model produces high-quality utterances in aspects of informativeness and naturalness as well as quality.
摘要:本文介绍了生成自然语言文本随机基于语料库的模型。我们的模式首先编码的依赖性和通过功能训练数据集的关系,然后连接这些特征来产生一个给定的意思表示一个新的依赖关系树,最后产生从产生依赖关系树的自然语言语句。我们测试我们从表格,对话行为和RDF格式9个域模型。我们的模型优于训练有素的表格数据集基于语料库的国家的最先进的方法和也实现了比较的结果神经网络的基础上对话行为,E2E和WebNLG数据集的BLEU和ERR评价指标办法训练。此外,通过报告人的评价结果,我们表明,我们的模型在信息量和自然,以及质量方面的生产高品质的话语。
8. Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study [PDF] 返回目录
Jinlan Fu, Pengfei Liu, Qi Zhang, Xuanjing Huang
Abstract: While neural network-based models have achieved impressive performance on a large body of NLP tasks, the generalization behavior of different models remains poorly understood: Does this excellent performance imply a perfect generalization model, or are there still some limitations? In this paper, we take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives and characterize the differences of their generalization abilities through the lens of our proposed measures, which guides us to better design models and training methods. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models in terms of breakdown performance analysis, annotation errors, dataset bias, and category relationships, which suggest directions for improvement. We have released the datasets: (ReCoNLL, PLONER) for the future research at our project page: this http URL. As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers and classifies them into different research topics: this https URL.
摘要:尽管基于神经网络的模型已在大机构的NLP任务,取得了骄人的业绩,不同型号的遗体推广行为知之甚少:这是否出色表现意味着一个完美的泛化模型,还是有仍有一定的局限性?在本文中,我们采取了NER任务作为测试平台,分析从不同的角度现有车型的推广行为,并通过我们的建议措施的镜头,是指导我们更好地设计模型和训练方法表征其泛化能力的差异。在深入分析实验诊断现有的神经NER模型的瓶颈在击穿性能分析,标注错误,数据集偏见和类别的关系,其提出改进方向的术语。我们已经发布了数据集:(ReCoNLL,PLONER)对未来的研究,在我们的项目页面:这个HTTP URL。作为本文的副产品,我们有开源的,涉及到的最近NER文件,并将其分类,全面总结成不同的研究课题项目:该HTTPS URL。
Jinlan Fu, Pengfei Liu, Qi Zhang, Xuanjing Huang
Abstract: While neural network-based models have achieved impressive performance on a large body of NLP tasks, the generalization behavior of different models remains poorly understood: Does this excellent performance imply a perfect generalization model, or are there still some limitations? In this paper, we take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives and characterize the differences of their generalization abilities through the lens of our proposed measures, which guides us to better design models and training methods. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models in terms of breakdown performance analysis, annotation errors, dataset bias, and category relationships, which suggest directions for improvement. We have released the datasets: (ReCoNLL, PLONER) for the future research at our project page: this http URL. As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers and classifies them into different research topics: this https URL.
摘要:尽管基于神经网络的模型已在大机构的NLP任务,取得了骄人的业绩,不同型号的遗体推广行为知之甚少:这是否出色表现意味着一个完美的泛化模型,还是有仍有一定的局限性?在本文中,我们采取了NER任务作为测试平台,分析从不同的角度现有车型的推广行为,并通过我们的建议措施的镜头,是指导我们更好地设计模型和训练方法表征其泛化能力的差异。在深入分析实验诊断现有的神经NER模型的瓶颈在击穿性能分析,标注错误,数据集偏见和类别的关系,其提出改进方向的术语。我们已经发布了数据集:(ReCoNLL,PLONER)对未来的研究,在我们的项目页面:这个HTTP URL。作为本文的副产品,我们有开源的,涉及到的最近NER文件,并将其分类,全面总结成不同的研究课题项目:该HTTPS URL。
9. Revisiting Challenges in Data-to-Text Generation with Fact Grounding [PDF] 返回目录
Hongmin Wang
Abstract: Data-to-text generation models face challenges in ensuring data fidelity by referring to the correct input source. To inspire studies in this area, Wiseman et al. (2017) introduced the RotoWire corpus on generating NBA game summaries from the box- and line-score tables. However, limited attempts have been made in this direction and the challenges remain. We observe a prominent bottleneck in the corpus where only about 60% of the summary contents can be grounded to the boxscore records. Such information deficiency tends to misguide a conditioned language model to produce unconditioned random facts and thus leads to factual hallucinations. In this work, we restore the information balance and revamp this task to focus on fact-grounded data-to-text generation. We introduce a purified and larger-scale dataset, RotoWire-FG (Fact-Grounding), with 50% more data from the year 2017-19 and enriched input tables, hoping to attract more research focuses in this direction. Moreover, we achieve improved data fidelity over the state-of-the-art models by integrating a new form of table reconstruction as an auxiliary task to boost the generation quality.
摘要:数据到文本代车型面临参照正确的输入源,确保数据的保真度的挑战。在这方面,怀斯曼等激励研究。 (2017)介绍了从箱 - 和线路得分表中生成的NBA比赛的摘要语料库RotoWire。然而,有限的尝试已在这方面取得和挑战依然存在。我们观察到,其中的总结内容只有约60%可以接地的技术统计记录的语料库一个突出的瓶颈。这样的信息不足往往误导了条件语言模型制作无条件随机的事实,从而导致实际的幻觉。在这项工作中,我们恢复信息平衡和改造这个任务专注于事实接地数据到文本生成。我们引进一个纯化和大规模数据集,RotoWire-FG(实况接地),从今年2017年覆盖和丰富的输入表50%以上的数据,希望能吸引更多的研究集中在这个方向。此外,我们通过表重建的一种新形式的积分作为辅助任务,以提高生成质量实现对国家的最先进的模型改进的数据的保真度。
Hongmin Wang
Abstract: Data-to-text generation models face challenges in ensuring data fidelity by referring to the correct input source. To inspire studies in this area, Wiseman et al. (2017) introduced the RotoWire corpus on generating NBA game summaries from the box- and line-score tables. However, limited attempts have been made in this direction and the challenges remain. We observe a prominent bottleneck in the corpus where only about 60% of the summary contents can be grounded to the boxscore records. Such information deficiency tends to misguide a conditioned language model to produce unconditioned random facts and thus leads to factual hallucinations. In this work, we restore the information balance and revamp this task to focus on fact-grounded data-to-text generation. We introduce a purified and larger-scale dataset, RotoWire-FG (Fact-Grounding), with 50% more data from the year 2017-19 and enriched input tables, hoping to attract more research focuses in this direction. Moreover, we achieve improved data fidelity over the state-of-the-art models by integrating a new form of table reconstruction as an auxiliary task to boost the generation quality.
摘要:数据到文本代车型面临参照正确的输入源,确保数据的保真度的挑战。在这方面,怀斯曼等激励研究。 (2017)介绍了从箱 - 和线路得分表中生成的NBA比赛的摘要语料库RotoWire。然而,有限的尝试已在这方面取得和挑战依然存在。我们观察到,其中的总结内容只有约60%可以接地的技术统计记录的语料库一个突出的瓶颈。这样的信息不足往往误导了条件语言模型制作无条件随机的事实,从而导致实际的幻觉。在这项工作中,我们恢复信息平衡和改造这个任务专注于事实接地数据到文本生成。我们引进一个纯化和大规模数据集,RotoWire-FG(实况接地),从今年2017年覆盖和丰富的输入表50%以上的数据,希望能吸引更多的研究集中在这个方向。此外,我们通过表重建的一种新形式的积分作为辅助任务,以提高生成质量实现对国家的最先进的模型改进的数据的保真度。
10. Learning Cross-Context Entity Representations from Text [PDF] 返回目录
Jeffrey Ling, Nicholas FitzGerald, Zifei Shan, Livio Baldini Soares, Thibault Févry, David Weiss, Tom Kwiatkowski
Abstract: Language modeling tasks, in which words, or word-pieces, are predicted on the basis of a local context, have been very effective for learning word embeddings and context dependent representations of phrases. Motivated by the observation that efforts to code world knowledge into machine readable knowledge bases or human readable encyclopedias tend to be entity-centric, we investigate the use of a fill-in-the-blank task to learn context independent representations of entities from the text contexts in which those entities were mentioned. We show that large scale training of neural models allows us to learn high quality entity representations, and we demonstrate successful results on four domains: (1) existing entity-level typing benchmarks, including a 64% error reduction over previous work on TypeNet (Murty et al., 2018); (2) a novel few-shot category reconstruction task; (3) existing entity linking benchmarks, where we match the state-of-the-art on CoNLL-Aida without linking-specific features and obtain a score of 89.8% on TAC-KBP 2010 without using any alias table, external knowledge base or in domain training data and (4) answering trivia questions, which uniquely identify entities. Our global entity representations encode fine-grained type categories, such as Scottish footballers, and can answer trivia questions such as: Who was the last inmate of Spandau jail in Berlin?
摘要:语言建模任务,其中词或字块,在本地范围内的基础上预测,一直是学习的嵌入词和短语的背景有关的表示是非常有效的。通过观察该努力的代码世界知识转化为机器可读的知识基础或人类可读的百科全书往往是实体为中心的推动下,我们研究使用填充式的空白任务的学习实体的情况下独立表示从文本在这些实体中提到的上下文。我们展示的神经模型的规模大的培训,让我们了解高品质实体交涉,我们证明在四个主要领域成功的结果:(1)现有的实体层面打字的基准,其中包括64%的误差减少了以前的工作在键入net(穆尔蒂。等人,2018); (2)一种新的几拍重建类别任务; (3)现有的实体连接的基准,在那里我们匹配状态的最先进的上CoNLL-阿依无需关联的特定功能,将获得于2010年05 TAC-KBP得分为89.8%,而无需使用任何别名表,外部知识库或在域训练数据和(4)回答琐事问题,唯一标识实体。我们的全球实体表示编码细粒度类型类别,如苏格兰足球运动员,并且可以回答小问题,如:谁是施潘道监狱在柏林的最后一个犯人?
Jeffrey Ling, Nicholas FitzGerald, Zifei Shan, Livio Baldini Soares, Thibault Févry, David Weiss, Tom Kwiatkowski
Abstract: Language modeling tasks, in which words, or word-pieces, are predicted on the basis of a local context, have been very effective for learning word embeddings and context dependent representations of phrases. Motivated by the observation that efforts to code world knowledge into machine readable knowledge bases or human readable encyclopedias tend to be entity-centric, we investigate the use of a fill-in-the-blank task to learn context independent representations of entities from the text contexts in which those entities were mentioned. We show that large scale training of neural models allows us to learn high quality entity representations, and we demonstrate successful results on four domains: (1) existing entity-level typing benchmarks, including a 64% error reduction over previous work on TypeNet (Murty et al., 2018); (2) a novel few-shot category reconstruction task; (3) existing entity linking benchmarks, where we match the state-of-the-art on CoNLL-Aida without linking-specific features and obtain a score of 89.8% on TAC-KBP 2010 without using any alias table, external knowledge base or in domain training data and (4) answering trivia questions, which uniquely identify entities. Our global entity representations encode fine-grained type categories, such as Scottish footballers, and can answer trivia questions such as: Who was the last inmate of Spandau jail in Berlin?
摘要:语言建模任务,其中词或字块,在本地范围内的基础上预测,一直是学习的嵌入词和短语的背景有关的表示是非常有效的。通过观察该努力的代码世界知识转化为机器可读的知识基础或人类可读的百科全书往往是实体为中心的推动下,我们研究使用填充式的空白任务的学习实体的情况下独立表示从文本在这些实体中提到的上下文。我们展示的神经模型的规模大的培训,让我们了解高品质实体交涉,我们证明在四个主要领域成功的结果:(1)现有的实体层面打字的基准,其中包括64%的误差减少了以前的工作在键入net(穆尔蒂。等人,2018); (2)一种新的几拍重建类别任务; (3)现有的实体连接的基准,在那里我们匹配状态的最先进的上CoNLL-阿依无需关联的特定功能,将获得于2010年05 TAC-KBP得分为89.8%,而无需使用任何别名表,外部知识库或在域训练数据和(4)回答琐事问题,唯一标识实体。我们的全球实体表示编码细粒度类型类别,如苏格兰足球运动员,并且可以回答小问题,如:谁是施潘道监狱在柏林的最后一个犯人?
11. PatentTransformer-2: Controlling Patent Text Generation by Structural Metadata [PDF] 返回目录
Jieh-Sheng Lee, Jieh Hsiang
Abstract: PatentTransformer is our codename for patent text generation based on Transformer-based models. Our goal is "Augmented Inventing." In this second version, we leverage more of the structural metadata in patents. The structural metadata includes patent title, abstract, and dependent claim, in addition to independent claim previously. Metadata controls what kind of patent text for the model to generate. Also, we leverage the relation between metadata to build a text-to-text generation flow, for example, from a few words to a title, the title to an abstract, the abstract to an independent claim, and the independent claim to multiple dependent claims. The text flow can go backward because the relation is trained bidirectionally. We release our GPT-2 models trained from scratch and our code for inference so that readers can verify and generate patent text on their own. As for generation quality, we measure it by both ROUGE and Google Universal Sentence Encoder.
摘要:PatentTransformer是我们基于基于变压器的新型专利文本生成代号。我们的目标是“增强发明了。”在第二个版本中,我们利用更多的结构性元数据的专利。结构元数据包括专利标题,摘要,以及从属权利要求中,除了独立权利要求先前。元数据控制什么样的专利文本为模型来生成。此外,我们利用的元数据之间的关系,以建立一个文本到文本生成流,例如,从几话标题,标题为抽象,抽象到一个独立的权利要求,以及在独立权利要求到多个从属索赔。因为关系是双向训练文本流可以去落后。我们发布我们从头开始训练的GPT-2机型和我们推断代码,使读者可以验证并产生自己的专利文本。至于代的品质,我们双方ROUGE和谷歌万能句子编码器测量。
Jieh-Sheng Lee, Jieh Hsiang
Abstract: PatentTransformer is our codename for patent text generation based on Transformer-based models. Our goal is "Augmented Inventing." In this second version, we leverage more of the structural metadata in patents. The structural metadata includes patent title, abstract, and dependent claim, in addition to independent claim previously. Metadata controls what kind of patent text for the model to generate. Also, we leverage the relation between metadata to build a text-to-text generation flow, for example, from a few words to a title, the title to an abstract, the abstract to an independent claim, and the independent claim to multiple dependent claims. The text flow can go backward because the relation is trained bidirectionally. We release our GPT-2 models trained from scratch and our code for inference so that readers can verify and generate patent text on their own. As for generation quality, we measure it by both ROUGE and Google Universal Sentence Encoder.
摘要:PatentTransformer是我们基于基于变压器的新型专利文本生成代号。我们的目标是“增强发明了。”在第二个版本中,我们利用更多的结构性元数据的专利。结构元数据包括专利标题,摘要,以及从属权利要求中,除了独立权利要求先前。元数据控制什么样的专利文本为模型来生成。此外,我们利用的元数据之间的关系,以建立一个文本到文本生成流,例如,从几话标题,标题为抽象,抽象到一个独立的权利要求,以及在独立权利要求到多个从属索赔。因为关系是双向训练文本流可以去落后。我们发布我们从头开始训练的GPT-2机型和我们推断代码,使读者可以验证并产生自己的专利文本。至于代的品质,我们双方ROUGE和谷歌万能句子编码器测量。
12. Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks [PDF] 返回目录
R. Thomas McCoy, Robert Frank, Tal Linzen
Abstract: Learners that are exposed to the same training data might generalize differently due to differing inductive biases. In neural network models, inductive biases could in theory arise from any aspect of the model architecture. We investigate which architectural factors affect the generalization behavior of neural sequence-to-sequence models trained on two syntactic tasks, English question formation and English tense reinflection. For both tasks, the training set is consistent with a generalization based on hierarchical structure and a generalization based on linear order. All architectural factors that we investigated qualitatively affected how models generalized, including factors with no clear connection to hierarchical structure. For example, LSTMs and GRUs displayed qualitatively different inductive biases. However, the only factor that consistently contributed a hierarchical bias across tasks was the use of a tree-structured model rather than a model with sequential recurrence, suggesting that human-like syntactic generalization requires architectural syntactic structure.
摘要:暴露在同样的训练数据学习者可以概括不同,由于不同的感性偏见。在神经网络模型,感性的偏见在理论上可以从模型架构的任何方面引起的。我们调查其建筑因素影响训练的两个句法任务,英语问题的形成和英语时态reinflection神经序列到序列模型的推广行为。对于这两个任务,训练集是基于层次结构和基于线性顺序的推广泛化一致。所有的建筑因素,我们调查定性的影响模型如何推广,其中包括没有明确的连接层次结构的因素。例如,LSTMs越冬和显示本质上不同的感应偏压。然而,持续推动整个任务的分层偏见的唯一因素是使用一个树形结构的模型,而不是连续的复发模型,这表明类似人类的语法概括要求的建筑句法结构。
R. Thomas McCoy, Robert Frank, Tal Linzen
Abstract: Learners that are exposed to the same training data might generalize differently due to differing inductive biases. In neural network models, inductive biases could in theory arise from any aspect of the model architecture. We investigate which architectural factors affect the generalization behavior of neural sequence-to-sequence models trained on two syntactic tasks, English question formation and English tense reinflection. For both tasks, the training set is consistent with a generalization based on hierarchical structure and a generalization based on linear order. All architectural factors that we investigated qualitatively affected how models generalized, including factors with no clear connection to hierarchical structure. For example, LSTMs and GRUs displayed qualitatively different inductive biases. However, the only factor that consistently contributed a hierarchical bias across tasks was the use of a tree-structured model rather than a model with sequential recurrence, suggesting that human-like syntactic generalization requires architectural syntactic structure.
摘要:暴露在同样的训练数据学习者可以概括不同,由于不同的感性偏见。在神经网络模型,感性的偏见在理论上可以从模型架构的任何方面引起的。我们调查其建筑因素影响训练的两个句法任务,英语问题的形成和英语时态reinflection神经序列到序列模型的推广行为。对于这两个任务,训练集是基于层次结构和基于线性顺序的推广泛化一致。所有的建筑因素,我们调查定性的影响模型如何推广,其中包括没有明确的连接层次结构的因素。例如,LSTMs越冬和显示本质上不同的感应偏压。然而,持续推动整个任务的分层偏见的唯一因素是使用一个树形结构的模型,而不是连续的复发模型,这表明类似人类的语法概括要求的建筑句法结构。
13. Reformer: The Efficient Transformer [PDF] 返回目录
Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya
Abstract: Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences. We introduce two techniques to improve the efficiency of Transformers. For one, we replace dot-product attention by one that uses locality-sensitive hashing, changing its complexity from O($L^2$) to O($L\log L$), where $L$ is the length of the sequence. Furthermore, we use reversible residual layers instead of the standard residuals, which allows storing activations only once in the training process instead of $N$ times, where $N$ is the number of layers. The resulting model, the Reformer, performs on par with Transformer models while being much more memory-efficient and much faster on long sequences.
摘要:大型变压器模型通常实现多项任务的国家的最先进的成果,但训练这些模型可能极其昂贵的,特别是在长序列。我们介绍了两种技术来提高变压器的效率。首先,我们通过一个使用局部性敏感散列,从O($ L ^ 2 $)至O($ L \材L $),其中$ L $是的长度改变其复杂性替代点积关注顺序。此外,我们使用可逆残渣层而不是标准的残差,其允许在训练过程中,而不是$ N $倍,其中$ N $是层的数目仅一次存储激活。将得到的模型,重整器,而被更内存效率和长序列快得多与Transformer模型看齐进行。
Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya
Abstract: Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences. We introduce two techniques to improve the efficiency of Transformers. For one, we replace dot-product attention by one that uses locality-sensitive hashing, changing its complexity from O($L^2$) to O($L\log L$), where $L$ is the length of the sequence. Furthermore, we use reversible residual layers instead of the standard residuals, which allows storing activations only once in the training process instead of $N$ times, where $N$ is the number of layers. The resulting model, the Reformer, performs on par with Transformer models while being much more memory-efficient and much faster on long sequences.
摘要:大型变压器模型通常实现多项任务的国家的最先进的成果,但训练这些模型可能极其昂贵的,特别是在长序列。我们介绍了两种技术来提高变压器的效率。首先,我们通过一个使用局部性敏感散列,从O($ L ^ 2 $)至O($ L \材L $),其中$ L $是的长度改变其复杂性替代点积关注顺序。此外,我们使用可逆残渣层而不是标准的残差,其允许在训练过程中,而不是$ N $倍,其中$ N $是层的数目仅一次存储激活。将得到的模型,重整器,而被更内存效率和长序列快得多与Transformer模型看齐进行。
14. LP-SparseMAP: Differentiable Relaxed Optimization for Sparse Structured Prediction [PDF] 返回目录
Vlad Niculae, André F. T. Martins
Abstract: Structured prediction requires manipulating a large number of combinatorial structures, e.g., dependency trees or alignments, either as latent or output variables. Recently, the SparseMAP method has been proposed as a differentiable, sparse alternative to maximum a posteriori (MAP) and marginal inference. SparseMAP returns a combination of a small number of structures, a desirable property in some downstream applications. However, SparseMAP requires a tractable MAP inference oracle. This excludes, e.g., loopy graphical models or factor graphs with logic constraints, which generally require approximate inference. In this paper, we introduce LP-SparseMAP, an extension of SparseMAP that addresses this limitation via a local polytope relaxation. LP-SparseMAP uses the flexible and powerful domain specific language of factor graphs for defining and backpropagating through arbitrary hidden structure, supporting coarse decompositions, hard logic constraints, and higher-order correlations. We derive the forward and backward algorithms needed for using LP-SparseMAP as a hidden or output layer. Experiments in three structured prediction tasks show benefits compared to SparseMAP and Structured SVM.
摘要:结构化预测需要操纵大量组合结构,例如,依赖树木或比对,无论是作为潜在的或输出变量。最近,SparseMAP方法已经被提出作为一个微的,稀疏替代最大后验(MAP)和边际推理。 SparseMAP返回少量的结构,在一些下游应用的期望特性的组合。然而,SparseMAP需要一个听话的地图推断预言。这不包括,例如,多圈图形模型或因子图与逻辑约束,这通常需要近似推断。在本文中,我们介绍了LP-SparseMAP,SparseMAP的扩展,地址通过本地多面体放松这一限制。 LP-SparseMAP使用用于定义和通过任意隐藏结构backpropagating,支撑粗分解,硬逻辑约束,和更高阶的相关性因子图的灵活和强大的域专用语言。我们推导需要使用LP-SparseMAP作为隐藏或输出层中的向前和向后的算法。在三个结构预测任务实验表明相比SparseMAP和结构化SVM的好处。
Vlad Niculae, André F. T. Martins
Abstract: Structured prediction requires manipulating a large number of combinatorial structures, e.g., dependency trees or alignments, either as latent or output variables. Recently, the SparseMAP method has been proposed as a differentiable, sparse alternative to maximum a posteriori (MAP) and marginal inference. SparseMAP returns a combination of a small number of structures, a desirable property in some downstream applications. However, SparseMAP requires a tractable MAP inference oracle. This excludes, e.g., loopy graphical models or factor graphs with logic constraints, which generally require approximate inference. In this paper, we introduce LP-SparseMAP, an extension of SparseMAP that addresses this limitation via a local polytope relaxation. LP-SparseMAP uses the flexible and powerful domain specific language of factor graphs for defining and backpropagating through arbitrary hidden structure, supporting coarse decompositions, hard logic constraints, and higher-order correlations. We derive the forward and backward algorithms needed for using LP-SparseMAP as a hidden or output layer. Experiments in three structured prediction tasks show benefits compared to SparseMAP and Structured SVM.
摘要:结构化预测需要操纵大量组合结构,例如,依赖树木或比对,无论是作为潜在的或输出变量。最近,SparseMAP方法已经被提出作为一个微的,稀疏替代最大后验(MAP)和边际推理。 SparseMAP返回少量的结构,在一些下游应用的期望特性的组合。然而,SparseMAP需要一个听话的地图推断预言。这不包括,例如,多圈图形模型或因子图与逻辑约束,这通常需要近似推断。在本文中,我们介绍了LP-SparseMAP,SparseMAP的扩展,地址通过本地多面体放松这一限制。 LP-SparseMAP使用用于定义和通过任意隐藏结构backpropagating,支撑粗分解,硬逻辑约束,和更高阶的相关性因子图的灵活和强大的域专用语言。我们推导需要使用LP-SparseMAP作为隐藏或输出层中的向前和向后的算法。在三个结构预测任务实验表明相比SparseMAP和结构化SVM的好处。
15. Negative Statements Considered Useful [PDF] 返回目录
Hiba Arnaout, Simon Razniewski, Gerhard Weikum
Abstract: Knowledge bases (KBs), pragmatic collections of knowledge about notable entities, are an important asset in applications such as search, question answering and dialogue. Rooted in a long tradition in knowledge representation, all popular KBs only store positive information, while they abstain from taking any stance towards statements not contained in them. In this paper, we make the case for explicitly stating interesting statements which are not true. Negative statements would be important to overcome current limitations of question answering, yet due to their potential abundance, any effort towards compiling them needs a tight coupling with ranking. We introduce two approaches towards compiling negative statements. (i) In peer-based statistical inferences, we compare entities with highly related entities in order to derive potential negative statements, which we then rank using supervised and unsupervised features. (ii) In query-log-based text extraction, we use a pattern-based approach for harvesting search engine query logs. Experimental results show that both approaches hold promising and complementary potential. Along with this paper, we publish the first datasets on interesting negative information, containing over 1.1M statements for 100K popular Wikidata entities.
摘要:知识库(KBS),约著名的实体知识务实的集合,是在应用程序,如搜索,问答和对话的重要资产。在知识表示有着悠久的传统根深蒂固,所有流行的知识库系统只保存正面信息,而他们从迈出不包含在他们陈述的任何立场弃权。在本文中,我们做出明确说明有趣的声明不属实的情况。克服答疑的电流限制否定陈述将是重要的,但由于其潜在的丰富,对编译他们的任何努力,需要与排名的紧密耦合。我们引入对编译否定陈述两种方法。 (一)在对等的统计推断,我们比较具有高度相关实体的实体,以得出潜在的负面陈述,然后我们使用级监督和无监督的功能。 (二)在查询日志基于文本的提取,我们用收获的搜索引擎查询日志基于模式的方法。实验结果表明,这两种方法保持承诺和互补的潜力。除了本文中,我们公布有趣的负面信息的第一数据集,包含100K流行的维基数据实体超过1.1M的语句。
Hiba Arnaout, Simon Razniewski, Gerhard Weikum
Abstract: Knowledge bases (KBs), pragmatic collections of knowledge about notable entities, are an important asset in applications such as search, question answering and dialogue. Rooted in a long tradition in knowledge representation, all popular KBs only store positive information, while they abstain from taking any stance towards statements not contained in them. In this paper, we make the case for explicitly stating interesting statements which are not true. Negative statements would be important to overcome current limitations of question answering, yet due to their potential abundance, any effort towards compiling them needs a tight coupling with ranking. We introduce two approaches towards compiling negative statements. (i) In peer-based statistical inferences, we compare entities with highly related entities in order to derive potential negative statements, which we then rank using supervised and unsupervised features. (ii) In query-log-based text extraction, we use a pattern-based approach for harvesting search engine query logs. Experimental results show that both approaches hold promising and complementary potential. Along with this paper, we publish the first datasets on interesting negative information, containing over 1.1M statements for 100K popular Wikidata entities.
摘要:知识库(KBS),约著名的实体知识务实的集合,是在应用程序,如搜索,问答和对话的重要资产。在知识表示有着悠久的传统根深蒂固,所有流行的知识库系统只保存正面信息,而他们从迈出不包含在他们陈述的任何立场弃权。在本文中,我们做出明确说明有趣的声明不属实的情况。克服答疑的电流限制否定陈述将是重要的,但由于其潜在的丰富,对编译他们的任何努力,需要与排名的紧密耦合。我们引入对编译否定陈述两种方法。 (一)在对等的统计推断,我们比较具有高度相关实体的实体,以得出潜在的负面陈述,然后我们使用级监督和无监督的功能。 (二)在查询日志基于文本的提取,我们用收获的搜索引擎查询日志基于模式的方法。实验结果表明,这两种方法保持承诺和互补的潜力。除了本文中,我们公布有趣的负面信息的第一数据集,包含100K流行的维基数据实体超过1.1M的语句。
16. Asymmetrical Hierarchical Networks with Attentive Interactions for Interpretable Review-Based Recommendation [PDF] 返回目录
Xin Dong, Jingchao Ni, Wei Cheng, Zhengzhang Chen, Bo Zong, Dongjin Song, Yanchi Liu, Haifeng Chen, Gerard de Melo
Abstract: Recently, recommender systems have been able to emit substantially improved recommendations by leveraging user-provided reviews. Existing methods typically merge all reviews of a given user or item into a long document, and then process user and item documents in the same manner. In practice, however, these two sets of reviews are notably different: users' reviews reflect a variety of items that they have bought and are hence very heterogeneous in their topics, while an item's reviews pertain only to that single item and are thus topically homogeneous. In this work, we develop a novel neural network model that properly accounts for this important difference by means of asymmetric attentive modules. The user module learns to attend to only those signals that are relevant with respect to the target item, whereas the item module learns to extract the most salient contents with regard to properties of the item. Our multi-hierarchical paradigm accounts for the fact that neither are all reviews equally useful, nor are all sentences within each review equally pertinent. Extensive experimental results on a variety of real datasets demonstrate the effectiveness of our method.
摘要:近日,推荐系统已经能够通过利用用户提供的评论发出显着改善的建议。现有的方法通常合并给定用户或项目的所有评价为长的文档,然后处理以同样的方式用户和项目的文件。然而在实践中,这两组的评论是显着不同:用户的评价反映的各种物品,他们已经买了,并因此在其主题非常庞杂,而项目的审查只涉及到单个项目,因此是局部均匀。在这项工作中,我们开发了妥善占不对称周到模块的方式这一重要区别一个新的神经网络模型。用户模块学会照顾只有那些相关的相对于目标项目的信号,而项目模块学会了关于该项目的属性提取最突出的内容。我们的多层次模式考虑的事实是,无论是全部评论同样有用,也不是每个评论中的所有语句同样相关。在各种真实数据集的大量实验结果证明了该方法的有效性。
Xin Dong, Jingchao Ni, Wei Cheng, Zhengzhang Chen, Bo Zong, Dongjin Song, Yanchi Liu, Haifeng Chen, Gerard de Melo
Abstract: Recently, recommender systems have been able to emit substantially improved recommendations by leveraging user-provided reviews. Existing methods typically merge all reviews of a given user or item into a long document, and then process user and item documents in the same manner. In practice, however, these two sets of reviews are notably different: users' reviews reflect a variety of items that they have bought and are hence very heterogeneous in their topics, while an item's reviews pertain only to that single item and are thus topically homogeneous. In this work, we develop a novel neural network model that properly accounts for this important difference by means of asymmetric attentive modules. The user module learns to attend to only those signals that are relevant with respect to the target item, whereas the item module learns to extract the most salient contents with regard to properties of the item. Our multi-hierarchical paradigm accounts for the fact that neither are all reviews equally useful, nor are all sentences within each review equally pertinent. Extensive experimental results on a variety of real datasets demonstrate the effectiveness of our method.
摘要:近日,推荐系统已经能够通过利用用户提供的评论发出显着改善的建议。现有的方法通常合并给定用户或项目的所有评价为长的文档,然后处理以同样的方式用户和项目的文件。然而在实践中,这两组的评论是显着不同:用户的评价反映的各种物品,他们已经买了,并因此在其主题非常庞杂,而项目的审查只涉及到单个项目,因此是局部均匀。在这项工作中,我们开发了妥善占不对称周到模块的方式这一重要区别一个新的神经网络模型。用户模块学会照顾只有那些相关的相对于目标项目的信号,而项目模块学会了关于该项目的属性提取最突出的内容。我们的多层次模式考虑的事实是,无论是全部评论同样有用,也不是每个评论中的所有语句同样相关。在各种真实数据集的大量实验结果证明了该方法的有效性。
17. Shareable Representations for Search Query Understanding [PDF] 返回目录
Mukul Kumar, Youna Hu, Will Headden, Rahul Goutam, Heran Lin, Bing Yin
Abstract: Understanding search queries is critical for shopping search engines to deliver a satisfying customer experience. Popular shopping search engines receive billions of unique queries yearly, each of which can depict any of hundreds of user preferences or intents. In order to get the right results to customers it must be known queries like "inexpensive prom dresses" are intended to not only surface results of a certain product type but also products with a low price. Referred to as query intents, examples also include preferences for author, brand, age group, or simply a need for customer service. Recent works such as BERT have demonstrated the success of a large transformer encoder architecture with language model pre-training on a variety of NLP tasks. We adapt such an architecture to learn intents for search queries and describe methods to account for the noisiness and sparseness of search query data. We also describe cost effective ways of hosting transformer encoder models in context with low latency requirements. With the right domain-specific training we can build a shareable deep learning model whose internal representation can be reused for a variety of query understanding tasks including query intent identification. Model sharing allows for fewer large models needed to be served at inference time and provides a platform to quickly build and roll out new search query classifiers.
摘要:了解搜索查询是购物搜索引擎提供一个满意的客户体验至关重要。流行的购物搜索引擎获得数十亿年唯一的查询,每一个都可以描绘出任何数百个用户的偏好或意图的。为了得到正确的结果,必须知道像“便宜的舞会礼服”旨在不仅是某个产品类型的表面效果,也具有价格低的产品查询客户。称为查询意图,例子还包括作者,品牌,年龄组或只是需要为客户服务的偏好。最近的作品如BERT都展现了大型变压器编码器架构,拥有对各种NLP任务语言模型前培训的成功。我们采用这样的架构,以学习为搜索查询意图和描述的是占搜索查询数据的吵闹和稀疏。我们还描述在低延迟要求的背景下举办的变压器编码器模型的经济有效的方式。有了正确的特定领域的培训,我们可以建立其内部表示可以为多种查询理解任务,包括查询意图识别重复使用一个共享的深度学习模式。模型共享允许在需要推理时间送达较少的大型模型,并提供了一个平台快速构建并推出新的搜索查询的分类。
Mukul Kumar, Youna Hu, Will Headden, Rahul Goutam, Heran Lin, Bing Yin
Abstract: Understanding search queries is critical for shopping search engines to deliver a satisfying customer experience. Popular shopping search engines receive billions of unique queries yearly, each of which can depict any of hundreds of user preferences or intents. In order to get the right results to customers it must be known queries like "inexpensive prom dresses" are intended to not only surface results of a certain product type but also products with a low price. Referred to as query intents, examples also include preferences for author, brand, age group, or simply a need for customer service. Recent works such as BERT have demonstrated the success of a large transformer encoder architecture with language model pre-training on a variety of NLP tasks. We adapt such an architecture to learn intents for search queries and describe methods to account for the noisiness and sparseness of search query data. We also describe cost effective ways of hosting transformer encoder models in context with low latency requirements. With the right domain-specific training we can build a shareable deep learning model whose internal representation can be reused for a variety of query understanding tasks including query intent identification. Model sharing allows for fewer large models needed to be served at inference time and provides a platform to quickly build and roll out new search query classifiers.
摘要:了解搜索查询是购物搜索引擎提供一个满意的客户体验至关重要。流行的购物搜索引擎获得数十亿年唯一的查询,每一个都可以描绘出任何数百个用户的偏好或意图的。为了得到正确的结果,必须知道像“便宜的舞会礼服”旨在不仅是某个产品类型的表面效果,也具有价格低的产品查询客户。称为查询意图,例子还包括作者,品牌,年龄组或只是需要为客户服务的偏好。最近的作品如BERT都展现了大型变压器编码器架构,拥有对各种NLP任务语言模型前培训的成功。我们采用这样的架构,以学习为搜索查询意图和描述的是占搜索查询数据的吵闹和稀疏。我们还描述在低延迟要求的背景下举办的变压器编码器模型的经济有效的方式。有了正确的特定领域的培训,我们可以建立其内部表示可以为多种查询理解任务,包括查询意图识别重复使用一个共享的深度学习模式。模型共享允许在需要推理时间送达较少的大型模型,并提供了一个平台快速构建并推出新的搜索查询的分类。
18. Improving Dysarthric Speech Intelligibility Using Cycle-consistent Adversarial Training [PDF] 返回目录
Seung Hee Yang, Minhwa Chung
Abstract: Dysarthria is a motor speech impairment affecting millions of people. Dysarthric speech can be far less intelligible than those of non-dysarthric speakers, causing significant communication difficulties. The goal of our work is to develop a model for dysarthric to healthy speech conversion using Cycle-consistent GAN. Using 18,700 dysarthric and 8,610 healthy control Korean utterances that were recorded for the purpose of automatic recognition of voice keyboard in a previous study, the generator is trained to transform dysarthric to healthy speech in the spectral domain, which is then converted back to speech. Objective evaluation using automatic speech recognition of the generated utterance on a held-out test set shows that the recognition performance is improved compared with the original dysarthic speech after performing adversarial training, as the absolute WER has been lowered by 33.4%. It demonstrates that the proposed GAN-based conversion method is useful for improving dysarthric speech intelligibility.
摘要:构音障碍是影响数百万人的电机语言障碍。构音障碍的言语可以比那些非构音障碍的扬声器远不如理解,造成显著沟通困难。我们工作的目标是开发用于构音障碍的使用周期一致甘健康语音转换模型。使用18700构音障碍,并且记录在先前的研究中自动识别语音键盘的目的8,610健康控制朝鲜的言论,发电机被训练在频域中,然后将其转换回语音转换构音障碍的健康讲话。客观评价使用上的保持输出测试组示出了识别性能与执行对抗性训练后的原始dysarthic语音相比得到改善,作为绝对WER已经被降低了33.4%的产生的话语的自动语音识别。这表明,所提出的基于GaN的转换方法是提高构音障碍的语音清晰度非常有用。
Seung Hee Yang, Minhwa Chung
Abstract: Dysarthria is a motor speech impairment affecting millions of people. Dysarthric speech can be far less intelligible than those of non-dysarthric speakers, causing significant communication difficulties. The goal of our work is to develop a model for dysarthric to healthy speech conversion using Cycle-consistent GAN. Using 18,700 dysarthric and 8,610 healthy control Korean utterances that were recorded for the purpose of automatic recognition of voice keyboard in a previous study, the generator is trained to transform dysarthric to healthy speech in the spectral domain, which is then converted back to speech. Objective evaluation using automatic speech recognition of the generated utterance on a held-out test set shows that the recognition performance is improved compared with the original dysarthic speech after performing adversarial training, as the absolute WER has been lowered by 33.4%. It demonstrates that the proposed GAN-based conversion method is useful for improving dysarthric speech intelligibility.
摘要:构音障碍是影响数百万人的电机语言障碍。构音障碍的言语可以比那些非构音障碍的扬声器远不如理解,造成显著沟通困难。我们工作的目标是开发用于构音障碍的使用周期一致甘健康语音转换模型。使用18700构音障碍,并且记录在先前的研究中自动识别语音键盘的目的8,610健康控制朝鲜的言论,发电机被训练在频域中,然后将其转换回语音转换构音障碍的健康讲话。客观评价使用上的保持输出测试组示出了识别性能与执行对抗性训练后的原始dysarthic语音相比得到改善,作为绝对WER已经被降低了33.4%的产生的话语的自动语音识别。这表明,所提出的基于GaN的转换方法是提高构音障碍的语音清晰度非常有用。
19. Structural Decompositions of Epistemic Logic Programs [PDF] 返回目录
Markus Hecher, Michael Morak, Stefan Woltran
Abstract: Epistemic logic programs (ELPs) are a popular generalization of standard Answer Set Programming (ASP) providing means for reasoning over answer sets within the language. This richer formalism comes at the price of higher computational complexity reaching up to the fourth level of the polynomial hierarchy. However, in contrast to standard ASP, dedicated investigations towards tractability have not been undertaken yet. In this paper, we give first results in this direction and show that central ELP problems can be solved in linear time for ELPs exhibiting structural properties in terms of bounded treewidth. We also provide a full dynamic programming algorithm that adheres to these bounds. Finally, we show that applying treewidth to a novel dependency structure---given in terms of epistemic literals---allows to bound the number of ASP solver calls in typical ELP solving procedures.
摘要:认知逻辑程序(电子学习)是标准的回答集编程(ASP)提供用于在语言中的推理在结果集的流行推广。这更丰富的形式主义来以较高的计算复杂性达到最高多项式层次的第四级的价格。然而,相对于标准的ASP,朝易处理专用的调查还没有进行呢。在本文中,我们让在这个方向的第一结果和表明中央ELP问题可以在线性时间内解决了在有界树宽的方面表现出结构性质电子学习。我们还提供一个完整的动态规划算法了符合这些界限。最后,我们表明,将树宽以一种新颖的依赖结构---在认识文字的形式给出---允许的ASP求解器的典型ELP解决过程的调用绑定的号码。
Markus Hecher, Michael Morak, Stefan Woltran
Abstract: Epistemic logic programs (ELPs) are a popular generalization of standard Answer Set Programming (ASP) providing means for reasoning over answer sets within the language. This richer formalism comes at the price of higher computational complexity reaching up to the fourth level of the polynomial hierarchy. However, in contrast to standard ASP, dedicated investigations towards tractability have not been undertaken yet. In this paper, we give first results in this direction and show that central ELP problems can be solved in linear time for ELPs exhibiting structural properties in terms of bounded treewidth. We also provide a full dynamic programming algorithm that adheres to these bounds. Finally, we show that applying treewidth to a novel dependency structure---given in terms of epistemic literals---allows to bound the number of ASP solver calls in typical ELP solving procedures.
摘要:认知逻辑程序(电子学习)是标准的回答集编程(ASP)提供用于在语言中的推理在结果集的流行推广。这更丰富的形式主义来以较高的计算复杂性达到最高多项式层次的第四级的价格。然而,相对于标准的ASP,朝易处理专用的调查还没有进行呢。在本文中,我们让在这个方向的第一结果和表明中央ELP问题可以在线性时间内解决了在有界树宽的方面表现出结构性质电子学习。我们还提供一个完整的动态规划算法了符合这些界限。最后,我们表明,将树宽以一种新颖的依赖结构---在认识文字的形式给出---允许的ASP求解器的典型ELP解决过程的调用绑定的号码。
20. A logic-based relational learning approach to relation extraction: The OntoILPER system [PDF] 返回目录
Rinaldo Lima, Bernard Espinasse, Fred Freitas
Abstract: Relation Extraction (RE), the task of detecting and characterizing semantic relations between entities in text, has gained much importance in the last two decades, mainly in the biomedical domain. Many papers have been published on Relation Extraction using supervised machine learning techniques. Most of these techniques rely on statistical methods, such as feature-based and tree-kernels-based methods. Such statistical learning techniques are usually based on a propositional hypothesis space for representing examples, i.e., they employ an attribute-value representation of features. This kind of representation has some drawbacks, particularly in the extraction of complex relations which demand more contextual information about the involving instances, i.e., it is not able to effectively capture structural information from parse trees without loss of information. In this work, we present OntoILPER, a logic-based relational learning approach to Relation Extraction that uses Inductive Logic Programming for generating extraction models in the form of symbolic extraction rules. OntoILPER takes profit of a rich relational representation of examples, which can alleviate the aforementioned drawbacks. The proposed relational approach seems to be more suitable for Relation Extraction than statistical ones for several reasons that we argue. Moreover, OntoILPER uses a domain ontology that guides the background knowledge generation process and is used for storing the extracted relation instances. The induced extraction rules were evaluated on three protein-protein interaction datasets from the biomedical domain. The performance of OntoILPER extraction models was compared with other state-of-the-art RE systems. The encouraging results seem to demonstrate the effectiveness of the proposed solution.
摘要:关系抽取(RE),检测和文本中的实体之间的表征语义关系的任务,获得了巨大的重要性在过去的二十年中,主要是在生物医学领域。许多论文已使用监督机器学习技术发表了关系抽取。这些技术大部分依赖于统计方法,如基于树的内核基于特征和方法。这样的统计学习的技术通常是基于用于表示实施例中,即一个命题假设空间,他们采用的特征的属性 - 值表示。这种表示法存在一些缺陷,特别是在复杂的关系,其中要求对涉及的情况下,即更多的上下文信息的提取,它不能有效地捕捉解析树结构信息不会丢失信息。在这项工作中,我们目前OntoILPER,一种基于逻辑的关系学习方法关系抽取使用归纳逻辑程序设计中的象征提取规则的形式产生的提取模式。 OntoILPER需要的例子丰富的关系表示,这可以减轻上述缺点的利润。拟议的关系的方式似乎更适合关系抽取比统计的人有几个原因,我们认为。此外,OntoILPER使用领域本体引导的背景知识生成处理,用于存储提取的关系实例。从生物医学域中的三个蛋白质 - 蛋白质相互作用数据集的感应提取规则进行评价。 OntoILPER提取模型的性能与国家的最先进的其它可再生能源系统进行了比较。令人鼓舞的结果似乎证明了该解决方案的有效性。
Rinaldo Lima, Bernard Espinasse, Fred Freitas
Abstract: Relation Extraction (RE), the task of detecting and characterizing semantic relations between entities in text, has gained much importance in the last two decades, mainly in the biomedical domain. Many papers have been published on Relation Extraction using supervised machine learning techniques. Most of these techniques rely on statistical methods, such as feature-based and tree-kernels-based methods. Such statistical learning techniques are usually based on a propositional hypothesis space for representing examples, i.e., they employ an attribute-value representation of features. This kind of representation has some drawbacks, particularly in the extraction of complex relations which demand more contextual information about the involving instances, i.e., it is not able to effectively capture structural information from parse trees without loss of information. In this work, we present OntoILPER, a logic-based relational learning approach to Relation Extraction that uses Inductive Logic Programming for generating extraction models in the form of symbolic extraction rules. OntoILPER takes profit of a rich relational representation of examples, which can alleviate the aforementioned drawbacks. The proposed relational approach seems to be more suitable for Relation Extraction than statistical ones for several reasons that we argue. Moreover, OntoILPER uses a domain ontology that guides the background knowledge generation process and is used for storing the extracted relation instances. The induced extraction rules were evaluated on three protein-protein interaction datasets from the biomedical domain. The performance of OntoILPER extraction models was compared with other state-of-the-art RE systems. The encouraging results seem to demonstrate the effectiveness of the proposed solution.
摘要:关系抽取(RE),检测和文本中的实体之间的表征语义关系的任务,获得了巨大的重要性在过去的二十年中,主要是在生物医学领域。许多论文已使用监督机器学习技术发表了关系抽取。这些技术大部分依赖于统计方法,如基于树的内核基于特征和方法。这样的统计学习的技术通常是基于用于表示实施例中,即一个命题假设空间,他们采用的特征的属性 - 值表示。这种表示法存在一些缺陷,特别是在复杂的关系,其中要求对涉及的情况下,即更多的上下文信息的提取,它不能有效地捕捉解析树结构信息不会丢失信息。在这项工作中,我们目前OntoILPER,一种基于逻辑的关系学习方法关系抽取使用归纳逻辑程序设计中的象征提取规则的形式产生的提取模式。 OntoILPER需要的例子丰富的关系表示,这可以减轻上述缺点的利润。拟议的关系的方式似乎更适合关系抽取比统计的人有几个原因,我们认为。此外,OntoILPER使用领域本体引导的背景知识生成处理,用于存储提取的关系实例。从生物医学域中的三个蛋白质 - 蛋白质相互作用数据集的感应提取规则进行评价。 OntoILPER提取模型的性能与国家的最先进的其它可再生能源系统进行了比较。令人鼓舞的结果似乎证明了该解决方案的有效性。
21. Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View [PDF] 返回目录
Harsh Mehta, Yoav Artzi, Jason Baldridge, Eugene Ie, Piotr Mirowski
Abstract: The Touchdown dataset (Chen et al., 2019) provides instructions by human annotators for navigation through New York City streets and for resolving spatial descriptions at a given location. To enable the wider research community to work effectively with the Touchdown tasks, we are publicly releasing the 29k raw Street View panoramas needed for Touchdown. We follow the process used for the StreetLearn data release (Mirowski et al., 2019) to check panoramas for personally identifiable information and blur them as necessary. These have been added to the StreetLearn dataset and can be obtained via the same process as used previously for StreetLearn. We also provide a reference implementation for both of the Touchdown tasks: vision and language navigation (VLN) and spatial description resolution (SDR). We compare our model results to those given in Chen et al. (2019) and show that the panoramas we have added to StreetLearn fully support both Touchdown tasks and can be used effectively for further research and comparison.
摘要:(Chen等,2019)着陆数据集由通过纽约市的街道导航人工注释,并在给定的位置,解决空间的描述提供了说明。为了使更广泛的研究团体与着陆任务有效地开展工作,我们公开发布的29K原街景全景图所需的触地得分。我们遵循用于StreetLearn数据发布过程(Mirowski等,2019),以检查全景的个人身份信息,模糊它们是必要的。这些已被添加到所述数据集StreetLearn并如前面对StreetLearn使用可以通过相同的过程来获得。我们还为双方的着陆任务提供一个参考实现:视觉和语言导航(VLN)和空间分辨率描述(SDR)。我们比较我们的模型结果与陈等人给出的。 (2019),并表明我们已经添加到StreetLearn全景图完全支持着陆任务,可以进一步研究和比较有效地使用。
Harsh Mehta, Yoav Artzi, Jason Baldridge, Eugene Ie, Piotr Mirowski
Abstract: The Touchdown dataset (Chen et al., 2019) provides instructions by human annotators for navigation through New York City streets and for resolving spatial descriptions at a given location. To enable the wider research community to work effectively with the Touchdown tasks, we are publicly releasing the 29k raw Street View panoramas needed for Touchdown. We follow the process used for the StreetLearn data release (Mirowski et al., 2019) to check panoramas for personally identifiable information and blur them as necessary. These have been added to the StreetLearn dataset and can be obtained via the same process as used previously for StreetLearn. We also provide a reference implementation for both of the Touchdown tasks: vision and language navigation (VLN) and spatial description resolution (SDR). We compare our model results to those given in Chen et al. (2019) and show that the panoramas we have added to StreetLearn fully support both Touchdown tasks and can be used effectively for further research and comparison.
摘要:(Chen等,2019)着陆数据集由通过纽约市的街道导航人工注释,并在给定的位置,解决空间的描述提供了说明。为了使更广泛的研究团体与着陆任务有效地开展工作,我们公开发布的29K原街景全景图所需的触地得分。我们遵循用于StreetLearn数据发布过程(Mirowski等,2019),以检查全景的个人身份信息,模糊它们是必要的。这些已被添加到所述数据集StreetLearn并如前面对StreetLearn使用可以通过相同的过程来获得。我们还为双方的着陆任务提供一个参考实现:视觉和语言导航(VLN)和空间分辨率描述(SDR)。我们比较我们的模型结果与陈等人给出的。 (2019),并表明我们已经添加到StreetLearn全景图完全支持着陆任务,可以进一步研究和比较有效地使用。
注:中文为机器翻译结果!