目录
1. Catplayinginthesnow: Impact of Prior Segmentation on a Model of Visually Grounded Speech [PDF] 摘要
3. Subjective Question Answering: Deciphering the inner workings of Transformers in the realm of subjectivity [PDF] 摘要
4. Graph-Stega: Semantic Controllable Steganographic Text Generation Guided by Knowledge Graph [PDF] 摘要
5. DeepVar: An End-to-End Deep Learning Approach for Genomic Variant Recognition in Biomedical Literature [PDF] 摘要
7. Affective Conditioning on Hierarchical Networks applied to Depression Detection from Transcribed Clinical Interviews [PDF] 摘要
10. An Augmented Translation Technique for low Resource language pair: Sanskrit to Hindi translation [PDF] 摘要
13. Fine-grained Human Evaluation of Transformer and Recurrent Approaches to Neural Machine Translation for English-to-Chinese [PDF] 摘要
16. Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder [PDF] 摘要
22. Words ranking and Hirsch index for identifying the core of the hapaxes in political texts [PDF] 摘要
35. Leveraging Multimodal Behavioral Analytics for Automated Job Interview Performance Assessment and Feedback [PDF] 摘要
41. Guided Transformer: Leveraging Multiple External Sources for Representation Learning in Conversational Search [PDF] 摘要
44. How to Avoid Being Eaten by a Grue: Structured Exploration Strategies for Textual Worlds [PDF] 摘要
摘要
1. Catplayinginthesnow: Impact of Prior Segmentation on a Model of Visually Grounded Speech [PDF] 返回目录
William N. Havard, Jean-Pierre Chevrot, Laurent Besacier
Abstract: We investigate the effect of introducing phone, syllable, or word boundaries on the performance of a Model of Visually Grounded Speech and compare the results with a model that does not use any boundary information and with a model that uses random boundaries. We introduce a simple way to introduce such information in an RNN-based model and investigate which type of boundary enables a better mapping between an image and its spoken description. We also explore where, that is, at which level of the network's architecture such information should be introduced. We show that using a segmentation that results in syllable-like or word-like segments and that respects word boundaries are the most efficient. Also, we show that a linguistically informed subsampling is more efficient than a random subsampling. Finally, we show that using a hierarchical segmentation, by first using a phone segmentation and recomposing words from the phone units yields better results than either using a phone or word segmentation in isolation.
摘要:我们调查的视觉接地语音的模型的性能引入手机,音节或单词边界的效果和结果与不使用任何边界信息,并与使用随机边界模型的模型进行比较。我们介绍一个简单的方法来在基于RNN模型介绍这些信息,并调查其边界的类型允许的图象和语音描述之间更好的映射。我们还探讨了在那里,那就是,在哪个级别的网络架构的应引入这样的信息。我们表明,采用分割的结果在音节状或字状部分和尊重单词边界是最有效的。此外,我们表明,一个语言告知子采样比随机欠采样更有效。最后,我们表明,使用分层分割,先用手机细分和使用隔离电话或分词重新构图从比任何手机产量单位更好的结果的话。
William N. Havard, Jean-Pierre Chevrot, Laurent Besacier
Abstract: We investigate the effect of introducing phone, syllable, or word boundaries on the performance of a Model of Visually Grounded Speech and compare the results with a model that does not use any boundary information and with a model that uses random boundaries. We introduce a simple way to introduce such information in an RNN-based model and investigate which type of boundary enables a better mapping between an image and its spoken description. We also explore where, that is, at which level of the network's architecture such information should be introduced. We show that using a segmentation that results in syllable-like or word-like segments and that respects word boundaries are the most efficient. Also, we show that a linguistically informed subsampling is more efficient than a random subsampling. Finally, we show that using a hierarchical segmentation, by first using a phone segmentation and recomposing words from the phone units yields better results than either using a phone or word segmentation in isolation.
摘要:我们调查的视觉接地语音的模型的性能引入手机,音节或单词边界的效果和结果与不使用任何边界信息,并与使用随机边界模型的模型进行比较。我们介绍一个简单的方法来在基于RNN模型介绍这些信息,并调查其边界的类型允许的图象和语音描述之间更好的映射。我们还探讨了在那里,那就是,在哪个级别的网络架构的应引入这样的信息。我们表明,采用分割的结果在音节状或字状部分和尊重单词边界是最有效的。此外,我们表明,一个语言告知子采样比随机欠采样更有效。最后,我们表明,使用分层分割,先用手机细分和使用隔离电话或分词重新构图从比任何手机产量单位更好的结果的话。
2. Wat zei je? Detecting Out-of-Distribution Translations with Variational Transformers [PDF] 返回目录
Tim Z. Xiao, Aidan N. Gomez, Yarin Gal
Abstract: We detect out-of-training-distribution sentences in Neural Machine Translation using the Bayesian Deep Learning equivalent of Transformer models. For this we develop a new measure of uncertainty designed specifically for long sequences of discrete random variables -- i.e. words in the output sentence. Our new measure of uncertainty solves a major intractability in the naive application of existing approaches on long sentences. We use our new measure on a Transformer model trained with dropout approximate inference. On the task of German-English translation using WMT13 and Europarl, we show that with dropout uncertainty our measure is able to identify when Dutch source sentences, sentences which use the same word types as German, are given to the model instead of German.
摘要:我们检测出的训练分配使用贝叶斯学习深等效变压器模型的神经机器翻译的句子。为此,我们开发专门针对离散型随机变量的长序列设计的不确定性的新措施 - 即在输出一句话。我们的新的不确定性的措施解决了对长句现有方法天真应用的主要棘手。我们使用与辍学近似推理训练的一个变压器模型我们的新措施。使用WMT13和Europarl德语英语翻译的任务,我们表明,辍学的不确定性我们的措施是能够识别当荷兰源的句子,其使用相同的字类型,德国的句子,被给予模型而不是德国。
Tim Z. Xiao, Aidan N. Gomez, Yarin Gal
Abstract: We detect out-of-training-distribution sentences in Neural Machine Translation using the Bayesian Deep Learning equivalent of Transformer models. For this we develop a new measure of uncertainty designed specifically for long sequences of discrete random variables -- i.e. words in the output sentence. Our new measure of uncertainty solves a major intractability in the naive application of existing approaches on long sentences. We use our new measure on a Transformer model trained with dropout approximate inference. On the task of German-English translation using WMT13 and Europarl, we show that with dropout uncertainty our measure is able to identify when Dutch source sentences, sentences which use the same word types as German, are given to the model instead of German.
摘要:我们检测出的训练分配使用贝叶斯学习深等效变压器模型的神经机器翻译的句子。为此,我们开发专门针对离散型随机变量的长序列设计的不确定性的新措施 - 即在输出一句话。我们的新的不确定性的措施解决了对长句现有方法天真应用的主要棘手。我们使用与辍学近似推理训练的一个变压器模型我们的新措施。使用WMT13和Europarl德语英语翻译的任务,我们表明,辍学的不确定性我们的措施是能够识别当荷兰源的句子,其使用相同的字类型,德国的句子,被给予模型而不是德国。
3. Subjective Question Answering: Deciphering the inner workings of Transformers in the realm of subjectivity [PDF] 返回目录
Lukas Muttenthaler
Abstract: Understanding subjectivity demands reasoning skills beyond the realm of common knowledge. It requires a machine learning model to process sentiment and to perform opinion mining. In this work, I've exploited a recently released dataset for span-selection Question Answering, namely SubjQA. SubjQA is the first QA dataset that contains questions that ask for subjective opinions corresponding to review paragraphs from six different domains. Hence, to answer these subjective questions, a learner must extract opinions and process sentiment for various domains, and additionally, align the knowledge extracted from a paragraph with the natural language utterances in the corresponding question, which together enhance the difficulty of a QA task. The primary goal of this thesis was to investigate the inner workings (i.e., latent representations) of a Transformer-based architecture to contribute to a better understanding of these not yet well understood "black-box" models. Transformer's hidden representations, concerning the true answer span, are clustered more closely in vector space than those representations corresponding to erroneous predictions. This observation holds across the top three Transformer layers for both objective and subjective questions and generally increases as a function of layer dimensions. Moreover, the probability to achieve a high cosine similarity among hidden representations in latent space concerning the true answer span tokens is significantly higher for correct compared to incorrect answer span predictions. These results have decisive implications for down-stream applications, where it is crucial to know about why a neural network made mistakes, and in which point, in space and time the mistake has happened (e.g., to automatically predict correctness of an answer span prediction without the necessity of labeled data).
摘要:了解主观性需要超越常识的境界推理能力。它需要一个机器学习模型来处理情绪和进行意见挖掘。在这项工作中,我已经开发了范围的选择问题解答,即SubjQA最近发布的数据集。 SubjQA是第一QA数据集,其中包含要求相应的审查来自六个不同领域的段落主观意见的问题。因此,要回答这些主观题,学习者必须提取意见和处理情绪的各个领域,另外,对准从在相应的问题自然语言语句段落提取的知识,共同提高QA工作的难度。本文的主要目的是探讨基于变压器的架构的内部运作(即潜在的交涉),以有助于更好地理解这些尚未很好地理解“黑盒子”模式。变压器的隐藏表示,关于真正的答案跨度,在大于对应于错误预测这些表示向量空间更加紧密地聚集。这一观察横跨三甲变压器层保持用于客观和主观问题,并通常随着层的尺寸的函数。此外,概率达到潜在空间有关的真正的答案跨度令牌隐藏表示中高余弦相似度是正确的显著较高相比,不正确的答案跨度预测。这些结果对下游的应用,其中关键的是要了解为什么一个神经网络所犯的错误,并在该点,在空间和时间上的错误已经发生了决定性的影响(例如,自动预测答案寿命预测的正确性未经标记的数据的必要性)。
Lukas Muttenthaler
Abstract: Understanding subjectivity demands reasoning skills beyond the realm of common knowledge. It requires a machine learning model to process sentiment and to perform opinion mining. In this work, I've exploited a recently released dataset for span-selection Question Answering, namely SubjQA. SubjQA is the first QA dataset that contains questions that ask for subjective opinions corresponding to review paragraphs from six different domains. Hence, to answer these subjective questions, a learner must extract opinions and process sentiment for various domains, and additionally, align the knowledge extracted from a paragraph with the natural language utterances in the corresponding question, which together enhance the difficulty of a QA task. The primary goal of this thesis was to investigate the inner workings (i.e., latent representations) of a Transformer-based architecture to contribute to a better understanding of these not yet well understood "black-box" models. Transformer's hidden representations, concerning the true answer span, are clustered more closely in vector space than those representations corresponding to erroneous predictions. This observation holds across the top three Transformer layers for both objective and subjective questions and generally increases as a function of layer dimensions. Moreover, the probability to achieve a high cosine similarity among hidden representations in latent space concerning the true answer span tokens is significantly higher for correct compared to incorrect answer span predictions. These results have decisive implications for down-stream applications, where it is crucial to know about why a neural network made mistakes, and in which point, in space and time the mistake has happened (e.g., to automatically predict correctness of an answer span prediction without the necessity of labeled data).
摘要:了解主观性需要超越常识的境界推理能力。它需要一个机器学习模型来处理情绪和进行意见挖掘。在这项工作中,我已经开发了范围的选择问题解答,即SubjQA最近发布的数据集。 SubjQA是第一QA数据集,其中包含要求相应的审查来自六个不同领域的段落主观意见的问题。因此,要回答这些主观题,学习者必须提取意见和处理情绪的各个领域,另外,对准从在相应的问题自然语言语句段落提取的知识,共同提高QA工作的难度。本文的主要目的是探讨基于变压器的架构的内部运作(即潜在的交涉),以有助于更好地理解这些尚未很好地理解“黑盒子”模式。变压器的隐藏表示,关于真正的答案跨度,在大于对应于错误预测这些表示向量空间更加紧密地聚集。这一观察横跨三甲变压器层保持用于客观和主观问题,并通常随着层的尺寸的函数。此外,概率达到潜在空间有关的真正的答案跨度令牌隐藏表示中高余弦相似度是正确的显著较高相比,不正确的答案跨度预测。这些结果对下游的应用,其中关键的是要了解为什么一个神经网络所犯的错误,并在该点,在空间和时间上的错误已经发生了决定性的影响(例如,自动预测答案寿命预测的正确性未经标记的数据的必要性)。
4. Graph-Stega: Semantic Controllable Steganographic Text Generation Guided by Knowledge Graph [PDF] 返回目录
Zhongliang Yang, Baitao Gong, Yamin Li, Jinshuai Yang, Zhiwen Hu, Yongfeng Huang
Abstract: Most of the existing text generative steganographic methods are based on coding the conditional probability distribution of each word during the generation process, and then selecting specific words according to the secret information, so as to achieve information hiding. Such methods have their limitations which may bring potential security risks. Firstly, with the increase of embedding rate, these models will choose words with lower conditional probability, which will reduce the quality of the generated steganographic texts; secondly, they can not control the semantic expression of the final generated steganographic text. This paper proposes a new text generative steganography method which is quietly different from the existing models. We use a Knowledge Graph (KG) to guide the generation of steganographic sentences. On the one hand, we hide the secret information by coding the path in the knowledge graph, but not the conditional probability of each generated word; on the other hand, we can control the semantic expression of the generated steganographic text to a certain extent. The experimental results show that the proposed model can guarantee both the quality of the generated text and its semantic expression, which is a supplement and improvement to the current text generation steganography.
摘要:大多数现有的文本生成隐写术的方法是基于在生成过程中的编码每个单词的条件概率分布,然后根据秘密信息来选择特定的单词,从而实现信息隐藏。这种方法有其局限性,其可能带来潜在的安全风险。首先,随着嵌入率的增加,这些车型将选择以较低的条件概率,这将减少产生的隐写文章的质量的话;其次,他们不能控制最终生成的隐写文本的语义表达。本文提出了一种新的文本生成隐写术的方法是从现有的模型静静地不同。我们用知识图(KG)引导的句子隐写的生成。在一方面,我们通过隐藏在知识图形编码路径中的秘密信息,但不是每个生成的词的条件概率;在另一方面,我们可以控制所生成的隐写文本的语义表达到一定程度。实验结果表明,该模型可以保证生成的文本的质量和它的语义表达,这是一个补充和完善当前的文本生成隐写术。
Zhongliang Yang, Baitao Gong, Yamin Li, Jinshuai Yang, Zhiwen Hu, Yongfeng Huang
Abstract: Most of the existing text generative steganographic methods are based on coding the conditional probability distribution of each word during the generation process, and then selecting specific words according to the secret information, so as to achieve information hiding. Such methods have their limitations which may bring potential security risks. Firstly, with the increase of embedding rate, these models will choose words with lower conditional probability, which will reduce the quality of the generated steganographic texts; secondly, they can not control the semantic expression of the final generated steganographic text. This paper proposes a new text generative steganography method which is quietly different from the existing models. We use a Knowledge Graph (KG) to guide the generation of steganographic sentences. On the one hand, we hide the secret information by coding the path in the knowledge graph, but not the conditional probability of each generated word; on the other hand, we can control the semantic expression of the generated steganographic text to a certain extent. The experimental results show that the proposed model can guarantee both the quality of the generated text and its semantic expression, which is a supplement and improvement to the current text generation steganography.
摘要:大多数现有的文本生成隐写术的方法是基于在生成过程中的编码每个单词的条件概率分布,然后根据秘密信息来选择特定的单词,从而实现信息隐藏。这种方法有其局限性,其可能带来潜在的安全风险。首先,随着嵌入率的增加,这些车型将选择以较低的条件概率,这将减少产生的隐写文章的质量的话;其次,他们不能控制最终生成的隐写文本的语义表达。本文提出了一种新的文本生成隐写术的方法是从现有的模型静静地不同。我们用知识图(KG)引导的句子隐写的生成。在一方面,我们通过隐藏在知识图形编码路径中的秘密信息,但不是每个生成的词的条件概率;在另一方面,我们可以控制所生成的隐写文本的语义表达到一定程度。实验结果表明,该模型可以保证生成的文本的质量和它的语义表达,这是一个补充和完善当前的文本生成隐写术。
5. DeepVar: An End-to-End Deep Learning Approach for Genomic Variant Recognition in Biomedical Literature [PDF] 返回目录
Chaoran Cheng, Fei Tan, Zhi Wei
Abstract: We consider the problem of Named Entity Recognition (NER) on biomedical scientific literature, and more specifically the genomic variants recognition in this work. Significant success has been achieved for NER on canonical tasks in recent years where large data sets are generally available. However, it remains a challenging problem on many domain-specific areas, especially the domains where only small gold annotations can be obtained. In addition, genomic variant entities exhibit diverse linguistic heterogeneity, differing much from those that have been characterized in existing canonical NER tasks. The state-of-the-art machine learning approaches in such tasks heavily rely on arduous feature engineering to characterize those unique patterns. In this work, we present the first successful end-to-end deep learning approach to bridge the gap between generic NER algorithms and low-resource applications through genomic variants recognition. Our proposed model can result in promising performance without any hand-crafted features or post-processing rules. Our extensive experiments and results may shed light on other similar low-resource NER applications.
摘要:我们认为命名实体识别(NER)的生物医学科学文献在这项工作中的问题,更具体的基因组变异的认可。显著的成功,近年来这里的大型数据集通常可被用于实现在NER规范任务。但是,它仍然在许多领域特定领域具有挑战性的问题,特别是在只能获得少量黄金注释的域。此外,基因组变异的实体表现出不同语言的异质性,从不同那些已经被其特征在于现有规范NER任务了。国家的最先进的机器学习方法在这样的任务很大程度上依赖艰巨的工程特性表征的独特模式。在这项工作中,我们提出了第一个成功的终端到终端的深度学习的方法来弥补,通过基因组变异体识别通用NER算法和低资源应用之间的差距。我们提出的模型可能会导致没有任何手工制作的功能或后期处理规则看好的表现。我们广泛的实验和结果可能对其他类似的低资源NER应用线索。
Chaoran Cheng, Fei Tan, Zhi Wei
Abstract: We consider the problem of Named Entity Recognition (NER) on biomedical scientific literature, and more specifically the genomic variants recognition in this work. Significant success has been achieved for NER on canonical tasks in recent years where large data sets are generally available. However, it remains a challenging problem on many domain-specific areas, especially the domains where only small gold annotations can be obtained. In addition, genomic variant entities exhibit diverse linguistic heterogeneity, differing much from those that have been characterized in existing canonical NER tasks. The state-of-the-art machine learning approaches in such tasks heavily rely on arduous feature engineering to characterize those unique patterns. In this work, we present the first successful end-to-end deep learning approach to bridge the gap between generic NER algorithms and low-resource applications through genomic variants recognition. Our proposed model can result in promising performance without any hand-crafted features or post-processing rules. Our extensive experiments and results may shed light on other similar low-resource NER applications.
摘要:我们认为命名实体识别(NER)的生物医学科学文献在这项工作中的问题,更具体的基因组变异的认可。显著的成功,近年来这里的大型数据集通常可被用于实现在NER规范任务。但是,它仍然在许多领域特定领域具有挑战性的问题,特别是在只能获得少量黄金注释的域。此外,基因组变异的实体表现出不同语言的异质性,从不同那些已经被其特征在于现有规范NER任务了。国家的最先进的机器学习方法在这样的任务很大程度上依赖艰巨的工程特性表征的独特模式。在这项工作中,我们提出了第一个成功的终端到终端的深度学习的方法来弥补,通过基因组变异体识别通用NER算法和低资源应用之间的差距。我们提出的模型可能会导致没有任何手工制作的功能或后期处理规则看好的表现。我们广泛的实验和结果可能对其他类似的低资源NER应用线索。
6. Open-Domain Question Answering with Pre-Constructed Question Spaces [PDF] 返回目录
Jinfeng Xiao, Lidan Wang, Franck Dernoncourt, Trung Bui, Tong Sun, Jiawei Han
Abstract: Open-domain question answering aims at solving the task of locating the answers to user-generated questions in large collections of documents. There are two families of solutions to this challenge. One family of algorithms, namely retriever-readers, first retrieves some pieces of text that are probably relevant to the question, and then feeds the retrieved text to a neural network to get the answer. Another line of work first constructs some knowledge graphs from the corpus, and queries the graph for the answer. We propose a novel algorithm with a reader-retriever structure that differs from both families. Our algorithm first reads off-line the corpus to generate collections of all answerable questions associated with their answers, and then queries the pre-constructed question spaces online to find answers that are most likely to be asked in the given way. The final answer returned to the user is decided with an accept-or-reject mechanism that combines multiple candidate answers by comparing the level of agreement between the retriever-reader and reader-retriever results. We claim that our algorithm solves some bottlenecks in existing work, and demonstrate that it achieves superior accuracy on a public dataset.
摘要:开放域问答系统旨在解决定位问题的答案在文档的大集合用户产生问题的任务。对于这个挑战的解决方案的两个家庭。算法,即检索阅读器,首先检索文本的某些部件,它们可能相关的问题,再一个家庭喂检索到的文本,以神经网络来得到答案。另一项工作首先构造从语料库一些知识图,并查询了答案图。我们提出了一种新的算法与读者猎犬结构:从两个家庭不同。我们的算法首先读取离线语料库产生的与他们的答案相关的所有回答的问题集合,然后查询预构建的问题空间网上找最有可能在给定的方式来问的答案。返回给用户的最终答案决定与接受,或阻机制,通过比较猎犬阅读器和阅读器,检索结果之间的一致程度将多个候选答案。我们要求我们的算法解决了在现有工作的一些瓶颈问题,并表明它实现在公共数据集卓越的精度。
Jinfeng Xiao, Lidan Wang, Franck Dernoncourt, Trung Bui, Tong Sun, Jiawei Han
Abstract: Open-domain question answering aims at solving the task of locating the answers to user-generated questions in large collections of documents. There are two families of solutions to this challenge. One family of algorithms, namely retriever-readers, first retrieves some pieces of text that are probably relevant to the question, and then feeds the retrieved text to a neural network to get the answer. Another line of work first constructs some knowledge graphs from the corpus, and queries the graph for the answer. We propose a novel algorithm with a reader-retriever structure that differs from both families. Our algorithm first reads off-line the corpus to generate collections of all answerable questions associated with their answers, and then queries the pre-constructed question spaces online to find answers that are most likely to be asked in the given way. The final answer returned to the user is decided with an accept-or-reject mechanism that combines multiple candidate answers by comparing the level of agreement between the retriever-reader and reader-retriever results. We claim that our algorithm solves some bottlenecks in existing work, and demonstrate that it achieves superior accuracy on a public dataset.
摘要:开放域问答系统旨在解决定位问题的答案在文档的大集合用户产生问题的任务。对于这个挑战的解决方案的两个家庭。算法,即检索阅读器,首先检索文本的某些部件,它们可能相关的问题,再一个家庭喂检索到的文本,以神经网络来得到答案。另一项工作首先构造从语料库一些知识图,并查询了答案图。我们提出了一种新的算法与读者猎犬结构:从两个家庭不同。我们的算法首先读取离线语料库产生的与他们的答案相关的所有回答的问题集合,然后查询预构建的问题空间网上找最有可能在给定的方式来问的答案。返回给用户的最终答案决定与接受,或阻机制,通过比较猎犬阅读器和阅读器,检索结果之间的一致程度将多个候选答案。我们要求我们的算法解决了在现有工作的一些瓶颈问题,并表明它实现在公共数据集卓越的精度。
7. Affective Conditioning on Hierarchical Networks applied to Depression Detection from Transcribed Clinical Interviews [PDF] 返回目录
D. Xezonaki, G. Paraskevopoulos, A. Potamianos, S. Narayanan
Abstract: In this work we propose a machine learning model for depression detection from transcribed clinical interviews. Depression is a mental disorder that impacts not only the subject's mood but also the use of language. To this end we use a Hierarchical Attention Network to classify interviews of depressed subjects. We augment the attention layer of our model with a conditioning mechanism on linguistic features, extracted from affective lexica. Our analysis shows that individuals diagnosed with depression use affective language to a greater extent than not-depressed. Our experiments show that external affective information improves the performance of the proposed architecture in the General Psychotherapy Corpus and the DAIC-WoZ 2017 depression datasets, achieving state-of-the-art 71.6 and 68.6 F1 scores respectively.
摘要:在这项工作中,我们提出从转录临床面试抑郁症检测机器学习模型。抑郁症是一种心理疾病,不仅影响对象的情绪,而且使用的语言。为此,我们使用分层关注网络对抑郁症患者进行分类的采访。我们与语言功能调节机制,从情感lexica提取增强我们的模型的关注层。我们的分析表明,诊断为抑郁症的个体使用情感语言在更大程度上比不了,郁闷。我们的实验表明,外部情感信息提高了在一般心理治疗语料库和侗台,沃兹2017年抑郁症的数据集所提出的架构的性能,分别达到71.6和68.6的得分F1国家的最先进的。
D. Xezonaki, G. Paraskevopoulos, A. Potamianos, S. Narayanan
Abstract: In this work we propose a machine learning model for depression detection from transcribed clinical interviews. Depression is a mental disorder that impacts not only the subject's mood but also the use of language. To this end we use a Hierarchical Attention Network to classify interviews of depressed subjects. We augment the attention layer of our model with a conditioning mechanism on linguistic features, extracted from affective lexica. Our analysis shows that individuals diagnosed with depression use affective language to a greater extent than not-depressed. Our experiments show that external affective information improves the performance of the proposed architecture in the General Psychotherapy Corpus and the DAIC-WoZ 2017 depression datasets, achieving state-of-the-art 71.6 and 68.6 F1 scores respectively.
摘要:在这项工作中,我们提出从转录临床面试抑郁症检测机器学习模型。抑郁症是一种心理疾病,不仅影响对象的情绪,而且使用的语言。为此,我们使用分层关注网络对抑郁症患者进行分类的采访。我们与语言功能调节机制,从情感lexica提取增强我们的模型的关注层。我们的分析表明,诊断为抑郁症的个体使用情感语言在更大程度上比不了,郁闷。我们的实验表明,外部情感信息提高了在一般心理治疗语料库和侗台,沃兹2017年抑郁症的数据集所提出的架构的性能,分别达到71.6和68.6的得分F1国家的最先进的。
8. A Dataset and Benchmarks for Multimedia Social Analysis [PDF] 返回目录
Bofan Xue, David Chan, John Canny
Abstract: We present a new publicly available dataset with the goal of advancing multi-modality learning by offering vision and language data within the same context. This is achieved by obtaining data from a social media website with posts containing multiple paired images/videos and text, along with comment trees containing images/videos and/or text. With a total of 677k posts, 2.9 million post images, 488k post videos, 1.4 million comment images, 4.6 million comment videos, and 96.9 million comments, data from different modalities can be jointly used to improve performances for a variety of tasks such as image captioning, image classification, next frame prediction, sentiment analysis, and language modeling. We present a wide range of statistics for our dataset. Finally, we provide baseline performance analysis for one of the regression tasks using pre-trained models and several fully connected networks.
摘要:本文提出了一种新的可公开获得的数据集在同一范围内通过提供视觉和语言数据推动多模态学习的目标。这是通过从社交媒体网站,包含多个配对的图像/视频和文本,包含图像/视频和/或文本注释树木沿柱获得的数据来实现的。共有677k的帖子,290万倍后的图像,488k发布视频,140万张评论图片,460万个评论视频,96.9万条评论,从不同形态的数据可以结合使用,以提高各种任务,例如图像的表现字幕,图像分类,下一帧的预测,情绪分析,以及语言模型。我们提出了一个范围广泛用于我们的数据统计。最后,我们对使用预训练模式和几个全连接网络的回归任务之一提供基准性能分析。
Bofan Xue, David Chan, John Canny
Abstract: We present a new publicly available dataset with the goal of advancing multi-modality learning by offering vision and language data within the same context. This is achieved by obtaining data from a social media website with posts containing multiple paired images/videos and text, along with comment trees containing images/videos and/or text. With a total of 677k posts, 2.9 million post images, 488k post videos, 1.4 million comment images, 4.6 million comment videos, and 96.9 million comments, data from different modalities can be jointly used to improve performances for a variety of tasks such as image captioning, image classification, next frame prediction, sentiment analysis, and language modeling. We present a wide range of statistics for our dataset. Finally, we provide baseline performance analysis for one of the regression tasks using pre-trained models and several fully connected networks.
摘要:本文提出了一种新的可公开获得的数据集在同一范围内通过提供视觉和语言数据推动多模态学习的目标。这是通过从社交媒体网站,包含多个配对的图像/视频和文本,包含图像/视频和/或文本注释树木沿柱获得的数据来实现的。共有677k的帖子,290万倍后的图像,488k发布视频,140万张评论图片,460万个评论视频,96.9万条评论,从不同形态的数据可以结合使用,以提高各种任务,例如图像的表现字幕,图像分类,下一帧的预测,情绪分析,以及语言模型。我们提出了一个范围广泛用于我们的数据统计。最后,我们对使用预训练模式和几个全连接网络的回归任务之一提供基准性能分析。
9. StackOverflow vs Kaggle: A Study of Developer Discussions About Data Science [PDF] 返回目录
David Hin
Abstract: Software developers are increasingly required to understand fundamental Data science (DS) concepts. Recently, the presence of machine learning (ML) and deep learning (DL) has dramatically increased in the development of user applications, whether they are leveraged through frameworks or implemented from scratch. These topics attract much discussion on online platforms. This paper conducts large-scale qualitative and quantitative experiments to study the characteristics of 197836 posts from StackOverflow and Kaggle. Latent Dirichlet Allocation topic modelling is used to extract twenty-four DS discussion topics. The main findings include that TensorFlow-related topics were most prevalent in StackOverflow, while meta discussion topics were the prevalent ones on Kaggle. StackOverflow tends to include lower-level troubleshooting, while Kaggle focuses on practicality and optimising leaderboard performance. In addition, across both communities, DS discussion is increasing at a dramatic rate. While TensorFlow discussion on StackOverflow is slowing, interest in Keras is rising. Finally, ensemble algorithms are the most mentioned ML/DL algorithms in Kaggle but are rarely discussed on StackOverflow. These findings can help educators and researchers to more effectively tailor and prioritise efforts in researching and communicating DS concepts towards different developer communities.
摘要:软件开发人员越来越需要了解基本的科学数据(DS)的概念。最近,机器学习(ML)和深度学习(DL)存在于用户应用的发展已经大大增加,无论是通过框架或利用从头开始实施。这些主题吸引在线平台的讨论。本文还进行大规模的定性和定量实验,研究从StackOverflow上和Kaggle 197836个职位的特点。隐含狄利克雷分布主题建模被用来提取24 DS讨论主题。主要结论包括TensorFlow相关的主题是在StackOverflow上最普遍的,而元的讨论主题是关于Kaggle普遍的。 StackOverflow的往往包括低级别的故障排除,而Kaggle侧重于实用性和优化排行榜性能。此外,在这两个社区,DS讨论以惊人的速度增加。虽然在计算器上TensorFlow讨论正在放缓,在Keras的兴趣正在上升。最后,合奏算法在Kaggle提及最多的ML / DL算法,但StackOverflow上很少讨论。这些研究结果可以帮助教育工作者和研究人员能够更有效地量身定制和优先考虑的努力研究和对不同的开发者社区交流DS概念。
David Hin
Abstract: Software developers are increasingly required to understand fundamental Data science (DS) concepts. Recently, the presence of machine learning (ML) and deep learning (DL) has dramatically increased in the development of user applications, whether they are leveraged through frameworks or implemented from scratch. These topics attract much discussion on online platforms. This paper conducts large-scale qualitative and quantitative experiments to study the characteristics of 197836 posts from StackOverflow and Kaggle. Latent Dirichlet Allocation topic modelling is used to extract twenty-four DS discussion topics. The main findings include that TensorFlow-related topics were most prevalent in StackOverflow, while meta discussion topics were the prevalent ones on Kaggle. StackOverflow tends to include lower-level troubleshooting, while Kaggle focuses on practicality and optimising leaderboard performance. In addition, across both communities, DS discussion is increasing at a dramatic rate. While TensorFlow discussion on StackOverflow is slowing, interest in Keras is rising. Finally, ensemble algorithms are the most mentioned ML/DL algorithms in Kaggle but are rarely discussed on StackOverflow. These findings can help educators and researchers to more effectively tailor and prioritise efforts in researching and communicating DS concepts towards different developer communities.
摘要:软件开发人员越来越需要了解基本的科学数据(DS)的概念。最近,机器学习(ML)和深度学习(DL)存在于用户应用的发展已经大大增加,无论是通过框架或利用从头开始实施。这些主题吸引在线平台的讨论。本文还进行大规模的定性和定量实验,研究从StackOverflow上和Kaggle 197836个职位的特点。隐含狄利克雷分布主题建模被用来提取24 DS讨论主题。主要结论包括TensorFlow相关的主题是在StackOverflow上最普遍的,而元的讨论主题是关于Kaggle普遍的。 StackOverflow的往往包括低级别的故障排除,而Kaggle侧重于实用性和优化排行榜性能。此外,在这两个社区,DS讨论以惊人的速度增加。虽然在计算器上TensorFlow讨论正在放缓,在Keras的兴趣正在上升。最后,合奏算法在Kaggle提及最多的ML / DL算法,但StackOverflow上很少讨论。这些研究结果可以帮助教育工作者和研究人员能够更有效地量身定制和优先考虑的努力研究和对不同的开发者社区交流DS概念。
10. An Augmented Translation Technique for low Resource language pair: Sanskrit to Hindi translation [PDF] 返回目录
Rashi Kumar, Piyush Jha, Vineet Sahula
Abstract: Neural Machine Translation (NMT) is an ongoing technique for Machine Translation (MT) using enormous artificial neural network. It has exhibited promising outcomes and has shown incredible potential in solving challenging machine translation exercises. One such exercise is the best approach to furnish great MT to language sets with a little preparing information. In this work, Zero Shot Translation (ZST) is inspected for a low resource language pair. By working on high resource language pairs for which benchmarks are available, namely Spanish to Portuguese, and training on data sets (Spanish-English and English-Portuguese) we prepare a state of proof for ZST system that gives appropriate results on the available data. Subsequently the same architecture is tested for Sanskrit to Hindi translation for which data is sparse, by training the model on English-Hindi and Sanskrit-English language pairs. In order to prepare and decipher with ZST system, we broaden the preparation and interpretation pipelines of NMT seq2seq model in tensorflow, incorporating ZST features. Dimensionality reduction of word embedding is performed to reduce the memory usage for data storage and to achieve a faster training and translation cycles. In this work existing helpful technology has been utilized in an imaginative manner to execute our NLP issue of Sanskrit to Hindi translation. A Sanskrit-Hindi parallel corpus of 300 is constructed for testing. The data required for the construction of parallel corpus has been taken from the telecasted news, published on Department of Public Information, state government of Madhya Pradesh, India website.
摘要:神经机器翻译(NMT)是使用巨大的人工神经网络机器翻译(MT)正在进行的技术。它已表现出有前途的结果,并表现出令人难以置信的潜力,解决具有挑战性的机器翻译练习。一个这样的锻炼是饰面大MT最好的办法,以语言集一点点准备的信息。在这项工作中,零射门翻译(ZST)进行检查,查验低资源语言对。通过对这些基准可用,即西班牙语葡萄牙语高资源的语言对,并在数据集(西班牙语 - 英语和英语 - 葡萄牙语)培训工作,我们准备证明了ZST系统,让现有数据相应的结果的状态。随后,同样的架构是为梵文测试,印地文翻译的数据是稀疏的,通过培训英语,印地文和梵文,英语语言对模型。为了准备与ZST系统解密,我们拓宽NMT seq2seq模型的准备和解释管道在tensorflow,结合ZST功能。进行字嵌入的维数降低,以减少用于数据存储的存储器的使用和实现更快的训练和翻译周期。在这项工作中存在的实用技术已被用于一个富有想象力的方式来执行我们的梵文的NLP问题印地文翻译。的300甲梵文-印地文平行语料库被构建用于测试。对于平行语料库的建设所需要的数据取自被播放新闻,公共信息,中央邦,印度网站的州政府部门公布。
Rashi Kumar, Piyush Jha, Vineet Sahula
Abstract: Neural Machine Translation (NMT) is an ongoing technique for Machine Translation (MT) using enormous artificial neural network. It has exhibited promising outcomes and has shown incredible potential in solving challenging machine translation exercises. One such exercise is the best approach to furnish great MT to language sets with a little preparing information. In this work, Zero Shot Translation (ZST) is inspected for a low resource language pair. By working on high resource language pairs for which benchmarks are available, namely Spanish to Portuguese, and training on data sets (Spanish-English and English-Portuguese) we prepare a state of proof for ZST system that gives appropriate results on the available data. Subsequently the same architecture is tested for Sanskrit to Hindi translation for which data is sparse, by training the model on English-Hindi and Sanskrit-English language pairs. In order to prepare and decipher with ZST system, we broaden the preparation and interpretation pipelines of NMT seq2seq model in tensorflow, incorporating ZST features. Dimensionality reduction of word embedding is performed to reduce the memory usage for data storage and to achieve a faster training and translation cycles. In this work existing helpful technology has been utilized in an imaginative manner to execute our NLP issue of Sanskrit to Hindi translation. A Sanskrit-Hindi parallel corpus of 300 is constructed for testing. The data required for the construction of parallel corpus has been taken from the telecasted news, published on Department of Public Information, state government of Madhya Pradesh, India website.
摘要:神经机器翻译(NMT)是使用巨大的人工神经网络机器翻译(MT)正在进行的技术。它已表现出有前途的结果,并表现出令人难以置信的潜力,解决具有挑战性的机器翻译练习。一个这样的锻炼是饰面大MT最好的办法,以语言集一点点准备的信息。在这项工作中,零射门翻译(ZST)进行检查,查验低资源语言对。通过对这些基准可用,即西班牙语葡萄牙语高资源的语言对,并在数据集(西班牙语 - 英语和英语 - 葡萄牙语)培训工作,我们准备证明了ZST系统,让现有数据相应的结果的状态。随后,同样的架构是为梵文测试,印地文翻译的数据是稀疏的,通过培训英语,印地文和梵文,英语语言对模型。为了准备与ZST系统解密,我们拓宽NMT seq2seq模型的准备和解释管道在tensorflow,结合ZST功能。进行字嵌入的维数降低,以减少用于数据存储的存储器的使用和实现更快的训练和翻译周期。在这项工作中存在的实用技术已被用于一个富有想象力的方式来执行我们的梵文的NLP问题印地文翻译。的300甲梵文-印地文平行语料库被构建用于测试。对于平行语料库的建设所需要的数据取自被播放新闻,公共信息,中央邦,印度网站的州政府部门公布。
11. Probing Neural Dialog Models for Conversational Understanding [PDF] 返回目录
Abdelrhman Saleh, Tovly Deutsch, Stephen Casper, Yonatan Belinkov, Stuart Shieber
Abstract: The predominant approach to open-domain dialog generation relies on end-to-end training of neural models on chat datasets. However, this approach provides little insight as to what these models learn (or do not learn) about engaging in dialog. In this study, we analyze the internal representations learned by neural open-domain dialog systems and evaluate the quality of these representations for learning basic conversational skills. Our results suggest that standard open-domain dialog systems struggle with answering questions, inferring contradiction, and determining the topic of conversation, among other tasks. We also find that the dyadic, turn-taking nature of dialog is not fully leveraged by these models. By exploring these limitations, we highlight the need for additional research into architectures and training methods that can better capture high-level information about dialog.
摘要:主要途径开放域对话生成依赖上聊天数据集的端至高端培训的神经模型。然而,这种方法提供了关于对话框搞一点洞察力,这些模型学什么(或不学习)。在这项研究中,我们分析由神经开放域的对话系统学习的内部表示,并评估这些陈述的学习基本会话技能素质。我们的研究结果表明与答疑,推断矛盾,并确定谈话的主题,以及其他任务的是标准的开放域的对话系统的斗争。我们还发现,对话的二元,回合制取自然不受这些车型充分利用。通过探索这些限制,我们强调更多的研究体系和训练方法,可以更好地了解对话框捕获高级别信息的需要。
Abdelrhman Saleh, Tovly Deutsch, Stephen Casper, Yonatan Belinkov, Stuart Shieber
Abstract: The predominant approach to open-domain dialog generation relies on end-to-end training of neural models on chat datasets. However, this approach provides little insight as to what these models learn (or do not learn) about engaging in dialog. In this study, we analyze the internal representations learned by neural open-domain dialog systems and evaluate the quality of these representations for learning basic conversational skills. Our results suggest that standard open-domain dialog systems struggle with answering questions, inferring contradiction, and determining the topic of conversation, among other tasks. We also find that the dyadic, turn-taking nature of dialog is not fully leveraged by these models. By exploring these limitations, we highlight the need for additional research into architectures and training methods that can better capture high-level information about dialog.
摘要:主要途径开放域对话生成依赖上聊天数据集的端至高端培训的神经模型。然而,这种方法提供了关于对话框搞一点洞察力,这些模型学什么(或不学习)。在这项研究中,我们分析由神经开放域的对话系统学习的内部表示,并评估这些陈述的学习基本会话技能素质。我们的研究结果表明与答疑,推断矛盾,并确定谈话的主题,以及其他任务的是标准的开放域的对话系统的斗争。我们还发现,对话的二元,回合制取自然不受这些车型充分利用。通过探索这些限制,我们强调更多的研究体系和训练方法,可以更好地了解对话框捕获高级别信息的需要。
12. ETHOS: an Online Hate Speech Detection Dataset [PDF] 返回目录
Ioannis Mollas, Zoe Chrysopoulou, Stamatis Karlos, Grigorios Tsoumakas
Abstract: Online hate speech is a newborn problem in our modern society which is growing at a steady rate exploiting weaknesses of the corresponding regimes that characterise several social media platforms. Therefore, this phenomenon is mainly cultivated through such comments, either during users' interaction or on posted multimedia context. Nowadays, giant companies own platforms where many millions of users log in daily. Thus, protection of their users from exposure to similar phenomena for keeping up with the corresponding law, as well as for retaining a high quality of offered services, seems mandatory. Having a robust and reliable mechanism for identifying and preventing the uploading of related material would have a huge effect on our society regarding several aspects of our daily life. On the other hand, its absence would deteriorate heavily the total user experience, while its erroneous operation might raise several ethical issues. In this work, we present a protocol for creating a more suitable dataset, regarding its both informativeness and representativeness aspects, favouring the safer capture of hate speech occurrence, without at the same time restricting its applicability to other classification problems. Moreover, we produce and publish a textual dataset with two variants: binary and multi-label, called `ETHOS', based on YouTube and Reddit comments validated through figure-eight crowdsourcing platform. Our assumption about the production of more compatible datasets is further investigated by applying various classification models and recording their behaviour over several appropriate metrics.
摘要:网上仇恨言论是在我们的现代社会新生的问题,这是在一个稳定的速率利用这样几个特点的社交媒体平台的相应制度的弱点越来越大。因此,这种现象主要是通过这样的评论栽培,无论是在用户的交互或张贴多媒体内容。如今,巨头公司那里拥有数以百万计的用户登录每日平台。因此,与相应的法律跟上,以及用于保持提供服务的高品质保障他们的用户从接触到类似的现象,似乎是强制性的。具有用于识别和防止相关材料上传会对我们的社会对我们日常生活的多个方面产生巨大影响一个强大的和可靠的机制。在另一方面,它的缺席会严重恶化的总用户体验,同时它的错误操作可能会引发一些伦理问题。在这项工作中,我们提出了一个协议,用于创建更合适的数据集,关于它的两个信息性和代表性方面,有利于仇恨言论出现的安全捕捉,而不同时限制其适用于其他分类问题。此外,我们生产和发布的文本数据集两种变体:二进制和多标签,叫做'ETHOS”,根据YouTube和Reddit评论通过八字形众包平台验证。我们对生产更兼容的数据集的假设是通过应用各种分类模型,并在几个合适的衡量标准记录他们的行为进一步调查。
Ioannis Mollas, Zoe Chrysopoulou, Stamatis Karlos, Grigorios Tsoumakas
Abstract: Online hate speech is a newborn problem in our modern society which is growing at a steady rate exploiting weaknesses of the corresponding regimes that characterise several social media platforms. Therefore, this phenomenon is mainly cultivated through such comments, either during users' interaction or on posted multimedia context. Nowadays, giant companies own platforms where many millions of users log in daily. Thus, protection of their users from exposure to similar phenomena for keeping up with the corresponding law, as well as for retaining a high quality of offered services, seems mandatory. Having a robust and reliable mechanism for identifying and preventing the uploading of related material would have a huge effect on our society regarding several aspects of our daily life. On the other hand, its absence would deteriorate heavily the total user experience, while its erroneous operation might raise several ethical issues. In this work, we present a protocol for creating a more suitable dataset, regarding its both informativeness and representativeness aspects, favouring the safer capture of hate speech occurrence, without at the same time restricting its applicability to other classification problems. Moreover, we produce and publish a textual dataset with two variants: binary and multi-label, called `ETHOS', based on YouTube and Reddit comments validated through figure-eight crowdsourcing platform. Our assumption about the production of more compatible datasets is further investigated by applying various classification models and recording their behaviour over several appropriate metrics.
摘要:网上仇恨言论是在我们的现代社会新生的问题,这是在一个稳定的速率利用这样几个特点的社交媒体平台的相应制度的弱点越来越大。因此,这种现象主要是通过这样的评论栽培,无论是在用户的交互或张贴多媒体内容。如今,巨头公司那里拥有数以百万计的用户登录每日平台。因此,与相应的法律跟上,以及用于保持提供服务的高品质保障他们的用户从接触到类似的现象,似乎是强制性的。具有用于识别和防止相关材料上传会对我们的社会对我们日常生活的多个方面产生巨大影响一个强大的和可靠的机制。在另一方面,它的缺席会严重恶化的总用户体验,同时它的错误操作可能会引发一些伦理问题。在这项工作中,我们提出了一个协议,用于创建更合适的数据集,关于它的两个信息性和代表性方面,有利于仇恨言论出现的安全捕捉,而不同时限制其适用于其他分类问题。此外,我们生产和发布的文本数据集两种变体:二进制和多标签,叫做'ETHOS”,根据YouTube和Reddit评论通过八字形众包平台验证。我们对生产更兼容的数据集的假设是通过应用各种分类模型,并在几个合适的衡量标准记录他们的行为进一步调查。
13. Fine-grained Human Evaluation of Transformer and Recurrent Approaches to Neural Machine Translation for English-to-Chinese [PDF] 返回目录
Yuying Ye, Antonio Toral
Abstract: This research presents a fine-grained human evaluation to compare the Transformer and recurrent approaches to neural machine translation (MT), on the translation direction English-to-Chinese. To this end, we develop an error taxonomy compliant with the Multidimensional Quality Metrics (MQM) framework that is customised to the relevant phenomena of this translation direction. We then conduct an error annotation using this customised error taxonomy on the output of state-of-the-art recurrent- and Transformer-based MT systems on a subset of WMT2019's news test set. The resulting annotation shows that, compared to the best recurrent system, the best Transformer system results in a 31% reduction of the total number of errors and it produced significantly less errors in 10 out of 22 error categories. We also note that two of the systems evaluated do not produce any error for a category that was relevant for this translation direction prior to the advent of NMT systems: Chinese classifiers.
摘要:本研究提出了一个细粒度人评价比较变压器和复发的方法神经机器翻译(MT),在翻译方向英语到中国。为此,我们开发与定制为这个翻译方向的相关现象的多维质量指标(MQM)框架错误的分类标准。然后,我们使用的国家的最先进的recurrent-和基于变压器的MT系统对WMT2019的新闻测试集的一个子集的输出自定义的错误分类进行了错误的诠释。将得到的注释示出了,相比于最好的复发性系统,最好的变压器系统的结果中的错误的总数的减少31%和它产生在10显著误差较小总分22个错误类别。我们还注意到两个评估不产生任何错误的类别,为NMT系统的出现是相关的这个翻译方向之前的系统:中国的分类。
Yuying Ye, Antonio Toral
Abstract: This research presents a fine-grained human evaluation to compare the Transformer and recurrent approaches to neural machine translation (MT), on the translation direction English-to-Chinese. To this end, we develop an error taxonomy compliant with the Multidimensional Quality Metrics (MQM) framework that is customised to the relevant phenomena of this translation direction. We then conduct an error annotation using this customised error taxonomy on the output of state-of-the-art recurrent- and Transformer-based MT systems on a subset of WMT2019's news test set. The resulting annotation shows that, compared to the best recurrent system, the best Transformer system results in a 31% reduction of the total number of errors and it produced significantly less errors in 10 out of 22 error categories. We also note that two of the systems evaluated do not produce any error for a category that was relevant for this translation direction prior to the advent of NMT systems: Chinese classifiers.
摘要:本研究提出了一个细粒度人评价比较变压器和复发的方法神经机器翻译(MT),在翻译方向英语到中国。为此,我们开发与定制为这个翻译方向的相关现象的多维质量指标(MQM)框架错误的分类标准。然后,我们使用的国家的最先进的recurrent-和基于变压器的MT系统对WMT2019的新闻测试集的一个子集的输出自定义的错误分类进行了错误的诠释。将得到的注释示出了,相比于最好的复发性系统,最好的变压器系统的结果中的错误的总数的减少31%和它产生在10显著误差较小总分22个错误类别。我们还注意到两个评估不产生任何错误的类别,为NMT系统的出现是相关的这个翻译方向之前的系统:中国的分类。
14. On the Multi-Property Extraction and Beyond [PDF] 返回目录
Tomasz Dwojak, Michał Pietruszka, Łukasz Borchmann, Filip Graliński, Jakub Chłędowski
Abstract: In this paper, we investigate the Dual-source Transformer architecture on the WikiReading information extraction and machine reading comprehension dataset. The proposed model outperforms the current state-of-the-art by a large margin. Next, we introduce WikiReading Recycled - a newly developed public dataset, supporting the task of multiple property extraction. It keeps the spirit of the original WikiReading but does not inherit the identified disadvantages of its predecessor.
摘要:在本文中,我们探讨了WikiReading信息提取和机器阅读理解数据集的双电源转换架构。所提出的模型优于当前状态的最先进的大幅度。接下来,我们介绍WikiReading再生 - 新开发的公共数据集,支持多种属性提取的任务。它使原来WikiReading的精神,但不继承其前任的标识缺点。
Tomasz Dwojak, Michał Pietruszka, Łukasz Borchmann, Filip Graliński, Jakub Chłędowski
Abstract: In this paper, we investigate the Dual-source Transformer architecture on the WikiReading information extraction and machine reading comprehension dataset. The proposed model outperforms the current state-of-the-art by a large margin. Next, we introduce WikiReading Recycled - a newly developed public dataset, supporting the task of multiple property extraction. It keeps the spirit of the original WikiReading but does not inherit the identified disadvantages of its predecessor.
摘要:在本文中,我们探讨了WikiReading信息提取和机器阅读理解数据集的双电源转换架构。所提出的模型优于当前状态的最先进的大幅度。接下来,我们介绍WikiReading再生 - 新开发的公共数据集,支持多种属性提取的任务。它使原来WikiReading的精神,但不继承其前任的标识缺点。
15. Extracting N-ary Cross-sentence Relations using Constrained Subsequence Kernel [PDF] 返回目录
Sachin Pawar, Pushpak Bhattacharyya, Girish K. Palshikar
Abstract: Most of the past work in relation extraction deals with relations occurring within a sentence and having only two entity arguments. We propose a new formulation of the relation extraction task where the relations are more general than intra-sentence relations in the sense that they may span multiple sentences and may have more than two arguments. Moreover, the relations are more specific than corpus-level relations in the sense that their scope is limited only within a document and not valid globally throughout the corpus. We propose a novel sequence representation to characterize instances of such relations. We then explore various classifiers whose features are derived from this sequence representation. For SVM classifier, we design a Constrained Subsequence Kernel which is a variant of Generalized Subsequence Kernel. We evaluate our approach on three datasets across two domains: biomedical and general domain.
摘要:大多数关系抽取交易过去的工作与发生的句子中关系,只有两个实体论点。我们提出关系抽取任务的新制剂,其中的关系比在这个意义上,他们可以跨越多个句子,可以有两个以上的参数内句子的关系比较一般。此外,关系比在某种意义上语料库的层次关系更具体的,他们的范围只能在有限的文件,而不是整个语料全局有效。我们提出了一个新的序列表示来描述这种关系的实例。然后,我们探索不同的分类,其特征是由该序列表示导出。对于SVM分类,我们设计了一个约束子序列内核是广义子序列内核的变体。我们评估三个数据集中在两个领域的方法:生物医学和一般领域。
Sachin Pawar, Pushpak Bhattacharyya, Girish K. Palshikar
Abstract: Most of the past work in relation extraction deals with relations occurring within a sentence and having only two entity arguments. We propose a new formulation of the relation extraction task where the relations are more general than intra-sentence relations in the sense that they may span multiple sentences and may have more than two arguments. Moreover, the relations are more specific than corpus-level relations in the sense that their scope is limited only within a document and not valid globally throughout the corpus. We propose a novel sequence representation to characterize instances of such relations. We then explore various classifiers whose features are derived from this sequence representation. For SVM classifier, we design a Constrained Subsequence Kernel which is a variant of Generalized Subsequence Kernel. We evaluate our approach on three datasets across two domains: biomedical and general domain.
摘要:大多数关系抽取交易过去的工作与发生的句子中关系,只有两个实体论点。我们提出关系抽取任务的新制剂,其中的关系比在这个意义上,他们可以跨越多个句子,可以有两个以上的参数内句子的关系比较一般。此外,关系比在某种意义上语料库的层次关系更具体的,他们的范围只能在有限的文件,而不是整个语料全局有效。我们提出了一个新的序列表示来描述这种关系的实例。然后,我们探索不同的分类,其特征是由该序列表示导出。对于SVM分类,我们设计了一个约束子序列内核是广义子序列内核的变体。我们评估三个数据集中在两个领域的方法:生物医学和一般领域。
16. Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder [PDF] 返回目录
Daya Guo, Duyu Tang, Nan Duan, Jian Yin, Daxin Jiang, Ming Zhou
Abstract: Generating inferential texts about an event in different perspectives requires reasoning over different contexts that the event occurs. Existing works usually ignore the context that is not explicitly provided, resulting in a context-independent semantic representation that struggles to support the generation. To address this, we propose an approach that automatically finds evidence for an event from a large text corpus, and leverages the evidence to guide the generation of inferential texts. Our approach works in an encoder-decoder manner and is equipped with a Vector Quantised-Variational Autoencoder, where the encoder outputs representations from a distribution over discrete variables. Such discrete representations enable automatically selecting relevant evidence, which not only facilitates evidence-aware generation, but also provides a natural way to uncover rationales behind the generation. Our approach provides state-of-the-art performance on both Event2Mind and ATOMIC datasets. More importantly, we find that with discrete representations, our model selectively uses evidence to generate different inferential texts.
摘要:生成关于不同的观点一个事件推理文本需要推理在该事件发生时,不同的上下文。现有的作品通常会忽略未明确规定,导致背景独立的语义表达的是斗争,支持新一代的上下文。为了解决这个问题,我们提出了自动查找证据,从一个大的文本语料库的事件,并利用证据来指导推理文本的生成的方法。我们的方法的工作原理在编码器 - 解码器的方式,并配有一个向量量化变分自动编码器,其中从分布在离散变量将编码器输出的表示。这样的离散表示自动启用选择相关的证据,这不仅有利于证据意识的产生,同时也提供了一个自然的方式来产生揪出背后的基本原理。我们的方法提供了在两个Event2Mind和原子数据集的国家的最先进的性能。更重要的是,我们发现,与分立表示,我们的模型中选择性地使用证据,以产生不同的推理文本。
Daya Guo, Duyu Tang, Nan Duan, Jian Yin, Daxin Jiang, Ming Zhou
Abstract: Generating inferential texts about an event in different perspectives requires reasoning over different contexts that the event occurs. Existing works usually ignore the context that is not explicitly provided, resulting in a context-independent semantic representation that struggles to support the generation. To address this, we propose an approach that automatically finds evidence for an event from a large text corpus, and leverages the evidence to guide the generation of inferential texts. Our approach works in an encoder-decoder manner and is equipped with a Vector Quantised-Variational Autoencoder, where the encoder outputs representations from a distribution over discrete variables. Such discrete representations enable automatically selecting relevant evidence, which not only facilitates evidence-aware generation, but also provides a natural way to uncover rationales behind the generation. Our approach provides state-of-the-art performance on both Event2Mind and ATOMIC datasets. More importantly, we find that with discrete representations, our model selectively uses evidence to generate different inferential texts.
摘要:生成关于不同的观点一个事件推理文本需要推理在该事件发生时,不同的上下文。现有的作品通常会忽略未明确规定,导致背景独立的语义表达的是斗争,支持新一代的上下文。为了解决这个问题,我们提出了自动查找证据,从一个大的文本语料库的事件,并利用证据来指导推理文本的生成的方法。我们的方法的工作原理在编码器 - 解码器的方式,并配有一个向量量化变分自动编码器,其中从分布在离散变量将编码器输出的表示。这样的离散表示自动启用选择相关的证据,这不仅有利于证据意识的产生,同时也提供了一个自然的方式来产生揪出背后的基本原理。我们的方法提供了在两个Event2Mind和原子数据集的国家的最先进的性能。更重要的是,我们发现,与分立表示,我们的模型中选择性地使用证据,以产生不同的推理文本。
17. FinBERT: A Pretrained Language Model for Financial Communications [PDF] 返回目录
Yi Yang, Mark Christopher Siy UY, Allen Huang
Abstract: Contextual pretrained language models, such as BERT (Devlin et al., 2019), have made significant breakthrough in various NLP tasks by training on large scale of unlabeled text re-sources.Financial sector also accumulates large amount of financial communication text.However, there is no pretrained finance specific language models available. In this work,we address the need by pretraining a financial domain specific BERT models, FinBERT, using a large scale of financial communication corpora. Experiments on three financial sentiment classification tasks confirm the advantage of FinBERT over generic domain BERT model. The code and pretrained models are available at this https URL. We hope this will be useful for practitioners and researchers working on financial NLP tasks.
摘要:(Devlin等,2019)上下文预训练的语言模型,如BERT,已通过培训大规模未标记文本re-sources.Financial部门做出各种NLP任务显著的突破也积累大量的金融通讯的文字。但是,没有预训练的财政特定语言型号。在这项工作中,我们应对训练前一个金融领域的具体型号BERT,FinBERT,使用大规模金融通信语料的需要。三个金融情感分类任务的实验证实FinBERT超过通用域名BERT模式的优势。代码和预训练的模型可在此HTTPS URL。我们希望这将是实践者和研究者对金融NLP任务时非常有用。
Yi Yang, Mark Christopher Siy UY, Allen Huang
Abstract: Contextual pretrained language models, such as BERT (Devlin et al., 2019), have made significant breakthrough in various NLP tasks by training on large scale of unlabeled text re-sources.Financial sector also accumulates large amount of financial communication text.However, there is no pretrained finance specific language models available. In this work,we address the need by pretraining a financial domain specific BERT models, FinBERT, using a large scale of financial communication corpora. Experiments on three financial sentiment classification tasks confirm the advantage of FinBERT over generic domain BERT model. The code and pretrained models are available at this https URL. We hope this will be useful for practitioners and researchers working on financial NLP tasks.
摘要:(Devlin等,2019)上下文预训练的语言模型,如BERT,已通过培训大规模未标记文本re-sources.Financial部门做出各种NLP任务显著的突破也积累大量的金融通讯的文字。但是,没有预训练的财政特定语言型号。在这项工作中,我们应对训练前一个金融领域的具体型号BERT,FinBERT,使用大规模金融通信语料的需要。三个金融情感分类任务的实验证实FinBERT超过通用域名BERT模式的优势。代码和预训练的模型可在此HTTPS URL。我们希望这将是实践者和研究者对金融NLP任务时非常有用。
18. FinEst BERT and CroSloEngual BERT: less is more in multilingual models [PDF] 返回目录
Matej Ulčar, Marko Robnik-Šikonja
Abstract: Large pretrained masked language models have become state-of-the-art solutions for many NLP problems. The research has been mostly focused on English language, though. While massively multilingual models exist, studies have shown that monolingual models produce much better results. We train two trilingual BERT-like models, one for Finnish, Estonian, and English, the other for Croatian, Slovenian, and English. We evaluate their performance on several downstream tasks, NER, POS-tagging, and dependency parsing, using the multilingual BERT and XLM-R as baselines. The newly created FinEst BERT and CroSloEngual BERT improve the results on all tasks in most monolingual and cross-lingual situations
摘要:大蒙面预先训练语言模型已成为许多NLP问题的国家的最先进的解决方案。这项研究已经主要集中在英语,虽然。虽然有大量多语种模型,研究表明,单一语言模型产生更好的结果。我们班列车2个俱佳BERT样的模型,一个是芬兰,爱沙尼亚语和英语,其他的克罗地亚,斯洛文尼亚和英语。我们评估其对下游的几个任务,NER,POS标记,和依赖解析性能,使用多语言BERT和XLM-R作为基准。新创建的最优秀的BERT和BERT CroSloEngual改善大多数的单语和跨语言的情况下,所有任务的结果
Matej Ulčar, Marko Robnik-Šikonja
Abstract: Large pretrained masked language models have become state-of-the-art solutions for many NLP problems. The research has been mostly focused on English language, though. While massively multilingual models exist, studies have shown that monolingual models produce much better results. We train two trilingual BERT-like models, one for Finnish, Estonian, and English, the other for Croatian, Slovenian, and English. We evaluate their performance on several downstream tasks, NER, POS-tagging, and dependency parsing, using the multilingual BERT and XLM-R as baselines. The newly created FinEst BERT and CroSloEngual BERT improve the results on all tasks in most monolingual and cross-lingual situations
摘要:大蒙面预先训练语言模型已成为许多NLP问题的国家的最先进的解决方案。这项研究已经主要集中在英语,虽然。虽然有大量多语种模型,研究表明,单一语言模型产生更好的结果。我们班列车2个俱佳BERT样的模型,一个是芬兰,爱沙尼亚语和英语,其他的克罗地亚,斯洛文尼亚和英语。我们评估其对下游的几个任务,NER,POS标记,和依赖解析性能,使用多语言BERT和XLM-R作为基准。新创建的最优秀的BERT和BERT CroSloEngual改善大多数的单语和跨语言的情况下,所有任务的结果
19. Vietnamese Word Segmentation with SVM: Ambiguity Reduction and Suffix Capture [PDF] 返回目录
Duc-Vu Nguyen, Dang Van Thin, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Abstract: In this paper, we approach Vietnamese word segmentation as a binary classification by using the Support Vector Machine classifier. We inherit features from prior works such as n-gram of syllables, n-gram of syllable types, and checking conjunction of adjacent syllables in the dictionary. We propose two novel ways to feature extraction, one to reduce the overlap ambiguity and the other to increase the ability to predict unknown words containing suffixes. Different from UETsegmenter and RDRsegmenter, two state-of-the-art Vietnamese word segmentation methods, we do not employ the longest matching algorithm as an initial processing step or any post-processing technique. According to experimental results on benchmark Vietnamese datasets, our proposed method obtained a better F1-score than the prior state-of-the-art methods UETsegmenter, and RDRsegmenter.
摘要:在本文中,我们通过使用支持向量机分类接近越南分词作为一个二元分类。我们沿用现有的作品如音节的n-gram中,n-gram中的音节种类,以及在字典中相邻的音节检查结合特征。我们提出了两种新的方式来特征提取,一个以减少重叠模糊性和另增加预测包含后缀生词的能力。从UETsegmenter和RDRsegmenter两个国家的最先进的越南字分割方法不同,我们不采用最长匹配算法作为初始处理步骤或任何后处理技术。据对基准数据集越南实验结果,我们提出的方法而得到的较好的F1-得分比现有状态的最先进的方法UETsegmenter,和RDRsegmenter。
Duc-Vu Nguyen, Dang Van Thin, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Abstract: In this paper, we approach Vietnamese word segmentation as a binary classification by using the Support Vector Machine classifier. We inherit features from prior works such as n-gram of syllables, n-gram of syllable types, and checking conjunction of adjacent syllables in the dictionary. We propose two novel ways to feature extraction, one to reduce the overlap ambiguity and the other to increase the ability to predict unknown words containing suffixes. Different from UETsegmenter and RDRsegmenter, two state-of-the-art Vietnamese word segmentation methods, we do not employ the longest matching algorithm as an initial processing step or any post-processing technique. According to experimental results on benchmark Vietnamese datasets, our proposed method obtained a better F1-score than the prior state-of-the-art methods UETsegmenter, and RDRsegmenter.
摘要:在本文中,我们通过使用支持向量机分类接近越南分词作为一个二元分类。我们沿用现有的作品如音节的n-gram中,n-gram中的音节种类,以及在字典中相邻的音节检查结合特征。我们提出了两种新的方式来特征提取,一个以减少重叠模糊性和另增加预测包含后缀生词的能力。从UETsegmenter和RDRsegmenter两个国家的最先进的越南字分割方法不同,我们不采用最长匹配算法作为初始处理步骤或任何后处理技术。据对基准数据集越南实验结果,我们提出的方法而得到的较好的F1-得分比现有状态的最先进的方法UETsegmenter,和RDRsegmenter。
20. Through the Twitter Glass: Detecting Questions in Micro-Text [PDF] 返回目录
Kyle Dent, Sharoda Paul
Abstract: In a separate study, we were interested in understanding people's Q&A habits on Twitter. Finding questions within Twitter turned out to be a difficult challenge, so we considered applying some traditional NLP approaches to the problem. On the one hand, Twitter is full of idiosyncrasies, which make processing it difficult. On the other, it is very restricted in length and tends to employ simple syntactic constructions, which could help the performance of NLP processing. In order to find out the viability of NLP and Twitter, we built a pipeline of tools to work specifically with Twitter input for the task of finding questions in tweets. This work is still preliminary, but in this paper we discuss the techniques we used and the lessons we learned.
摘要:在另一项研究中,我们有兴趣了解在Twitter上人们的Q&A的习惯。内的Twitter查找问题竟然是一个艰巨的挑战,所以我们考虑应用一些传统的NLP方法的问题。在一方面,Twitter是完全的特质,这使得加工困难的。另一方面,这是非常长的限制,往往采用简单的句法结构,这将有助于NLP处理的性能。为了找出NLP和Twitter的可行性,我们建立了一个工具管道与Twitter输入工作专门用于查找鸣叫问题的任务。这项工作还是初步的,但在本文中,我们讨论我们使用的技术和我们的经验教训。
Kyle Dent, Sharoda Paul
Abstract: In a separate study, we were interested in understanding people's Q&A habits on Twitter. Finding questions within Twitter turned out to be a difficult challenge, so we considered applying some traditional NLP approaches to the problem. On the one hand, Twitter is full of idiosyncrasies, which make processing it difficult. On the other, it is very restricted in length and tends to employ simple syntactic constructions, which could help the performance of NLP processing. In order to find out the viability of NLP and Twitter, we built a pipeline of tools to work specifically with Twitter input for the task of finding questions in tweets. This work is still preliminary, but in this paper we discuss the techniques we used and the lessons we learned.
摘要:在另一项研究中,我们有兴趣了解在Twitter上人们的Q&A的习惯。内的Twitter查找问题竟然是一个艰巨的挑战,所以我们考虑应用一些传统的NLP方法的问题。在一方面,Twitter是完全的特质,这使得加工困难的。另一方面,这是非常长的限制,往往采用简单的句法结构,这将有助于NLP处理的性能。为了找出NLP和Twitter的可行性,我们建立了一个工具管道与Twitter输入工作专门用于查找鸣叫问题的任务。这项工作还是初步的,但在本文中,我们讨论我们使用的技术和我们的经验教训。
21. Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya [PDF] 返回目录
Abrhalei Tela, Abraham Woubie, Ville Hautamaki
Abstract: In recent years, transformer models have achieved great success in natural language processing (NLP) tasks. Most of the current state-of-the-art NLP results are achieved by using monolingual transformer models, where the model is pre-trained using a single language unlabelled text corpus. Then, the model is fine-tuned to the specific downstream task. However, the cost of pre-training a new transformer model is high for most languages. In this work, we propose a cost-effective transfer learning method to adopt a strong source language model, trained from a large monolingual corpus to a low-resource language. Thus, using XLNet language model, we demonstrate competitive performance with mBERT and a pre-trained target language model on the cross-lingual sentiment (CLS) dataset and on a new sentiment analysis dataset for low-resourced language Tigrinya. With only 10k examples of the given Tigrinya sentiment analysis dataset, English XLNet has achieved 78.88% F1-Score outperforming BERT and mBERT by 10% and 7%, respectively. More interestingly, fine-tuning (English) XLNet model on the CLS dataset has promising results compared to mBERT and even outperformed mBERT for one dataset of the Japanese language.
摘要:近年来,变压器模型已经实现了自然语言处理(NLP)任务取得圆满成功。目前大多数国家的最先进的自然语言处理的结果通过使用单语变压器模型,其中对模型进行预先训练使用单一语言的无标记文本语料库来实现的。然后,该模型被微调的具体任务下游。然而,岗前培训,新的变压器模型的成本是很高的大多数语言。在这项工作中,我们提出了一种经济有效的转移学习方法采用强大的源语言模式,从一个大的单语语料库的低资源语言培训。因此,使用XLNet语言模型,我们证明与mBERT竞争力的性能和对跨语言情绪预先训练目标语言模型(CLS)的数据集和对资源不足地区的语言提格里尼亚一个新的情感分析数据集。只有10K给定蒂格里亚情绪分析数据集的例子,英国XLNet取得78.88%的F1-得分优于BERT和mBERT由分别为10%和7%。更有趣的是,在数据集已经承诺相比mBERT结果,甚至跑赢mBERT为日语的一个数据集的CLS微调(英文)XLNet模型。
Abrhalei Tela, Abraham Woubie, Ville Hautamaki
Abstract: In recent years, transformer models have achieved great success in natural language processing (NLP) tasks. Most of the current state-of-the-art NLP results are achieved by using monolingual transformer models, where the model is pre-trained using a single language unlabelled text corpus. Then, the model is fine-tuned to the specific downstream task. However, the cost of pre-training a new transformer model is high for most languages. In this work, we propose a cost-effective transfer learning method to adopt a strong source language model, trained from a large monolingual corpus to a low-resource language. Thus, using XLNet language model, we demonstrate competitive performance with mBERT and a pre-trained target language model on the cross-lingual sentiment (CLS) dataset and on a new sentiment analysis dataset for low-resourced language Tigrinya. With only 10k examples of the given Tigrinya sentiment analysis dataset, English XLNet has achieved 78.88% F1-Score outperforming BERT and mBERT by 10% and 7%, respectively. More interestingly, fine-tuning (English) XLNet model on the CLS dataset has promising results compared to mBERT and even outperformed mBERT for one dataset of the Japanese language.
摘要:近年来,变压器模型已经实现了自然语言处理(NLP)任务取得圆满成功。目前大多数国家的最先进的自然语言处理的结果通过使用单语变压器模型,其中对模型进行预先训练使用单一语言的无标记文本语料库来实现的。然后,该模型被微调的具体任务下游。然而,岗前培训,新的变压器模型的成本是很高的大多数语言。在这项工作中,我们提出了一种经济有效的转移学习方法采用强大的源语言模式,从一个大的单语语料库的低资源语言培训。因此,使用XLNet语言模型,我们证明与mBERT竞争力的性能和对跨语言情绪预先训练目标语言模型(CLS)的数据集和对资源不足地区的语言提格里尼亚一个新的情感分析数据集。只有10K给定蒂格里亚情绪分析数据集的例子,英国XLNet取得78.88%的F1-得分优于BERT和mBERT由分别为10%和7%。更有趣的是,在数据集已经承诺相比mBERT结果,甚至跑赢mBERT为日语的一个数据集的CLS微调(英文)XLNet模型。
22. Words ranking and Hirsch index for identifying the core of the hapaxes in political texts [PDF] 返回目录
Valerio Ficcadenti, Roy Cerqueti, Marcel Ausloos, Gurjeet Dhesi
Abstract: This paper deals with a quantitative analysis of the content of official political speeches. We study a set of about one thousand talks pronounced by the US Presidents, ranging from Washington to Trump. In particular, we search for the relevance of the rare words, i.e. those said only once in each speech -- the so-called hapaxes. We implement a rank-size procedure of Zipf-Mandelbrot type for discussing the hapaxes' frequencies regularity over the overall set of speeches. Starting from the obtained rank-size law, we define and detect the core of the hapaxes set by means of a procedure based on an Hirsch index variant. We discuss the resulting list of words in the light of the overall US Presidents' speeches. We further show that this core of hapaxes itself can be well fitted through a Zipf-Mandelbrot law and that contains elements producing deviations at the low ranks between scatter plots and fitted curve -- the so-called king and vice-roy effect. Some socio-political insights are derived from the obtained findings about the US Presidents messages.
摘要:本文以官方的政治演讲的内容的定量分析交易。我们研究了一套由美国总统,从华盛顿到特朗普宣布大约有一千会谈。特别是,我们寻找的生僻字的相关性,即那些在每个演讲中说只有一次 - 即所谓的hapaxes。我们实行齐普夫 - 曼德尔布罗型的等级规模程序在总体组发言讨论hapaxes'频率的规律性。从所获得的等级规模规律出发,我们定义和检测由基于一个赫希指数变异的程序来设定的hapaxes的核心。我们讨论了总体美国总统的发言的光所产生的单词列表。进一步的研究表明这个核心hapaxes本身可以通过齐普夫 - 曼德尔布罗定律很好地表达和包含在生产散点图和拟合曲线之间的低等级偏差元素 - 即所谓的国王和副罗伊效果。一些社会政治观点来源于对美国总统的消息所获得的调查结果中得出。
Valerio Ficcadenti, Roy Cerqueti, Marcel Ausloos, Gurjeet Dhesi
Abstract: This paper deals with a quantitative analysis of the content of official political speeches. We study a set of about one thousand talks pronounced by the US Presidents, ranging from Washington to Trump. In particular, we search for the relevance of the rare words, i.e. those said only once in each speech -- the so-called hapaxes. We implement a rank-size procedure of Zipf-Mandelbrot type for discussing the hapaxes' frequencies regularity over the overall set of speeches. Starting from the obtained rank-size law, we define and detect the core of the hapaxes set by means of a procedure based on an Hirsch index variant. We discuss the resulting list of words in the light of the overall US Presidents' speeches. We further show that this core of hapaxes itself can be well fitted through a Zipf-Mandelbrot law and that contains elements producing deviations at the low ranks between scatter plots and fitted curve -- the so-called king and vice-roy effect. Some socio-political insights are derived from the obtained findings about the US Presidents messages.
摘要:本文以官方的政治演讲的内容的定量分析交易。我们研究了一套由美国总统,从华盛顿到特朗普宣布大约有一千会谈。特别是,我们寻找的生僻字的相关性,即那些在每个演讲中说只有一次 - 即所谓的hapaxes。我们实行齐普夫 - 曼德尔布罗型的等级规模程序在总体组发言讨论hapaxes'频率的规律性。从所获得的等级规模规律出发,我们定义和检测由基于一个赫希指数变异的程序来设定的hapaxes的核心。我们讨论了总体美国总统的发言的光所产生的单词列表。进一步的研究表明这个核心hapaxes本身可以通过齐普夫 - 曼德尔布罗定律很好地表达和包含在生产散点图和拟合曲线之间的低等级偏差元素 - 即所谓的国王和副罗伊效果。一些社会政治观点来源于对美国总统的消息所获得的调查结果中得出。
23. GIPFA: Generating IPA Pronunciation from Audio [PDF] 返回目录
Xavier Marjou
Abstract: Transcribing spoken audio samples into International Phonetic Alphabet (IPA) has long been reserved for experts. In this study, we instead examined the use of an Artificial Neural Network (ANN) model to automatically extract the IPA pronunciation of a word based on its audio pronunciation, hence its name Generating IPA Pronunciation From Audio (GIPFA). Based on the French Wikimedia dictionary, we trained our model which then correctly predicted 75% of the IPA pronunciations tested. Interestingly, by studying inference errors, the model made it possible to highlight possible errors in the dataset as well as identifying the closest phonemes in French.
摘要:抄写口语音频样本插入到国际音标(IPA)一直被保留的专家。在这项研究中,我们不是研究利用人工神经网络(ANN)模型自动提取基于其音频发音单词的国际音标的,因此它的名字生成国际音标从音频(GIPFA)。根据法国维基词典,我们训练我们的模型,然后正确预测测试的IPA发音的75%。有趣的是,通过研究推定误差,该模型使人们有可能在数据集中凸显可能出现的错误,以及确定在法国最接近的音素。
Xavier Marjou
Abstract: Transcribing spoken audio samples into International Phonetic Alphabet (IPA) has long been reserved for experts. In this study, we instead examined the use of an Artificial Neural Network (ANN) model to automatically extract the IPA pronunciation of a word based on its audio pronunciation, hence its name Generating IPA Pronunciation From Audio (GIPFA). Based on the French Wikimedia dictionary, we trained our model which then correctly predicted 75% of the IPA pronunciations tested. Interestingly, by studying inference errors, the model made it possible to highlight possible errors in the dataset as well as identifying the closest phonemes in French.
摘要:抄写口语音频样本插入到国际音标(IPA)一直被保留的专家。在这项研究中,我们不是研究利用人工神经网络(ANN)模型自动提取基于其音频发音单词的国际音标的,因此它的名字生成国际音标从音频(GIPFA)。根据法国维基词典,我们训练我们的模型,然后正确预测测试的IPA发音的75%。有趣的是,通过研究推定误差,该模型使人们有可能在数据集中凸显可能出现的错误,以及确定在法国最接近的音素。
24. Do Dogs have Whiskers? A New Knowledge Base of hasPart Relations [PDF] 返回目录
Sumithra Bhakthavatsalam, Kyle Richardson, Niket Tandon, Peter Clark
Abstract: We present a new knowledge-base of hasPart relationships, extracted from a large corpus of generic statements. Complementary to other resources available, it is the first which is all three of: accurate (90% precision), salient (covers relationships a person may mention), and has high coverage of common terms (approximated as within a 10 year old's vocabulary), as well as having several times more hasPart entries than in the popular ontologies ConceptNet and WordNet. In addition, it contains information about quantifiers, argument modifiers, and links the entities to appropriate concepts in Wikipedia and WordNet. The knowledge base is available at this https URL
摘要:我们提出hasPart关系的新的知识基础,从大量语料通用报表的提取。补充其他可用资源,它是第一这是所有三个:准确(90%精度),显着(盖关系的人可能会提到),并具有通用术语的高覆盖率(近似为一个10岁的词汇表中) ,以及具有多次比流行的本体ConceptNet和共发现更多hasPart条目。此外,它包含了量词,参数调节剂的信息,以及实体链接到维基百科和共发现合适的概念。知识库可在此HTTPS URL
Sumithra Bhakthavatsalam, Kyle Richardson, Niket Tandon, Peter Clark
Abstract: We present a new knowledge-base of hasPart relationships, extracted from a large corpus of generic statements. Complementary to other resources available, it is the first which is all three of: accurate (90% precision), salient (covers relationships a person may mention), and has high coverage of common terms (approximated as within a 10 year old's vocabulary), as well as having several times more hasPart entries than in the popular ontologies ConceptNet and WordNet. In addition, it contains information about quantifiers, argument modifiers, and links the entities to appropriate concepts in Wikipedia and WordNet. The knowledge base is available at this https URL
摘要:我们提出hasPart关系的新的知识基础,从大量语料通用报表的提取。补充其他可用资源,它是第一这是所有三个:准确(90%精度),显着(盖关系的人可能会提到),并具有通用术语的高覆盖率(近似为一个10岁的词汇表中) ,以及具有多次比流行的本体ConceptNet和共发现更多hasPart条目。此外,它包含了量词,参数调节剂的信息,以及实体链接到维基百科和共发现合适的概念。知识库可在此HTTPS URL
25. A Generative Model for Joint Natural Language Understanding and Generation [PDF] 返回目录
Bo-Hsiang Tseng, Jianpeng Cheng, Yimai Fang, David Vandyke
Abstract: Natural language understanding (NLU) and natural language generation (NLG) are two fundamental and related tasks in building task-oriented dialogue systems with opposite objectives: NLU tackles the transformation from natural language to formal representations, whereas NLG does the reverse. A key to success in either task is parallel training data which is expensive to obtain at a large scale. In this work, we propose a generative model which couples NLU and NLG through a shared latent variable. This approach allows us to explore both spaces of natural language and formal representations, and facilitates information sharing through the latent space to eventually benefit NLU and NLG. Our model achieves state-of-the-art performance on two dialogue datasets with both flat and tree-structured formal representations. We also show that the model can be trained in a semi-supervised fashion by utilising unlabelled data to boost its performance.
摘要:自然语言理解(NLU)和自然语言生成(NLG)是建立面向任务的对话系统与目标相反两个基本和相关任务:NLU铲球从自然语言的形式化表示的转变,而NLG正好相反。在任一任务成功的关键是这是昂贵的,以获得在大规模并行训练数据。在这项工作中,我们提出了一个生成模型,夫妇NLU和NLG通过一个共享的潜在变量。这种方法使我们能够探索自然语言和正式交涉的两个空间,并通过潜在的空间有利于信息共享,最终受益NLU和NLG。我们的模型实现与平面和树型结构的形式化表达两人的对话数据集的国家的最先进的性能。我们还表明,该模型可以在半监督方式,利用未标记数据,以提高其性能方面的培训。
Bo-Hsiang Tseng, Jianpeng Cheng, Yimai Fang, David Vandyke
Abstract: Natural language understanding (NLU) and natural language generation (NLG) are two fundamental and related tasks in building task-oriented dialogue systems with opposite objectives: NLU tackles the transformation from natural language to formal representations, whereas NLG does the reverse. A key to success in either task is parallel training data which is expensive to obtain at a large scale. In this work, we propose a generative model which couples NLU and NLG through a shared latent variable. This approach allows us to explore both spaces of natural language and formal representations, and facilitates information sharing through the latent space to eventually benefit NLU and NLG. Our model achieves state-of-the-art performance on two dialogue datasets with both flat and tree-structured formal representations. We also show that the model can be trained in a semi-supervised fashion by utilising unlabelled data to boost its performance.
摘要:自然语言理解(NLU)和自然语言生成(NLG)是建立面向任务的对话系统与目标相反两个基本和相关任务:NLU铲球从自然语言的形式化表示的转变,而NLG正好相反。在任一任务成功的关键是这是昂贵的,以获得在大规模并行训练数据。在这项工作中,我们提出了一个生成模型,夫妇NLU和NLG通过一个共享的潜在变量。这种方法使我们能够探索自然语言和正式交涉的两个空间,并通过潜在的空间有利于信息共享,最终受益NLU和NLG。我们的模型实现与平面和树型结构的形式化表达两人的对话数据集的国家的最先进的性能。我们还表明,该模型可以在半监督方式,利用未标记数据,以提高其性能方面的培训。
26. Measuring Forecasting Skill from Text [PDF] 返回目录
Shi Zong, Alan Ritter, Eduard Hovy
Abstract: People vary in their ability to make accurate predictions about the future. Prior studies have shown that some individuals can predict the outcome of future events with consistently better accuracy. This leads to a natural question: what makes some forecasters better than others? In this paper we explore connections between the language people use to describe their predictions and their forecasting skill. Datasets from two different forecasting domains are explored: (1) geopolitical forecasts from Good Judgment Open, an online prediction forum and (2) a corpus of company earnings forecasts made by financial analysts. We present a number of linguistic metrics which are computed over text associated with people's predictions about the future including: uncertainty, readability, and emotion. By studying linguistic factors associated with predictions, we are able to shed some light on the approach taken by skilled forecasters. Furthermore, we demonstrate that it is possible to accurately predict forecasting skill using a model that is based solely on language. This could potentially be useful for identifying accurate predictions or potentially skilled forecasters earlier.
摘要:人们在他们对未来做出准确的预测能力有所不同。此前的研究表明,有些人可以预测未来事件的结果具有一致的更高的精度。这导致了一个自然的问题:是什么让一些预测比别人做得更好?在本文中,我们探讨人们用来形容他们的预测和预报的技术语言之间的连接。从两个不同的预测域数据集进行了探讨:(1)好的判断力开放,在线预测论坛和(2)的金融分析员的公司的盈利预测语料库的地缘政治预测。我们提出了一些其在计算与人们对未来的预测,包括相关的文本语言的指标,即不确定性,可读性和情感。通过研究与预测相关语言因素,我们能够摆脱对技术预测所采取的做法的一些情况。此外,我们证明它可以使用完全基于语言的模型精确地预测预报技术。这可能是用于识别准确的预测或更早潜在熟练预测有用。
Shi Zong, Alan Ritter, Eduard Hovy
Abstract: People vary in their ability to make accurate predictions about the future. Prior studies have shown that some individuals can predict the outcome of future events with consistently better accuracy. This leads to a natural question: what makes some forecasters better than others? In this paper we explore connections between the language people use to describe their predictions and their forecasting skill. Datasets from two different forecasting domains are explored: (1) geopolitical forecasts from Good Judgment Open, an online prediction forum and (2) a corpus of company earnings forecasts made by financial analysts. We present a number of linguistic metrics which are computed over text associated with people's predictions about the future including: uncertainty, readability, and emotion. By studying linguistic factors associated with predictions, we are able to shed some light on the approach taken by skilled forecasters. Furthermore, we demonstrate that it is possible to accurately predict forecasting skill using a model that is based solely on language. This could potentially be useful for identifying accurate predictions or potentially skilled forecasters earlier.
摘要:人们在他们对未来做出准确的预测能力有所不同。此前的研究表明,有些人可以预测未来事件的结果具有一致的更高的精度。这导致了一个自然的问题:是什么让一些预测比别人做得更好?在本文中,我们探讨人们用来形容他们的预测和预报的技术语言之间的连接。从两个不同的预测域数据集进行了探讨:(1)好的判断力开放,在线预测论坛和(2)的金融分析员的公司的盈利预测语料库的地缘政治预测。我们提出了一些其在计算与人们对未来的预测,包括相关的文本语言的指标,即不确定性,可读性和情感。通过研究与预测相关语言因素,我们能够摆脱对技术预测所采取的做法的一些情况。此外,我们证明它可以使用完全基于语言的模型精确地预测预报技术。这可能是用于识别准确的预测或更早潜在熟练预测有用。
27. Evaluating a Multi-sense Definition Generation Model for Multiple Languages [PDF] 返回目录
Arman Kabiri, Paul Cook
Abstract: Most prior work on definition modeling has not accounted for polysemy, or has done so by considering definition modeling for a target word in a given context. In contrast, in this study, we propose a context-agnostic approach to definition modeling, based on multi-sense word embeddings, that is capable of generating multiple definitions for a target word. In further, contrast to most prior work, which has primarily focused on English, we evaluate our proposed approach on fifteen different datasets covering nine languages from several language families. To evaluate our approach we consider several variations of BLEU. Our results demonstrate that our proposed multi-sense model outperforms a single-sense model on all fifteen datasets.
摘要:定义建模大多数现有的工作还没有占多义词,或者已经考虑定义建模在特定情况下的目标字这样做。相反,在这项研究中,我们提出了一个上下文无关的方法来定义建模的基础上,多义词的嵌入,能够生成多个定义为一个目标词的。在进一步的,与大多数以前的工作,其中主要集中在英国,我们评估的十五种不同的数据集涵盖九种语言从几个语系提出的方法。为了评估我们的做法,我们认为BLEU的几个变化。我们的研究结果表明,我们提出的多义模型优于在所有十五集单感模式。
Arman Kabiri, Paul Cook
Abstract: Most prior work on definition modeling has not accounted for polysemy, or has done so by considering definition modeling for a target word in a given context. In contrast, in this study, we propose a context-agnostic approach to definition modeling, based on multi-sense word embeddings, that is capable of generating multiple definitions for a target word. In further, contrast to most prior work, which has primarily focused on English, we evaluate our proposed approach on fifteen different datasets covering nine languages from several language families. To evaluate our approach we consider several variations of BLEU. Our results demonstrate that our proposed multi-sense model outperforms a single-sense model on all fifteen datasets.
摘要:定义建模大多数现有的工作还没有占多义词,或者已经考虑定义建模在特定情况下的目标字这样做。相反,在这项研究中,我们提出了一个上下文无关的方法来定义建模的基础上,多义词的嵌入,能够生成多个定义为一个目标词的。在进一步的,与大多数以前的工作,其中主要集中在英国,我们评估的十五种不同的数据集涵盖九种语言从几个语系提出的方法。为了评估我们的做法,我们认为BLEU的几个变化。我们的研究结果表明,我们提出的多义模型优于在所有十五集单感模式。
28. Regularized Forward-Backward Decoder for Attention Models [PDF] 返回目录
Tobias Watzel, Ludwig Kürzinger, Lujun Li, Gerhard Rigoll
Abstract: Nowadays, attention models are one of the popular candidates for speech recognition. So far, many studies mainly focus on the encoder structure or the attention module to enhance the performance of these models. However, mostly ignore the decoder. In this paper, we propose a novel regularization technique incorporating a second decoder during the training phase. This decoder is optimized on time-reversed target labels beforehand and supports the standard decoder during training by adding knowledge from future context. Since it is only added during training, we are not changing the basic structure of the network or adding complexity during decoding. We evaluate our approach on the smaller TEDLIUMv2 and the larger LibriSpeech dataset, achieving consistent improvements on both of them.
摘要:如今,重视模型是热门人选语音识别的一个。到目前为止,许多研究主要集中在编码器结构或注意力模块,以增强这些车型的性能。然而,大多忽视了解码器。在本文中,我们提出在训练阶段期间包括第二解码器的新型正则化技术。该解码器的时间反转的目标标签预先优化,从未来的情况下增加知识培训支持在标准的解码器。既然是训练期间只加了,我们不改变网络的基本结构和解码过程中增加复杂性。我们评估我们对小TEDLIUMv2方法和更大的数据集LibriSpeech,实现对二者的持续改善。
Tobias Watzel, Ludwig Kürzinger, Lujun Li, Gerhard Rigoll
Abstract: Nowadays, attention models are one of the popular candidates for speech recognition. So far, many studies mainly focus on the encoder structure or the attention module to enhance the performance of these models. However, mostly ignore the decoder. In this paper, we propose a novel regularization technique incorporating a second decoder during the training phase. This decoder is optimized on time-reversed target labels beforehand and supports the standard decoder during training by adding knowledge from future context. Since it is only added during training, we are not changing the basic structure of the network or adding complexity during decoding. We evaluate our approach on the smaller TEDLIUMv2 and the larger LibriSpeech dataset, achieving consistent improvements on both of them.
摘要:如今,重视模型是热门人选语音识别的一个。到目前为止,许多研究主要集中在编码器结构或注意力模块,以增强这些车型的性能。然而,大多忽视了解码器。在本文中,我们提出在训练阶段期间包括第二解码器的新型正则化技术。该解码器的时间反转的目标标签预先优化,从未来的情况下增加知识培训支持在标准的解码器。既然是训练期间只加了,我们不改变网络的基本结构和解码过程中增加复杂性。我们评估我们对小TEDLIUMv2方法和更大的数据集LibriSpeech,实现对二者的持续改善。
29. SD-RSIC: Summarization Driven Deep Remote Sensing Image Captioning [PDF] 返回目录
Gencer Sumbul, Sonali Nayak, Begüm Demir
Abstract: Deep neural networks (DNNs) have been recently found popular for image captioning problems in remote sensing (RS). Existing DNN based approaches rely on the availability of a training set made up of a high number of RS images with their captions. However, captions of training images may contain redundant information (they can be repetitive or semantically similar to each other), resulting in information deficiency while learning a mapping from image domain to language domain. To overcome this limitation, in this paper we present a novel Summarization Driven Remote Sensing Image Captioning (SD-RSIC) approach. The proposed approach consists of three main steps. The first step obtains the standard image captions by jointly exploiting convolutional neural networks (CNNs) with long short-term memory (LSTM) networks. The second step, unlike the existing RS image captioning methods, summarizes the ground-truth captions of each training image into a single caption by exploiting sequence to sequence neural networks and eliminates the redundancy present in the training set. The third step automatically defines the adaptive weights associated to each RS image to combine the standard captions with the summarized captions based on the semantic content of the image. This is achieved by a novel adaptive weighting strategy defined in the context of LSTM networks. Experimental results obtained on the RSCID, UCM-Captions and Sydney-Captions datasets show the effectiveness of the proposed approach compared to the state-of-the-art RS image captioning approaches.
摘要:深层神经网络(DNNs)最近已发现流行在遥感(RS)图像字幕的问题。现有的基于DNN的方法依赖于他们的字幕大量遥感图像的由训练集的可用性。然而,训练图像的字幕可能包含冗余信息(它们可以是重复的或者语义上彼此相似),从而导致信息不足,同时学习从图像域映射到语言域。为了克服这种限制,在本文中,我们提出一个新的综述驱动遥感图像字幕(SD-RSIC)的方法。所提出的方法包括三个主要步骤。第一步通过联合利用卷积神经网络(细胞神经网络)与长短期记忆(LSTM)网络获得标准图像字幕。第二步骤中,不同于现有的RS图像字幕方法,通过利用序列序列神经网络总结了每个训练图像的地面真值字幕成一个单一标题,并且消除存在于所述训练集合中的冗余。第三步自动定义关联到每个RS图像到标准字幕与基于图像的语义内容的概括字幕结合自适应权重。这是通过在网络LSTM的上下文中定义一个新的自适应加权策略实现。在RSCID获得的实验结果,UCM-字幕和悉尼的字幕数据集示出的状态相比的最先进的RS图像字幕接近所提出的方法的有效性。
Gencer Sumbul, Sonali Nayak, Begüm Demir
Abstract: Deep neural networks (DNNs) have been recently found popular for image captioning problems in remote sensing (RS). Existing DNN based approaches rely on the availability of a training set made up of a high number of RS images with their captions. However, captions of training images may contain redundant information (they can be repetitive or semantically similar to each other), resulting in information deficiency while learning a mapping from image domain to language domain. To overcome this limitation, in this paper we present a novel Summarization Driven Remote Sensing Image Captioning (SD-RSIC) approach. The proposed approach consists of three main steps. The first step obtains the standard image captions by jointly exploiting convolutional neural networks (CNNs) with long short-term memory (LSTM) networks. The second step, unlike the existing RS image captioning methods, summarizes the ground-truth captions of each training image into a single caption by exploiting sequence to sequence neural networks and eliminates the redundancy present in the training set. The third step automatically defines the adaptive weights associated to each RS image to combine the standard captions with the summarized captions based on the semantic content of the image. This is achieved by a novel adaptive weighting strategy defined in the context of LSTM networks. Experimental results obtained on the RSCID, UCM-Captions and Sydney-Captions datasets show the effectiveness of the proposed approach compared to the state-of-the-art RS image captioning approaches.
摘要:深层神经网络(DNNs)最近已发现流行在遥感(RS)图像字幕的问题。现有的基于DNN的方法依赖于他们的字幕大量遥感图像的由训练集的可用性。然而,训练图像的字幕可能包含冗余信息(它们可以是重复的或者语义上彼此相似),从而导致信息不足,同时学习从图像域映射到语言域。为了克服这种限制,在本文中,我们提出一个新的综述驱动遥感图像字幕(SD-RSIC)的方法。所提出的方法包括三个主要步骤。第一步通过联合利用卷积神经网络(细胞神经网络)与长短期记忆(LSTM)网络获得标准图像字幕。第二步骤中,不同于现有的RS图像字幕方法,通过利用序列序列神经网络总结了每个训练图像的地面真值字幕成一个单一标题,并且消除存在于所述训练集合中的冗余。第三步自动定义关联到每个RS图像到标准字幕与基于图像的语义内容的概括字幕结合自适应权重。这是通过在网络LSTM的上下文中定义一个新的自适应加权策略实现。在RSCID获得的实验结果,UCM-字幕和悉尼的字幕数据集示出的状态相比的最先进的RS图像字幕接近所提出的方法的有效性。
30. EPIC: An Epidemics Corpus Of Over 20 Million Relevant Tweets [PDF] 返回目录
Junhua Liu, Trisha Singhal, Lucienne T.M. Blessing, Kristin L. Wood, Kwan Hui Lim
Abstract: Since the start of COVID-19, several relevant corpora from various sources are presented in the literature that contain millions of data points. While these corpora are valuable in supporting many analyses on this specific pandemic, researchers require additional benchmark corpora that contain other epidemics to facilitate cross-epidemic pattern recognition and trend analysis tasks. During our other efforts on COVID-19 related work, we discover very little disease related corpora in the literature that are sizable and rich enough to support such cross-epidemic analysis tasks. In this paper, we present EPIC, a large-scale epidemic corpus that contains 20 millions micro-blog posts, i.e., tweets crawled from Twitter, from year 2006 to 2020. EPIC contains a subset of 17.8 millions tweets related to three general diseases, namely Ebola, Cholera and Swine Flu, and another subset of 3.5 millions tweets of six global epidemic outbreaks, including 2009 H1N1 Swine Flu, 2010 Haiti Cholera, 2012 Middle-East Respiratory Syndrome (MERS), 2013 West African Ebola, 2016 Yemen Cholera and 2018 Kivu Ebola. Furthermore, we explore and discuss the properties of the corpus with statistics of key terms and hashtags and trends analysis for each subset. Finally, we demonstrate the value and impact that EPIC could create through a discussion of multiple use cases of cross-epidemic research topics that attract growing interest in recent years. These use cases span multiple research areas, such as epidemiological modeling, pattern recognition, natural language understanding and economical modeling.
摘要:由于COVID-19开始,从各种渠道几个相关语料库中包含数百万个数据点的文献中提出的。虽然这些语料库在这个特定的流行病支持许多有价值的分析,研究人员需要包含其他传染病,以促进跨疫情模式识别和趋势分析任务的附加基准语料库。在我们对COVID-19相关的工作等努力,我们发现在文献中很少疾病相关语料是相当大的,丰富的,足以支持这样的跨疫情分析任务。在本文中,我们提出了EPIC,包含2000万的微博客,即鸣叫从Twitter抓取,从2006年到2020年EPIC包含与三个一般疾病的17.8百万鸣叫子集的大规模流行的语料库,即埃博拉,霍乱和猪流感,以及六个全球疫情350万条微博,其中包括2009年H1N1猪流感,2010海地霍乱,2012中东呼吸综合征(MERS),2013西非埃博拉年,2016年也门霍乱和另一个子集2018基伍埃博拉病毒。此外,我们探索并与每个子集关键术语和井号标签和趋势分析的统计讨论语料库的性质。最后,我们证明了价值和影响是EPIC可以通过的交叉流行的研究课题,吸引近几年不断增长的兴趣多个用例的讨论创建。这些用例跨越多个研究领域,如流行病学建模,图形识别,自然语言理解和经济建模。
Junhua Liu, Trisha Singhal, Lucienne T.M. Blessing, Kristin L. Wood, Kwan Hui Lim
Abstract: Since the start of COVID-19, several relevant corpora from various sources are presented in the literature that contain millions of data points. While these corpora are valuable in supporting many analyses on this specific pandemic, researchers require additional benchmark corpora that contain other epidemics to facilitate cross-epidemic pattern recognition and trend analysis tasks. During our other efforts on COVID-19 related work, we discover very little disease related corpora in the literature that are sizable and rich enough to support such cross-epidemic analysis tasks. In this paper, we present EPIC, a large-scale epidemic corpus that contains 20 millions micro-blog posts, i.e., tweets crawled from Twitter, from year 2006 to 2020. EPIC contains a subset of 17.8 millions tweets related to three general diseases, namely Ebola, Cholera and Swine Flu, and another subset of 3.5 millions tweets of six global epidemic outbreaks, including 2009 H1N1 Swine Flu, 2010 Haiti Cholera, 2012 Middle-East Respiratory Syndrome (MERS), 2013 West African Ebola, 2016 Yemen Cholera and 2018 Kivu Ebola. Furthermore, we explore and discuss the properties of the corpus with statistics of key terms and hashtags and trends analysis for each subset. Finally, we demonstrate the value and impact that EPIC could create through a discussion of multiple use cases of cross-epidemic research topics that attract growing interest in recent years. These use cases span multiple research areas, such as epidemiological modeling, pattern recognition, natural language understanding and economical modeling.
摘要:由于COVID-19开始,从各种渠道几个相关语料库中包含数百万个数据点的文献中提出的。虽然这些语料库在这个特定的流行病支持许多有价值的分析,研究人员需要包含其他传染病,以促进跨疫情模式识别和趋势分析任务的附加基准语料库。在我们对COVID-19相关的工作等努力,我们发现在文献中很少疾病相关语料是相当大的,丰富的,足以支持这样的跨疫情分析任务。在本文中,我们提出了EPIC,包含2000万的微博客,即鸣叫从Twitter抓取,从2006年到2020年EPIC包含与三个一般疾病的17.8百万鸣叫子集的大规模流行的语料库,即埃博拉,霍乱和猪流感,以及六个全球疫情350万条微博,其中包括2009年H1N1猪流感,2010海地霍乱,2012中东呼吸综合征(MERS),2013西非埃博拉年,2016年也门霍乱和另一个子集2018基伍埃博拉病毒。此外,我们探索并与每个子集关键术语和井号标签和趋势分析的统计讨论语料库的性质。最后,我们证明了价值和影响是EPIC可以通过的交叉流行的研究课题,吸引近几年不断增长的兴趣多个用例的讨论创建。这些用例跨越多个研究领域,如流行病学建模,图形识别,自然语言理解和经济建模。
31. Tamil Vowel Recognition With Augmented MNIST-like Data Set [PDF] 返回目录
Muthiah Annamalai
Abstract: We report generation of a MNIST [4] compatible data set [1] for Tamil vowels to enable building a classification DNN or other such ML/AI deep learning [2] models for Tamil OCR/Handwriting applications. We report the capability of the 60,000 grayscale, 28x28 pixel dataset to build a 92% accuracy (training) and 82% cross-validation 4-layer CNN, with 100,000+ parameters, in TensorFlow. We also report a top-1 classification accuracy of 70% and top-2 classification accuracy of 92% on handwritten vowels showing, for the same network.
摘要:我们报告MNIST的产生[4]泰米尔兼容的数据集[1]元音能够建立一个分类DNN或其他类似ML / AI深度学习[2]模型泰米尔OCR /手写应用。我们报告的60000灰度,28x28像素数据集以建立一个92%的准确度(训练)和82%的交叉验证4层CNN,具有100,000的参数,在TensorFlow的能力。我们还报告的70%上手写元音示出,对于相同的网络的顶1的分类精度和92%的顶2的分类精度。
Muthiah Annamalai
Abstract: We report generation of a MNIST [4] compatible data set [1] for Tamil vowels to enable building a classification DNN or other such ML/AI deep learning [2] models for Tamil OCR/Handwriting applications. We report the capability of the 60,000 grayscale, 28x28 pixel dataset to build a 92% accuracy (training) and 82% cross-validation 4-layer CNN, with 100,000+ parameters, in TensorFlow. We also report a top-1 classification accuracy of 70% and top-2 classification accuracy of 92% on handwritten vowels showing, for the same network.
摘要:我们报告MNIST的产生[4]泰米尔兼容的数据集[1]元音能够建立一个分类DNN或其他类似ML / AI深度学习[2]模型泰米尔OCR /手写应用。我们报告的60000灰度,28x28像素数据集以建立一个92%的准确度(训练)和82%的交叉验证4层CNN,具有100,000的参数,在TensorFlow的能力。我们还报告的70%上手写元音示出,对于相同的网络的顶1的分类精度和92%的顶2的分类精度。
32. Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset [PDF] 返回目录
Andrei Andrusenko, Aleksandr Laptev, Ivan Medennikov
Abstract: This paper presents an exploration of end-to-end automatic speech recognition systems (ASR) for the largest open-source Russian language data set -- OpenSTT. We evaluate different existing end-to-end approaches such as joint CTC/Attention, RNN-Transducer, and Transformer. All of them are compared with the strong hybrid ASR system based on LF-MMI TDNN-F acoustic model. For the three available validation sets (phone calls, YouTube, and books), our best end-to-end model achieves word error rate (WER) of 34.8%, 19.1%, and 18.1%, respectively. Under the same conditions, the hybridASR system demonstrates 33.5%, 20.9%, and 18.6% WER.
摘要:本文介绍终端到终端的自动语音识别系统(ASR)的勘探最大的开源俄语数据集 - OpenSTT。我们评估不同的现有终端到终端的办法,如联合CTC /注意,RNN换能器,变压器和。所有这些都基于LF-MMI TDNN-F声学模型的强混合动力ASR系统相比。对于这三个可用的验证台(电话,YouTube和书籍),我们最好的终端到终端的模式实现了分别为34.8%,19.1%和18.1%,字错误率(WER)。在相同条件下,该系统hybridASR表明33.5%,20.9%,和18.6%WER。
Andrei Andrusenko, Aleksandr Laptev, Ivan Medennikov
Abstract: This paper presents an exploration of end-to-end automatic speech recognition systems (ASR) for the largest open-source Russian language data set -- OpenSTT. We evaluate different existing end-to-end approaches such as joint CTC/Attention, RNN-Transducer, and Transformer. All of them are compared with the strong hybrid ASR system based on LF-MMI TDNN-F acoustic model. For the three available validation sets (phone calls, YouTube, and books), our best end-to-end model achieves word error rate (WER) of 34.8%, 19.1%, and 18.1%, respectively. Under the same conditions, the hybridASR system demonstrates 33.5%, 20.9%, and 18.6% WER.
摘要:本文介绍终端到终端的自动语音识别系统(ASR)的勘探最大的开源俄语数据集 - OpenSTT。我们评估不同的现有终端到终端的办法,如联合CTC /注意,RNN换能器,变压器和。所有这些都基于LF-MMI TDNN-F声学模型的强混合动力ASR系统相比。对于这三个可用的验证台(电话,YouTube和书籍),我们最好的终端到终端的模式实现了分别为34.8%,19.1%和18.1%,字错误率(WER)。在相同条件下,该系统hybridASR表明33.5%,20.9%,和18.6%WER。
33. Relational reasoning and generalization using non-symbolic neural networks [PDF] 返回目录
Atticus Geiger, Alexandra Carstensen Michael C. Frank, Christopher Potts
Abstract: Humans have a remarkable capacity to reason about abstract relational structures, an ability that may support some of the most impressive, human-unique cognitive feats. Because equality (or identity) is a simple and ubiquitous relational operator, equality reasoning has been a key case study for the broader question of abstract relational reasoning. This paper revisits the question of whether equality can be learned by neural networks that do not encode explicit symbolic structure. Earlier work arrived at a negative answer to this question, but that result holds only for a particular class of hand-crafted feature representations. In our experiments, we assess out-of-sample generalization of equality using both arbitrary representations and representations that have been pretrained on separate tasks to imbue them with abstract structure. In this setting, even simple neural networks are able to learn basic equality with relatively little training data. In a second case study, we show that sequential equality problems (learning ABA sequences) can be solved with only positive training instances. Finally, we consider a more complex, hierarchical equality problem, but this requires vastly more data. However, using a pretrained equality network as a modular component of this larger task leads to good performance with no task-specific training. Overall, these findings indicate that neural models are able to solve equality-based reasoning tasks, suggesting that essential aspects of symbolic reasoning can emerge from data-driven, non-symbolic learning processes.
摘要:人类有一个显着的能力,推理抽象关系结构,可以支持一些最令人印象深刻的,人类独特的认知功勋的能力。因为平等(或标识)是一种简单而无处不在的关系运算符,平等推理一直是抽象的关系推理的更广泛的问题的一个关键案例。本文重访是否平等的问题,可以通过不明确的编码符号结构的神经网络来学习。早期的工作来到了否定的回答这个问题,但结果只保存特定类手工制作的特征表示。在我们的实验中,我们使用已经预先训练上不同的任务既武断陈述和表达抽象结构灌输给他们评估出的样本平等的推广。在这种背景下,即使是简单的神经网络是能够学会用相对较少的训练数据基本平等。在第二个案例中,我们表明,连续的平等问题(学习ABA序列)只能与积极的训练情况来解决。最后,我们考虑更复杂,层次平等问题,但是这需要大大更多的数据。然而,使用预训练平等网络这个更大的任务导致良好的性能,没有特定任务的训练的模块化组件。总的来说,这些研究结果表明,神经元模型能够解决基于平等的推理任务,提示符号推理的基本方面可以由数据驱动的,非符号的学习进程中出现。
Atticus Geiger, Alexandra Carstensen Michael C. Frank, Christopher Potts
Abstract: Humans have a remarkable capacity to reason about abstract relational structures, an ability that may support some of the most impressive, human-unique cognitive feats. Because equality (or identity) is a simple and ubiquitous relational operator, equality reasoning has been a key case study for the broader question of abstract relational reasoning. This paper revisits the question of whether equality can be learned by neural networks that do not encode explicit symbolic structure. Earlier work arrived at a negative answer to this question, but that result holds only for a particular class of hand-crafted feature representations. In our experiments, we assess out-of-sample generalization of equality using both arbitrary representations and representations that have been pretrained on separate tasks to imbue them with abstract structure. In this setting, even simple neural networks are able to learn basic equality with relatively little training data. In a second case study, we show that sequential equality problems (learning ABA sequences) can be solved with only positive training instances. Finally, we consider a more complex, hierarchical equality problem, but this requires vastly more data. However, using a pretrained equality network as a modular component of this larger task leads to good performance with no task-specific training. Overall, these findings indicate that neural models are able to solve equality-based reasoning tasks, suggesting that essential aspects of symbolic reasoning can emerge from data-driven, non-symbolic learning processes.
摘要:人类有一个显着的能力,推理抽象关系结构,可以支持一些最令人印象深刻的,人类独特的认知功勋的能力。因为平等(或标识)是一种简单而无处不在的关系运算符,平等推理一直是抽象的关系推理的更广泛的问题的一个关键案例。本文重访是否平等的问题,可以通过不明确的编码符号结构的神经网络来学习。早期的工作来到了否定的回答这个问题,但结果只保存特定类手工制作的特征表示。在我们的实验中,我们使用已经预先训练上不同的任务既武断陈述和表达抽象结构灌输给他们评估出的样本平等的推广。在这种背景下,即使是简单的神经网络是能够学会用相对较少的训练数据基本平等。在第二个案例中,我们表明,连续的平等问题(学习ABA序列)只能与积极的训练情况来解决。最后,我们考虑更复杂,层次平等问题,但是这需要大大更多的数据。然而,使用预训练平等网络这个更大的任务导致良好的性能,没有特定任务的训练的模块化组件。总的来说,这些研究结果表明,神经元模型能够解决基于平等的推理任务,提示符号推理的基本方面可以由数据驱动的,非符号的学习进程中出现。
34. UWSpeech: Speech to Speech Translation for Unwritten Languages [PDF] 返回目录
Chen Zhang, Xu Tan, Yi Ren, Tao Qin, Kejun Zhang, Tie-Yan Liu
Abstract: Existing speech to speech translation systems heavily rely on the text of target language: they usually translate source language either to target text and then synthesize target speech from text, or directly to target speech with target text for auxiliary training. However, those methods cannot be applied to unwritten target languages, which have no written text or phoneme available. In this paper, we develop a translation system for unwritten languages, named as UWSpeech, which converts target unwritten speech into discrete tokens with a converter, and then translates source-language speech into target discrete tokens with a translator, and finally synthesizes target speech from target discrete tokens with an inverter. We propose a method called XL-VAE, which enhances vector quantized variational autoencoder (VQ-VAE) with cross-lingual (XL) speech recognition, to train the converter and inverter of UWSpeech jointly. Experiments on Fisher Spanish-English conversation translation dataset show that UWSpeech outperforms direct translation and VQ-VAE baseline by about 16 and 10 BLEU points respectively, which demonstrate the advantages and potentials of UWSpeech.
摘要:现有的语音到语音翻译系统在很大程度上依赖于目标语言的文字:他们通常要么目标文本,然后合成目标讲话从文本与辅助训练目标文本翻译源语言,或直接向目标讲话。然而,这些方法不能适用于不成文的目标语言,已经没有书面文字或音素可用。在本文中,我们开发了一个翻译系统的文字语言,命名为UWSpeech,其将目标不成文的语音转换成离散的令牌与转换器,然后翻译源语言的语音转换成目标离散令牌与转换,最后从合成目标语音靶向与逆变器的离散标记。我们提出了一个称为XL-VAE方法,该方法提高了矢量量化的变自动编码器(VQ-VAE)与跨语种(XL)的语音识别,训练UWSpeech的转换器和逆变器共同。费舍尔西班牙语,英语对话翻译数据集的实验表明UWSpeech分别优于约16和10个BLEU点的直接翻译和VQ-VAE基线,这表现出的优势和UWSpeech的潜力。
Chen Zhang, Xu Tan, Yi Ren, Tao Qin, Kejun Zhang, Tie-Yan Liu
Abstract: Existing speech to speech translation systems heavily rely on the text of target language: they usually translate source language either to target text and then synthesize target speech from text, or directly to target speech with target text for auxiliary training. However, those methods cannot be applied to unwritten target languages, which have no written text or phoneme available. In this paper, we develop a translation system for unwritten languages, named as UWSpeech, which converts target unwritten speech into discrete tokens with a converter, and then translates source-language speech into target discrete tokens with a translator, and finally synthesizes target speech from target discrete tokens with an inverter. We propose a method called XL-VAE, which enhances vector quantized variational autoencoder (VQ-VAE) with cross-lingual (XL) speech recognition, to train the converter and inverter of UWSpeech jointly. Experiments on Fisher Spanish-English conversation translation dataset show that UWSpeech outperforms direct translation and VQ-VAE baseline by about 16 and 10 BLEU points respectively, which demonstrate the advantages and potentials of UWSpeech.
摘要:现有的语音到语音翻译系统在很大程度上依赖于目标语言的文字:他们通常要么目标文本,然后合成目标讲话从文本与辅助训练目标文本翻译源语言,或直接向目标讲话。然而,这些方法不能适用于不成文的目标语言,已经没有书面文字或音素可用。在本文中,我们开发了一个翻译系统的文字语言,命名为UWSpeech,其将目标不成文的语音转换成离散的令牌与转换器,然后翻译源语言的语音转换成目标离散令牌与转换,最后从合成目标语音靶向与逆变器的离散标记。我们提出了一个称为XL-VAE方法,该方法提高了矢量量化的变自动编码器(VQ-VAE)与跨语种(XL)的语音识别,训练UWSpeech的转换器和逆变器共同。费舍尔西班牙语,英语对话翻译数据集的实验表明UWSpeech分别优于约16和10个BLEU点的直接翻译和VQ-VAE基线,这表现出的优势和UWSpeech的潜力。
35. Leveraging Multimodal Behavioral Analytics for Automated Job Interview Performance Assessment and Feedback [PDF] 返回目录
Anumeha Agrawal, Rosa Anil George, Selvan Sunitha Ravi, Sowmya Kamath S, Anand Kumar M
Abstract: Behavioral cues play a significant part in human communication and cognitive perception. In most professional domains, employee recruitment policies are framed such that both professional skills and personality traits are adequately assessed. Hiring interviews are structured to evaluate expansively a potential employee's suitability for the position - their professional qualifications, interpersonal skills, ability to perform in critical and stressful situations, in the presence of time and resource constraints, etc. Therefore, candidates need to be aware of their positive and negative attributes and be mindful of behavioral cues that might have adverse effects on their success. We propose a multimodal analytical framework that analyzes the candidate in an interview scenario and provides feedback for predefined labels such as engagement, speaking rate, eye contact, etc. We perform a comprehensive analysis that includes the interviewee's facial expressions, speech, and prosodic information, using the video, audio, and text transcripts obtained from the recorded interview. We use these multimodal data sources to construct a composite representation, which is used for training machine learning classifiers to predict the class labels. Such analysis is then used to provide constructive feedback to the interviewee for their behavioral cues and body language. Experimental validation showed that the proposed methodology achieved promising results.
摘要:行为线索在人类沟通和认知感知显著的部分。在最专业的领域,员工招聘政策框架,使得无论是专业技能和个性特征得到充分的评估。招聘面试是结构化的,以豪爽地评估潜在雇员的职位适应性 - 他们的专业资格,人际交往能力,才能在关键和紧张的情况下进行,在时间和资源的限制,存在等。因此,考生需要注意的其正面和负面属性,同时也要注意可能对他们的成功产生不利影响的行为线索。我们提出了一个多模式的分析框架,分析在接受采访时的场景候选人和预定义的标签,如订婚,语速,眼神交流等,我们进行了全面的分析,包括受访者的面部表情,语音和韵律信息提供反馈,使用视频,音频和文字记录的采访中获得的成绩单。我们使用这些多数据源来构建一个复合表示,这是用于训练机器学习分类预测类标签。那么这种分析用来为自己的行为线索和身体语言提供给受访者建设性的反馈意见。实验验证表明,该方法取得了可喜的成果。
Anumeha Agrawal, Rosa Anil George, Selvan Sunitha Ravi, Sowmya Kamath S, Anand Kumar M
Abstract: Behavioral cues play a significant part in human communication and cognitive perception. In most professional domains, employee recruitment policies are framed such that both professional skills and personality traits are adequately assessed. Hiring interviews are structured to evaluate expansively a potential employee's suitability for the position - their professional qualifications, interpersonal skills, ability to perform in critical and stressful situations, in the presence of time and resource constraints, etc. Therefore, candidates need to be aware of their positive and negative attributes and be mindful of behavioral cues that might have adverse effects on their success. We propose a multimodal analytical framework that analyzes the candidate in an interview scenario and provides feedback for predefined labels such as engagement, speaking rate, eye contact, etc. We perform a comprehensive analysis that includes the interviewee's facial expressions, speech, and prosodic information, using the video, audio, and text transcripts obtained from the recorded interview. We use these multimodal data sources to construct a composite representation, which is used for training machine learning classifiers to predict the class labels. Such analysis is then used to provide constructive feedback to the interviewee for their behavioral cues and body language. Experimental validation showed that the proposed methodology achieved promising results.
摘要:行为线索在人类沟通和认知感知显著的部分。在最专业的领域,员工招聘政策框架,使得无论是专业技能和个性特征得到充分的评估。招聘面试是结构化的,以豪爽地评估潜在雇员的职位适应性 - 他们的专业资格,人际交往能力,才能在关键和紧张的情况下进行,在时间和资源的限制,存在等。因此,考生需要注意的其正面和负面属性,同时也要注意可能对他们的成功产生不利影响的行为线索。我们提出了一个多模式的分析框架,分析在接受采访时的场景候选人和预定义的标签,如订婚,语速,眼神交流等,我们进行了全面的分析,包括受访者的面部表情,语音和韵律信息提供反馈,使用视频,音频和文字记录的采访中获得的成绩单。我们使用这些多数据源来构建一个复合表示,这是用于训练机器学习分类预测类标签。那么这种分析用来为自己的行为线索和身体语言提供给受访者建设性的反馈意见。实验验证表明,该方法取得了可喜的成果。
36. Examining the Role of Mood Patterns in Predicting Self-Reported Depressive symptoms [PDF] 返回目录
Lucia Lushi Chen, Walid Magdy, Heather Whalley, Maria Wolters
Abstract: Depression is the leading cause of disability worldwide. Initial efforts to detect depression signals from social media posts have shown promising results. Given the high internal validity, results from such analyses are potentially beneficial to clinical judgment. The existing models for automatic detection of depressive symptoms learn proxy diagnostic signals from social media data, such as help-seeking behavior for mental health or medication names. However, in reality, individuals with depression typically experience depressed mood, loss of pleasure nearly in all the activities, feeling of worthlessness or guilt, and diminished ability to think. Therefore, a lot of the proxy signals used in these models lack the theoretical underpinnings for depressive symptoms. It is also reported that social media posts from many patients in the clinical setting do not contain these signals. Based on this research gap, we propose to monitor a type of signal that is well-established as a class of symptoms in affective disorders -- mood. The mood is an experience of feeling that can last for hours, days, or even weeks. In this work, we attempt to enrich current technology for detecting symptoms of potential depression by constructing a 'mood profile' for social media users.
摘要:抑郁症是全世界残疾的主要原因。检测来自社交媒体帖子抑郁信号初步努力已展现出可喜效果。鉴于高内部效度,从这样的分析结果对临床判断潜在有益的。对于抑郁症状自动检测现有车型汲取社交媒体数据,如对精神健康的药物名称求助行为代理诊断信号。然而,在现实中,抑郁症患者的个体通常会经历抑郁心境,快感的损失几乎在所有的活动,无价值或内疚,并思考能力下降的感觉。因此,很多在这些模型中所使用的代理信号的缺乏抑郁症状的理论基础。另据报道,从许多患者社交媒体帖子在临床上不包含这些信号。在此基础上研究的空白,我们建议监控类型的信号是行之有效的,如情感性精神障碍的一类症状 - 情绪。心情的感觉,可以持续数小时,数天,甚至数周的经验。在这项工作中,我们试图以丰富目前的技术,通过构建“心情曲线”的社交媒体用户发现潜在抑郁症的症状。
Lucia Lushi Chen, Walid Magdy, Heather Whalley, Maria Wolters
Abstract: Depression is the leading cause of disability worldwide. Initial efforts to detect depression signals from social media posts have shown promising results. Given the high internal validity, results from such analyses are potentially beneficial to clinical judgment. The existing models for automatic detection of depressive symptoms learn proxy diagnostic signals from social media data, such as help-seeking behavior for mental health or medication names. However, in reality, individuals with depression typically experience depressed mood, loss of pleasure nearly in all the activities, feeling of worthlessness or guilt, and diminished ability to think. Therefore, a lot of the proxy signals used in these models lack the theoretical underpinnings for depressive symptoms. It is also reported that social media posts from many patients in the clinical setting do not contain these signals. Based on this research gap, we propose to monitor a type of signal that is well-established as a class of symptoms in affective disorders -- mood. The mood is an experience of feeling that can last for hours, days, or even weeks. In this work, we attempt to enrich current technology for detecting symptoms of potential depression by constructing a 'mood profile' for social media users.
摘要:抑郁症是全世界残疾的主要原因。检测来自社交媒体帖子抑郁信号初步努力已展现出可喜效果。鉴于高内部效度,从这样的分析结果对临床判断潜在有益的。对于抑郁症状自动检测现有车型汲取社交媒体数据,如对精神健康的药物名称求助行为代理诊断信号。然而,在现实中,抑郁症患者的个体通常会经历抑郁心境,快感的损失几乎在所有的活动,无价值或内疚,并思考能力下降的感觉。因此,很多在这些模型中所使用的代理信号的缺乏抑郁症状的理论基础。另据报道,从许多患者社交媒体帖子在临床上不包含这些信号。在此基础上研究的空白,我们建议监控类型的信号是行之有效的,如情感性精神障碍的一类症状 - 情绪。心情的感觉,可以持续数小时,数天,甚至数周的经验。在这项工作中,我们试图以丰富目前的技术,通过构建“心情曲线”的社交媒体用户发现潜在抑郁症的症状。
37. Continual General Chunking Problem and SyncMap [PDF] 返回目录
Danilo Vasconcellos Vargas, Toshitake Asabuki
Abstract: Humans possess an inherent ability to chunk sequences into their constituent parts. In fact, this ability is thought to bootstrap language skills to the learning of image patterns which might be a key to a more animal-like type of intelligence. Here, we propose a continual generalization of the chunking problem (an unsupervised problem), encompassing fixed and probabilistic chunks, discovery of temporal and causal structures and their continual variations. Additionally, we propose an algorithm called SyncMap that can learn and adapt to changes in the problem by creating a dynamic map which preserves the correlation between variables. Results of SyncMap suggest that the proposed algorithm learn near optimal solutions, despite the presence of many types of structures and their continual variation. When compared to Word2vec, PARSER and MRIL, SyncMap surpasses or ties with the best algorithm on $77\%$ of the scenarios while being the second best in the remaing $33\%$.
摘要:人类具有的固有能力,以块序列成它们的组成部分。事实上,这种能力被认为是引导语言技能的图像模式的学习,这可能是一种动物般的多种类型的智能的关键。在这里,我们提出了分块问题的不断泛化(无监督的问题),涵盖固定和概率块,时间和因果结构的发现和他们不断变化。此外,我们提出了所谓的SYNCMAP算法,可以学习,并通过创建一个动态地图,保留变量之间的相关性适应问题的变化。 SYNCMAP的结果表明,该算法学习接近最优的解决方案,尽管有许多类型的结构的存在和他们的不断变化。相较于Word2vec,解析器和MRIL,SYNCMAP超越或与$ 77国\%的情景$,同时在remaing $ 33 \%$第二个最好的最好的算法关系。
Danilo Vasconcellos Vargas, Toshitake Asabuki
Abstract: Humans possess an inherent ability to chunk sequences into their constituent parts. In fact, this ability is thought to bootstrap language skills to the learning of image patterns which might be a key to a more animal-like type of intelligence. Here, we propose a continual generalization of the chunking problem (an unsupervised problem), encompassing fixed and probabilistic chunks, discovery of temporal and causal structures and their continual variations. Additionally, we propose an algorithm called SyncMap that can learn and adapt to changes in the problem by creating a dynamic map which preserves the correlation between variables. Results of SyncMap suggest that the proposed algorithm learn near optimal solutions, despite the presence of many types of structures and their continual variation. When compared to Word2vec, PARSER and MRIL, SyncMap surpasses or ties with the best algorithm on $77\%$ of the scenarios while being the second best in the remaing $33\%$.
摘要:人类具有的固有能力,以块序列成它们的组成部分。事实上,这种能力被认为是引导语言技能的图像模式的学习,这可能是一种动物般的多种类型的智能的关键。在这里,我们提出了分块问题的不断泛化(无监督的问题),涵盖固定和概率块,时间和因果结构的发现和他们不断变化。此外,我们提出了所谓的SYNCMAP算法,可以学习,并通过创建一个动态地图,保留变量之间的相关性适应问题的变化。 SYNCMAP的结果表明,该算法学习接近最优的解决方案,尽管有许多类型的结构的存在和他们的不断变化。相较于Word2vec,解析器和MRIL,SYNCMAP超越或与$ 77国\%的情景$,同时在remaing $ 33 \%$第二个最好的最好的算法关系。
38. Pot, kettle: Nonliteral titles aren't (natural) science [PDF] 返回目录
Mike Thelwall
Abstract: Researchers may be tempted to attract attention through poetic titles for their publications, but would this be mistaken in some fields? Whilst poetic titles are known to be common in medicine, it is not clear whether the practice is widespread elsewhere. This article investigates the prevalence of poetic expressions in journal article titles 1996-2019 in 3.3 million articles from all 27 Scopus broad fields. Expressions were identified by manually checking all phrases with at least 5 words that occurred at least 25 times, finding 149 stock phrases, idioms, sayings, literary allusions, film names and song titles or lyrics. The expressions found are most common in the social sciences and the humanities. They are also relatively common in medicine, but almost absent from engineering and the natural and formal sciences. The differences may reflect the less hierarchical and more varied nature of the social sciences and humanities, where interesting titles may attract an audience. In engineering, natural science and formal science fields, authors should take extra care with poetic expressions, in case their choice is judged inappropriate. This includes interdisciplinary research overlapping these areas. Conversely, reviewers of interdisciplinary research involving the social sciences should be more tolerant of poetic license.
摘要:研究人员可能会试图通过吸引诗意的标题为关注他们的出版物,但这样做在某些领域弄错了?虽然诗意的标题被称为是在内科常见病,它是不明确的做法是否普遍别处。本文研究的期刊文章标题1996至2019年在所有27个SCOPUS广泛领域330万篇诗意表达的患病率。表达式是由手动检查所有词组与发生至少25次,至少5个字,发现149股票短语,成语,谚语,典故,电影中的名字和歌曲名称或歌词鉴定。发现该表达式是最常见的,在社会科学和人文科学。他们也是医学比较常见的,但是从工程和自然和正式版几乎不存在。的差异可能反映了社会科学和人文科学,其中有趣的标题可以吸引观众的层级更少和更多样化的性质。在工程,自然科学和正式的科学领域,作者应格外小心与诗意的表达,如果他们选择判断不当。这包括跨学科的研究重叠这些区域。相反,涉及社会科学跨学科研究的评论家应该是诗意的许可证的更加宽容。
Mike Thelwall
Abstract: Researchers may be tempted to attract attention through poetic titles for their publications, but would this be mistaken in some fields? Whilst poetic titles are known to be common in medicine, it is not clear whether the practice is widespread elsewhere. This article investigates the prevalence of poetic expressions in journal article titles 1996-2019 in 3.3 million articles from all 27 Scopus broad fields. Expressions were identified by manually checking all phrases with at least 5 words that occurred at least 25 times, finding 149 stock phrases, idioms, sayings, literary allusions, film names and song titles or lyrics. The expressions found are most common in the social sciences and the humanities. They are also relatively common in medicine, but almost absent from engineering and the natural and formal sciences. The differences may reflect the less hierarchical and more varied nature of the social sciences and humanities, where interesting titles may attract an audience. In engineering, natural science and formal science fields, authors should take extra care with poetic expressions, in case their choice is judged inappropriate. This includes interdisciplinary research overlapping these areas. Conversely, reviewers of interdisciplinary research involving the social sciences should be more tolerant of poetic license.
摘要:研究人员可能会试图通过吸引诗意的标题为关注他们的出版物,但这样做在某些领域弄错了?虽然诗意的标题被称为是在内科常见病,它是不明确的做法是否普遍别处。本文研究的期刊文章标题1996至2019年在所有27个SCOPUS广泛领域330万篇诗意表达的患病率。表达式是由手动检查所有词组与发生至少25次,至少5个字,发现149股票短语,成语,谚语,典故,电影中的名字和歌曲名称或歌词鉴定。发现该表达式是最常见的,在社会科学和人文科学。他们也是医学比较常见的,但是从工程和自然和正式版几乎不存在。的差异可能反映了社会科学和人文科学,其中有趣的标题可以吸引观众的层级更少和更多样化的性质。在工程,自然科学和正式的科学领域,作者应格外小心与诗意的表达,如果他们选择判断不当。这包括跨学科的研究重叠这些区域。相反,涉及社会科学跨学科研究的评论家应该是诗意的许可证的更加宽容。
39. SE-MelGAN -- Speaker Agnostic Rapid Speech Enhancement [PDF] 返回目录
Luka Chkhetiani, Levan Bejanidze
Abstract: Recent advancement in Generative Adversarial Networks in speech synthesis domain[3],[2] have shown, that it's possible to train GANs [8] in a reliable manner for high quality coherent waveform generation from mel-spectograms. We propose that it is possible to transfer the MelGAN's [3] robustness in learning speech features to speech enhancement and noise reduction domain without any model modification tasks. Our proposed method generalizes over multi-speaker speech dataset and is able to robustly handle unseen background noises during the inference. Also, we show that by increasing the batch size for this particular approach not only yields better speech results, but generalizes over multi-speaker dataset easily and leads to faster convergence. Additionally, it outperforms previous state of the art GAN approach for speech enhancement SEGAN [5] in two domains: 1. quality ; 2. speed. Proposed method runs at more than 100x faster than realtime on GPU and more than 2x faster than real time on CPU without any hardware optimization tasks, right at the speed of MelGAN [3].
摘要:在剖成对抗性网络最近的进步在语音合成领域[3],[2]表明,它是可能在从MEL-spectograms高质量相干波形生成一个可靠的方式来训练甘斯[8]。我们建议,可以在学习语音功能,语音增强和降噪域,而无需任何模型修改的任务转移MelGAN的[3]的鲁棒性。在多扬声器的语音数据集我们提出的方法推广,并能推理过程中可靠地处理看不见的背景噪音。此外,我们表明,通过增加批量大小为这个特殊的做法不仅得到更好的语音效果,但在多扬声器数据集推广容易,并导致更快的收敛。此外,它优于用于语音增强SEGAN本领域GAN方法的先前状态[5]中的两个结构域:1质量; 2.速度。超过100倍提出的方法运行比GPU实时更快,超过2倍的速度比CPU的实时无需任何硬件优化任务,就在MelGAN的速度[3]。
Luka Chkhetiani, Levan Bejanidze
Abstract: Recent advancement in Generative Adversarial Networks in speech synthesis domain[3],[2] have shown, that it's possible to train GANs [8] in a reliable manner for high quality coherent waveform generation from mel-spectograms. We propose that it is possible to transfer the MelGAN's [3] robustness in learning speech features to speech enhancement and noise reduction domain without any model modification tasks. Our proposed method generalizes over multi-speaker speech dataset and is able to robustly handle unseen background noises during the inference. Also, we show that by increasing the batch size for this particular approach not only yields better speech results, but generalizes over multi-speaker dataset easily and leads to faster convergence. Additionally, it outperforms previous state of the art GAN approach for speech enhancement SEGAN [5] in two domains: 1. quality ; 2. speed. Proposed method runs at more than 100x faster than realtime on GPU and more than 2x faster than real time on CPU without any hardware optimization tasks, right at the speed of MelGAN [3].
摘要:在剖成对抗性网络最近的进步在语音合成领域[3],[2]表明,它是可能在从MEL-spectograms高质量相干波形生成一个可靠的方式来训练甘斯[8]。我们建议,可以在学习语音功能,语音增强和降噪域,而无需任何模型修改的任务转移MelGAN的[3]的鲁棒性。在多扬声器的语音数据集我们提出的方法推广,并能推理过程中可靠地处理看不见的背景噪音。此外,我们表明,通过增加批量大小为这个特殊的做法不仅得到更好的语音效果,但在多扬声器数据集推广容易,并导致更快的收敛。此外,它优于用于语音增强SEGAN本领域GAN方法的先前状态[5]中的两个结构域:1质量; 2.速度。超过100倍提出的方法运行比GPU实时更快,超过2倍的速度比CPU的实时无需任何硬件优化任务,就在MelGAN的速度[3]。
40. Mining Implicit Relevance Feedback from User Behavior forWeb Question Answering [PDF] 返回目录
Linjun Shou, Shining Bo, Feixiang Cheng, Ming Gong, Jian Pei, Daxin Jiang
Abstract: Training and refreshing a web-scale Question Answering (QA) system for a multi-lingual commercial search engine often requires a huge amount of training examples. One principled idea is to mine implicit relevance feedback from user behavior recorded in search engine logs. All previous works on mining implicit relevance feedback target at relevance of web documents rather than passages. Due to several unique characteristics of QA tasks, the existing user behavior models for web documents cannot be applied to infer passage relevance. In this paper, we make the first study to explore the correlation between user behavior and passage relevance, and propose a novel approach for mining training data for Web QA. We conduct extensive experiments on four test datasets and the results show our approach significantly improves the accuracy of passage ranking without extra human labeled data. In practice, this work has proved effective to substantially reduce the human labeling cost for the QA service in a global commercial search engine, especially for languages with low resources. Our techniques have been deployed in multi-language services.
摘要:培训和清爽的多语言商业搜索引擎网络规模的问题回答(QA)系统往往需要大量的训练样本量。一个原则性的想法是从记录在搜索引擎记录用户行为矿隐含相关性反馈。在对网页文件,而不是通道的相关性挖掘隐含相关性反馈的目标以前所有的作品。由于QA任务的一些独特的特性,网页文件的现有用户行为模型不能用于推断通道的相关性。在本文中,我们做的第一项研究探讨用户行为和通道的相关性之间的关系,并提出了挖掘训练数据的Web QA的新方法。我们在四个测试数据集进行了广泛的实验,结果表明我们的方法显著提高通道排名的准确性,无需额外人力标签的数据。在实践中,这项工作已被证明有效的显着地减少了QA服务人力成本的标签在全球商业搜索引擎,特别是低资源语言。我们的技术已经被部署在多语言服务。
Linjun Shou, Shining Bo, Feixiang Cheng, Ming Gong, Jian Pei, Daxin Jiang
Abstract: Training and refreshing a web-scale Question Answering (QA) system for a multi-lingual commercial search engine often requires a huge amount of training examples. One principled idea is to mine implicit relevance feedback from user behavior recorded in search engine logs. All previous works on mining implicit relevance feedback target at relevance of web documents rather than passages. Due to several unique characteristics of QA tasks, the existing user behavior models for web documents cannot be applied to infer passage relevance. In this paper, we make the first study to explore the correlation between user behavior and passage relevance, and propose a novel approach for mining training data for Web QA. We conduct extensive experiments on four test datasets and the results show our approach significantly improves the accuracy of passage ranking without extra human labeled data. In practice, this work has proved effective to substantially reduce the human labeling cost for the QA service in a global commercial search engine, especially for languages with low resources. Our techniques have been deployed in multi-language services.
摘要:培训和清爽的多语言商业搜索引擎网络规模的问题回答(QA)系统往往需要大量的训练样本量。一个原则性的想法是从记录在搜索引擎记录用户行为矿隐含相关性反馈。在对网页文件,而不是通道的相关性挖掘隐含相关性反馈的目标以前所有的作品。由于QA任务的一些独特的特性,网页文件的现有用户行为模型不能用于推断通道的相关性。在本文中,我们做的第一项研究探讨用户行为和通道的相关性之间的关系,并提出了挖掘训练数据的Web QA的新方法。我们在四个测试数据集进行了广泛的实验,结果表明我们的方法显著提高通道排名的准确性,无需额外人力标签的数据。在实践中,这项工作已被证明有效的显着地减少了QA服务人力成本的标签在全球商业搜索引擎,特别是低资源语言。我们的技术已经被部署在多语言服务。
41. Guided Transformer: Leveraging Multiple External Sources for Representation Learning in Conversational Search [PDF] 返回目录
Helia Hashemi, Hamed Zamani, W. Bruce Croft
Abstract: Asking clarifying questions in response to ambiguous or faceted queries has been recognized as a useful technique for various information retrieval systems, especially conversational search systems with limited bandwidth interfaces. Analyzing and generating clarifying questions have been studied recently but the accurate utilization of user responses to clarifying questions has been relatively less explored. In this paper, we enrich the representations learned by Transformer networks using a novel attention mechanism from external information sources that weights each term in the conversation. We evaluate this Guided Transformer model in a conversational search scenario that includes clarifying questions. In our experiments, we use two separate external sources, including the top retrieved documents and a set of different possible clarifying questions for the query. We implement the proposed representation learning model for two downstream tasks in conversational search; document retrieval and next clarifying question selection. Our experiments use a public dataset for search clarification and demonstrate significant improvements compared to competitive baselines.
摘要:问澄清回应含糊不清或方位的查询问题,已被确认为各种信息检索系统,用有限的带宽接口特别对话搜索系统的有用技术。分析和生成澄清的问题最近已研究,但对用户的响应,以澄清问题的准确度的应用已经相对较少研究。在本文中,我们使用一种新的机制,重视从外部信息源丰富的变压器网络学到的陈述是权重会话的每个学期。我们评估了包括澄清问题进行对话搜索场景此指导Transformer模型。在我们的实验中,我们使用了两个独立的外部来源,包括顶部检索文档和一组用于查询不同的可能澄清的问题。我们实现了在对话搜索两个下游任务的提出表示学习模式;文献检索和明年澄清问题的选择。我们的实验使用搜索澄清一个公共数据集,并与有竞争力的基线证明显著的改善。
Helia Hashemi, Hamed Zamani, W. Bruce Croft
Abstract: Asking clarifying questions in response to ambiguous or faceted queries has been recognized as a useful technique for various information retrieval systems, especially conversational search systems with limited bandwidth interfaces. Analyzing and generating clarifying questions have been studied recently but the accurate utilization of user responses to clarifying questions has been relatively less explored. In this paper, we enrich the representations learned by Transformer networks using a novel attention mechanism from external information sources that weights each term in the conversation. We evaluate this Guided Transformer model in a conversational search scenario that includes clarifying questions. In our experiments, we use two separate external sources, including the top retrieved documents and a set of different possible clarifying questions for the query. We implement the proposed representation learning model for two downstream tasks in conversational search; document retrieval and next clarifying question selection. Our experiments use a public dataset for search clarification and demonstrate significant improvements compared to competitive baselines.
摘要:问澄清回应含糊不清或方位的查询问题,已被确认为各种信息检索系统,用有限的带宽接口特别对话搜索系统的有用技术。分析和生成澄清的问题最近已研究,但对用户的响应,以澄清问题的准确度的应用已经相对较少研究。在本文中,我们使用一种新的机制,重视从外部信息源丰富的变压器网络学到的陈述是权重会话的每个学期。我们评估了包括澄清问题进行对话搜索场景此指导Transformer模型。在我们的实验中,我们使用了两个独立的外部来源,包括顶部检索文档和一组用于查询不同的可能澄清的问题。我们实现了在对话搜索两个下游任务的提出表示学习模式;文献检索和明年澄清问题的选择。我们的实验使用搜索澄清一个公共数据集,并与有竞争力的基线证明显著的改善。
42. A Multifunction Printer CUI for the Blind [PDF] 返回目录
Kyle Dent, Kalai Ramea
Abstract: Advances in interface design using touch surfaces creates greater obstacles for blind and visually impaired users of technology. Conversational user interfaces offer a reasonable alternative for interactions and enable greater access and most importantly greater independence for the blind. This paper presents a case study of our work to develop a conversational user interface for accessibility for multifunction printers (MFP). It describes our approach to conversational interfaces in general and the specifics of the solution we created for MFPs. It also presents a user study we performed to assess the solution and guide our future efforts.
摘要:进展采用触摸表面接口设计创造了盲人更大的障碍和技术的视障用户。对话式的用户界面提供交互一个合理的选择,使盲人更多的机会,最重要的更大的独立性。本文介绍了我们工作的情况下,研究制定无障碍的多功能打印机(MFP)对话式的用户界面。它描述了我们在一般会话接口和我们的多功能一体机创建的解决方案的具体做法。同时还介绍了我们进行评估的解决方案,并指导我们今后努力的一个用户研究。
Kyle Dent, Kalai Ramea
Abstract: Advances in interface design using touch surfaces creates greater obstacles for blind and visually impaired users of technology. Conversational user interfaces offer a reasonable alternative for interactions and enable greater access and most importantly greater independence for the blind. This paper presents a case study of our work to develop a conversational user interface for accessibility for multifunction printers (MFP). It describes our approach to conversational interfaces in general and the specifics of the solution we created for MFPs. It also presents a user study we performed to assess the solution and guide our future efforts.
摘要:进展采用触摸表面接口设计创造了盲人更大的障碍和技术的视障用户。对话式的用户界面提供交互一个合理的选择,使盲人更多的机会,最重要的更大的独立性。本文介绍了我们工作的情况下,研究制定无障碍的多功能打印机(MFP)对话式的用户界面。它描述了我们在一般会话接口和我们的多功能一体机创建的解决方案的具体做法。同时还介绍了我们进行评估的解决方案,并指导我们今后努力的一个用户研究。
43. Understanding Unintended Memorization in Federated Learning [PDF] 返回目录
Om Thakkar, Swaroop Ramaswamy, Rajiv Mathews, Françoise Beaufays
Abstract: Recent works have shown that generative sequence models (e.g., language models) have a tendency to memorize rare or unique sequences in the training data. Since useful models are often trained on sensitive data, to ensure the privacy of the training data it is critical to identify and mitigate such unintended memorization. Federated Learning (FL) has emerged as a novel framework for large-scale distributed learning tasks. However, it differs in many aspects from the well-studied central learning setting where all the data is stored at the central server. In this paper, we initiate a formal study to understand the effect of different components of canonical FL on unintended memorization in trained models, comparing with the central learning setting. Our results show that several differing components of FL play an important role in reducing unintended memorization. Specifically, we observe that the clustering of data according to users---which happens by design in FL---has a significant effect in reducing such memorization, and using the method of Federated Averaging for training causes a further reduction. We also show that training with a strong user-level differential privacy guarantee results in models that exhibit the least amount of unintended memorization.
摘要:最近的工作表明,生成序列模型(例如,语言模型),要记住在训练数据中罕见的或独特的序列的趋势。由于有用的模型往往受过训练上的敏感数据,以保证训练数据关键是要发现和减少此类意外记忆的隐私。联合学习(FL)已经成为了大规模分布式学习任务的新框架。但是,它的不同之处,从那里所有的数据都存储在中央服务器充分研究中心学习环境的许多方面。在本文中,我们开始正式的研究,以了解规范FL的不同成分对无意记忆的效果在训练的模型,与中央学习环境比较。我们的研究结果显示,佛罗里达州的几个不同的部件起到减少意外的记忆具有重要作用。具体而言,我们观察到,根据用户---这恰好通过设计在FL ---在降低这种记忆和使用联合取平均的方法,用于一训练显著效应导致进一步减少数据的聚类。我们还表明,培训与模型强大的用户级别差的隐私保证效果表现出意想不到的记忆量最少。
Om Thakkar, Swaroop Ramaswamy, Rajiv Mathews, Françoise Beaufays
Abstract: Recent works have shown that generative sequence models (e.g., language models) have a tendency to memorize rare or unique sequences in the training data. Since useful models are often trained on sensitive data, to ensure the privacy of the training data it is critical to identify and mitigate such unintended memorization. Federated Learning (FL) has emerged as a novel framework for large-scale distributed learning tasks. However, it differs in many aspects from the well-studied central learning setting where all the data is stored at the central server. In this paper, we initiate a formal study to understand the effect of different components of canonical FL on unintended memorization in trained models, comparing with the central learning setting. Our results show that several differing components of FL play an important role in reducing unintended memorization. Specifically, we observe that the clustering of data according to users---which happens by design in FL---has a significant effect in reducing such memorization, and using the method of Federated Averaging for training causes a further reduction. We also show that training with a strong user-level differential privacy guarantee results in models that exhibit the least amount of unintended memorization.
摘要:最近的工作表明,生成序列模型(例如,语言模型),要记住在训练数据中罕见的或独特的序列的趋势。由于有用的模型往往受过训练上的敏感数据,以保证训练数据关键是要发现和减少此类意外记忆的隐私。联合学习(FL)已经成为了大规模分布式学习任务的新框架。但是,它的不同之处,从那里所有的数据都存储在中央服务器充分研究中心学习环境的许多方面。在本文中,我们开始正式的研究,以了解规范FL的不同成分对无意记忆的效果在训练的模型,与中央学习环境比较。我们的研究结果显示,佛罗里达州的几个不同的部件起到减少意外的记忆具有重要作用。具体而言,我们观察到,根据用户---这恰好通过设计在FL ---在降低这种记忆和使用联合取平均的方法,用于一训练显著效应导致进一步减少数据的聚类。我们还表明,培训与模型强大的用户级别差的隐私保证效果表现出意想不到的记忆量最少。
44. How to Avoid Being Eaten by a Grue: Structured Exploration Strategies for Textual Worlds [PDF] 返回目录
Prithviraj Ammanabrolu, Ethan Tien, Matthew Hausknecht, Mark O. Riedl
Abstract: Text-based games are long puzzles or quests, characterized by a sequence of sparse and potentially deceptive rewards. They provide an ideal platform to develop agents that perceive and act upon the world using a combinatorially sized natural language state-action space. Standard Reinforcement Learning agents are poorly equipped to effectively explore such spaces and often struggle to overcome bottlenecks---states that agents are unable to pass through simply because they do not see the right action sequence enough times to be sufficiently reinforced. We introduce Q*BERT, an agent that learns to build a knowledge graph of the world by answering questions, which leads to greater sample efficiency. To overcome bottlenecks, we further introduce MC!Q*BERT an agent that uses an knowledge-graph-based intrinsic motivation to detect bottlenecks and a novel exploration strategy to efficiently learn a chain of policy modules to overcome them. We present an ablation study and results demonstrating how our method outperforms the current state-of-the-art on nine text games, including the popular game, Zork, where, for the first time, a learning agent gets past the bottleneck where the player is eaten by a Grue.
摘要:基于文本的游戏是长难题或任务,其特点是稀疏的和潜在的欺骗性奖励的序列。他们提供了一个理想的平台开发感知代理和使用组合地大小的自然语言的状态 - 动作空间在世界上行事。标准的强化学习代理设备简陋,有效开发这样的空间,往往很难克服的瓶颈---美国代理商都无法通过,只是因为他们没有看到正确的行动序列足够的时间来充分加强。我们推出Q * BERT,代理该学会通过回答问题,从而导致更大的样品的效率建造世界的知识图谱。为了克服瓶颈,我们进一步介绍MC!Q * BERT使用的基于知识图的内在动力,以检测瓶颈和新的勘探战略,以有效地学习的策略模块链,以克服他们的代理人。我们提出消融研究,结果证明我们的方法是如何优于九个文字游戏,包括流行的游戏,魔域,其中,首次,学习代理人获取过去的瓶颈,球员目前的状态的最先进的是被怪兽吃掉。
Prithviraj Ammanabrolu, Ethan Tien, Matthew Hausknecht, Mark O. Riedl
Abstract: Text-based games are long puzzles or quests, characterized by a sequence of sparse and potentially deceptive rewards. They provide an ideal platform to develop agents that perceive and act upon the world using a combinatorially sized natural language state-action space. Standard Reinforcement Learning agents are poorly equipped to effectively explore such spaces and often struggle to overcome bottlenecks---states that agents are unable to pass through simply because they do not see the right action sequence enough times to be sufficiently reinforced. We introduce Q*BERT, an agent that learns to build a knowledge graph of the world by answering questions, which leads to greater sample efficiency. To overcome bottlenecks, we further introduce MC!Q*BERT an agent that uses an knowledge-graph-based intrinsic motivation to detect bottlenecks and a novel exploration strategy to efficiently learn a chain of policy modules to overcome them. We present an ablation study and results demonstrating how our method outperforms the current state-of-the-art on nine text games, including the popular game, Zork, where, for the first time, a learning agent gets past the bottleneck where the player is eaten by a Grue.
摘要:基于文本的游戏是长难题或任务,其特点是稀疏的和潜在的欺骗性奖励的序列。他们提供了一个理想的平台开发感知代理和使用组合地大小的自然语言的状态 - 动作空间在世界上行事。标准的强化学习代理设备简陋,有效开发这样的空间,往往很难克服的瓶颈---美国代理商都无法通过,只是因为他们没有看到正确的行动序列足够的时间来充分加强。我们推出Q * BERT,代理该学会通过回答问题,从而导致更大的样品的效率建造世界的知识图谱。为了克服瓶颈,我们进一步介绍MC!Q * BERT使用的基于知识图的内在动力,以检测瓶颈和新的勘探战略,以有效地学习的策略模块链,以克服他们的代理人。我们提出消融研究,结果证明我们的方法是如何优于九个文字游戏,包括流行的游戏,魔域,其中,首次,学习代理人获取过去的瓶颈,球员目前的状态的最先进的是被怪兽吃掉。
注:中文为机器翻译结果!