目录
5. A Neural Topical Expansion Framework for Unstructured Persona-oriented Dialogue Generation [PDF] 摘要
摘要
1. Irony Detection in a Multilingual Context [PDF] 返回目录
Bilal Ghanem, Jihen Karoui, Farah Benamara, Paolo Rosso, Véronique Moriceau
Abstract: This paper proposes the first multilingual (French, English and Arabic) and multicultural (Indo-European languages vs. less culturally close languages) irony detection system. We employ both feature-based models and neural architectures using monolingual word representation. We compare the performance of these systems with state-of-the-art systems to identify their capabilities. We show that these monolingual models trained separately on different languages using multilingual word representation or text-based features can open the door to irony detection in languages that lack of annotated data for irony.
摘要:本文提出了一个多语种(法语,英语和阿拉伯语)和多元文化(印欧语言与文化少接近语言)具有讽刺意味的检测系统。我们采用使用单语单词表示既基于特征的模型和神经结构。我们比较这些系统与国家的最先进的系统的性能,以确定自己的能力。我们发现,这些单语车型上使用多语言的单词表示或基于文本的功能,可以打开在缺乏讽刺注释数据的语言门讽刺检测不同的语言单独训练。
Bilal Ghanem, Jihen Karoui, Farah Benamara, Paolo Rosso, Véronique Moriceau
Abstract: This paper proposes the first multilingual (French, English and Arabic) and multicultural (Indo-European languages vs. less culturally close languages) irony detection system. We employ both feature-based models and neural architectures using monolingual word representation. We compare the performance of these systems with state-of-the-art systems to identify their capabilities. We show that these monolingual models trained separately on different languages using multilingual word representation or text-based features can open the door to irony detection in languages that lack of annotated data for irony.
摘要:本文提出了一个多语种(法语,英语和阿拉伯语)和多元文化(印欧语言与文化少接近语言)具有讽刺意味的检测系统。我们采用使用单语单词表示既基于特征的模型和神经结构。我们比较这些系统与国家的最先进的系统的性能,以确定自己的能力。我们发现,这些单语车型上使用多语言的单词表示或基于文本的功能,可以打开在缺乏讽刺注释数据的语言门讽刺检测不同的语言单独训练。
2. Conversational Structure Aware and Context Sensitive Topic Model for Online Discussions [PDF] 返回目录
Yingcheng Sun, Kenneth Loparo, Richard Kolacinski
Abstract: Millions of online discussions are generated everyday on social media platforms. Topic modelling is an efficient way of better understanding large text datasets at scale. Conventional topic models have had limited success in online discussions, and to overcome their limitations, we use the discussion thread tree structure and propose a "popularity" metric to quantify the number of replies to a comment to extend the frequency of word occurrences, and the "transitivity" concept to characterize topic dependency among nodes in a nested discussion thread. We build a Conversational Structure Aware Topic Model (CSATM) based on popularity and transitivity to infer topics and their assignments to comments. Experiments on real forum datasets are used to demonstrate improved performance for topic extraction with six different measurements of coherence and impressive accuracy for topic assignments.
摘要:数以百万计的在线讨论的是在社会化媒体平台上产生的每一天。主题造型是在规模更好地理解大数据集文字的有效方式。传统主题模型曾在网上讨论有限的成功,并克服其局限性,我们用话题树形结构,并提出了“人气”指标,以回复的数量量化为一个注释,延长词出现的频率,和“及物”的概念来描述话题依赖嵌套话题节点之间。我们建立一个基于普及和传递来推断主题和他们的任务,以评论的会话结构感知主题模型(CSATM)。真实数据集论坛的实验来证明与连贯性和令人印象深刻的精度为主题分配六种不同的测量话题提取改进性能。
Yingcheng Sun, Kenneth Loparo, Richard Kolacinski
Abstract: Millions of online discussions are generated everyday on social media platforms. Topic modelling is an efficient way of better understanding large text datasets at scale. Conventional topic models have had limited success in online discussions, and to overcome their limitations, we use the discussion thread tree structure and propose a "popularity" metric to quantify the number of replies to a comment to extend the frequency of word occurrences, and the "transitivity" concept to characterize topic dependency among nodes in a nested discussion thread. We build a Conversational Structure Aware Topic Model (CSATM) based on popularity and transitivity to infer topics and their assignments to comments. Experiments on real forum datasets are used to demonstrate improved performance for topic extraction with six different measurements of coherence and impressive accuracy for topic assignments.
摘要:数以百万计的在线讨论的是在社会化媒体平台上产生的每一天。主题造型是在规模更好地理解大数据集文字的有效方式。传统主题模型曾在网上讨论有限的成功,并克服其局限性,我们用话题树形结构,并提出了“人气”指标,以回复的数量量化为一个注释,延长词出现的频率,和“及物”的概念来描述话题依赖嵌套话题节点之间。我们建立一个基于普及和传递来推断主题和他们的任务,以评论的会话结构感知主题模型(CSATM)。真实数据集论坛的实验来证明与连贯性和令人印象深刻的精度为主题分配六种不同的测量话题提取改进性能。
3. Citation Data of Czech Apex Courts [PDF] 返回目录
Jakub Harašta, Tereza Novotná, Jaromír Šavelka
Abstract: In this paper, we introduce the citation data of the Czech apex courts (Supreme Court, Supreme Administrative Court and Constitutional Court). This dataset was automatically extracted from the corpus of texts of Czech court decisions - CzCDC 1.0. We obtained the citation data by building the natural language processing pipeline for extraction of the court decision identifiers. The pipeline included the (i) document segmentation model and the (ii) reference recognition model. Furthermore, the dataset was manually processed to achieve high-quality citation data as a base for subsequent qualitative and quantitative analyses. The dataset will be made available to the general public.
摘要:在本文中,我们介绍了捷克顶点法院的引用数据(最高法院,最高行政法院和宪法法院)。 CzCDC 1.0 - 这个数据集自动从捷克法院判决文本的语料库中提取。我们通过建立自然语言处理管道的法院判决标识符萃取得到的引文数据。该管道包括第(i)文件分割模型和(ⅱ)参考识别模型。此外,该数据集被人工处理,以实现高品质的引用数据作为后续定性和定量分析的位置。该数据集将提供给广大市民。
Jakub Harašta, Tereza Novotná, Jaromír Šavelka
Abstract: In this paper, we introduce the citation data of the Czech apex courts (Supreme Court, Supreme Administrative Court and Constitutional Court). This dataset was automatically extracted from the corpus of texts of Czech court decisions - CzCDC 1.0. We obtained the citation data by building the natural language processing pipeline for extraction of the court decision identifiers. The pipeline included the (i) document segmentation model and the (ii) reference recognition model. Furthermore, the dataset was manually processed to achieve high-quality citation data as a base for subsequent qualitative and quantitative analyses. The dataset will be made available to the general public.
摘要:在本文中,我们介绍了捷克顶点法院的引用数据(最高法院,最高行政法院和宪法法院)。 CzCDC 1.0 - 这个数据集自动从捷克法院判决文本的语料库中提取。我们通过建立自然语言处理管道的法院判决标识符萃取得到的引文数据。该管道包括第(i)文件分割模型和(ⅱ)参考识别模型。此外,该数据集被人工处理,以实现高品质的引用数据作为后续定性和定量分析的位置。该数据集将提供给广大市民。
4. Related Tasks can Share! A Multi-task Framework for Affective language [PDF] 返回目录
Kumar Shikhar Deep, Md Shad Akhtar, Asif Ekbal, Pushpak Bhattacharyya
Abstract: Expressing the polarity of sentiment as 'positive' and 'negative' usually have limited scope compared with the intensity/degree of polarity. These two tasks (i.e. sentiment classification and sentiment intensity prediction) are closely related and may offer assistance to each other during the learning process. In this paper, we propose to leverage the relatedness of multiple tasks in a multi-task learning framework. Our multi-task model is based on convolutional-Gated Recurrent Unit (GRU) framework, which is further assisted by a diverse hand-crafted feature set. Evaluation and analysis suggest that joint-learning of the related tasks in a multi-task framework can outperform each of the individual tasks in the single-task frameworks.
摘要:表达情绪的极性为“正”和“负”通常与强度/程度的极性相比具有有限的范围。这两个任务(即情感分类和情感强度预测)紧密相关,并在学习过程中可以互相提供协助。在本文中,我们提出了利用多任务的关联性在多任务学习框架。我们的多任务模式是基于卷积门控重复单元(GRU)的框架,这是一个多元化的手工制作的功能集进一步协助。评估和分析表明,联合学习在多任务框架的相关任务可以在单任务框架超越每个单独的任务。
Kumar Shikhar Deep, Md Shad Akhtar, Asif Ekbal, Pushpak Bhattacharyya
Abstract: Expressing the polarity of sentiment as 'positive' and 'negative' usually have limited scope compared with the intensity/degree of polarity. These two tasks (i.e. sentiment classification and sentiment intensity prediction) are closely related and may offer assistance to each other during the learning process. In this paper, we propose to leverage the relatedness of multiple tasks in a multi-task learning framework. Our multi-task model is based on convolutional-Gated Recurrent Unit (GRU) framework, which is further assisted by a diverse hand-crafted feature set. Evaluation and analysis suggest that joint-learning of the related tasks in a multi-task framework can outperform each of the individual tasks in the single-task frameworks.
摘要:表达情绪的极性为“正”和“负”通常与强度/程度的极性相比具有有限的范围。这两个任务(即情感分类和情感强度预测)紧密相关,并在学习过程中可以互相提供协助。在本文中,我们提出了利用多任务的关联性在多任务学习框架。我们的多任务模式是基于卷积门控重复单元(GRU)的框架,这是一个多元化的手工制作的功能集进一步协助。评估和分析表明,联合学习在多任务框架的相关任务可以在单任务框架超越每个单独的任务。
5. A Neural Topical Expansion Framework for Unstructured Persona-oriented Dialogue Generation [PDF] 返回目录
Minghong Xu, Piji Li, Haoran Yang, Pengjie Ren, Zhaochun Ren, Zhumin Chen, Jun Ma
Abstract: Unstructured Persona-oriented Dialogue Systems (UPDS) has been demonstrated effective in generating persona consistent responses by utilizing predefined natural language user persona descriptions (e.g., "I am a vegan"). However, the predefined user persona descriptions are usually short and limited to only a few descriptive words, which makes it hard to correlate them with the dialogues. As a result, existing methods either fail to use the persona description or use them improperly when generating persona consistent responses. To address this, we propose a neural topical expansion framework, namely Persona Exploration and Exploitation (PEE), which is able to extend the predefined user persona description with semantically correlated content before utilizing them to generate dialogue responses. PEE consists of two main modules: persona exploration and persona exploitation. The former learns to extend the predefined user persona description by mining and correlating with existing dialogue corpus using a variational auto-encoder (VAE) based topic model. The latter learns to generate persona consistent responses by utilizing the predefined and extended user persona description. In order to make persona exploitation learn to utilize user persona description more properly, we also introduce two persona-oriented loss functions: Persona-oriented Matching (P-Match) loss and Persona-oriented Bag-of-Words (P-BoWs) loss which respectively supervise persona selection in encoder and decoder. Experimental results show that our approach outperforms state-of-the-art baselines, in terms of both automatic and human evaluations.
摘要:非结构化假面为本对话系统(UPDS)已经通过利用预定义的自然语言用户个性描述(例如,“我是素食主义者”)证明有效生成人物一致响应。然而,预定义用户的人物角色描述通常是短且仅限于一些描述性词语,这使得它很难将它们与对话相关联。其结果是,现有方法要么不能使用的人物角色描述或生成人物一致响应时不正确地使用它们。为了解决这个问题,我们提出了一个神经局部扩展的框架,即假面勘探和开采(PEE),这是能够利用它们来生成对话响应之前延长与语义相关的内容的预定义的用户角色的描述。 PEE包括两个主要模块:人物的勘探和开采的人物。前者学习到挖掘扩展预定义的用户角色的描述和使用基于主题模型,变分自动编码器(VAE)与现有的对话语料库相关。后者学会生成通过利用预定义的和扩展的用户个性描述人物一致响应。为了使人物开采学会更合理的利用用户的人物角色描述,我们还推出两款面向角色损功能:假面为本匹配(P-匹配)的损失和人物角色的导向一袋字(P-弓)损失分别在监督编码器和解码器的人物的选择。实验结果表明,我们的方法优于国家的最先进的基线,在自动和人的评估方面。
Minghong Xu, Piji Li, Haoran Yang, Pengjie Ren, Zhaochun Ren, Zhumin Chen, Jun Ma
Abstract: Unstructured Persona-oriented Dialogue Systems (UPDS) has been demonstrated effective in generating persona consistent responses by utilizing predefined natural language user persona descriptions (e.g., "I am a vegan"). However, the predefined user persona descriptions are usually short and limited to only a few descriptive words, which makes it hard to correlate them with the dialogues. As a result, existing methods either fail to use the persona description or use them improperly when generating persona consistent responses. To address this, we propose a neural topical expansion framework, namely Persona Exploration and Exploitation (PEE), which is able to extend the predefined user persona description with semantically correlated content before utilizing them to generate dialogue responses. PEE consists of two main modules: persona exploration and persona exploitation. The former learns to extend the predefined user persona description by mining and correlating with existing dialogue corpus using a variational auto-encoder (VAE) based topic model. The latter learns to generate persona consistent responses by utilizing the predefined and extended user persona description. In order to make persona exploitation learn to utilize user persona description more properly, we also introduce two persona-oriented loss functions: Persona-oriented Matching (P-Match) loss and Persona-oriented Bag-of-Words (P-BoWs) loss which respectively supervise persona selection in encoder and decoder. Experimental results show that our approach outperforms state-of-the-art baselines, in terms of both automatic and human evaluations.
摘要:非结构化假面为本对话系统(UPDS)已经通过利用预定义的自然语言用户个性描述(例如,“我是素食主义者”)证明有效生成人物一致响应。然而,预定义用户的人物角色描述通常是短且仅限于一些描述性词语,这使得它很难将它们与对话相关联。其结果是,现有方法要么不能使用的人物角色描述或生成人物一致响应时不正确地使用它们。为了解决这个问题,我们提出了一个神经局部扩展的框架,即假面勘探和开采(PEE),这是能够利用它们来生成对话响应之前延长与语义相关的内容的预定义的用户角色的描述。 PEE包括两个主要模块:人物的勘探和开采的人物。前者学习到挖掘扩展预定义的用户角色的描述和使用基于主题模型,变分自动编码器(VAE)与现有的对话语料库相关。后者学会生成通过利用预定义的和扩展的用户个性描述人物一致响应。为了使人物开采学会更合理的利用用户的人物角色描述,我们还推出两款面向角色损功能:假面为本匹配(P-匹配)的损失和人物角色的导向一袋字(P-弓)损失分别在监督编码器和解码器的人物的选择。实验结果表明,我们的方法优于国家的最先进的基线,在自动和人的评估方面。
6. Multilingual acoustic word embedding models for processing zero-resource languages [PDF] 返回目录
Herman Kamper, Yevgen Matusevych, Sharon Goldwater
Abstract: Acoustic word embeddings are fixed-dimensional representations of variable-length speech segments. In settings where unlabelled speech is the only available resource, such embeddings can be used in "zero-resource" speech search, indexing and discovery systems. Here we propose to train a single supervised embedding model on labelled data from multiple well-resourced languages and then apply it to unseen zero-resource languages. For this transfer learning approach, we consider two multilingual recurrent neural network models: a discriminative classifier trained on the joint vocabularies of all training languages, and a correspondence autoencoder trained to reconstruct word pairs. We test these using a word discrimination task on six target zero-resource languages. When trained on seven well-resourced languages, both models perform similarly and outperform unsupervised models trained on the zero-resource languages. With just a single training language, the second model works better, but performance depends more on the particular training--testing language pair.
摘要:声字的嵌入被固定维可变长度的语音段的表示。在设置里未标记的讲话是唯一可用的资源,这样的嵌入可在“零资源”的声音检索,索引和发现系统中使用。在这里,我们提出培养从多个资源充足的语言标记数据的单一监督嵌入模型,然后把它应用到看不见的零资源的语言。对于这种转移的学习方法,我们考虑两个多语种回归神经网络模型:辨别分类培训了所有训练语言的词汇联合,培养重建的单词对对应的自动编码。我们这些使用上的六个标靶零资源语言文字辨别任务测试。当七,资源丰富语言的训练,这两款车型同样执行和超越训练有素的零资源语言的无监督模型。只是一个单一的语言训练,第二个模型更好地工作,但性能更依赖于特定的训练 - 测试语言对。
Herman Kamper, Yevgen Matusevych, Sharon Goldwater
Abstract: Acoustic word embeddings are fixed-dimensional representations of variable-length speech segments. In settings where unlabelled speech is the only available resource, such embeddings can be used in "zero-resource" speech search, indexing and discovery systems. Here we propose to train a single supervised embedding model on labelled data from multiple well-resourced languages and then apply it to unseen zero-resource languages. For this transfer learning approach, we consider two multilingual recurrent neural network models: a discriminative classifier trained on the joint vocabularies of all training languages, and a correspondence autoencoder trained to reconstruct word pairs. We test these using a word discrimination task on six target zero-resource languages. When trained on seven well-resourced languages, both models perform similarly and outperform unsupervised models trained on the zero-resource languages. With just a single training language, the second model works better, but performance depends more on the particular training--testing language pair.
摘要:声字的嵌入被固定维可变长度的语音段的表示。在设置里未标记的讲话是唯一可用的资源,这样的嵌入可在“零资源”的声音检索,索引和发现系统中使用。在这里,我们提出培养从多个资源充足的语言标记数据的单一监督嵌入模型,然后把它应用到看不见的零资源的语言。对于这种转移的学习方法,我们考虑两个多语种回归神经网络模型:辨别分类培训了所有训练语言的词汇联合,培养重建的单词对对应的自动编码。我们这些使用上的六个标靶零资源语言文字辨别任务测试。当七,资源丰富语言的训练,这两款车型同样执行和超越训练有素的零资源语言的无监督模型。只是一个单一的语言训练,第二个模型更好地工作,但性能更依赖于特定的训练 - 测试语言对。
7. Attractive or Faithful? Popularity-Reinforced Learning for Inspired Headline Generation [PDF] 返回目录
Yun-Zhu Song, Hong-Han Shuai, Sung-Lin Yeh, Yi-Lun Wu, Lun-Wei Ku, Wen-Chih Peng
Abstract: With the rapid proliferation of online media sources and published news, headlines have become increasingly important for attracting readers to news articles, since users may be overwhelmed with the massive information. In this paper, we generate inspired headlines that preserve the nature of news articles and catch the eye of the reader simultaneously. The task of inspired headline generation can be viewed as a specific form of Headline Generation (HG) task, with the emphasis on creating an attractive headline from a given news article. To generate inspired headlines, we propose a novel framework called POpularity-Reinforced Learning for inspired Headline Generation (PORL-HG). PORL-HG exploits the extractive-abstractive architecture with 1) Popular Topic Attention (PTA) for guiding the extractor to select the attractive sentence from the article and 2) a popularity predictor for guiding the abstractor to rewrite the attractive sentence. Moreover, since the sentence selection of the extractor is not differentiable, techniques of reinforcement learning (RL) are utilized to bridge the gap with rewards obtained from a popularity score predictor. Through quantitative and qualitative experiments, we show that the proposed PORL-HG significantly outperforms the state-of-the-art headline generation models in terms of attractiveness evaluated by both human (71.03%) and the predictor (at least 27.60%), while the faithfulness of PORL-HG is also comparable to the state-of-the-art generation model.
摘要:随着网络媒体来源和公布的消息迅速扩散,标题已成为吸引读者的新闻文章越来越重要,因为用户可能会用大量的信息所淹没。在本文中,我们产生灵感的头条新闻保持新闻报道的本质,同时吸引读者的眼球。启发标题一代人的任务,可以被看作是头条代(HG)任务的具体形式,并把重点放在建立从给定的新闻文章的标题吸引人。为了产生灵感的头条新闻,我们提出了一个所谓的流行,增强学习的启发标题代(PORL-HG)的新框架。 PORL-HG利用与1)热门话题注意(PTA)萃取-抽象体系结构用于引导所述提取器从物品和2)的流行度预测器用于引导提取器重写吸引力句子中选择有吸引力的句子。此外,由于提取的例句选择是不可微的,(RL)强化学习的技术用于桥接与从普及的分数的预测获得奖励的间隙。通过定量和定性实验,我们表明,该PORL-HG显著优于国家的最先进的标题代车型由两个人(71.03%)和预测(至少27.60%)评估吸引力方面,而PORL-HG的信实也比得上状态的最先进的生成模型。
Yun-Zhu Song, Hong-Han Shuai, Sung-Lin Yeh, Yi-Lun Wu, Lun-Wei Ku, Wen-Chih Peng
Abstract: With the rapid proliferation of online media sources and published news, headlines have become increasingly important for attracting readers to news articles, since users may be overwhelmed with the massive information. In this paper, we generate inspired headlines that preserve the nature of news articles and catch the eye of the reader simultaneously. The task of inspired headline generation can be viewed as a specific form of Headline Generation (HG) task, with the emphasis on creating an attractive headline from a given news article. To generate inspired headlines, we propose a novel framework called POpularity-Reinforced Learning for inspired Headline Generation (PORL-HG). PORL-HG exploits the extractive-abstractive architecture with 1) Popular Topic Attention (PTA) for guiding the extractor to select the attractive sentence from the article and 2) a popularity predictor for guiding the abstractor to rewrite the attractive sentence. Moreover, since the sentence selection of the extractor is not differentiable, techniques of reinforcement learning (RL) are utilized to bridge the gap with rewards obtained from a popularity score predictor. Through quantitative and qualitative experiments, we show that the proposed PORL-HG significantly outperforms the state-of-the-art headline generation models in terms of attractiveness evaluated by both human (71.03%) and the predictor (at least 27.60%), while the faithfulness of PORL-HG is also comparable to the state-of-the-art generation model.
摘要:随着网络媒体来源和公布的消息迅速扩散,标题已成为吸引读者的新闻文章越来越重要,因为用户可能会用大量的信息所淹没。在本文中,我们产生灵感的头条新闻保持新闻报道的本质,同时吸引读者的眼球。启发标题一代人的任务,可以被看作是头条代(HG)任务的具体形式,并把重点放在建立从给定的新闻文章的标题吸引人。为了产生灵感的头条新闻,我们提出了一个所谓的流行,增强学习的启发标题代(PORL-HG)的新框架。 PORL-HG利用与1)热门话题注意(PTA)萃取-抽象体系结构用于引导所述提取器从物品和2)的流行度预测器用于引导提取器重写吸引力句子中选择有吸引力的句子。此外,由于提取的例句选择是不可微的,(RL)强化学习的技术用于桥接与从普及的分数的预测获得奖励的间隙。通过定量和定性实验,我们表明,该PORL-HG显著优于国家的最先进的标题代车型由两个人(71.03%)和预测(至少27.60%)评估吸引力方面,而PORL-HG的信实也比得上状态的最先进的生成模型。
8. Aligning the Pretraining and Finetuning Objectives of Language Models [PDF] 返回目录
Nuo Wang Pierse, Jingwen Lu
Abstract: We demonstrate that explicitly aligning the pretraining objectives to the finetuning objectives in language model training significantly improves the finetuning task performance and reduces the minimum amount of finetuning examples required. The performance margin gained from objective alignment allows us to build language models with smaller sizes for tasks with less available training data. We provide empirical evidence of these claims by applying objective alignment to concept-of-interest tagging and acronym detection tasks. We found that, with objective alignment, our 768 by 3 and 512 by 3 transformer language models can reach accuracy of 83.9%/82.5% for concept-of-interest tagging and 73.8%/70.2% for acronym detection using only 200 finetuning examples per task, outperforming the 768 by 3 model pretrained without objective alignment by +4.8%/+3.4% and +9.9%/+6.3%. We name finetuning small language models in the presence of hundreds of training examples or less "Few Example learning". In practice, Few Example Learning enabled by objective alignment not only saves human labeling costs, but also makes it possible to leverage language models in more real-time applications.
摘要:我们证明,明确对准训练前的目标的目标细化和微调在语言模型训练显著提高了任务细化和微调性能和降低微调所需的例子的最低金额。从客观比对所获得的性能裕量使我们能够建立语言模型尺寸较小与较少的可用训练数据的任务。我们通过将目标对准概念的兴趣标记和缩写检测任务提供这些说法的经验证据。我们发现,与目标定位,我们768 3和512 3变压器的语言模型可以达到83.9%/ 82.5%,准确度概念的兴趣标签和73.8%/ 70.2%的首字母缩写,检测只用200元细化和微调的例子任务,表现优于768由3模型由4.8%/ + 3.4%和9.9%/ + 6.3%没有客观对准预训练。我们的名字在数百个训练范例以下“几个示例学习”的存在微调小语言模型。在实践中,能够通过客观对准几个示例学习不仅节约了人工标识的成本,而且还能够利用语言模型在多个实时应用。
Nuo Wang Pierse, Jingwen Lu
Abstract: We demonstrate that explicitly aligning the pretraining objectives to the finetuning objectives in language model training significantly improves the finetuning task performance and reduces the minimum amount of finetuning examples required. The performance margin gained from objective alignment allows us to build language models with smaller sizes for tasks with less available training data. We provide empirical evidence of these claims by applying objective alignment to concept-of-interest tagging and acronym detection tasks. We found that, with objective alignment, our 768 by 3 and 512 by 3 transformer language models can reach accuracy of 83.9%/82.5% for concept-of-interest tagging and 73.8%/70.2% for acronym detection using only 200 finetuning examples per task, outperforming the 768 by 3 model pretrained without objective alignment by +4.8%/+3.4% and +9.9%/+6.3%. We name finetuning small language models in the presence of hundreds of training examples or less "Few Example learning". In practice, Few Example Learning enabled by objective alignment not only saves human labeling costs, but also makes it possible to leverage language models in more real-time applications.
摘要:我们证明,明确对准训练前的目标的目标细化和微调在语言模型训练显著提高了任务细化和微调性能和降低微调所需的例子的最低金额。从客观比对所获得的性能裕量使我们能够建立语言模型尺寸较小与较少的可用训练数据的任务。我们通过将目标对准概念的兴趣标记和缩写检测任务提供这些说法的经验证据。我们发现,与目标定位,我们768 3和512 3变压器的语言模型可以达到83.9%/ 82.5%,准确度概念的兴趣标签和73.8%/ 70.2%的首字母缩写,检测只用200元细化和微调的例子任务,表现优于768由3模型由4.8%/ + 3.4%和9.9%/ + 6.3%没有客观对准预训练。我们的名字在数百个训练范例以下“几个示例学习”的存在微调小语言模型。在实践中,能够通过客观对准几个示例学习不仅节约了人工标识的成本,而且还能够利用语言模型在多个实时应用。
9. UNCC Biomedical Semantic Question Answering Systems. BioASQ: Task-7B, Phase-B [PDF] 返回目录
Sai Krishna Telukuntla, Aditya Kapri, Wlodek Zadrozny
Abstract: In this paper, we detail our submission to the 2019, 7th year, BioASQ competition. We present our approach for Task-7b, Phase B, Exact Answering Task. These Question Answering (QA) tasks include Factoid, Yes/No, List Type Question answering. Our system is based on a contextual word embedding model. We have used a Bidirectional Encoder Representations from Transformers(BERT) based system, fined tuned for biomedical question answering task using BioBERT. In the third test batch set, our system achieved the highest MRR score for Factoid Question Answering task. Also, for List type question answering task our system achieved the highest recall score in the fourth test batch set. Along with our detailed approach, we present the results for our submissions, and also highlight identified downsides for our current approach and ways to improve them in our future experiments.
摘要:在本文中,我们详细介绍了提交给2019年,第7年,BioASQ竞争。我们提出我们的任务-7B的方法,B相,精确应答任务。这些问题回答(QA)任务包括FACTOID,是/否,列表类型答疑。我们的系统是基于上下文的单词嵌入模型。我们使用来自变形金刚双向编码表示(BERT)的系统,罚款调整为使用BioBERT生物医学问题回答任务。在第三个试验批次设置,我们的系统取得了最高MRR得分事实型询问应答任务。另外,对于列表类型问答任务我们的系统实现了在第四一批测试集最高得分召回。随着我们的详细的方法,我们目前的结果为我们的意见,并且还强调确定了我们目前的方法和途径,以提高他们在我们未来的实验缺点。
Sai Krishna Telukuntla, Aditya Kapri, Wlodek Zadrozny
Abstract: In this paper, we detail our submission to the 2019, 7th year, BioASQ competition. We present our approach for Task-7b, Phase B, Exact Answering Task. These Question Answering (QA) tasks include Factoid, Yes/No, List Type Question answering. Our system is based on a contextual word embedding model. We have used a Bidirectional Encoder Representations from Transformers(BERT) based system, fined tuned for biomedical question answering task using BioBERT. In the third test batch set, our system achieved the highest MRR score for Factoid Question Answering task. Also, for List type question answering task our system achieved the highest recall score in the fourth test batch set. Along with our detailed approach, we present the results for our submissions, and also highlight identified downsides for our current approach and ways to improve them in our future experiments.
摘要:在本文中,我们详细介绍了提交给2019年,第7年,BioASQ竞争。我们提出我们的任务-7B的方法,B相,精确应答任务。这些问题回答(QA)任务包括FACTOID,是/否,列表类型答疑。我们的系统是基于上下文的单词嵌入模型。我们使用来自变形金刚双向编码表示(BERT)的系统,罚款调整为使用BioBERT生物医学问题回答任务。在第三个试验批次设置,我们的系统取得了最高MRR得分事实型询问应答任务。另外,对于列表类型问答任务我们的系统实现了在第四一批测试集最高得分召回。随着我们的详细的方法,我们目前的结果为我们的意见,并且还强调确定了我们目前的方法和途径,以提高他们在我们未来的实验缺点。
10. Zero-Shot Activity Recognition with Videos [PDF] 返回目录
Evin Pinar Ornek
Abstract: In this paper, we examined the zero-shot activity recognition task with the usage of videos. We introduce an auto-encoder based model to construct a multimodal joint embedding space between the visual and textual manifolds. On the visual side, we used activity videos and a state-of-the-art 3D convolutional action recognition network to extract the features. On the textual side, we worked with GloVe word embeddings. The zero-shot recognition results are evaluated by top-n accuracy. Then, the manifold learning ability is measured by mean Nearest Neighbor Overlap. In the end, we provide an extensive discussion over the results and the future directions.
摘要:在本文中,我们研究了零次活动识别任务与视频的使用。我们引入了自动编码器基于模型的构建视觉和文本歧管之间的多模式联合嵌入空间。在可视侧,我们使用活动视频和一个国家的最先进的三维卷积动作识别网络来提取特征。在文字方面,我们曾与手套字的嵌入。零射门的识别结果被顶n准确评估。然后,歧管学习能力通过平均最近邻重叠测量。最后,我们提供对结果和未来的发展方向进行了广泛讨论。
Evin Pinar Ornek
Abstract: In this paper, we examined the zero-shot activity recognition task with the usage of videos. We introduce an auto-encoder based model to construct a multimodal joint embedding space between the visual and textual manifolds. On the visual side, we used activity videos and a state-of-the-art 3D convolutional action recognition network to extract the features. On the textual side, we worked with GloVe word embeddings. The zero-shot recognition results are evaluated by top-n accuracy. Then, the manifold learning ability is measured by mean Nearest Neighbor Overlap. In the end, we provide an extensive discussion over the results and the future directions.
摘要:在本文中,我们研究了零次活动识别任务与视频的使用。我们引入了自动编码器基于模型的构建视觉和文本歧管之间的多模式联合嵌入空间。在可视侧,我们使用活动视频和一个国家的最先进的三维卷积动作识别网络来提取特征。在文字方面,我们曾与手套字的嵌入。零射门的识别结果被顶n准确评估。然后,歧管学习能力通过平均最近邻重叠测量。最后,我们提供对结果和未来的发展方向进行了广泛讨论。
11. Understanding Car-Speak: Replacing Humans in Dealerships [PDF] 返回目录
Habeeb Hooshmand, James Caverlee
Abstract: A large portion of the car-buying experience in the United States involves interactions at a car dealership. At the dealership, the car-buyer relays their needs to a sales representative. However, most car-buyers are only have an abstract description of the vehicle they need. Therefore, they are only able to describe their ideal car in "car-speak". Car-speak is abstract language that pertains to a car's physical attributes. In this paper, we define car-speak. We also aim to curate a reasonable data set of car-speak language. Finally, we train several classifiers in order to classify car-speak.
摘要:在美国的购车体验的很大一部分涉及在汽车经销店的互动。在经销店,汽车买方中继其销售代表的需求。然而,大多数汽车购买者只拥有他们所需要的车辆的抽象描述。因此,他们只能够描述自己理想中的车“汽车说话。”租车发言是抽象的语言,涉及到汽车的物理属性。在本文中,我们定义汽车发言。我们还致力于策划的车讲的语言合理的数据集。最后,我们培养几个分类,以分类车说话。
Habeeb Hooshmand, James Caverlee
Abstract: A large portion of the car-buying experience in the United States involves interactions at a car dealership. At the dealership, the car-buyer relays their needs to a sales representative. However, most car-buyers are only have an abstract description of the vehicle they need. Therefore, they are only able to describe their ideal car in "car-speak". Car-speak is abstract language that pertains to a car's physical attributes. In this paper, we define car-speak. We also aim to curate a reasonable data set of car-speak language. Finally, we train several classifiers in order to classify car-speak.
摘要:在美国的购车体验的很大一部分涉及在汽车经销店的互动。在经销店,汽车买方中继其销售代表的需求。然而,大多数汽车购买者只拥有他们所需要的车辆的抽象描述。因此,他们只能够描述自己理想中的车“汽车说话。”租车发言是抽象的语言,涉及到汽车的物理属性。在本文中,我们定义汽车发言。我们还致力于策划的车讲的语言合理的数据集。最后,我们培养几个分类,以分类车说话。
12. Stimulating Creativity with FunLines: A Case Study of Humor Generation in Headlines [PDF] 返回目录
Nabil Hossain, John Krumm, Tanvir Sajed, Henry Kautz
Abstract: Building datasets of creative text, such as humor, is quite challenging. We introduce FunLines, a competitive game where players edit news headlines to make them funny, and where they rate the funniness of headlines edited by others. FunLines makes the humor generation process fun, interactive, collaborative, rewarding and educational, keeping players engaged and providing humor data at a very low cost compared to traditional crowdsourcing approaches. FunLines offers useful performance feedback, assisting players in getting better over time at generating and assessing humor, as our analysis shows. This helps to further increase the quality of the generated dataset. We show the effectiveness of this data by training humor classification models that outperform a previous benchmark, and we release this dataset to the public.
摘要:创作文本的建筑数据集,如幽默,极具挑战性。我们介绍FunLines,有竞争力的游戏,玩家编辑新闻标题,使他们逗的,在那里他们率他人编辑头条funniness。 FunLines使得幽默生成过程的乐趣,互动,合作,奖励,教育,保持玩家参与,并与传统的众包接近以非常低的成本提供幽默的数据。 FunLines提供了有用的绩效反馈,帮助玩家在发电渐入佳境随着时间的推移和评估幽默,因为我们的分析显示。这有助于进一步提高所产生的数据集的质量。我们表明,该数据由超越以前的基准培训幽默分类模型的有效性,以及我们发布这个数据集给公众。
Nabil Hossain, John Krumm, Tanvir Sajed, Henry Kautz
Abstract: Building datasets of creative text, such as humor, is quite challenging. We introduce FunLines, a competitive game where players edit news headlines to make them funny, and where they rate the funniness of headlines edited by others. FunLines makes the humor generation process fun, interactive, collaborative, rewarding and educational, keeping players engaged and providing humor data at a very low cost compared to traditional crowdsourcing approaches. FunLines offers useful performance feedback, assisting players in getting better over time at generating and assessing humor, as our analysis shows. This helps to further increase the quality of the generated dataset. We show the effectiveness of this data by training humor classification models that outperform a previous benchmark, and we release this dataset to the public.
摘要:创作文本的建筑数据集,如幽默,极具挑战性。我们介绍FunLines,有竞争力的游戏,玩家编辑新闻标题,使他们逗的,在那里他们率他人编辑头条funniness。 FunLines使得幽默生成过程的乐趣,互动,合作,奖励,教育,保持玩家参与,并与传统的众包接近以非常低的成本提供幽默的数据。 FunLines提供了有用的绩效反馈,帮助玩家在发电渐入佳境随着时间的推移和评估幽默,因为我们的分析显示。这有助于进一步提高所产生的数据集的质量。我们表明,该数据由超越以前的基准培训幽默分类模型的有效性,以及我们发布这个数据集给公众。
注:中文为机器翻译结果!