目录
1. Predicting Event Time by Classifying Sub-Level Temporal Relations Induced from a Unified Representation of Time Anchors [PDF] 摘要
2. ANDES at SemEval-2020 Task 12: A jointly-trained BERT multilingual model for offensive language detection [PDF] 摘要
4. Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images [PDF] 摘要
11. Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis [PDF] 摘要
13. A Hybrid BERT and LightGBM based Model for Predicting Emotion GIF Categories on Twitter [PDF] 摘要
14. End-to-End Trainable Self-Attentive Shallow Network for Text-Independent Speaker Verification [PDF] 摘要
摘要
1. Predicting Event Time by Classifying Sub-Level Temporal Relations Induced from a Unified Representation of Time Anchors [PDF] 返回目录
Fei Cheng, Yusuke Miyao
Abstract: Extracting event time from news articles is a challenging but attractive task. In contrast to the most existing pair-wised temporal link annotation, Reimers et al.(2016) proposed to annotate the time anchor (a.k.a. the exact time) of each event. Their work represents time anchors with discrete representations of Single-Day/Multi-Day and Certain/Uncertain. This increases the complexity of modeling the temporal relations between two time anchors, which cannot be categorized into the relations of Allen's interval algebra (Allen, 1990). In this paper, we propose an effective method to decompose such complex temporal relations into sub-level relations by introducing a unified quadruple representation for both Single-Day/Multi-Day and Certain/Uncertain time anchors. The temporal relation classifiers are trained in a multi-label classification manner. The system structure of our approach is much simpler than the existing decision tree model (Reimers et al., 2018), which is composed by a dozen of node classifiers. Another contribution of this work is to construct a larger event time corpus (256 news documents) with a reasonable Inter-Annotator Agreement (IAA), for the purpose of overcoming the data shortage of the existing event time corpus (36 news documents). The empirical results show our approach outperforms the state-of-the-art decision tree model and the increase of data size obtained a significant improvement of performance.
摘要:从新闻报道中提取事件时间是一个挑战,但有吸引力的任务。与此相反的大多数现有成对识破时间链接注释,赖默斯等人(2016)提出来注释每个事件的时间锚(亦称确切时间)。他们的作品代表了单日/多日的某些/不确定的离散表示时间锚。这增加了模拟两个时间锚,不能被归类为艾伦的区间代数(阿伦,1990年)的关系之间的时间关系的复杂性。在本文中,我们提出了一种有效的方法,通过引入两个单日/多日的某些/不确定时间锚统一的四联表示分解这类复杂的时序关系到分层次关系。时间关系分类器的多标签分类的方式训练。我们的方法的系统结构比现有的决策树模型简单得多(赖默斯等,2018),这是由节点分类十几组成。这项工作的另一个贡献是构建一个更大的事件时语料库(256个新闻文档)以合理的跨注解者协议(IAA),为了克服现有的事件时语料库(36个新闻文档)的数据不足的目的。实证结果表明我们的方法优于国家的最先进的决策树模型和获得的性能显著改善的数据大小的增加。
Fei Cheng, Yusuke Miyao
Abstract: Extracting event time from news articles is a challenging but attractive task. In contrast to the most existing pair-wised temporal link annotation, Reimers et al.(2016) proposed to annotate the time anchor (a.k.a. the exact time) of each event. Their work represents time anchors with discrete representations of Single-Day/Multi-Day and Certain/Uncertain. This increases the complexity of modeling the temporal relations between two time anchors, which cannot be categorized into the relations of Allen's interval algebra (Allen, 1990). In this paper, we propose an effective method to decompose such complex temporal relations into sub-level relations by introducing a unified quadruple representation for both Single-Day/Multi-Day and Certain/Uncertain time anchors. The temporal relation classifiers are trained in a multi-label classification manner. The system structure of our approach is much simpler than the existing decision tree model (Reimers et al., 2018), which is composed by a dozen of node classifiers. Another contribution of this work is to construct a larger event time corpus (256 news documents) with a reasonable Inter-Annotator Agreement (IAA), for the purpose of overcoming the data shortage of the existing event time corpus (36 news documents). The empirical results show our approach outperforms the state-of-the-art decision tree model and the increase of data size obtained a significant improvement of performance.
摘要:从新闻报道中提取事件时间是一个挑战,但有吸引力的任务。与此相反的大多数现有成对识破时间链接注释,赖默斯等人(2016)提出来注释每个事件的时间锚(亦称确切时间)。他们的作品代表了单日/多日的某些/不确定的离散表示时间锚。这增加了模拟两个时间锚,不能被归类为艾伦的区间代数(阿伦,1990年)的关系之间的时间关系的复杂性。在本文中,我们提出了一种有效的方法,通过引入两个单日/多日的某些/不确定时间锚统一的四联表示分解这类复杂的时序关系到分层次关系。时间关系分类器的多标签分类的方式训练。我们的方法的系统结构比现有的决策树模型简单得多(赖默斯等,2018),这是由节点分类十几组成。这项工作的另一个贡献是构建一个更大的事件时语料库(256个新闻文档)以合理的跨注解者协议(IAA),为了克服现有的事件时语料库(36个新闻文档)的数据不足的目的。实证结果表明我们的方法优于国家的最先进的决策树模型和获得的性能显著改善的数据大小的增加。
2. ANDES at SemEval-2020 Task 12: A jointly-trained BERT multilingual model for offensive language detection [PDF] 返回目录
Juan Manuel Pérez, Aymé Arango, Franco Luque
Abstract: This paper describes our participation in SemEval-2020 Task 12: Multilingual Offensive Language Detection. We jointly-trained a single model by fine-tuning Multilingual BERT to tackle the task across all the proposed languages: English, Danish, Turkish, Greek and Arabic. Our single model had competitive results, with a performance close to top-performing systems in spite of sharing the same parameters across all languages. Zero-shot and few-shot experiments were also conducted to analyze the transference performance among these languages. We make our code public for further research
摘要:本文介绍了SemEval,2020年,我们参与任务12:多语言攻击性语言检测。通过微调多语种BERT我们共同训练的单一模式来解决所有提出的语言的任务是:英语,丹麦语,土耳其语,希腊语和阿拉伯语。我们的单一模型具有竞争力的结果,具有性能接近顶部进行系统尽管在所有语言共享相同的参数。还进行了零次和几个次实验来分析这些语言之间的转移性能。我们使我们的代码公开为进一步研究
Juan Manuel Pérez, Aymé Arango, Franco Luque
Abstract: This paper describes our participation in SemEval-2020 Task 12: Multilingual Offensive Language Detection. We jointly-trained a single model by fine-tuning Multilingual BERT to tackle the task across all the proposed languages: English, Danish, Turkish, Greek and Arabic. Our single model had competitive results, with a performance close to top-performing systems in spite of sharing the same parameters across all languages. Zero-shot and few-shot experiments were also conducted to analyze the transference performance among these languages. We make our code public for further research
摘要:本文介绍了SemEval,2020年,我们参与任务12:多语言攻击性语言检测。通过微调多语种BERT我们共同训练的单一模式来解决所有提出的语言的任务是:英语,丹麦语,土耳其语,希腊语和阿拉伯语。我们的单一模型具有竞争力的结果,具有性能接近顶部进行系统尽管在所有语言共享相同的参数。还进行了零次和几个次实验来分析这些语言之间的转移性能。我们使我们的代码公开为进一步研究
3. Graph-based Modeling of Online Communities for Fake News Detection [PDF] 返回目录
Shantanu Chandra, Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova
Abstract: Over the past few years, there has been substantial effort towards automated detection of fake news. Existing research has modeled the structure, style and content of news articles, as well as the demographic traits of users. However, no attention has been directed towards modeling the properties of online communities that interact with fake news. In this work, we propose a novel approach via graph-based modeling of online communities. Our method aggregates information with respect to: 1) the nature of the content disseminated, 2) content-sharing behavior of users, and 3) the social network of those users. We empirically demonstrate that this yields significant improvements over existing text and user-based techniques for fake news detection.
摘要:在过去的几年中,出现了朝着假新闻自动化检测大量的努力。现有的研究已建模的结构,风格和新闻文章内容,以及用户的人口特征。然而,没有关注已转向建模在线社区,与假新闻相互作用的性质。在这项工作中,我们提出了通过在线社区的基于图形的建模的新方法。我们的方法聚合相对于信息:1)所述内容的传播,用户2)的内容共享行为的性质,和3)的那些用户的社交网络。我们经验表明,这种收益率在现有的假新闻基于检测的用户的文本和技术显著的改善。
Shantanu Chandra, Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova
Abstract: Over the past few years, there has been substantial effort towards automated detection of fake news. Existing research has modeled the structure, style and content of news articles, as well as the demographic traits of users. However, no attention has been directed towards modeling the properties of online communities that interact with fake news. In this work, we propose a novel approach via graph-based modeling of online communities. Our method aggregates information with respect to: 1) the nature of the content disseminated, 2) content-sharing behavior of users, and 3) the social network of those users. We empirically demonstrate that this yields significant improvements over existing text and user-based techniques for fake news detection.
摘要:在过去的几年中,出现了朝着假新闻自动化检测大量的努力。现有的研究已建模的结构,风格和新闻文章内容,以及用户的人口特征。然而,没有关注已转向建模在线社区,与假新闻相互作用的性质。在这项工作中,我们提出了通过在线社区的基于图形的建模的新方法。我们的方法聚合相对于信息:1)所述内容的传播,用户2)的内容共享行为的性质,和3)的那些用户的社交网络。我们经验表明,这种收益率在现有的假新闻基于检测的用户的文本和技术显著的改善。
4. Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images [PDF] 返回目录
Leanne Nortje, Herman Kamper
Abstract: We consider the task of multimodal one-shot speech-image matching. An agent is shown a picture along with a spoken word describing the object in the picture, e.g. cookie, broccoli and ice-cream. After observing one paired speech-image example per class, it is shown a new set of unseen pictures, and asked to pick the "ice-cream". Previous work attempted to tackle this problem using transfer learning: supervised models are trained on labelled background data not containing any of the one-shot classes. Here we compare transfer learning to unsupervised models trained on unlabelled in-domain data. On a dataset of paired isolated spoken and visual digits, we specifically compare unsupervised autoencoder-like models to supervised classifier and Siamese neural networks. In both unimodal and multimodal few-shot matching experiments, we find that transfer learning outperforms unsupervised training. We also present experiments towards combining the two methodologies, but find that transfer learning still performs best (despite idealised experiments showing the benefits of unsupervised learning).
摘要:我们认为多一杆的语音图像匹配的任务。的试剂与描述在画面中的对象一个口语单词一起示出一个画面,例如饼干,西兰花和冰淇淋。观察每一个类配对语音形象的例子后,它显示了一组新的图片看不见的,并且要挑选“冰淇淋”。先前的工作试图解决使用迁移学习这个问题:监督模型上不包含任何一次性类的标记背景数据训练。在这里,我们比较迁移学习训练有素未标记的域数据模型无人监督。在配对隔离语音和视频的数字数据集,我们特别比较的自动编码般的模型来监督分类和连体神经网络的无监督。在这两个单峰和多几个次匹配实验,我们发现,传递学习性能优于无监督训练。我们对两种方法相结合也存在实验,但发现转移学习仍表现最好(尽管理想化的实验显示无监督学习的好处)。
Leanne Nortje, Herman Kamper
Abstract: We consider the task of multimodal one-shot speech-image matching. An agent is shown a picture along with a spoken word describing the object in the picture, e.g. cookie, broccoli and ice-cream. After observing one paired speech-image example per class, it is shown a new set of unseen pictures, and asked to pick the "ice-cream". Previous work attempted to tackle this problem using transfer learning: supervised models are trained on labelled background data not containing any of the one-shot classes. Here we compare transfer learning to unsupervised models trained on unlabelled in-domain data. On a dataset of paired isolated spoken and visual digits, we specifically compare unsupervised autoencoder-like models to supervised classifier and Siamese neural networks. In both unimodal and multimodal few-shot matching experiments, we find that transfer learning outperforms unsupervised training. We also present experiments towards combining the two methodologies, but find that transfer learning still performs best (despite idealised experiments showing the benefits of unsupervised learning).
摘要:我们认为多一杆的语音图像匹配的任务。的试剂与描述在画面中的对象一个口语单词一起示出一个画面,例如饼干,西兰花和冰淇淋。观察每一个类配对语音形象的例子后,它显示了一组新的图片看不见的,并且要挑选“冰淇淋”。先前的工作试图解决使用迁移学习这个问题:监督模型上不包含任何一次性类的标记背景数据训练。在这里,我们比较迁移学习训练有素未标记的域数据模型无人监督。在配对隔离语音和视频的数字数据集,我们特别比较的自动编码般的模型来监督分类和连体神经网络的无监督。在这两个单峰和多几个次匹配实验,我们发现,传递学习性能优于无监督训练。我们对两种方法相结合也存在实验,但发现转移学习仍表现最好(尽管理想化的实验显示无监督学习的好处)。
5. Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems [PDF] 返回目录
Andrea Madotto
Abstract: Task-Oriented dialogue systems use four connected modules such as Natural Language Understanding (NLU), Dialogue State Tracker (DST), Dialogue Policy (DP) and Natural Language Generator (NLG). A research challenge is to learn each module with the least amount of samples (i.e., few-shots) given the high cost related to the data collection. The most common and effective technique to solve this problem is transferring learning, where large language models, either pre-trained on text or task-specific data, are fine-tuned on the few samples. These methods require fine-tuning steps and a set of parameters for each task. Differently, language models such as GPT-2 (Radford et al., 2019) and GPT-3 Brown et al., 2020) allows few-shot learning by priming the model with few-examples. In this paper, we evaluate the few-shot ability of Language Models such as GPT-2 by priming in the NLU, DST, DP and NLG tasks. Importantly, we highlight the current limitations of this approach and we discuss the possible implication to future work.
摘要:面向任务的对话系统使用四个连接的模块,如自然语言理解(NLU),对话状态追踪器(DST),对话策略(DP)和自然语言生成(NLG)。一个研究的挑战是要学会与给定的相关数据采集成本高,样本量最少(即,少数射击),每个模块。最常见的,解决这个问题的有效技术转移的学习,其中大语言模型,文本或任务的具体数据或者预先训练,都是微调的几样。这些方法需要微调的步骤和一组为每个任务的参数。不同地,语言模型如GPT-2(雷德福等人,2019)和GPT-3 Brown等人。,2020)允许少数次学习通过吸具有很少-示例模型。在本文中,我们通过在NLU,DST,DP和NLG任务吸评估语言模型,如GPT-2的一些射击技能。重要的是,我们强调这种方法的当前限制和大家讨论可能产生的影响到今后的工作。
Andrea Madotto
Abstract: Task-Oriented dialogue systems use four connected modules such as Natural Language Understanding (NLU), Dialogue State Tracker (DST), Dialogue Policy (DP) and Natural Language Generator (NLG). A research challenge is to learn each module with the least amount of samples (i.e., few-shots) given the high cost related to the data collection. The most common and effective technique to solve this problem is transferring learning, where large language models, either pre-trained on text or task-specific data, are fine-tuned on the few samples. These methods require fine-tuning steps and a set of parameters for each task. Differently, language models such as GPT-2 (Radford et al., 2019) and GPT-3 Brown et al., 2020) allows few-shot learning by priming the model with few-examples. In this paper, we evaluate the few-shot ability of Language Models such as GPT-2 by priming in the NLU, DST, DP and NLG tasks. Importantly, we highlight the current limitations of this approach and we discuss the possible implication to future work.
摘要:面向任务的对话系统使用四个连接的模块,如自然语言理解(NLU),对话状态追踪器(DST),对话策略(DP)和自然语言生成(NLG)。一个研究的挑战是要学会与给定的相关数据采集成本高,样本量最少(即,少数射击),每个模块。最常见的,解决这个问题的有效技术转移的学习,其中大语言模型,文本或任务的具体数据或者预先训练,都是微调的几样。这些方法需要微调的步骤和一组为每个任务的参数。不同地,语言模型如GPT-2(雷德福等人,2019)和GPT-3 Brown等人。,2020)允许少数次学习通过吸具有很少-示例模型。在本文中,我们通过在NLU,DST,DP和NLG任务吸评估语言模型,如GPT-2的一些射击技能。重要的是,我们强调这种方法的当前限制和大家讨论可能产生的影响到今后的工作。
6. Speech To Semantics: Improve ASR and NLU Jointly via All-Neural Interfaces [PDF] 返回目录
Milind Rao, Anirudh Raju, Pranav Dheram, Bach Bui, Ariya Rastrow
Abstract: We consider the problem of spoken language understanding (SLU) of extracting natural language intents and associated slot arguments or named entities from speech that is primarily directed at voice assistants. Such a system subsumes both automatic speech recognition (ASR) as well as natural language understanding (NLU). An end-to-end joint SLU model can be built to a required specification opening up the opportunity to deploy on hardware constrained scenarios like devices enabling voice assistants to work offline, in a privacy preserving manner, whilst also reducing server costs. We first present models that extract utterance intent directly from speech without intermediate text output. We then present a compositional model, which generates the transcript using the Listen Attend Spell ASR system and then extracts interpretation using a neural NLU model. Finally, we contrast these methods to a jointly trained end-to-end joint SLU model, consisting of ASR and NLU subsystems which are connected by a neural network based interface instead of text, that produces transcripts as well as NLU interpretation. We show that the jointly trained model shows improvements to ASR incorporating semantic information from NLU and also improves NLU by exposing it to ASR confusion encoded in the hidden layer.
摘要:我们认为口语理解(SLU)中提取的自然语言的意图和相关槽参数或从语音命名实体主要是针对语音助手的问题。这样的系统涵括两个自动语音识别(ASR)以及自然语言理解(NLU)。终端到终端联合SLU模型可以建造一个需要规范开放的机会,部署像实现语音助理工作离线设备的硬件约束的场景,在隐私保护的方式,同时也降低了服务器成本。首先,我们现在的模型,提取说话直接从讲话的意图,没有中间的文本输出。然后,我们提出一个成分模型,采用听。参加拼写ASR系统生成的成绩单,然后利用神经NLU模型中提取解释。最后,我们对比这些方法的联合训练的终端到终端的联合SLU模型,由它们通过基于神经网络的接口,而不是文字,产生的成绩单以及NLU解释连接ASR和NLU子系统。我们表明,联合训练的模特表演改善ASR结合从NLU语义信息,还可以通过将其暴露在隐藏层编码ASR混乱提高NLU。
Milind Rao, Anirudh Raju, Pranav Dheram, Bach Bui, Ariya Rastrow
Abstract: We consider the problem of spoken language understanding (SLU) of extracting natural language intents and associated slot arguments or named entities from speech that is primarily directed at voice assistants. Such a system subsumes both automatic speech recognition (ASR) as well as natural language understanding (NLU). An end-to-end joint SLU model can be built to a required specification opening up the opportunity to deploy on hardware constrained scenarios like devices enabling voice assistants to work offline, in a privacy preserving manner, whilst also reducing server costs. We first present models that extract utterance intent directly from speech without intermediate text output. We then present a compositional model, which generates the transcript using the Listen Attend Spell ASR system and then extracts interpretation using a neural NLU model. Finally, we contrast these methods to a jointly trained end-to-end joint SLU model, consisting of ASR and NLU subsystems which are connected by a neural network based interface instead of text, that produces transcripts as well as NLU interpretation. We show that the jointly trained model shows improvements to ASR incorporating semantic information from NLU and also improves NLU by exposing it to ASR confusion encoded in the hidden layer.
摘要:我们认为口语理解(SLU)中提取的自然语言的意图和相关槽参数或从语音命名实体主要是针对语音助手的问题。这样的系统涵括两个自动语音识别(ASR)以及自然语言理解(NLU)。终端到终端联合SLU模型可以建造一个需要规范开放的机会,部署像实现语音助理工作离线设备的硬件约束的场景,在隐私保护的方式,同时也降低了服务器成本。首先,我们现在的模型,提取说话直接从讲话的意图,没有中间的文本输出。然后,我们提出一个成分模型,采用听。参加拼写ASR系统生成的成绩单,然后利用神经NLU模型中提取解释。最后,我们对比这些方法的联合训练的终端到终端的联合SLU模型,由它们通过基于神经网络的接口,而不是文字,产生的成绩单以及NLU解释连接ASR和NLU子系统。我们表明,联合训练的模特表演改善ASR结合从NLU语义信息,还可以通过将其暴露在隐藏层编码ASR混乱提高NLU。
7. Studying Dishonest Intentions in Brazilian Portuguese Texts [PDF] 返回目录
Francielle Alves Vargas, Thiago Alexandre Salgueiro Pardo
Abstract: Previous work in the social sciences, psychology and linguistics has show that liars have some control over the content of their stories, however their underlying state of mind may "leak out" through the way that they tell them. To the best of our knowledge, no previous systematic effort exists in order to describe and model deception language for Brazilian Portuguese. To fill this important gap, we carry out an initial empirical linguistic study on false statements in Brazilian news. We methodically analyze linguistic features using the this http URL corpus, which includes both fake and true news. The results show that they present substantial lexical, syntactic and semantic variations, as well as punctuation and emotion distinctions.
摘要:在社会科学,心理学和语言学以前的工作表明,说谎者有超过他们的故事内容一定的控制,但他们的头脑中潜在的状态可能会“泄露”,通过他们告诉他们的方式。据我们所知,没有以前的系统性努力的存在是为了描述和巴西葡萄牙语模型欺骗的语言。为了填补这一重要的差距,我们开展了巴西新闻虚假陈述的初步经验语言学研究。我们有条不紊地分析使用此HTTP URL语料库,其中包括假冒真正的新闻语言特征。结果表明,它们呈现实质性的词汇,句法和语义的变化,以及标点符号和情感的区别。
Francielle Alves Vargas, Thiago Alexandre Salgueiro Pardo
Abstract: Previous work in the social sciences, psychology and linguistics has show that liars have some control over the content of their stories, however their underlying state of mind may "leak out" through the way that they tell them. To the best of our knowledge, no previous systematic effort exists in order to describe and model deception language for Brazilian Portuguese. To fill this important gap, we carry out an initial empirical linguistic study on false statements in Brazilian news. We methodically analyze linguistic features using the this http URL corpus, which includes both fake and true news. The results show that they present substantial lexical, syntactic and semantic variations, as well as punctuation and emotion distinctions.
摘要:在社会科学,心理学和语言学以前的工作表明,说谎者有超过他们的故事内容一定的控制,但他们的头脑中潜在的状态可能会“泄露”,通过他们告诉他们的方式。据我们所知,没有以前的系统性努力的存在是为了描述和巴西葡萄牙语模型欺骗的语言。为了填补这一重要的差距,我们开展了巴西新闻虚假陈述的初步经验语言学研究。我们有条不紊地分析使用此HTTP URL语料库,其中包括假冒真正的新闻语言特征。结果表明,它们呈现实质性的词汇,句法和语义的变化,以及标点符号和情感的区别。
8. Hate Speech Detection and Racial Bias Mitigation in Social Media based on BERT model [PDF] 返回目录
Marzieh Mozafari, Reza Farahbakhsh, Noel Crespi
Abstract: Disparate biases associated with datasets and trained classifiers in hateful and abusive content identification tasks have raised many concerns recently. Although the problem of biased datasets on abusive language detection has been addressed more frequently, biases arising from trained classifiers have not yet been a matter of concern. Here, we first introduce a transfer learning approach for hate speech detection based on an existing pre-trained language model BERT and evaluate the proposed model on two publicly available datasets that have been annotated for racism, sexism, hate or offensive content on Twitter. Next, we introduce a bias alleviation mechanism to mitigate the effect of bias in training set during the fine-tuning of our pre-trained BERT-based model for hate speech detection. Toward that end, we use a regularization method to reweight input samples, thereby decreasing the effects of high correlated training set' s n-grams with class labels, and then fine-tune our pre-trained BERT-based model with the new re-weighted samples. To evaluate our bias alleviation mechanism, we employed a cross-domain approach in which we use the trained classifiers on the aforementioned datasets to predict the labels of two new datasets from Twitter, AAE-aligned and White-aligned groups, which indicate tweets written in African-American English (AAE) and Standard American English (SAE), respectively. The results show the existence of systematic racial bias in trained classifiers, as they tend to assign tweets written in AAE from AAE-aligned group to negative classes such as racism, sexism, hate, and offensive more often than tweets written in SAE from White-aligned. However, the racial bias in our classifiers reduces significantly after our bias alleviation mechanism is incorporated. This work could institute the first step towards debiasing hate speech and abusive language detection systems.
摘要:数据集和训练后的分类在仇恨和恶意内容识别任务相关的全异偏见最近提出了许多关注。虽然粗言秽语检测偏置数据集的问题已经被更频繁地解决,从训练分类所产生的偏见还没有被关注的一个问题。在这里,我们首先介绍了基于现有的预训练的语言模型BERT仇恨言论检测转移的学习方法和评估已被注释,以便种族主义,性别歧视,仇恨或在Twitter上攻击性内容的两个公开可用的数据集所提出的模型。接下来,我们介绍一个偏置扶贫机制,以减轻我们的预先训练的基于BERT模型的微调对仇恨言论检测过程中训练集偏差的影响。为了达到这个目的,我们使用正则化方法来reweight输入样本,从而降低正克的高相关的训练集中的与类标签的效果,然后微调我们的预先训练的基于BERT-新重新模型加权样品。为了评估我们的偏见扶贫机制,我们采用在我们用上述数据集训练分类预测从Twitter,AAE对齐和白对齐组,这表明写在微博了两个新的数据集的标签跨域方法非裔美国人的英语(AAE)和标准美式英语(SAE),分别。结果表明系统的种族偏见的存在,在训练后的分类,因为他们往往写在AAE AAE从对齐组阴性类,如种族歧视,性别歧视,仇恨和攻击性往往比从白色 - 写在SAE鸣叫分配鸣叫对齐。然而,在我们的分类种族偏见减少了我们的偏见扶贫机制结合后显著。这项工作可以提起对消除直流偏压仇恨言论和侮辱性语言检测系统的第一步。
Marzieh Mozafari, Reza Farahbakhsh, Noel Crespi
Abstract: Disparate biases associated with datasets and trained classifiers in hateful and abusive content identification tasks have raised many concerns recently. Although the problem of biased datasets on abusive language detection has been addressed more frequently, biases arising from trained classifiers have not yet been a matter of concern. Here, we first introduce a transfer learning approach for hate speech detection based on an existing pre-trained language model BERT and evaluate the proposed model on two publicly available datasets that have been annotated for racism, sexism, hate or offensive content on Twitter. Next, we introduce a bias alleviation mechanism to mitigate the effect of bias in training set during the fine-tuning of our pre-trained BERT-based model for hate speech detection. Toward that end, we use a regularization method to reweight input samples, thereby decreasing the effects of high correlated training set' s n-grams with class labels, and then fine-tune our pre-trained BERT-based model with the new re-weighted samples. To evaluate our bias alleviation mechanism, we employed a cross-domain approach in which we use the trained classifiers on the aforementioned datasets to predict the labels of two new datasets from Twitter, AAE-aligned and White-aligned groups, which indicate tweets written in African-American English (AAE) and Standard American English (SAE), respectively. The results show the existence of systematic racial bias in trained classifiers, as they tend to assign tweets written in AAE from AAE-aligned group to negative classes such as racism, sexism, hate, and offensive more often than tweets written in SAE from White-aligned. However, the racial bias in our classifiers reduces significantly after our bias alleviation mechanism is incorporated. This work could institute the first step towards debiasing hate speech and abusive language detection systems.
摘要:数据集和训练后的分类在仇恨和恶意内容识别任务相关的全异偏见最近提出了许多关注。虽然粗言秽语检测偏置数据集的问题已经被更频繁地解决,从训练分类所产生的偏见还没有被关注的一个问题。在这里,我们首先介绍了基于现有的预训练的语言模型BERT仇恨言论检测转移的学习方法和评估已被注释,以便种族主义,性别歧视,仇恨或在Twitter上攻击性内容的两个公开可用的数据集所提出的模型。接下来,我们介绍一个偏置扶贫机制,以减轻我们的预先训练的基于BERT模型的微调对仇恨言论检测过程中训练集偏差的影响。为了达到这个目的,我们使用正则化方法来reweight输入样本,从而降低正克的高相关的训练集中的与类标签的效果,然后微调我们的预先训练的基于BERT-新重新模型加权样品。为了评估我们的偏见扶贫机制,我们采用在我们用上述数据集训练分类预测从Twitter,AAE对齐和白对齐组,这表明写在微博了两个新的数据集的标签跨域方法非裔美国人的英语(AAE)和标准美式英语(SAE),分别。结果表明系统的种族偏见的存在,在训练后的分类,因为他们往往写在AAE AAE从对齐组阴性类,如种族歧视,性别歧视,仇恨和攻击性往往比从白色 - 写在SAE鸣叫分配鸣叫对齐。然而,在我们的分类种族偏见减少了我们的偏见扶贫机制结合后显著。这项工作可以提起对消除直流偏压仇恨言论和侮辱性语言检测系统的第一步。
9. Partial Orders, Residuation, and First-Order Linear Logic [PDF] 返回目录
Richard Moot
Abstract: We will investigate proof-theoretic and linguistic aspects of first-order linear logic. We will show that adding partial order constraints in such a way that each sequent defines a unique linear order on the antecedent formulas of a sequent allows us to define many useful logical operators. In addition, the partial order constraints improve the efficiency of proof search.
摘要:我们将调查一阶线性逻辑的证明理论和语言方面。我们将证明,以这样的方式使得每个相继式定义了一个连续的前件式的唯一线性顺序加入部分顺序约束允许我们定义了许多有用的逻辑运算符。此外,偏序约束提高防爆搜索的效率。
Richard Moot
Abstract: We will investigate proof-theoretic and linguistic aspects of first-order linear logic. We will show that adding partial order constraints in such a way that each sequent defines a unique linear order on the antecedent formulas of a sequent allows us to define many useful logical operators. In addition, the partial order constraints improve the efficiency of proof search.
摘要:我们将调查一阶线性逻辑的证明理论和语言方面。我们将证明,以这样的方式使得每个相继式定义了一个连续的前件式的唯一线性顺序加入部分顺序约束允许我们定义了许多有用的逻辑运算符。此外,偏序约束提高防爆搜索的效率。
10. An Efficient Model Inference Algorithm for Learning-based Testing of Reactive Systems [PDF] 返回目录
Muddassar A. Sindhu
Abstract: Learning-based testing (LBT) is an emerging methodology to automate iterative black-box requirements testing of software systems. The methodology involves combining model inference with model checking techniques. However, a variety of optimisations on model inference are necessary in order to achieve scalable testing for large systems. In this paper we describe the IKL learning algorithm which is an active incremental learning algorithm for deterministic Kripke structures. We formally prove the correctness of IKL. We discuss the optimisations it incorporates to achieve scalability of testing. We also evaluate a black box heuristic for test termination based on convergence of IKL learning.
摘要:学习型测试(LBT)是一种新兴的方法来迭代自动化暗箱要求软件系统的测试。该方法涉及到模型推断与模型检测技术相结合。然而,各种型号推理的优化是必要的,以实现大型系统的可扩展性测试。在本文中,我们描述了IKL学习算法是确定性的克里普克结构的主动增量学习算法。我们正式证明IKL的正确性。我们讨论它结合来实现测试的可扩展性的优化。我们还评估了基于IKL学习的收敛测试终止一个黑盒子启发。
Muddassar A. Sindhu
Abstract: Learning-based testing (LBT) is an emerging methodology to automate iterative black-box requirements testing of software systems. The methodology involves combining model inference with model checking techniques. However, a variety of optimisations on model inference are necessary in order to achieve scalable testing for large systems. In this paper we describe the IKL learning algorithm which is an active incremental learning algorithm for deterministic Kripke structures. We formally prove the correctness of IKL. We discuss the optimisations it incorporates to achieve scalability of testing. We also evaluate a black box heuristic for test termination based on convergence of IKL learning.
摘要:学习型测试(LBT)是一种新兴的方法来迭代自动化暗箱要求软件系统的测试。该方法涉及到模型推断与模型检测技术相结合。然而,各种型号推理的优化是必要的,以实现大型系统的可扩展性测试。在本文中,我们描述了IKL学习算法是确定性的克里普克结构的主动增量学习算法。我们正式证明IKL的正确性。我们讨论它结合来实现测试的可扩展性的优化。我们还评估了基于IKL学习的收敛测试终止一个黑盒子启发。
11. Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis [PDF] 返回目录
Stavros Assimakopoulos, Rebecca Vella Muskat, Lonneke van der Plas, Albert Gatt
Abstract: This paper presents a novel scheme for the annotation of hate speech in corpora of Web 2.0 commentary. The proposed scheme is motivated by the critical analysis of posts made in reaction to news reports on the Mediterranean migration crisis and LGBTIQ+ matters in Malta, which was conducted under the auspices of the EU-funded C.O.N.T.A.C.T. project. Based on the realization that hate speech is not a clear-cut category to begin with, appears to belong to a continuum of discriminatory discourse and is often realized through the use of indirect linguistic means, it is argued that annotation schemes for its detection should refrain from directly including the label 'hate speech,' as different annotators might have different thresholds as to what constitutes hate speech and what not. In view of this, we suggest a multi-layer annotation scheme, which is pilot-tested against a binary +/- hate speech classification and appears to yield higher inter-annotator agreement. Motivating the postulation of our scheme, we then present the MaNeCo corpus on which it will eventually be used; a substantial corpus of on-line newspaper comments spanning 10 years.
摘要:本文介绍了仇恨言论在Web 2.0解说语料标注的新方案。该方案是由反应作出地中海移民危机和LGBTIQ +事项马耳他,这是欧盟资助的C.O.N.T.A.C.T.的主持下进行的新闻报道帖的关键动机分析项目。基于这样的认识:仇恨言论是不是一个明确的类别开始,似乎属于歧视性话语的连续,往往是通过使用间接的语言来实现,有人认为,注释计划为其检测应避免从直接包括标签“仇恨言论”,因为不同的注释者可能有不同的阈值是什么构成仇恨言论,什么不是。鉴于此,我们提出了一种多层注释方案,它是导频测试针对二进制+/-恨语音分类和显示,以产生更高的注释间协议。我们的激励计划的公设,我们再提出关于它最终将要使用的MaNeCo语料库;上线的报纸大幅语料库评论跨越10年。
Stavros Assimakopoulos, Rebecca Vella Muskat, Lonneke van der Plas, Albert Gatt
Abstract: This paper presents a novel scheme for the annotation of hate speech in corpora of Web 2.0 commentary. The proposed scheme is motivated by the critical analysis of posts made in reaction to news reports on the Mediterranean migration crisis and LGBTIQ+ matters in Malta, which was conducted under the auspices of the EU-funded C.O.N.T.A.C.T. project. Based on the realization that hate speech is not a clear-cut category to begin with, appears to belong to a continuum of discriminatory discourse and is often realized through the use of indirect linguistic means, it is argued that annotation schemes for its detection should refrain from directly including the label 'hate speech,' as different annotators might have different thresholds as to what constitutes hate speech and what not. In view of this, we suggest a multi-layer annotation scheme, which is pilot-tested against a binary +/- hate speech classification and appears to yield higher inter-annotator agreement. Motivating the postulation of our scheme, we then present the MaNeCo corpus on which it will eventually be used; a substantial corpus of on-line newspaper comments spanning 10 years.
摘要:本文介绍了仇恨言论在Web 2.0解说语料标注的新方案。该方案是由反应作出地中海移民危机和LGBTIQ +事项马耳他,这是欧盟资助的C.O.N.T.A.C.T.的主持下进行的新闻报道帖的关键动机分析项目。基于这样的认识:仇恨言论是不是一个明确的类别开始,似乎属于歧视性话语的连续,往往是通过使用间接的语言来实现,有人认为,注释计划为其检测应避免从直接包括标签“仇恨言论”,因为不同的注释者可能有不同的阈值是什么构成仇恨言论,什么不是。鉴于此,我们提出了一种多层注释方案,它是导频测试针对二进制+/-恨语音分类和显示,以产生更高的注释间协议。我们的激励计划的公设,我们再提出关于它最终将要使用的MaNeCo语料库;上线的报纸大幅语料库评论跨越10年。
12. Adaptable Multi-Domain Language Model for Transformer ASR [PDF] 返回目录
Taewoo Lee, Min-Joong Lee, Tae Gyoon Kang, Seokyeoung Jung, Minseok Kwon, Yeona Hong, Jungin Lee, Kyoung-Gu Woo, Ho-Gyeong Kim, Jiseung Jeong, Jihyun Lee, Hosik Lee, Young Sang Choi
Abstract: We propose an adapter based multi-domain Transformer based language model (LM) for Transformer ASR. The model consists of a big size common LM and small size adapters. The model can perform multi-domain adaptation with only the small size adapters and its related layers. The proposed model can reuse the full fine-tuned LM which is fine-tuned using all layers of an original model. The proposed LM can be expanded to new domains by adding about 2% of parameters for a first domain and 13% parameters for after second domain. The proposed model is also effective in reducing the model maintenance cost because it is possible to omit the costly and time-consuming common LM pre-training process. Using proposed adapter based approach, we observed that a general LM with adapter can outperform a dedicated music domain LM in terms of word error rate (WER).
摘要:我们提出了变压器ASR适配器基于多域变换基于语言模型(LM)。该模型由一个大尺寸的共同LM和小尺寸的适配器。该模型可以仅与小尺寸的适配器和其相关层执行多域自适应。该模型可以重用全微调LM其使用原始模型的所有层微调。所提出的LM可以通过加入约的用于第一域参数和13%参数2%第二域之后被扩展到新的领域。该模型还有效地降低了模型的维护成本,因为它能够省去昂贵和费时的共同LM前的训练过程。建议使用基于接口的方法,我们观察到,与适配器通用LM可以在字错误率(WER)方面胜过一个专门的音乐领域LM。
Taewoo Lee, Min-Joong Lee, Tae Gyoon Kang, Seokyeoung Jung, Minseok Kwon, Yeona Hong, Jungin Lee, Kyoung-Gu Woo, Ho-Gyeong Kim, Jiseung Jeong, Jihyun Lee, Hosik Lee, Young Sang Choi
Abstract: We propose an adapter based multi-domain Transformer based language model (LM) for Transformer ASR. The model consists of a big size common LM and small size adapters. The model can perform multi-domain adaptation with only the small size adapters and its related layers. The proposed model can reuse the full fine-tuned LM which is fine-tuned using all layers of an original model. The proposed LM can be expanded to new domains by adding about 2% of parameters for a first domain and 13% parameters for after second domain. The proposed model is also effective in reducing the model maintenance cost because it is possible to omit the costly and time-consuming common LM pre-training process. Using proposed adapter based approach, we observed that a general LM with adapter can outperform a dedicated music domain LM in terms of word error rate (WER).
摘要:我们提出了变压器ASR适配器基于多域变换基于语言模型(LM)。该模型由一个大尺寸的共同LM和小尺寸的适配器。该模型可以仅与小尺寸的适配器和其相关层执行多域自适应。该模型可以重用全微调LM其使用原始模型的所有层微调。所提出的LM可以通过加入约的用于第一域参数和13%参数2%第二域之后被扩展到新的领域。该模型还有效地降低了模型的维护成本,因为它能够省去昂贵和费时的共同LM前的训练过程。建议使用基于接口的方法,我们观察到,与适配器通用LM可以在字错误率(WER)方面胜过一个专门的音乐领域LM。
13. A Hybrid BERT and LightGBM based Model for Predicting Emotion GIF Categories on Twitter [PDF] 返回目录
Ye Bi, Shuo Wang, Zhongrui Fan
Abstract: The animated Graphical Interchange Format (GIF) images have been widely used on social media as an intuitive way of expression emotion. Given their expressiveness, GIFs offer a more nuanced and precise way to convey emotions. In this paper, we present our solution for the EmotionGIF 2020 challenge, the shared task of SocialNLP 2020. To recommend GIF categories for unlabeled tweets, we regarded this problem as a kind of matching tasks and proposed a learning to rank framework based on Bidirectional Encoder Representations from Transformer (BERT) and LightGBM. Our team won the 4th place with a Mean Average Precision @ 6 (MAP@6) score of 0.5394 on the round 1 leaderboard.
摘要:动画图形交换格式(GIF)图像已经在社交媒体上被广泛用作表达情感的一种直观的方式。鉴于其表现,GIF格式提供了更细致和精确的方式来传达情感。在本文中,我们提出我们的EmotionGIF 2020挑战的解决方案,SocialNLP 2020年的共同任务要推荐GIF类别未标记的鸣叫,大家都把这个问题作为一种配套任务,并提出了基于双向编码器学习等级框架从变压器(BERT)和LightGBM表示。我们的团队,平均平均准确率获得了第4名@ 6(MAP @ 6)的0.5394得分第1轮领先。
Ye Bi, Shuo Wang, Zhongrui Fan
Abstract: The animated Graphical Interchange Format (GIF) images have been widely used on social media as an intuitive way of expression emotion. Given their expressiveness, GIFs offer a more nuanced and precise way to convey emotions. In this paper, we present our solution for the EmotionGIF 2020 challenge, the shared task of SocialNLP 2020. To recommend GIF categories for unlabeled tweets, we regarded this problem as a kind of matching tasks and proposed a learning to rank framework based on Bidirectional Encoder Representations from Transformer (BERT) and LightGBM. Our team won the 4th place with a Mean Average Precision @ 6 (MAP@6) score of 0.5394 on the round 1 leaderboard.
摘要:动画图形交换格式(GIF)图像已经在社交媒体上被广泛用作表达情感的一种直观的方式。鉴于其表现,GIF格式提供了更细致和精确的方式来传达情感。在本文中,我们提出我们的EmotionGIF 2020挑战的解决方案,SocialNLP 2020年的共同任务要推荐GIF类别未标记的鸣叫,大家都把这个问题作为一种配套任务,并提出了基于双向编码器学习等级框架从变压器(BERT)和LightGBM表示。我们的团队,平均平均准确率获得了第4名@ 6(MAP @ 6)的0.5394得分第1轮领先。
14. End-to-End Trainable Self-Attentive Shallow Network for Text-Independent Speaker Verification [PDF] 返回目录
Hyeonmook Park, Jungbae Park, Sang Wan Lee
Abstract: Generalized end-to-end (GE2E) model is widely used in speaker verification (SV) fields due to its expandability and generality regardless of specific languages. However, the long-short term memory (LSTM) based on GE2E has two limitations: First, the embedding of GE2E suffers from vanishing gradient, which leads to performance degradation for very long input sequences. Secondly, utterances are not represented as a properly fixed dimensional vector. In this paper, to overcome issues mentioned above, we propose a novel framework for SV, end-to-end trainable self-attentive shallow network (SASN), incorporating a time-delay neural network (TDNN) and a self-attentive pooling mechanism based on the self-attentive x-vector system during an utterance embedding phase. We demonstrate that the proposed model is highly efficient, and provides more accurate speaker verification than GE2E. For VCTK dataset, with just less than half the size of GE2E, the proposed model showed significant performance improvement over GE2E of about 63%, 67%, and 85% in EER (Equal error rate), DCF (Detection cost function), and AUC (Area under the curve), respectively. Notably, when the input length becomes longer, the DCF score improvement of the proposed model is about 17 times greater than that of GE2E.
摘要:广义的端至端(GE2E)模型被广泛用于说话者验证(SV)字段,由于其可扩展性和通用性,无论特定语言。然而,基于GE2E长短期存储器(LSTM)有两个限制:一是GE2E患有从消失梯度嵌入,这导致性能下降非常长的输入序列。其次,话语不表示为妥善固定维向量。在本文中,为了克服上述问题,我们提出了SV一种新的框架,最终到终端的可训练自周到的浅网(SASN),结合延时神经网络(TDNN)和自周到池机制期间的发声嵌入相基于自细心的x载体系统。我们表明,该模型是高效的,而且比GE2E提供更精确的说话人确认。对于VCTK数据集,具有GE2E的只有不到一半的大小,该模型显示出超过GE2E显著的性能提高约63%,67%和EER(相等错误率),DCF(检测成本函数)85%, AUC(曲线下面积),分别。值得注意的是,当输入长度变长,所提出的模型的DCF评分改善小于GE2E的大于约17倍。
Hyeonmook Park, Jungbae Park, Sang Wan Lee
Abstract: Generalized end-to-end (GE2E) model is widely used in speaker verification (SV) fields due to its expandability and generality regardless of specific languages. However, the long-short term memory (LSTM) based on GE2E has two limitations: First, the embedding of GE2E suffers from vanishing gradient, which leads to performance degradation for very long input sequences. Secondly, utterances are not represented as a properly fixed dimensional vector. In this paper, to overcome issues mentioned above, we propose a novel framework for SV, end-to-end trainable self-attentive shallow network (SASN), incorporating a time-delay neural network (TDNN) and a self-attentive pooling mechanism based on the self-attentive x-vector system during an utterance embedding phase. We demonstrate that the proposed model is highly efficient, and provides more accurate speaker verification than GE2E. For VCTK dataset, with just less than half the size of GE2E, the proposed model showed significant performance improvement over GE2E of about 63%, 67%, and 85% in EER (Equal error rate), DCF (Detection cost function), and AUC (Area under the curve), respectively. Notably, when the input length becomes longer, the DCF score improvement of the proposed model is about 17 times greater than that of GE2E.
摘要:广义的端至端(GE2E)模型被广泛用于说话者验证(SV)字段,由于其可扩展性和通用性,无论特定语言。然而,基于GE2E长短期存储器(LSTM)有两个限制:一是GE2E患有从消失梯度嵌入,这导致性能下降非常长的输入序列。其次,话语不表示为妥善固定维向量。在本文中,为了克服上述问题,我们提出了SV一种新的框架,最终到终端的可训练自周到的浅网(SASN),结合延时神经网络(TDNN)和自周到池机制期间的发声嵌入相基于自细心的x载体系统。我们表明,该模型是高效的,而且比GE2E提供更精确的说话人确认。对于VCTK数据集,具有GE2E的只有不到一半的大小,该模型显示出超过GE2E显著的性能提高约63%,67%和EER(相等错误率),DCF(检测成本函数)85%, AUC(曲线下面积),分别。值得注意的是,当输入长度变长,所提出的模型的DCF评分改善小于GE2E的大于约17倍。
注:中文为机器翻译结果!封面为论文标题词云图!