摘要

1. Linked Credibility Reviews for Explainable Misinformation Detection [PDF] 返回目录
Ronald Denaux, Jose Manuel Gomez-Perez
Abstract: In recent years, misinformation on the Web has become increasingly rampant. The research community has responded by proposing systems and challenges, which are beginning to be useful for (various subtasks of) detecting misinformation. However, most proposed systems are based on deep learning techniques which are fine-tuned to specific domains, are difficult to interpret and produce results which are not machine readable. This limits their applicability and adoption as they can only be used by a select expert audience in very specific settings. In this paper we propose an architecture based on a core concept of Credibility Reviews (CRs) that can be used to build networks of distributed bots that collaborate for misinformation detection. The CRs serve as building blocks to compose graphs of (i) web content, (ii) existing credibility signals --fact-checked claims and reputation reviews of websites--, and (iii) automatically computed reviews. We implement this architecture on top of lightweight extensions to this http URL and services providing generic NLP tasks for semantic similarity and stance detection. Evaluations on existing datasets of social-media posts, fake news and political speeches demonstrates several advantages over existing systems: extensibility, domain-independence, composability, explainability and transparency via provenance. Furthermore, we obtain competitive results without requiring finetuning and establish a new state of the art on the Clef'18 CheckThat! Factuality task.
摘要：近年来，在网络上误传已经变得越来越猖獗。该研究团队通过提出系统和挑战，这是开始被用于检测误传（各子任务）有用的回应。然而，大多数提出的系统都是基于深度学习技术，这是微调，以特定的域，难以解释和产生的结果不属于机器可读。这限制了它们的适用性和采纳，因为他们只能通过选择专业观众非常具体的设置中使用。在本文中，我们提出了一种基于可用于对于误传检测协作分布式机器人的构建网络诚信评价（CRS）的一个核心概念的架构。该CR作为构建块以（i）的web内容撰写的曲线图，（ⅱ）现有信誉信号--fact核对权利要求和websites--的声誉的评价，和（iii）自动计算审查。我们实行轻量级扩展的顶部这种架构这个HTTP URL和提供语义相似性和姿态检测通用NLP任务服务。通过出处扩展，域名独立，可组合，explainability与透明度：公司在社交媒体帖子，假新闻和政治演讲的现有数据集的评估显示出比现有系统的几个优点。此外，我们获得竞争的结果，而不需要和细化和微调建立在Clef'18 CheckThat艺术的新境界！求实任务。

2. Two Step Joint Model for Drug Drug Interaction Extraction [PDF] 返回目录
Siliang Tang, Qi Zhang, Tianpeng Zheng, Mengdi Zhou, Zhan Chen, Lixing Shen, Xiang Ren, Yueting Zhuang, Shiliang Pu, Fei Wu
Abstract: When patients need to take medicine, particularly taking more than one kind of drug simultaneously, they should be alarmed that there possibly exists drug-drug interaction. Interaction between drugs may have a negative impact on patients or even cause death. Generally, drugs that conflict with a specific drug (or label drug) are usually described in its drug label or package insert. Since more and more new drug products come into the market, it is difficult to collect such information by manual. We take part in the Drug-Drug Interaction (DDI) Extraction from Drug Labels challenge of Text Analysis Conference (TAC) 2018, choosing task1 and task2 to automatically extract DDI related mentions and DDI relations respectively. Instead of regarding task1 as named entity recognition (NER) task and regarding task2 as relation extraction (RE) task then solving it in a pipeline, we propose a two step joint model to detect DDI and it's related mentions jointly. A sequence tagging system (CNN-GRU encoder-decoder) finds precipitants first and search its fine-grained Trigger and determine the DDI for each precipitant in the second step. Moreover, a rule based model is built to determine the sub-type for pharmacokinetic interation. Our system achieved best result in both task1 and task2. F-measure reaches 0.46 in task1 and 0.40 in task2.
摘要：当病人需要吃药，尤其是同时服用一种以上的药物，就应该警惕的是有可能存在药物相互作用。药物之间的相互作用可能对患者甚至导致死亡产生负面影响。一般情况下，药物与特定药物（或标签药物）的冲突在其药物标签或包装说明书通常被描述。由于越来越多的新药产品进入市场，难以通过人工来收集这些信息。我们参加药物相互作用（DDI）提取药品标签的文本分析会议（TAC）2018的挑战，选择TASK1和Task2自动提取DDI相关提到分别DDI关系。相反，关于TASK1作为命名实体识别（NER）任务和有关TASK2作为关系抽取（RE）的任务，然后在管道解决它，我们提出了一个两步的联合模式来检测DDI和它相关的共同提及。序列标签系统（CNN-GRU编码器 - 解码器）首先找到沉淀并搜索其细粒度触发并确定为DDI在第二步骤中的每个沉淀剂。此外，以规则为基础的模型是建立在确定的子类型的药代动力学互为作用。我们的系统在这两个TASK1和Task2取得了最好的结果。 F-度量达到0.46在任务1和0.40中TASK2。

3. The Adapter-Bot: All-In-One Controllable Conversational Model [PDF] 返回目录
Andrea Madotto, Zhaojiang Lin, Yejin Bang, Pascale Fung
Abstract: Considerable progress has been made towards conversational models that generate coherent and fluent responses by training large language models on large dialogue datasets. These models have little or no control of the generated responses and miss two important features: continuous dialogue skills integration and seamlessly leveraging diverse knowledge sources. In this paper, we propose the Adapter-Bot, a dialogue model that uses a fixed backbone conversational model such as DialGPT (Zhang et al., 2019) and triggers on-demand dialogue skills (e.g., emphatic response, weather information, movie recommendation) via different adapters (Houlsby et al., 2019). Each adapter can be trained independently, thus allowing a continual integration of skills without retraining the entire model. Depending on the skills, the model is able to process multiple knowledge types, such as text, tables, and graphs, in a seamless manner. The dialogue skills can be triggered automatically via a dialogue manager, or manually, thus allowing high-level control of the generated responses. At the current stage, we have implemented 12 response styles (e.g., positive, negative etc.), 8 goal-oriented skills (e.g. weather information, movie recommendation, etc.), and personalized and emphatic responses. We evaluate our model using automatic evaluation by comparing it with existing state-of-the-art conversational models, and we have released an interactive system at this http URL.
摘要：相当大的进展方面已经取得了产生的大数据集的对话训练大语言模型连贯和流畅应答对话模式。这些机型具有生成的响应很少或没有控制权，错过了两个重要的特点：持续的对话技巧集成和无缝地利用多样的知识来源。在本文中，我们提出了适配器博特，使用一个固定的骨干对话模型如DialGPT一个对话模式（Zhang等，2019），并触发点播对话技能（例如，有力的响应，天气信息，电影推荐）经由不同的适配器（Houlsby等人，2019）。每个适配器可以独立进行培训，从而使的技能不断的整合，而不重新培训整个模型。根据不同的技能，该模型能够处理多个知识类型，如文本，表格和图表，以无缝的方式。对话技能可以自动经由对话管理器手动触发，或者，因此允许产生的响应的高级别控制。在目前阶段，我们已经实现了12种响应样式（例如，积极，消极等），8目标导向的技能（例如天气信息，电影推荐等），以及个性化和有力的回应。我们使用自动评估通过与国家的最先进的现有会话模型比较评估我们的模型，我们已经发布了一个互动系统在这个HTTP URL。

4. Cost-Quality Adaptive Active Learning for Chinese Clinical Named Entity Recognition [PDF] 返回目录
Tingting Cai, Yangming Zhou, Hong Zheng
Abstract: Clinical Named Entity Recognition (CNER) aims to automatically identity clinical terminologies in Electronic Health Records (EHRs), which is a fundamental and crucial step for clinical research. To train a high-performance model for CNER, it usually requires a large number of EHRs with high-quality labels. However, labeling EHRs, especially Chinese EHRs, is time-consuming and expensive. One effective solution to this is active learning, where a model asks labelers to annotate data which the model is uncertain of. Conventional active learning assumes a single labeler that always replies noiseless answers to queried labels. However, in real settings, multiple labelers provide diverse quality of annotation with varied costs and labelers with low overall annotation quality can still assign correct labels for some specific instances. In this paper, we propose a Cost-Quality Adaptive Active Learning (CQAAL) approach for CNER in Chinese EHRs, which maintains a balance between the annotation quality, labeling costs, and the informativeness of selected instances. Specifically, CQAAL selects cost-effective instance-labeler pairs to achieve better annotation quality with lower costs in an adaptive manner. Computational results on the CCKS-2017 Task 2 benchmark dataset demonstrate the superiority and effectiveness of the proposed CQAAL.
摘要：临床命名实体识别（CNER）旨在在电子健康记录（电子病历），这是临床研究的根本和关键的一步自动身份临床术语。要培养对CNER一个高性能的机型，它通常需要大量的高品质的标签，电子病历的。然而，标签的电子健康档案，尤其是中国的电子健康档案，耗时和昂贵。一种有效的解决方案，这是积极的学习，其中，模型询问贴标到其中模型是不确定的注释数据。传统的主动学习假设一个贴标总是回复无声的答案查询标签。然而，在实际设置多个贴标提供注解的不同质量的变化，成本和贴标以较低的总注释质量可以为某些特定的情况下，仍指定正确的标签。在本文中，我们提出了在中国的电子健康档案，它保持注释质量，标签成本，并选定实例的信息量之间的平衡CNER一个成本 - 质量自适应主动学习（CQAAL）的方法。具体而言，选择CQAAL成本效益的实例 - 贴标机对实现与以自适应的方式降低成本更好注解质量。在CCKS - 2017年任务2基准数据集的计算结果表明，所提出的CQAAL的优越性和有效性。

5. Misogynistic Tweet Detection: Modelling CNN with Small Datasets [PDF] 返回目录
Md Abul Bashar, Richi Nayak, Nicolas Suzor, Bridget Weir
Abstract: Online abuse directed towards women on the social media platform Twitter has attracted considerable attention in recent years. An automated method to effectively identify misogynistic abuse could improve our understanding of the patterns, driving factors, and effectiveness of responses associated with abusive tweets over a sustained time period. However, training a neural network (NN) model with a small set of labelled data to detect misogynistic tweets is difficult. This is partly due to the complex nature of tweets which contain misogynistic content, and the vast number of parameters needed to be learned in a NN model. We have conducted a series of experiments to investigate how to train a NN model to detect misogynistic tweets effectively. In particular, we have customised and regularised a Convolutional Neural Network (CNN) architecture and shown that the word vectors pre-trained on a task-specific domain can be used to train a CNN model effectively when a small set of labelled data is available. A CNN model trained in this way yields an improved accuracy over the state-of-the-art models.
摘要：针对妇女的社会化媒体平台的Twitter上在线虐待的情况引起相当大的关注，近年来。自动化的方法来有效地识别厌恶女人的滥用会提高我们对规律的认识，驱动因素，并且与在一段持续时间内辱骂微博相关的反应效果。然而，训练神经网络（NN）模型具有小集标记的数据，以检测厌恶女人鸣叫是困难的。这部分是由于其含有厌恶女人的内容鸣叫的复杂性，并在需要的参数广大的神经网络模型的教训。我们已经进行了一系列的实验来研究如何训练神经网络模型有效地检测出厌恶女人的鸣叫。特别是，我们已经定制，正规化的卷积神经网络（CNN）的体系结构和显示，字矢量预先训练的上一个特定的任务域可以用来有效地训练CNN模型时的小集合标签数据是可用的。以这种方式训练甲CNN模型产生过度状态的最先进的模型的改进的准确度。

6. QutNocturnal@HASOC'19: CNN for Hate Speech and Offensive Content Identification in Hindi Language [PDF] 返回目录
Md Abul Bashar, Richi Nayak
Abstract: We describe our top-team solution to Task 1 for Hindi in the HASOC contest organised by FIRE 2019. The task is to identify hate speech and offensive language in Hindi. More specifically, it is a binary classification problem where a system is required to classify tweets into two classes: (a) \emph{Hate and Offensive (HOF)} and (b) \emph{Not Hate or Offensive (NOT)}. In contrast to the popular idea of pretraining word vectors (a.k.a. word embedding) with a large corpus from a general domain such as Wikipedia, we used a relatively small collection of relevant tweets (i.e. random and sarcasm tweets in Hindi and Hinglish) for pretraining. We trained a Convolutional Neural Network (CNN) on top of the pretrained word vectors. This approach allowed us to be ranked first for this task out of all teams. Our approach could easily be adapted to other applications where the goal is to predict class of a text when the provided context is limited.
摘要：我们描述了我们在火2019年举办的任务是确定在印地文仇恨言论和攻击性语言的HASOC大赛顶级团队解决方案任务1印地文。更具体地，它是其中系统需要进行分类鸣叫成两类二元分类问题：（1）\ EMPH {恨与进攻（HOF）}和（b）\ EMPH {不讨厌或进攻（NOT）}。与此相反，以从诸如维基百科的通用域大语料库训练前的单词矢量（又名字嵌入）的流行的观点，我们使用相关的tweet（即在印地文和印地随机和讽刺微博）的一个相对较小的集合训练前。我们对预训练的词矢量的顶部培养了卷积神经网络（CNN）。这种方法使我们能够排在第一位完成这个任务的所有球队。其目标是在提供的上下文仅限于预测文本类我们的方法可以很容易地适用于其他应用程序。

7. Language Models as Emotional Classifiers for Textual Conversations [PDF] 返回目录
Connor T. Heaton, David M. Schwartz
Abstract: Emotions play a critical role in our everyday lives by altering how we perceive, process and respond to our environment. Affective computing aims to instill in computers the ability to detect and act on the emotions of human actors. A core aspect of any affective computing system is the classification of a user's emotion. In this study we present a novel methodology for classifying emotion in a conversation. At the backbone of our proposed methodology is a pre-trained Language Model (LM), which is supplemented by a Graph Convolutional Network (GCN) that propagates information over the predicate-argument structure identified in an utterance. We apply our proposed methodology on the IEMOCAP and Friends data sets, achieving state-of-the-art performance on the former and a higher accuracy on certain emotional labels on the latter. Furthermore, we examine the role context plays in our methodology by altering how much of the preceding conversation the model has access to when making a classification.
摘要：与情感改变我们如何看待，处理和响应我们的环境发挥我们的日常生活至关重要的作用。情感计算的目标在电脑灌输检测和对行为人的情绪行事的能力。任何情感计算系统的核心方面是用户的情感的分类。在这项研究中，我们提出了一种新方法，在谈话的情绪进行分类。在我们提出的方法的骨干是预先训练的语言模型（LM），它是由图卷积网络（GCN）以上在话语中标识的谓词参数的结构传播的信息补充。我们应用在IEMOCAP和朋友数据集我们建议的方法，实现了对原国有的最先进的性能和对后者的某些情感标签更高的精度。此外，我们通过改变审视我们的方法中的作用方面扮演的模型是如何前面的谈话多少访问进行分类时。

8. Repurposing TREC-COVID Annotations to Answer the Key Questions of CORD-19 [PDF] 返回目录
Connor T. Heaton, Prasenjit Mitra
Abstract: The novel coronavirus disease 2019 (COVID-19) began in Wuhan, China in late 2019 and to date has infected over 14M people worldwide, resulting in over 750,000 deaths. On March 10, 2020 the World Health Organization (WHO) declared the outbreak a global pandemic. Many academics and researchers, not restricted to the medical domain, began publishing papers describing new discoveries. However, with the large influx of publications, it was hard for these individuals to sift through the large amount of data and make sense of the findings. The White House and a group of industry research labs, lead by the Allen Institute for AI, aggregated over 200,000 journal articles related to a variety of coronaviruses and tasked the community with answering key questions related to the corpus, releasing the dataset as CORD-19. The information retrieval (IR) community repurposed the journal articles within CORD-19 to more closely resemble a classic TREC-style competition, dubbed TREC-COVID, with human annotators providing relevancy judgements at the end of each round of competition. Seeing the related endeavors, we set out to repurpose the relevancy annotations for TREC-COVID tasks to identify journal articles in CORD-19 which are relevant to the key questions posed by CORD-19. A BioBERT model trained on this repurposed dataset prescribes relevancy annotations for CORD-19 tasks that have an overall agreement of 0.4430 with majority human annotations in terms of Cohen's kappa. We present the methodology used to construct the new dataset and describe the decision process used throughout.
摘要：新型冠状病毒病2019（COVID-19）开始在武汉，中国在2019年后期，迄今已经感染了超过14M全世界人民，造成超过750,000人死亡。 2020年3月10日，世界卫生组织（WHO）宣布爆发全球大流行。许多学者和研究人员，不仅限于医疗领域，开始描述新发现的论文发表。然而，随着大量涌入的出版物，这是很难为这些人通过大量的数据筛选，使调查结果的意义。白宫和一批产业研究实验室，铅由艾伦研究所AI，合计超过20万篇相关的各种冠状病毒的期刊论文，并回答相关的语料关键问题，释放数据集作为CORD-19责成社区。信息检索（IR）社区改变用途CORD-19内的期刊文章，使其更接近经典的TREC风格的比赛中，被称为TREC-COVID，与人工注释在每一轮比赛的结束提供了相关性的判断。眼看着相关的努力，我们开始着手重新调整为TREC-COVID任务的相关说明，以确定CORD-19这是有关通过CORD-19提出的关键问题期刊文章。一个BioBERT模型中训练的这一重新改编集相关规定进行标注有的0.4430与大多数人的注释在科恩kappa方面达成全面协议CORD-19的任务。我们目前用于构建新的数据集，并说明通篇使用的决策过程的方法。

9. Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-Initiative Conversations [PDF] 返回目录
Ashwin Paranjape, Abigail See, Kathleen Kenealy, Haojun Li, Amelia Hardy, Peng Qi, Kaushik Ram Sadagopan, Nguyet Minh Phu, Dilara Soylu, Christopher D. Manning
Abstract: We present Chirpy Cardinal, an open-domain dialogue agent, as a research platform for the 2019 Alexa Prize competition. Building an open-domain socialbot that talks to real people is challenging - such a system must meet multiple user expectations such as broad world knowledge, conversational style, and emotional connection. Our socialbot engages users on their terms prioritizing their interests, feelings and autonomy. As a result, our socialbot provides a responsive, personalized user experience, capable of talking knowledgeably about a wide variety of topics, as well as chatting empathetically about ordinary life. Neural generation plays a key role in achieving these goals, providing the backbone for our conversational and emotional tone. At the end of the competition, Chirpy Cardinal progressed to the finals with an average rating of 3.6/5.0, a median conversation duration of 2 minutes 16 seconds, and a 90th percentile duration of over 12 minutes.
摘要：我们提出扢红衣主教，开放域对话剂，作为研究平台，为2019年的Alexa奖的竞争。建设开放域socialbot是谈判真正的人是具有挑战性的 - 这样的系统必须满足多个用户的期望，如广阔的天地知识，对话风格和情感上的联系。我们socialbot从事对他们而言用户优先自己的利益，情感和自主性。其结果是，我们的socialbot提供一个负责任的，个性化的用户体验，能够聪明地谈论各种各样的话题，以及同情地聊起普通的生活。神经一代在实现这些目标，为我们的对话和情感基调的骨干了关键作用。在比赛结束时，扢基数进展到决赛，平均得分3.6 / 5.0，2分16秒的中值会话的持续时间，和百分在12分钟的持续时间的90。

10. Temporal Random Indexing of Context Vectors Applied to Event Detection [PDF] 返回目录
Yashank Singh, Niladri Chatterjee
Abstract: In this paper we explore new representations for encoding language data.The general method of one-hot encoding grows linearly with the size of the word corpus in space-complexity. We address this by using Random Indexing(RI) of context vectors with nonzero entries. We propose a novel RI representation where we exploit the effect imposing a probability distribution on the number of randomized entries which leads to a class of RI representations. We also propose an algorithm to track the semantic relationship of the key word to other words and hence propose an algorithm for suggesting the events that could happen relevant to the word in question. Finally we run simulations on the novel RI representations using the proposed algorithms for tweets relevant to the word ``iPhone'' and present results. The RI representation is shown to be faster and space efficient as compared to BoW embeddings.
摘要：在本文中，我们探索新的表示编码语言中的一个热码的data.The一般方法与空间复杂性词语料库的大小线性增长。我们通过使用情境矢量的随机索引（RI）与非零项解决这个问题。我们提出了一个新颖的RI表示，我们利用效果上随机条目导致一类RI表示的数量强加的概率分布。我们还提出了一种算法，可以跟踪关键的词来换句话说语义关系，从而提出了一种算法，这表明可能发生相关问题的字的事件。最后，我们运行使用相关的字``iPhone'和现在的结果鸣叫该算法小说RI表示模拟。所述RI表示被示为相比低头的嵌入要更快和节省空间。

11. A Dataset and Baselines for Visual Question Answering on Art [PDF] 返回目录
Noa Garcia, Chentao Ye, Zihua Liu, Qingtao Hu, Mayu Otani, Chenhui Chu, Yuta Nakashima, Teruko Mitamura
Abstract: Answering questions related to art pieces (paintings) is a difficult task, as it implies the understanding of not only the visual information that is shown in the picture, but also the contextual knowledge that is acquired through the study of the history of art. In this work, we introduce our first attempt towards building a new dataset, coined AQUA (Art QUestion Answering). The question-answer (QA) pairs are automatically generated using state-of-the-art question generation methods based on paintings and comments provided in an existing art understanding dataset. The QA pairs are cleansed by crowdsourcing workers with respect to their grammatical correctness, answerability, and answers' correctness. Our dataset inherently consists of visual (painting-based) and knowledge (comment-based) questions. We also present a two-branch model as baseline, where the visual and knowledge questions are handled independently. We extensively compare our baseline model against the state-of-the-art models for question answering, and we provide a comprehensive study about the challenges and potential future directions for visual question answering on art.
摘要：在回答有关的艺术作品（绘画）的问题是一项艰巨的任务，因为这意味着在画面表现出的不仅是视觉信息的理解，而且还通过艺术史的研究中所获得的语境知识。在这项工作中，我们介绍我们努力建设一个新的数据集的第一次尝试，创造了AQUA（艺术问答）。的问题 - 回答（QA）对使用基于在现有技术理解的数据集提供的绘画和注释国家的最先进的问题生成方法自动生成。质量保证对通过众包的工人就其语法的正确性，回应能力，并回答正确清洗。我们的数据本质上是由视觉（基于绘画）和知识（基于注释的）问题。我们还提出一个有两个分支模型为基准，在视觉和知识的问题是独立处理。我们比较广泛我们的基础模型对国家的最先进的型号为问答，我们提供有关的艺术视觉答疑的挑战和潜在的未来方向进行全面的研究。

注：中文为机器翻译结果！封面为论文标题词云图！

WITH LOVE OF WORLD

【arxiv论文】 Computation and Language 2020-08-31

目录

摘要