摘要

1. Information Extraction of Clinical Trial Eligibility CriteriaYitong [PDF] 返回目录
Yitong Tseo, M. I. Salkola, Ahmed Mohamed, Anuj Kumar, Freddy Abnousi
Abstract: Clinical trials predicate subject eligibility on a diversity of criteria ranging from patient demographics to food allergies. Trials post their requirements as semantically complex, unstructured free-text. Formalizing trial criteria to a computer-interpretable syntax would facilitate eligibility determination. In this paper, we investigate an information extraction (IE) approach for grounding criteria from trials in this http URL to a shared knowledge base. We frame the problem as a novel knowledge base population task, and implement a solution combining machine learning and context free grammar. To our knowledge, this work is the first criteria extraction system to apply attention-based conditional random field architecture for named entity recognition (NER), and word2vec embedding clustering for named entity linking (NEL). We release the resources and core components of our system on GitHub. Finally, we report our per module and end to end performances; we conclude that our system is competitive with Criteria2Query, which we view as the current state-of-the-art in criteria extraction.
摘要：临床试验谓词的标准，从病人的人口统计到食物过敏多样性主体资格。试验张贴他们的语义复杂的，非结构化的自由文本的要求。正式审判标准的计算机可解释的语法将有助于资格认定。在本文中，我们研究了在此http URL从试验标准接地到共享知识库信息提取（IE）的方法。我们框定问题，作为一种新的知识基础群体的任务，并实施解决方案相结合的机器学习和上下文无关文法。据我们所知，这项工作的第一准则提取系统申请命名实体识别（NER）关注基于条件随机场的架构，并为命名实体连接（NEL）word2vec嵌入集群。我们在GitHub上发布我们的系统的资源和核心部件。最后，我们报告我们的每个模块和端到端的性能;我们得出结论，我们的系统是有竞争力与Criteria2Query，我们认为这是当前状态的最先进的在标准萃取。

2. Low-resource Languages: A Review of Past Work and Future Challenges [PDF] 返回目录
Alexandre Magueresse, Vincent Carles, Evan Heetderks
Abstract: A current problem in NLP is massaging and processing low-resource languages which lack useful training attributes such as supervised data, number of native speakers or experts, etc. This review paper concisely summarizes previous groundbreaking achievements made towards resolving this problem, and analyzes potential improvements in the context of the overall future research direction.
摘要：NLP当前的一个问题是按摩和处理缺乏有效的训练属性，如教师数据，母语或专家，许多低资源语言等，这综述文章简洁明了的总结对解决这个问题以前提出的开创性成果，并分析在未来整体研究方向的情况下潜在的改进。

3. SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020) [PDF] 返回目录
Marcos Zampieri, Preslav Nakov, Sara Rosenthal, Pepa Atanasova, Georgi Karadzhov, Hamdy Mubarak, Leon Derczynski, Zeses Pitenis, Çağrı Çöltekin
Abstract: We present the results and main findings of SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2020). The task involves three subtasks corresponding to the hierarchical taxonomy of the OLID schema (Zampieri et al., 2019a) from OffensEval 2019. The task featured five languages: English, Arabic, Danish, Greek, and Turkish for Subtask A. In addition, English also featured Subtasks B and C. OffensEval 2020 was one of the most popular tasks at SemEval-2020 attracting a large number of participants across all subtasks and also across all languages. A total of 528 teams signed up to participate in the task, 145 teams submitted systems during the evaluation period, and 70 submitted system description papers.
摘要：我们提出SemEval-2020工作12多种语言的攻击性语言鉴定社会化媒体（OffensEval 2020年）的成果和主要结论。任务涉及对应奥利德架构的层级分类3子任务从2019年OffensEval任务介绍了五种语言（ZAMPIERI等，2019a）：英语，阿拉伯语，丹麦语，希腊语和土耳其语为子任务A.此外，英语还精选子任务B和C. OffensEval 2020是最流行的任务之一SemEval-2020吸引了所有的子任务，并在所有语言的大量参与者。共报名参加任务528队，145队在评估期间提交系统，并提交系统描述文件70。

4. Speaker Sensitive Response Evaluation Model [PDF] 返回目录
JinYeong Bak, Alice Oh
Abstract: Automatic evaluation of open-domain dialogue response generation is very challenging because there are many appropriate responses for a given context. Existing evaluation models merely compare the generated response with the ground truth response and rate many of the appropriate responses as inappropriate if they deviate from the ground truth. One approach to resolve this problem is to consider the similarity of the generated response with the conversational context. In this paper, we propose an automatic evaluation model based on that idea and learn the model parameters from an unlabeled conversation corpus. Our approach considers the speakers in defining the different levels of similar context. We use a Twitter conversation corpus that contains many speakers and conversations to test our evaluation model. Experiments show that our model outperforms the other existing evaluation metrics in terms of high correlation with human annotation scores. We also show that our model trained on Twitter can be applied to movie dialogues without any additional training. We provide our code and the learned parameters so that they can be used for automatic evaluation of dialogue response generation models.
摘要：开放域对话响应生成的自动评价非常具有挑战性，因为有给定环境下许多适当的响应。现有的评估模型只是，如果他们从地面实况偏离比较不合适适当响应的地面真实的反应和速度许多生成的响应。解决这个问题的一种方法是考虑与对话方面所产生的反应的相似性。在本文中，我们提出了基于这种想法的自动评估模型，并借鉴未标记的谈话语料库模型参数。我们的方法考虑在界定不同层次的相似背景的扬声器。我们使用包含许多发言者和对话来检验我们的评价模型，Twitter的谈话语料库。实验表明，我们的模型优于现有的其他评价指标与人类注释得分较高的相关性的条款。我们还表明，我们训练有素的Twitter模式可以在没有任何额外的培训被应用到电影的对话。我们提供我们的代码和所学参数，以便它们可以用于对话响应一代车型的自动评估。

5. Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System [PDF] 返回目录
Jianhong Wang, Yuan Zhang, Tae-Kyun Kim, Yunjie Gu
Abstract: Designing task-oriented dialogue systems is a challenging research topic, since it needs not only to generate utterances fulfilling user requests but also to guarantee the comprehensibility. Many previous works trained end-to-end (E2E) models with supervised learning (SL), however, the bias in annotated system utterances remains as a bottleneck. Reinforcement learning (RL) deals with the problem through using non-differentiable evaluation metrics (e.g., the success rate) as rewards. Nonetheless, existing works with RL showed that the comprehensibility of generated system utterances could be corrupted when improving the performance on fulfilling user requests. In our work, we (1) propose modelling the hierarchical structure between dialogue policy and natural language generator (NLG) with the option framework, called HDNO; (2) train HDNO with hierarchical reinforcement learning (HRL), as well as suggest alternating updates between dialogue policy and NLG during HRL inspired by fictitious play, to preserve the comprehensibility of generated system utterances while improving fulfilling user requests; and (3) propose using a discriminator modelled with language models as an additional reward to further improve the comprehensibility. We test HDNO on MultiWoz 2.0 and MultiWoz 2.1, the datasets on multi-domain dialogues, in comparison with word-level E2E model trained with RL, LaRL and HDSA, showing a significant improvement on the total performance evaluated with automatic metrics.
摘要：设计任务为导向的对话系统是一项具有挑战性的研究课题，因为它不仅需要生成话语满足用户的要求，但也保证可理解性。许多以前的作品训练有素的端至端（E2E）有监督学习（SL）模型，然而，在注释系统话语遗体的瓶颈偏差。强化学习（RL）通过使用非可微分的评价指标（例如，成功率），为奖励问题交易。尽管如此，与RL现有的作品表明，系统生成话语的可理解性可以提高对满足用户请求的性能时被破坏。在我们的工作中，我们（1）提出了模拟对话的政策和自然语言生成（NLG）与选项的框架，叫做HDNO之间的层次结构; （2）火车HDNO与分层强化学习（HRL），以及建议HRL期间交替对话政策和NLG之间更新启发虚构的发挥，同时提高满足用户请求保全系统生成话语的可理解性;和（3）提出使用具有语言模型建模为一个额外的奖励以进一步改善可理解性鉴别器。我们对MultiWoz 2.0和2.1 MultiWoz，在多域对话数据集测试HDNO，在与RL，LaRL和HDSA训练词级E2E模型比较，显示出与自动度量评估的总性能有显著改善。

6. Comparing Natural Language Processing Techniques for Alzheimer's Dementia Prediction in Spontaneous Speech [PDF] 返回目录
Thomas Searle, Zina Ibrahim, Richard Dobson
Abstract: Alzheimer's Dementia (AD) is an incurable, debilitating, and progressive neurodegenerative condition that affects cognitive function. Early diagnosis is important as therapeutics can delay progression and give those diagnosed vital time. Developing models that analyse spontaneous speech could eventually provide an efficient diagnostic modality for earlier diagnosis of AD. The Alzheimer's Dementia Recognition through Spontaneous Speech task offers acoustically pre-processed and balanced datasets for the classification and prediction of AD and associated phenotypes through the modelling of spontaneous speech. We exclusively analyse the supplied textual transcripts of the spontaneous speech dataset, building and comparing performance across numerous models for the classification of AD vs controls and the prediction of Mental Mini State Exam scores. We rigorously train and evaluate Support Vector Machines (SVMs), Gradient Boosting Decision Trees (GBDT), and Conditional Random Fields (CRFs) alongside deep learning Transformer based models. We find our top performing models to be a simple Term Frequency-Inverse Document Frequency (TF-IDF) vectoriser as input into a SVM model and a pre-trained Transformer based model `DistilBERT' when used as an embedding layer into simple linear models. We demonstrate test set scores of 0.81-0.82 across classification metrics and a RMSE of 4.58.
摘要：早老性痴呆（AD）是一种无法治愈的，大伤元气，并进行性神经变性的条件，影响认知功能。早期诊断很重要，因为治疗可延缓进展，给那些被诊断至关重要的时间。该分析自然语音发展模式最终可能为AD的早期诊断提供了一种有效的诊断方式。在早老性痴呆识别通过自然话语任务计划书声预处理，并通过自然的语音建模平衡分类和AD的预测及相关表型数据集。我们专门分析了自然语音数据集所提供的文本成绩单，建设和跨多个模型AD的比对照组的分类和心理迷你国家考试成绩的预测比较性能。我们严格的培训和评估支持向量机（SVM），梯度推进决策树（GBDT），和条件随机域（控释肥）一起深度学习变压器的机型为主。我们发现我们最出色的机型做个简单的词频 - 逆文档频率（TF-IDF）vectoriser作为输入到SVM模型和预训练变压器基于模型的埋层为简单的线性模型使用时`DistilBERT”。我们展示出跨越分类指标0.81-0.82测试集分数和4.58一RMSE。

7. Dutch General Public Reaction on Governmental COVID-19 Measures and Announcements in Twitter Data [PDF] 返回目录
Shihan Wang, Marijn Schraagen, Erik Tjong Kim Sang, Mehdi Dastani
Abstract: Public sentiment (the opinion, attitude or feeling that the public expresses) is a factor of interest for government, as it directly influences the implementation of policies. Given the unprecedented nature of the COVID-19 crisis, having an up-to-date representation of public sentiment on governmental measures and announcements is crucial. While the staying-at-home policy makes face-to-face interactions and interviews challenging, analysing real-time Twitter data that reflects public opinion toward policy measures is a cost-effective way to access public sentiment. In this paper, we collect streaming data using the Twitter API starting from the COVID-19 outbreak in the Netherlands in February 2020, and track Dutch general public reactions on governmental measures and announcements. We provide temporal analysis of tweet frequency and public sentiment over the past four months. We also identify public attitudes towards the Dutch policy on wearing face masks in a case study. By presenting those preliminary results, we aim to provide visibility into the social media discussions around COVID-19 to the general public, scientists and policy makers. The data collection and analysis will be updated and expanded over time.
摘要：民情（观点，态度或感觉公众表达）是政府关心的一个因素，因为它直接影响到政策的执行情况。鉴于COVID-19危机的空前性质，其民情对政府措施和通知的跟上时代的代表是至关重要的。虽然停留在家里的政策，使脸对脸的互动和采访挑战，分析朝政策措施，反映民意实时Twitter的数据是具有成本效益的方式来获得公众的情绪。在本文中，我们收集使用Twitter的API从荷兰COVID-19疫情2020年2月开始流数据，并跟踪政府措施和通知荷兰公众的反应。我们提供的鸣叫频率和公众情绪的时间分析在过去的四个月。我们还确定对戴口罩的情况下研究了荷兰政策公众的态度。通过展示这些初步的结果，我们的目标是提供洞察周围COVID-19的社交媒体讨论，广大市民，科学家和政策制定者。数据收集和分析将会更新，并随着时间逐渐扩大。

8. Sparse and Continuous Attention Mechanisms [PDF] 返回目录
André F. T. Martins, Marcos Treviso, António Farinhas, Vlad Niculae, Mário A. T. Figueiredo, Pedro M. Q. Aguiar
Abstract: Exponential families are widely used in machine learning; they include many distributions in continuous and discrete domains (e.g., Gaussian, Dirichlet, Poisson, and categorical distributions via the softmax transformation). Distributions in each of these families have fixed support. In contrast, for finite domains, there has been recent work on sparse alternatives to softmax (e.g. sparsemax and alpha-entmax), which have varying support, being able to assign zero probability to irrelevant categories. This paper expands that work in two directions: first, we extend alpha-entmax to continuous domains, revealing a link with Tsallis statistics and deformed exponential families. Second, we introduce continuous-domain attention mechanisms, deriving efficient gradient backpropagation algorithms for alpha in {1,2}. Experiments on attention-based text classification, machine translation, and visual question answering illustrate the use of continuous attention in 1D and 2D, showing that it allows attending to time intervals and compact regions.
摘要：指数的家庭被广泛应用于机器学习;它们包括在连续和离散域（例如，高斯，狄利克雷，泊松，并且经由变换SOFTMAX分类分布）许多分布。在这些家庭分布有固定的支持。相反，对于有限域，最近有稀疏替代SOFTMAX（例如sparsemax和α-entmax），其具有变化的支持，能够零概率分配给不相关的类别的工作。本文扩大了工作在两个方向：第一，我们的α-entmax扩展到连续域，揭示了与Tsallis统计和变形指数族的链接。第二，我们引入连续域注意机制，在{1,2}导出高效梯度反向传播算法用于α-。在关注基于文本分类，机器翻译，和视觉的问答实验说明使用的一维和二维持续关注，表明它允许参加到时间间隔和紧凑的区域。

9. NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing [PDF] 返回目录
Nikita Klyuchnikov, Ilya Trofimov, Ekaterina Artemova, Mikhail Salnikov, Maxim Fedorov, Evgeny Burnaev
Abstract: Neural Architecture Search (NAS) is a promising and rapidly evolving research area. Training a large number of neural networks requires an exceptional amount of computational power, which makes NAS unreachable for those researchers who have limited or no access to high-performance clusters and supercomputers. A few benchmarks with precomputed neural architectures performances have been recently introduced to overcome this problem and ensure more reproducible experiments. However, these benchmarks are only for the computer vision domain and, thus, are built from the image datasets and convolution-derived architectures. In this work, we step outside the computer vision domain by leveraging the language modeling task, which is the core of natural language processing (NLP). Our main contribution is as follows: we have provided search space of recurrent neural networks on the text datasets and trained 14k architectures within it; we have conducted both intrinsic and extrinsic evaluation of the trained models using datasets for semantic relatedness and language understanding evaluation; finally, we have tested several NAS algorithms to demonstrate how the precomputed results can be utilized. We believe that our results have high potential of usage for both NAS and NLP communities.
摘要：神经结构搜索（NAS）是一种很有前途和迅速发展的研究领域。培养了一大批神经网络需要的计算能力，这使得NAS无法访问那些谁限制了研究者或高性能集群和超级计算机不能访问了非同寻常的。与预先计算神经架构性能的几个基准已经最近推出来克服这个问题，并确保更多的可重复的实验。然而，这些基准仅适用于计算机视觉领域，因此，从图像数据集和卷积衍生架构构建的。在这项工作中，我们一步计算机视觉领域之外通过利用语言建模任务，这是自然语言处理（NLP）的核心。我们的主要工作如下：我们已经提供递归神经网络的搜索空间上的文本数据集和内培训了14K架构;我们已经进行了使用语义关联和语言理解评价数据集训练的模型的内在和外在的评价;最后，我们已经测试了几个NAS算法证明预先计算的结果是如何被利用。我们相信，我们的研究结果有使用了NAS和NLP社区的巨大潜力。

10. Language-Conditioned Goal Generation: a New Approach to Language Grounding for RL [PDF] 返回目录
Cédric Colas, Ahmed Akakzia, Pierre-Yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
Abstract: In the real world, linguistic agents are also embodied agents: they perceive and act in the physical world. The notion of Language Grounding questions the interactions between language and embodiment: how do learning agents connect or ground linguistic representations to the physical world ? This question has recently been approached by the Reinforcement Learning community under the framework of instruction-following agents. In these agents, behavioral policies or reward functions are conditioned on the embedding of an instruction expressed in natural language. This paper proposes another approach: using language to condition goal generators. Given any goal-conditioned policy, one could train a language-conditioned goal generator to generate language-agnostic goals for the agent. This method allows to decouple sensorimotor learning from language acquisition and enable agents to demonstrate a diversity of behaviors for any given instruction. We propose a particular instantiation of this approach and demonstrate its benefits.
摘要：在现实世界中，语言剂也是体现因子，它们感知和现实世界中的作用。如何学习代理连接或接地语言表述物理世界：语言与实施之间的交互语言接地问题的概念？这个问题最近已下列指示代理框架下走近强化学习社区。在这些药物，行为的政策或奖励功能空调上用自然语言表达的指令嵌入。本文提出了另一种方法：使用语言条件的目标发电机。给定任何目标的条件政策，一个可以训练语言空调的目标生成器来代理语言无关的目标。这种方法允许解耦感觉学习从语言习得，使代理商证明行为的多样性对于任何给定的指令。我们建议这种方法的一个具体实例，并证明它的好处。

11. Learning Effective Representations for Person-Job Fit by Feature Fusion [PDF] 返回目录
Junshu Jiang, Songyun Ye, Wei Wang, Jingran Xu, Xiaosheng Luo
Abstract: Person-job fit is to match candidates and job posts on online recruitment platforms using machine learning algorithms. The effectiveness of matching algorithms heavily depends on the learned representations for the candidates and job posts. In this paper, we propose to learn comprehensive and effective representations of the candidates and job posts via feature fusion. First, in addition to applying deep learning models for processing the free text in resumes and job posts, which is adopted by existing methods, we extract semantic entities from the whole resume (and job post) and then learn features for them. By fusing the features from the free text and the entities, we get a comprehensive representation for the information explicitly stated in the resume and job post. Second, however, some information of a candidate or a job may not be explicitly captured in the resume or job post. Nonetheless, the historical applications including accepted and rejected cases can reveal some implicit intentions of the candidates or recruiters. Therefore, we propose to learn the representations of implicit intentions by processing the historical applications using LSTM. Last, by fusing the representations for the explicit and implicit intentions, we get a more comprehensive and effective representation for person-job fit. Experiments over 10 months real data show that our solution outperforms existing methods with a large margin. Ablation studies confirm the contribution of each component of the fused representation. The extracted semantic entities help interpret the matching results during the case study.
摘要：人对岗位匹配是匹配使用机器学习算法的在线招聘平台，候选人和就业岗位。匹配算法的有效性在很大程度上取决于为考生和就业岗位学习表示。在本文中，我们提出要通过学习特征融合的候选人和就业岗位的全面和有效的表示。首先，除了应用深度学习模型处理在简历和就业岗位，这是由现有方法所采用的自由文本，我们提取了整个简历（和就业岗位）语义实体，然后学习功能他们。通过融合从自由文本和实体的特征，我们得到的简历和工作职位明确规定了信息的全面表现。二，然而，候选人或工作的某些信息可能不被明确地在简历或作业后捕获。然而，历史的应用，包括接受和拒绝的情况下，可以揭示候选人或招聘人员的一些隐含的意图。因此，我们建议通过使用被LSTM历史的应用程序来学习隐含意图的表示。最后，通过融合交涉的显性和隐性的意图，我们得到的人，工作配合的更全面，更有效的代表。在10个月实验真实的数据显示，我们的解决方案性能优于现有的以大比分的方法。消融研究证实融合表示的每个组成部分的贡献。提取语义实体帮助解释本案例研究的匹配结果。

12. Improving GAN Training with Probability Ratio Clipping and Sample Reweighting [PDF] 返回目录
Yue Wu, Pan Zhou, Andrew Gordon Wilson, Eric P. Xing, Zhiting Hu
Abstract: Despite success on a wide range of problems related to vision, generative adversarial networks (GANs) can suffer from inferior performance due to unstable training, especially for text generation. we propose a new variational GAN training framework which enjoys superior training stability. Our approach is inspired by a connection of GANs and reinforcement learning under a variational perspective. The connection leads to (1) probability ratio clipping that regularizes generator training to prevent excessively large updates, and (2) a sample re-weighting mechanism that stabilizes discriminator training by downplaying bad-quality fake samples. We provide theoretical analysis on the convergence of our approach. By plugging the training approach in diverse state-of-the-art GAN architectures, we obtain significantly improved performance over a range of tasks, including text generation, text style transfer, and image generation.
摘要：尽管在广泛的涉及到视力问题的成功，生成由于不稳定的训练对抗网络（甘斯）可以从性能较差受到影响，特别是对文本生成。我们提出了一个新的变分甘培训框架，享有优越的训练稳定性。我们的做法是由下变的角度甘斯的连接和强化学习的启发。连接导线（1），概率比削波的是规则化发生器训练，以防止过大的更新，和（2）的样品重新加权的稳定通过淡化不良品质的假样品鉴别培养机制。我们提供我们的方法收敛理论分析。通过插入在国家的最先进的多样化GAN架构的训练方法，我们获得在一定范围内的任务，包括文本生成，文本样式转移和图像生成的显著改进的性能。

13. High-Precision Extraction of Emerging Concepts from Scientific Literature [PDF] 返回目录
Daniel King, Doug Downey, Daniel S. Weld
Abstract: Identification of new concepts in scientific literature can help power faceted search, scientific trend analysis, knowledge-base construction, and more, but current methods are lacking. Manual identification cannot keep up with the torrent of new publications, while the precision of existing automatic techniques is too low for many applications. We present an unsupervised concept extraction method for scientific literature that achieves much higher precision than previous work. Our approach relies on a simple but novel intuition: each scientific concept is likely to be introduced or popularized by a single paper that is disproportionately cited by subsequent papers mentioning the concept. From a corpus of computer science papers on arXiv, we find that our method achieves a Precision@1000 of 99%, compared to 86% for prior work, and a substantially better precision-yield trade-off across the top 15,000 extractions. To stimulate research in this area, we release our code and data (this https URL).
摘要：在科学文献中可以帮助电力面搜索，科学的趋势分析，知识库建设的新理念识别，多，但是缺乏通用方法。手动识别无法与新出版物洪流跟上，而现有的自动技术的精度太低，许多应用。我们提出了科学文献无监督的概念提取方法，它实现了比以前的工作更高的精度。我们的方法依赖于一个简单而新颖的直觉：每个科学的理念很可能被引入或由被不成比例地通过后续的论文提的概念引用了一篇论文普及。从对计算机的arXiv科学论文语料库，我们发现，我们的方法实现了精密@ 1000的99％，而86％的以前的工作，以及大致更好的精度收益权衡在顶部15000提取。为了刺激这方面的研究，我们发布的代码和数据（此HTTPS URL）。

14. FastPitch: Parallel Text-to-speech with Pitch Prediction [PDF] 返回目录
Adrian Łańcucki
Abstract: We present FastPitch, a fully-parallel text-to-speech model based on FastSpeech, conditioned on fundamental frequency contours. The model predicts pitch contours during inference, and generates speech that could be further controlled with predicted contours. FastPitch can thus change the perceived emotional state of the speaker or put emphasis on certain lexical units. We find that uniformly increasing or decreasing the pitch with FastPitch generates speech that resembles the voluntary modulation of voice. Conditioning on frequency contours improves the quality of synthesized speech, making it comparable to state-of-the-art. It does not introduce an overhead, and FastPitch retains the favorable, fully-parallel Transformer architecture of FastSpeech with a similar speed of mel-scale spectrogram synthesis, orders of magnitude faster than real-time.
摘要：我们提出FastPitch，基于FastSpeech完全平行文本 - 语音模型，空调，基频轮廓。该模型推理过程中预测基音，并生成语音可与预测的轮廓来进一步控制。因此FastPitch可以改变某些词汇单元扬声器的感知情绪状态或注重。我们发现，均匀地增加或减少的间距与FastPitch生成语音类似的声音自愿调制。上频率轮廓调节改善合成语音的质量，使之比得上状态的最先进的。它不引入开销，FastPitch保留FastSpeech的优惠，全并行变压器架构，梅尔频谱合成，数量级比实时更快的速度相近。

注：中文为机器翻译结果！

WITH LOVE OF WORLD

【arxiv论文】 Computation and Language 2020-06-15

目录

摘要