STANZA PYTHON NLP库中的生物医学和临床英语模型包

论文标题

STANZA PYTHON NLP库中的生物医学和临床英语模型包

Biomedical and Clinical English Model Packages in the Stanza Python NLP Library

论文作者

Zhang, Yuhao, Zhang, Yuhui, Qi, Peng, Manning, Christopher D., Langlotz, Curtis P.

论文摘要

我们为STANZA PYTHON NLP库介绍了生物医学和临床英语模型包。这些包装提供了准确的句法分析，并通过将STANZA的完全神经体系结构与各种开放数据集以及大规模的无聊的生物医学和临床文本数据相结合，来提供生物医学和临床文本的实体识别能力。我们通过广泛的实验表明，我们的软件包实现了句法分析，并与与最新结果相提并论的实体识别性能。我们进一步表明，与现有的工具包相比，当GPU加速时，这些模型不会损害速度，并且可以易于下载和与Stanza的Python界面一起使用。我们的包裹的演示可在以下网站上找到：http：//stanza.run/bio。

We introduce biomedical and clinical English model packages for the Stanza Python NLP library. These packages offer accurate syntactic analysis and named entity recognition capabilities for biomedical and clinical text, by combining Stanza's fully neural architecture with a wide variety of open datasets as well as large-scale unsupervised biomedical and clinical text data. We show via extensive experiments that our packages achieve syntactic analysis and named entity recognition performance that is on par with or surpasses state-of-the-art results. We further show that these models do not compromise speed compared to existing toolkits when GPU acceleration is available, and are made easy to download and use with Stanza's Python interface. A demonstration of our packages is available at: http://stanza.run/bio.

下载PDF全文

下载文献需遵守相关版权规定

论文标题