MarketSpaCy
Company Profile

SpaCy

spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython. The library is published under the MIT license and its main developers are Matthew Honnibal and Ines Montani, the founders of the software company Explosion.

History
• Version 1.0 was released on October 19, 2016, and included preliminary support for deep learning workflows by supporting custom processing pipelines. It further included a rule matcher that supported entity annotations, and an officially documented training API. • Version 2.0 was released on November 7, 2017, and introduced convolutional neural network models for 7 different languages. It also supported custom processing pipeline components and extension attributes, and featured a built-in trainable text classification component. • Version 3.0 was released on February 1, 2021, and introduced state-of-the-art transformer-based pipelines. It also introduced a new configuration system and training workflow, as well as type hints and project templates. This version dropped support for Python 2. == Main features ==
Main features
• Non-destructive tokenization • "Alpha tokenization" support for over 65 languages • Built-in support for trainable pipeline components such as Named entity recognition, Part-of-speech tagging, dependency parsing, Text classification, Entity Linking and more • Statistical models for 19 languages • Multi-task learning with pretrained transformers like BERT • Support for custom models in PyTorch, TensorFlow and other frameworks • State-of-the-art speed and accuracy • Production-ready training system • Built-in visualizers for syntax and named entities • Easy model packaging, deployment and workflow management == Extensions and visualizers ==
Extensions and visualizers
visualization generated with the displaCy visualizer spaCy comes with several extensions and visualizations that are available as free, open-source libraries: • : A machine learning library optimized for CPU usage and deep learning with text input. • : A library for computing word similarities, based on Word2vec. • : An open-source dependency parse tree visualizer built with JavaScript, CSS and SVG. • : An open-source named entity visualizer built with JavaScript and CSS. == References ==
tickerdossier.comtickerdossier.substack.com