Models

As of v1.7.0, models for spaCy can be installed as Python packages. This means that they're a component of your application, just like any other module. They're versioned and can be defined as a dependency in your requirements.txt. Models can be installed from a download URL or a local directory, manually or via pip. Their data can be located anywhere on your file system.

Quickstart

Install a default model, get the code to load it from within spaCy and an example to test it. For more options, see the section on available models below.

Language
Loading style
Options
python -m spacy download enimport en_core_web_smnlp = en_core_web_sm.load()nlp = spacy.load('en')doc = nlp(u"This is a sentence.")print([(w.text, w.pos_) for w in doc])python -m spacy download deimport de_core_news_mdnlp = de_core_news_md.load()nlp = spacy.load('de')doc = nlp(u"Dies ist ein Satz.")print([(w.text, w.pos_) for w in doc])python -m spacy download frimport fr_depvec_web_lgnlp = fr_depvec_web_lg.load()nlp = spacy.load('fr')doc = nlp(u"C'est une phrase.")print([(w.text, w.pos_) for w in doc])python -m spacy download esimport es_core_web_mdnlp = es_core_web_md.load()nlp = spacy.load('es')doc = nlp(u"Esto es una frase.")print([(w.text, w.pos_) for w in doc])
Like this widget? Check out quickstart.js!

Available models

Model differences are mostly statistical. In general, we do expect larger models to be "better" and more accurate overall. Ultimately, it depends on your use case and requirements, and we recommend starting with the default models (marked with a star below).

NameLanguageVocDepEntVecSizeLicense
en_core_web_sm English50 MBCC BY-SA
en_core_web_mdEnglish1 GBCC BY-SA
en_depent_web_mdEnglish328 MBCC BY-SA
en_vectors_glove_mdEnglish727 MBCC BY-SA
de_core_news_md German645 MBCC BY-SA
fr_depvec_web_lg French1.33 GBCC BY-NC
es_core_web_md Spanish377 MBCC BY-SA

Downloading models

The easiest way to download a model is via spaCy's download command. It takes care of finding the best-matching model compatible with your spaCy installation.

# out-of-the-box: download best-matching default model
python -m spacy download en
python -m spacy download de
python -m spacy download fr
python -m spacy download es

# download best-matching version of specific model for your spaCy installation
python -m spacy download en_core_web_md

# download exact model version (doesn't create shortcut link)
python -m spacy download en_core_web_md-1.2.0 --direct

The download command will install the model via pip, place the package in your site-packages directory and create a shortcut link that lets you load the model by a custom name. The shortcut link will be the same as the model name used in spacy.download.

pip install spacy
python -m spacy download en
import spacy
nlp = spacy.load('en')
doc = nlp(u'This is a sentence.')

Installation via pip

To download a model directly using pip, simply point pip install to the URL or local path of the archive file. To find the direct link to a model, head over to the model releases, right click on the archive link and copy it to your clipboard.

# with external URL
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_md-1.2.0/en_core_web_md-1.2.0.tar.gz

# with local file
pip install /Users/you/en_core_web_md-1.2.0.tar.gz

By default, this will install the model into your site-packages directory. You can then use spacy.load() to load it via its package name, create a shortcut link to assign it a custom name, or import it explicitly as a module. If you need to download models as part of an automated process, we recommend using pip with a direct link, instead of relying on spaCy's download command.

Manual download and installation

In some cases, you might prefer downloading the data manually, for example to place it into a custom directory. You can download the model via your browser from the latest releases, or configure your own download script using the URL of the archive file. The archive consists of a model directory that contains another directory with the model data.

Directory structure

└── en_core_web_md-1.2.0.tar.gz # downloaded archive ├── meta.json # model meta data ├── setup.py # setup file for pip installation └── en_core_web_md # 📦 model package ├── __init__.py # init for pip installation ├── meta.json # model meta data └── en_core_web_md-1.2.0 # model data

You can place the model package directory anywhere on your local file system. To use it with spaCy, simply assign it a name by creating a shortcut link for the data directory.

Using models with spaCy

To load a model, use spacy.load() with the model's shortcut link, package name or a path to the data directory:

import spacy
nlp = spacy.load('en')                       # load model with shortcut link "en"
nlp = spacy.load('en_core_web_sm')           # load model package "en_core_web_sm"
nlp = spacy.load('/path/to/en_core_web_sm')  # load package from a directory

doc = nlp(u'This is a sentence.')

Using custom shortcut links

While previous versions of spaCy required you to maintain a data directory containing the models for each installation, you can now choose how and where you want to keep your data. For example, you could download all models manually and put them into a local directory. Whenever your spaCy projects need a models, you create a shortcut link to tell spaCy to load it from there. This means you'll never end up with duplicate data.

The link command will create a symlink in the spacy/data directory.

python -m spacy link [package name or path] [shortcut] [--force]

The first argument is the package name (if the model was installed via pip), or a local path to the the model package. The second argument is the internal name you want to use for the model. Setting the --force flag will overwrite any existing links.

Examples

# set up shortcut link to load installed package as "en_default" python -m spacy link en_core_web_md en_default # set up shortcut link to load local model as "my_amazing_model" python -m spacy link /Users/you/model my_amazing_model

Importing models as modules

If you've installed a model via spaCy's downloader, or directly via pip, you can also import it and then call its load() method with no arguments:

import en_core_web_md

nlp = en_core_web_md.load()
doc = nlp(u'This is a sentence.')

How you choose to load your models ultimately depends on personal preference. However, for larger code bases, we usually recommend native imports, as this will make it easier to integrate models with your existing build process, continuous integration workflow and testing framework. It'll also prevent you from ever trying to load a model that is not installed, as your code will raise an ImportError immediately, instead of failing somewhere down the line when calling spacy.load().

Using your own models

If you've trained your own model, for example for additional languages or custom named entities, you can save its state using the Language.to_disk() method. To make the model more convenient to deploy, we recommend wrapping it as a Python package.