English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.
import spacy
from spacy.lang.en.examples import sentences
nlp = spacy.load("en_core_web_sm")
doc = nlp(sentences[0])
for token in doc:
print(token.text, token.pos_, token.dep_)
‘en’ stands for English language, which means you are working specifically on English language using the spaCy library.
‘core’ stands for core NLP tasks such as lemmatization or PoS tagging, which means you are loading the pre-built models which can perform some of the core NLP-related tasks.
‘web’ is the pre-built model of the spaCy library which you will use for NLP tasks that are trained from web source content such as blogs, social media and comments.
‘sm’ means small models which are faster and use smaller pipelines but are comparatively less accurate. As a complement to ‘sm’, you can use ‘lg’ or ‘md’ for larger pipelines which will be more accurate than ‘sm’.
No comments:
Post a Comment