Building an NLP Pipeline with spaCy for Token Classification
Kickstart your journey into token classification by setting up an efficient NLP pipeline, learning about tokenization, POS tagging, and lemmatization with spaCy.
Lessons and practices
Counting Unique Categories in Reuters Dataset
Explore 'Tea' Category in Reuters Corpus
Fetch Text and Categories for 'Coffee' in Reuters Corpus
Exploring the 'Gas' Category in Reuters Corpus
Exploring Reuters Corpus by Category
Changing the String for Tokenization
Tokenize Sentences with Missing Code
Tokenizing First Reuters Document with spaCy
Calculating Unique Tokens in Document
Tokenizing Multiple Reuters Documents with spaCy
Filter Non-Alphabetic Stopword Tokens
Identifying Out-of-Vocabulary and Digital Tokens
Counting Stop Word Tokens
Identifying Token Capitalization in Text
Filtering Tokens Using a Simple Pipeline
Change the Sentence for Lemmatization
Lemmatizing Reuters Dataset with spaCy
Lemmatization on Reuters Dataset with spaCy
Integrating Lemmatization into Text Processing Pipeline
Lemmatization with spaCy on the Reuters Dataset
Refining Output Format of POS Tagging
POS tagging on a Real-world Text Document
Analyzing Verb Usage in Reuters News
Frequency Analysis on Adjectives Using POS Tagging
Exploring Word Usages with POS Tagging
Interested in this course? Learn and practice with Cosmo!
Practice is how you turn knowledge into actual skills.