[Pycon] [new paper] "Navid Nobani" - A zero to Hero training on Text Mining and NLP

info a pycon.it info a pycon.it
Ven 14 Dic 2018 10:05:00 CET


Title: A zero to Hero training on Text Mining and NLP
Duration: 240 (includes Q&A)
Q&A Session: 0
Language: en
Type: Talk

Abstract: **Objectives**: Assuming you have zero or little knowledge about Text mining, this hands-on training will show you how you can clean your data from usual noise, explore its structure, extract information like the sentiment from it and finally how to create a working language model from it.
What you’ll learn:
**Data Preparation and Cleaning**
o	Segmentation: Arriving from a bulk of the text to sentences.
o	Tokenization: getting tokens from sentences.
o	Stop-word, numbers and punctuation removal: Removing tokens which have no or negative impact on our performance.
o	Stemming: removing unnecessary parts from tokens
Lowercase conversion
**POS tagging**: Identifying part of speech for 
**NER tagging**: Identifying named entities like locations, persons, etc. in the text.
**Term-Document matrix**: creating a numerical representation from textual data
o	BOW: Bag of word model
o	TF-IDF: Term frequency – Inverse document frequency
**Sentiment Analysis**: Extracting the positive/Negative/neural sentiment from text
**Word2Vec Modeling**: Converting textual data to a numerical model
o	CBOW: Continuous Bag of World Model
o	Skip-Gram
**Model Evaluation**: Finding out which model creates a better representation of our data
**Model Visualization using t-SNE**: Visualizing our model using dimension-reduction methods.
**Topic Modelling**: Finding the hidden topics inside our textual data



Tags: [u'Text-Mining', u'nlp']


Maggiori informazioni sulla lista Pycon