[Pycon] [new paper] "Navid Nobani" - A zero to Hero training on Text Mining and NLP
info a pycon.it
info a pycon.it
Ven 14 Dic 2018 10:05:00 CET
Title: A zero to Hero training on Text Mining and NLP
Duration: 240 (includes Q&A)
Q&A Session: 0
Language: en
Type: Talk
Abstract: **Objectives**: Assuming you have zero or little knowledge about Text mining, this hands-on training will show you how you can clean your data from usual noise, explore its structure, extract information like the sentiment from it and finally how to create a working language model from it.
What you’ll learn:
**Data Preparation and Cleaning**
o Segmentation: Arriving from a bulk of the text to sentences.
o Tokenization: getting tokens from sentences.
o Stop-word, numbers and punctuation removal: Removing tokens which have no or negative impact on our performance.
o Stemming: removing unnecessary parts from tokens
Lowercase conversion
**POS tagging**: Identifying part of speech for
**NER tagging**: Identifying named entities like locations, persons, etc. in the text.
**Term-Document matrix**: creating a numerical representation from textual data
o BOW: Bag of word model
o TF-IDF: Term frequency – Inverse document frequency
**Sentiment Analysis**: Extracting the positive/Negative/neural sentiment from text
**Word2Vec Modeling**: Converting textual data to a numerical model
o CBOW: Continuous Bag of World Model
o Skip-Gram
**Model Evaluation**: Finding out which model creates a better representation of our data
**Model Visualization using t-SNE**: Visualizing our model using dimension-reduction methods.
**Topic Modelling**: Finding the hidden topics inside our textual data
Tags: [u'Text-Mining', u'nlp']
Maggiori informazioni sulla lista
Pycon