[Pycon] [new paper] "Alisha Aneja" - What's trending this month?

info a pycon.it info a pycon.it
Dom 20 Gen 2019 13:57:34 CET


Title: What's trending this month?
Duration: 45 (includes Q&A)
Q&A Session: 15
Language: en
Type: Talk

Abstract: Finding trending topics out of loads of interesting topics on the Internet is difficult. Trending can be defined as something which is being talked about by a lot of people worldwide over a particular period of time.

Event Registry is a tool that has a collection and analysis of news content in real time published by over 30,000 news publishers worldwide. It can be used to find news articles of interest or the global events mentioned in these articles.

To find the trending topics, the global events feature of the Event Registry is used. Event-registry-python provides API access to Event Registry in Python. All the API’s support different filter mechanisms, like date range, language, relevance scores etc. The events can be found using the relevant keywords in the domain. For example, to find the events related to Artifical Intelligence, the keywords can be ‘artificial intelligence’, ‘machine learning’, ‘virtual assistant’ etc. The events returned can then be filtered according to their social scores. These social scores are an aggregate of virality scores from different social media platforms like Twitter, Facebook, Pinterest etc. Then, top articles are extracted from these events according to their calculated cosine distance from the phrases (all phrases related to the domain of interest). These phrases are the superset of the keywords used initially to search in the Event Registry. Finally, TextRank algorithm is used to find the trending topics and their brief summary. To further filter the absolute relevant trending topics, sent2vec embedding model (https://github.com/epfml/sent2vec) can be used. TextRank and sent2vec model basics will also be discussed in this talk.

Last 10 minutes will be for any questions that the audience has.


Tags: [u'data', u'nlp', u'pydata']


Maggiori informazioni sulla lista Pycon