[Pycon] [new paper] "Harshdeep Singh" - Dummies guide to Spark (using Python)!

info a pycon.it info a pycon.it
Sab 5 Gen 2019 22:22:32 CET


Title: Dummies guide to Spark (using Python)!
Duration: 60 (includes Q&A)
Q&A Session: 0
Language: en
Type: Talk

Abstract: This talk will give an introduction as to what is Spark and why it is essential in today's data stack in the research and industry. We will also delve into details as to what makes it fast and how does it scale jobs?

If you love Python and want to play with big data, this is the talk for you.

This talk covers a number of important topics which are important when dealing with Spark – from what is an RDD to considerations for working with key-value data, why avoiding groupByKey is important and more. We also include Python specific considerations, like the difference between DataFrames/Datasets and traditional RDDs with Python. Also, we will do a comparison with other libraries such as Pandas and see which is better in which scenario.

Tags: [u'datamining', u'pyspark', u'bigdata', u'data-science', u'integration', u'data', u'Big-Data', u'spark']


Maggiori informazioni sulla lista Pycon