[Pycon] [new paper] "Christian Barra" - Scaling up your data infrastructure
info a pycon.it
info a pycon.it
Lun 6 Nov 2017 11:52:07 CET
Title: Scaling up your data infrastructure
Duration: 45 (includes Q&A)
Q&A Session: 15
Language: en
Type: Talk
Abstract: **This talk aims to answer a few questions:**
- What do you do when you need to move your model from your laptop to production?
- Is big data == I need to use JVM?
- What do you do when you need to have GPUs to train your model?
- How do you apply the best software engineering practises (testing and ci for example) inside your data science process?
- How do you “decouple” your data scientists, developers and devops teams?
- How do you guarantee the reproducibility of your models?
- How do you scale your training process when does not fit in memory anymore?
- How do you serve your models and provide an easy rollback system?
I’ll share my experience highlighting some of the challenges I faced and the solutions I came up to answer these questions.
The principles and best practises I will share are something that you can apply, more or less easily, if you are running or in the process to run a production system based on the Python stack.
This talk will focus on (my) best practices to run the Python Data stack together and is the result of more than 1 year of working on a project called Cassiny, that aims to simplify your life if you want to use a completely Python based solution in your data science workflow.
Tags: [u'java', u'Data-Scientist', u'docker', u'pydata']
Maggiori informazioni sulla lista
Pycon