[Python] [OT] Cheap MapReduce in Go
    Riccardo Magliocchetti 
    riccardo.magliocchetti a gmail.com
       
    Lun 13 Lug 2015 20:35:01 CEST
    
    
  
Il 13/07/2015 20:20, Carlo Miron ha scritto:
> < http://marcio.io/2015/07/cheap-mapreduce-in-go/>
>
> tl;dr
>
> Sometimes you don’t need overly complex infrastructures or systems to do a job
> well. In this case, we were running these exact same aggregations over close to
> 20 EMR instances that would take a few minutes to execute the entire MapReduce
> job over hundreds of Gigabytes of data each day.
>
> When we decided to take a look at this problem again, we rewrote this task using
> Go, and we now simply run this on a single 8-core machine and the whole daily
> execution takes about 10 minutes. We cut a lot of the costs associated with
> maintaining and running these EMR systems and we just schedule this Go app to
> run once a day over our daily dataset.
>
> You can find the entire code here:
> https://gist.github.com/mcastilho/e051898d129b44e2f502
Qualche tempo fa era uscito qualcosa del genere dove veniva usata una commovente 
pipeline:
http://aadrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html
-- 
Riccardo Magliocchetti
@rmistaken
http://menodizero.it
    
    
       
       Maggiori informazioni sulla lista 
	       Python