by Tomasz Nurkiewicz
MapReduce is a programming model for processing large amounts of data. It works best when you have a relatively simple program, but data is spread across thousands of servers. MapReduce was invented and popularized by Google. I’ll talk about MapReduce in general and Hadoop in particular.
- MapReduce on Wikipedia
- How Hadoop Works Internally – Inside Hadoop
- Hadoop Combiner – Best Explanation to MapReduce Combiner
- Apache Hive
MapReduce at Google
- MapReduce: Simplified Data Processing on Large Clusters - the original whitepaper
- Why did Google stop using MapReduce and start encouraging Cloud Dataflow?
- Google Dumps MapReduce in Favor of New Hyper-Scale Analytics System
Be the first to listen to new episodes!
To get exclusive content:
- Unedited, longer content
- More extra materials to learn
- Your user voice ideas are prioritized