App Engine MapReduce API - Part 4: Combining Sequential MapReduce Jobs

View all articles in the MapReduce API Series. Last time we looked at how to run a full MapReduce Pipeline to count the number of occurrences of a character within each string. In this post we will see how to chain multiple MapReduce Pipelines together to perform sequential tasks. ...

May 13, 2014 · 2 min · Kevin Sookocheff

App Engine MapReduce API - Part 3: Programmatic MapReduce using Pipelines

View all articles in the MapReduce API Series. In the last article we examined how to run one-off tasks that operate on a large dataset using a mapreduce.yaml configuration file. This article will take us a step further and look at how to run a MapReduce job programmatically using the App Engine Pipeline API. ...

April 28, 2014 · 7 min · Kevin Sookocheff

App Engine MapReduce API - Part 2: Running a MapReduce Job Using mapreduce.yaml

View all articles in the MapReduce API Series. Last time we looked at an overview of how MapReduce works. In this article we’ll be getting our hands dirty writing some code to handle the Map Stage. If you’ll recall, the Map Stage is composed of two separate components: an InputReader and a map function. We’ll look at each of these in turn. ...

April 22, 2014 · 11 min · Kevin Sookocheff

App Engine MapReduce API - Part 1: The Basics

View all articles in the MapReduce API Series. The first arcticle in this series provides an overview of the App Engine MapReduce API. We will give a basic overview of what MapReduce is and how it is used to do parallel and distributed processing of large datasets. ...

April 15, 2014 · 6 min · Kevin Sookocheff