The MapReduce library supports a number of default output writers. You can also write your own that implements the output writer interface. This article examines how to write a custom output writer that pushes data from the App Engine datastore to an elasticsearch cluster. A similar pattern can be followed to push the output from your MapReduce job to any number of places.[Read More]
App Engine MapReduce API - Part 6: Writing a Custom Input Reader
One of the great things about the MapReduce library is the abilitiy to write a
cutom InputReader to process data from any data source. In this post we will
explore how to write an InputReader the leases tasks from an AppEngine pull
queue by implementing the
App Engine MapReduce API - Part 5: Using Combiners to Reduce Data Throughput
App Engine MapReduce API - Part 4: Combining Sequential MapReduce Jobs
App Engine MapReduce API - Part 3: Programmatic MapReduce using Pipelines
In the last article we examined how to run one-off tasks that operate on a large dataset using a
mapreduce.yaml configuration file. This article will take us a step further and look at how to run a MapReduce job programmatically using the App Engine Pipeline API.
App Engine MapReduce API - Part 2: Running a MapReduce Job Using mapreduce.yaml
Last time we looked at an overview of how MapReduce works. In this article we’ll be getting our hands dirty writing some code to handle the Map Stage. If you’ll recall, the Map Stage is composed of two separate components: an InputReader and a
map function. We’ll look at each of these in turn.