Why I’m not (too) worried about Python 2

Python 2 will retire in about one month. Given many organizations continued reliance on it, you may be asking “Now What?” What does this change mean for a company that heavily relies on a deprecated language? Many times, the Python 2 deployed in an organization still generates a lot of value. Yet as this code continues to age, you expose ourselves to potential security vulnerabilities with fewer avenues to address them. [Read More]

Introducing CloudPyPI

A common problem with Python development for large-scale teams is sharing internal libraries. At Vendasta we’ve been solving this problem using a private PyPI installation running on Google App Engine with Python eggs and wheels being served by Google Cloud Storage. Today, we are announcing the open source version of this tool — CloudPyPI. CloudPyPI is a modification of pypiserver for running on Google App Engine. We’ve also introduced a simple user management system to allow authenticated access to your Python packages. [Read More]

Halting Python unittest Execution on First Error

We all know the importance of unit tests. Especially in a dynamic language like Python. Occasionally you have a set of unit tests that are failing in a cascading fashion where the first error case causes subsequent tests to fail (these tests are likely no longer unit tests, but that’s a different discussion). To help isolate the offending test case in a see of failures you can set the unittest.TestCase class to halt after the first error by overriding the run method as follows. [Read More]

Managing App Engine Dependencies Using pip

One unfortunate difficulty when working with App Engine is managing your local dependencies. You don’t have access to your Python environment so all libraries you wish to use must be vendored with your installation. That is, you need to copy all of your library code into a local folder to ship along with your app.

[Read More]

App Engine MapReduce API - Part 7: Writing a Custom Output Writer

View all articles in the MapReduce API Series.

The MapReduce library supports a number of default output writers. You can also write your own that implements the output writer interface. This article examines how to write a custom output writer that pushes data from the App Engine datastore to an elasticsearch cluster. A similar pattern can be followed to push the output from your MapReduce job to any number of places.

[Read More]

App Engine MapReduce API - Part 6: Writing a Custom Input Reader

View all articles in the MapReduce API Series.

One of the great things about the MapReduce library is the abilitiy to write a cutom InputReader to process data from any data source. In this post we will explore how to write an InputReader the leases tasks from an AppEngine pull queue by implementing the InputReader interface.

[Read More]

App Engine MapReduce API - Part 3: Programmatic MapReduce using Pipelines

View all articles in the MapReduce API Series.

In the last article we examined how to run one-off tasks that operate on a large dataset using a mapreduce.yaml configuration file. This article will take us a step further and look at how to run a MapReduce job programmatically using the App Engine Pipeline API.

[Read More]

App Engine MapReduce API - Part 2: Running a MapReduce Job Using mapreduce.yaml

View all articles in the MapReduce API Series.

Last time we looked at an overview of how MapReduce works. In this article we’ll be getting our hands dirty writing some code to handle the Map Stage. If you’ll recall, the Map Stage is composed of two separate components: an InputReader and a map function. We’ll look at each of these in turn.

[Read More]