When doing a MapReduce operation there are times when you want to edit a set of entities without triggering the post or pre put hooks associated with those entities. On such ocassions using the raw datastore entity allows you to process the data without unwanted side effects. This article will show how to use the RawDatastoreInputReader to process datastore entities.
When doing a MapReduce operation there are times when you want to edit a set of entities without triggering the post or pre put hooks associated with those entities. On such ocassions using the raw datastore entity allows you to process the data without unwanted side effects.
For the sake of this discussion let’s assume we want to move a
phone_number field to a
work_number field for all entities of a certain Kind in the datastore.
Getting the raw datastore entity
The MapReduce library provides a
RawDatastoreInputReader that will feed raw datastore entities to your mapping function. We can set our MapReduce operation to use the
RawDatastoreInputReader using a
- name: move_phone_numbers mapper: input_reader: mapreduce.input_readers.RawDatastoreInputReader handler: app.pipelines.move_phone_numbers_map params: - name: entity_kind default: MyModel
Manipulating a raw datastore entity
raw_datastore_map function to use the datastore entity in its raw form. The raw form of the datastore entity provides a dictionary like interface that we can use to manipulate the entity. With this interface we can move the phone number to the correct field.
def move_phone_numbers_map(entity): phone_number = entity.get('phone_number') if phone_number: entity['work_number'] = phone_number del entity['phone_number'] yield op.db.Put(entity)
op.db.Put will put the entity to the datastore using the raw datastore
API, thereby bypassing any ndb hooks that are in place. For more information on
the raw datastore API the best resource is the source code itself, available
from the App Engine SDK