Map/Reduce Script Stages

The map/reduce script type goes through at least two of five possible stages.

The stages are processed in the following order. Each stage must finish before the next one starts.

getInputData – Gets a collection of data. This stage is always processed first and is required. The input stage runs sequentially.
map – Parses each row of data into a key-value pair. One pair (key-value) is passed per function invocation. If you skip this stage, you need a reduce stage. Data can be processed simultaneously in this stage.
shuffle – Groups values by keys. This happens automatically after the map stage. You can't access this stage directly as it's handled by the framework. Data is processed sequentially in this stage.
reduce – Evaluates the data in each group. One group (key-values) is passed per function invocation. If you skip this stage, you need a map stage. Data can be processed simultaneously in this stage.
summarize – Summarizes the previous stages' output. Use this stage to summarize the data and write it to a file or send an email. This stage is optional and not part of the main map/reduce process. The summarize stage runs sequentially.

Note:

You don't need to use both the map and reduce stages. You can skip one of them.

The following diagram illustrates these stages, in the context of processing a set of invoices.

Here's how the stages are used in this example:

getInputData – Loads invoices that need payment.
map – Pairs each invoice with the customer who should pay it. The output is key-value pairs with customerID as the key and the invoice as the value. The map function runs five times for five invoices.
reduce – Groups invoices by customerID, with three unique groups. The reduce function runs three times for the three groups. To create a customer payment for every group, custom logic iterates over each group using customerID as the key.
summarize – Custom logic gets metrics (like invoices paid) and sends them in an email

For a code sample similar to this example, see Processing Invoices Example.

Passing Data to a Map or Reduce Stage

To avoid changing data accidentally, key-value pairs are converted to strings when passed between stages. For map/reduce scripts, SuiteScript 2.x checks if the data is a string and uses JSON.stringify() to convert it if needed.

JSON-serialized objects stay in JSON format. To avoid possible errors, SuiteScript does not automatically deserialize the data. For instance, trying to convert non-JSON data types like CSV or XML can cause errors. You can use JSON.parse() to convert the JSON string back to a JS object if you need to.

Map/reduce script keys (specifically, in mapContext or reduceContext objects) have a 3,000 character limit. In addition, error messages are returned when a key is longer than 3,000 characters or a value is larger than 10 MB. Keys over 3,000 characters return a KEY_LENGTH_IS_OVER_3000_BYTES error. Values over 10 MB return a VALUE_LENGTH_IS_OVER_10_MB error.

If you have map/reduce scripts that use the mapContext.write(options) or reduceContext.write(options) methods, keep key strings under 3,000 characters and value strings under 10 MB. Consider the potential length of dynamically generated strings, which might exceed these limits. Also, try not to use keys instead of values to pass your data.

For more information about map/reduce limits see Map/Reduce Governance.

Passing Data to a Map or Reduce Stage

Related Topics