Execution of Restarted Map/Reduce Stages

A map/reduce script can have many jobs. The input, shuffle, and summarize stages each use one job, but multiple jobs can run in the map and reduce stages. Jobs can run in parallel within the map stage or reduce stages. Any of the jobs can be forcefully ended at any time. The impact depends on what the job was doing and which stage it was running in.

For details, see the following topics:

Termination of getInput Stage

The work of a serial stage (getInput, shuffle, and summarize stages) is done in a single job. If the getInput stage job is forcefully terminated, it's later restarted.getInput part of the script can check if it's a restart by examining the isRestarted attribute of the context argument (inputContext.isRestarted). The script is being restarted if and only if (context.isRestarted === true).

The next stage gets its input from the return value of the getInput script, which is saved after the getInput stage finishes. Even a restarted getInput script should return the same data. The map/reduce framework makes sure no data is written twice.

However, if the getInput script changes other data (like creating NetSuite records), it should have code to handle duplicate processing. The script needs idempotent operations to ensure that these records are not created twice, if this is undesired.

Termination of Shuffle Stage

The shuffle stage doesn't have any custom code, so if its job is forcefully ended, it's restarted later and all the work gets redone. There is no impact other than that the stage takes longer to finish.

Termination of Parallel Stages

Map and reduce stages can run jobs in parallel, so they're considered parallel stages. An application restart affects parallel stages the same way. The following example covers what happens if a map stage restarts. A reduce stage will behave the same way.

The map stage's job is to run a map function on each key-value pair from the previous stage (getInput). Multiple jobs participate in the map stage. Map jobs claim key-value pairs (or a set number of them) that haven't had the map function run yet. The job flags these key-value pairs so no other job runs the map function on them. Then, the job sequentially executes the map function on the key-value pairs it flagged. The map stage finishes when all pairs have been processed.

There's no limit to how many jobs can be in the map stage, but concurrency is limited. Initially, the number of map jobs is equal to the selected concurrency in the corresponding map/reduce script deployment. However, to prevent a single map/reduce task from monopolizing all computational resources in the account, each map job can yield itself to allow other jobs to execute. The yield creates an additional map job and the number of yields is unlimited.

Note:

This is a different type of yield compared to yield in a SuiteScript 1.0 scheduled script. In SuiteScript 1.0, the yield happens in the middle of a script execution. In a map job, the yield can happen only between two map function executions, and not in the middle of one.

If a map job is forcefully terminated, it's later restarted. First, the job executes the map function on all key-value pairs that it took and did not mark finished before termination. It's the only map job that can run the map function on those pairs—they can't be taken by another map job. After processing those pairs, the map job continues as usual, taking other unfinished pairs and running the map function on them.

In some cases, the map function can run again on multiple key-value pairs. The number of pairs the map function might run again depends on the buffer size you select on the deployment page. The buffer size determines the number of key-value pairs originally taken in a batch. The job marks the batch as finished only when the map function is executed on all of them. Therefore, if the map job is forcefully terminated in the middle of the batch, the entire batch will be processed from the beginning when the map job is restarted.

Note that the map/reduce framework deletes all key-value pairs written from a partially executed batch so they aren't written twice. Therefore, the map function does not need to check whether mapContext.write(options) for a particular key-value has already been executed. However, if the map function changes other data, it must also use idempotent operations. For example, if a map function creates NetSuite records, the script should check that these records aren't created twice if that's not what you want.

To check if a map function execution is a part of a restarted batch, the script must examine the isRestarted attribute in the context argument (mapContext.isRestarted). The map function is in the restarted batch if and only if (context.isRestarted === true).

Be aware that a restarted value of true only means some part of the script might have already run. Even if context.isRestarted === true, a map function could run on a particular key-value for the first time. For example, the map job could be stopped after picking up a key-value pair but before running the map function on it. This is more likely if a high buffer value is set in the map/reduce deployment.

Termination of Summarize Stage

If the summarize stage job is forcefully ended, it's restarted later. The summarize part of the script can check if it's a restart by looking at the isRestarted attribute of the summary argument (summaryContext.isRestarted).

The script is being restarted if and only if (summary.isRestarted === true).

Related Topics

General Notices