Calculate the Sum of a Column or the Count of Rows while Processing Large Files
You can process large comma-separated value (CSV) files (up to 1 GB in size) using the Read File In Segments operation of a stage file action. You may also have a requirement to calculate the sum of a column or the count of rows.
Consider the following payload. Assume you want to calculate the sum of all the
values in the column Amount1. You may typically declare a
variable upstream of the stage file action and keep updating this variable with the
computation done in each chunk of the stage file action.


However, updating the upstream variables inside the stage file
action Read File in Segments operation impacts performance and does not allow stage
file action processing in parallel. You observe the following warning message in the
integration
canvas:
Stage File Read File in Segments includes action that will result in segments being processed sequentially
As a solution for this use case, perform the following steps to use the aggregate functions sum and count while processing larger files: