1.7 About Aggregation Offload

Oracle Big Data SQL uses Oracle In-Memory technology to push aggregation processing down to the Oracle Big Data SQL cells. This enables Oracle Big Data SQL to leverage the processing power of the Hadoop cluster for distributing aggregations across the cluster nodes. The performance gains can be significantly faster compared to aggregations that do not offload especially when there are a moderate number of summary groupings. For single table queries, the aggregation operation should consistently offload.

Oracle Big Data SQL cells support single table and multi-table aggregations (for example, dimension tables joining to a fact table). For multi-table aggregations, the Oracle Database uses the key vector transform optimization in which the key vectors are pushed to the cells for the aggregation process. This transformation type is useful for star join SQL queries that use typical aggregation operators (for example, SUM, MIN, MAX, and COUNT) which are common in business queries.

A vector transformation query is a more efficient query that uses bloom filter for joins. When you use a vector transformed query with Oracle Big Data SQL Cells, the performance of joins in the query is enhanced by the ability to offload filtration for rows used for aggregation. You see a “KEY VECTOR USE” operation in the query plan during this optimization.

In Oracle Big Data SQL cells, vector transformed queries benefit from more efficient processing due to the application of group-by columns (key vectors) to the Oracle Big Data SQL Storage Index.

You may not see the benefit of aggregation offload in certain instances:
  • Missing predicate

    If the SYS_OP_VECTOR_GROUP_BY predicate is missing in the explain plan, aggregation offload is affected. The predicate can be missing due to the following reasons:
    • Presence of a disallowed intervening row source between the table scan and group-by row sources.

    • The table scan does not produce rowsets.

    • Presence of an expression or data type in the query that can not be offloaded.

    • Vector group-by is manually disabled.

    • The table of table scan or configuration does not expect gains from aggregation offload.

  • Missing smart scan

    The cell interconnect bytes returned by XT smart scan and cell XT granules requested for predicate offload statistics must be available.

  • Missing key vectors

    The limit on the data transmitted to the cells is 1 MB. If this threshold is exceeded, then queries can benefit from intelligent key vector filtering but not necessarily offloaded aggregation. This condition is known as Key Vector Lite mode. Due to their large size, some of the key vectors are not fully offloaded. They get offloaded in lite mode along with the key vectors that do not support aggregation offload. Key vectors are not completely serialized in lite mode. The vector group-by offload is disabled when key vectors are offloaded in lite mode.

See Also:

Oracle Database In-Memory Guide for information about how aggregation works in Oracle Database