8 Optimizing In-Memory Joins
Database In-Memory supports multiple features to improve the performance of In-Memory joins.
About In-Memory Joins
Joins are an integral part of data warehousing workloads. The IM column store enhances the performance of joins when the tables being joined are stored in memory.
Because of faster scan and join processing, complex multitable joins and simple joins that use Bloom filters benefit from the IM column store. In a data warehousing environment, the most frequently-used joins involved a fact table and one or more dimension tables.
The following joins run faster when the tables are populated in the IM column store:
-
Joins that are amenable to using Bloom filters
-
Joins of multiple small dimension tables with one fact table
-
Joins between two tables that have a primary key-foreign key relationship
Optimizing Joins with Join Groups
You can optimizing In-Memory joins by creating join groups using the CREATE INMEMORY JOIN GROUP
statement.
Tip:
Automatic In-Memory Management AIM now automatically analyzes a workload to determine where a join group may be beneficial. It then creates the join group, monitors its impact on performance, and reverts change there is if no improvement. AIM is enabled by default.About Join Groups
A join group is a group of between 1 and 255 columns that are frequently joined.
The table set for the join group includes one or more internal tables. External tables are not supported. When the IM column store is enabled, the database can use join groups to optimize joins of populated tables.
The columns in the join group can be in the same or different tables. For example, if the sales
and times
tables frequently join on the time_id
column, then you might create a join group for (times(time_id), sales(time_id))
. If the employees
table often joins to itself on the employee_id
column, then you could create the join group (employees(employee_id))
.
Note:
The same column cannot be a member of multiple join groups.
When you create a join group, the database invalidates the current In-Memory contents of the tables referenced in the join group. Subsequent repopulation causes the database to re-encode the IMCUs of the tables with the common dictionary. For this reason, Oracle recommends that you first create the join group, and then populate the tables.
Create join groups using the CREATE INMEMORY JOIN GROUP
statement. To add columns to or drop columns from a join group, use an ALTER INMEMORY JOIN GROUP
statement. Drop a join group using the DROP INMEMORY JOIN GROUP
statement.
Note:
In Oracle Active Data Guard, a standby database ignores join group definitions. A standby database does not use common dictionaries, and executes queries as if join groups did not exist.
Example 8-1 Creating a Join Group
This example creates a join group named deptid_jg
that includes the department_id
column in the hr.employees
and hr.departments
tables.
CREATE INMEMORY JOIN GROUP deptid_jg (hr.employees(department_id),hr.departments(department_id));
Purpose of Join Groups
In certain queries, join groups eliminate the performance overhead of decompressing and hashing column values.
Without join groups, if the optimizer uses a hash join but cannot use a Bloom filter, or if the Bloom filter does not filter rows effectively, then the database must decompress IMCUs and use an expensive hash join. To illustrate the problem, assume a star schema has a sales
fact table and a vehicles
dimension table. The following query joins these tables, but does not filter the output, which means that the database cannot use a Bloom filter:
SELECT v.year, v.name, s.sales_price
FROM vehicles v, sales s
WHERE v.name = s.name;
The following figure illustrates how the database joins the two data sets.
The database performs a hash join as follows:
-
Scans the
vehicles
table, decompresses the rows that satisfy the predicate (in this case, all rows satisfy the predicate because no filters exist), and sends the rows to the hash join -
Builds a hash table in the PGA based on the decompressed rows
-
Scans the
sales
table and applies any filters (in this case, the query does not specify filters) -
Processes matching rows from the IMCUs and then sends the rows to the join
When the hash join can consume row sets from the probe side (in this case, the
sales
table), the row sets sent by the table scan are in compressed form. Depending on whether the local dictionary or join group is leveraged to find matching rows from the build side, the hash join either decompresses the rows or leaves them uncompressed. -
Probes the hash table using the join column, which in this case is the vehicle name
If a join group exists on the v.name
and s.name
columns, then the database can make the preceding steps more efficient, eliminating the decompression and filtering overhead. The benefits of join groups are:
-
The database operates on compressed data.
-
The database avoids hashing on the join key and probing the hash table, which requires comparing the join keys of the probe rows and hashed rows.
When a join group exists, the database stores codes for each join column value in a common dictionary. The database builds a join group array using dictionary codes. Every array element points to a build-side row stored in the hash area (typically, PGA memory). During the probe, each probe row has a code associated with the join key. The database uses this code to search the array to determine whether a pointer exists in the array element. If a pointer exists, then there is a match; otherwise, there is no match.
-
The dictionary codes are dense and have a fixed length, which makes them space efficient.
-
Optimizing a query with a join group is sometimes possible when it is not possible to use a Bloom filter.
How Join Groups Work
In a join group, the database compresses all columns in the join group using the same common dictionary.
How a Join Group Uses a Common Dictionary
A common dictionary is a table-level set of dictionary codes.
A common dictionary is instance-specific on single-instance Oracle databases and on Oracle RAC One Node databases. On an Oracle RAC cluster it is a global dictionary distributed across all nodes of the cluster.
The database automatically creates a common dictionary in the IM column store when a join group is defined on the underlying columns. The common dictionary enables the join columns to share the same dictionary codes.
A common dictionary provides the following benefits:
-
Encodes the values in the local dictionaries with codes from the common dictionary, which provides compression and increases the cache efficiency of the IMCU
-
Enables joins to use dictionary codes to construct and probe the data structures used during hash joins
-
Enables the optimizer to obtain statistics such as cardinality, distribution of column values, and so on
The following figure illustrates a common dictionary that corresponds to a join group created on the sales.name
and vehicles.name
columns.
Figure 8-2 Common Dictionary for a Join Group

Description of "Figure 8-2 Common Dictionary for a Join Group"
When the database uses a common dictionary, the local dictionary for each CU does not store the original values: AUDI
, BMW
, CADILLAC
, FORD
, and so on. Instead, the local dictionary stores references to the values stored in the common dictionary. For example, the local dictionary might store the value 101
for Audi
and 220
for BMW
. The common dictionary might store the value 0
for Audi
and 1
for BMW
. The 101
(AUDI) in the local dictionary is a pointer to the 0
(AUDI) in the common dictionary.
How a Join Group Optimizes Scans
The key optimization is joining on common dictionary codes instead of column values, thereby avoiding the use of a hash table for the join.
Consider the following query, which uses a join group to join vehicles
and sales
on the name
column:
SELECT v.year, v.name, s.sales_price
FROM vehicles v, sales s
WHERE v.name = s.name
AND v.name IN ('Audi', 'BMW', 'Porsche', 'VW');
The following figure illustrates how the join benefits from the common dictionary created on the join group.
As illustrated in the preceding diagram, the database performs a hash join on the compressed data as follows:
-
Scans the
vehicles
table, and sends the dictionary codes (not the original column values) to the hash join:0
(Audi),1
(BMW),2
(Cadillac), and so on -
Builds an array of distinct common dictionary codes in the PGA
-
Scans the
sales
table and applies any filters (in this case, the filter is for German cars only) -
Sends matching rows to the join in compressed format
-
Looks up corresponding values in the array rather than probing a hash table, thus avoiding the need to compute a hash function on the join key columns
In this example, the vehicles
table has only seven rows. The vehicles.name
column has the following values:
Audi
BMW
Cadillac
Ford
Porsche
Tesla
VW
The common dictionary assigns a dictionary code to each distinct value. Conceptually, the common dictionary looks as follows:
Audi 0
BMW 1
Cadillac 2
Ford 3
Porsche 4
Tesla 5
VW 6
The database scans vehicles.name
, starting at the first dictionary code in the first IMCU and ending at the last code in the last IMCU. It stores a 1
for every row that matches the filter (German cars only), and 0
for every row that does not match the filter. Conceptually, the array might look as follows:
array[0]: 1
array[1]: 1
array[2]: 0
array[3]: 0
array[4]: 1
array[5]: 0
array[6]: 1
The database now scans the sales
fact table. To simplify the example, assume that the sales
table only has 6 rows. The database scans the rows as follows (the common dictionary code for each value is shown in parentheses):
Cadillac (2)
Cadillac (2)
BMW (1)
Ford (3)
Audi (0)
Tesla (5)
The database then proceeds through the vehicles.name
array, looking for matches. If a row matches, then the database sends the matching row with its associated common dictionary code, and retrieves the corresponding column value from the vehicles.name
and sales.name
IMCUs:
2 -> array[2] is 0, so no join
2 -> array[2] is 0, so no join
1 -> array[1] is 1, so join
3 -> array[3] is 0, so no join
0 -> array[0] is 1, so join
5 -> array[5] is 0, so no join
When a Hash Join Uses Common Dictionary Encodings
Joins on columns in a join group typically see a performance benefit.
At join group creation, the database does the following:
-
Caches the hash of the dictionary values for the join key columns
-
Caches the binary representation of the
NUMBER
data for the join key columns -
Encodes columns with the same common dictionary
A join on columns in a join group always uses the first two optimizations to improve performance. For example, if the optimizer chooses a hash join, then the query uses the cached hash values to probe the bloom filter. If the query uses an IM aggregation join, then the query uses the cached binary number to index into the key vector.
A hash join may or may not use dictionary encodings. When dictionary encodings are present in at least one column of the hash join, the query can leverage the encodings in the following ways:
-
Join group-aware hash join
Both columns in the hash join carry common dictionary encoding data during runtime. The execution plan must show either a parallel hash join plan without any distribution involved from both sides of the hash join, or a serial hash join plan.
-
Encoding-aware hash join
One fact table column in the hash join carries dictionary encoding data during runtime. The execution plan must show either a parallel hash join without any distribution from the right side of the hash join, or a serial hash join plan. In some cases, if the common dictionary has good compression ratio, and if a parallel hash join plan cannot leverage a join group-aware hash join (for example, in a parallel broadcast-none plan), then the query can use an encoding-aware hash join for the common dictionary.
In a SQL Monitor report, the following fields show dictionary usage:
Columnar Encodings Observed
, and Columnar Encodings
Leveraged
. The statistics are cumulative. In a parallel hash join, the
fields summarize statistics collected from all child processes involved in executing a
row source. In the context of the local dictionary in an IMCU, the statistics show the
number of encoding IDs observed from the right child row source and the number of
encodings leveraged by the join. If a hash join on a single process leverages the common
dictionary, then Columnar Encodings Leveraged
shows the number of
encodings leveraged in the join.
The following table indicates the possible values for Columnar Encodings Observed
and Columnar Encodings Leveraged
, and what the combinations mean.
Table 8-1 Join Group Usage in a SQL Monitor Report
Columnar Encodings Observed | Columnar Encodings Leveraged | Encoding-Aware Hash Join Used? | Join Group-Aware Hash Join Used? |
---|---|---|---|
Not present |
Not present |
No |
No |
Positive value |
Not present |
No |
No |
Positive value |
Positive value |
Yes |
No |
Not present |
Positive value |
No |
Yes |
For example, if the report shows that the Columnar Encodings Leveraged
field is 4
(for example, because the parallel degree is 4) but the Columnar Encodings Observed
field is absent, then the query leveraged the join group for the hash join. If the Columnar Encodings Observed
field is 4
but the Columnar Encodings Leveraged
field is absent, then dictionary encodings existed, but the query did not use them.
Various factors can prevent a query from engaging an encoding-aware hash join. Factors include the following:
-
The compression ratio of the common dictionary is suboptimal.
-
The query observes too many row sets passed from the table scan without a common dictionary.
-
The build-side row length is too large.
-
The build-side rows cannot fit into PGA memory.
-
The build side has duplicate join keys.
See Also:
Creating Join Groups
Define join groups using the CREATE INMEMORY JOIN GROUP
statement.
Candidates for join groups are columns that are frequently paired in a join predicate. Typical examples include a column joining a fact and dimension table, or a column joining a table to itself.
The CREATE INMEMORY JOIN GROUP
statement immediately defines a join group, which means that its metadata is visible in the data dictionary. The database does not immediately construct the common dictionary. Rather, the database builds the common dictionary the next time that a table referenced in the join group is populated or repopulated in the IM column store.
Guidelines
Creating, modifying, or dropping a join group typically invalidates all the underlying tables referenced in the join group. Thus, Oracle recommends that you create join groups before initially populating the tables.
To create a join group:
-
In SQL*Plus or SQL Developer, log in to the database as a user with the necessary privileges.
-
Create a join group by using a statement in the following form:
CREATE INMEMORY JOIN GROUP join_group_name ( table1(col1), table2(col2) );
For example, the following statement creates a join group named
sales_products_jg
:CREATE INMEMORY JOIN GROUP sales_products_jg (sales(prod_id), products(prod_id));
-
Optionally, view the join group definition by querying the data dictionary (sample output included):
COL JOINGROUP_NAME FORMAT a18 COL TABLE_NAME FORMAT a8 COL COLUMN_NAME FORMAT a7 SELECT JOINGROUP_NAME, TABLE_NAME, COLUMN_NAME, GD_ADDRESS FROM DBA_JOINGROUPS; JOINGROUP_NAME TABLE_NA COLUMN_ GD_ADDRESS ------------------ -------- ------- ---------------- SALES_PRODUCTS_JG SALES PROD_ID 00000000A142AE50 SALES_PRODUCTS_JG PRODUCTS PROD_ID 00000000A142AE50
-
Populate the tables referenced in the join group, or repopulate them if they are currently populated.
Example 8-2 Optimizing a Query Using a Join Group
In this example, you log in to the database as SYSTEM
, and then create a join group on the prod_id
column of sales
and products
, which are not yet populated in the IM column store:
CREATE INMEMORY JOIN GROUP
sh.sales_products_jg (sh.sales(prod_id), sh.products(prod_id));
You enable the sh.sales
and sh.products
tables for population in the IM column store:
ALTER TABLE sh.sales INMEMORY;
ALTER TABLE sh.products INMEMORY;
The following query indicates the tables are not yet populated in the IM column store (sample output included):
COL OWNER FORMAT a3
COL NAME FORMAT a10
COL STATUS FORMAT a20
SELECT OWNER, SEGMENT_NAME NAME,
POPULATE_STATUS STATUS
FROM V$IM_SEGMENTS;
no rows selected
Query both tables to populate them in the IM column store:
SELECT /*+ FULL(s) NO_PARALLEL(s) */ COUNT(*) FROM sh.sales s;
SELECT /*+ FULL(p) NO_PARALLEL(p) */ COUNT(*) FROM sh.products p;
The following query indicates the tables are now populated in the IM column store (sample output included):
COL OWNER FORMAT a3
COL NAME FORMAT a10
COL PARTITION FORMAT a13
COL STATUS FORMAT a20
SELECT OWNER, SEGMENT_NAME NAME, PARTITION_NAME PARTITION,
POPULATE_STATUS STATUS, BYTES_NOT_POPULATED
FROM V$IM_SEGMENTS;
OWN NAME PARTITION STATUS BYTES_NOT_POPULATED
--- ---------- ------------- -------------------- -------------------
SH SALES SALES_Q3_1998 COMPLETED 0
SH SALES SALES_Q4_2001 COMPLETED 0
SH SALES SALES_Q4_1999 COMPLETED 0
SH PRODUCTS COMPLETED 0
SH SALES SALES_Q1_2001 COMPLETED 0
SH SALES SALES_Q1_1999 COMPLETED 0
SH SALES SALES_Q2_2000 COMPLETED 0
SH SALES SALES_Q2_1998 COMPLETED 0
SH SALES SALES_Q3_2001 COMPLETED 0
SH SALES SALES_Q3_1999 COMPLETED 0
SH SALES SALES_Q4_2000 COMPLETED 0
SH SALES SALES_Q4_1998 COMPLETED 0
SH SALES SALES_Q1_2000 COMPLETED 0
SH SALES SALES_Q1_1998 COMPLETED 0
SH SALES SALES_Q2_2001 COMPLETED 0
SH SALES SALES_Q2_1999 COMPLETED 0
SH SALES SALES_Q3_2000 COMPLETED 0
Query DBA_JOINGROUPS
to get information about the join group (sample output included):
COL JOINGROUP_NAME FORMAT a18
COL TABLE_NAME FORMAT a8
COL COLUMN_NAME FORMAT a7
SELECT JOINGROUP_NAME, TABLE_NAME, COLUMN_NAME, GD_ADDRESS
FROM DBA_JOINGROUPS;
JOINGROUP_NAME TABLE_NA COLUMN_ GD_ADDRESS
------------------ -------- ------- ----------------
SALES_PRODUCTS_JG SALES PROD_ID 00000000A142AE50
SALES_PRODUCTS_JG PRODUCTS PROD_ID 00000000A142AE50
The preceding output shows that the join group sales_products_jg
joins on the same common dictionary address.
See Also:
-
Oracle Database SQL Language Reference to learn about the
CREATE INMEMORY JOIN GROUP
statement -
Oracle Database Reference to learn about the
DBA_JOINGROUPS
view
Monitoring Join Group Usage
To determine whether queries are using the join group, you can use either a graphical SQL Monitor report (recommended) or a SQL query that uses the DBMS_SQLTUNE.REPORT_SQL_MONITOR_XML
function.
"When a Hash Join Uses Common Dictionary Encodings" explains how to interpret the SQL Monitor output.
Prerequisites
To monitor join groups, you must meet the following prerequisites:
-
A join group must exist.
-
The columns referenced by the join group must have been populated after join group creation.
-
You must execute a join query that could potentially use the join group.
To monitor join group usage:
-
Log in to the database as a user with the necessary privileges.
-
Create a SQL*Plus variable to store the SQL ID as follows:
VAR b_sqlid VARCHAR2(13)
-
Execute a query that joins on the columns in the join group.
-
Use either following techniques:
-
Graphical SQL Monitor Report
SQL Monitor reports are available in Enterprise Manager. In SQL*Plus, you can use
DBMS_SQL_MONITOR.REPORT_SQL_MONITOR
to generate a SQL Monitor report as follows:SET TRIMSPOOL ON SET TRIM ON SET PAGES 0 SET LINESIZE 1000 SET LONG 1000000 SET LONGCHUNKSIZE 1000000 SPOOL /tmp/long_sql.htm SELECT DBMS_SQL_MONITOR.REPORT_SQL_MONITOR( sql_id => :b_sqlid, report_level => 'ALL', TYPE => 'active') FROM DUAL; SPOOL OFF
Access the report in a browser, and then click the binoculars icon on the hash join to view the join group statistics.
-
Command-Line Query
Use the
DBMS_SQLTUNE.REPORT_SQL_MONITOR_XML
function in a query, as shown in the following example:SELECT encoding_hj.rowsource_id row_source_id, CASE WHEN encoding_hj.encodings_observed IS NULL AND encoding_hj.encodings_leveraged IS NOT NULL THEN 'join group was leveraged on ' || encoding_hj.encodings_leveraged || ' processes' ELSE 'join group was NOT leveraged' END columnar_encoding_usage_info FROM (SELECT DBMS_SQLTUNE.REPORT_SQL_MONITOR_XML(session_id=>-1,sql_id=>:b_sqlid). EXTRACT(q'#//operation[@name='HASH JOIN' and @parent_id]#') xmldata FROM DUAL) hj_operation_data, XMLTABLE('/operation' PASSING hj_operation_data.xmldata COLUMNS "ROWSOURCE_ID" NUMBER PATH '@id', "ENCODINGS_LEVERAGED" NUMBER PATH 'rwsstats/stat[@id="9"]', "ENCODINGS_OBSERVED" NUMBER PATH 'rwsstats/stat[@id="10"]') encoding_hj;
-
Monitoring Join Groups Using a SQL Monitor Report: Example
Your goal is to use a graphical SQL Monitor report to determine whether a query leveraged a join group.
In this example, you create a join group on the prod_id
columns of sh.products
and sh.sales
tables, and then join these tables on this column. You grant the sh
account administrative privileges.
Example 8-3 Monitoring a Join Group Using a SQL Monitor Report
-
In SQL*Plus, log in to the database as user
sh
. -
Create a SQL*Plus variable to store the SQL ID as follows:
VAR b_sqlid VARCHAR2(13)
-
Apply the
INMEMORY
attribute to thesh.products
andsh.sales
tables as follows:ALTER TABLE sales NO INMEMORY; ALTER TABLE products NO INMEMORY; ALTER TABLE sales INMEMORY MEMCOMPRESS FOR QUERY; ALTER TABLE products INMEMORY MEMCOMPRESS FOR QUERY;
-
Create a join group on
prod_id
:CREATE INMEMORY JOIN GROUP jgrp_products_sales (products(prod_id), sales(prod_id));
-
Scan the tables to populate them in the IM column store:
SELECT /*+ FULL(s) */ COUNT(*) FROM sales s; SELECT /*+ FULL(p) */ COUNT(*) FROM products p;
-
Execute a query that joins on the
prod_id
column, and then aggregates product sales:SELECT /*+ USE_HASH(sales) LEADING(products sales) MONITOR */ products.prod_id, products.prod_category_id, SUM(sales.amount_sold) FROM products, sales WHERE products.prod_id = sales.prod_id GROUP BY products.prod_category_id, products.prod_id;
-
Generate an HTML-based SQL Monitor report by using
DBMS_SQLTUNE.REPORT_SQL_MONITOR
.For example, create a SQL script with the following contents, and run it in SQL*Plus:
SET TRIMSPOOL ON SET TRIM ON SET PAGES 0 SET LINESIZE 1000 SET LONG 1000000 SET LONGCHUNKSIZE 1000000 SPOOL /tmp/jg_report.htm SELECT DBMS_SQL_MONITOR.REPORT_SQL_MONITOR( sql_id => :b_sqlid, report_level => 'ALL', TYPE => 'active') FROM DUAL; SPOOL OFF
-
Open the HTML report in a browser.
The following sample report shows the execution plan for the join. The binoculars in the hash join open a window that shows additional statistics.
Figure 8-4 Monitored SQL Execution Details Page
Description of "Figure 8-4 Monitored SQL Execution Details Page" -
Click the binoculars icon to open a window that shows join group statistics.
The following sample window shows the statistics:
Because
Columnar Encodings Leveraged
is a positive value andColumnar Encodings Observed
is not present, the join group was leveraged. -
Optionally, clean up after the example:
DROP INMEMORY JOIN GROUP jgrp_products_sales; ALTER TABLE sales NO INMEMORY; ALTER TABLE products NO INMEMORY;
See Also:
-
Oracle Database PL/SQL Packages and Types Reference to learn about the
DBMS_SQLTUNE.REPORT_SQL_MONITOR_XML
function -
Oracle Database Reference to learn about the
V$SESSION
view
Monitoring Join Groups from the Command Line: Example
Your goal is to use command-line tools to determine whether a query leveraged a join group.
In this example, you create a join group on the prod_id
columns of sh.products
and sh.sales
tables, and then join these tables on this column. You grant the sh
account administrative privileges.
Example 8-4 Monitoring a Join Group from the Command Line
-
Log in to the database as
sh
. -
Create a SQL*Plus variable to store the SQL ID as follows:
VAR b_sqlid VARCHAR2(13)
-
Apply the
INMEMORY
attribute to thesh.products
andsh.sales
tables as follows:ALTER TABLE sales NO INMEMORY; ALTER TABLE products NO INMEMORY; ALTER TABLE sales INMEMORY MEMCOMPRESS FOR QUERY; ALTER TABLE products INMEMORY MEMCOMPRESS FOR QUERY;
-
Create a join group on
prod_id
:CREATE INMEMORY JOIN GROUP jgrp_products_sales (products(prod_id), sales(prod_id));
-
Scan the tables to populate them in the IM column store:
SELECT /*+ FULL(s) */ COUNT(*) FROM sales s; SELECT /*+ FULL(p) */ COUNT(*) FROM products p;
-
Execute a query that joins on the
prod_id
column, and then aggregates product sales:SELECT /*+ USE_HASH(sales) LEADING(products sales) MONITOR */ products.prod_id, products.prod_category_id, SUM(sales.amount_sold) FROM products, sales WHERE products.prod_id = sales.prod_id GROUP BY products.prod_category_id, products.prod_id;
-
Obtain the SQL ID of the preceding aggregation query:
BEGIN SELECT PREV_SQL_ID INTO :b_sqlid FROM V$SESSION WHERE SID=USERENV('SID'); END;
-
Use
DBMS_SQLTUNE.REPORT_SQL_MONITOR_XML
to determine whether the database used the join group.For example, execute the following query:
COL row_source_id FORMAT 999 COL columnar_encoding_usage_info FORMAT A40 SELECT encoding_hj.rowsource_id row_source_id, CASE WHEN encoding_hj.encodings_observed IS NULL AND encoding_hj.encodings_leveraged IS NOT NULL THEN 'join group was leveraged on ' || encoding_hj.encodings_leveraged || ' processes' ELSE 'join group was NOT leveraged' END columnar_encoding_usage_info FROM (SELECT DBMS_SQLTUNE.REPORT_SQL_MONITOR_XML(session_id=>-1,sql_id=>:b_sqlid). EXTRACT(q'#//operation[@name='HASH JOIN' and @parent_id]#') xmldata FROM DUAL ) hj_operation_data, XMLTABLE('/operation' PASSING hj_operation_data.xmldata COLUMNS "ROWSOURCE_ID" NUMBER PATH '@id', "ENCODINGS_LEVERAGED" NUMBER PATH 'rwsstats/stat[@id="9"]', "ENCODINGS_OBSERVED" NUMBER PATH 'rwsstats/stat[@id="10"]' ) encoding_hj;
The following sample output shows that the join group was leveraged in the query:
ROW_SOURCE_ID COLUMNAR_ENCODING_USAGE_INFO ------------- ---------------------------------------- 2 join group was leveraged on 1 processes
-
Optionally, clean up after the example:
DROP INMEMORY JOIN GROUP jgrp_products_sales; ALTER TABLE sales NO INMEMORY; ALTER TABLE products NO INMEMORY;
See Also:
-
Oracle Database PL/SQL Packages and Types Reference to learn about the
DBMS_SQLTUNE.REPORT_SQL_MONITOR_XML
function -
Oracle Database Reference to learn about the
V$SESSION
view
Leveraging Join Groups on a RAC Cluster
RAC now supports dictionaries that are global across the RAC cluster.
A common dictionary is instance-specific on single-instance Oracle databases and on Oracle RAC One Node databases. On these systems, join group aware hash joins are only enabled for joins that do not involve row distribution.
An Oracle RAC cluster supports a distributed, global dictionary that is
shared across instances in a RAC cluster. This enables you to leverage join group aware
hash joins for distributed joins on RAC. When you issue a CREATE INMEMORY JOIN
GROUP
statement on a RAC database, a distributed RAC-level global
dictionary is created as part of the object population.
Consider the example below. A join group is created on the columns
sales.dealer_key
and dealership.dealership_key
,
which are used as join keys between the two tables. When you create an in-memory join
group comprising these tables, a common RAC-level global dictionary is created for these
columns. As you can see below, you do not need to do anything differently in order to
benefit from this enhancement.
create inmemory join group jg1 (sales(dealer_key), dealership(deadlership_key));
Queries that access these columns can then take advantage of the join group.
select sum(revenue) from sales s, dealership d where s.dealership_key = d.dealership_key;
Distributed RAC-level global dictionaries can enhance query performance. Performance runs with join group aware hash joins leveraging RAC-level global dictionaries demonstrate significant gains on the SSB (Star Schema Benchmark).
Optimizing Joins Using In-Memory Deep Vectorization
In-Memory deep vectorization can optimize complex SQL operators by pipelining the physical operators inside each SQL operator and vectorizing them using SIMD techniques. This feature is enabled by default.
About In-Memory Deep Vectorization
In-Memory deep vectorization is a SIMD-based framework that supports vectorization for higher-level query operators in the query plan. The framework includes optimizations such as SIMD, hardware acceleration, and pipelined execution.
The In-Memory vectorized joins feature is key to the deep vectorization framework. Using SIMD vector processing, the framework optimizes aspects of hash joins such as hashing, building, probing, and gathering. This optimization can improve the performance of join processing by 100% or more.
The In-Memory vectorized joins feature is transparent to the user, requiring no plan changes. The optimization is enabled by default, but you can disable it by setting the INMEMORY_DEEP_VECTORIZATION
initialization parameter to false
.
See Also:
-
"CPU Architecture: SIMD Vector Processing" to learn more about SIMD vectorization
-
Oracle Database SQL Tuning Guide to learn more about hash joins
How In-Memory Deep Vectorization Works
The In-Memory deep vectorization framework deconstructs high-level, complex SQL operators such as hash joins into smaller kernel-sized units.
The deconstructed kernels are suitable for SIMD vectorization techniques. The database executes the kernels in a pipelined fashion to accelerate the overall operation.
See Also:
"CPU Architecture: SIMD Vector Processing" to learn more about SIMD vectorization
How an In-Memory Vectorized Join Works
The vectorized joins feature is a key aspect of the In-Memory deep vectorization framework.
An In-Memory vectorized join works as follows:
-
At run time, the database decides whether a hash join would benefit from In-Memory deep vectorization. If so, the database proceeds to the next step.
Note:
Because selection of the vectorized joins operation occurs at runtime, the execution plan does not show the optimization.
-
The database pushes down join processing to the scan operators for evaluation directly on In-Memory columnar data formats.
-
The operation uses a SIMD-optimized hash table data structure instead of a traditional hash table.
-
The database determines matched rows from the left and right side of the join, and sends them back to the parent SQL operator using vectorization techniques.
An In-Memory vectorized join may leverage In-Memory features such as the following:
-
Join groups
If a join group is declared, then join processing with deep vectorization may be significantly faster.
-
IM dynamic scans
Join processing is parallelized further in scan operators using lightweight threads.
-
Aggregation pushdown
Aggregations on top of joins are vectorized with the join operation.
-
In-Memory columnar compression formats
The vectorized joins feature heavily leverages columnar data formats.
User Interface for Deep Vectorization
The deep vectorization framework is enabled when the INMEMORY_DEEP_VECTORIZATION
initialization parameter is true
, which is the default value.
You can use SQL Monitor to determine whether a query used a vectorized join. In a SQL Monitor report, click the binoculars icon next to the HASH JOIN
operation in the Information column. If DeepVec Hash Joins
has the value 1
, then the query used deep vectorization; otherwise, the query did not use it.
Query Types Supported by the Deep Vectorization Framework
Queries containing the following are supported by the In-Memory Deep Vectorization Framework:
- Multiple join key columns
- Semi join and outer join
- These function-based aggregations:
Aggregations are also fully supported. This includes:
- Columns from the build side, or combination of build and probe sides.
- Simple expressions grouping columns or aggregates.
In-Memory Vectorized Join: Example
This example indicates how a hash join benefits from deep vectorization.
In this example, a customers
and orders
table exist, and INMEMORY_DEEP_VECTORIZATION
is currently set to false
. Your goal is to count the orders within a single nation, making use of the deep vectorization optimizations if possible.
-
Log in to the database as administrative user
DB1
. -
Using
ALTER SESSION
, set theINMEMORY_DEEP_VECTORIZATION
initialization parameter toENABLE
. -
Join the
tpch.customer
andtpch.orders
tables as follows, filtering on the value21
in thetpch.customer.c_nationkey
column:SELECT /*+monitor */ COUNT(*) FROM tpch.customer c, tpch.orders o WHERE c.c_custkey = o.o_custkey AND c.c_nationkey = 21
-
To create a SQL Monitor report in HTML, pass the SQL ID to
DBMS_SQL_MONITOR.REPORT_SQL_MONITOR
.See "Monitoring Join Group Usage" for an example showing how to generate the report.
-
Open the SQL Monitor report in a browser.
The overview section of the report appears below.
-
In the Details section of the report, find the
HASH JOIN
operation, and then click the binoculars icon. -
In the Other Information window, look for
DeepVec Hash Joins
. Because the value is1
, the database used an In-Memory vectorized join.Note that
DeepVec Hash Join Flags
is an internal statistic that indicates which optimizations deep vectorization employed.
In-Memory Deep Vectorization for Multi-Level Joins
In-Memory Deep Vectorization enhancements now enable support for complex queries performing multi-table joins and aggregations.
Note:
Hash Hash hash join plans are not currently supported.As with other aspects of In-Memory Deep Vectorization , vectorization for multi-level joins is transparent to the user, requiring no plan changes.
Multiple join key columns in joins like the following are supported.
select * from build, probe
where build.key1 = probe.key1
and build.key2 = probe.key2
and …
select orders_order_id, avg(lineitem_extendedprice)
from orders, lineitem
where orders.zipcode = lineitem.zipcode
and orders.city = lineitem.city
group by orders.order_id;
Types of Multi-Level Joins Supported by the Optimizer
There are two types of multi-level joins generated by the optimizer:
- Left Deep Joins
The result of each join node feeds the Build side of its parent join
- Right Deep Join
The result of each join node feeds the Probe side of its parent join

Description of the illustration inmem_multileveljoins.eps