10.1 Optimizing Queries with Statistics
Query optimization with statistics uses the collected statistics on the tables and indexes in a query to select an execution plan that can process the query in the most efficient manner. As a general rule, Oracle recommends that you collect statistics on your base table if you are interested in improving your query performance. Optimizing with statistics enables a more accurate estimation of the selectivity and costs of the CONTAINS
predicate and thus a better execution plan.
The optimizer attempts to choose the best execution plan based on the following parameters:
-
The selectivity on the
CONTAINS
predicate -
The selectivity of other predicates in the query
-
The CPU and I/O costs of processing the
CONTAINS
predicates
The following topics discuss how to use statistics with the extensible query optimizer:
Note:
Importing and exporting statistics on domain indexes, including Oracle Text indexes, is not supported with the DBMS_STATS
package. For more information on importing and exporting statistics, see the Oracle Database PL/SQL Packages and Types Reference.
See Also:
Oracle Text Reference for information on the CONTAINS
query operator
10.1.1 Collecting Statistics
By default, Oracle Text uses the cost-based optimizer (CBO) to determine the best execution plan for a query.
To enable the optimizer to better estimate costs, calculate the statistics on the table you queried table:
ANALYZE TABLE <table_name> COMPUTE STATISTICS;
Alternatively, estimate the statistics on a sample of the table:
ANALYZE TABLE <table_name> ESTIMATE STATISTICS 1000 ROWS;
or
ANALYZE TABLE <table_name> ESTIMATE STATISTICS 50 PERCENT;
You can also collect statistics in parallel with the DBMS_STATS.GATHER_TABLE_STATS
procedure:
begin DBMS_STATS.GATHER_TABLE_STATS('owner', 'table_name', estimate_percent=>50, block_sample=>TRUE, degree=>4) ; end ;
These statements collect statistics on all objects associated with table_name,
including the table columns and any indexes (b-tree, bitmap, or Text domain) associated with the table.
To re-collect the statistics on a table, enter the ANALYZE
statement as many times as necessary or use the DBMS_STATS
package.
By collecting statistics on the Text domain index, the CBO in Oracle Database can perform the following tasks:
-
Estimate the selectivity of the
CONTAINS
predicate -
Estimate the I/O and CPU costs of using the Oracle Text index (that is, the cost of processing the
CONTAINS
predicate by using the domain index) -
Estimate the I/O and CPU costs of each invocation of
CONTAINS
Knowing the selectivity of a CONTAINS
predicate is useful for queries that contain more than one predicate, such as in structured queries. This way the CBO can better decide whether to use the domain index to evaluate CONTAINS
or to apply the CONTAINS
predicate as a post filter.
See Also:
-
Oracle Database SQL Language Reference for more information about the
ANALYZE
statement -
Oracle Database PL/SQL Packages and Types Reference for information about
DBMS_STATS
package
10.1.2 Query Optimization with Statistics Example
The following structured query provides an example for optimizing statistics:
select score(1) from tab where contains(txt, 'freedom', 1) > 0 and author = 'King' and year > 1960;
Assume the following:
-
The author column is of type
VARCHAR2
and the year column is of typeNUMBER.
-
A b-tree index on the
author
column. -
The structured
author
predicate is highly selective with respect to theCONTAINS
predicate and theyear
predicate. That is, the structured predicate (author = 'King') returns a much smaller number of rows with respect to theyear
andCONTAINS
predicates individually, say 5 rows returned versus 1000 and 1500 rows, respectively.
In this situation, Oracle Text can execute this query more efficiently by first scanning a b-tree index range on the structured predicate (author = 'King'), then accessing a table by rowid, and then applying the other two predicates to the rows returned from the b-tree table access.
Note:
When statistics are not collected for a Oracle Text index, the CBO assumes low selectivity and index costs for the CONTAINS
predicate.
10.1.3 Re-Collecting Statistics
After synchronizing your index, you can re-collect statistics on a single index to update the cost estimates.
If your base table was reanalyzed before the synchronization, it is sufficient to analyze the index after the synchronization without reanalyzing the entire table.
To re-collect statistics, enter one of the following statements:
ANALYZE INDEX <index_name> COMPUTE STATISTICS;
ANALYZE INDEX <index_name> ESTIMATE STATISTICS SAMPLE 50 PERCENT;