Database Analytics Functions
Database analytics functions enable you to perform advanced analysis and data mining analysis, for example, detecting anomalies, clustering data, sampling data, and affinity analysis. Analytics Functions are available when you connect to an Oracle database or Oracle Autonomous Data Warehouse.
To display the Database Analytics step in the data flow editor, you must connect to an Oracle database or Oracle Autonomous Data Warehouse.
Function Types | Description |
---|---|
Dynamic Anomaly Detection |
Detect anomalies in your input data without a pre-defined model. For example, you might want to highlight unusual financial transactions. When you deploy this function with large datasets, configure the partition columns to maximize performance. |
Dynamic Clustering |
Cluster your input data without a pre-defined model. For example, you might want to characterize and discover customer segments for marketing purposes. When you deploy this function with large datasets, configure the partition columns to maximize performance. |
Frequent Item Set |
Discover relationships in your data by identifying sets of items that often appear together. This data mining technique is also known as association rule learning, affinity analysis, or in the retail industry as market basket analysis. If you use frequent item set as a market basket analysis tool, you might find that customers who buy shampoo also buy hair conditioner. This operation is resource intensive and its performance depends on several factors, such as input dataset volume, cardinality of transaction id, and cardinality of Item value column. To avoid potential performance degradation on the database, try with a higher value of minimum support percent (default is 0.25) and gradually reduce it to accommodate more itemsets in your output. |
Sampling Data |
Selects a random sample percentage of data from a table. You simply specify the percentage of data you want to sample. For example, you might want to randomly sample ten percent of your data. |
Text Tokenization |
Analyze textual data by breaking it down into distinct words and counting the occurrences of each word. When you run your data flow, Oracle Analytics creates a table in the database named DR$IndexName$I, which contains the token text and the token count related details. Use the DR$IndexName$I table to create a dataset.
The database connection that you use for your data flow requires special database privileges. Check with your administrator that:
|
Time Series |
Time Series is a data mining technique that forecasts target value based on a known history of target values. The input to time series analysis is a sequence of target values. It provides estimates of the target value for each period of a time window that can include up to 30 periods beyond the historical data. The model also computes various statistics that measure the goodness of fit to historical data. These statistics are available as an additional output dataset via a parameter setting. Note: The Time Series algorithm is only available from Oracle database version 18c onwards. |
Un-pivoting Data |
Transpose data that's stored in columns into row format. For example, you might want to transpose multiple columns showing a revenue metric value for each year to a single revenue column with multiple value rows for the year dimension. You simply select the metric columns to transpose and specify a name for the new column.You get a new dataset with fewer columns and more rows. |
Note: To use analytic functions, make sure that the administrator has enabled analytics functions (see Console, Advanced System Settings, Performance and Compatibility, Enable Database Analytics Node in Data Flows).