2 Working with Oracle Stream Analytics

Only Oracle user-managed services This topic applies only to Oracle user-managed services.

Oracle Stream Analytics has various artifacts like connections, references, streams, targets, cubes, dashboards, predictive models, custom jars, and many more. The artifacts are the important resources using which you can create pipelines.

Home Page

The Home page is the first page that you see when you login to Oracle Stream Analytics. This page lists the industry verticals that Oracle Stream Analytics supports.

Each industry vertical has a tag associated with it and the tags are case-sensitive.

  • Distributed Intelligence for IOT - Acquire, analyze, and act on high-volume, high-velocity data from sensors and devices both at the edge and in the data center in real-time. Tag for this vertical is IOT.

  • Risk and Fraud Management - Leverage industry's best stream processing platform to assess risk and prevent financial fraud in real-time. Tag for this vertical is risk.

  • Transportation and Logistics - Manage fleet, track assets, and improve supply chain efficiencies by combining streaming data with Oracle's advanced spatial functions. Tag for this vertical is transportation.

  • Customer Experience and Consumer Analytics - Know the sentiment of your customers to reduce churn, improve loyalty, make offers, and attract customers in real-time. Tag for this vertical is customer.

  • Telecommunications - Pro actively monitor your networks, predict network failures, and prevent distributed denial of service type attacks. Tag for this vertical is telecom.

  • Retail — Understand and Apply instant Retail Shopping trends, instigate beneficial shelf life patterns and placements, be responsive to Customers cart utilization and interoperate with advanced Vending Machines. Tag for this vertical is retail.

The Home page is as shown below:

Description of home_page.png follows
Description of the illustration home_page.png

You can navigate to the Catalog or the Patterns page from the home page to get started with Oracle Stream Analytics.

About the Catalog

The Catalog page is the location where resources including pipelines, streams, references, maps, connections, targets, dashboards, predictive models, custom jars, visualizations, and cubes are listed. This is the go-to place for you to perform any tasks in Oracle Stream Analytics.

You can mark a resource as a favorite in the Catalog by clicking on the Star icon. Click the icon again to remove it from your favorites. You can also delete a resource or view its topology using the menu icon to the right of the favorite icon.

The tags applied to items in the Catalog are also listed on the screen below the left navigation pane. You can click any of these tags to display only the items with that tag in the Catalog. The tag appears at the top of the screen. Click Clear All at the top of the screen to clear the Catalog and display all the items.

You can include or exclude pipelines, streams, references, predictive models, geo fences, connections, targets, custom jars, visualizations, dashboards, and cubes using the View All link in the left panel under Show Me. When you click View All, a check mark appears beside it and all the components are displayed in the Catalog.

When you want to display or view only a few or selective items in the Catalog, deselect View All and select the individual components. Only the selected components will appear in the Catalog.

Typical Workflow for Administering Oracle Stream Analytics

The typical workflow lists the artifacts required to create a pipeline in Oracle Stream Analytics.

The prerequisites for a pipeline are:

  • A connection is required to create a stream, except for a file stream.

  • A stream is required to create a pipeline.

Creating a Connection

To create a connection:
  1. Click Catalog in the left pane.
  2. From the Create New Item menu, select Connection.
  3. Provide details for the following fields on the Type Properties page and click Next:
    • Name — name of the connection

    • Description — description of the connection

    • Tags — tags you want to use for the connection

    • Connection Type — type of connection: Coherence, Database, Druid, JNDI, or Kafka

    Description of create_connection_type.png follows
    Description of the illustration create_connection_type.png

  4. Enter Connection Details on the next screen and click Save.

    When the connection type is Coherence:

    • Host name — the Coherence Extend Proxy Services TCP/IP Server Socket host

    • Port — the Coherence Extend Proxy Services TCP/IP Server Socket port

    When the connection type is Database:

    • Connect using — select the way you want to identify the database; SID or Service name

    • Service name/SID — the details of the service name or SID

    • Host name — the host name on which the database is running

    • Port — the port on which the database is running. Usually it is 1521

    • Username — the user name with which you connect to the database

    • Password — the password you use to login to the database

    When the connection type is Druid, provide Zookeeper URL.

    When the connection type is JNDI:

    • JNDI Provider — select the JNDI service provider

    • Server Url(s) — the server url(s) for the JNDI connection; for example: host1:port1, host2:port2

    • Username — the user name for authenticating the JNDI connection

    • Password — the password for the JNDI connection

    When the connection type is Kafka, provide Zookeeper URL.

A connection with the specified details is created.

Cache Configuration for Coherence

Oracle Stream Analytics requires a special coherence cache configuration and the proxy schema, so that it can connect to the coherence.

To enrich stream data with external coherence cluster reference data, you must access external coherence cluster using extend client APIs. To access external cluster as client, you need to configure cache-config with ExtendTcpCacheService and ExtendTcpInvocationService.

Configure the Coherence Cluster

Make sure that you have Coherence for Java is installed.

To configure the external cluster as client:

  1. Create an XML file named cache-config.xml.

  2. Copy the following XML to the file:

    <?xml version="1.0"?>
    
    <cache-config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns="http://xmlns.oracle.com/coherence/coherence-cache-config"
       xsi:schemaLocation="http://xmlns.oracle.com/coherence/coherence-cache-config
       coherence-cache-config.xsd">
    		<caching-scheme-mapping>
    			<cache-mapping>
    				<cache-name>
    					externalcache*
    				</cache-name>
    				<schema-name>
    					remote
    				</schema-name>
    			</cahce-mapping>
    		</caching-scheme-mapping>
    
    		<caching-schemes>
    			<remote-cache-scheme>
    				<scheme-name>
    					remote
    				</scheme-name>
    				<service-name>
    					ExtendTcpCacheService
    				</service-name>
    				<initiator-config>
    					<tcp-initiator>
    						<remote-addresses>
    							<socket-address>
    								<address>localhost	</address>
    								<port>9099</port>
    							</socket-address>
    						</remote-addresses>
    					</tcp-initiator>
    					<outgoing-message-handler>
    						<request-timeout>5s</request-timeout>
    					</outgoing-message-handler>
    				</initiator-config>
    			</remote-cache-scheme>
    
    			<remote-invocation-scheme>
    				<scheme-name>extend-invocation</scheme-name>
    				<service-name>ExtendTcpInvocationService</service-name>
    				<initiator-config>
    					<tcp-initiator>
    						<remote-addresses>
    							<socket-address>
    								<address>localhost</address>
    								<port>9099</port>
    							</socket-address>
    						</remote-addresses>
    					</tcp-initiator>
    					<outgoing-message-handler>
    						<request-timeout>5s</request-timeout>
    					</outgoing-message-handler>
    				</initiator-config>
    			</remote-invocation-scheme>
    		</caching-schemes>
    </cache-config>
  3. Save and close the file.

  4. Test the connection to the cluster.

    InvocationService service = (InvocationService) CacheFactory.getConfigurableCacheFactory().ensureService("ExtendTcpInvocationService");

    ensureService() will throw exception if there is no coherence cluster available with the given host and port.

  5. Create a coherence reference using a coherence connection.

  6. Register the coherence as reference.

The following is the sample code to register the coherence as reference:

override def initialize():Unit = {
    repartition = true
    val externalEvent = EventType("externalorders",IntAttr("orderId"), VarCharAttr("orderDesc", 20))
    val sExtSrcProps = Map(EXT_URL -> "",EXT_ENTITY_NAME -> "externalcache")
    val jExtSrcProps = new java.util.HashMap[String,String](sExtSrcProps)
    val converter = ConverterFactory(ConverterType.COHERENCE,externalEvent)
    cc.registerEventType(externalEvent)
    cc.registerRelation(externalEvent).onExternal(jExtSrcProps,ExtSourceType.COHERENCE,converter)
 }

def main(args: Array[String]) {
    cql = "istream(select R.orderId as orderId, R.orderStatus as orderStatus, Ext.orderDesc as orderDesc from orders[now] as R, externalorders as Ext where R.orderId = Ext.orderId)"
    name = "CoherenceCorrelation"
    processOrders(args)
    }
}
// EXT_URL is not used for coherence as reference , currently used for webservice & database, so this will be set to EMPTY
//EXT_ENTITY_NAME is the cache name of the external coherence cluster

For the above example, coherence cache must have key as orderId <Integer> and value as Map of values for orderId and orderDesc. A sample cache similar to the following will populate:

NamedCache cache = CacheFactory.getCache("externalcache"); 
Map<String,Object> order1 = new HashMap<String, Object>();
order1.put("orderId", new Integer(1)); 
order1.put("orderDesc", "HP Deskjet v2"); 
Map<String,Object> order2 = new HashMap<String, Object>(); 
order2.put("orderId", new Integer(2)); 
order2.put("orderDesc", "Oracle Database 12"); 
MapString,Object> order3 = new HashMap<String, Object>();
order3.put("orderId", new Integer(3)); 
order3.put("orderDesc", "Apple iPhone6s"); 
Map<String,Object> order4 = new HashMap<String, Object>();
order4.put("orderId", new Integer(4)); 
order4.put("orderDesc", "Logitech Mouse"); 
cache.put(1,order1); 
cache.put(2,order2); 
cache.put(3,order3); 
cache.put(4,order4);

Creating a Stream

A stream is a source of events with a given content (shape).

To create a stream:

  1. Navigate to Catalog.

  2. Select Stream in the Create New Item menu.

  3. Provide details for the following fields on the Type Properties page and click Next:

    • Name — name of the stream

    • Description — description of the stream

    • Tags — tags you want to use for the stream

    • Stream Type — select suitable stream type. Supported types are File, GoldenGate, JMS, and Kafka.

    Description of create_stream_type.png follows
    Description of the illustration create_stream_type.png

  4. Provide details for the following fields on the Source Details page and click Next:

    When the stream type is File:

    • File Path or URL — the location of the file that you want to upload

    • Read whole content — select this option if you want to read the whole content of the file

    • Number of events per batch — the number of events that you want to process per batch

    • Loop — select this option if you want to process the file in a loop

    • Data Format — the format of the data. The supported types are: CSV and JSON.

    When the stream type is GoldenGate:

    • Connection — the connection for the stream

    • Topic name — the topic name that receives events you want to analyze

    • Data Format — the format of the data. The supported types are: CSV, JSON, AVRO. AVRO is a data serialization system.

    When the stream type is JMS:

    • Connection — the connection for the stream

    • Jndi name — the Jndi that reads messages from topics, distributed topics, queues and distributed queues

    • Client ID — the client to be used for durable subscriber

    • Message Selector — the message selector to filter messages. If your messaging application needs to filter the messages it receives, you can use a JMS API message selector, which allows a message consumer to specify the messages it is interested in. Message selectors assign the work of filtering messages to the JMS provider rather than to the application.

      A message selector is a String that contains an expression. The syntax of the expression is based on a subset of the SQL92 conditional expression syntax. The message selector in the following example selects any message that has a NewsType property that is set to the value 'Sports' or 'Opinion':

      NewsType = ’Sports’ OR NewsType = ’Opinion’

      The createConsumer and createDurableSubscriber methods allow you to specify a message selector as an argument when you create a message consumer.

    • Subscription ID — the subscription id for durable selector

    • Data Format — the format of the data. The supported types are: CSV, JSON, AVRO, MapMessage. MapMessage is supported only for JNDI based streams.

      A MapMessage object is used to send a set of name-value pairs. The names are String objects, and the values are primitive data types in the Java programming language. The names must have a value that is not null, and not an empty string. The entries can be accessed sequentially or randomly by name. The order of the entries is undefined.

    When the stream type is Kafka:

    • Connection — the connection for the stream

    • Topic name — the topic name that receives events you want to analyze

    • Data Format — the format of the data within the stream. The supported types are: CSV, JSON, AVRO.

    Description of create_stream_source.png follows
    Description of the illustration create_stream_source.png

  5. Select one of the mechanisms to define the shape on the Shape page:

    • Infer Shape — detects the shape automatically from the input data stream.

      You can infer the shape from Kafka, JSON schema file, or CSV message/data file. You can also save the auto detected shape and use it later.

    • Select Existing Shape — lets you choose one of the existing shapes from the drop-down list.

    • Manual Shape — populates the existing fields and also allows you to add or remove columns from the shape. You can also update the datatype of the fields.

    Description of create_stream_shape.png follows
    Description of the illustration create_stream_shape.png

A stream is created with specified details.

CSV Data for Pre-defined Formats

When your data format is CSV, select a predefined format based on the variations of CSV data that differs due to the originating source of these CSV. The following table describes the CSV data for each of these predefined formats:

CSV Predefined Format Description

DEFAULT

Standard comma separated format, as for RFC4180 but allowing empty lines

EXCEL

Excel file format (using a comma as the value delimiter).

INFORMIX_UNLOAD_CSV

Default Informix CSV UNLOAD format used by the UNLOAD TO file_name operation (escaping is disabled.) This is a comma-delimited format with a LF character as the line separator. Values are not quoted and special characters are escaped with '\'. The default NULL string is "\\N".

MYSQL

Default MySQL format used by the SELECT INTO OUTFILE and LOAD DATA INFILE operations. This is a tab-delimited format with a LF character as the line separator. Values are not quoted and special characters are escaped with '\'. The default NULL string is "\\N".

POSTGRESQL_CSV

Default PostgreSQL CSV format used by the COPY operation. This is a comma-delimited format with a LF character as the line separator. The default NULL string is "".

POSTGRESQL_TEXT

Default PostgreSQL text format used by the COPY operation. This is a tab-delimited format with a LF character as the line separator. The default NULL string is "\\N".

RFC4180

Comma separated format as defined by RFC4180

TDF

Tab-delimited format 

Capabilities of JMS Source

The capabilities of JMS Source are listed in the following table:

Capability Description Comments

Ability to connect to JMS Cluster

JMS consumer should be able to connect to JMS cluster and handle JMS server fail-over

 

Message Format support

Map and TextMessage (JSON, CSV and AVRO)

Does not support xml and object

Message selector

JMS message selector to use to filter messages. Only messages that match the selector will produce events.

 

Re-connection

Reconnect to JMS server or JMS cluster

 

Read messages from queue/distributed queue

 

 

Read messages from topic

Read messages from JMS topic. By default the subscriber is non-durable

 

Support for Durable subscriber

A durable subscriber registers a durable subscription by specifying a unique identity that is retained by the JMS provider.

If the consumer reconnects to JMS topic, it would read messages from where it last read.

 

T3 Support

Weblogic JMS Protocol

 

JMS Server Clean Up

When you create a JMS stream and select the durable subscription option (by providing client ID and subscription ID value), Oracle Stream Analytics creates the durable subscription (if not already present) when the pipeline using this stream is running. When you come out of the pipeline or unpublish the pipeline(or kill the running pipeline), the durable subscription remains on the JMS Server. It is advisable to delete the durable subscription from the JMS Server and clean up the resources, if you do not intend to publish this pipeline anymore.

Creating a Reference

The reference defines a read-only source of reference data to enrich a stream. A stream containing a customer name could use a reference containing customer data to add the customer’s address to the stream by doing a lookup using the customer name.

A database reference is a reference to specified table in the database. With cache enabled for database reference, when the values gets pulled from database, they are maintained in coherence cache for reference from where they can be served on next request. A database reference requires a database connection.

A coherence reference can be any external cache defined in coherence cluster that can have data from an external system.

To create a reference:

  1. Navigate to Catalog.

  2. Select Reference in the Create New Item menu.

  3. Provide details for the following fields on the Type Properties page and click Next:

    • Name — name of the reference

    • Description — description of the reference

    • Tags — tags you want to use for the reference

    • Reference Type — the reference type of the reference. The supported reference types are: Coherence and Database.

      Description of create_reference_type.png follows
      Description of the illustration create_reference_type.png

  4. Provide details for the following fields on the Source Details page and click Next:

    When the reference type is Coherence, enter or select appropriate values for:

    • Connection — the connection for the coherence reference

      Description of create_reference_source.png follows
      Description of the illustration create_reference_source.png

    • Cache name — the name of the cache to enable caching for better performance at the cost of higher memory usage of the Spark applications. Caching is supported only for single equality join condition. When you update the cache, the application will receive the updated data very quickly.

    Coherence reference has data in key-value pairs. Key is object type and value is Map<String,Object>. Map<String,Object> is map of attribute names and values, attributes list should match with external event type. In this release, only external schema for key and value s supported.

    When the reference type is Database Table, enter or select appropriate values for:

    • Connection — the connection for the database reference

    • Enable Caching — select this option if you want to enable caching

    • Expiry Delay — the amount of time from last update that entries will be kept by the cache before being marked as expired. Any attempt to read an expired entry will result in a reloading of the entry from the configured cache store. This field is enabled only when caching is enabled.

  5. Provide details for the following fields on the Shape page and click Save:

    When the reference type is Coherence:

    • Select Existing Shape — select a shape that you want to use for the reference

      Remember:

      Ensure that you do not use any of the CQL reserved words as the column names. If you use the reserved keywords, you cannot deploy the pipeline.
    • Manual Shape — select this option if you want to define your own shape

    Note:

    When you load coherence data, ensure that you include precision and scale for number type. Only when these values are specified, the join works. For example,
    NamedCache cache  = CacheFactory.getCache("externalcachetimestamp");
    
            java.math.BigDecimal big10 = new java.math.BigDecimal("10",new
    MathContext(58)).setScale(56, RoundingMode.HALF_UP);
    
            Map<String,Object> order1 = new HashMap<String, Object>();
    order1.put("strValue", "Test");
    order1.put("intervalValue", "+000000002 03:04:11.330000000");
            order1.put("orderTag", big10);
    
            cache.put(big10,order1);

    When the reference type is Database Table:

    • Shape Name — select a shape that you want to use for the reference

When the datatype of the table data is not supported, the table columns do not have auto generated datatype. Only the following datatypes are supported:

  • numeric

  • interval day to second

  • text

  • timestamp (without timezone)

  • date time (without timezone)

    Note:

    The date column cannot be mapped to timestamp. This is a limitation in the current release.

A reference is created with the specified details.

Limitations of Coherence as Reference

With coherence as reference, there are a few limitations:

  • You cannot test the connection

  • You need to specify the cache name manually

  • Only equal operator is allowed while establishing a correlation with coherence reference

  • You must use manual shape

Creating a Dashboard

Dashboard is a visualization tool that helps you look at and analyze the data related to a pipeline based on various metrics like visualizations. A dashboard can have visualizations created out of cubes as well.

Dashboard is an analytics feature. You can create dashboards in Oracle Stream Analytics to have a quick view at the metrics.

To create a dashboard:
  1. Go to the Catalog.
  2. Select Dashboard in the Create New Item menu.

    The Create Dashboard screen appears.

    Description of create_dashboard.png follows
    Description of the illustration create_dashboard.png

  3. Provide suitable details for the following fields:
    • Name — enter a name for the dashboard. this is a mandatory field.
    • Description — enter a suitable description for the dashboard. This is an optional field.
    • Tags — enter or select logical tags to easily identify the dashboard in the catalog. This is an optional field.
  4. Click Next.
  5. Enter a custom stylesheet for the dashboard. This is an optional step.
  6. Click Save.
    You can see the dashboard in the Catalog.

After you have created the dashboard, it is just an empty dashboard. You need to start adding visualizations to the dashboard.

Editing a Dashboard

To edit a dashboard:

  1. Click the required dashboard in the catalog.

    The dashboard opens in the dashboard editor.

    Description of edit_dashboard.png follows
    Description of the illustration edit_dashboard.png

  2. Click the Add a new visualization icon to see a list of existing visualizations. Visualizations from the pipelines and as well as from the cube explorations appear here. Go through the list, select one or more visualizations and add them to the dashboard.

  3. Click the Specify refresh interval icon to select the refresh frequency for the dashboard. This is applicable only for cube based visualizations not applicable for streaming charts created out of pipeline.

    This just a client side setting and is not persisted with the Superset Version 0.17.0.

  4. Click the Apply CSS to the dashboard icon to select a CSS. You can also edit the CSS in the live editor.

    You can also see the active filter applied to the dashboard by clicking the Active dashboard filters icon. You can save the link to the dashboard or email the link to someone using the Copy the link to the clipboard and Email the link icons respectively.

  5. Click the Save icon to save the changes you have made to the dashboard.

  6. Hover over the added visualization, click the Explore chart icon to open the chart editor of the visualization.

    Explore Chart icon

    Description of explore_chart.png follows
    Description of the illustration explore_chart.png

    You can see the metadata of the visualization. You can also move the chart around the canvas, refresh it, or remove it from the dashboard.

    A cube exploration looks like the following:

    Description of cube_exploration.png follows
    Description of the illustration cube_exploration.png

    The various options like time granularity, group by, table timestamp format, row limit, filters, and result filters add more granularity and details to the dashboard.

  7. Click Save as to make the following changes to the dashboard:

    • Overwrite the visualization

    • Overwrite the current visualization with a different name

    • Add the visualization to an existing dashboard

    • Add the visualization to a new dashboard

Creating a Cube

Cube is a data structure that helps in quickly analyzing the data related to a business problem on multiple dimensions.

To create a cube:

  1. Go to the Catalog.
  2. From the Create New Item menu, select Cube.
  3. On the Create Cube — Type Properties screen, provide suitable details for the following fields:
    • Name — enter a name for the cube. This is a mandatory field.

      Make sure that the names you use for the underlying sources for the cube like Pipeline Name, Druid Connection, and Kafka Target use names that contain alphanumeric, hyphen, and underscore characters.

    • Description — enter a suitable description for the cube. This is an optional field.
    • Tags — enter or select logical tags for the cube. This is an optional field.
    • Source Type — select the source type from the drop-down list. Currently, Published Pipeline is the only supported type. This is a mandatory field.
  4. Click Next and provide suitable details for the following fields on the Ingestion Details screen:
    • Connection — the connection for the cube. This is a mandatory field.
    • Pipelines — select a pipeline to be used as the base for the cube. This is a mandatory field.
    • Kafka Target — the Kafka target for the cube. This is a mandatory field.
    • Timestamp — select a column from the pipeline to be used as the timestamp. This is a mandatory field.
    • Timestamp format — select or set a suitable format for the timestamp using Joda time format. This is a mandatory field. auto is the default value.
    • Metrics — select metrics for creating measures
    • Dimensions — select dimensions for group by
    • High Cardinality Dimensions — high cardinality dimensions such as unique IDs. Hyperlog approximation will be used.
  5. Click Next and select the required values for the Metric on the Metric Capabilities screen.
  6. Click Next and make any changes, if required, on the Advanced Settings screen.
    • Segment granularity — select the granularity with which you want to create segments
    • Query granularity — select the minimum granularity to be able to query results and the granularity of the data inside the segment
    • Task count — select the maximum number of reading tasks in a replica set. This means that the maximum number of reading tasks is taskCount*replicas and the total number of tasks (reading + publishing) is higher than this. The number of reading tasks is less than taskCount if taskCount > {numKafkaPartitions}.
    • Task duration — select the length of time before tasks stop reading and begin publishing their segment. The segments are only pushed to deep storage and loadable by historical nodes when the indexing task completes.
    • Maximum rows in memory — enter a number greater than or equal to 0. This number indicates the number of rows to aggregate before persisting. This number is the post-aggregation rows, so it is not equivalent to the number of input events, but the number of aggregated rows that those events result in. This is used to manage the required JVM heap size. Maximum heap memory usage for indexing scales with maxRowsInMemory*(2 + maxPendingPersists).
    • Maximum rows per segment — enter a number greater than or equal to 0. This is the number of rows to aggregate into a segment; this number is post-aggregation rows.
    • Immediate Persist Period — select the period that determines the rate at which intermediate persists occur. This allows the data cube is ready for query earlier before the indexing task finishes.
    • Report Parse Exception — select this option to throw exceptions encountered during parsing and halt ingestion.
    • Advanced IO Config — specify name-value pair in a CSV format. Available configurations are replicas, startDelay, period, useEarliestOffset, completionTimeout, and lateMessageRejectionPeriod.
    • Advanced Tuning Config — specify name-value pair in CSV format. Available configurations are maxPendingPersists, handoffConditionTimeout, resetOffsetAutomatically, workerThreads, chatThreads, httpTimeout, and shutdownTimeout.
  7. Click Save to save the changes you have made.
You can see the cube you have created in the catalog.

Exploring a Cube

When you create druid based cube, you can explore data in it.

To explore a cube:

  1. In the Catalog, click the cube that you want to explore.
    The Cube Exploration canvas appears.
  2. Construct a query by setting the various parameters.
    • Visualization Type — the type of visualization to be used for displaying data. The supported visualizations are:

      Distribution - Bar Chart

      Separator

      Sunburst

      Pie Chart

      World Cloud

      Sankey

      Time Series - Line Chart

      Treemap

      Directed force Layout

      Time Series - Dual Axis Line Chart

      Calendar Heatmap

      World Map

      Time Series - Bar Chart

      Box Plot

      Filter Box

      Time Series - Percent Change

      Bubble Chart

      iFrame

      Time Series - Stacked

      Bullet Chart

      Streaming Chart

      Table View

      Big Number with Trendline

      Parallel Coordinates

      Markup

      Big Number

      Heatmap

      Pivot Table

      Histogram

      Horizon

    • Time — time related form attributes like time granularity, origin (starting point of time), and time range

    • Group By — parameters to aggregate the query data

    • Not Grouped By — parameter to query atomic rows

    • Options

    • Filters — columns that you can use in filters

    • Result Filters — columns that you can use in result filters

    Description of cube_exploration.png follows
    Description of the illustration cube_exploration.png

  3. Click Query to run the query with the defined parameters.
  4. Click Save As to save the cube exploration. You can save it as a visualization, choose to add it to an existing dashboard, not to add to a dashboard, or to a new dashboard.

Creating a Target

The target defines a destination for output data coming from a pipeline.

To create a target:

  1. Navigate to Catalog.

  2. Select Target in the Create New Item menu.

  3. Provide details for the following fields on the Type Properties page and click Save and Next:

    • Name — name of the target

    • Description — description of the target

    • Tags — tags you want to use for the target

    • Target Type — the transport type of the target. Supported types are JMS, Kafka and Rest. The target is a sink for the output event. Each type of target is a different sink system and therefore different configuration parameters are required for different types.

      Description of create_target_type.png follows
      Description of the illustration create_target_type.png

  4. Provide details for the following fields on the Target Details page and click Next:

    When the target type is JMS:

    • Connection — the connection for the target

    • Jndi name — the topic or queue name defined in Jndi to be used in the target

      Description of create_target_details.png follows
      Description of the illustration create_target_details.png

    • Data Format — select a suitable data format. This is a mandatory field. The supported data format types are: CSV and JSON.

    When the target type is Kafka:

    • Connection — the connection for the target

    • Topic Name — the Kafka topic to be used in the target

    • Data Format — select a suitable data format. This is a mandatory field. The supported data format types are: CSV and JSON.

    When the target type is REST:

    • URL — enter the REST service URL. This is a mandatory field.

    • Custom HTTP headers — set the custom headers for HTTP. This is an optional field.

    • Batch processing — select this option to send events in batches and not one by one. Enable this option for high throughput pipelines. This is an optional field.

    • Data Format — select a suitable data format. This is a mandatory field.

    Click Test connection to check if the connection has been established successfully.

    Testing REST targets is a heuristic process. It uses proxy settings. The testing process uses GET request to ping the given URL and returns success if the server returns OK (status code 200). The return content is of the type of application/json.

  5. Provide details for the following fields on the Data Format page and click Next:

    When the data format type is CSV:

    • CSV Predefined Format — select a predefined CSV format. This supported formats are: Excel, InfomixUnload, InfomixUnloadCsv, MySQL, PostgreSQLCsv, PostgreSQLText.

    • Create the header row — select this option if you want to create a header row in the target.

    When the data format type is JSON:

  6. Select one of the mechanisms to define the shape on the Shape page and click Save:

    • Select Existing Shape lets you choose one of the existing shapes from the drop-down list.

    • Manual Shape populates the existing fields and also allows you to add or remove columns from the shape. You can also update the datatype of the fields.

      Description of create_target_shape.png follows
      Description of the illustration create_target_shape.png

A target is created with specified details.

Creating Target from Pipeline Editor

Alternatively, you can also create a target from the pipeline editor. When you click Create in the target stage, you are navigated to the Create Target dialog box. Provide all the required details and complete the target creation process. When you create a target from the pipeline editor, the shape gets pre-populated with the shape from the last stage.

Creating a Geo Fence

Geo fences are further classified into two categories: manual geo fence and database-based geo fence.

Create a Manual Geo Fence

To create a manual geo fence:

  1. Navigate to the Catalog page.

  2. Click Create New Item and select Geo Fence from the drop-down list.

    The Create Geo Fence dialog opens.

  3. Enter a suitable name for the Geo Fence.

  4. Select Manually Created Geo Fence as the Type.

  5. Click Save.

    The Geo Fence Editor opens. In this editor you can create the geo fence according to your requirement.

  6. Within the Geo Fence Editor, Zoom In or Zoom Out to navigate to the required area using the zoom icons in the toolbar located on the top-left side of the screen.

    You can also use the Marquee Zoom tool to zoom a specific area on the map. You can mark an area using the marquee zoom and that area in map is zoomed.

  7. Click the Polygon Tool and mark the area around a region to create a geo fence.

    Description of create_geo_fence.png follows
    Description of the illustration create_geo_fence.png

  8. Enter a name and description, and click Save to save your changes.

Update a Manual Geo Fence

To update a manual geo fence:

  1. Navigate to the Catalog page.

  2. Click the name of the geo fence you want to update.

    The Geo Fence Editor opens. You can edit/update the geo fence here.

Search Within a Manual Geo Fence

You can search the geo fence based on the country and a region or address. The search field allows you search within the available list of countries. When you click the search results tile in the left center of the geo fence and select any result, you are automatically zoomed in to that specific area.

Delete a Manual Geo Fence

To delete a manual geo fence:

  1. Navigate to Catalog page.

  2. Click Actions, then select Delete Item to delete the selected geo fence.

Create a Database-based Geo Fence

To create a database-based geo fence:

  1. Navigate to Catalog page.

  2. Click Create New Item and then select Geo Fence from the drop-down list.

    The Create Geo Fence dialog opens.

  3. Enter a suitable name for the geo fence.

  4. Select Geo Fence from Database as the Type.

  5. Click Next and select Connection.

  6. Click Next.

    All tables that have the field type as SDO_GEOMETRY appear in the drop-down list.

  7. Select the required table to define the shape.

  8. Click Save.

Note:

You cannot edit/update database-based geo fences.

Delete a Database-based Geo Fence

To delete a database-based geo fence:

  1. Navigate to Catalog page.

  2. Click Actions and then select Delete Item to delete the selected geo fence.

Display the Map Using Tile Layers

Tile layer is the base map that provides immediate geographic context. Tiles are stored in the map tile server. <ph ishcondition="Product_Family=Cloud" varref="streaming">Stream Analytics</ph><ph ishcondition="Product_Family=OnPremise" varref="osa">Oracle Stream Analytics</ph> supports two types of tile layers. Open Street Maps tile layer is a free map. And, Elocation tile layer is an Oracle tile layer. These tile layers contains huge amount of data pertaining to:

  • Roads, railways, waterways, etc.

  • Restaurants, shops, stations, ATMs, and more

  • Walking and cycling paths

  • Buildings, campuses, etc.

You can choose if you would like to see the map in Elocation tile layer or Open Street Maps tile layer. To set your preference:

  1. Click the user name in the top right corner of the screen.

  2. Click Preferences. The Preferences page opens.

  3. Click Map.

  4. Under Tile Layer, choose Open Street Maps Tile Layer option from the drop-down list.

    Description of tilelayer_1.png follows
    Description of the illustration tilelayer_1.png

  5. Click Save. The map looks like this:

    Description of tilelayer_2.png follows
    Description of the illustration tilelayer_2.png

  6. To display the map in Elocation tile layer, follow steps 1 to 3.

  7. From the Tile Layer drop-down list, choose Elocation Tile Layer.

  8. Click Save. The map looks like this:

    Description of tilelayer_4.png follows
    Description of the illustration tilelayer_4.png

Creating a Predictive Model

To create a predictive model:
  1. In the Create New Item menu, select Predictive Model (Beta).
    The Create Predictive Model page opens.
  2. Under Type Properties do the following and then click Next:
    1. In the Name field, enter a meaningful name for your PMML model.
    2. In the Predictive Model Type drop-down list, select PMML Model.

      Note:

      Only PMML Models up to version 4.1 are supported in this release.
  3. Under Predictive Model Details, do the following and click Save:
    1. For Predictive Model URL, upload your PMML file.
    2. In the Model Version field, enter the version of this artifact. For example, 1.0.
    3. (Optional) In the Version Description, enter a meaningful description for your PMML file.
    4. In the Algorithm field, accept the default. The algorithm is derived from the PMML file you have uploaded.
    5. (Optional) In the Tool drop-down list, select the tool with which you created your PMML file.
Your predictive model has been created. It is displayed in the Catalog if you have selected the Predictive Models option under Show Me.

Description of predictive_model.png follows
Description of the illustration predictive_model.png

Limited Support for Predictive Models

The menu commands for creating Predictive Models and Scoring Stages are marked Beta, for example, Predictive Model (Beta). The Beta label indicates that the functionality has been tested, but is not fully supported. The import and scoring of Predictive Models might contain undocumented limitations and you should use them as is.

Creating a Custom Jar

A custom jar is a user-supplied Jar archive containing Java classes for custom stage types or custom functions that will be used within a pipeline.

To create a custom jar:
  1. In the Create New Item menu, select Custom Jar.
    The Import a jar for custom stages and functions wizard appears.
  2. On the Type Properties page, enter/select suitable values and click Next:
    1. In the Name field, enter a meaningful name for the custom jar you are trying to import into the application.
    2. In the Description field, provide a suitable description.
    3. In the Tags field, select one or more of existing tags, or enter your own tags.
    4. In the Custom Jar Type drop-down list, select Custom Jar.
  3. On the Custom Jar Details page, click Upload file, select the jar file that you want to import into the application, and then click Save.
    Make sure that the jar file you select for uploading is a valid jar file and includes all the required dependencies.

Creating a Pipeline

A pipeline is a Spark application where you implement your business logic. It can have multiple stages such as a query stage, a pattern stage, a business rule stage, a query group stage, a custom stage and many more.

To create a pipeline:

  1. Navigate to Catalog.

  2. Select Pipeline in the Create New Item menu.

  3. Provide details for the following fields and click Save:

A pipeline is created with specified details.

Configuring a Pipeline

You can configure the pipeline to use various stages like query, pattern, rules, query group, scoring, and custom stage from custom jars.

Pipeline Editor

The canvas on which you edit/update a pipeline and add different stages to the pipeline is called Pipeline Editor.

The pipelines in Oracle Stream Analytics can vary from being very simple to highly complex. Complex pipelines have various stages branching out from each/any stage of the pipeline. In other words, you can add any type of stage to any of the existing stage in the pipeline.

You can delete any stage that does not have any children without breaking the pipeline. You can expand/collapse a pipeline, switch the layout of the pipeline to vertical or horizontal, and zoom in or zoom out the pipeline. You can adjust the pipeline pane, editor pane, and the live output table pane using the resizing arrows.

Description of expand_collapse_pipeline.png follows
Description of the illustration expand_collapse_pipeline.png

The pipeline editor allows you to see the relationship and dependencies between various stages of the pipeline.

Working with Live Output Table

The streaming data in the pipeline appears in a live output table.

Hide/Unhide Columns

In the live output table, right-click columns and click Hide to hide that column from the output. To unhide the hidden columns, click Columns and then click the eye icon to make the columns visible in the output.

Select/Unselect the Columns

Click the Columns link at the top of the output table to view all the columns available. Use the arrow icons to either select or unselect individual columns or all columns. Only columns you select appear in the output table.

Pause/Restart the Table

Click Pause/Resume to pause or resume the streaming data in the output table.

Perform Operations on Column Headers

Right-click on any column header to perform the following operations:

  • Hide — hides the column from the output table. Click the Columns link and unhide the hidden columns.

  • Remove from output — removes the column from the output table. Click the Columns link and select the columns to be included in the output table.

  • Rename — renames the column to the specified name.

  • Function — captures the column in Expression Builder using which you can perform various operations through the in-built functions.

Add a Timestamp

Include timestamp in the live output table by clicking the clock icon in the output table.

Reorder the Columns

Click and drag the column headers to right or left in the output table to reorder the columns.

Adding a Query Stage

You can include simple or complex queries on the data stream without any coding to obtain refined results in the output.

  1. Open a pipeline in the Pipeline Editor.
  2. Right-click the stage after which you want to add a query stage, click Add a Stage, and then select Query.
  3. Enter a Name and Description for the Query Stage.
  4. Click Save.
Adding and Correlating Sources and References

You can correlate sources and references in a pipeline.

To add a correlating source or reference:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required query stage.
  3. Click the Sources tab.
  4. Click Add a Source.
  5. Select a source (stream or reference) from the available list.
  6. Click the Window Area in the source next to the clock icon and select appropriate values for Range and Evaluation Frequency.
  7. Under Correlation Conditions, select Match All or Match Any as per your requirement. Then click Add a Condition.
  8. Select the fields from the sources and the appropriate operator to correlate.
    Ensure that the fields you use on one correlation line are of compatible types. The fields that appear in the righ drop-down list depend on the field you select in the left drop-down list.
  9. Repeat these steps for as many sources or references as you want to correlate.
Adding Filters

You can add filters in a pipeline to obtain more accurate streaming data.

To add a filter:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required query stage.
  3. Navigate to the Filters tab.
  4. Click Add a Filter.
  5. Select the required column and a suitable operator and value.

    You can also calculated fields within filters.

  6. Click Add a Condition to add and apply a condition to the filter.
  7. Click Add a Group to add a group to the filter.
  8. Repeat these steps for as many filters, conditions, or groups as you want to add.
Adding Summaries

To add a summary:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required query stage and click the Summaries tab.
  3. Click Add a Summary.
  4. Select the suitable function and the required column.
  5. Repeat the above steps to add as many summaries you want.
Adding Group Bys

To add a group by:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required query stage and click the Summaries tab.
  3. Click Add a Group By.
  4. Click Add a Field and select the column on which you want to group by.
    A group by is created on the selected column.

When you create a group by, the live output table shows the group by column alone by default. Turn ON Retain All Columns to display all columns in the output table.

You can add multiple group bys as well.

Using the Expression Builder

You can perform calculations on the data streaming in the pipeline using in-built functions of the Expression Builder.

Oracle Stream Analytics supports various functions. For a list of supported functions, see Understanding Expression Builder Functions.

Note:

Currently, you can use expressions only within a query stage.

Adding a Constant Value Column

A constant value is a simple string or number. No calculation is performed on a constant value. Enter a constant value directly in the expression builder to add it to the live output table.

Description of expr_constant_value.png follows
Description of the illustration expr_constant_value.png

Using Functions

You can select a CQL Function from the list of available functions and select the input parameters. Make sure to begin the expression with =”. Click Apply to apply the function to the streaming data.

Description of list_of_functions.png follows
Description of the illustration list_of_functions.png

You can see custom functions in the list of available functions when you add/import a custom jar in your pipeline.

Adding Visualizations

Visualizations are graphical representation of the streaming data in a pipeline. You can add visualizations on all stages in the pipeline except a target stage.

Select an appropriate visualization that suits your requirement.
Creating Visualization - Area Visualization

Area visualization represents data as a filled-in area. Area visualization requires at least two groups of data along an axis. The X-axis is a single consecutive dimension, such as a date-time field, and the data lines are unlikely to cross. Y axis represents the metrics (measured value). X axis can also have non date-time categories. This visualization is mainly suitable for presenting accumulative value changes over time.

To add an area visualization:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required stage and click the Visualizations tab.
  3. Click Add a Visualization and then click Area Chart.
  4. Enter/select values for the following fields:
    • Name: a suitable name for the visualization. This is a mandatory field.

    • Description: a suitable description. This is an optional field.

    • Tags: suitable tags to for easy identification. This is an optional field.

    • Y Axis Field Selection: the column to be used as the Y axis. This is a mandatory field.

    • Axis Label: a label for the Y axis. This is an optional field.

    • X Axis Field Selection: the column to be used as the X axis. This is a mandatory field.

    • Axis Label: a label for the X axis. This is an optional field.

    • Orientation: select this check box if you want the visualization to appear with a horizontal orientation in the Pipeline Editor. This is optional and you can decide based on your usecase or requirement if you want to change the orientation.

    • Data Series Selection: the column to be used as the data series. This is an optional field.

  5. Click Create.
The visualization is created and you can see the data populated in it.
Creating Visualization - Bar Visualization

Bar visualization is one of the widely used visualization types which represents data as a series of vertical bars. It is best suited for comparison of the values represented along y axis where different categories are spread across x axis. In a Bar visualization vertical columns represent metrics (measured values). The horizontal axis displays multiple or non-consecutive categories.

To add a bar visualization:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required stage and click the Visualizations tab.
  3. Click Add a Visualization and then click Bar Chart.
  4. Enter/select values for the following fields:
    • Name: a suitable name for the visualization. This is a mandatory field.

    • Description: a suitable description. This is an optional field.

    • Tags: suitable tags to for easy identification. This is an optional field.

    • Y Axis Field Selection: the column to be used as the Y axis. This is a mandatory field.

    • Axis Label: a label for the Y axis. This is an optional field.

    • X Axis Field Selection: the column to be used as the X axis. This is a mandatory field.

    • Axis Label: a label for the X axis. This is an optional field.

    • Orientation: select this check box if you want the visualization to appear with a horizontal orientation in the Pipeline Editor. This is optional and you can decide based on your usecase or requirement if you want to change the orientation.

  5. Click Create.
The visualization is created and you can see the data populated in it.
Creating Visualization - Bubble Visualization

A bubble chart is a good option when you want to add an additional dimension to a scatter plot chart. Scatter charts compare two values, but you can add bubble size as the third variable in a bubble chart and thus enable comparison. A good example to use bubble chart is to show marketing expenditures vs revenue vs profit.

To add a bubble chart:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required stage and click the Visualizations tab.
  3. Click Add a Visualization and then click Bubble Chart.
  4. Enter/select values for the following fields:
    • Name: a suitable name for the visualization. This is a mandatory field.

    • Description: a suitable description. This is an optional field.

    • Tags: suitable tags to for easy identification. This is an optional field.

    • Y Axis Field Selection: the column to be used as the Y axis. This is a mandatory field.

    • Axis Label: a label for the Y axis. This is an optional field.

    • X Axis Field Selection: the column to be used as the X axis. This is a mandatory field.

    • Axis Label: a label for the X axis. This is an optional field.

    • Bubble Size Field Selection: select the field that you want to use as the bubble size. This is a mandatory field.

  5. Click Create.
The visualization is created and you can see the data populated in it.
Creating Visualization - Geo Spatial Visualization

Geo Spatial visualization allows displaying location of an object on a geo fence and takes user to the area where events are occurring. User can configure visualization to specify latitude, longitude, identifier etc. Customization of visualization by specifying different pins like arrows with different colors based on certain condition is also allowed.

To add a geo spatial visualization:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required stage and click the Visualizations tab.
  3. Click Add a Visualization and then click Geo Spatial.
  4. Enter/select values for the following fields:

    On the Properties tab:

    • Name: a suitable name for the visualization. This is a mandatory field.

    • Description: a suitable description. This is an optional field.

    • Tags: suitable tags to for easy identification. This is an optional field.

    • Lat: select the field that you want to use as the latitude. This is a mandatory field.

    • Long: select the field that you want to use as the longitude. This is a mandatory field.

    • Key: select the field that you want to use as the key. This is a mandatory field.

    On the Customizations tab:

    • Click the Add icon and select/enter appropriate values for Field, Operator, Value, and Style. This is an optional step.

  5. Click Create.
The visualization is created and you can see the data populated in it.
Creating Visualization - Line Visualization

Line visualization represents data as a line, as a series of data points, or as data points that are connected by a line. Line visualization require data for at least two points for each member in a group. The X-axis is a single consecutive dimension, such as a date-time field, and the data lines are likely to cross. X axis can also have non date-time categories. Y axis represents the metrics (measured value). It is preferred to use line visualization when data set is continuous in nature. It is best suited for trend-based plotting of data over a period of time.

To add a line visualization:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required stage and click the Visualizations tab.
  3. Click Add a Visualization and then click Line Chart.
  4. Enter/select values for the following fields:
    • Name: a suitable name for the visualization. This is a mandatory field.

    • Description: a suitable description. This is an optional field.

    • Tags: suitable tags to for easy identification. This is an optional field.

    • Y Axis Field Selection: the column to be used as the Y axis. This is a mandatory field.

    • Axis Label: a label for the Y axis. This is an optional field.

    • X Axis Field Selection: the column to be used as the X axis. This is a mandatory field.

    • Axis Label: a label for the X axis. This is an optional field.

    • Orientation: select this check box if you want the visualization to appear with a horizontal orientation in the Pipeline Editor. This is optional and you can decide based on your usecase or requirement if you want to change the orientation.

    • Data Series Selection: the field that you want to use for data selection series.

  5. Click Create.
The visualization is created and you can see the data populated in it.
Creating Visualization - Pie Visualization

A pie chart is a circular graph that represents statistical data in slices. The size of each slice is proportional to the quantity of the value it represents.

To add a pie chart:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required stage and click the Visualizations tab.
  3. Click Add a Visualization and then click Pie Chart.
  4. Enter/select values for the following fields:
    • Name: a suitable name for the visualization. This is a mandatory field.

    • Description: a suitable description. This is an optional field.

    • Tags: suitable tags to for easy identification. This is an optional field.

    • Measure: the field to be used as the measure of the visualization. This is a mandatory field.

    • Group: the field to be used as the group for the visualization. This is a mandatory field.

    • Use 3D rendering: select this check box if you want to render the visualization with a 3D effect. This is an optional field.

  5. Click Create.
The visualization is created and you can see the data populated in it.
Creating Visualization - Scatter Visualization

Scatter charts are primarily used for correlation and distribution analysis. This type of chart is good for showing the relationship between two different variables where one correlates to another.

To add a scatter visualization:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required stage and click the Visualizations tab.
  3. Click Add a Visualization and then click Scatter Chart.
  4. Enter/select values for the following fields:
    • Name: a suitable name for the visualization. This is a mandatory field.

    • Description: a suitable description. This is an optional field.

    • Tags: suitable tags to for easy identification. This is an optional field.

    • Y Axis Field Selection: the column to be used as the Y axis. This is a mandatory field.

    • Axis Label: a label for the Y axis. This is an optional field.

    • X Axis Field Selection: the column to be used as the X axis. This is a mandatory field.

    • Axis Label: a label for the X axis. This is an optional field.

    • Data Series Selection: the field that you want to use for data series selection. This is an optional field.

  5. Click Create.
The visualization is created and you can see the data populated in it.
Creating Visualization - Stacked Bar Visualization

A stacked visualization displays sets of values stacked in a single segmented column instead of side-by-side in separate columns. It is used to show a composition. Bars for each set of data are appended to previous sets of data. The size of the stack represents a cumulative data total.

To add a stacked visualization:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required stage and click the Visualizations tab.
  3. Click Add a Visualization and then click Stacked Bar Chart.
  4. Enter/select values for the following fields:
    • Name: a suitable name for the visualization. This is a mandatory field.

    • Description: a suitable description. This is an optional field.

    • Tags: suitable tags to for easy identification. This is an optional field.

    • Y Axis Field Selection: the column to be used as the Y axis. This is a mandatory field.

    • Axis Label: a label for the Y axis. This is an optional field.

    • X Axis Field Selection: the column to be used as the X axis. This is a mandatory field.

    • Axis Label: a label for the X axis. This is an optional field.

    • Orientation: select this check box if you want the visualization to appear with a horizontal orientation in the Pipeline Editor. This is optional and you can decide based on your usecase or requirement if you want to change the orientation.

  5. Click Create.
The visualization is created and you can see the data populated in it.
Creating Visualization - Thematic Map

A thematic map is used to represent a particular theme in data connected to a geographical area. This type of map depicts the political, cultural, agricultural, sociological, and many other aspects of the geographic region, be it a city, state, country, ore region.

To add a thematic map:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the required stage and click the Visualizations tab.
  3. Click Add a Visualization and then click Thematic Map.
  4. Enter/select values for the following fields:
    • Name: a suitable name for the visualization. This is a mandatory field.

    • Description: a suitable description. This is an optional field.

    • Tags: suitable tags to for easy identification. This is an optional field.

    • Map Type: the map of the region that you want to use. This is a mandatory field.

    • Location Field: the field that you want to use as the location. This is a mandatory field.

    • Data Field: the field that you want to use as the data field. This is a mandatory field.

    • Show Data Value: select this check box if you want to display the data value as marker on the visualization. This is an optional field.

  5. Click Create.
The visualization is created and you can see the data populated in it.

Updating Visualizations

You can perform update operations like edit and delete on the visualizations after you add them.

You can open the visualization in a new window/tab using the Maximize Visualizations icon in the visualization canvas.

Edit Visualization

To edit a visualization:

  1. On the stage that has visualizations, click the Visualizations tab.

  2. Identify the visualization that you want to edit and click the pencil icon next to the visualization name.

  3. In the Edit Visualization dialog box that appears, make the changes you want. You can even change the Y Axis and X Axis selections. When you change the Y Axis and X Axis values, you will notice a difference in the visualization as the basis on which the graph is plotted has changed.

Change Orientation

Based on the data that you have in the visualization or your requirement, you can change the orientation of the visualization. You can toggle between horizontal and vertical orientations by clicking the Flip Chart Layout icon in the visualization canvas.

Delete Visualization

You can delete the visualization if you no longer need it in the pipeline. In the visualization canvas, click the Delete icon available beside the visualization name to delete the visualization from the pipeline. Be careful while you delete the visualization, as it is deleted with immediate effect and there is no way to restore it once deleted.

Delete All Visualizations

You can delete all the visualizations in the stage if you no longer need them. In the visualization canvas, click the Delete All icon to delete all the visualizations of the stage at one go. Be careful while you delete the visualizations, as the effect is immediate and there is no way to restore the deleted visualizations.

Adding a Pattern Stage

Patterns are templatized stages. You supply a few parameters for the template and a stage is generated based on the template.

For detailed information about the various type of patterns, see Patterns.

To add a pattern stage:
  1. Open a pipeline in the Pipeline Editor.
  2. Right-click the stage after which you want to add a pattern stage, click Add a Stage, and then select Pattern.
  3. Choose the required pattern from the list of available patterns.
  4. Enter a Name and Description for the pattern stage.
    The selected pattern stage is added to the pipeline.
  5. Click Parameters and provide the required values for the parameters.
  6. Click Visualizations and add the required visualizations to the pattern stage.

Adding a Rule Stage

Using a rule stage, you can add the IF-THEN logic to your pipeline. A rule is a set of conditions and actions applied to a stream.

To add a rule stage:
  1. Open a pipeline in the Pipeline Editor.
  2. Right-click the stage after which you want to add a rule stage, click Add a Stage, and then select Rule.
  3. Enter a Name and Description for the rule stage.
  4. Click Add a Rule.
  5. Enter Rule Name and Description for the rule and click Done to save the rule.
  6. Select a suitable condition in the IF statement, THEN statement, and click Add Action to add actions within the business rules.
The rules are applied to the incoming events one by one and actions are triggered if the conditions are met.

Adding a Query Group Stage

A query group stage allows you to use more than one query group to process your data - a stream or a table in memory. A query group is a combination of summaries (aggregation functions), group-bys, filters and a range window. Different query groups process your input in parallel and the results are combined in the query group stage output. You can also define input filters that process the incoming stream before the query group logic is applied, and result filters that are applied on the combined output of all query groups together.

A query group stage of the stream type applies processing logic to a stream. It is in essence similar to several parallel query stages grouped together for the sake of simplicity.

A query group stage of the table type can be added to a stream containing transactional semantic, such as a change data capture stream produced, to give just one example, by the Oracle GoldenGate BigData plugin. The stage of this type will recreate the original database table in memory using the transactional semantics contained in the stream. You can then apply query groups to this table in memory to run real-time analytics on your transactional data without affecting the performance of your database.

Adding Query Group: Stream

You can apply aggregate functions with different groupbys and window ranges to your streaming data.

To add a query group stage of type stream:
  1. Open a pipeline in the Pipeline Editor.
  2. Right-click the stage after which you want to add a query group stage, click Add a Stage, select Query Group, and then Stream.

    You can add a query stage group only at the end of the pipeline.

  3. Enter a name and a description for the query group stage of the type stream and click Save.

    The query group stage of the type stream appears in the pipeline.

  4. On the Input Filters tab, click Add a Filter. See Adding Filters to understand the steps for creating filters.

    These filters process data before it enters the query group stage. Hence, you can only see fields of the original incoming shape.

  5. On the Groups tab, click Add a Group. A group can consist one or many of summaries, filters, and group bys.
  6. Repeat the previous step to add as many groups as you want.
  7. On the Result Filters tab, click Add a Filter to filter the results.

    These filters process data before it exits the query group stage. Hence, you can see combined set of fields that get produced in the outgoing shape.

  8. On the Visualizations tab, click Add a Visualization and add the required type of visualization. See Adding Visualizations for the procedure.
Adding Query Group: Table

You can apply aggregate functions with different groupbys and window ranges to a database table data recreated in memory.

To add a query group stage of the type table:
  1. Open a pipeline in the Pipeline Editor.
  2. Right-click the stage after which you want to add a query group stage, click Add a Stage, select Query Group, and then Table.
  3. Enter a name and a description for the Query Group Table and click Next.
  4. On the Transactions Settings screen, select a column in the Transaction Field drop-down list.

    The transaction column is a column from the output of the previous stage that carries the transaction semantics (insert/update/delete). Make sure that you use the values that correspond to your change data capture dataset. The default values work for Oracle GoldenGate change data capture dataset.

  5. On the Field Mappings screen, select the columns that carry the before and after transaction values from the original database table. For example, in case of Oracle GoldenGate, the before and after values have before_ and after_ as prefixes, respectively. Specify a column as primary key in the table.
  6. Click Save to create a query group stage of the type table.
    You can see the table configuration that you have specified while creating the table stage in the Table Configuration tab.
  7. On the Input Filters tab, click Add a Filter. See Adding Filters to understand the procedure.
  8. On the Groups tab, click Add a Group. A group can consist one or many of summaries, filters, and groupbys.
  9. Repeat the previous step to add as many groups as you want.
  10. On the Result Filters tab, click Add a Filter to filter the results.
  11. On the Visualizations tab, click Add a Visualization and add the required type of visualization. See Adding Visualizations for the procedure.

Adding a Scoring Stage

To add a scoring stage:
  1. Open the required pipeline in Pipeline Editor.
  2. Right-click the stage after which you want to add a scoring stage, click Add a Stage, and then select Scoring (Beta).
  3. Enter a meaningful name and suitable description for the scoring stage and click Save.
  4. In the stage editor, select appropriate values for the following:
    1. Model name — the predictive model that you want to use in the scoring stage
    2. Model Version — the version of the predictive model
    3. Mapping — the corresponding model fields that appropriately map to the stage fields
You can add multiple scoring stages based on your use case.

Adding a Custom Stage

You can add filters in a pipeline to obtain more accurate streaming data.

To add a custom stage:
  1. Open the required pipeline in Pipeline Editor.
  2. Right-click the stage after which you want to add a custom stage. Click Add a Stage, and Custom and then select Custom Stage from Custom Jars.
  3. Enter a meaningful name and suitable description for the scoring stage and click Save.
  4. In the stage editor, select appropriate values for the following:
    1. Custom Stage Type — the custom stage that was previously installed though a custom jar
    2. Input Mapping — the corresponding column from the previous stage for every input parameter
You can add multiple custom stages based on your use case.

Adding a Target Stage

To add a target stage:
  1. Open the required pipeline in Pipeline Editor.
  2. Right-click the stage after which you want to add a scoring stage, click Add a Stage, and then select Target.
  3. Enter a meaningful name and suitable description for the scoring stage and click Save.
  4. In the stage editor, select a target that suits your requirement and start mapping the fields.
  5. If the existing target does not suit your requirement of if there is no existing target, click Create Target to create a target.

Configuring a Target

Target defines a destination for output data coming from a pipeline.

To configure a target:
  1. Open a pipeline in the Pipeline Editor.
  2. Select the target node in the pipeline.
  3. Select a target for the pipeline from the drop-down list.
  4. Map each of the Target Property and Output Stream Property.

You can also directly create the target from within the pipeline editor. See Creating a Target for the procedure. You can also edit an existing target.

Description of create_edit_target.png follows
Description of the illustration create_edit_target.png

The pipeline is configured with the specified target.

Exporting and Importing a Pipeline and Its Dependent Artifacts

The export and import feature lets you migrate your pipeline and its contents between Oracle Stream Analytics systems (such as development and production) in a matter of few clicks. You also have the option to migrate only select artifacts. You can import a pipeline developed with the latest version of Oracle Stream Analytics. On re-import, the existing metadata is overwritten with the newly imported metadata if the pipeline is not published. You can delete the imported artifacts by right-clicking them and selecting Delete.

You can export and import pipelines and artifacts except for the following:
  • Cubes

  • Dashboards

  • Custom Stages

  • Visualizations

  • File Streams

  • Predictive Models

  1. In your Oracle Stream Analytics instance, under Catalog, right-click the pipeline or artifact that you want to export to another instance of the Stream Analytics and then select Export.
    Your items are exported as a ZIP file.
  2. Go to the Oracle Stream Analytics instance to which you want to import the exported metadata.
  3. On the toolbar, click Import.
  4. In the Import dialog box, click Select file and then select the exported ZIP file.
  5. Click Import.
    When the metadata is imported successfully, a message similar to the following appears:

    Description of importresults.png follows
    Description of the illustration importresults.png

Publishing a Pipeline

You must publish a pipeline to make the pipeline available for all the users of Oracle Stream Analytics and send data to targets.

A published pipeline will continue to run on your Spark cluster after you exit the Pipeline Editor, unlike the draft pipelines which are undeployed to release resources.

To publish a pipeline:

  1. Open a draft pipeline in the Pipeline Editor.
  2. Click Publish.
    The Pipeline Settings dialog box opens.
  3. Update any required settings.

    Note:

    Make sure to allot more memory to executors in the scenarios where you have large windows.
  4. Click Publish to publish the pipeline.
    A confirmation message appears when the pipeline is published.
You can also publish a pipeline from the Catalog using the Publish option in the Actions menu.

Using the Topology Viewer

Topology is a graphical representation and illustration of the connected entities and the dependencies between the artifacts.

The topology viewer helps you in identifying the dependencies that a selected entity has on other entities. Understanding the dependencies helps you in being cautious while deleting or undeploying an entity. Oracle Stream Analytics supports two contexts for the topology — Immediate Family and Extended Family.

You can launch the Topology viewer in any of the following ways:

Click the Show Topology icon at the top-right corner of the editor to open the topology viewer.By default, the topology of the entity from which you launch the Topology Viewer is displayed. The context of this topology is Immediate Family, which indicates that only the immediate dependencies and connections between the entity and other entities are shown. You can switch the context of the topology to display the full topology of the entity from which you have launched the Topology Viewer. The topology in an Extended Family context displays all the dependencies and connections in the topology in a hierarchical manner.

Note:

The entity for which the topology is shown has a grey box surrounding it in the Topology Viewer.

Immediate Family

Immediate Family context displays the dependencies between the selected entity and its child or parent.

The following figure illustrates how a topology looks in the Immediate Family.

Description of topology_viewer_immediate.png follows
Description of the illustration topology_viewer_immediate.png

Extended Family

Extended Family context displays the dependencies between the entities in a full context, that is if an entity has a child entity and a parent entity, and the parent entity has other dependencies, all the dependencies are shown in the Full context.

The following figure illustrates how a topology looks in the Extended Family.

Description of topology_viewer_full.png follows
Description of the illustration topology_viewer_full.png