Select AI Concepts

Explores the concepts and terms related to Select AI.

Actions

An action in Select AI is a keyword that instructs Select AI to perform different behavior when acting on the prompt. By specifying an action, users can instruct Select AI to process their natural language prompt to generate SQL code, to respond to a chat prompt, narrate the output, display the SQL statement, or explain the SQL code, leveraging the LLMs to efficiently interact with the data within their database environment.

See Use AI Keyword to Enter Prompts for supported Select AI actions.

AI Agent

See Select AI Agent Concepts.

AI Model

A general term encompassing various types of artificial intelligence models, including large language models (LLMs) and transformers (also referred as embedding models), used for tasks like text generation, translation, and image recognition. An AI model is a program, trained on data, that detects patterns and makes predictions or decisions based on new inputs. Within the context of Oracle, AI model specifically refers to the various machine learning and large language models (LLMs) available through Oracle's services. See Concepts for Generative AI for more information.

AI Profile

An AI profile is a specification that includes the AI provider to use and other details regarding metadata and database objects required for generating responses to natural language prompts. See CREATE_PROFILE Procedure and Profile Attributes.

AI Provider

An AI Provider in Select AI refers to the service provider that supplies the LLM or transformer or both for processing and generating responses to natural language prompts. These providers offer models that can interpret and convert natural language for the use cases highlighted under the LLM concept. See Select your AI Provider and LLMs for the supported providers.

Chatbot

An AI-powered conversational agent designed to interact with users in natural language, often used for customer service or information retrieval. In the context of Select AI, the Ask Oracle chatbot helps users ask natural language questions and receive AI-generated responses backed by their database and private content. Through this UI, users can:
  • Ask natural language questions and get SQL generated automatically (NL2SQL).
  • Run queries against database tables and views using Select AI.
  • Use Retrieval-Augmented Generation (RAG) to include private document content stored in Autonomous AI Database.
  • Interact with agent teams you defined with Select AI Agent.

See Ask Oracle for more details.

Cloud Link

A cloud link establishes secure, private connectivity between Oracle Cloud Infrastructure and external cloud providers or on-premises networks, facilitating seamless data exchange. In Select AI, cloud links enable Autonomous AI Database to incorporate external data into NL2SQL interactions without public exposure, empowering users to query hybrid environments conversationally while adhering to Oracle's security standards, such as encryption and access controls, for compliant AI-driven analytics. See Use Cloud Links for Read Only Data Access on Autonomous AI Database for more details.

Conversations

Conversations in Select AI represent an interactive exchange between the user and the system, enabling users to query or interact with the database through a series of natural language prompts. Select AI incorporates session-based short-term conversations to generate context-aware responses for the current prompt based on prior interactions. Up to 10 previous prompts are incorporated into the current request with short-term conversations, creating an augmented prompt that is sent to the LLM. Select AI supports using customizable long-term conversations enabling you to use Select AI with different topics without mixing context, that can be configured through conversation APIs from the DBMS_CLOUD_AI Package. See Select AI Conversations.

Database Credentials

Database credentials are authentication credentials used to access and interact with databases. They typically consist of a user name and a password, sometimes supplemented by additional authentication factors like security tokens. These credentials are used to establish a secure connection between an application or user and a database, such that only authorized individuals or systems can access and manipulate the data stored within the database.

Database Link

A database link connects an Oracle database to remote databases, enabling transparent access to external data as if it were local. In Select AI, database links integrate with Autonomous AI Database, or on-premises Oracle AI Database to extend NL2SQL capabilities to federated sources, supporting natural language queries that span on-premises or other cloud environments securely. See CREATE DATABASE LINK and Use Database Links with Autonomous AI Database for more details.

Embedding Model

An AI model that converts input data into vector embeddings to capture semantic relationships, often used in tasks like language understanding and image recognition. Select AI uses embedding models to compute embeddings for your documents, tables, and query text. These embeddings power semantic search, RAG workflows, similarity scoring, and relevance ranking inside Autonomous AI Database.

Hallucination in LLM

Hallucination in the context of Large Language Models refers to a phenomenon where the model generates text that is incorrect, nonsensical, or unrelated to the input prompt. Despite being a result of the model's attempt to generate coherent text, these responses can contain information that is fabricated, misleading, or purely imaginative. Hallucination can occur due to biases in training data, lack of proper context understanding, or limitations in the model's training process.

IAM

Oracle Cloud Infrastructure Identity and Access Management (IAM) lets you control who has access to your cloud resources. You can control what type of access a group of users have and to which specific resources. To learn more, see Overview of Identity and Access Management.

Iterative Refinement

Iterative refinement is a process of gradually improving a solution or a model through repeated cycles of adjustments based on feedback or evaluation. It starts with an initial approximation, refines it step by step, and continues until the desired accuracy or outcome is achieved. Each iteration builds on the previous one, incorporating corrections or optimizations to move closer to the goal.

In text summary generation, iterative refinement can be useful for processing large files or documents. The process splits the text into manageable-sized chunks, for example, that fit within an LLM's token limits, generates a summary for one chunk, and then improves the summary by sequentially incorporating the following chunks.

Use cases for iterative refinement:

  • Best suited for situations where contextual accuracy and coherence are critical, such as when summarizing complex or highly interconnected texts where each part builds on the previous.
  • Ideal for smaller-scale tasks where sequential processing is acceptable.

See Summarization Techniques.

Large Language Model (LLM)

A Large Language Model (LLM) refers to an advanced type of artificial intelligence model that is trained on massive amounts of text data to support a range of use cases depending on their training data. This includes understanding and generating human-like language as well as software code and database queries. These models are capable of performing a wide range of natural language processing tasks, including text generation, translation, summarization, question answering, sentiment analysis, and more. LLMs are typically based on sophisticated deep learning neural network models that learn patterns, context, and semantics from the input data, enabling them to generate coherent and contextually relevant text.

MapReduce

In general, the MapReduce programming model enables processing large-volume data by dividing tasks into two phases: Map and Reduce.
  • Map: Processes input data and transforms it into key-value pairs.
  • Reduce: Aggregates and summarizes the mapped data based on keys. MapReduce performs parallel processing of large data sets.

In the case of Select AI Summarize, MapReduce partitions text into multiple chunks and processes them in parallel and independently, generating individual summaries for each chunk. These summaries are then combined to form a cohesive overall summary.

Use cases for map reduce:

  • Best suited for large-scale, parallel tasks where speed and scalability are priorities, such as summarizing very large data sets or documents.
  • Ideal for situations where chunk independence is acceptable, and the summaries can be aggregated later.

See Summarization Techniques.

Metadata

Metadata is data that describes data. In the case of Select AI, metadata is database metadata, which refers to the data that describes the structure, organization, and properties of the database tables and views.

For database tables and views, metadata includes column names and types, constraints and keys, view definitions, relationships, lineage, quality and freshness indicators, security classifications, and access policies. Well-managed metadata enables discoverability, correct usage, performance tuning, and compliance. Select AI augments NL2SQL prompts with table metadata that include the table definition (table name, columns names and their data types), and optionally table and column comments, annotations, and constraints.

Metadata Clone

A metadata clone or an Autonomous AI Database clone creates a copy of a metadata defining the database or schema, containing only the structure, not the actual data. This clone includes tables, indexes, views, statistics, procedures, and triggers without any data rows. Developers, testers, or those building database templates find this useful. To learn more, see Clone, Move, or Upgrade an Autonomous AI Database Instance.

Metadata Enrichment

The practice of augmenting database schemas with high-quality descriptions, comments, and annotations so an LLM can better understand the intent for tables and columns, clarify business meaning, and generate more accurate SQL. It turns bare table or column names into well-documented assets with clear intent, relationships, and constraints.

Candidate information to include:

  • Table and column descriptions: purpose, business definitions, units, and allowed value ranges
  • Keys and relationships: primary/foreign keys, join paths
  • Data semantics: time granularity, slowly changing dimensions, deduplication rules
  • Constraints and quality: nullability, uniqueness, validation rules, data freshness
  • Synonyms and aliases: common business terms that map to technical names
  • Examples and patterns: sample values, common filters or aggregations

See Overview of AI Enrichment to learn more about adding such metadata using Oracle SQL Developer for VS Code through Visual Studio Code.

Natural Language Prompt

A natural language prompt consists of instructions, questions, or input statements expressed in everyday human language (such as English) that guide an LLM's response. Instead of requiring code or specialized syntax, users interact with the LLM by typing sentences or phrases that describe their intent, ask for information, or specify a task.

For example:

  • "What is the revenue in the last quarter in each corporate region?"
  • "What is our internal corporate policy on parental leave?"
  • "Summarize this article."
  • "Write an email to a customer apologizing for a delayed shipment."
  • "What are the key differences between SQL and NoSQL databases?"

These prompts leverage the model’s understanding of human language to generate useful, contextually relevant outputs. Natural language prompts are central to LLM usability, making advanced AI capabilities accessible to users without technical expertise.

Network Access Control List (ACL)

A Network Access Control List is a set of rules or permissions that define what network traffic is allowed to pass through a network device, such as a router, firewall, or gateway. ACLs are used to control and filter incoming and outgoing traffic based on various criteria such as IP addresses, port numbers, and protocols. They play a crucial role in network security by enabling administrators to manage and restrict network traffic to prevent unauthorized access, potential attacks, and data breaches.

NL2SQL

Natural Language to SQL (NL2SQL) converts natural language questions into SQL statements using generative AI.

Select AI actively uses NL2SQL to interpret user prompts and generate correct, runnable SQL against your Autonomous AI Database or connected external sources. This enables business users ask questions like “Show me last quarter’s revenue by region” and receive accurate SQL queries and results with no SQL expertise.

ONNX

ONNX (Open Neural Network Exchange) is an open standard format for representing machine learning and deep-learning models. ONNX standardizes the representation and interchange of machine learning models across frameworks, enabling seamless deployment and interoperability. See ONNX for more details.

Select AI can use generative AI models exported in ONNX format to run AI workloads directly inside Autonomous AI Database or through supported runtimes enabling organizations to leverage pre-trained models for natural language processing tasks like query generation. By using ONNX models, you keep inference close to your data, reduce data movement, and enable consistent model processing across different tools and environments ensuring compliant AI operations.

ONNX Runtime

ONNX Runtime runs ONNX-formatted models efficiently across hardware platforms, optimizing inference for real-time AI applications.

Select AI users can specify in-database ONNX-format models in their AI profile in support of RAG. The database embeds the ONNX Runtime in Oracle AI Database 26ai and Autonomous AI Database. Using the in-database ONNX Runtime avoids sending content to an external engine to produce, for example, vector embeddings. ONNX Runtime powers the runtime evaluation of transformer-based models within Autonomous AI Database, facilitating developers to load ONNX models, fast natural language to SQL (NL2SQL) conversions, compute embeddings, classify data, or run inference inside the database engine without sending data to external service, which enhances query performance and improves security, latency, and governance. See Example: Select AI with In-database Transformer Models and ONNX Runtime for more details.

Private Endpoint

A secure and dedicated communication point that allows restricted access to specific services or resources. A private endpoint establishes a secure, dedicated connection that restricts access to specific services or resources, ensuring isolated communication. In Select AI, organizations can configure private endpoints in Oracle Cloud Infrastructure (OCI) to connect with privately hosted LLMs like Ollama or Llama.cpp on virtual machines (VMs), addressing security needs by processing AI workloads within the Oracle Virtual Cloud Network. This setup includes a public subnet with a jump server for controlled access and a private subnet housing the Autonomous AI Database and AI models, preventing internet exposure and keeping all components compliant with enterprise isolation requirements. See unresolvable-reference.html#GUID-E435B2FC-856F-48AB-88B6-9B98126B4128 for more details.

Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) is a technique that involves retrieving relevant information for a user's query and supplying that information to a large language model (LLM) to improve responses and reduce hallucination.

Most commonly, RAG involves vector search, but more generally, includes augmenting a prompt of database content (either manually or automatically) such as schema metadata for SQL generation or database content explicitly queried. Other forms of augmentation can involve technologies such as graph analytics and traditional machine learning.

Semantic Similarity Search

Semantic similarity search identifies and retrieves data points that closely match a given query by comparing feature vectors in a vector store.

Sidecar

The sidecar architecture allows one database to act as the central metadata repository for both local and remote data sources, that is, Oracle and non-Oracle. Select AI uses this architecture by leveraging the metadata to build an augmented prompt that is sent to the user’s chosen LLM, which then generates a federated SQL query. A key benefit of the sidecar is that it enables data to remain in its original location, eliminating the need for data duplication or complex ETL processes.

It supports federated access to diverse external systems such as, BigQuery, Redshift, multi-cloud, or on-premises databases by securely bridging these sources to Autonomous AI Database.

Similarity Threshold

A similarity threshold sets a minimum score to classify two items as related, filtering results based on their vector proximity or distance. In Select AI, the similarity threshold helps filter results that fall below a required level of semantic closeness, ensuring that only highly related document chunks, rows, or embeddings are returned.

Synthetic Data Generation

In the context of Select AI, Synthetic Data Generation is the capability to automatically generate artificial data that conforms to your database schema enabling you to populate tables for development, testing, training, or proof-of-concept scenarios without using sensitive or production data. Select AI provides the PL/SQL function DBMS_CLOUD_AI.GENERATE_SYNTHETIC_DATA to produce synthetic data sets. See Synthetic Data Generation for more details.

Transformer

A type of deep learning model architecture commonly used for natural language processing tasks, such as vector embedding generation or text generation and translation. In Select AI, transformer-based LLMs drive the conversion of user queries into SQL queries that can be run within your database.

Vector

In the context of semantic similarity search, a vector is a mathematical representation that captures the semantic meaning of data points, such as words, documents, or images, in a multi-dimensional space.

In the context of Select AI, vectors support retrieval augmented generation by capturing the meaning of text content to enable fast semantic retrieval from the database.

Vector Database

A database that stores vector embeddings, which are mathematical representations of data points used in AI applications to support efficient semantic similarity search. Oracle Autonomous AI Database and Oracle AI Database serve as a vector database with optimized vector indexes.

In Select AI, the vector database component (powered by Oracle AI Vector Search) indexes embeddings generated from enterprise data. This enables natural language queries to retrieve semantically similar results, improves relevance for AI-powered search and RAG workflows, and provides seamless integration with Oracle Cloud environments.

Vector Distance

Vector distance measure the similarity or dissimilarity between feature vectors by calculating the distance between them in a multidimensional space.

Vector Index

A vector index organizes and stores vectors to enable efficient similarity search and retrieval of related data.

Vector Store

A vector store includes systems that store, manage, and enable semantic similarity search involving vector embeddings. This includes standalone vector databases and Oracle AI Database 26ai AI Vector Search.