Ingestion API

50 Ingestion API

The Ingestion API facilitates document retrieval, chunking and creation of vector embeddings. It is one of the key differentiators between Coherence RAG and its competitors, as it uses full processing power of the Coherence RAG cluster to parallelize all parts of ingestion: documents are loaded and split in parallel, and vector embeddings are then created in parallel.

To kick off document ingestion, the user can simply submit a list of document URIs to process to the ingestion endpoint:

Endpoint
 http request
 POST /api/kb/<storeName>/docs
 
 Sample Request Payload
 json
 [
  "https://docs.oracle.com/en/database/oracle/oracle-database/23/vecse/ai-vector-search-users-guide.pdf",
  "https://docs.oracle.com/en/middleware/standalone/coherence/14.1.1.2206/manage/managing-oracle-coherence.pdf",
  "https://docs.oracle.com/en/middleware/standalone/coherence/14.1.1.2206/secure/securing-oracle-coherence.pdf",
  "oci.os://odx-stateservice/docs/coherence/administering-http-session-management-oracle-coherenceweb.pdf"
 ]

The protocol component of the URI determines which document loader should be used to retrieve the document, as described in the [Document Loaders](#document-loaders) section later in a document.