Quick Start

48 Quick Start

The easiest way to get started with Coherence RAG is to deploy the pre-built `coherence-rag-server` container image to Kubernetes using [Coherence Operator](https://oracle.github.io/coherence-operator/docs/latest/#/docs/about/01_overview).

For example, to deploy a 3-member Coherence RAG cluster that uses a built-in `all-MiniLM-L6-v2` embedding model, and Open AI `gpt-4o-2024-08-06` chat model you could use the following deployment YAML:

coherence-rag-demo.yaml
yaml
apiVersion: coherence.oracle.com/v1
kind: Coherence
metadata:
  name: coherence-rag-demo
spec:
  replicas: 3
  image: ghcr.io/coherence-community/coherence-rag-server:15.1.1-0-0
  cluster: coherence-rag-demo
  env:
    - name: MODEL_EMBEDDING
      value: -/all-MiniLM-L6-v2
    - name: MODEL_CHAT
      value: OpenAI/gpt-4o-2024-08-06
    - name: OPENAI_API_KEY
      valueFrom:
        secretKeyRef:
          name: openai-api-key
          key: key
  jvm:
    memory:
      heapSize: 16g
  ports:
    - name: server
      port: 7001

The above example will expose Coherence RAG REST API on port 7001 on each pod. It will also create `coherence-rag-demo-server` Kubernetes service that maps to that port on all the pods, allowing you to expose REST API using ingress, or to forward local port to it for testing.

Note:

We have to pass OpenAI API key as an environment variable in order to be able to use the chat model, which references Kubernetes secret, for security reasons. To create the secret, you would need to run the following command and specify your own OpenAI API key within the `from-literal` argument:

kubectl create secret generic openai-api-key --from-literal=key=sk-

Now that we have the secret configured we can deploy our demo cluster:

kubectl apply -f coherence-rag-demo.yaml

Finally, we can forward local port 7001 to the `coherence-rag-demo-server` service, which will allow us to make REST API calls described in the following sections to ingest documents, perform vector searches and augment chat conversations with the results of those searches:

kubectl port-forward service/coherence-rag-demo-server 7001:7001 -n default