Security Considerations

Scope: This article covers security considerations related to Agent Memory Python SDK. It applies to applications using either the active-memory features of the SDK or the store layer only.

Why it matters: Agent Memory can persist thread content and memory records in Oracle AI Database and, when LLM-backed features are enabled, send content to configured model endpoints for summarization, memory extraction, or embeddings. Secure deployment therefore depends on careful handling of application data, retrieval scope, database access, external model endpoints, and retention policies.

Considerations Regarding LLM-Backed Memory Processing

Agent Memory supports active-memory features such as thread summarization and automatic memory extraction. When these features are enabled, the SDK may send recent messages, thread summaries, retrieved memories, or search text to the configured LLM or embedding endpoint.

Note: Only send content to Agent Memory that is appropriate for the configured model endpoint and your deployment policies. If active-memory is enabled for data that appears to include secrets, credentials, or unnecessary sensitive data, minimize or redact that content before messages enter the memory pipeline. Treat extracted memories, summaries, context cards, and other model-derived text as untrusted output that must be reviewed and handled safely by the integrating application.

Warning: Model-derived text can become persistent memory state. When automatic extraction, summarization, or context-card features are enabled, a summary, extracted memory, or retrieved record may be inserted by the SDK into later prompts, such as memory-extraction, summarization, context-card, or agent prompts, before the application can review that specific intermediate value. Treat this as normal untrusted LLM data flow: review and validate the outputs your application consumes, and do not let memory-derived content authorize privileged actions or bypass policy.

Follow these recommendations when using active-memory features:

Validate and minimize application data: Review which messages, metadata, and IDs your application sends into the SDK. Avoid passing more data than the memory workflow needs.
Use trusted model endpoints: Configure LLM and embedding endpoints that meet your requirements for transport security, data residency, retention, and operational monitoring.
Treat generated memory as application data and untrusted output: Extracted memories, summaries, and context cards are derived outputs. Review how your application uses them, especially before they influence privileged actions, external tool calls, or customer-visible decisions.
Account for persistent prompt injection: Caller-provided, retrieved, or model-derived text stored in memory can be replayed into later summarization, extraction, context-card, or agent prompts. Prompt delimiters, escaping, and extraction instructions can help structure model input, but they are not a security boundary. Review extracted memories, summaries, context cards, and other persisted or prompt-bound intermediate text before relying on them. If your workflow requires review before model-derived text can influence future extraction or context construction, disable automatic extraction and use explicit memory writes or another application-controlled review gate.
Sanitize or escape derived text for its destination: If extracted memories, summaries, context cards, or other model-derived text are rendered into HTML, Markdown, templates, logs, or other output surfaces, apply context-appropriate escaping or sanitization. Use the same care before reusing derived text in downstream prompts, tool inputs, commands, or other interpreter-like contexts.
Choose the right operating mode: If your application needs review before model-derived text can influence later extraction or context construction, consider using explicit memory writes, store-only integrations, or extract_memories=False for workflows that should not perform automatic extraction.

Considerations Regarding Persistence and Data Minimization

Agent Memory is designed to persist messages, memories, metadata, and embeddings in Oracle AI Database when the DB-backed store is used. This allows durable retrieval and cross-session memory, but it also means the application should plan what data is appropriate to retain.

The following guidance helps keep deployments aligned with secure data-handling practices:

For store-only usage, persist only what is needed: Design your application so that only useful, business-appropriate content is written to the memory store.
When active-memory features are enabled, plan for derived records: In addition to caller-provided content such as messages and metadata, a workflow may also persist extracted memories, summaries, or embeddings.
Treat write-capable memory paths as trusted: Database credentials and backend code paths that can write messages, summaries, memories, metadata, embeddings, or thread runtime state can affect future prompts and retrieval results. Active-memory features intentionally persist model-derived state; if that is not appropriate for a workflow, disable automatic extraction or use a store-only/manual-write integration with narrower application controls.
Choose the right deletion scope for retention work: delete_message() removes the raw message record only. Derived memories or other downstream thread-scoped artifacts created from that message can remain searchable because extracted memories do not currently persist per-message provenance. When you need thread-scoped cleanup that also removes associated memories and managed vector/chunk data, use OracleAgentMemory.delete_thread().
Define retention and deletion policies up front: If your application offers deletion or retention commitments, make sure they cover raw messages, extracted memories, metadata, and other related records created by the workflow.
Avoid relying on memory as a source of truth: Stored memories are intended to improve context and retrieval. Applications should continue to rely on authoritative systems for important decisions.

Considerations Regarding Retrieval Scope and Access Control

Agent Memory uses caller-provided user_id, agent_id, and thread_id values to scope retrieval. This is a powerful filtering model, but it should not be the only control your application relies on when deciding how retrieved content is used or shown.

By default, thread-scoped retrieval uses exact matching for user_id and agent_id and a broader match for thread_id so relevant results can span past threads for the same user-agent pair. Top-level OracleAgentMemory.search() and search_async() calls also require explicit user scoping and exact user matching. They reject omitted user scope and exact_user_match=False so the public client API does not accidentally search across multiple users. Passing user_id=None is allowed only with exact user matching and targets only unscoped records.

Use the following practices when designing retrieval:

Map application rules to memory scope: Ensure that the scopes your application passes to the SDK match your tenant, user, and data-sharing rules.
Pass an explicit user scope on every client search: Derive the user_id from the authenticated request context rather than from request JSON or other caller-controlled input, and provide it on each top-level OracleAgentMemory.search() or search_async() call. Use user_id=None only for workflows intentionally restricted to unscoped records.
Prefer the narrowest scope that satisfies the use case: Use exact matching and tighter filters for workflows that handle more sensitive data.
Review cross-thread retrieval intentionally: Broader retrieval can improve continuity across sessions, but applications should enable it only where that approach is appropriate.
Treat search results as retrieved content, not final decisions: Returned memories may be relevant, but the application remains responsible for deciding whether and how they should be shown or acted on.
Handle retrieved text safely at the integration boundary: Retrieved records can include caller-provided or model-derived text. If retrieved memories or other returned text are rendered into HTML, Markdown, templates, logs, or other output surfaces, apply context-appropriate escaping or sanitization before displaying it, transforming it, logging it, or passing it to downstream systems.

Considerations Regarding Application Integration and Caller Trust

Agent Memory is meant to be called by the integrating application or other trusted backend code, not directly by end users. Raw memory APIs are not an end-user-facing security boundary, and they do not perform end-user authentication or authorization on their own. The package trusts the caller to provide the correct user_id, agent_id, thread_id, and retrieval scope for each operation.

Note: The integrating application is responsible for authenticating the end user, authorizing access, and deriving the correct user_id and scope before it calls Agent Memory APIs. A caller-supplied user_id is a scoping value, not proof of identity.

Use the following practices when integrating the SDK into an agentic application:

Treat user_id as security-sensitive application input: If the integrating application derives user_id from request JSON or other caller-controlled input instead of authenticated context, that can allow cross-user memory access. Derive user_id from your authenticated application context instead of letting end users choose arbitrary values.
Apply application authorization before every memory call: The integrating application must decide which user_id, agent_id, thread_id, and search scope values are valid for the current request and keep reads and writes inside the intended tenant and user boundary.
Do not expose raw memory APIs to end users: Package APIs such as add_memory or search helpers should be wrapped in application logic that validates the caller, enforces policy, and controls what data can be written or returned.
Keep user-ID discovery and enumeration privileged: If the package adds helpers for listing or enumerating user_id values, treat them as administrative capabilities only and never expose them to end users through the integrating application.
Review scope overrides carefully: Any workflow that broadens thread scope, disables exact matching, or drops to lower-level store APIs should be restricted to trusted components and reviewed for cross-user or cross-tenant effects.

Considerations Regarding Logging and Diagnostics

Agent Memory uses standard Python logging and does not configure application log handlers or log levels for the integrating application. If the integrating application enables DEBUG logging for the SDK, debug logs may include additional troubleshooting detail. Keep production deployments at a non-DEBUG level; DEBUG logging is intended only for controlled development or support diagnostics and is not suitable for production log collection.

Considerations Regarding Database Access, Schema Management, and Secrets

Agent Memory uses a caller-provided Oracle AI Database connection or pool. The package does not create or manage database credentials itself. It also does not create, negotiate, or upgrade database network encryption on behalf of the caller.

Note:

For production deployments, create the Oracle AI Database connection or pool with encrypted transport enabled before passing it into Agent Memory. Do not use plaintext database connections across untrusted, shared, or external networks. When using python-oracledb, follow the official section Securely Encrypting Network Traffic to Oracle AI Database and configure TLS or another approved encrypted transport as part of connection or pool creation.
Never embed API keys, passwords, or other secrets directly in application code, checked-in configuration, or exported artifacts. Always use secure injection mechanisms and follow the principle of least privilege for credential access.

The following deployment practices are recommended:

Use database users with only the required privileges: Grant only what is needed for the selected deployment model and schema policy.
Use a separate database user for deletion workflows where practical: If your application needs to remove records, prefer a dedicated connection or pool for those paths and grant DELETE on the managed Agent Memory tables only to that database user. Keep the main runtime connection limited to the non-deletion privileges required for its normal operations so accidental or unwanted deletes have a narrower blast radius. If a caller invokes delete() through a connection that does not have DELETE permission, Oracle AI Database rejects the statement.
Create encrypted database connections and pools: Production code should pass a TLS-enabled Oracle AI Database connection or pool into the SDK. Agent Memory uses the connection or pool exactly as provided by the caller. For python-oracledb, prefer TLS-enabled connections such as protocol="tcps" or an equivalent TCPS DSN, configure the required wallet or CA material, and keep server certificate validation enabled.
Keep the default schema policy unless you explicitly need DDL changes: SchemaPolicy.REQUIRE_EXISTING is the default and avoids creating, modifying, or dropping schema objects during standard application startup.
Restrict destructive setup modes: SchemaPolicy.RECREATE is intended for setup, testing, or administrative workflows and should not be used in standard production paths.
Rely on package-managed SQL paths, not dynamic SQL assembly in application code: In the managed DB paths, record values and search filters are sent with bind variables, and managed object names are derived from validated prefixes.
Protect connection and provider credentials: Store database, LLM, and embedding credentials in a secrets manager such as OCI Vault, and rotate them regularly.
Prefer validated TLS in both Thin and Thick mode: The official python-oracledb docs note that both Thin and Thick modes support TLS, and Thick mode can also use Oracle Native Network Encryption where that is your approved standard.
Use secure transport to the database: Database network security, TLS configuration, and authentication method are determined by the caller-provided connection and should follow your organization’s standards.

Considerations Regarding Network Communication and External Endpoints

Agent Memory can communicate with external services when the deployment configures remote LLM or embedding providers. The SDK forwards prompts and request parameters through the configured client path, but the surrounding application and deployment remain responsible for securing these connections.

We recommend the following:

Use HTTPS for model endpoints and prefer private or restricted network paths where available.
Monitor outbound traffic and provider usage for unexpected destinations, unusual request volume, or anomalous token consumption.
Choose providers that match your compliance and residency needs before enabling active-memory features on regulated or sensitive workflows.

Considerations Regarding Resource-Exhaustion Vectors

Memory workflows can increase database usage, embedding traffic, and LLM token consumption over time. This is true both for malicious over-use and for innocent implementation mistakes such as oversized messages or overly broad retrieval patterns.

Use these controls as part of your production hardening:

Set practical prompt and message bounds: Configure values such as max_message_token_length and memory_extraction_token_limit to fit your workload and provider limits. max_message_token_length limits the prompt-time copy used by extraction workflows; stored messages remain unchanged.
Bound retrieval sizes: Use reasonable max_results values and record-type filters for application searches.
Apply infrastructure limits outside the SDK: Use database quotas, connection limits, network controls, endpoint timeouts, and rate limiting in the surrounding deployment.
Monitor growth over time: Track stored message volume, durable memory growth, provider usage, and query latency so retention or tuning changes can be made before they affect reliability.