1.3 Best Practices for Enriching Your Database Schema

For optimal AI enrichment of your database schema, adhere to the following best practices. These recommendations improve an LLM’s contextual understanding, resulting in more precise and reliable SQL query generation.

  • Use clear language

    Write all annotations in clear, domain-specific English to accurately express the business meaning of each database object.

  • Write effective descriptions

    Create an annotation using the DESCRIPTION label to clarify the purpose of a database object, like a table or column. For example, you can annotate a table named T_EMP as follows to describe its role:

    This table stores both active and former employees, including contractors, with one row per employee.

  • List aliases for database objects

    Use the ALIASES label to provide common synonyms or alternative names for a database object. Because users often refer to objects using various synonyms, the ALIASES annotation helps the LLM map these variations accurately for a particular object, improving its ability to interpret user intent. For example, a column named EMP_ID may be referred to as employee number, person number, or worker id. You can list these alternatives in an annotation as follows:

    Common aliases for this column include employee number, worker id, and person number.

  • Specify measurement units for numeric data

    For columns containing numeric data, use the UNITS annotation to specify the unit of measurement. This prevents the LLM from misinterpreting values. For example, you can annotate a salary_amount column as: Expressed in United States Dollars (USD). This ensures the model correctly identifies the currency type. Similarly, for a distance column, you can annotate Units: kilometers to ensure the model performs accurate conversions or aggregations.

  • Define the join logic

    When joining tables, use the JOIN COLUMN label to specify preferred join partners for a column. Because automated SQL generation with LLMs often struggles with join logic, this annotation guides the model toward the correct join. For example, you can annotate a column named EMPLOYEE.DEPT_NO as follows:

    This column is most commonly joined with DEPARTMENT.DEPT_NO and represents the employee's home department.

  • Enumerate sample values

    Use the VALUES label to provide the LLM with example or distinct values for a column. This information helps the model recognize valid predicates and directly improves its filtering capabilities. For example, you can annotate a column named status_code as follows:

    Typical values for status_code include A (active), I (inactive), and T (terminated).

  • Group related tables

    Organize tables into groups that align with your application domains and annotate the groups. By grouping tables, you help the LLM understand relationships that aren't explicitly defined by foreign keys. For example, you can organize all tables related to Human Resources (HR) into a group named HR_TableGrp and annotate it as follows:

    This group contains all tables related to Human Resources (HR).

Example 1-1 Annotating an HR schema

To understand how to apply annotations in practice, let's look at a simplified HR schema with two tables: EMPLOYEE and DEPARTMENT. Using the best practices, you can annotate it as follows.

This example set of annotations provides a rich, machine-readable layer of context that dramatically improves how an LLM interacts with the schema.

-- Annotations for the EMPLOYEE table
ALTER TABLE EMPLOYEE ANNOTATIONS (
  DESCRIPTION 'Current and former employees, including contractors',
  ALIASES 'staff, worker, personnel'
);

-- Annotations for the columns of EMPLOYEE table
ALTER TABLE EMPLOYEE MODIFY (
  emp_id ANNOTATIONS (
    DESCRIPTION 'Unique employee identifier',
    ALIASES 'employee number, person number, worker id'
  ),
  salary_usd ANNOTATIONS (
    DESCRIPTION 'Annual base salary',
    UNITS 'currency: USD',
    VALUES '35000, 100000'
  ),
  dept_no ANNOTATIONS (
    DESCRIPTION 'Department foreign key',
    JOIN_COLUMN 'DEPARTMENT.dept_no'
  )
);

-- Annotations for the DEPARTMENT table
ALTER TABLE DEPARTMENT ANNOTATIONS (
  DESCRIPTION 'Company departments'
);

-- Annotations for the columns of DEPARTMENT table
ALTER TABLE DEPARTMENT MODIFY (
  dept_no ANNOTATIONS (
    DESCRIPTION 'Department primary key',
    ALIASES 'department id'
  ),
  dept_name ANNOTATIONS (
    DESCRIPTION 'Human-readable department name'
  )
);