1.3 Best Practices for Enriching Your Database Schema
For optimal AI enrichment of your database schema, adhere to the following best practices. These recommendations improve an LLM’s contextual understanding, resulting in more precise and reliable SQL query generation.
-
Use clear language
Write all annotations in clear, domain-specific English to accurately express the business meaning of each database object.
-
Write effective descriptions
Create an annotation using the
DESCRIPTIONlabel to clarify the purpose of a database object, like a table or column. For example, you can annotate a table namedT_EMPas follows to describe its role:This table stores both active and former employees, including contractors, with one row per employee. -
List aliases for database objects
Use the
ALIASESlabel to provide common synonyms or alternative names for a database object. Because users often refer to objects using various synonyms, theALIASESannotation helps the LLM map these variations accurately for a particular object, improving its ability to interpret user intent. For example, a column namedEMP_IDmay be referred to as employee number, person number, or worker id. You can list these alternatives in an annotation as follows:Common aliases for this column include employee number, worker id, and person number. -
Specify measurement units for numeric data
For columns containing numeric data, use the
UNITSannotation to specify the unit of measurement. This prevents the LLM from misinterpreting values. For example, you can annotate asalary_amountcolumn as:Expressed in United States Dollars (USD). This ensures the model correctly identifies the currency type. Similarly, for a distance column, you can annotateUnits: kilometersto ensure the model performs accurate conversions or aggregations. -
Define the join logic
When joining tables, use the
JOIN COLUMNlabel to specify preferred join partners for a column. Because automated SQL generation with LLMs often struggles with join logic, this annotation guides the model toward the correct join. For example, you can annotate a column namedEMPLOYEE.DEPT_NOas follows:This column is most commonly joined with DEPARTMENT.DEPT_NO and represents the employee's home department. -
Enumerate sample values
Use the
VALUESlabel to provide the LLM with example or distinct values for a column. This information helps the model recognize valid predicates and directly improves its filtering capabilities. For example, you can annotate a column namedstatus_codeas follows:Typical values for status_code include A (active), I (inactive), and T (terminated). - Group related tables
Organize tables into groups that align with your application domains and annotate the groups. By grouping tables, you help the LLM understand relationships that aren't explicitly defined by foreign keys. For example, you can organize all tables related to Human Resources (HR) into a group named
HR_TableGrpand annotate it as follows:This group contains all tables related to Human Resources (HR).
Example 1-1 Annotating an HR schema
To understand how to apply annotations in practice, let's
look at a simplified HR schema with two tables: EMPLOYEE and
DEPARTMENT. Using the best practices, you can annotate it as follows.
This example set of annotations provides a rich, machine-readable layer of context that dramatically improves how an LLM interacts with the schema.
-- Annotations for the EMPLOYEE table
ALTER TABLE EMPLOYEE ANNOTATIONS (
DESCRIPTION 'Current and former employees, including contractors',
ALIASES 'staff, worker, personnel'
);
-- Annotations for the columns of EMPLOYEE table
ALTER TABLE EMPLOYEE MODIFY (
emp_id ANNOTATIONS (
DESCRIPTION 'Unique employee identifier',
ALIASES 'employee number, person number, worker id'
),
salary_usd ANNOTATIONS (
DESCRIPTION 'Annual base salary',
UNITS 'currency: USD',
VALUES '35000, 100000'
),
dept_no ANNOTATIONS (
DESCRIPTION 'Department foreign key',
JOIN_COLUMN 'DEPARTMENT.dept_no'
)
);
-- Annotations for the DEPARTMENT table
ALTER TABLE DEPARTMENT ANNOTATIONS (
DESCRIPTION 'Company departments'
);
-- Annotations for the columns of DEPARTMENT table
ALTER TABLE DEPARTMENT MODIFY (
dept_no ANNOTATIONS (
DESCRIPTION 'Department primary key',
ALIASES 'department id'
),
dept_name ANNOTATIONS (
DESCRIPTION 'Human-readable department name'
)
);