8.1 Highlighting Query Terms
In text query applications, you can present selected documents with query terms highlighted for text queries or with themes highlighted for ABOUT
queries.
You can generate three types of output associated with highlighting:
-
A marked-up version of the document
-
Query offset information for the document
-
A concordance of the document, in which occurrences of the query term are returned with their surrounding text
This section contains the following topics:
8.1.1 Text highlighting
For text highlighting, you supply the query, and Oracle Text highlights words in the document that satisfy the query. You can obtain plain-text or HTML highlighting.
8.1.2 Theme Highlighting
For ABOUT
queries, the CTX_DOC
procedures highlight and mark up words or phrases that best represent the ABOUT
query.
8.1.3 CTX_DOC Highlighting Procedures
These are the highlighting procedures in CTX_DOC
:
-
CTX_DOC.MARKUP
andCTX_DOC.POLICY_MARKUP
-
CTX_DOC.HIGHLIGHT
andCTX_DOC.POLICY_HIGHLIGHT
-
CTX_DOC.SNIPPET
andCTX_DOC.POLICY_SNIPPET
The POLICY
and non-POLICY
versions of the procedures are equivalent, except that the POLICY
versions do not require an index.
Note:
SNIPPET
can also be generated using the Result Set Interface.
See Also:
Oracle Text Reference for information on CTX_QUERY.RESULT_SET
This section contains these topics:
8.1.3.1 Markup Procedure
The CTX_DOC.MARKUP
and CTX_DOC.POLICY_MARKUP
procedures take a document reference and a query, and return a marked-up version of the document.
The output can be either marked-up plain text or marked-up HTML. For example, specify that a marked-up document be returned with the query term surrounded by angle brackets (<<<tansu>>>) or HTML (<b>tansu</b>).
CTX_DOC.MARKUP
and CTX_DOC.POLICY_MARKUP
are equivalent, except that CTX_DOC.POLICY_MARKUP
does not require an index.
You can customize the markup sequence for HTML navigation.
CTX_DOC.MARKUP Example
The following example is taken from the web application described in CONTEXT Query Application. The showDoc
procedure takes an HTML document and a query, creates the highlight markup—in this case, the query term is displayed in red—and outputs the result to an in-memory buffer. It then uses htp.print
to display it in the browser.
procedure showDoc (p_id in varchar2, p_query in varchar2) is v_clob_selected CLOB; v_read_amount integer; v_read_offset integer; v_buffer varchar2(32767); v_query varchar(2000); v_cursor integer; begin htp.p('<html><title>HTML version with highlighted terms</title>'); htp.p('<body bgcolor="#ffffff">'); htp.p('<b>HTML version with highlighted terms</b>'); begin ctx_doc.markup (index_name => 'idx_search_table', textkey => p_id, text_query => p_query, restab => v_clob_selected, starttag => '<i><font color=red>', endtag => '</font></i>'); v_read_amount := 32767; v_read_offset := 1; begin loop dbms_lob.read(v_clob_selected,v_read_amount,v_read_offset,v_buffer); htp.print(v_buffer); v_read_offset := v_read_offset + v_read_amount; v_read_amount := 32767; end loop; exception when no_data_found then null; end; exception when others then null; --showHTMLdoc(p_id); end; end showDoc; end; / show errors set define on
8.1.3.2 Highlight Procedure
CTX_DOC.HIGHLIGHT
and CTX_DOC.POLICY_HIGHLIGHT
take a query and a document and return offset information for the query in plain text or HTML format. You can use this offset information to write your own custom routines for displaying documents.
CTX_DOC.HIGHLIGHT
and CTX_DOC.POLICY_HIGHLIGHT
are equivalent, except that CTX_DOC.POLICY_HIGHLIGHT
does not require an index.
With offset information, you can display a highlighted version of a document (such as different font types or colors) instead of the standard plain-text markup obtained from CTX_DOC.MARKUP.
See Also:
Oracle Text Reference for more information about using CTX_DOC.HIGHLIGHT
and CTX_DOC.POLICY_HIGHLIGHT
8.1.3.3 Concordance
CTX_DOC.SNIPPET
and CTX_DOC.POLICY_SNIPPET
produce a concordance of the document, in which occurrences of the query term are returned with their surrounding text. This result is sometimes known as Key Word in Context (KWIC) because, instead of returning the entire document (with or without the query term highlighted), it returns the query term in text fragments, allowing a user to see it in context. You can control how the query term is highlighted in the returned fragments.
CTX_DOC.SNIPPET
and CTX_DOC.POLICY_SNIPPET
are equivalent, except that CTX_DOC.POLICY_SNIPPET
does not require an index. CTX_DOC.POLICY_SNIPPET
and CTX_DOC.SNIPPET
include two new attributes: radius
specifies the approximate desired length of each segment, whereas, max_length
puts an upper bound on the length of the sum of all segments.
See Also:
Oracle Text Reference for more information about CTX_DOC.SNIPPET
and CTX_DOC.POLICY_SNIPPET