8.1 Highlighting Query Terms

In text query applications, you can present selected documents with query terms highlighted for text queries or with themes highlighted for ABOUT queries.

You can generate three types of output associated with highlighting:

  • A marked-up version of the document

  • Query offset information for the document

  • A concordance of the document, in which occurrences of the query term are returned with their surrounding text

This section contains the following topics:

8.1.1 Text highlighting

For text highlighting, you supply the query, and Oracle Text highlights words in the document that satisfy the query. You can obtain plain-text or HTML highlighting.

8.1.2 Theme Highlighting

For ABOUT queries, the CTX_DOC procedures highlight and mark up words or phrases that best represent the ABOUT query.

8.1.3 CTX_DOC Highlighting Procedures

These are the highlighting procedures in CTX_DOC:

  • CTX_DOC.MARKUP and CTX_DOC.POLICY_MARKUP

  • CTX_DOC.HIGHLIGHT and CTX_DOC.POLICY_HIGHLIGHT

  • CTX_DOC.SNIPPET and CTX_DOC.POLICY_SNIPPET

The POLICY and non-POLICY versions of the procedures are equivalent, except that the POLICY versions do not require an index.

Note:

SNIPPET can also be generated using the Result Set Interface.

See Also:

Oracle Text Reference for information on CTX_QUERY.RESULT_SET

This section contains these topics:

8.1.3.1 Markup Procedure

The CTX_DOC.MARKUP and CTX_DOC.POLICY_MARKUP procedures take a document reference and a query, and return a marked-up version of the document.

The output can be either marked-up plain text or marked-up HTML. For example, specify that a marked-up document be returned with the query term surrounded by angle brackets (<<<tansu>>>) or HTML (<b>tansu</b>).

CTX_DOC.MARKUP and CTX_DOC.POLICY_MARKUP are equivalent, except that CTX_DOC.POLICY_MARKUP does not require an index.

You can customize the markup sequence for HTML navigation.

CTX_DOC.MARKUP Example

The following example is taken from the web application described in CONTEXT Query Application. The showDoc procedure takes an HTML document and a query, creates the highlight markup—in this case, the query term is displayed in red—and outputs the result to an in-memory buffer. It then uses htp.print to display it in the browser.

procedure showDoc (p_id in varchar2, p_query in varchar2) is

 v_clob_selected   CLOB;
 v_read_amount     integer;
 v_read_offset     integer;
 v_buffer          varchar2(32767);
 v_query           varchar(2000);
 v_cursor          integer;

 begin
   htp.p('<html><title>HTML version with highlighted terms</title>');
   htp.p('<body bgcolor="#ffffff">');
   htp.p('<b>HTML version with highlighted terms</b>');

   begin
     ctx_doc.markup (index_name => 'idx_search_table',
                     textkey    => p_id,
                     text_query => p_query,
                     restab     => v_clob_selected,
                     starttag   => '<i><font color=red>',
                     endtag     => '</font></i>');

     v_read_amount := 32767;
     v_read_offset := 1;
     begin
      loop
        dbms_lob.read(v_clob_selected,v_read_amount,v_read_offset,v_buffer);
        htp.print(v_buffer);
        v_read_offset := v_read_offset + v_read_amount;
        v_read_amount := 32767;
      end loop;
     exception
      when no_data_found then
         null;
     end;

     exception
      when others then
        null; --showHTMLdoc(p_id);
   end;
end showDoc;
end;
/
show errors
set define on

See Also:

Oracle Text Reference for more information about CTX_DOC.MARKUP and CTX_DOC.POLICY_SNIPPET

8.1.3.2 Highlight Procedure

CTX_DOC.HIGHLIGHT and CTX_DOC.POLICY_HIGHLIGHT take a query and a document and return offset information for the query in plain text or HTML format. You can use this offset information to write your own custom routines for displaying documents.

CTX_DOC.HIGHLIGHT and CTX_DOC.POLICY_HIGHLIGHT are equivalent, except that CTX_DOC.POLICY_HIGHLIGHT does not require an index.

With offset information, you can display a highlighted version of a document (such as different font types or colors) instead of the standard plain-text markup obtained from CTX_DOC.MARKUP.

See Also:

Oracle Text Reference for more information about using CTX_DOC.HIGHLIGHT and CTX_DOC.POLICY_HIGHLIGHT

8.1.3.3 Concordance

CTX_DOC.SNIPPET and CTX_DOC.POLICY_SNIPPET produce a concordance of the document, in which occurrences of the query term are returned with their surrounding text. This result is sometimes known as Key Word in Context (KWIC) because, instead of returning the entire document (with or without the query term highlighted), it returns the query term in text fragments, allowing a user to see it in context. You can control how the query term is highlighted in the returned fragments.

CTX_DOC.SNIPPET and CTX_DOC.POLICY_SNIPPET are equivalent, except that CTX_DOC.POLICY_SNIPPET does not require an index. CTX_DOC.POLICY_SNIPPET and CTX_DOC.SNIPPET include two new attributes: radius specifies the approximate desired length of each segment, whereas, max_length puts an upper bound on the length of the sum of all segments.

See Also:

Oracle Text Reference for more information about CTX_DOC.SNIPPET and CTX_DOC.POLICY_SNIPPET