17.5 Performing Sentiment Analysis with the RSI

The XML Query Result Set Interface (RSI) enables you to perform sentiment analysis on a set of documents by using either the default sentiment classifier or a user-defined sentiment classifier. The documents on which sentiment analysis must be performed are stored in a document table.

Use the sentiment element in the input RSI to indicate that sentiment analysis, in addition to other operations specified in the Result Set Descriptor (RSD), must be performed at query time. If you specify a value for the classifier attribute of the sentiment element, then the specified sentiment classifier is used to perform the sentiment analysis. If the classifier attribute is omitted, then Oracle Text performs sentiment analysis by using the default sentiment classifier. The sentiment element contains a child element called item that specifies the topic or concept about which a sentiment must be generated during sentiment analysis.

You can generate either a single sentiment score for each document or separate sentiment scores for each topic within the document. Use the agg attribute of the item element to generate a single aggregated sentiment score for each document.

You can perform sentiment classification by using a keyword query or the ABOUT operator. When you use the ABOUT operator, the result set includes synonyms of the keyword that are identified by using the thesaurus.

To perform sentiment analysis by using RSI:

  1. Create and train the sentiment classifier you will use to perform sentiment analysis.
  2. Create the document table that contains the documents to be analyzed and a context index on the document table.
  3. Use the required elements and attributes within a query to perform sentiment analysis.

    The RSI must contain the sentiment element.

Example 17-6 Input the RSD to Perform Sentiment Analysis

The following example performs sentiment analysis and generates a sentiment for the ‘lens’ topic. The driving query is a keyword query for ‘camera.’ The sentiment element specifies that sentiment analysis must be performed by using the clsfier_camera sentiment classifier. This classifier was previously created and trained by using the CTX_CLS.SA_TRAIN_MODEL procedure. The camera_revidx context index is on the document set table.

The sentiment score ranges from -100 to 100. A positive score indicates positive sentiment, whereas a negative score indicates negative sentiment. The absolute value of the score is indicative of the magnitude of positive and negative sentiment.

To perform sentiment analysis and obtain a sentiment score for each topic within the document:

  1. Create the rs result set table that will store the results of the search operation.

    SQL> var rs clob;
    SQL> exec dbms_lob.createtemporary(:rs, TRUE, DBMS_LOB.SESSION);
    
  2. Perform sentiment analysis as part of a search query.

    The keyword being searched for is ‘camera.’ The topic for which sentiment analysis is performed is ‘lens.’

    begin
    ctx_query.result_set('camera_revidx','camera',' 
        <ctx_result_set_descriptor>
            <hitlist start_hit_num="1" end_hit_num="10" order="score desc"> 
            <sentiment classifier="clsfier_camera">
               <item topic="lens" /> 
               <item topic="picture quality" agg="true" />
           </sentiment> </hitlist>
       </ctx_result_set_descriptor>',:rs); 
    end; 
    / 
    
    
  3. View the results stored in the result table.

    Other applications can use the XML result set for further processing. For brevity, some output was removed. For each segment within the document, a score represents the sentiment score for the segment.

    SQL> select xmltype(:rs) from dual; 
    XMLTYPE(:RS) 
    -------------------------------------------------------------------------------- 
    <ctx_result_set>
      <hitlist>
        <hit>
          <sentiment>
             <item topic="lens">          
                <segment>             
                   <segment_text>The first time it was sent in was because the <b>lens </b> door failed to turn on the camera 
    and it was almost to come off of its track . Eight months later, the flash quit working in all modes AND the door was 
    failing AGAIN!</segment_text>           
                    <segment_score>-81</segment_score>           
               </segment>         
            </item>        
             <item topic="picture quality"> <score> -75 </score>       
             </item>
          </sentiment>
        </hit>
        <hit>
           <sentiment>
              <item topic="lens">
                 <segment>
                     <segment_text>I was actually quite impressed with it. Powerful zoom , sharp <b>lens</b>, decent picture 
    quality. I also played with some other Panasonic models in various stores just to get a better feel for them, as well as 
    spent a few hours on </segment_text> 
                      <segment_score> 67 </segment_score>           
                </segment>        
              </item>         
                 <item topic="picture quality">  <score>-1</score>    </item>
           </sentiment>
        </hit> 
        . . . 
      . . .
      </hitlist>
    </ctx_result_set>