17.4 Performing Sentiment Analysis with the CTX_DOC Package
Use the procedures in the CTX_DOC
package to perform sentiment analysis on a single document within a document set. For each document, you can either determine a single sentiment score for the entire document or individual sentiment scores for each topic within the document.
Before you perform sentiment analysis, you must create a context index on the document set. The following command creates a camera_revidx
context index on the document set in the camera_reviews
table:
create index camera_revidx on camera_reviews(review_text) indextype is
ctxsys.context parameters ('lexer mylexer stoplist ctxsys.default_stoplist');
To perform sentiment analysis with the CTX_DOC
package, use one of the following methods:
Example 17-2 Obtaining a Single Sentiment Score for a Document
The following example uses the clsfier_camera
sentiment classifier to provide a single aggregate sentiment score for the entire document. The sentiment classifier was created and trained. The table containing the document set has a camera_revidx
context index. The doc_id
of the document within the document table for which sentiment analysis must be performed is 49. The topic for which a sentiment score is being generated is ‘Nikon.’
select ctx_doc.sentiment_aggregate('camera_revidx','49','Nikon','clsfier_camera') from dual;
CTX_DOC.SENTIMENT_AGGREGATE('CAMERA_REVIDX','49','NIKON','CLSFIER_CAMERA')
--------------------------------------------------------------------------
74
1 row selected.
Example 17-3 Obtaining a Single Sentiment Score with the Default Classifier
The following example uses the default sentiment classifier to provide an aggregate sentiment score for the entire document. The table containing the document set has a camera_revidx
context index. The doc_id
of the document within the document table for which sentiment analysis must be performed is 1.
select ctx_doc.sentiment_aggregate('camera_revidx','1') from dual;
CTX_DOC.SENTIMENT_AGGREGATE('CAMERA_REVIDX','1')
--------------------------------------------
2
1 row selected.
Example 17-4 Obtaining Sentiment Scores for Each Topic Within a Document
The following example uses the clsfier_camera
sentiment classifier to generate sentiment scores for each segment within the document. The sentiment classifier was created and trained. The table containing the document set has a camera_revidx
context index . The doc_id
of the document within the document table for which sentiment analysis must be performed is 49. The topic for which a sentiment score is being generated is ‘Nikon.’ The restab
result table, which will be populated with the analysis results, was created with the columns snippet (CLOB
) and score (NUMBER
).
exec ctx_doc.sentiment('camera_revidx','49','Nikon','restab','clsfier_camera', starttag=>'<<', endtag=>'>>');
SQL> select * from restab;
SNIPPET
--------------------------------------------------------------------------------
SCORE
----------
It took <<Nikon>> a while to produce a superb compact 85mm lens, but this time they finally got it right.
65
Without a doubt, this is a fine portrait lens for photographing head-and-shoulder portraits (The only lens which is optically better is
<<Nikon>>'s legendary 10
5mm f2.5 Nikkor lens, and its close optical twin, the 105mm f2.8 Micro Nikkor.
75
Since the 105mm f2.5 Nikkor lens doesn't have an autofocus version, then this might be the perfect moderate telephoto lens for owners of
<<Nikon>> autofocus
SLR cameras.
84
3 rows selected.
Example 17-5 Obtaining a Sentiment Score for a Topic Within a Document
The following example uses the tdrbrtsent03_cl
sentiment classifier to generate a sentiment score for each segment within the document. The sentiment classifier was created and trained. The table containing the document set has a tdrbrtsent03_idx
context index. The doc_id
of the document within the document table for which sentiment analysis must be performed is 1. The topic for which a sentiment score is being generated is ‘movie.’ The tdrbrtsent03_rtab
result table, which will be populated with the analysis results was created with the columns snippet and score.
SQL> exec ctx_doc.sentiment('tdrbrtsent03_idx','1','movie','tdrbrtsent03_rtab','tdrbrtsent03_cl');
PL/SQL procedure successfully completed.
SQL> select * from tdrbrtsent03_rtab;
SNIPPET
--------------------------------------------------------------------------------
SCORE
----------
the <b>movie</b> is a bit overlong , but nicholson is such good fun that the running time passes by pretty quickly
-62
1 row selected.
See Also:
-
CTX_DOC.SENTIMENT_AGGREGATE
in the Oracle Text Reference -
CTX_DOC.SENTIMENT
in the Oracle Text Reference