Extracting Text from a PDF File

To extract text from a PDF file of any length, use documentCapture.documentToText(options). For a sample, see Extract Text from a PDF File.

Provide the following parameters:

The documentCapture.documentToText(options) method returns a string with the text of the PDF file. If you want to analyze the text further, you can provide the extracted text to the llm.generateText(options) method in the N/llm module, as the following example shows:

          // "14" is the unique ID of a PDF stored in the NetSuite File Cabinet
const fileObj = file.load({
    id: "14"
});
const extractedData = documentCapture.documentToText({
    file: fileObj
});

const response = llm.generateText({
    prompt: "What is this invoice for?",
    documents: [{
        id: '14',
        data: extractedData
    }]
}); 

        

Keep the following considerations in mind:

Related Topics

General Notices