Describe Image Content

In these examples, you can see how to generate a textual analysis or description of the contents of a given image.

Here, you supply an image along with a text question as the prompt (for example, "What is this image about?" or "How many birds are there in this image?"). The LLM responds with a textual answer or description based on the specified task in the prompt, which can then be used for image classification, object detection, or similarity search.

Describe Images Using Public REST Providers
Perform an image-to-text transformation by supplying an image along with a text question as the prompt, using publicly hosted third-party LLMs by Google AI, Hugging Face, OpenAI, or Vertex AI.
Describe Images Using the Local REST Provider Ollama
Perform an image-to-text transformation by supplying an image along with a text question as the prompt by accessing open LLMs, using the local host REST endpoint provider Ollama.

Parent topic: Use LLM-Powered APIs to Generate Summary and Text