Azure AI Vision – Image Analysis – Generate Image Captions in Optimizely CMS

In this article, I demonstrate how to generate a list of image captions, when an image is uploaded in Optimizely CMS. This can be done by using the image captions feature which is provided by the Azure AI Vision service.

This is a similar concept to the article Azure AI Vision – Image Analysis – Generating a Single Image Caption in Optimizely CMS where I demonstrated the ability to generate a single image caption. In this article, I will explain how the image caption feature can be used to generate a list of image captions associated to the image.

Within the Azure AI Vision Service, the default captions generated may contain gender-specific terms such as “man,” “woman,” “boy,” and “girl.” However, an option exists to replace these terms with “person” in order to obtain gender-neutral captions. This can be achieved by setting the optional API request parameter “gender-neutral-caption” to true in the request URL. The image used for generating image captions is provided above.

Below is a screenshot of a code snippet used to retrieve the single image Caption.

Response from the API

Image Analysis- Dense Captions Operation has finished :
Image height = 769
Image width = 1024
Model version = 2023-02-01-preview
Dense Captions Generated Count = 10
Caption Content: “a living room with a chandelier and couches”
Caption Content: “a plant in a room”
Caption Content: “a red chair with a pillow”
Caption Content: “a chair with a plant on a table”
Caption Content: “a glass coffee table with legs”
Caption Content: “a living room with a chandelier and a couch”
Caption Content: “a lamp with a white shade”
Caption Content: “a table with books and a plant on it”
Caption Content: “a lamp shade on a table”
Caption Content: “a brown liquid with a brown substance”

When the image caption feature has finished processing the image, the list of image captions are returned from the API as shown above. The image used for this produced 10 captions. Following this process, the aforementioned captions are then stored in an IList<string> property and published in the CMS. The resulting list is provided below: