Azure AI Vision – Image Analysis – Generating a Single Image Caption in Optimizely CMS

This article will explore the demonstration of creating a single image caption for an image being uploaded in Optimizely CMS. This will be achieved using the image caption feature provided by the Azure AI Vision service.

The Azure AI services for Vision, which are powered by Microsoft’s Florence large foundation model, has integrated significant enhancements for image analysis and pioneering customisation capabilities. This enables the feature to produce extensive data regarding the image, such as delivering a succinct, one-sentence description of its contents.

The specific image used for generating the image caption is presented above.

Below is a screenshot of a code snippet used to retrieve the single image Caption.

Response from the API

Image Analysis- Image Caption Operation has finished :
Image height = 1600
Image width = 2400
Model version = 2023-02-01-preview
Image Caption = a room with a large chandelier and couches

When this process has finished, the image is uploaded into the CMS and the generated single image caption is subsequently assigned to a string property within the ImageFile type. The below screenshot shows the string property in the CMS showing the generated image caption from the Azure AI Vision service.