Azure AI Vision – Image Analysis – Object detection in Optimizely CMS

In this article, I demonstrate how the object detection feature, provided by the Azure AI Vision service, can effectively identify a range of objects within an image, which is uploaded to Optimizely CMS.

For instance, in the scenario where an image contains a dog, cat, and person, the object detection process will enumerate these objects along with their respective coordinates within the image. An inherent utility of this feature lies in its ability to analyse the spatial relationships between the objects within an image, while also facilitating the identification of multiple instances of the same object.

The utilisation of the object detection function leads to the allocation of tags based on the recognised objects or living entities within the image. From a conceptual standpoint, the object detection feature is confined to the identification of objects and living entities, whereas the tagging function also encompasses contextual terms, such as “indoor,” which cannot be localised using bounding boxes. It is essential to acknowledge the limitations of object detection in order to effectively acknowledge or minimise the impact of false negatives (missed objects) and the inherent limitations on detail.

  • Objects smaller than 5% of the image are typically not detected.
  • Objects arranged closely together, such as a stack of plates, are generally not detected.
  • Brand or product names are not used to differentiate objects.

The image shown above , will be used for the object detection process.

Below is a screenshot of a code snippet used for the object detection process.

Response from the API

Image Analysis- Image Objects Detection Operation has finished :
Image height = 1000
Image width = 1600
Model version = 2023-02-01-preview
Image Objects Detected Count = 3
“couch”, Confidence 0.6550
“kitchen appliance”, Confidence 0.5340
“couch”, Confidence 0.5430

As illustrated in the preceding response, the image used for object detection has produced 3 captions. These captions are subsequently stored in an IList<string> property and published in the CMS as part of the image upload process. Below is the resultant list of captions.