Skip to content

๐ŸŽฏ Azure AI Vision Image Analysis Study Guide

๐Ÿ” Problem

You need to analyze images to automatically:

  • ๐Ÿ“ Generate descriptive captions in natural language
  • ๐Ÿท๏ธ Suggest tags representing objects, scenery, or actions
  • ๐ŸŽฏ Detect and locate objects or people in the image

You want to implement this solution using Azure services with appropriate architecture, components, and configurations.


โ˜๏ธ Solution with Azure

Use Azure AI Vision (Computer Vision) service to analyze images and extract information. Provision an Azure AI Vision resource and connect to it through REST API or SDK (Python, .NET, etc.).

๐Ÿ”ง Key capabilities include:

  • ๐Ÿ“ CAPTION: Generate a natural language description of the image
  • ๐Ÿท๏ธ TAGS: Identify objects, scenery, setting, and actions
  • ๐Ÿ“ฆ OBJECTS: Locate objects with bounding boxes
  • ๐Ÿ‘ฅ PEOPLE: Locate people with bounding boxes
  • ๐Ÿ“– DENSE_CAPTIONS: Generate detailed captions for detected objects
  • โœ‚๏ธ SMART_CROPS: Suggest crop regions for a specified aspect ratio
  • ๐Ÿ”ค READ: Extract readable text from images (OCR)

๐Ÿงฉ Components Required

  • ๐ŸŒ Azure AI Vision resource, provisioned in one of the following ways:

    • Azure AI Foundry project โ†’ AI Foundry hub โ†’ multi-service resource (includes AI Vision)
    • Azure AI services multi-service resource
    • Standalone Computer Vision resource (includes free tier for testing)
  • ๐Ÿ“ฑ Client app (Python, .NET, etc.) using:

    • REST API
    • SDK (e.g., azure.ai.vision.imageanalysis for Python)
  • ๐Ÿ” Authentication:

    • Key-based (authorization key)
    • Microsoft Entra ID token
    • (Production) Managed identity or Azure Key Vault for securing credentials

๐Ÿ—๏ธ Architecture / Development

1. ๐Ÿš€ Provision the resource:

  • Create AI Vision / multi-service resource
  • Obtain endpoint (e.g., https://<resource_name>.cognitiveservices.azure.com/)
  • Obtain key or set up Entra ID access

2. ๐Ÿ”Œ Connect client app:

Example in Python (key-based):

from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.ai.vision.imageanalysis.models import VisualFeatures
from azure.core.credentials import AzureKeyCredential

client = ImageAnalysisClient(
    endpoint="<YOUR_RESOURCE_ENDPOINT>",
    credential=AzureKeyCredential("<YOUR_AUTHORIZATION_KEY>")
)

result = client.analyze(
    image_data=<IMAGE_DATA_BYTES>,
    visual_features=[VisualFeatures.CAPTION, VisualFeatures.TAGS],
    gender_neutral_caption=True,
)

3. ๐Ÿ–ผ๏ธ Image requirements:

  • Format: JPEG, PNG, GIF, BMP
  • Size: < 4 MB
  • Dimensions: > 50 x 50 pixels

4. ๐Ÿ“ค Submit image:

  • Upload image bytes
  • Or provide URL using analyze_from_url

5. ๐Ÿ“ฅ Receive response (JSON with captions, tags, bounding boxes, confidence scores, etc.)

Example JSON excerpt:

{
    "denseCaptionsResult": {
        "values": [
            {
                "text": "a house in the woods",
                "confidence": 0.705,
                "boundingBox": { "x": 0, "y": 0, "w": 640, "h": 640 }
            }
        ]
    }
}

โญ Best Practice / Considerations

  • ๐Ÿ”’ Use Microsoft Entra ID authentication + managed identity for production security
  • ๐Ÿ—๏ธ Secure keys in Azure Key Vault if key-based authentication is required
  • ๐Ÿค For collaborative or multi-AI service solutions, prefer Azure AI Foundry projects
  • ๐Ÿ“ Ensure image size, format, and dimensions meet requirements to avoid API errors
  • โšก Specify only necessary visual features to reduce processing time and cost

๐Ÿ“ Simulated Exam Questions

1๏ธโƒฃ You need to generate a natural language caption and identify tags for an image using Azure. Which visual features should you request?

  • โœ… CAPTION, TAGS
  • โŒ OBJECTS, PEOPLE
  • โŒ READ, SMART_CROPS

2๏ธโƒฃ A client app sends images for analysis but fails due to large file size. What is the maximum allowed file size for image analysis in Azure AI Vision?

  • โœ… 4 MB
  • โŒ 10 MB
  • โŒ 50 MB

  • โœ… Microsoft Entra ID with managed identity
  • โŒ Hardcoded authorization key in app
  • โŒ Anonymous access

4๏ธโƒฃ Which resource type should you choose if you want to experiment with Azure AI Vision at no cost?

  • โœ… Standalone Computer Vision resource (free tier)
  • โŒ AI Foundry project
  • โŒ Multi-service AI resource