page_type | languages | name | description | products | |||||
---|---|---|---|---|---|---|---|---|---|
sample |
|
Azure Image Analysis samples |
C++, C# and Python samples for Image Analysis using Azure AI Vision SDK (Preview) |
|
This repository hosts sample code and setup documents for the Microsoft Azure AI Vision SDK (Preview).
- Vision SDK 0.11.1-beta.1 released May 2023. Image Analysis APIs were updated to support Background Removal.
- Vision SDK 0.10.0-beta.1 released April 2023. Image Analysis APIs were updated to support Dense Captions.
- Vision SDK 0.9.0-beta.1 first released on March 2023, targeting Image Analysis applications on Windows and Linux platforms.
This repository hosts samples that help you get started with several features of the SDK in public preview. This includes the following API sets:
Other API sets are under development.
Please open a new issue in this repo if you encounter any problems building or running the samples, or have any additional questions about the SDK. This is the preferred method of getting support. Note that these issues will be visible to the public, so please do not include any sensitive information.
Alternatively, you can contact Microsoft's Vision SDK development team directly by sending an e-mail to [email protected]
.
-
Running the samples in this repository requires you to install the Azure AI Vision SDK. By doing so you acknowledge the Azure AI Vision SDK license agreement.
-
The easiest way to get access to these samples is to download the content of this repo as a ZIP file.
-
Alternatively, you can use a Git client to clone this repository to your hard drive by running
git clone https://github.com/Azure-Samples/azure-ai-vision-sdk.git
See Microsoft documentation for an overview of Image Analysis. The Vision SDK Image Analysis APIs (preview) uses Image Analysis REST API v4.0 (preview).
The Image Analysis APIs supports the extraction of one or more of the following visual features using a single REST call:
- Caption - Generates a human-readable phrase that describes the whole image content. For example, for the above image, "A woman wearing a mask sitting at a table with a laptop".
- Dense Captions - Generates a human-readable phrase that describes the whole image content, and up to 9 additional descriptions that describe sub-regions of the image.
- Tags - Returns content tags for recognizable objects, living beings, scenery, and actions that appear in the image.
- Objects - Detects various objects within an image, including their approximate location. See example in the above image: person, two chairs, laptop, dining table.
- People - Detects people in the image, including their approximate location.
- Text - Also known as Read or OCR. Performs Optical Character Recognition (OCR) and returns the text detected in the image, including the approximate location of every text line and word.
- Crop Suggestions - Also known as Smart Crop. Recommendations for cropping operations that preserve content (e.g. for thumbnail generation).
The Image Analysis APIs also support background removal (segmentation). This feature can either output an image of the detected foreground object with a transparent background, or a gray-scale alpha matte image showing the opacity of the detected foreground object.
For all scenarios, you can either upload an image for analysis by providing the name of an image file on disk, or you can provide a publicly-accessible URL of the image.
At the moment the SDK is available for the following platforms and programming languages:
-
Platforms:
- Windows 10 x64 (and above)
- Linux x64 running Ubuntu 18.04/20.04/22.04, Debian 9/10/11, Red Hat Enterprise Linux (RHEL) 7/8
-
Programming languages:
- Python
- C# (.NET Core)
- C++
Support for others platform and programming languages (including Android, iOS, MacOS) is planned for future releases.
If your platform and/or programming language is not listed above, your application will need to directly implement REST calls to the Vision Service using the Image Analysis REST API v4.0 (preview).
The samples will show how to analyze an image file from local disk or an image URL. Click on the links below for detailed setup, build and run instructions corresponding to your programming language:
Programming Language |
---|
C++ |
C# .NET Core |
Python |
There are currently four samples, with more to come:
- Analyze all features from a JPEG image file on disk and print detailed results to the console. This is done using the synchronous (blocking) API. Start by looking at this sample first.
- Analyze one feature from an image URL, using the asynchronous (non-blocking) API, while registering for an event to get the analysis results.
- Analyze an image using a custom-trained model. To run this sample, you first need to create a custom model. See Image Analysis overview for more details.
- Analyze an image for background removal (segmentation).
If your platform and/or programming language is not listed above, your application will need to directly implement REST calls to the Vision service using the Image Analysis REST API v4.0 (preview).