Detect Objects Using Deep Learning (Image Analyst Tools)
Summary
Runs a trained deep learning model on an input raster to produce a feature class containing the objects it finds. The features can be bounding boxes or polygons around the objects found or points at the centers of the objects.
This tool requires a model definition file containing trained model information. The model can be trained using the Train Deep Learning Model tool or by a third-party training software such as PyTorch. The model definition file can be an Esri model definition JSON file (.emd) or a deep learning model package, and it must contain the path to the Python raster function to be called to process each object and the path to the trained binary deep learning model file.
Usage
You must install the proper deep learning framework Python API (such as PyTorch) in the ArcGIS Pro Python environment; otherwise, an error will occur when you add the Esri model definition file to the tool. Obtain the appropriate framework information from the creator of the Esri model definition file.
To set up your machine to use deep learning frameworks in ArcGIS Pro, see Install deep learning frameworks for ArcGIS.
This tool calls a third-party deep learning Python API (such as PyTorch) and uses the specified Python raster function to process each object.
Sample use cases for this tool are available on the Esri Python raster function GitHub page. You can also write custom Python modules by following examples and instructions in the GitHub repository.
The Model Definition parameter value can be an Esri model definition JSON file (
.emd), a JSON string, or a deep learning model package (.dlpk). A JSON string is useful when this tool is used on the server so you can paste the JSON string rather than upload the.emdfile. The.dlpkfile must be stored locally.The tool can process input imagery that is in map space or in pixel space. Imagery in map space is in a map-based coordinate system. Imagery in pixel space is based on rows and columns with no rotation and no distortion. The reference system can be specified when generating the training data in the Export Training Data For Deep Learning tool using the Reference System parameter. If the model is trained in a third-party training software, the reference system must be specified in the
.emdfile using theImageSpaceUsedparameter, which can be set toMAP_SPACEorPIXEL_SPACE.For oriented imagery layers, the processing will always occur in pixel space. When pixel space is used for processing, the pixel space detections are preserved in the output table in the
IShapefield.Increasing the batch size can improve tool performance; however, as the batch size increases, more memory is used. If an out of memory error occurs, use a smaller batch size. The
batch_sizevalue can be adjusted using the Arguments parameter.Batch sizes are square numbers, such as 1, 4, 9, 16, 25, 64 and so on. If the input value is not a perfect square, the highest possible square value is used. For example, if a value of 6 is specified, the batch size is set to 4.
Use the Non Maximum Suppression parameter to identify and remove duplicate features from the object detection. To learn more about this parameter, see the Usage section of the Non Maximum Suppression tool. When the inputs are oriented imagery layers, the duplicates are retained with null ground geometries.
Use the Process candidate items only option for the Processing Mode parameter to only detect objects on select images in the mosaic dataset. You can use the Compute Mosaic Candidates tool to find the image candidates in a mosaic dataset and image service that best represent the mosaic area.
This tool supports and uses multiple GPUs if available. To use a specific GPU, specify the GPU ID environment. When the GPU ID is not set, the tool uses all available GPUs. This is the default.
- When the Processor Type environment is set to CPU, and the Parallel Processing Factor environment is unspecified, the tool will use a Parallel Processing Factor value of 50%.
The input raster can be a single raster, multiple rasters in a mosaic dataset, an oriented imagery layer or dataset, an image service, a folder of images, or a feature class with images attached. For more information about attachments, see Add or remove file attachments.
For information about requirements for running this tool and issues you may encounter, see Deep Learning frequently asked questions.
For more information about deep learning, see Deep learning using the ArcGIS Image Analyst extension.
Parameters
| Label | Explanation | Data type |
|---|---|---|
|
Input Raster |
The input image that will be used to detect objects. The input can be a single raster, multiple rasters in a mosaic dataset, an image service, a folder of images, a feature class with image attachments, or an oriented imagery dataset or layer. |
Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder; Feature Layer; Feature Class; Oriented Imagery Layer |
|
Output Detected Objects |
The output feature class that will contain geometries circling the object or objects detected in the input image. If the feature class already exists, the results will be appended to the existing feature class. |
Feature Class |
|
Model Definition |
This parameter can be an Esri model definition JSON file ( It contains the path to the deep learning binary model file, the path to the Python raster function to be used, and other parameters such as preferred tile size or padding. |
File; String |
|
Arguments (Optional) |
The information from the Model Definition parameter will be used to populate this parameter. These arguments vary, depending on the model architecture. The following are supported model arguments for models trained in ArcGIS. ArcGIS pretrained models and custom deep learning models may have additional arguments that the tool supports.
Value table columns:
|
Value Table |
|
Non Maximum Suppression (Optional) |
Specifies whether nonmaximum suppression will be performed in which duplicate objects are identified and the duplicate features with lower confidence value are removed.
|
Boolean |
|
Confidence Score Field (Optional) |
The name of the field in the feature class that will contain the confidence scores as output by the object detection method. This parameter is required when the Non Maximum Suppression parameter is checked. |
String |
|
Class Value Field (Optional) |
The name of the class value field in the input feature class. If no field name is provided, a |
String |
|
Max Overlap Ratio (Optional) |
The maximum overlap ratio for two overlapping features, which is defined as the ratio of intersection area over union area. The default is 0. |
Double |
|
Processing Mode (Optional) |
Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service.
|
String |
|
Use pixel space (Optional) |
Specifies whether inferencing will be performed on images in pixel space.
|
Boolean |
|
Objects of Interest (Optional) |
Specifies the object names that will be detected by the tool. The available options will be based on the Model Definition parameter value. This parameter is only active when the model detects more than one type of object. |
String |
Derived output
| Label | Explanation | Data type |
|---|---|---|
|
Output Classified Raster |
The output classified raster for pixel classification. The name of the raster dataset will be the same as the Output Detected Objects parameter value. This parameter is only applicable when the model type is Panoptic Segmentation. |
Raster Dataset |
Environments
Cell Size, Current Workspace, Extent, Geographic Transformations, GPU ID, Mask, Output Coordinate System, Parallel Processing Factor, Processor Type, Scratch Workspace
Licensing information
- Basic: Requires Image Analyst
- Standard: Requires Image Analyst
- Advanced: Requires Image Analyst
Related topics
- An overview of the Deep Learning toolset
- Install deep learning frameworks for ArcGIS
- Train Deep Learning Model
- Deep learning arguments
- Deep learning model architectures
- Find a geoprocessing tool
- Object Detection
- Compute Accuracy For Object Detection
- Classify Objects Using Deep Learning
- Export Training Data For Deep Learning