Translate Pixels Using Deep Learning (Image Analyst Tools)

Summary

Runs a trained deep learning model to translate pixels in an image.

Usage

This tool is used to translate pixels in an image using deep learning model information. Examples of translations include increasing the pixel resolution, converting a grayscale image into a color image, and so on.
The Super-resolution model type can be used to increase image resolution.
The CycleGAN model type can be used to perform image translation, such as grayscale to red, green, blue (RGB) color.
The Pix2Pix model type can be used to perform image translation, such as SAR to RGB.
See the sample below for a model definition JSON file (.emd).

Sample for Model Definition JSON File (*.emd)

{
   "Framework": "",
   "ModelConfiguration":" ",
   "ModelFile":"",
   "InferenceFunction":"",
   "ModelType":"",
   "ImageHeight":256,
   "ImageWidth":256,
   "ExtractBands":[0,1,2],
   "CropSizeFixed": 1,
   "BlackenAroundFeature": 1,
      "Classes": [
      {
         "Value": 0,
            "Name": "Building",
            "Color": [255, 0, 0]
           	}
   ]
}

Label	Explanation	Data type
Input raster	The input raster image or images to be translated.	Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder; Feature Layer; Feature Class
Model Definition	The path to the model definition JSON file (`*.emd`).	File; String
Arguments (Optional)	The information from the Model Definition parameter will be used to populate this parameter. These arguments vary, depending on the model architecture. The following are supported model arguments for models trained in ArcGIS. ArcGIS pretrained models and custom deep learning models may have additional arguments that the tool supports. `batch_size`—The number of image tiles processed in each step of the model inference. This depends on the memory of your graphics card. The argument is available for all model architectures. `direction`—The image is translated from one domain to another. Options are `AtoB` and `BtoA`. The argument is only available for the CycleGAN architecture. For more information about this argument, see How CycleGAN works. `n_timestep`—The number of time steps that will be used. The default is 200. It can be increased and decreased based on the quality of generations. The argument is only supported for the Super Resolution with SR3 backbone model architecture. `padding`—The number of pixels at the border of image tiles from which predictions are blended for adjacent tiles. To smooth the output while reducing artifacts, increase the value. The maximum value of the padding can be half the tile size value. The argument is available for all model architectures. `sampling_type`—The type of sampling that will be used. Two types of sampling are available: `ddim` (Denoising Diffusion Implicit Models) and `ddpm` (Denoising Diffusion Probabilistic Models). The default is `ddim`, which generates results in fewer time steps compared to `ddpm`. The argument is only supported for the Super Resolution with SR3 backbone model architecture. `schedule`—The type of learning rate schedule type to use; this option will be used to adjust the model's learning rate while the model is training. The default schedule is the same as the model it was trained on. The argument is only supported for the Super Resolution with SR3 backbone model architecture. `tile_size`—The width and height of image tiles into which the imagery will be subsectioned for prediction. The argument is only available for the CycleGAN architecture. Value table columns: Name—The name of the function argument. Value—The value of the function argument.	Value Table
Processing Mode	Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service. Process as mosaicked image—All raster items in the mosaic dataset or image service will be mosaicked together and processed. This is the default. Process all raster items separately—All raster items in the mosaic dataset or image service will be processed as separate images.	String
Output Folder (Optional)	The folder where the output rasters will be stored. This parameter is required when the input raster is a folder of images or a mosaic dataset in which all items are to be processed separately. The default is a folder in the project folder.	Folder
Output Features (Optional)	The feature class where the output rasters will be stored. This parameter is required when the input raster is a feature class of images. If the feature class already exists, the results will be appended to the existing feature class.	Feature Class
Overwrite attachments (Optional)	Specifies whether existing image attachments will be overwritten. This parameter is only valid when the Input Raster parameter value is a feature class with image attachments. Checked—The existing feature class will be overwritten with the new updated attachments. Unchecked—Existing image attachments will not be overwritten and new image attachments will be stored in a new feature class. When this option is specified, the Output Features parameter must have a value. This is the default.	Boolean
Use pixel space (Optional)	Specifies whether inferencing will be performed on images in pixel space. Pixel space is the x,y coordinate space defined by the number of pixels in the display area, such as 1024 x 768. Checked—Inferencing will be performed in pixel space, and the output will be transformed back to map space. This option is useful when using oblique imagery or street-view imagery, which may cause the features to become distorted using map space. Unchecked—Inferencing will be performed in map space. This is the default.	Boolean

Return value

Label	Explanation	Data type
Output Raster Dataset	The name of the raster or mosaic dataset containing the result.	Raster

TranslatePixelsUsingDeepLearning(in_raster, in_model_definition, {arguments}, processing_mode, {out_translated_folder}, {out_featureclass}, {overwrite_attachments}, {use_pixelspace})

Name	Explanation	Data type
in_raster	The input raster image or images to be translated.	Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder; Feature Layer; Feature Class
in_model_definition	The path to the model definition JSON file (`*.emd`).	File; String
arguments [arguments,...] (Optional)	The information from the `in_model_definition` parameter will be used to set the default values for this parameter. These arguments vary, depending on the model architecture. The following are supported model arguments for models trained in ArcGIS. ArcGIS pretrained models and custom deep learning models may have additional arguments that the tool supports. `batch_size`—The number of image tiles processed in each step of the model inferencing. This depends on the memory of your graphics card. The argument is available for all model architectures. `direction`—The image is translated from one domain to another. Options are `AtoB` and `BtoA`. The argument is only available for the CycleGAN architecture. For more information about this argument, see How CycleGAN works. `n_timestep`—The number of time steps that will be used. The default is 200. It can be increased and decreased based on the quality of generations. The argument is only supported for the Super Resolution with SR3 backbone model architecture. `padding`—The number of pixels at the border of image tiles from which predictions are blended for adjacent tiles. To smooth the output while reducing artifacts, increase the value. The maximum value of the padding can be half the tile size value. The argument is available for all model architectures. `sampling_type`—The type of sampling that will be used. Two types of sampling are available: `ddim` (Denoising Diffusion Implicit Models) and `ddpm` (Denoising Diffusion Probabilistic Models). The default is `ddim`, which generates results in fewer time steps compared to `ddpm`. The argument is only supported for the Super Resolution with SR3 backbone model architecture. `schedule`—The type of learning rate schedule type to use; this option will be used to adjust the model's learning rate while the model is training. The default schedule is the same as the model it was trained on. The argument is only supported for the Super Resolution with SR3 backbone model architecture. `tile_size`—The width and height of image tiles into which the imagery will be subsectioned for prediction. The argument is only available for the CycleGAN architecture. Value table columns: `Name`—The name of the function argument. `Value`—The value of the function argument.	Value Table
processing_mode	Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service. `PROCESS_AS_MOSAICKED_IMAGE`—All raster items in the mosaic dataset or image service will be mosaicked together and processed. This is the default. `PROCESS_ITEMS_SEPARATELY`—All raster items in the mosaic dataset or image service will be processed as separate images.	String
out_translated_folder (Optional)	The folder where the output rasters will be stored. This parameter is required when the input raster is a folder of images or a mosaic dataset in which all items are to be processed separately. The default is a folder in the project folder.	Folder
out_featureclass (Optional)	The feature class where the output rasters will be stored. This parameter is required when the input raster is a feature class of images. If the feature class already exists, the results will be appended to the existing feature class.	Feature Class
overwrite_attachments (Optional)	Specifies whether existing image attachments will be overwritten. This parameter is only valid when the `in_raster` parameter value is a feature class with image attachments. `OVERWRITE`—The existing feature class will be overwritten with the new updated attachments. `NO_OVERWRITE`—Existing image attachments will not be overwritten and new image attachments will be stored in a new feature class. When this option is specified, the `out_featureclass` parameter must have a value. This is the default.	Boolean
use_pixelspace (Optional)	Specifies whether inferencing will be performed on images in pixel space. Pixel space is the x,y coordinate space defined by the number of pixels in the display area, such as 1024 x 768. `PIXELSPACE`—Inferencing will be performed in pixel space, and the output will be transformed back to map space. This option is useful when using oblique imagery or street-view imagery, which may cause the features to become distorted using map space. `NO_PIXELSPACE`—Inferencing will be performed in map space. This is the default.	Boolean

Return value

Name	Explanation	Data type
out_raster	The name of the raster or mosaic dataset containing the result.	Raster

Code sample

TranslatePixelsUsingDeepLearning example 1 (Python window)

This example outputs a translated image using a deep learning model.

# Import system modules
import arcpy
from arcpy.ia import *

out_raster = TranslatePixelsUsingDeepLearning(
    "c:\\input_image.tif", "c:\\superresolution.emd", 
    "padding 0; batch_size 16", "PROCESS_AS_MOSAICKED_IMAGE")
out_raster.save("c:\\translated_image.tif")

TranslatePixelsUsingDeepLearning example 2 (stand-alone script)

This example outputs a translated image using a deep learning model.

# Import system modules
import arcpy
from arcpy.ia import *

# Set local variables
in_raster = "c:\\input_raster.tif"
in_model_definition = "c:\\translatepixels.emd"
model_arguments = "padding 0; batch_size 16"
processing_mode = "PROCESS_AS_MOSAICKED_IMAGE"

arcpy.env.processorType = "GPU" 
arcpy.env.gpuId = 0

# Run
out_raster = TranslatePixelsUsingDeepLearning(in_raster,  
        in_model_definition, model_arguments, processing_mode)
out_raster.save("c:\\translated_image.tif")

Environments

Cell Size, Current Workspace, Extent, Geographic Transformations, GPU ID, Mask, Output Coordinate System, Parallel Processing Factor, Processor Type, Scratch Workspace, Snap Raster

Licensing information

Basic: Requires Image Analyst
Standard: Requires Image Analyst
Advanced: Requires Image Analyst

Translate Pixels Using Deep Learning (Image Analyst Tools)

Summary

Usage

Parameters

Return value

Environments

Licensing information

Related topics