Train Point Cloud Object Detection Model (3D Analyst Tools)

Summary

Trains an object detection model for point clouds using deep learning.

Usage

This tool requires the installation of Deep Learning Essentials, which provides multiple neural network solutions that include neural architectures for classifying point clouds.

To set up your machine to use deep learning frameworks in ArcGIS Pro, see Install deep learning frameworks for ArcGIS.
If you will be training models in a disconnected environment, see Additional Installation for Disconnected Environment for more information.
Using a pretrained model in the training process is helpful, especially when you have limitations in data, time, or computational resources. Pretrained models reduce the need for extensive training and provide a reliable starting point for quickly creating a useful model. To use a pretrained model, the new training data must be compatible with the pretrained model. This means that the new training data must have the same attributes and object codes as the training data that was used to create the pretrained model. If object codes in the training data do not match the codes in the pretrained model, the training data's object codes must be remapped accordingly.
The point cloud object detection model can only be trained using a CUDA-capable NVIDIA graphics card. When the Processor Type environment is not set to a computer with CUDA-capable graphics cards, the card with the most optimal hardware will be used for training. Otherwise, a specific graphics card can be assigned in the GPU ID environment setting.
The following metrics will be reported during the training process:
- Epoch—The epoch number with which the result is associated
- Training Loss—The result of the entropy loss function that was averaged for the training data
- Validation Loss—The result of the entropy loss function that was determined when applying the model trained in the epoch on the validation data
- Average Precision—The ratio of points in the validation data that were correctly classified by the model trained in the epoch (true positives) over all the points in the validation data
A model that achieves low training loss but high validation loss is considered to be overfitting the training data, whereby it detects patterns from artifacts in the training data that result in the model not working well for the validation data. A model that achieves a high training loss and a high validation loss is considered to be underfitting the training data in which no patterns are learned effectively enough to produce a usable model.

Learn more about assessing point cloud training results
A folder is created to store the checkpoint models, which are models that are created at the end of each epoch. The checkpoints folder name begins with the same name as the model and ends with a suffix of .checkpoints. It is stored in the Output Model Location parameter value.

Label	Explanation	Data type
Input Training Data	The point cloud object detection training data (`*.pcotd` file) that will be used to train the model.	File
Output Model Location	An existing folder that will store the new directory containing the deep learning model.	Folder
Output Model Name	The name of the output Esri model definition file (`.emd`), deep learning package (`.dlpk`), and the directory that will be created to store them.	String
Pre-trained Model Definition File (Optional)	The pretrained object detection model that will be refined. When a pretrained model is provided, the input training data must have the same attributes and maximum number of points that were used by the training data that generated the model.	File
Architecture (Optional)	Specifies the architecture that will be used to train the model. Sparsely Embedded Convolutional Detection—The Sparsely Embedded Convolutional Detection (SECOND) architecture will be used. This is the default. Point Transfomer V3—The Point Transformer V3 architecture will be used.	String
Attribute Selection (Optional)	Specifies the point attributes that will be used with the classification code when training the model. Only the attributes that are present in the point cloud training data will be available. No additional attributes are included by default. Intensity—The measure of the magnitude of the lidar pulse return will be used. Return Number—The ordinal position of the point obtained from a given lidar pulse will be used. Number of Returns—The total number of lidar returns that were identified as points from the pulse associated with a given point will be used. Red Band—The red band's value from a point cloud with color information will be used. Green Band—The green band's value from a point cloud with color information will be used. Blue Band—The blue band's value from a point cloud with color information will be used. Near Infrared Band—The near infrared band's value from a point cloud with near infrared information will be used. Relative Height—The relative height of each point in relation to a reference surface, which would typically be a bare earth DEM, will be used.	String
Minimum Points Per Block (Optional)	The minimum number of points that must be present in a given block for it to be used when training the model. The default is 0.	Long
Remap Object Codes (Optional)	Defines how object codes will be remapped to new values before training the deep learning model. Value table columns: Current Code—The object code value in the training data. Remapped Code—The object code value that the existing code will be changed to.	Value Table
Object Codes of Interest (Optional)	The object codes that will be used to filter the objects in the training data. When object codes are provided, the objects that are not included will be ignored.	Long
Only train blocks that contain objects (Optional)	Specifies whether the model will be trained using only blocks that contain objects or if all blocks, including those that do not contain objects. Checked—The model will be trained using only blocks that contain objects. The data used for validation will not be modified. Unchecked—The model will be trained using all blocks, including those that do not contain objects. This is the default.	Boolean
Object Descriptions (Optional)	The descriptions for each object code in the training data. Value table columns: Object Code—The object code value that was learned by the model. Description—The object described by the class code.	Value Table
Model Selection Criteria (Optional)	Specifies the statistical basis that will be used to determine the final model. Validation Loss—The model that achieves the lowest result when the entropy loss function is applied to the validation data will be used. Average Precision—The model that achieves the highest ratio of points in the validation data that were correctly classified by the model trained in the epoch (true positives) over all the points in the validation data will be used. This is the default.	String
Maximum Number of Epochs (Optional)	The number of times each block of data will be passed forward and backward through the neural network. The default is 25.	Long
Learning Rate Strategy (Optional)	Specifies how the learning rate will be modified during training. One Cycle Learning Rate—The learning rate will be cycled throughout each epoch using Fast.AI's implementation of the 1cycle technique for training neural networks to help improve the training of a convolutional neural network. This is the default. Fixed Learning Rate—The same learning rate will be used throughout the training process.	String
Learning Rate (Optional)	The rate at which existing information will be overwritten with new information. If no value is provided, the optimal learning rate will be extracted from the learning curve during the training process. This is the default.	Double
Batch Size (Optional)	The number of training data blocks that will be processed at any given time. The default is 2.	Long
Stop training when model no longer improves (Optional)	Specifies whether the model training will stop when the metric specified in the Model Selection Criteria parameter value does not register any improvement after five consecutive epochs. Checked—The model training will stop when the model is no longer improving. Unchecked—The model training will continue until the maximum number of epochs has been reached. This is the default.	Boolean
Architecture Settings (Optional)	The architecture settings that can be modified to improve training results. Value table columns: Option—The architecture-specific options that can be modified. Voxel Width—The x- and y-dimensions of the voxel used during training. The corresponding value is in linear units of meters and can be expressed as a double value. Voxel Height—The z-dimension of the voxel used during training. The corresponding value is in linear units of meters and can be expressed as a double value. Voxel Point Limit—The number of points in a given voxel. The corresponding value should be a positive integer. When no value is provided, this limit is calculated during the training process based on the block size and block point limit of the training data. Maximum Training Voxels—The maximum number of voxels that can be used in the training data. The corresponding value should be a positive integer. When no value is provided, this limit is calculated during training. Maximum Validation Voxels—The maximum number of voxels that can be used in the validation data. The corresponding value should be a positive integer. When no value is provided, this limit is calculated during training. Value—The value that corresponds with the option being modified.	Value Table

Derived output

Label	Explanation	Data type
Output Model	The output object detection model that is produced.	File
Output Epoch Statistics	The output ASCII table that contains the epoch statistics that were obtained during the training process.	Text File

arcpy.3d.TrainPointCloudObjectDetectionModel(in_training_data, out_model_location, out_model_name, {pretrained_model}, {architecture}, {attributes}, {min_points}, {remap_objects}, {target_objects}, {train_blocks}, {object_descriptions}, {model_selection_criteria}, {max_epochs}, {learning_rate_strategy}, {learning_rate}, {batch_size}, {early_stop}, {architecture_settings})

Name	Explanation	Data type
in_training_data	The point cloud object detection training data (`*.pcotd` file) that will be used to train the model.	File
out_model_location	An existing folder that will store the new directory containing the deep learning model.	Folder
out_model_name	The name of the output Esri model definition file (`.emd`), deep learning package (`.dlpk`), and the directory that will be created to store them.	String
pretrained_model (Optional)	The pretrained object detection model that will be refined. When a pretrained model is provided, the input training data must have the same attributes and maximum number of points that were used by the training data that generated the model.	File
architecture (Optional)	Specifies the architecture that will be used to train the model. `SECD`—The Sparsely Embedded Convolutional Detection (SECOND) architecture will be used. This is the default. `POINT_TRANSFORMER_V3`—The Point Transformer V3 architecture will be used.	String
attributes [attributes,...] (Optional)	Specifies the point attributes that will be used with the classification code when training the model. Only the attributes that are present in the point cloud training data will be available. No additional attributes are included by default. `INTENSITY`—The measure of the magnitude of the lidar pulse return will be used. `RETURN_NUMBER`—The ordinal position of the point obtained from a given lidar pulse will be used. `NUMBER_OF_RETURNS`—The total number of lidar returns that were identified as points from the pulse associated with a given point will be used. `RED`—The red band's value from a point cloud with color information will be used. `GREEN`—The green band's value from a point cloud with color information will be used. `BLUE`—The blue band's value from a point cloud with color information will be used. `NEAR_INFRARED`—The near infrared band's value from a point cloud with near infrared information will be used. `RELATIVE_HEIGHT`—The relative height of each point in relation to a reference surface, which would typically be a bare earth DEM, will be used.	String
min_points (Optional)	The minimum number of points that must be present in a given block for it to be used when training the model. The default is 0.	Long
remap_objects [remap_objects,...] (Optional)	Defines how object codes will be remapped to new values before training the deep learning model. Value table columns: `Current Code`—The object code value in the training data. `Remapped Code`—The object code value that the existing code will be changed to.	Value Table
target_objects [target_objects,...] (Optional)	The object codes that will be used to filter the objects in the training data. When object codes are provided, the objects that are not included will be ignored.	Long
train_blocks (Optional)	Specifies whether the model will be trained using only blocks that contain objects or if all blocks, including those that do not contain objects. `OBJECT_BLOCKS`—The model will be trained using only blocks that contain objects. The data used for validation will not be modified. `ALL_BLOCKS`—The model will be trained using all blocks, including those that do not contain objects. This is the default.	Boolean
object_descriptions [object_descriptions,...] (Optional)	The descriptions for each object code in the training data. Value table columns: `Object Code`—The object code value that was learned by the model. `Description`—The object described by the class code.	Value Table
model_selection_criteria (Optional)	Specifies the statistical basis that will be used to determine the final model. `VALIDATION_LOSS`—The model that achieves the lowest result when the entropy loss function is applied to the validation data will be used. `AVERAGE_PRECISION`—The model that achieves the highest ratio of points in the validation data that were correctly classified by the model trained in the epoch (true positives) over all the points in the validation data will be used. This is the default.	String
max_epochs (Optional)	The number of times each block of data will be passed forward and backward through the neural network. The default is 25.	Long
learning_rate_strategy (Optional)	Specifies how the learning rate will be modified during training. `ONE_CYCLE`—The learning rate will be cycled throughout each epoch using Fast.AI's implementation of the 1cycle technique for training neural networks to help improve the training of a convolutional neural network. This is the default. `FIXED`—The same learning rate will be used throughout the training process.	String
learning_rate (Optional)	The rate at which existing information will be overwritten with new information. If no value is provided, the optimal learning rate will be extracted from the learning curve during the training process. This is the default.	Double
batch_size (Optional)	The number of training data blocks that will be processed at any given time. The default is 2.	Long
early_stop (Optional)	Specifies whether the model training will stop when the metric specified in the `model_selection_criteria` parameter value does not register any improvement after five consecutive epochs. `EARLY_STOP`—The model training will stop when the model is no longer improving. `NO_EARLY_STOP`—The model training will continue until the maximum number of epochs has been reached. This is the default.	Boolean
architecture_settings [architecture_settings,...] (Optional)	The architecture settings that can be modified to improve training results. Value table columns: `Option`—The architecture-specific options that can be modified. `VOXEL_WIDTH`—The x- and y-dimensions of the voxel used during training. The corresponding value is in linear units of meters and can be expressed as a double value. `VOXEL_HEIGHT`—The z-dimension of the voxel used during training. The corresponding value is in linear units of meters and can be expressed as a double value. `VOXEL_POINT_LIMIT`—The number of points in a given voxel. The corresponding value should be a positive integer. When no value is provided, this limit is calculated during the training process based on the block size and block point limit of the training data. `MAX_TRAINING_VOXELS`—The maximum number of voxels that can be used in the training data. The corresponding value should be a positive integer. When no value is provided, this limit is calculated during training. `MAX_VALIDATION_VOXELS`—The maximum number of voxels that can be used in the validation data. The corresponding value should be a positive integer. When no value is provided, this limit is calculated during training. `Value`—The value that corresponds with the option being modified.	Value Table

Derived output

Name	Explanation	Data type
out_model	The output object detection model that is produced.	File
out_epoch_stats	The output ASCII table that contains the epoch statistics that were obtained during the training process.	Text File

Code sample

TrainPointCloudObjectDetectionModel example (stand-alone script)

The following sample demonstrates the use of this tool in the Python window:

import arcpy

arcpy.env.workspace = "D:/Deep_Learning_Workspace"
arcpy.ddd.TrainPointCloudObjectDetectionModel("Cars.pcotd", "D:/DL_Models", "Cars",
    attributes=["INTENSITY", "RETURN_NUMBER", "NUMBER_OF_RETURNS", "RELATIVE_HEIGHT"],
    object_descriptions=[[31, "Cars"]], train_blocks="OBJECT_BLOCKS",
    model_selection_criteria="AVERAGE_PRECISION", max_epochs=10)

Environments

GPU ID, Processor Type

Licensing information

Basic: Requires 3D Analyst
Standard: Requires 3D Analyst
Advanced: Requires 3D Analyst

Train Point Cloud Object Detection Model (3D Analyst Tools)

Summary

Usage

Parameters

Derived output

Environments

Licensing information

Related topics