Skip to main content

Predict Missing Values Using AI Model (GeoAI Tools)

Summary

Replaces missing values (nulls) with estimated feature values. Uses machine learning and deep learning models trained on patterns in the dataset to ensure statistical consistency.

Learn more about how Predict Missing Values Using AI Model works

Usage

  • The Input Features or Table parameter value can be point, line, polygon features, as well as stand-alone tables containing missing values (nulls) to be filled.

  • The Input Features or Table parameter is the source for training/fine-tuning and applying the AI model. Whether it is updated in place or a new output is created depends on the Fill Values in Input Data parameter value.

  • You can choose whether to update the input feature class or table directly with the filled values using the Fill Values in Input Data parameter. When checked (true), no separate output is created, and the input data is modified in place.

  • The Input Model Definition parameter lets you either train an XGBoost model or fine-tune a pretrained auto-regressive LLM (DistilGPT-2) on the provided input dataset and then use the model to predict the missing values. The model file is an Esri model definition file (.emd) or a deep learning package file (.dlpk) hosted on ArcGIS Living Atlas of the World.

  • The Null Value parameter represents the null (missing) values. This parameter is used differently depending on the input format as follows:

    • For geodatabase feature classes or tables, <Null> is assumed to be the null (missing value) if no value is provided for the Null Value parameter. If a value is provided, that value and the <Null> values will be estimated in the tool output.

    • For shapefiles and dBASE tables, the Null Value parameter is required. You must provide a value that represents null or missing values in the input data (for example, 0 or -9999).

  • Messages describing details of the analysis and characteristics of the filled fields are written at the bottom of the Geoprocessing pane during tool execution. To access the messages, hover over the progress bar and click the pop-out button , or expand the messages section in the Geoprocessing pane. You can also access the messages for a previous run of this tool through the geoprocessing history.

  • Example use cases include imputing missing demographic or health data in spatial datasets, completing property attributes within real estate databases, and filling gaps in environmental sensor networks or remote sensing products such as climate, soil, or air quality data.

Parameters

Label Explanation Data type

Input Features or Table

The input feature class or standalone table containing missing values (nulls) to be filled.

Table View

Output Features or Table

The output features or stand-alone table containing the filled values saved at the specified path.

Feature Class; Table

Fields to Fill

The numeric fields in the input containing the missing values (nulls) to be filled.

Field

Input Model Definition

The input model definition accepts an Esri model definition JSON file (.emd) or a deep learning package (.dlpk) that is stored locally or hosted on ArcGIS Living Atlas (.dlpk_remote).

File

Model Arguments

(Optional)

The information from the Input Model Definition parameter value will be used to populate this parameter. Currently two models are supported: XGBoost and DistilGPT2. These arguments vary depending on the respective model.

Value table columns:

  • NameThe name of the function argument.

  • ValueThe value of the function argument.

Value Table

Null Value

(Optional)

The value that represents null (missing) values. If no value is provided, <Null> is assumed as null for geodatabase feature classes and tables. If a value is provided, both the value and all <Null> values will be filled. If the input is a shapefile or dBASE table, a numeric value of the null placeholder is required.

Double

Fill Values in Input Data

(Optional)

Specifies whether the input feature class or table will be updated with the filled values or an output feature class or table will be created with the filled values.

  • CheckedThe fields containing the filled values will be updated in the input data. This option modifies the input data.

  • UncheckedAn output feature class or table will be created containing the filled value fields. This is the default.

Boolean

Derived output

Label Explanation Data type

Updated Features or Table

The updated input features or table containing the filled value fields.

Table View

Environments

Output Coordinate System, Geographic Transformations

Licensing information

  • Basic: No
  • Standard: No
  • Advanced: Yes