Skip to main content

Compute resources

Compute resources provide the processing power for interactively authoring data pipelines and running data pipeline jobs from scheduled tasks. Compute resources support capabilities such as running the data pipeline, inferring dataset schemas, generating data previews, caching inputs, and generating error and warning messages.

Status

The Data Pipelines editor connects to a dedicated compute resource to power your processing. The status of the compute resource is displayed at the top of the application on the connection details dialog box. Compute resources have the following status types:

  • Connecting—The compute resource for the data pipeline is being provisioned and the connection is initiating.

  • Connected—The compute resource has started and is currently active. To stop the compute resource, click the Disconnect all button on the connection details dialog box. This will disconnect all editors.

  • Disconnected—The compute resource has stopped and is currently inactive. When the editor is disconnected, the following options are available in the disconnected modal:

    • Leave—Leave the editor without saving changes made to the data pipeline. You will be returned to the Data Pipelines gallery page.

    • Save and leave—Save any changes made to the data pipeline and leave. You will be returned to the Data Pipelines gallery page.

    • Reconnect—Reconnect and continue working in the data pipeline editor.

  • Reconnecting—The compute resource has stopped and the application is attempting to reconnect.

For jobs run using scheduled data pipeline tasks or using the run option from the Data Pipelines gallery page, a compute resource is active while the job is running, and inactive when the job completes.

Considerations

For interactive editing, consider the following:

  • Each user has at most one compute resource that powers all browser tabs with an open data pipeline editor.

  • For running jobs via scheduled data pipeline tasks, each job run uses a dedicated compute resource.

  • The editor can be connected even if there are no input datasets, tools, or outputs configured.

  • When the editor shows a disconnected status, the following data is lost:

    • Cached sources and datasets

    • Long-running data pipeline jobs and any associated messages (warnings, errors, and results)

  • When you return to the data pipeline editor after closing the tab, you will not have access to any warnings, errors, and results from the previous run.

Back to top