Skip to main content

Extract Floor Plan Features From PDF (Indoors Tools)

Summary

Extracts polyline features from a PDF floor plan. Optionally, extracts text features as points from the PDF.

The output from this tool can be used as input to the Import Features To Indoor Dataset tool to populate an Indoors workspace.

Usage

  • This tool accepts a .pdf file as input and creates polylines based on the PDF linework in the following ways:

    • For vector PDFs, the tool extracts the vector information directly.

    • For raster PDFs, the tool extracts polylines based on line pixel width. For lines with a width less than 10 pixels, the centerlines of the pixels are used. For lines with a width greater than 10 pixels, the outlines of the pixels are used. Use this tool as part of a larger workflow to extract floor plans from PDF files.

    Note:

    For PDFs that contain both vector and raster data, the tool will only read the vector elements.

  • You can use the optional Output Text Point Features parameter to extract text from the PDF. Extracted text is written to point features that contain the text in the attribute table. Points are placed at the center of detected text. This tool extracts text in the following ways:

    • PDF text or PDF comment—Text stored as text objects in the PDF that the tool can read directly.

    • OCR text—For text not stored as text objects in the PDF, the tool uses Optical Character Recognition (OCR) to detect and recognize text for extraction.

    Note:

    Optical Character Recognition (OCR) is a technology that detects text in images and outputs the detected text as character strings. This tool uses OCR, which leverages machine learning models to improve text detection and recognition. This tool supports using OCR for English and horizontal text only. Report any security or privacy concerns regarding Optical Character Recognition technology to the ArcGIS Trust Center.

  • If the input .pdf file is georeferenced, georeferencing information will be honored. If the input .pdf file is not georeferenced, the resulting features will be created in WGS 1984 Web Mercator at the coordinates 0,0. When georeferencing, ensure the PDF is added to the map with the default resolution in DPI setting in the PDF Options dialog box.

  • For PDFs with multiple pages, use the Page Number parameter to specify the page to import.

  • The Output Line Features parameter value supports creating a new feature class or adding new polyline features to an existing layer. When adding to an existing layer containing polyline features with PDF_NAME and PDF_PAGE values that match the input PDF, those polyline features will be deleted and new polyline features will be added. The tool stores attribute information in the output.

  • Output features are created with a z-value of 0. Set the z-value of the level when running the Import Features To Indoor Dataset tool.

  • The Output Text Point Features parameter supports creating a new feature class or adding new point features to an existing layer. If an existing layer is provided that contains features with PDF_NAME and PDF_PAGE field values that match the input PDF, those point features will be deleted and new point features will be added. The tool stores attribute information, such as text color and size, for the detected text in the point features output.

  • Output text point features are centrally located within the detected text's bounding box.

  • Use the Extent parameter to limit the processing extent and to exclude PDF elements such as legends, text boxes, and leader lines.

  • You can use the output of this tool as input to the Import Features To Indoor Dataset tool as part of the Extract floor plan features from PDFs workflow**.

Parameters

Label Explanation Data type

Input PDF

The input .pdf file from which features will be extracted.

File

Output Line Features

The output polyline feature layer that extracted polylines will be written to.

Feature Layer

Page Number

(Optional)

The page number of the input .pdf file from which features will be extracted. The default is 1.

String

Extent

(Optional)

The extent of the data that will be evaluated.

  • Current Display Extent Map View—The extent will be based on the active map or scene.

  • Draw Extent Square and Finish—The extent will be based on a rectangle drawn on the map or scene.

  • Extent of a Layer Layer—The extent will be based on an active map layer. Choose an available layer or use the Extent of data in all layers option. Each map layer has the following options:

    • All Features Select All—The extent of all features.

    • Selected Features Area from Selected Features—The extent of the selected features.

    • Visible Features Extent Indicator—The extent of visible features.

  • Browse Browse—The extent will be based on a dataset.

  • Intersection of Inputs Intersect—The extent will be the intersecting extent of all inputs.

  • Union of Inputs Union—The extent will be the combined extent of all inputs.

  • Clipboard Paste—The extent can be copied to and from the clipboard.

    • Copy Extent Copy—Copies the extent and coordinate system to the clipboard.

    • Paste Extent Paste—Pastes the extent and coordinate system from the clipboard. If the clipboard does not include a coordinate system, the extent will use the map's coordinate system.

  • Reset Extent Reset—The extent will be reset to the default value.

When coordinates are manually provided, the coordinates must be numeric values and in the active map's coordinate system. The map may use different display units than the provided coordinates. Use a negative value sign for south and west coordinates.

Extent

Output Text Point Features

(Optional)

The output point feature layer that extracted text will be written to.

Feature Layer

Environments

Extent

Licensing information

  • Basic: No
  • Standard: No
  • Advanced: Requires ArcGIS Indoors Pro or ArcGIS Indoors Maps.