Extract Floor Plan Features From PDF (Indoors Tools)
Summary
Extracts polyline features from a PDF floor plan. Optionally, extracts text features as points from the PDF.
The output from this tool can be used as input to the Import Features To Indoor Dataset tool to populate an Indoors workspace.
Usage
This tool accepts a
.pdffile as input and creates polylines based on the PDF linework in the following ways:For vector PDFs, the tool extracts the vector information directly.
For raster PDFs, the tool extracts polylines based on line pixel width. For lines with a width less than 10 pixels, the centerlines of the pixels are used. For lines with a width greater than 10 pixels, the outlines of the pixels are used. Use this tool as part of a larger workflow to extract floor plans from PDF files.
Note:
For PDFs that contain both vector and raster data, the tool will only read the vector elements.
You can use the optional Output Text Point Features parameter to extract text from the PDF. Extracted text is written to point features that contain the text in the attribute table. Points are placed at the center of detected text. This tool extracts text in the following ways:
PDF text or PDF comment—Text stored as text objects in the PDF that the tool can read directly.
OCR text—For text not stored as text objects in the PDF, the tool uses Optical Character Recognition (OCR) to detect and recognize text for extraction.
Note:
Optical Character Recognition (OCR) is a technology that detects text in images and outputs the detected text as character strings. This tool uses OCR, which leverages machine learning models to improve text detection and recognition. This tool supports using OCR for English and horizontal text only. Report any security or privacy concerns regarding Optical Character Recognition technology to the ArcGIS Trust Center.
If the input
.pdffile is georeferenced, georeferencing information will be honored. If the input.pdffile is not georeferenced, the resulting features will be created in WGS 1984 Web Mercator at the coordinates 0,0. When georeferencing, ensure the PDF is added to the map with the default resolution in DPI setting in the PDF Options dialog box.For PDFs with multiple pages, use the Page Number parameter to specify the page to import.
The Output Line Features parameter value supports creating a new feature class or adding new polyline features to an existing layer. When adding to an existing layer containing polyline features with
PDF_NAMEandPDF_PAGEvalues that match the input PDF, those polyline features will be deleted and new polyline features will be added. The tool stores attribute information in the output.Output features are created with a z-value of 0. Set the z-value of the level when running the Import Features To Indoor Dataset tool.
The Output Text Point Features parameter supports creating a new feature class or adding new point features to an existing layer. If an existing layer is provided that contains features with
PDF_NAMEandPDF_PAGEfield values that match the input PDF, those point features will be deleted and new point features will be added. The tool stores attribute information, such as text color and size, for the detected text in the point features output.Output text point features are centrally located within the detected text's bounding box.
Use the Extent parameter to limit the processing extent and to exclude PDF elements such as legends, text boxes, and leader lines.
You can use the output of this tool as input to the Import Features To Indoor Dataset tool as part of the Extract floor plan features from PDFs workflow**.
Parameters
| Label | Explanation | Data type |
|---|---|---|
|
Input PDF |
The input |
File |
|
Output Line Features |
The output polyline feature layer that extracted polylines will be written to. |
Feature Layer |
|
Page Number (Optional) |
The page number of the input |
String |
|
Extent (Optional) |
The extent of the data that will be evaluated.
When coordinates are manually provided, the coordinates must be numeric values and in the active map's coordinate system. The map may use different display units than the provided coordinates. Use a negative value sign for south and west coordinates. |
Extent |
|
Output Text Point Features (Optional) |
The output point feature layer that extracted text will be written to. |
Feature Layer |
Environments
Licensing information
- Basic: No
- Standard: No
- Advanced: Requires ArcGIS Indoors Pro or ArcGIS Indoors Maps.