Video Multiplexer (Image Analyst Tools)

Summary

Creates a single geospatial video file that combines an archived video stream file and a separate associated metadata file synchronized by a time stamp. The process of combining the two files containing the video and metadata files is called multiplexing.

Usage

This tool is designed for archived video files and does not work with live video streams.
The format of the video file that will be multiplexed must be one of the supported video formats. The separate metadata file is a comma-separated values (CSV), JSON, or GPS Exchange Format (GPX) file containing the proper field headings and associated values.

The video formats supported as an input to the tool are listed in the following table:


Description	Extension
AOMedia Video 1 File	`.av1`
Audio Video Interleaved	`.avi`
H264 Video File¹	`.h264`
H265 Video File¹	`.h265`
HLS (Adaptive Bitrate (ABR))	`.m3u8`
MOV file	`.mov`
MPEG-2 Transport Stream	`.ts`
MPEG-2 Program Stream	`.ps`
M2TS Transport Stream	`.m2ts`
MPEG File	`.mpg`
MPEG-2 File	`.mpg2`
MPEG-2 File	`.mp2`
MPEG File	`.mpeg`
MPEG-4 Movie	`.mp4`
MPEG-4 File	`.mpg4`
MPEG-Dash	`.mpd`
VLC (mpeg2)	`.mpeg2`
VLC Media File (mpeg4)	`.mpeg4`
VLC Media File (vob)	`.vob`
Windows Media Video File	`.wmv`
¹ Requires multiplexing

The video file output format is .ts only.
The metadata associated with the video stream file will be used to compute the flight path of the video sensor, the video image frame center, and the four corners of the video frame footprint on the map.
Geospatial video supports the Motion Imagery Standards Board (MISB) specification. The full MISB specification defines many more parameters than those required for traditional full motion video. The MISB parameters provided will be encoded into the final video, including all parameters or a subset thereof.
To compute and display the relative corner points of the video image footprint as a frame outline on the map, you need the 13 essential metadata fields listed below and further detailed in the parameter description. When the metadata is complete and accurate, the tool will calculate the video frame corners, and the size, shape, and position of the video frame outline that can then be displayed on a map.
- Precision Timestamp
- Sensor Latitude
- Sensor Longitude
- Sensor Ellipsoid Height, or Sensor True Altitude
- Platform Heading Angle
- Platform Pitch Angle
- Platform Roll Angle
- Sensor Relative Roll Angle
- Sensor Relative Elevation Angle
- Sensor Relative Azimuth Angle
- Sensor Horizontal Field of View
- Sensor Vertical Field of View
- Far Distance
This is the minimum metadata required to compute the transform between video and map, display the video footprint on the map, and enable other functionality such as digitizing and marking on the video and the map.
The values for the metadata fields can be input in the Multiplexer_Field_Mapping_Template.csv metadata template file from C:\Program Files\ArcGIS\Pro\Resources\MotionImagery.
- The Multiplexer_Field_Mapping_Template.csv file contains all the required metadata fields.
- Only the 13 parameters defined above are needed to create the metadata compliant video file. You do not need to provide all the parameters defined in the MISB specification to create a metadata compliant video file. If additional MISB parameters are provided, they will be encoded into the video file.
The performance of the resulting multiplexed video file depends on the type and quality of the data contained in the metadata file and how accurately the video data and metadata files are synchronized.
- If the Multiplexer_Field_Mapping_Template.csv file only contains UNIX Time Stamp, Sensor Latitude, and Sensor Longitude fields, the location of the sensor will be displayed on the map, but the footprint of the video frames will not be displayed. Some functionality, such as digitizing features and measuring distance in the video, will not be supported.
- If the time stamp linking the video and metadata is not accurately synchronized, the video footprint and sensor location on the map will be offset from the view in the video player. In this case, use the Multiplexer_TimeShift_Template.csv template in C:\Program Files\ArcGIS\Pro\Resources\MotionImagery to adjust the timing of the video and metadata.
One set of parameters in the Multiplexer_Field_Mapping_Template.csv file includes the map coordinates of the four corners of the video image frame projected to the ground. If the four corner map coordinates are provided, they will be used in creating the video. If the four corner map coordinates are not provided, provide a source for digital elevation model (DEM) data in the Digital Elevation Model parameter, and the tool will compute the video footprint from the required parameters listed above.
The accuracy of the video footprint and frame center depends on the accuracy of the DEM data source provided. If you do not have access to DEM data, you can provide an average elevation and unit relative to sea level, such as 15 feet or 10 meters. In the case of a submersible, you can enter -15 feet or -10 meters, for example. Using an average elevation or ocean depth is not as accurate as providing a DEM or bathymetric data. It is recommended that you provide a DEM layer or image service if it is available.
Geospatial video supports Video Moving Target Indicator (VMTI) data, based on object tracking methods in motion imagery. If VMTI data is recorded in a file separate from the associated video file, it can be encoded into the video file using the video multiplexer tool. Geospatial video supports the MISB Video Moving Target Indicator and Track Metadata standard.

Encode the VMTI data into the video by supplying the required VMTI information for the proper video frame using a .csv, .json, or .gpx metadata file containing the data for the following fields:
- LDSVer,TimeStamp,FrameCenterLongitude,FrameCenterLatitude,SensorLongitude,SensorLatitude,vmtilocaldataset
- 5,1546300800231000,-76.1309338,36.91118708,-76.1309338,36.91118708,1 0.9938099 1611919 1815608 1711844;1 0.39056745 1438997 1556213 1496645
The last column, vmtilocaldataset, contains the detected object's bounding box in which each space-delimited value is defined by three values: Object_ID Confidence_Level Top_Left_Pixel Bottom_Right_Pixel Center_Pixel.

You can specify multiple object detections for a given time stamp with a semicolon (;) delimiter as shown in the example above.

Use MISB Tag 74 in the Multiplexer_Field_Mapping_Template.csv file.

Label	Explanation	Data type
Input Video File	The input video file that will be converted to a geospatial video file. The following file types are supported: `.av1,` `.avi`, `.h264`, `.h265`, `.mkv`, `.mov`, `.mp2`, `.mp4`, `.mpeg`, `.mpeg2`, `.mpeg4`, `.mpg`, `.mpg2`, `.mpg4`, `.ps`, `.ts`, `.vob`, and `.wmv`.	File
Metadata File	A `.csv`, `.json`, or `.gpx` file containing metadata about the video frames for specific times. Each column in the metadata file represents one metadata field, and one of the columns must be a time reference. The time reference is UNIX time stamp (seconds past 1970) multiplied by one million, which is stored as an integer. The time is stored this way so that any instant in time (down to one millionth of a second) can be referenced with an integer. Consequently, a time difference between two entries of 500,000 represents one half of a second in elapsed time. The first row contains the field names for the metadata columns. These field names are listed in the `Multiplexer_Field_Mapping_Template.csv` file in `C:\Program Files\ArcGIS\Pro\Resources\MotionImagery`, or custom field names can be matched to their corresponding video field names using the template. Each subsequent row contains the metadata values for a specific time, called a time stamp. The metadata field names can be in any order and should be named exactly as listed in the `Multiplexer_Field_Mapping_Template.csv` template to map the metadata field names to the proper video metadata field names.	File
Output Video File	The name of the output video file, including the file extension. The supported output video file is a `.ts` file.	File
Metadata Mapping File (Optional)	A `.csv` file that contains 5 columns and 87 rows and is based on the `FMV_Multiplexer_Field_Mapping_Template.csv` template file obtained from `C:\Program Files\ArcGIS\Pro\Resources\MotionImagery`. This `.csv` file cross references the metadata field name to the video field name. Each row represents one of the standard metadata parameters, such as sensor latitude. The first two columns contain the information for the tag and MISB parameter name provided in the form. The third column contains the field name as it appears in the Metadata File parameter. When the third column is populated, the tool can match the metadata field names to the proper video metadata tags.	File
Timeshift File (Optional)	A file containing defined time shift intervals. Ideally, the video images and the metadata are synchronized in time. In this case, the image footprint in the video surrounds features that can be seen in the video image. Sometimes there is a mismatch between the timing of the video and the timing in the metadata. This leads to an apparent time delay between when a ground feature is surrounded by the image footprint and when that ground feature is visible in the video image. If this time shift is observable and consistent, the multiplexer can adjust the timing of the metadata to match the video. If there is a mismatch between the timing of the video and metadata, specify the time shift in the `Multiplexer_TimeShift_Template.csv` template in `C:\Program Files\ArcGIS\Pro\Resources\MotionImagery`. The time shift observations file is a `.csv` file containing two columns (`elapsed time` and `time shift`) and one or more data rows. A row for column names is optional. For example, if the video image has a five-second lag for the entire time, the time shift observation file will have one line: `0:00, -5`. The entire video is shifted five seconds. If there is a five-second lag at the 0:18 mark of the video, and a nine-second lag at the 2:21 mark of the video, the time shift observation file will have the following two lines: `0:18, -5 2:21, -9` In this case, the video is shifted differently at the beginning of the video and at the end of the video. You can define any number of time shift intervals in the time shift observation file.	File
Digital Elevation Model (Optional)	The source of the elevation needed for calculating the video frame corner coordinates. The source can be a layer, image service, or an average ground elevation or ocean depth. The average elevation value must include the units of measurement such as meters or feet or other measure of length. The accuracy of the video footprint and frame center depend on the accuracy of the DEM data source provided. It is recommended that you provide a DEM layer or image service. If you do not have access to DEM data, you can provide an average elevation and unit relative to sea level, such as 15 feet or 10 meters. In the case of a submersible, you can enter -15 feet or -10 meters, for example. Using an average elevation or ocean depth is not as accurate as providing a DEM or bathymetric data. To calculate frame corner coordinates, the average elevation value must always be less than the sensor's altitude or depth as recorded in the metadata. For example, if the video was filmed at a sensor altitude of 10 meters and higher, a valid average elevation could be 9 meters or less. If a video was filmed underwater at a depth of -10 meters and deeper, the valid average elevation (relative to sea level) could be -11 or deeper. If the sensor altitude value is less than the average elevation value, the four corner coordinates will not be calculated for that record. If you do not know the average elevation of the project area, use a DEM.	Raster Layer; Image Service; Linear Unit
Input Coordinate System (Optional)	The coordinate system that will be used for the Metadata File parameter value.	Coordinate System

VideoMultiplexer(in_video_file, metadata_file, out_video_file, {metadata_mapping_file}, {timeshift_file}, {elevation_layer}, {input_coordinate_system})

Name	Explanation	Data type
in_video_file	The input video file that will be converted to a geospatial video file. The following file types are supported: `.av1,` `.avi`, `.h264`, `.h265`, `.mkv`, `.mov`, `.mp2`, `.mp4`, `.mpeg`, `.mpeg2`, `.mpeg4`, `.mpg`, `.mpg2`, `.mpg4`, `.ps`, `.ts`, `.vob`, and `.wmv`.	File
metadata_file	A `.csv`, `.json`, or `.gpx` file containing metadata about the video frames for specific times. Each column in the metadata file represents one metadata field, and one of the columns must be a time reference. The time reference is UNIX time stamp (seconds past 1970) multiplied by one million, which is stored as an integer. The time is stored this way so that any instant in time (down to one millionth of a second) can be referenced with an integer. Consequently, a time difference between two entries of 500,000 represents one half of a second in elapsed time. The first row contains the field names for the metadata columns. These field names can be matched to their corresponding field names using the `Multiplexer_Field_Mapping_Template.csv` file in `C:\Program Files\ArcGIS\Pro\Resources\MotionImagery` if needed. Each subsequent row contains the metadata values for the time indicated in the time field.	File
out_video_file	The name of the output video file, including the file extension. The supported output video file is a `.ts` file.	File
metadata_mapping_file (Optional)	A `.csv` file that contains 5 columns and 87 rows and is based on the `FMV_Multiplexer_Field_Mapping_Template.csv` template file obtained from `C:\Program Files\ArcGIS\Pro\Resources\MotionImagery`. Each row represents one of the standard MISB metadata tags, such as sensor latitude. The first two columns contain the MISB index and MISB tag name. The third column contains the field name as it appears in the `metadata_file` parameter if present. When the third column is populated, the tool can match the metadata field names to the proper video metadata tags. The fourth and fifth columns represent the units and notes associated with the tag, respectively.	File
timeshift_file (Optional)	A file containing defined time shift intervals. Ideally, the video images and the metadata are synchronized in time. In this case, the image footprint in the video surrounds features that can be seen in the video image. Sometimes there is a mismatch between the timing of the video and the timing in the metadata. This leads to an apparent time delay between when a ground feature is surrounded by the image footprint and when that ground feature is visible in the video image. If this time shift is observable and consistent, the multiplexer can adjust the timing of the metadata to match the video. If there is a mismatch between the timing of the video and metadata, specify the time shift in the `Multiplexer_TimeShift_Template.csv` template in `C:\Program Files\ArcGIS\Pro\Resources\MotionImagery`. The time shift observations file is a `.csv` file containing two columns (`elapsed time` and `time shift`) and one or more data rows. A row for column names is optional. For example, if the video image has a five-second lag for the entire time, the time shift observation file will have one line: `0:00, -5`. The entire video is shifted five seconds. If there is a five-second lag at the 0:18 mark of the video, and a nine-second lag at the 2:21 mark of the video, the time shift observation file will have the following two lines: `0:18, -5 2:21, -9` In this case, the video is shifted differently at the beginning of the video and at the end of the video. You can define any number of time shift intervals in the time shift observation file.	File
elevation_layer (Optional)	The source of the elevation needed for calculating the video frame corner coordinates. The source can be a layer, image service, or an average ground elevation or ocean depth. The average elevation value must include the units of measurement such as meters or feet or other measure of length. The accuracy of the video footprint and frame center depend on the accuracy of the DEM data source provided. It is recommended that you provide a DEM layer or image service. If you do not have access to DEM data, you can provide an average elevation and unit relative to sea level, such as 15 feet or 10 meters. In the case of a submersible, you can enter -15 feet or -10 meters, for example. Using an average elevation or ocean depth is not as accurate as providing a DEM or bathymetric data. To calculate frame corner coordinates, the average elevation value must always be less than the sensor's altitude or depth as recorded in the metadata. For example, if the video was filmed at a sensor altitude of 10 meters and higher, a valid average elevation could be 9 meters or less. If a video was filmed underwater at a depth of -10 meters and deeper, the valid average elevation (relative to sea level) could be -11 or deeper. If the sensor altitude value is less than the average elevation value, the four corner coordinates will not be calculated for that record. If you do not know the average elevation of the project area, use a DEM.	Raster Layer; Image Service; Linear Unit
input_coordinate_system (Optional)	The coordinate system that will be used for the `metadata_file` parameter value.	Coordinate System

Code sample

VideoMultiplexer example 1 (Python window)

This example creates a video file that combines an archived video and related metadata.

arcpy.ia.VideoMultiplexer(r"C:\input_video.mpeg", r"C:\video_metadata.csv", r"C:\output_video.ts",
                          r"C:\Video_Multiplexer_MISB_Field_Mapping_Template.csv",
                          r"C:\Video_Multiplexer_TimeShift_Template.csv", r"c:\test\dem.tif")

VideoMultiplexer example 2 (stand-alone script)

This example creates a video file that combines an archived video and related metadata.

import arcpy
from arcpy.ia import *

in_video = "c:\\test\\drone_vid.mpeg"
in_metadata = "c:\\test\\videometadata.csv"
out_video = "c:\\test\\mutiplexer_output.ts"
MISB_mapping = "c:\\test\\Field_Mapping_Template.csv"
time_shift_file = "c:\\test\\timeshift.csv"
in_elevation_layer = "c:\\test\\dem.tif"

arcpy.ia.VideoMultiplexer(in_video, in_metadata, out_video, MISB_mapping, time_shift_file, in_elevation_layer)

Environments

Adjust for Daylight Saving, Time Zone

Licensing information