# 360 Camera Pipeline

This document describe the step-by-step guidance for mapping with panorama cameras.

Pro license is required to use this feature.

Immersal map construction require two elements as input: images and camera pose. The mapping process with Phones/Tablets and BLK2GO is pretty simple (image 0.1), we can get both from your phone/tablet or BLK2GO. Specifically, the camera pose is from device's SLAM.

Mapping with Mapper or LiDAR
Mapping with Mapper or LiDAR

However, Panoama(360) camera doesn't have SLAM at all. The camera pose has to be estimated through Structure-from-Motion(SfM) approach. In practice, you may ultilize 3rd-party photogrametric software(e.g. Metashape) for that.

Mapping with 360 camera
Mapping with 360 camera

# Tools

  • Immersal account (Enterprise level)
  • 360 camera, below are the recommended models: (06/2024)
    1. Insta 360 X4 (resolution 8k, good for day-time)
    2. Insta 360 ONE RS 1-inch sensor edition (resolution 6.5k, good for low-light environment)
  • Photogrammetric software. e.g. Metashape (Pro license needed).
  • Scripts for converting and uploading to Immersal backend. We offer a sample script for Metashape.
  • MeshLab (optional, for Map editing).

# 1. Mapping

# 1.1 Basic Principles

The primary principle for mapping is ensuring the camera's position and orientation during mapping align with the expected pose where users will be localized. Hence, we must first determine the anticipated user location and the camera's front facing directional orientation for proper localization.

# 1.2 Route Planning (Urban Area)

In order to obtain the correct camera pose, we should plan the route during mapping, which must meet the following requirements:

  • Ensure that the mapping route has a loop. The size of the loop should be determined according to the actual environment. For example, if we want to cover the streets in the right picture, a loop can be made for each block, and each loop should align with the others (i.e., ensure that the camera can capture overlapping frames).
  • At the end, we need to return to the starting point.

Plan your route carefully
Plan your route carefully

# 1.3 Route Planning (Open Area)

For open areas, such as squares, where there are no objects (e.g. buildings) surrounding it, we can walk freely. However, we still need to consider the basic principles, that is, we need to think about what routes the users might take for localization, where they might stop, and in what direction the camera should face. This will allow us to plan the best mapping route.
If there are some key objects or landscapes in the environment that need to be covered, such as a sculpture in the center square, we can do more mapping around it, but please note:

  • Try not to rotate the camera, and it's best to keep the camera's orientation constant, otherwise, it might cause a jelly effect, affecting mapping.
  • Ensure a uniform and slow walking speed, and do not suddenly stop or accelerate.

Mapping outdoor area
Mapping outdoor area

# 1.4. Mapping options

Users can choose to conduct the mapping through either taking panoramic photos or shooting panoramic videos. The former generally results in a spatial map of higher quality but is also more time-consuming and laborious. This is due to the fact that photographs often achieve a higher resolution than videos and can avoid motion blur, thus ensuring the high quality of the images. Additionally, since users can freely control the shooting density, such as densely capturing key areas and sparsely capturing non-key areas, the final spatial map generated is usually smaller in size yet superior in quality compared to that derived from videos.

# Option A: Taking 360 Photos

# Option B: Shooting 360 Videos

# 2. Processing The Mapping Data

# 2.1. Processing Data: Photos

# 2.1.1 Importing 360 photos into Metashape

You may complete export 360 photos from the camera, then drag into Metashape.

Drag photos to Metashape
Drag photos to Metashape

# 2.2. Processing Data: Importing From Video

Export 360 video and prepare for importing to Photogrammetry software (e.g. Metashape) You need to export the video from your camera as a panoramic flat view, i.e., an equirectangular projection view. In terms of a format, use a format that MetaShape can accept (mov, avi, fav, mp4, wmv). For Insta360 cameras, export the video through Insta360 Studio (desktop software). Make you are exportting 360 video, the encoding can be H.264 or ProRes(the file size will be a lot larger).

Shooting 360 video
Shooting 360 video
Equirectangular projection view
Equirectangular projection view

# 2.2.1 Frame Extraction:

It is not recommended to use the frame extraction tool built into Metashape (as it may be uneven). You may do it yourself or use the sample frame extraction script we provide. Specify the input (which supports multiple videos), output directory and frame extraction interval (in seconds).

Extracting frames
Extracting frames
Then drag the extracted images to Metashape.
Drag the extracted images to Metashape
Drag the extracted images to Metashape

# 2.3. Setting the coordinate system:

The default might be WGS84 (latitude, longitude, altitude). Please make sure to switch to "Local coordinates" for 'x/y/z' representation allowing the coordinates to be represented in the 'x/y/z' format.

Setting the coordinate system
Setting the coordinate system

# 2.4. Aligning Photo to generate point-cloud

Click Tools -> Camera Calibration, Set Camera type as 'Spherical'

Camera Calibration
Camera Calibration

Then click Workflow -> Align Photos,select ‘Accuracy’='Highest' and ‘Reference preselection’='Estimated'. You may customize the 'key point limit' and 'tie point limit'. e.g. Increasing 'key point limit' would allow Metashape to extract more feature from the image, which could be beneficial for finding matches.

Align Photos Settings
Align Photos Settings
Generated PointCloud
Generated PointCloud

# 2.5. Generating Mesh and Texture

There are two purpose for generating Mesh and Texture:

  • Having a mesh/texture would make it easy for us to add reference points/markers for fixing the scale.
  • The mesh/texture can be later used in placing AR content to the space in Unity.

You may build mesh and texture by clicking:

  • Workflow -> Build Mesh
  • Workflow -> Build Texture
Building mesh Building texture

Generated Mesh with Texture
Generated Mesh with Texture

# 2.6. Checking the point-cloud and mesh

Please check the point-cloud and mesh! Look for any obvious errors. For example:

  • Some areas are missing in the point-cloud. (as shown in the left image below)
  • Misalignment, distortion, which do not match the physical space (as shown in the right image below)

Misalignment (in point-cloud)
Misalignment (in point-cloud)
Misalignment (camera pose)
Misalignment (camera pose)
Misalignment (in mesh)
Misalignment (in mesh)

# 2.7. Fixing the scale and setting the coordinate system

The default scale of the point cloud generated by the photogrammetry software is different from the actual scale of the object. To fix the scale, we need to find some objects with known dimensions as references. We need to add at least 3 points (markers) in the point cloud. More is preferred to minimise the inaccuracy.

For example, if we can clearly see the shape of the square tiles on the street surface in the mesh or point cloud, we can mark three of the four corners of the tiles. For each corner, right-click -> Add Marker, then you will see newly added points in the Markers list on the left, with a default position value.

Finding reference objects and adding markers
Finding reference objects and adding markers

We now need to assign the correct position value to each point, so that we can set the orientation while correcting the scale. For example, we may take the first of the three points as the origin of the entire coordinate system (0, 0, 0), and rename it as "origin". Based on the actual scale of the object (tile side length is 2.8 meters), we can calculate the coordinates of the other two points as (0, 0, 2.8) and (2.8, 0, 0), and rename them as "point2" and "point3." Please be aware that you may first need to confirm the coordinate system of the photogrammetry software, e.g. Metashape uses a right-handed coordinate system, unlike Unity's left-handed coordinate system.

Then select all the markers, click Model -> Transform Object -> Update Transform. Please be noticed that you can switch between point-cloud mode and mesh mode at any time using the buttons at the top to check whether the positions of the points are correct.

Adding at least three markers
Adding at least three markers

# 2.9. Optimize camera pose

Remove any unnecessary or incorrect points from the area. You can use the selection tool to select the points and then Delete (remove the selected points, retaining the others); or select the desired area, and then Crop (remove points outside the selected area, retaining the selected points).

(img: 2.9.1)
(img: 2.9.1)

Select Model -> Gradual Selection. Set the following parameters. For each selection, if there are any selected points, go to Edit -> Delete to delete the points, leaving only a very small number of points in the end.

Selecting points
Selecting points

  • Re-projection error: 0.5-1 (recommended: 1)
  • Reconstruction uncertainty: 10-30 (recommended: 30)
  • Projection accuracy: 3-5 (recommended: 3)
  • Image count: 3-5 (recommended: 3)

When completed, click Tools -> Optimize Cameras

Deleting points
Deleting points

# 2.10. Export camera poses

Click File -> Export -> Cameras, then camera poses will be saved in XML format. - Save your project.

# 3. Converting data and start map construction

# 3.1. Convert the camera pose data

To be able to use the exported camera poses data, we must prepare it for each image. You may write your own script to convert the exported XML to json for each image. For Metashape, we provide a sample script ’metashape-xml-to-json.py’ for that purpose. To run it, you need to specify the path of exported XML file and the path of the images.

The script for converting camera pose
The script for converting camera pose

Once this step is completed, there will be a JSON file in the directory containing the photos, associated with each image, that includes its camera pose data.

The final output of camera pose for each image
The final output of camera pose for each image

# 3.2. Submitting data to start map construction

Now you can use Immersal's processing script 'submit-images-and-json.py' to upload the data to the Immersal server for map construction. Inside the script, you need to specify URL of Immersal server, your Immersal token, the map name, and the image directory. Please note that the map name must consist of letters or numbers (A-Z/a-z/0-9), and must not contain spaces or special characters (such as -, _, /, etc.).

Please pay special attention when there are many images, it is recommended to use a wired network for uploading (to prevent interruptions in the middle). If the console output shows “error”: “none” at the end, it indicates a successful upload; otherwise, an error message will be printed.

Script for submitting data to Immersal server
Script for submitting data to Immersal server

# 4. Testing the map

You can use the Immersal mapper app and test the location within the space.

Test localization
Test localization

# 5. Adding AR content

We can directly use the mesh/texture generated by Photogrammetry software to develop AR scenes.

  • In Metashape, navigate to File -> Export -> Export Model and File -> Export -> Export Texture to export the model and texture.
  • We can directly use the mesh/texture generated by Photogrammetry software to develop AR scenes.
  • Please note that the default rotation of the model might be (-90, 0, 0), so you might need to reset it to (0, 0, 0).
  • After that, you can start adding AR content. When you make the final build for your app, it's good practice to disable or delete the model.
    Correctnig the pose of imported mesh
    Correctnig the pose of imported mesh

# 6. Editing and optimising the map (optional)

Panoramic cameras often capture extraneous objects, including the photographer themselves, trees, and so on. We can remove these unnecessary objects by manually editing the spatial map (point cloud), thereby increasing the success rate of positioning.

  • Download the point cloud file with the suffix '-sparse.ply' from the Immersal developer portal, and open it with 3rd- party software ‘MeshLab’.
  • Within MeshLab, click View -> Toggle Orthographic Camera switch to an orthographic view, which is easier to observe the point cloud.
  • In the point cloud information panel on the right, select None for Shading, to make the point cloud more visible. You may also manually adjust the Point Size for visibility.

Making it easier to visualize the point-cloud
Making it easier to visualize the point-cloud

You can download all scripts related to 360 pipeline from github here