diff --git a/docs/developer_guides/viewer/local_viewer.md b/docs/developer_guides/viewer/local_viewer.md index ef8b1ac1f9..b2a36171a1 100644 --- a/docs/developer_guides/viewer/local_viewer.md +++ b/docs/developer_guides/viewer/local_viewer.md @@ -1,6 +1,6 @@ # (Legacy Viewer) Local Server -**Note:** this doc only applies to the legacy version of the viewer, which was the default in in Nerfstudio versions `<=0.3.4`. It was deprecated starting Nerfstudio version `1.0.0`, where it needs to be opted into via the `--vis viewer_legacy` argument. +**Note:** this doc only applies to the legacy version of the viewer, which was the default in Nerfstudio versions `<=0.3.4`. It was deprecated starting Nerfstudio version `1.0.0`, where it needs to be opted into via the `--vis viewer_legacy` argument. --- diff --git a/docs/quickstart/custom_dataset.md b/docs/quickstart/custom_dataset.md index 6444ca4456..7453ccb5b2 100644 --- a/docs/quickstart/custom_dataset.md +++ b/docs/quickstart/custom_dataset.md @@ -268,6 +268,31 @@ ns-process-data record3d --data {data directory} --output-dir {output directory} ns-train nerfacto --data {output directory} ``` +### Adding a Point Cloud + +Adding a point cloud is useful for avoiding random initialization when training gaussian splats. To add a point cloud using Record3D follow these steps: + +1. Export a Zipped sequence of PLY point clouds from Record3D. + + + + + + +2. Move the exported zip file to your computer from your iPhone. + + +3. Unzip the file and move all extracted `.ply` files to a directory. + + +4. Convert the data to nerfstudio format with the `--ply` flag and the directory from step 3. + +```bash +ns-process-data record3d --data {data directory} --ply {ply directory} --output-dir {output directory} +``` + +Additionally you can specify `--voxel-size {float}` which determines the level of sparsity when downsampling from the dense point clouds generated by Record3D to the sparse point cloud used in Nerfstudio. The default value is 0.8, lower is less sparse, higher is more sparse. + (spectacularai)= ## Spectacular AI @@ -292,13 +317,13 @@ pip install spectacularAI[full] 2. Install FFmpeg. Linux: `apt install ffmpeg` (or similar, if using another package manager). Windows: [see here](https://www.editframe.com/guides/how-to-install-and-start-using-ffmpeg-in-under-10-minutes). FFmpeg must be in your `PATH` so that `ffmpeg` works on the command line. 3. Data capture. See [here for specific instructions for each supported device](https://github.com/SpectacularAI/sdk-examples/tree/main/python/mapping#recording-data). - + 4. Process and export. Once you have recorded a dataset in Spectacular AI format and have it stored in `{data directory}` it can be converted into a Nerfstudio supported format with: ```bash sai-cli process {data directory} --preview3d --key_frame_distance=0.05 {output directory} ``` -The optional `--preview3d` flag shows a 3D preview of the point cloud and estimated trajectory live while VISLAM is running. The `--key_frame_distance` argument can be tuned based on the recorded scene size: 0.05 (5cm) is good for small scans and 0.15 for room-sized scans. If the processing gets slow, you can also try adding a --fast flag to `sai-cli process` to trade off quality for speed. +The optional `--preview3d` flag shows a 3D preview of the point cloud and estimated trajectory live while VISLAM is running. The `--key_frame_distance` argument can be tuned based on the recorded scene size: 0.05 (5cm) is good for small scans and 0.15 for room-sized scans. If the processing gets slow, you can also try adding a --fast flag to `sai-cli process` to trade off quality for speed. 5. Train. No separate `ns-process-data` step is needed. The data in `{output directory}` can now be trained with Nerfstudio: @@ -453,7 +478,7 @@ If cropping only needs to be done from the bottom, you can use the `--crop-botto ## 🥽 Render VR Video -Stereo equirectangular rendering for VR video is supported as VR180 and omni-directional stereo (360 VR) Nerfstudio camera types for video and image rendering. +Stereo equirectangular rendering for VR video is supported as VR180 and omni-directional stereo (360 VR) Nerfstudio camera types for video and image rendering. ### Omni-directional Stereo (360 VR) This outputs two equirectangular renders vertically stacked, one for each eye. Omni-directional stereo (ODS) is a method to render VR 3D 360 videos, and may introduce slight depth distortions for close objects. For additional information on how ODS works, refer to this [writeup](https://developers.google.com/vr/jump/rendering-ods-content.pdf). @@ -464,7 +489,7 @@ This outputs two equirectangular renders vertically stacked, one for each eye. O ### VR180 -This outputs two 180 deg equirectangular renders horizontally stacked, one for each eye. VR180 is a video format for VR 3D 180 videos. Unlike in omnidirectional stereo, VR180 content only displays front facing content. +This outputs two 180 deg equirectangular renders horizontally stacked, one for each eye. VR180 is a video format for VR 3D 180 videos. Unlike in omnidirectional stereo, VR180 content only displays front facing content.
@@ -524,4 +549,4 @@ If the depth of the scene is unviewable and looks too close or expanded when vie - The IPD can be modified in the `cameras.py` script as the variable `vr_ipd` (default is 64 mm). - Compositing with Blender Objects and VR180 or ODS Renders - Configure the Blender camera as panoramic and equirectangular. For the VR180 Blender camera, set the panoramic longitude min and max to -90 and 90. - - Change the Stereoscopy mode to "Parallel" set the Interocular Distance to 0.064 m. + - Change the Stereoscopy mode to "Parallel" set the Interocular Distance to 0.064 m. diff --git a/docs/quickstart/imgs/record_3d_export_button.png b/docs/quickstart/imgs/record_3d_export_button.png new file mode 100644 index 0000000000..01fed42831 Binary files /dev/null and b/docs/quickstart/imgs/record_3d_export_button.png differ diff --git a/docs/quickstart/imgs/record_3d_ply_selection.png b/docs/quickstart/imgs/record_3d_ply_selection.png new file mode 100644 index 0000000000..33c3b57bee Binary files /dev/null and b/docs/quickstart/imgs/record_3d_ply_selection.png differ diff --git a/docs/quickstart/imgs/record_3d_video_example.png b/docs/quickstart/imgs/record_3d_video_example.png new file mode 100644 index 0000000000..9c4c99880c Binary files /dev/null and b/docs/quickstart/imgs/record_3d_video_example.png differ diff --git a/nerfstudio/data/utils/pixel_sampling_utils.py b/nerfstudio/data/utils/pixel_sampling_utils.py index ae77b3287f..717124f762 100644 --- a/nerfstudio/data/utils/pixel_sampling_utils.py +++ b/nerfstudio/data/utils/pixel_sampling_utils.py @@ -23,7 +23,7 @@ def dilate(tensor: Float[Tensor, "bs 1 H W"], kernel_size=3) -> Float[Tensor, "bs 1 H W"]: - """Dilate a tensor with 0s and 1s. 0s will be be expanded based on the kernel size. + """Dilate a tensor with 0s and 1s. 0s will be expanded based on the kernel size. Args: kernel_size: Size of the pooling region. Dilates/contracts 1 pixel if kernel_size is 3. diff --git a/nerfstudio/process_data/record3d_utils.py b/nerfstudio/process_data/record3d_utils.py index 2bbe7b62de..50212c2b20 100644 --- a/nerfstudio/process_data/record3d_utils.py +++ b/nerfstudio/process_data/record3d_utils.py @@ -16,23 +16,32 @@ import json from pathlib import Path -from typing import List +from typing import List, Optional import numpy as np +import open3d as o3d from scipy.spatial.transform import Rotation from nerfstudio.process_data.process_data_utils import CAMERA_MODELS from nerfstudio.utils import io -def record3d_to_json(images_paths: List[Path], metadata_path: Path, output_dir: Path, indices: np.ndarray) -> int: +def record3d_to_json( + images_paths: List[Path], + metadata_path: Path, + output_dir: Path, + indices: np.ndarray, + ply_dirname: Optional[Path], + voxel_size: Optional[float], +) -> int: """Converts Record3D's metadata and image paths to a JSON file. Args: - images_paths: list if image paths. + images_paths: list of image paths. metadata_path: Path to the Record3D metadata JSON file. output_dir: Path to the output directory. indices: Indices to sample the metadata_path. Should be the same length as images_paths. + ply_dirname: Path to the directory of exported ply files. Returns: The number of registered images. @@ -87,6 +96,23 @@ def record3d_to_json(images_paths: List[Path], metadata_path: Path, output_dir: out["frames"] = frames + # If .ply directory exists add the sparse point cloud for gsplat point initialization + if ply_dirname is not None: + assert ply_dirname.exists(), f"Directory not found: {ply_dirname}" + assert ply_dirname.is_dir(), f"Path given is not a directory: {ply_dirname}" + + # Create sparce point cloud + pcd = o3d.geometry.PointCloud() + for ply_filename in ply_dirname.iterdir(): + temp_pcd = o3d.io.read_point_cloud(str(ply_filename)) + pcd += temp_pcd.voxel_down_sample(voxel_size=voxel_size) + + # Save point cloud + points3D = np.asarray(pcd.points) + pcd.points = o3d.utility.Vector3dVector(points3D) + o3d.io.write_point_cloud(str(output_dir / "sparse_pc.ply"), pcd, write_ascii=True) + out["ply_file_path"] = "sparse_pc.ply" + with open(output_dir / "transforms.json", "w", encoding="utf-8") as f: json.dump(out, f, indent=4) diff --git a/nerfstudio/scripts/process_data.py b/nerfstudio/scripts/process_data.py index 5e1d869f32..1fdd36f7f2 100644 --- a/nerfstudio/scripts/process_data.py +++ b/nerfstudio/scripts/process_data.py @@ -49,6 +49,11 @@ class ProcessRecord3D(BaseConverterToNerfstudioDataset): 2. Converts Record3D poses into the nerfstudio format. """ + ply_dir: Optional[Path] = None + """Path to the Record3D directory of point export ply files.""" + voxel_size: Optional[float] = 0.8 + """Voxel size for down sampling dense point cloud""" + num_downscales: int = 3 """Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x.""" @@ -101,7 +106,14 @@ def main(self) -> None: ) metadata_path = self.data / "metadata.json" - record3d_utils.record3d_to_json(copied_image_paths, metadata_path, self.output_dir, indices=idx) + record3d_utils.record3d_to_json( + copied_image_paths, + metadata_path, + self.output_dir, + indices=idx, + ply_dirname=self.ply_dir, + voxel_size=self.voxel_size, + ) CONSOLE.rule("[bold green]:tada: :tada: :tada: All DONE :tada: :tada: :tada:") for summary in summary_log: diff --git a/nerfstudio/utils/tensor_dataclass.py b/nerfstudio/utils/tensor_dataclass.py index 293d978d7e..fb3331a355 100644 --- a/nerfstudio/utils/tensor_dataclass.py +++ b/nerfstudio/utils/tensor_dataclass.py @@ -280,7 +280,7 @@ def _apply_fn_to_fields( ) -> TensorDataclassT: """Applies a function to all fields of the tensor dataclass. - TODO: Someone needs to make a high level design choice for whether not not we want this + TODO: Someone needs to make a high level design choice for whether or not we want this to apply the function to any fields in arbitray superclasses. This is an edge case until we upgrade to python 3.10 and dataclasses can actually be subclassed with vanilla python and no janking, but if people try to jank some subclasses that are grandchildren of TensorDataclass