Skip to content

BaoNguyenz/LR3D-CULT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

3D Reconstruction of Vietnamese Pottery Using Neural Radiance Fields

Capstone Project — Single/Few-Shot 3D Reconstruction of cultural pottery artifacts using PixelNeRF, with vanilla NeRF as a baseline for comparison.

Overview

This project tackles the problem of reconstructing 3D models of Vietnamese pottery objects from a small number of 2D images. We leverage Neural Radiance Fields (NeRF) techniques to synthesize novel views and export 3D point clouds (PLY files).

Approach Framework Role Input Required
PixelNeRF PyTorch Primary method 1–3 source views (generalizable)
NeRF (vanilla) TensorFlow Baseline comparison ~100 views per object

Key Features

  • End-to-end pipeline: from Blender renders → data preparation → training → 3D export
  • Custom dataset of Vietnamese pottery (bowls, vases, cups, dishes) from the LR3D-CULT dataset
  • Automated COLMAP camera pose estimation with fallback to synthetic turntable poses
  • PLY point cloud export for visualization and downstream use

Project Structure

Capstone/
├── README.md
│
├── scripts/                           # All utility scripts
│   ├── data_processing/               #   Dataset standardization, resize, split
│   ├── camera_poses/                  #   Camera pose generation & fixing
│   ├── colmap/                        #   COLMAP automation
│   ├── export/                        #   3D export (PLY point clouds)
│   └── evaluation/                    #   Metrics & visualization
│
├── NeRF_finetuning/                   # NeRF baseline
│   ├── nerf/                          #   Original NeRF repo (TensorFlow)
│   ├── All_data/                      #   Raw dataset (~180 views/object, 512×512)
│   ├── data_nerf/                     #   Prepared NeRF data (train/val/test splits)
│   └── outputs/                       #   NeRF training outputs
│
├── PixelNerf_finetuning/              # PixelNeRF (primary)
│   ├── pixel-nerf/                    #   PixelNeRF repo (PyTorch)
│   │   ├── src/                       #     Model, data loaders, renderer, utils
│   │   ├── train/                     #     Training script
│   │   ├── eval/                      #     Evaluation & video generation
│   │   └── conf/                      #     HOCON config files
│   └── LR3D-CULT/                     #   Source dataset (Blender renders)
│
├── docs/                              # Documentation
│   ├── papers/                        #   Reference papers (NeRF, PixelNeRF, etc.)
│   ├── reports/                       #   Project reports & guides
│   ├── figures/                       #   Pipeline diagrams & visualizations
│   └── presentations/                 #   Slide decks
│
├── notebooks/                         # Jupyter notebooks
│   ├── analysis.ipynb                 #   Dataset analysis
│   ├── render_demo.ipynb              #   NeRF rendering demo
│   ├── extract_mesh.ipynb             #   Mesh extraction
│   └── tiny_nerf.ipynb                #   Minimal NeRF tutorial
│
└── results/                           # Final outputs & benchmarks

Requirements

System

  • OS: Windows 10/11
  • GPU: NVIDIA GPU with CUDA support (recommended ≥ 8GB VRAM)
  • COLMAP: Required for camera pose estimation (download)

Python Dependencies

PixelNeRF (Primary):

torch >= 1.10
torchvision
numpy
Pillow
imageio
tqdm
dotmap
pyhocon
open3d          # for PLY export
lpips           # for perceptual metrics

NeRF Baseline:

tensorflow >= 2.x
numpy
imageio

Installation

# Clone the project
git clone <repo_url>
cd Capstone

# Install PixelNeRF dependencies
cd PixelNerf_finetuning/pixel-nerf
pip install -r requirements.txt

# Install NeRF baseline dependencies (optional)
cd NeRF_finetuning/nerf
conda env create -f environment.yml

Data Pipeline

Dataset Overview

The dataset consists of Vietnamese pottery objects rendered in Blender with 180 views per object (2 cameras × 90 frames, 360° turntable):

Category Prefix Description
Bowls bat_gom_ Ceramic bowls
Vases binh_gom_ Ceramic vases
Vases (BT) binh_gom_bt_ Ceramic vases (variant)
Cups chen_gom_ Ceramic cups
Dishes dia_gom_ Ceramic dishes
Bronze cups ly_dong_ Bronze cups

Data Format

Each object folder contains:

object_name/
├── images/              # 180 PNG images (512×512)
├── transforms.json      # Camera poses (NeRF format)
└── metadata.json        # Blender render parameters

transforms.json format:

{
  "camera_angle_x": 0.6911,
  "frames": [
    {
      "file_path": "./images/0001.png",
      "transform_matrix": [[4×4 camera-to-world matrix]],
      "w": 512, "h": 512,
      "fl_x": 349.2, "fl_y": 349.2,
      "cx": 256.0, "cy": 256.0
    }
  ]
}

Preparing Data for PixelNeRF

# Full standardization pipeline:
# 1. Select N frames uniformly, resize to 128×128, split 70:20:10
python scripts/data_processing/standard_dataset.py

# Or use the CLI version with custom args:
python scripts/data_processing/standad_resize_dataset.py \
  --src "path/to/All_data" \
  --dst "path/to/output_dataset" \
  --n 90 --resize 128 128 --fov 50 --seed 42

Fixing Camera Poses

If COLMAP fails to register enough frames:

# Generate poses from Blender metadata (most accurate)
python scripts/camera_poses/generate_transforms_from_metadata.py

# Or use synthetic turntable poses as fallback
python scripts/camera_poses/fix_all_transforms.py

# Check dataset quality
python scripts/camera_poses/check_nerf_dataset.py

Training

PixelNeRF (Primary)

cd PixelNerf_finetuning/pixel-nerf

# Train on pottery dataset
python train/train.py \
  -n pottery_experiment \
  -c conf/exp/multi_obj.conf \
  -D path/to/dataset_pottery \
  --gpu_id 0 \
  --epochs 200

Training logs are saved to logs/ and checkpoints to checkpoints/.

NeRF Baseline

cd NeRF_finetuning/nerf

# Train on a single object
python run_nerf.py \
  --config config_lego.txt \
  --datadir path/to/object_data \
  --basedir outputs/

Evaluation

Render Novel Views & Compute Metrics

cd PixelNerf_finetuning/pixel-nerf

# Evaluate on test set
python eval/eval.py \
  -n pottery_experiment \
  -c conf/exp/multi_obj.conf \
  -D path/to/dataset_pottery \
  --gpu_id 0

# Calculate PSNR, SSIM, LPIPS
python eval/calc_metrics.py \
  -n pottery_experiment

# Generate 360° rotation video
python eval/gen_video.py \
  -n pottery_experiment

Perceptual Quality (LPIPS)

python scripts/evaluation/predict_lpips.py

3D Export (PLY Point Cloud)

cd PixelNerf_finetuning/pixel-nerf

# Export PLY from a trained model
python scripts/export/export_ply.py \
  --weights checkpoints/pottery_experiment/pixel_nerf_latest \
  --input path/to/source_images \
  --transforms path/to/transforms.json \
  --output results/output.ply \
  --n_views 36

The exported .ply file can be viewed in MeshLab, CloudCompare, or Open3D.


Model Architecture

PixelNeRF

Component Details
Encoder ResNet-34 (ImageNet pretrained), 4 feature levels
MLP ResNet-style, 3 blocks, 512 hidden dims (coarse + fine)
Renderer Coarse: 64 samples, Fine: 32 samples, Depth: 16 samples
Positional Encoding 6 frequencies, freq_factor=1.5
Background White

NeRF (Baseline)

Component Details
MLP 8 layers × 256 units, skip at layer 4
Positional Encoding 10 frequencies (position), 4 frequencies (view direction)
Rendering Coarse + Fine hierarchical sampling

Camera Conventions

  • Coordinate system: Y-up
  • Camera orientation: Looks along −Z axis
  • camera_angle_x: Horizontal FOV in radians (~0.6911 rad ≈ 39.6°)
  • transform_matrix: 4×4 camera-to-world (c2w) transformation

References

  1. Mildenhall, B., et al. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. ECCV 2020.
  2. Yu, A., et al. pixelNeRF: Neural Radiance Fields from One or Few Images. CVPR 2021.
  3. Cai, S., et al. Pix2NeRF: Unsupervised Conditional π-GAN for Single Image to Neural Radiance Fields Translation. CVPR 2022.
  4. Liu, R., et al. Zero-1-to-3: Zero-shot One Image to 3D Object. ICCV 2023.
  5. Tancik, M., et al. Nerfstudio: A Modular Framework for Neural Radiance Field Development. SIGGRAPH 2023.
  6. LR3D-CULT Dataset — 3D cultural heritage pottery dataset.

License

This project is for academic purposes (Capstone Project). The NeRF and PixelNeRF codebases retain their original licenses (MIT).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors