Efficient4D: Fast Dynamic 3D Object Generation from a Single-view Video,
Zijie Pan, Zeyu Yang, Xiatian Zhu, Li Zhang
IJCV 2026
Official implementation of "Efficient4D: Fast Dynamic 3D Object Generation from a Single-view Video".
demo.mp4
conda create -n efficient4d python=3.9
conda activate efficient4d
pip install -r requirements.txtFollowing Syncdreamer to download checkpoints under ./ckpt.
We provide one example case under ./data, please refer to Consistent4D for more data.
- Ubuntu 20 with torch 1.13 & CUDA 11.7 on eight A6000.
# set path
DEVICE="0,1,2,3"
N_FRAME=16 # 4 frames per GPU, 40G memory
INPUT="./data/walking_white_faced_egret"
OUTPUT="./output/walking_white_faced_egret"
OUTPUT_RECON="./recon_data/walking_white_faced_egret"
# generate pseudo muti-view and muti-frame consistent images
CUDA_VISIBLE_DEVICES=$DEVICE python generate.py \
--input $INPUT \
--output $OUTPUT \
--crop_size 200 \
--elevation 0 \
--frame_num $N_FRAME \
--smooth_filter \
--decomposed_sampling \
--is_cyc 0 \
--seed 3407
# frame interpolation (optional)
for((i=0; i<$[N_FRAME-1]; i++))
do
IMG1="$OUTPUT/$i.png"
IMG2="$OUTPUT/$[i+1].png"
SAVE_PATH="$OUTPUT/interp/${i}_$[i+1]"
CUDA_VISIBLE_DEVICES=$DEVICE python frame_interpolation.py --img $IMG1 $IMG2 --cycle2 --save_path $SAVE_PATH
done
# preview
CUDA_VISIBLE_DEVICES=$DEVICE python concat_video.py --root_path $OUTPUT --num_frame $N_FRAME --interp
# confidence map
CUDA_VISIBLE_DEVICES=$DEVICE python video_confidence.py --root_path $OUTPUT
# pose
CUDA_VISIBLE_DEVICES=$DEVICE python data_convertor.py --input $OUTPUT --output $OUTPUT_RECON --confidenceWe also provide a script run.sh to finish the above process in one command:
# bash run.sh DATA_NAME OUTPUT_NAME DEVICES N_FRAMES CYCLE
# We can set CYCLE=1 to enhance consistency if the input video is periodic
# or the first frame and the last frame of the input video are similar.
bash run.sh walking_white_faced_egret walking_white_faced_egret 4,5,6,7 16 0We have now released the 4D reconstruction part under the generation branch of 4DGS. The branch provides a general framework combining 4DGS and SDS loss, which supports taking multi-view images as input (e.g., this repo, Diffusion^2 and SV4D).
We have intensively borrow codes from the following repositories. Many thanks to the authors for sharing their codes.
If you find our repository useful, please consider giving it a star ⭐ and citing our paper in your work:
@article{pan2024efficient4d,
title={Efficient4D: Fast Dynamic 3D Object Generation from a Single-view Video},
author={Pan, Zijie and Yang, Zeyu and Zhu, Xiatian and Zhang, Li},
journal={International Journal of Computer Vision},
year={2026},
}
