Hello,
First of all, thank you for your great work.
I am currently reviewing the results of your model. Since the outputs are in .npy and image formats, I am having some difficulty replicating the high-level visualizations. Could you provide guidance on the following?
- How to create the loudness map visualization (Figure 5 in the paper).
- The process for generating demo videos shown on the project page.
Any scripts or documentation you could share would be extremely helpful.
Thanks!