Representing Sounds as Neural Amplitude Fields: A Benchmark of Coordinate-MLPs and A Fourier Kolmogorov-Arnold Framework
-
Configure a Python environment and install related dependencies.
pip install -r requirements.txt
-
Download the required dataset from the following websites.
-
Organize the data set according to the following file structure.
--data --demo --gt_bach.wav --gt_counting.wav --gt_blues00000.wav # from GTZAN dataset blues_00000.wav --gtzan --genres --blues ... --VCTK --wav48_silence_trimmed --p231 ...
- Testing on
Bach
,Counting
, andBlues
.
bash scripts/benchmark_MLPs_demo.sh
- Testing on
CSTR VCTK
dataset.
bash scripts/benchmark_MLPs_vctk.sh
- Testing on
GTZAN
dataset.
bash scripts/benchmark_MLPs_gtzan.sh
- Testing on
Bach
,Counting
, andBlues
.
bash scripts/benchmark_KANs_demo.sh
- Testing on
CSTR VCTK
dataset.
bash scripts/benchmark_KANs_vctk.sh
- Testing on
GTZAN
dataset.
bash scripts/benchmark_KANs_gtzan.sh
-
RFF
positional encoding is sensitive to the dimension parameter$L$ .
bash scripts/benchmark_FFN_L.sh
-
RFF
positional encoding is sensitive to the variance parameter$\sigma$ .
bash scripts/benchmark_FFN_sigma.sh
-
NeFF
positional encoding is sensitive to the dimension parameter$L$ .
bash scripts/benchmark_NeRF_L.sh
-
Gaussian-type
activation functions are sensitive to the variance factor$a$ .
bash scripts/benchmark_gaussian.sh
-
Sine-type
activation functions are sensitive to the frequency factor$\omega$ .
# Sine
bash scripts/benchmark_siren.sh
# Incode-Sine
bash scripts/benchmark_incode-sine.sh
bash scripts/benchmark_sensitive_init.sh
When model capacity is limited, larger
bash scripts/benchmark_Fourier_omega.sh