Skip to content

Commit 9e2a8a7

Browse files
authored
Merge pull request #2 from libAudioFlux/dev1
update python and README
2 parents 1092866 + a576530 commit 9e2a8a7

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

72 files changed

+5346
-402
lines changed

README.md

Lines changed: 167 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,29 @@
3434

3535
A library for audio and music analysis, feature extraction.
3636

37-
Can be used for deep learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc
37+
38+
# Table of Contents
39+
40+
- [Overview](#overview)
41+
- [Description](#description)
42+
- [Functionality](#functionality)
43+
- [transform](#1-transform)
44+
- [feature](#2-feature)
45+
- [mir](#3-mir)
46+
- [Quickstart](#quickstart)
47+
- [Mel & MFCC](#mel--mfcc)
48+
- [CWT & Synchrosqueezing](#cwt--synchrosqueezing)
49+
- [Other examples](#other-examples)
50+
- [Installation](#installation)
51+
- [Python Package Intsall](#python-package-intsall)
52+
- [iOS build](#ios-build)
53+
- [Android build](#android-build)
54+
- [Compiling from source](#compiling-from-source)
55+
- [Documentation](#documentation)
56+
- [Contributing](#contributing)
57+
- [Citing](#citing)
58+
- [License](#license)
59+
3860

3961
## Overview
4062

@@ -46,6 +68,8 @@ In the above tasks, **mel spectrogram** and **mfcc** features are commonly used
4668

4769
**`audioFlux`** provides systematic, comprehensive and multi-dimensional feature extraction and combination, and combines various deep learning network models to conduct research and development learning in different fields.
4870

71+
Can be used for deep learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc.
72+
4973
### Functionality
5074

5175
**`audioFlux`** is based on the design of data flow. It decouples each algorithm module structurally, and it is convenient, fast and efficient to extract features from large batches.The following are the main feature architecture diagrams, specific and detailed description view the documentation.
@@ -109,10 +133,130 @@ The mir module contains the following algorithms:
109133
- `onset` - Spectrum flux, novelty, etc algorithm.
110134
- `hpss` - Median filtering, NMF algorithm.
111135

136+
137+
## Quickstart
138+
139+
### Mel & MFCC
140+
141+
Mel spectrogram and Mel-frequency cepstral coefficients
142+
143+
```python
144+
# Feature extraction example
145+
import numpy as np
146+
import audioflux as af
147+
import matplotlib.pyplot as plt
148+
from audioflux.display import fill_spec
149+
from audioflux.type import SpectralFilterBankScaleType
150+
151+
# Get a 220Hz's audio file path
152+
sample_path = af.utils.sample_path('220')
153+
154+
# Read audio data and sample rate
155+
audio_arr, sr = af.read(sample_path)
156+
157+
# Extract mel spectrogram
158+
bft_obj = af.BFT(num=128, radix2_exp=12, samplate=sr,
159+
scale_type=SpectralFilterBankScaleType.MEL)
160+
spec_arr = bft_obj.bft(audio_arr)
161+
spec_arr = np.abs(spec_arr)
162+
163+
# Create XXCC object and extract mfcc
164+
xxcc_obj = af.XXCC(bft_obj.num)
165+
xxcc_obj.set_time_length(time_length=spec_arr.shape[1])
166+
mfcc_arr = xxcc_obj.xxcc(spec_arr)
167+
168+
audio_len = audio_arr.shape[0]
169+
fig, ax = plt.subplots()
170+
img = fill_spec(spec_arr, axes=ax,
171+
x_coords=bft_obj.x_coords(audio_len),
172+
y_coords=bft_obj.y_coords(),
173+
x_axis='time', y_axis='log',
174+
title='Mel Spectrogram')
175+
fig.colorbar(img, ax=ax)
176+
177+
fig, ax = plt.subplots()
178+
img = fill_spec(mfcc_arr, axes=ax,
179+
x_coords=bft_obj.x_coords(audio_len), x_axis='time',
180+
title='MFCC')
181+
fig.colorbar(img, ax=ax)
182+
183+
plt.show()
184+
```
185+
186+
<img src='image/demo_mel.png' width="415" /><img src='image/demo_mfcc.png' width="415" />
187+
188+
### CWT & Synchrosqueezing
189+
190+
Continuous Wavelet Transform spectrogram and its corresponding synchrosqueezing reassignment spectrogram
191+
192+
```python
193+
# Feature extraction example
194+
import numpy as np
195+
import audioflux as af
196+
import matplotlib.pyplot as plt
197+
from audioflux.display import fill_spec
198+
from audioflux.type import SpectralFilterBankScaleType, WaveletContinueType
199+
from audioflux.utils import note_to_hz
200+
201+
# Get a 220Hz's audio file path
202+
sample_path = af.utils.sample_path('220')
203+
204+
# Read audio data and sample rate
205+
audio_arr, sr = af.read(sample_path)
206+
audio_arr = audio_arr[:4096]
207+
208+
cwt_obj = af.CWT(num=84, radix2_exp=12, samplate=sr, low_fre=note_to_hz('C1'),
209+
bin_per_octave=12, wavelet_type=WaveletContinueType.MORSE,
210+
scale_type=SpectralFilterBankScaleType.OCTAVE)
211+
212+
cwt_spec_arr = cwt_obj.cwt(audio_arr)
213+
214+
synsq_obj = af.Synsq(num=cwt_obj.num,
215+
radix2_exp=cwt_obj.radix2_exp,
216+
samplate=cwt_obj.samplate)
217+
218+
synsq_arr = synsq_obj.synsq(cwt_spec_arr,
219+
filter_bank_type=cwt_obj.scale_type,
220+
fre_arr=cwt_obj.get_fre_band_arr())
221+
222+
# Show CWT
223+
fig, ax = plt.subplots(figsize=(7,4))
224+
img = fill_spec(np.abs(cwt_spec_arr), axes=ax,
225+
x_coords=cwt_obj.x_coords(),
226+
y_coords=cwt_obj.y_coords(),
227+
x_axis='time', y_axis='log',
228+
title='CWT')
229+
fig.colorbar(img, ax=ax)
230+
# Show Synsq
231+
fig, ax = plt.subplots(figsize=(7,4))
232+
img = fill_spec(np.abs(synsq_arr), axes=ax,
233+
x_coords=cwt_obj.x_coords(),
234+
y_coords=cwt_obj.y_coords(),
235+
x_axis='time', y_axis='log',
236+
title='Synsq')
237+
fig.colorbar(img, ax=ax)
238+
239+
plt.show()
240+
```
241+
242+
<img src='image/demo_cwt.png' width="415" /><img src='image/demo_synsq.png' width="415" />
243+
244+
245+
### Other examples
246+
247+
- [CQT & Chroma](docs/examples.md#cqt--chroma)
248+
- [Different Wavelet Type](docs/examples.md#different-wavelet-type)
249+
- [Spectral Features](docs/examples.md#spectral-features)
250+
- [Pitch Estimate](docs/examples.md#pitch-estimate)
251+
- [Onset Detection](docs/examples.md#onset-detection)
252+
- [Harmonic Percussive Source Separation](docs/examples.md#harmonic-percussive-source-separation)
253+
254+
More example scripts are provided in the [Documentation](https://audioflux.top/) section.
255+
112256
## Installation
113257
![language](https://img.shields.io/badge/platform-iOS%20|%20android%20|%20macOS%20|%20linux%20|%20windows%20-lyellow.svg)
114258

115-
The library is cross-platform and currently supports Linux, macOS, Windows, iOS and Android systems.
259+
The library is cross-platform and currently supports Linux, macOS, Windows, iOS and Android systems.
116260

117261
### Python Package Intsall
118262

@@ -122,23 +266,19 @@ Using PyPI:
122266
$ pip install audioflux
123267
```
124268

125-
Using Anaconda:
269+
<!--Using Anaconda:
126270
127271
```
128272
$ conda install -c conda-forge audioflux
129-
```
273+
```-->
130274

131-
Building from source:
132-
133-
```
134-
$ python setup.py build
135-
$ python setup.py install
136-
```
137275

138276
<!--Read installation instructions:
139277
https://audioflux.top/install-->
140278

279+
141280
### iOS build
281+
142282
To compile iOS on a Mac, Xcode Command Line Tools must exist in the system:
143283

144284
- Install the full Xcode package
@@ -156,6 +296,7 @@ $ ./build_iOS.sh
156296
Build and compile successfully, the project build compilation results are in the **`build`** folder
157297

158298
### Android build
299+
159300
The current system development environment needs to be installed [**android NDK**](https://developer.android.com/ndk), ndk version>=16,after installation, set the environment variable ndk path.
160301

161302
For example, ndk installation path is `~/Android/android-ndk-r16b`:
@@ -175,6 +316,13 @@ $ ./build_android.sh
175316

176317
Build and compile successfully, the project build compilation results are in the **`build`** folder
177318

319+
320+
### Compiling from source
321+
322+
For Linux, macOS, Windows systems. Read installation instructions:
323+
324+
* [docs/installing.md](docs/installing.md)
325+
178326
## Documentation
179327

180328
Documentation of the package can be found online:
@@ -186,7 +334,15 @@ We are more than happy to collaborate and receive your contributions to **`audio
186334

187335
You are also more than welcome to suggest any improvements, including proposals for need help, find a bug, have a feature request, ask a general question, new algorithms. <a href="https://github.com/libAudioFlux/audioFlux/issues/new">Open an issue</a>
188336

189-
<!-- ## Citing -->
337+
338+
## Citing
339+
340+
If you want to cite **`audioFlux`** in a scholarly work, there are two ways to do it.
341+
342+
- If you are using the library for your work, for the sake of reproducibility, please cite
343+
the version you used as indexed at Zenodo:
344+
345+
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7548289.svg)](https://doi.org/10.5281/zenodo.7548289)
190346

191347
## License
192348
audioFlux project is available MIT License.

audioflux/__version__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
__title__ = 'audioflux'
22
__description__ = 'A library for audio and music analysis, feature extraction.'
3-
__version__ = '0.0.1'
3+
__version__ = '0.1.1'

audioflux/bft.py

Lines changed: 6 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -104,8 +104,6 @@ class BFT(Base):
104104
>>> import audioflux as af
105105
>>> audio_path = af.utils.sample_path('220')
106106
>>> audio_arr, sr = af.read(audio_path)
107-
array([-5.5879354e-09, -9.3132257e-09, 0.0000000e+00, ...,
108-
3.2826858e-03, 3.2447521e-03, 3.0795704e-03], dtype=float32)
109107
110108
Create BFT object of Linser(STFT)
111109
@@ -124,19 +122,6 @@ class BFT(Base):
124122
>>> spec_arr = obj.bft(audio_arr)
125123
>>> spec_arr = np.abs(spec_arr)
126124
>>> spec_dB_arr = power_to_db(spec_arr)
127-
array([[-41.382824, -37.95072 , -50.98091 , ..., -48.275932, -66.01512 ,
128-
-53.229565],
129-
[-29.873356, -33.225224, -32.94691 , ..., -49.855965, -49.439796,
130-
-53.827766],
131-
[-27.326801, -36.17459 , -32.978054, ..., -56.360283, -51.485504,
132-
-51.036415],
133-
...,
134-
[-80. , -80. , -80. , ..., -80. , -80. ,
135-
-80. ],
136-
[-80. , -80. , -80. , ..., -80. , -80. ,
137-
-80. ],
138-
[-80. , -80. , -80. , ..., -80. , -80. ,
139-
-80. ]], dtype=float32)
140125
141126
Show spectrogram plot
142127
@@ -164,6 +149,12 @@ def __init__(self, num, radix2_exp=12, samplate=32000,
164149
is_reassign=False, is_temporal=False):
165150
super(BFT, self).__init__(pointer(OpaqueBFT()))
166151

152+
self.fft_length = fft_length = 1 << radix2_exp
153+
154+
# check num
155+
if num > (fft_length // 2 + 1):
156+
raise ValueError(f'num={num} is too large')
157+
167158
# check BPO
168159
if scale_type == SpectralFilterBankScaleType.OCTAVE and bin_per_octave < 1:
169160
raise ValueError(f'bin_per_octave={bin_per_octave} must be a positive integer')

0 commit comments

Comments
 (0)