Skip to content

Commit 93cae2a

Browse files
Updates for Docker tags and README
1 parent 1f4c5d1 commit 93cae2a

File tree

2 files changed

+52
-40
lines changed

2 files changed

+52
-40
lines changed

.github/workflows/docker-publish.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,8 @@ jobs:
5858
# Push bleeding-edge image ("<branch name>" tag) to registries
5959
- name: Bleeding Edge Docker Hub (Default Option)
6060
run: |
61-
echo "IMAGE_TAG=${GITHUB_REF_NAME}" >> $GITHUB_ENV
61+
TAG=$(echo ${GITHUB_REF_NAME} | sed 's/\//-/')
62+
echo "IMAGE_TAG=${TAG}" >> $GITHUB_ENV
6263
6364
# Push nightly image ("nightly" tag) to registries
6465
- name: Nightly Docker Hub

README.rst

Lines changed: 50 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -46,45 +46,56 @@ Spectrogram Extraction
4646

4747
Here are the steps for extracting the compressed spectrogram:
4848

49-
- Create the STFT
50-
- Load the original waveform at the original sample rate
51-
- Resample waveform to 250kHz
52-
- Convert to a STFT spectrogram (fft=512, method=blackmanharris, window=256, hop=16)
53-
- Convert complex power STFT to amplitude STFT (dB)
54-
- Normalize the STFT
55-
- Trim STFT to minimum and maximum frequencies (5kHz to 120kHz)
56-
- Subtract the per-freqency median dB (reduce any spectral bias / shift)
57-
- Set global dynamic range to -80 dB from the global maximum amplitude
58-
- Calculate the global median non-minimum dB (greater than -80dB)
59-
- Calculate the median absolute deviation (MAD)
60-
- Autogain the dynamic range to (5 * MAD) below the global amplitude median, if necessary
61-
- Quantize the STFT
62-
- Quantize the floating-point amplitude STFT to a 16-bit integer representation spanning the full dynamic range (65,536 bins)
63-
- Vertically flip the spectrogram (low frequencies on bottom) and convert to a C-contiguous array
64-
- Find Candidate Chirps
65-
- Create a 12ms sliding window with a 3ms stride
66-
- Keep the time windows that show a substantial right-skew across 10% of the frequency range
67-
- Add any user-provided time windows (annotations) to the found candidates windows
68-
- Merge any overlapping time windows into a set of contiguous time ranges
69-
- Tighten the candidate time ranges (and separate as needed) by repeating the same skew-based filter with a smaller sliding window and stride
70-
- Extract Chirp Metrics
71-
- *for each candidate chirp*
72-
- *Start*: First, find the peak amplitude location.
73-
- Step 1 - Normalize the chirp to the full 16-bit range. Calculate a histogram and identify the most common dB and standard deviation. Scale the amplitude values using an inverted PDF, weighting each value by its inverse probability of being noise (values below the most common dB are set to zero)
74-
- Step 2 - Apply a median filter and re-normalize
75-
- Step 3 - Apply a morphological open operation
76-
- Step 4 - Blur the chirp (k=5) and re-normalize
77-
- Step 5 - Find contours using the "marching squares" algorithm and select the one that contains the peak amplitude. Extract the convex hull of the contour and smooth the resulting outline
78-
- Step 6 - Extract a segmentation mask for the contour
79-
- Step 7 - Locate the harmonic (doubling the frequency) and echo (right edge of the contour to the end of the chirp time range) regions. Remove any overlapping noise from the chirp contour.
80-
- Step 8 - Locate the start, end, and characteristic frequency points (peak amplitude) and calculate an optimization cost grid for the contour using the masked amplitudes.
81-
- Step 9 - Solve a minimum distance optimization using A* that also maximizes the amplutide values from start to end points.
82-
- Step 10 - Smooth the contour path, extract the contour's slope, then identify the knee, heel, and other defining attributes.
83-
- *End*: Finally, if any of the above steps fails, or the chirp's attributes do not make semantic sense, then skip the candidate chirp.
84-
- Create Output
85-
- Collect all valid chirps regions and metadata, create a compressed spectrogram
86-
- Write the 16-bit spectrogram as a series of 8-bit JPEGs image chunks (max width per chunk 50k pixels)
87-
- Write the file and chirp metadata to a JSON file.
49+
* Create the STFT
50+
51+
* Load the original waveform at the original sample rate
52+
* Resample waveform to 250kHz
53+
* Convert to a STFT spectrogram (fft=512, method=blackmanharris, window=256, hop=16)
54+
* Convert complex power STFT to amplitude STFT (dB)
55+
56+
* Normalize the STFT
57+
58+
* Trim STFT to minimum and maximum frequencies (5kHz to 120kHz)
59+
* Subtract the per-freqency median dB (reduce any spectral bias / shift)
60+
* Set global dynamic range to -80 dB from the global maximum amplitude
61+
* Calculate the global median non-minimum dB (greater than -80dB)
62+
* Calculate the median absolute deviation (MAD)
63+
* Autogain the dynamic range to (5 * MAD) below the global amplitude median, if necessary
64+
65+
* Quantize the STFT
66+
67+
* Quantize the floating-point amplitude STFT to a 16-bit integer representation spanning the full dynamic range (65,536 bins)
68+
* Vertically flip the spectrogram (low frequencies on bottom) and convert to a C-contiguous array
69+
70+
* Find Candidate Chirps
71+
72+
* Create a 12ms sliding window with a 3ms stride
73+
* Keep the time windows that show a substantial right-skew across 10% of the frequency range
74+
* Add any user-provided time windows (annotations) to the found candidates windows
75+
* Merge any overlapping time windows into a set of contiguous time ranges
76+
* Tighten the candidate time ranges (and separate as needed) by repeating the same skew-based filter with a smaller sliding window and stride
77+
78+
* Extract Chirp Metrics
79+
80+
* *for each candidate chirp*
81+
* *Start*: First, find the peak amplitude location.
82+
* Step 1 - Normalize the chirp to the full 16-bit range. Calculate a histogram and identify the most common dB and standard deviation. Scale the amplitude values using an inverted PDF, weighting each value by its inverse probability of being noise (values below the most common dB are set to zero)
83+
* Step 2 - Apply a median filter and re-normalize
84+
* Step 3 - Apply a morphological open operation
85+
* Step 4 - Blur the chirp (k=5) and re-normalize
86+
* Step 5 - Find contours using the "marching squares" algorithm and select the one that contains the peak amplitude. Extract the convex hull of the contour and smooth the resulting outline
87+
* Step 6 - Extract a segmentation mask for the contour
88+
* Step 7 - Locate the harmonic (doubling the frequency) and echo (right edge of the contour to the end of the chirp time range) regions. Remove any overlapping noise from the chirp contour.
89+
* Step 8 - Locate the start, end, and characteristic frequency points (peak amplitude) and calculate an optimization cost grid for the contour using the masked amplitudes.
90+
* Step 9 - Solve a minimum distance optimization using A* that also maximizes the amplutide values from start to end points.
91+
* Step 10 - Smooth the contour path, extract the contour's slope, then identify the knee, heel, and other defining attributes.
92+
* *End*: Finally, if any of the above steps fails, or the chirp's attributes do not make semantic sense, then skip the candidate chirp.
93+
94+
* Create Output
95+
96+
* Collect all valid chirps regions and metadata, create a compressed spectrogram
97+
* Write the 16-bit spectrogram as a series of 8-bit JPEGs image chunks (max width per chunk 50k pixels)
98+
* Write the file and chirp metadata to a JSON file.
8899

89100
How to Install
90101
--------------

0 commit comments

Comments
 (0)