[not_for_merge_yet] fixing github workflows with automatic tests#2061
[not_for_merge_yet] fixing github workflows with automatic tests#2061KarelVesely84 wants to merge 6 commits intok2-fsa:masterfrom
Conversation
- test `egs/librispeech/ASR/zipformer/test_rknn_on_cpu_simulator_ctc_streaming.py` is crashing for missing this dependency
|
Hi Fangyun @csukuangfj , I found out that the corresponding workflow is .github/workflows/test.yml, and this uses some Docker image. But I am not sure, which Dockerfile is used (perhaps this one .github/scripts/docker/Dockerfile ?) I can add a Best regards |
📝 WalkthroughWalkthroughAdded two pip packages to CI/Docker installs; adjusted import/path references and rewrote one prepare_transcripts file (no logic change); fixed encoder call and averaged-model loading; corrected duration check and pruning logic; removed one test file and minor whitespace edits across utilities. Changes
Sequence Diagram(s)(Skipped — changes do not introduce a new multi-component control flow that requires a sequence diagram.) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@egs/iwslt22_ta/ST/local/prepare_transcripts.py`:
- Around line 1-3: Move the shebang line (#!/usr/bin/python) to the very top of
the file with no preceding bytes or comments so the shell can recognize it;
place the copyright/comment lines (e.g., "Copyright 2023 Johns Hopkins
University (Amir Hussein)") immediately after the shebang and ensure there are
no blank lines or other characters before the shebang.
- Around line 51-63: The code builds langdirs from args.src_langdir and
optionally args.tgt_langdir but then always accesses langdirs[1], causing an
IndexError when args.tgt_langdir is None; update the block that opens transcript
files to handle the single-langdir case: if len(langdirs) == 1, open only
langdirs[0] / "transcript_words.txt" and write only src_txt for each cut (using
c.supervisions[0].text), otherwise open both langdirs[0] and langdirs[1] and
write src_txt and tgt_txt (c.supervisions[0].custom['translated_text']['eng'])
respectively. Ensure you reference the same variables (langdirs, cuts, src_txt,
tgt_txt, args.tgt_langdir) so the change integrates cleanly.
🧹 Nitpick comments (3)
egs/iwslt22_ta/ST/local/prepare_transcripts.py (2)
11-11: Remove unusedpdbimport.The
pdbmodule is imported but not used. This appears to be a leftover debug artifact.Proposed fix
-import pdb
48-48: Useis not Nonefor None comparison.Per PEP 8, comparisons to
Noneshould use identity operators (is/is not) rather than equality operators.Proposed fix
- if args.tgt_langdir != None: + if args.tgt_langdir is not None:.github/workflows/test.yml (1)
105-105: Redundant install once Docker image is updated.Since
kaldi_native_fbankis now added to the Dockerfile (line 47), thispip installwill be redundant once the Docker image is rebuilt and published. Consider removing this line after confirming the updated image is available to keep the workflow clean and avoid potential version drift.If this is intentional as a temporary fix until the image is updated, that's acceptable—just ensure it's removed afterward.
| # Copyright 2023 Johns Hopkins University (Amir Hussein) | ||
|
|
||
| #!/usr/bin/python |
There was a problem hiding this comment.
Shebang should be at the beginning of the file.
The shebang (#!/usr/bin/python) on line 3 should come before the copyright comment on line 1 to be recognized by the shell.
Proposed fix
+#!/usr/bin/python
# Copyright 2023 Johns Hopkins University (Amir Hussein)
-
-#!/usr/bin/python📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Copyright 2023 Johns Hopkins University (Amir Hussein) | |
| #!/usr/bin/python | |
| #!/usr/bin/python | |
| # Copyright 2023 Johns Hopkins University (Amir Hussein) |
🧰 Tools
🪛 Ruff (0.14.13)
3-3: Shebang should be at the beginning of the file
(EXE005)
🤖 Prompt for AI Agents
In `@egs/iwslt22_ta/ST/local/prepare_transcripts.py` around lines 1 - 3, Move the
shebang line (#!/usr/bin/python) to the very top of the file with no preceding
bytes or comments so the shell can recognize it; place the copyright/comment
lines (e.g., "Copyright 2023 Johns Hopkins University (Amir Hussein)")
immediately after the shebang and ensure there are no blank lines or other
characters before the shebang.
| else: | ||
| langdirs = [Path(args.src_langdir)] | ||
|
|
||
| for langdir in langdirs: | ||
| if not os.path.exists(langdir): | ||
| os.makedirs(langdir) | ||
|
|
||
| with open(langdirs[0] / "transcript_words.txt", 'w') as src, open(langdirs[1] / "transcript_words.txt", 'w') as tgt: | ||
| for c in cuts: | ||
| src_txt = c.supervisions[0].text | ||
| tgt_txt = c.supervisions[0].custom['translated_text']['eng'] | ||
| src.write(src_txt + '\n') | ||
| tgt.write(tgt_txt + '\n') |
There was a problem hiding this comment.
IndexError when tgt_langdir is None.
When args.tgt_langdir is None, langdirs contains only one element (line 52), but line 58 unconditionally accesses langdirs[1], causing an IndexError.
Proposed fix: handle single langdir case
- with open(langdirs[0] / "transcript_words.txt", 'w') as src, open(langdirs[1] / "transcript_words.txt", 'w') as tgt:
- for c in cuts:
- src_txt = c.supervisions[0].text
- tgt_txt = c.supervisions[0].custom['translated_text']['eng']
- src.write(src_txt + '\n')
- tgt.write(tgt_txt + '\n')
+ if args.tgt_langdir is not None:
+ with open(langdirs[0] / "transcript_words.txt", 'w') as src, \
+ open(langdirs[1] / "transcript_words.txt", 'w') as tgt:
+ for c in cuts:
+ src_txt = c.supervisions[0].text
+ tgt_txt = c.supervisions[0].custom['translated_text']['eng']
+ src.write(src_txt + '\n')
+ tgt.write(tgt_txt + '\n')
+ else:
+ with open(langdirs[0] / "transcript_words.txt", 'w') as src:
+ for c in cuts:
+ src_txt = c.supervisions[0].text
+ src.write(src_txt + '\n')🤖 Prompt for AI Agents
In `@egs/iwslt22_ta/ST/local/prepare_transcripts.py` around lines 51 - 63, The
code builds langdirs from args.src_langdir and optionally args.tgt_langdir but
then always accesses langdirs[1], causing an IndexError when args.tgt_langdir is
None; update the block that opens transcript files to handle the single-langdir
case: if len(langdirs) == 1, open only langdirs[0] / "transcript_words.txt" and
write only src_txt for each cut (using c.supervisions[0].text), otherwise open
both langdirs[0] and langdirs[1] and write src_txt and tgt_txt
(c.supervisions[0].custom['translated_text']['eng']) respectively. Ensure you
reference the same variables (langdirs, cuts, src_txt, tgt_txt,
args.tgt_langdir) so the change integrates cleanly.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
egs/multi_conv_zh_es_ta/ST/zipformer_multijoiner_st/train.py (3)
1118-1205:--st-scalecurrently has no effect on loss computation.
alpha_st/alpha_asrare computed but never used, so the CLI option doesn’t influence training. Either remove the option or apply it to the loss terms.🛠️ Proposed fix (apply scales to ASR/ST loss terms)
if params.st_scale != 1: alpha_st = params.st_scale - alpha_asr = 1-params.st_scale + alpha_asr = 1 - params.st_scale else: - alpha_st, alpha_asr = 1, 1 + alpha_st, alpha_asr = 1.0, 1.0 @@ - loss += simple_loss_scale * simple_loss + pruned_loss_scale * pruned_loss + loss += alpha_asr * ( + simple_loss_scale * simple_loss + pruned_loss_scale * pruned_loss + ) if params.use_st_joiner: - loss += simple_loss_scale * st_simple_loss + pruned_loss_scale * st_pruned_loss + loss += alpha_st * ( + simple_loss_scale * st_simple_loss + pruned_loss_scale * st_pruned_loss + ) @@ - loss += params.ctc_loss_scale * ctc_loss + loss += alpha_asr * params.ctc_loss_scale * ctc_loss if use_asr_cr_ctc: - loss += params.cr_loss_scale * cr_loss + loss += alpha_asr * params.cr_loss_scale * cr_loss if params.use_st_ctc: - loss += params.st_ctc_loss_scale * st_ctc_loss + loss += alpha_st * params.st_ctc_loss_scale * st_ctc_loss if use_st_cr_ctc: - loss += params.st_cr_loss_scale * st_cr_loss + loss += alpha_st * params.st_cr_loss_scale * st_cr_loss
1602-1612: Duration filter comment doesn’t match thresholds.The comment says 1–20 seconds, but the code uses 0.1–30. Please align the comment or the thresholds.
✍️ Comment-only fix (if thresholds are intended)
- # Keep only utterances with duration between 1 second and 20 seconds + # Keep only utterances with duration between 0.1 second and 30 seconds
1637-1649: ST pruning boundary is stricter than ASR branch.ST uses
T <= len(st_tokens)while ASR usesT < len(tokens). If the requirement isT >= S, this should likely match ASR.✅ Suggested adjustment
- if T <= len(st_tokens): + if T < len(st_tokens): return Falseegs/multi_conv_zh_es_ta/ST/hent_srt/export.py (1)
594-601: Log incompatible keys when usingstrict=False.When
strict=False, mismatched/unexpected weights are silently ignored. Capture and log these incompatibilities to prevent export problems from going unnoticed.Suggested fix
- model.load_state_dict( + incompatible = model.load_state_dict( average_checkpoints_with_averaged_model( filename_start=filename_start, filename_end=filename_end, device=device, - ), strict=False + ), + strict=False, ) + if incompatible.missing_keys or incompatible.unexpected_keys: + logging.warning( + "load_state_dict strict=False: missing=%s unexpected=%s", + incompatible.missing_keys, + incompatible.unexpected_keys, + )
kaldi_native_fbank package to Docker for test.ymlkaldi_native_fbank package to Docker for test.yml
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
egs/iwslt22_ta/ST/zipformer/train.py (3)
1074-1081: UndefinedspcausesNameErrorand signature mismatch.The pipeline failure confirms
spis undefined. Additionally,compute_validation_loss(lines 858-864) doesn't have anspparameter—onlysp_tgt. This will crash at runtime.🐛 Proposed fix
valid_info = compute_validation_loss( params=params, model=model, - sp=sp, sp_tgt=sp_tgt, valid_dl=valid_dl, world_size=world_size, )
1131-1136: Undefinedspused forblank_idandvocab_sizeinitialization.
spis never defined in this function—onlysp_tgtis loaded. This will cause aNameErrorat runtime.🐛 Proposed fix
sp_tgt = spm.SentencePieceProcessor() sp_tgt.load(params.bpe_tgt_model) # <blk> is defined in local/train_bpe_model.py - params.blank_id = sp.piece_to_id("<blk>") - params.vocab_size = sp.get_piece_size() + params.blank_id = sp_tgt.piece_to_id("<blk>") + params.vocab_size = sp_tgt.get_piece_size()
1314-1329: Docstring still references removedspparameter.The docstring at lines 1328-1329 still documents the
spparameter which has been commented out from the signature.📝 Proposed fix
"""Display the batch statistics and save the batch into disk. Args: batch: A batch of data. See `lhotse.dataset.K2SpeechRecognitionDataset()` for the content in it. params: Parameters for training. See :func:`get_params`. - sp: - The BPE model. + sp_tgt: + The target BPE model. """
🤖 Fix all issues with AI agents
In `@egs/iwslt22_ta/ST/zipformer/beam_search.py`:
- Line 1212: Remove the stray debug call `breakpoint()` inside the beam search
implementation (the one found in `modified_beam_search_hat`), delete that line
so decoding won't drop into a debugger in production, and scan the file for any
other leftover `breakpoint()` or debugging statements to remove; then run the
unit/decoding tests or lint to confirm no runtime interruptions remain.
🧹 Nitpick comments (2)
egs/iwslt22_ta/ST/zipformer/train.py (2)
795-798: Commented-out code should be removed.The commented lines
#y = sp.encode(...)and#y = k2.RaggedTensor(y)...are dead code. If no longer needed, remove them entirely rather than leaving commented code.♻️ Proposed cleanup
texts = batch["supervisions"]["text"] tgt_texts = batch["supervisions"]["tgt_text"]['eng'] - `#y` = sp.encode(texts, out_type=int) y_tgt = sp_tgt.encode(tgt_texts, out_type=int) - `#y` = k2.RaggedTensor(y).to(device) y_tgt = k2.RaggedTensor(y_tgt).to(device)
1317-1318: Remove commented-out code indisplay_and_save_batch.Similar to the above, these commented lines should be removed if no longer needed.
♻️ Proposed cleanup
def display_and_save_batch( batch: dict, params: AttributeDict, - `#sp`: spm.SentencePieceProcessor, sp_tgt: spm.SentencePieceProcessor, ) -> None:- `#y` = sp.encode(supervisions["text"], out_type=int) y_tgt = sp_tgt.encode(supervisions["tgt_text"], out_type=int)Also applies to: 1342-1342
| # Additionally, to ensure the the probs of blank and non-blank sum to 1, we | ||
| # need to add the following term to the log-probs of non-blank symbols. This | ||
| # is equivalent to log(1 - sigmoid(logits[..., 0])). | ||
| breakpoint() |
There was a problem hiding this comment.
Remove debug breakpoint() before merging.
This breakpoint() will halt execution and drop into the debugger in production, breaking any decoding that uses modified_beam_search_hat.
🐛 Proposed fix
- breakpoint()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| breakpoint() |
🧰 Tools
🪛 Ruff (0.14.13)
1212-1212: Trace found: breakpoint used
(T100)
🤖 Prompt for AI Agents
In `@egs/iwslt22_ta/ST/zipformer/beam_search.py` at line 1212, Remove the stray
debug call `breakpoint()` inside the beam search implementation (the one found
in `modified_beam_search_hat`), delete that line so decoding won't drop into a
debugger in production, and scan the file for any other leftover `breakpoint()`
or debugging statements to remove; then run the unit/decoding tests or lint to
confirm no runtime interruptions remain.
kaldi_native_fbank package to Docker for test.yml
Hi Karel, thanks for your contribution!
Yes, you are right. The exact image is given below icefall/.github/workflows/test.yml Line 62 in 0904e49 where icefall/.github/workflows/test.yml Line 33 in 0904e49 which uses the code below: icefall/.github/scripts/docker/generate_build_matrix.py Lines 68 to 83 in 0904e49 See also
Yes, you are right. It is
It is updated by I usually trigger it manually due to
Perhaps you don't have permisison to trigger it in k2-fsa/icefall repo, but you should have permission to do that in your own fork. If you want to test that, make sure you remove the following line
Yes, I agree with you. I will try to fix all CI failures in the following days. |


egs/librispeech/ASR/zipformer/test_rknn_on_cpu_simulator_ctc_streaming.pyis crashing for missing this dependencySummary by CodeRabbit
Chores
Bug Fixes / Reliability
API Changes
Portability
Tests
✏️ Tip: You can customize this high-level summary in your review settings.