WIP [not-for-merge]: run aishell with latest recipe in Kaldi #3868

qindazhu · 2020-01-22T12:11:38Z

Run aishell with latest recipe in Kaldi which is copied from tedlium/s5_r3/:

run_kaldi.sh: the main script including steps before chain model training, with mfcc feature instead of mfcc_pitch
run_tdnn_1d.sh: chain model with ivector
run_tdnn_1c.sh: chain model without ivector.

Result

chain model without ivector

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_test/scoring_kaldi/best_cer <==
%WER 6.65 [ 6964 / 104765, 155 ins, 247 del, 6562 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_test/cer_12_0.5

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_test/scoring_kaldi/best_wer <==
%WER 15.18 [ 9783 / 64428, 900 ins, 1398 del, 7485 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_test/wer_12_0.5

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/scoring_kaldi/best_cer <==
%WER 5.71 [ 11724 / 205341, 245 ins, 346 del, 11133 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/cer_11_0.0

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/scoring_kaldi/best_wer <==
%WER 13.49 [ 17226 / 127698, 1606 ins, 2402 del, 13218 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/wer_11_0.0

chain model with ivector

==> exp/chain_cleaned_1d/tdnn1d_sp/decode_test/scoring_kaldi/best_cer <==
%WER 6.46 [ 6768 / 104765, 155 ins, 250 del, 6363 sub ] exp/chain_cleaned_1d/tdnn1d_sp/decode_test/cer_12_1.0

==> exp/chain_cleaned_1d/tdnn1d_sp/decode_test/scoring_kaldi/best_wer <==
%WER 14.91 [ 9604 / 64428, 1035 ins, 1241 del, 7328 sub ] exp/chain_cleaned_1d/tdnn1d_sp/decode_test/wer_13_0.0

==> exp/chain_cleaned_1d/tdnn1d_sp/decode_dev/scoring_kaldi/best_cer <==
%WER 5.51 [ 11310 / 205341, 254 ins, 359 del, 10697 sub ] exp/chain_cleaned_1d/tdnn1d_sp/decode_dev/cer_11_0.5

==> exp/chain_cleaned_1d/tdnn1d_sp/decode_dev/scoring_kaldi/best_wer <==
%WER 13.19 [ 16843 / 127698, 1533 ins, 2413 del, 12897 sub ] exp/chain_cleaned_1d/tdnn1d_sp/decode_dev/wer_12_0.0

TODO

Try different network-configs and training parameters in @csukuangfj 's pytorch training recipe for compare.

danpovey · 2020-01-22T12:19:28Z

can you remind me how this compares with the currently-checked-in results?

qindazhu · 2020-01-22T12:37:08Z

Copied from @csukuangfj 's commit https://github.com/mobvoi/kaldi/blob/e8a28b5c96d1f2bc428ebbfa0cc20c51cbccd77b/egs/aishell/s10/RESULTS

pytorch: Results for kaldi pybind LF-MMI training with PyTorch

## head exp/chain/decode_res/*/scoring_kaldi/best_* > RESULTS
#
==> exp/chain/decode_res/dev/scoring_kaldi/best_cer <==
%WER 8.22 [ 16888 / 205341, 774 ins, 1007 del, 15107 sub ] exp/chain/decode_res/dev/cer_10_1.0

==> exp/chain/decode_res/dev/scoring_kaldi/best_wer <==
%WER 16.66 [ 21278 / 127698, 1690 ins, 3543 del, 16045 sub ] exp/chain/decode_res/dev/wer_11_0.5

==> exp/chain/decode_res/test/scoring_kaldi/best_cer <==
%WER 9.98 [ 10454 / 104765, 693 ins, 802 del, 8959 sub ] exp/chain/decode_res/test/cer_11_1.0

==> exp/chain/decode_res/test/scoring_kaldi/best_wer <==
%WER 18.89 [ 12170 / 64428, 1112 ins, 1950 del, 9108 sub ] exp/chain/decode_res/test/wer_12_0.5

tdnn_1b: Results for kaldi nnet3 LF-MMI training https://github.com/mobvoi/kaldi/blob/44ae951ea9c6f509dda24c60d29e5dddb482e3e1/egs/aishell/s10/local/run_tdnn_1b.sh#L100

#
==> exp/chain_nnet3/tdnn_1b/decode_dev/scoring_kaldi/best_cer <==
%WER 7.06 [ 14494 / 205341, 466 ins, 726 del, 13302 sub ] exp/chain_nnet3/tdnn_1b/decode_dev/cer_10_0.5

==> exp/chain_nnet3/tdnn_1b/decode_dev/scoring_kaldi/best_wer <==
%WER 15.11 [ 19296 / 127698, 1800 ins, 2778 del, 14718 sub ] exp/chain_nnet3/tdnn_1b/decode_dev/wer_11_0.0

==> exp/chain_nnet3/tdnn_1b/decode_test/scoring_kaldi/best_cer <==
%WER 8.63 [ 9041 / 104765, 367 ins, 668 del, 8006 sub ] exp/chain_nnet3/tdnn_1b/decode_test/cer_11_1.0

==> exp/chain_nnet3/tdnn_1b/decode_test/scoring_kaldi/best_wer <==
%WER 17.40 [ 11210 / 64428, 1059 ins, 1654 del, 8497 sub ] exp/chain_nnet3/tdnn_1b/decode_test/wer_11_0.5

	pytorch	tdnn_1b	tdnn_1c	tdnn_1d
dev_cer	8.22	7.06	5.71	5.51
dev_wer	16.66	15.11	13.49	13.19
test_cer	9.98	8.63	6.65	6.46
test_wer	18.89	17.40	15.18	14.91

danpovey · 2020-01-22T12:39:42Z

OK, so we have some way to go, but it's all straightforward in principle. I am trying to relax on this vacation so I can get to work hard when I come back...

csukuangfj · 2020-01-29T12:44:19Z

How long did it take for the training part of run_tdnn_1c.sh ?

It costs me 6 hours and 37 minutes to reach Iter: 39/78 Epoch: 2.03/6.0 (33.8% complete).

fanlu · 2020-01-29T14:41:34Z

it took about 4 hours.

2020-01-28 19:11:02,058 [steps/nnet3/chain/train.py:428 - train - INFO ] Copying the properties from exp/chain_cleaned_1c/tdnn1c_sp/egs to exp/chain_cleaned_1c/tdnn1c_sp 
2020-01-28 19:11:02,222 [steps/nnet3/chain/train.py:442 - train - INFO ] Computing the preconditioning matrix for input features                                          
2020-01-28 19:11:57,945 [steps/nnet3/chain/train.py:451 - train - INFO ] Preparing the initial acoustic model.                                                            
2020-01-28 19:12:14,562 [steps/nnet3/chain/train.py:485 - train - INFO ] Training will run for 6.0 epochs = 79 iterations                                                 
2020-01-28 19:12:14,562 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 0/78   Jobs: 3   Epoch: 0.00/6.0 (0.0% complete)   lr: 0.000750                            
2020-01-28 19:15:25,749 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 1/78   Jobs: 3   Epoch: 0.03/6.0 (0.5% complete)   lr: 0.000741
2020-01-28 23:02:12,888 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 76/78   Jobs: 12   Epoch: 5.58/6.0 (92.9% complete)   lr: 0.000353
2020-01-28 23:05:12,423 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 77/78   Jobs: 12   Epoch: 5.70/6.0 (94.9% complete)   lr: 0.000337
2020-01-28 23:08:08,939 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 78/78   Jobs: 12   Epoch: 5.82/6.0 (97.0% complete)   lr: 0.000300
2020-01-28 23:11:32,723 [steps/nnet3/chain/train.py:585 - train - INFO ] Doing final combination to produce final.mdl
2020-01-28 23:11:32,724 [steps/libs/nnet3/train/chain_objf/acoustic_model.py:571 - combine_models - INFO ] Combining {60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,
73, 74, 75, 76, 77, 78, 79} models.
2020-01-28 23:12:26,242 [steps/nnet3/chain/train.py:614 - train - INFO ] Cleaning up the experiment directory exp/chain_cleaned_1c/tdnn1c_sp
exp/chain_cleaned_1c/tdnn1c_sp: num-iters=79 nj=3..12 num-params=9.3M dim=40->3448 combine=-0.030->-0.030 (over 1) xent:train/valid[51,78]=(-0.682,-0.513/-0.693,-0.540) l
ogprob:train/valid[51,78]=(-0.045,-0.030/-0.051,-0.039)

csukuangfj · 2020-01-30T02:15:05Z

It took me more than 19 hours for the nnet3 traning part and it gives me similar results as haowen's:

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_test/scoring_kaldi/best_cer <==
%WER 6.66 [ 6975 / 104765, 150 ins, 228 del, 6597 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_test/cer_11_0.5

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_test/scoring_kaldi/best_wer <==
%WER 15.14 [ 9755 / 64428, 1019 ins, 1255 del, 7481 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_test/wer_13_0.0

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/scoring_kaldi/best_cer <==
%WER 5.69 [ 11691 / 205341, 253 ins, 345 del, 11093 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/cer_11_0.0

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/scoring_kaldi/best_wer <==
%WER 13.45 [ 17179 / 127698, 1584 ins, 2408 del, 13187 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/wer_11_0.0

@qindazhu I think you mixed dev and test in your table.

Part of the training log is as follows:

2020-01-29 14:00:34,599 [steps/nnet3/chain/train.py:428 - train - INFO ] Copying the properties from exp/chain_cleaned_1c/tdnn1c_sp/egs to exp/chain_c
leaned_1c/tdnn1c_sp
2020-01-29 14:00:34,600 [steps/nnet3/chain/train.py:485 - train - INFO ] Training will run for 6.0 epochs = 79 iterations
2020-01-29 14:00:34,600 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 0/78   Jobs: 3   Epoch: 0.00/6.0 (0.0% complete)   lr: 0.000750
2020-01-29 14:07:15,371 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 1/78   Jobs: 3   Epoch: 0.03/6.0 (0.5% complete)   lr: 0.000741
2020-01-29 14:13:03,763 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 2/78   Jobs: 3   Epoch: 0.06/6.0 (1.0% complete)   lr: 0.000733
2020-01-29 14:18:51,418 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 3/78   Jobs: 3   Epoch: 0.09/6.0 (1.5% complete)   lr: 0.000724

2020-01-30 08:25:31,335 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 76/78   Jobs: 12   Epoch: 5.58/6.0 (92.9% complete)   lr: 0.000353
2020-01-30 08:49:30,420 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 77/78   Jobs: 12   Epoch: 5.70/6.0 (94.9% complete)   lr: 0.000337
2020-01-30 09:13:33,554 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 78/78   Jobs: 12   Epoch: 5.82/6.0 (97.0% complete)   lr: 0.000300
2020-01-30 09:37:31,684 [steps/nnet3/chain/train.py:585 - train - INFO ] Doing final combination to produce final.mdl

qindazhu · 2020-01-30T06:22:13Z

@csukuangfj yes, I mixed up the result in the table for Kaldi result, I have updated the table, thanks!

stale · 2020-06-19T06:36:12Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

run aishell with latest recipe

3c61cf5

qindazhu mentioned this pull request Jan 22, 2020

add RESULTS for kaldi pybind LF-MMI pipeline with PyTorch. #3831

Merged

csukuangfj mentioned this pull request Jan 31, 2020

WIP: add TDNNF to pytorch. #3892

Merged

remove natural gradient

870778f

qindazhu force-pushed the haowen-kaldi-aishell branch from bacf975 to 870778f Compare February 5, 2020 02:42

fanlu mentioned this pull request Feb 13, 2020

show L2 norm of parameters during training. #3925

Merged

qindazhu changed the title ~~WIP: run aishell with latest recipe in Kaldi~~ WIP [not-for-merge]: run aishell with latest recipe in Kaldi Feb 17, 2020

stale bot added the stale Stale bot on the loose label Jun 19, 2020

kkm000 added the stale-exclude Stale bot ignore this issue label Jul 15, 2020

stale bot removed the stale Stale bot on the loose label Jul 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP [not-for-merge]: run aishell with latest recipe in Kaldi #3868

WIP [not-for-merge]: run aishell with latest recipe in Kaldi #3868

qindazhu commented Jan 22, 2020

danpovey commented Jan 22, 2020

qindazhu commented Jan 22, 2020 •

edited

danpovey commented Jan 22, 2020

csukuangfj commented Jan 29, 2020

fanlu commented Jan 29, 2020

csukuangfj commented Jan 30, 2020

qindazhu commented Jan 30, 2020

stale bot commented Jun 19, 2020

WIP [not-for-merge]: run aishell with latest recipe in Kaldi #3868

Are you sure you want to change the base?

WIP [not-for-merge]: run aishell with latest recipe in Kaldi #3868

Conversation

qindazhu commented Jan 22, 2020

Result

TODO

danpovey commented Jan 22, 2020

qindazhu commented Jan 22, 2020 • edited

danpovey commented Jan 22, 2020

csukuangfj commented Jan 29, 2020

fanlu commented Jan 29, 2020

csukuangfj commented Jan 30, 2020

qindazhu commented Jan 30, 2020

stale bot commented Jun 19, 2020

qindazhu commented Jan 22, 2020 •

edited