Training forced aligners on (mis)matched data: the effect of dialect and age

Tünde Szalay, Mostafa Shahin, Kirrie Ballard, Beena Ahmed

Wednesday, December 14th, 2022, 2pm – 2.30pm


Training forced phonemic aligners for novel language varieties is non-trivial, as it requires aligned corpora. However, aligning novel corpora requires accurate forced aligners. To align AusKidTalk, an audio corpus of Australian English (AusE) speaking children, we trained three custom aligners on different datasets: age-matched American English (AmE) children, dialect-matched AusE-speaking adults, and their combination. Forced aligner performance using the three custom aligners and the Munich Automatic Segmentation System (MAUS) was evaluated against manual segmentation. The dialect-matched and combined custom aligners outperform MAUS, but the agematched aligner does not. Our aligners’ improved forced segmentation will reduce the time-need of manual correction.