Аннотация:The aim of this research is to develop a method capable of effectively handling speaker diarization and target speaker voice activity detection tasks under various levels of background noise, overlapping speech situations, while ensuring the possibility of use in streaming processing conditions.