Speech Samples for "Error-Resilient Semantic Communication forSpeech Transmission over Packet-Loss Networks"

Traditional speech codec + Neural PLC:


Opus: A modern low-delay SOTA traditional speech codec supporting variable bit rates, equipped with neural packet loss concealment (PLC).
J.-M. Valin, A. Mustafa et al., “Very low complexity speech synthesis using framewise autoregressive gan (fargan) with pitch prediction,” IEEE Signal Processing Letters, 2024.

Neural speech codec + Neural PLC


FDPLC: An attention based neural PLC performs on the latent space of neural speech codec SoundStream.
H. Xue, X. Peng, X. Jiang, and Y. Lu, “Towards error-resilient neural speech coding,” in Proc. Interspeech 2022, 2022, pp. 4217–4221.

Neural speech codec + Neural PLC + In-band FEC:


SoundSpring: A loss-resilient audio transceiver with dual-functional masked language modeling over the token sequnces of Soundstream. It uses coarse layer of the RVQ tokens as in-band FEC.
Glairs: An error-resilient semantic communication framework for speech transmission over packet-loss networks, which performs two-stage coding and PLC in the generative latent space of a VQ-VAE. It utilizes side information as in-band FEC.
S. Yao, J. Dai, X. Qin, S. Wang, S. Wang, K. Niu, and P. Zhang, “Soundspring: Loss-resilient audio transceiver with dual-functional masked language modeling,” IEEE Journal on Selected Areas in Communications, 2025.

Actual traces from PLC Challenge Dataset
Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
Reference
Glaris
(FEC=3kbps, Ours)
Opus
(FARGAN)
FDPLC
SoundSpring
(FEC=3kbps)

All rights reserved