Privacy-Preserving Speaker Verification in End-to-End Encrypted Chats: A Feasibility Study

Altaibek M. Zulkhazhav A. Yergesh B. Bekmanova G. Omarbekova A. Sharipbay A.
2025 Institute of Electrical and Electronics Engineers Inc.

IEEE Access
2025 #13 217556 - 217572 pp.

We present a deployment-oriented, privacy-preserving speaker-verification pipeline purpose-built for end-to-end encrypted chats. All nonlinear acoustic modeling and embedding extraction run on the user device, while the cloud performs only fixed-point cosine matching and a single threshold comparison via secure multiparty computation, so that the verifier learns only a one-bit accept/reject decision. Integer templates are rendered cancelable with per-tenant salts, enabling revocation and cross-realm unlinkability throughout the template lifecycle. To align evaluation with deployment arithmetic, we introduce a decision-only calibration procedure that operates in the same fixed-point regime as inference; it uses isotonic regression with binomial-uncertainty intervals to select operating points without exposing scores. Empirically, the system yields accuracy comparable to its plaintext baseline at matched operating points on standard VoxCeleb protocols, maintains stable operating characteristics under a unified tie policy across floating-point, fixed-point, and multiparty computation instantiations, and exhibits latency and communication footprints compatible with interactive use. The design further generalizes across conditions (e.g., duration and noise) and to a low-resource language. We release trial lists, per-threshold tables, and code to support independent audits. Overall, this work should be read as a feasibility study rather than a ready-to-deploy product: our goal is to surface the trade-offs and limitations of privacy-preserving speaker verification in realistic encrypted messaging environments, not to claim production-grade performance. Within this feasibility scope, our contribution is an engineering integration under end-to-end encryption and MPC constraints: 1) a practical split-inference architecture that keeps all non-linear acoustics and embeddings on device; 2) a fixed-point, decision-only calibration pipeline that is numerically aligned with the secure deployment regime; and 3) a revocable, unlinkable template mechanism instantiated via per-tenant salts and orthonormal transforms, consistent with ISO/IEC 24745. In contrast to HE-based schemes that homomorphically evaluate large parts of the embedding network or PLDA-based MPC designs that assume server-side access to embeddings, our verifier outsources only cosine-family matching and a single threshold comparison under MPC, never exposing raw audio, float embeddings, or plaintext templates. Together, these elements offer a pragmatic path to piloting privacy-preserving speaker verification in controlled encrypted channels without relaxing end-to-end encryption guarantees.

Cancelable biometrics , end-to-end encryption , feasibility study , fixed-point matching , on-device embeddings , secure multiparty computation , speaker verification , unlinkability

Text of the article Перейти на текст статьи

L. N. Gumilyov Eurasian National University, Astana, 010000, Kazakhstan

L. N. Gumilyov Eurasian National University

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026