Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Binaural Modeling for High-Fidelity Spatial Audio
Uppsala University, Disciplinary Domain of Science and Technology, Technology, Department of Electrical Engineering, Signals and Systems.ORCID iD: 0000-0002-7879-8513
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Description
Abstract [en]

The enjoyment of reproduced sound and music is a prime pleasure for many, and the high-fidelity reproduction of binaural audio is integral to many applications in augmented and virtual reality. This thesis introduces a framework for binaural headphone auralization of sound systems, together with an in-depth analysis and proposed solutions to address sources of coloration within the signal chain.

The framework includes a novel method for binaural auralization of microphone array impulse responses. Employing a hybrid parametric approach, it utilizes causal multichannel Wiener filtering to synthesize the directional response of the ear, as described by head-related transfer functions (HRTFs), using the microphone array and a model of its acoustic properties. A time-domain polynomial matrix framework is employed for filter computations and direct and reflected sound is treated separately. Results demonstrate a small perceptual difference to reference measured binaural room impulse responses. 

Additionally, the thesis addresses the impact of binaural measurement uncertainty and proposes a new measurement technique for HRTFs and headphone transfer functions (HpTFs). The method is based on a cardioid microphone array for open ear canal measurements. Results indicate that the method significantly reduces measurement uncertainty compared to omnidirectional measurements in the ear canal.  

Moreover, a phase pre-processing method for HRTFs is introduced that reduces spatial phase variability of the HRTF set at high frequencies while retaining correct interaural coherence for diffuse sound. It is demonstrated that the HRTF phase pre-processing greatly reduces spectral coloration in headphone simulation of amplitude panning on virtual speakers. The method also improves performance in binaural rendering of microphone array recordings. 

Finally, the thesis presents a comprehensive model for addressing coloration at the ear-signal level inherent in amplitude panning on speaker arrays. The analysis focuses on pairwise panning on symmetrical speaker setups and monaural correction filters are proposed that are robust to head movements around the sweet spot. The proposed filters are found to mitigate the phantom source elevation effect in stereophonic panning and enhance the perceived spectral similarity between discrete and panned sound sources, with effectiveness contingent on the speaker setup geometry.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2024. , p. 72
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 2410
Keywords [en]
Binaural measurements, binaural reproduction, headphones, MIMO Wiener filtering, sound reproduction, spatial audio, cardioid measurements
National Category
Signal Processing
Research subject
Electrical Engineering with specialization in Signal Processing
Identifiers
URN: urn:nbn:se:uu:diva-527750ISBN: 978-91-513-2151-6 (print)OAI: oai:DiVA.org:uu-527750DiVA, id: diva2:1856435
Public defence
2024-09-13, Polhemsalen, Ångströmlaboratoriet, Lägerhyddsvägen 1, Uppsala, 13:15 (English)
Opponent
Supervisors
Available from: 2024-06-04 Created: 2024-05-06 Last updated: 2024-06-04
List of papers
1. Binaural Auralization of Microphone Array Room Impulse Responses Using Causal Wiener Filtering
Open this publication in new window or tab >>Binaural Auralization of Microphone Array Room Impulse Responses Using Causal Wiener Filtering
2021 (English)In: IEEE/ACM Transactions on Audio, Speech, and Language Processing, ISSN 2329-9290, E-ISSN 2329-9304, Vol. 29, p. 2899-2914Article in journal (Refereed) Published
Abstract [en]

Binaural room auralization involves Binaural Room Impulse Responses (BRIRs). Dynamic binaural synthesis (i.e., head-tracked presentation) requires BRIRs for multiple head poses. Artificial heads can be used to measure BRIRs, but BRIR modeling from microphone array room impulse responses (RIRs) is becoming popular since personalized BRIRs can be obtained for any head pose with low extra effort. We present a novel framework for estimating a binaural signal from microphone array signals, using causal Wiener filtering and polynomial matrix formalism. The formulation places no explicit constraints on the geometry of the microphone array and enables directional weighting of the estimation error. A microphone noise model is used for regularization and to balance filter performance and noise gain. A complete procedure for BRIR modeling from microphone array RIRs is also presented, employing the proposed Wiener filtering framework. An application example illustrates the modeling procedure using a 19-channel spherical microphone array. Direct and reflected sound segments are modeled separately. The modeled BRIRs are compared to measured BRIRs and are shown to be waveform-accurate up to at least 1.5 kHz. At higher frequencies, correct statistical properties of diffuse sound field components are aimed for. A listening test indicates small perceptual differences to measured BRIRs. The presented method facilitates fast BRIR data set acquisition for use in dynamic binaural synthesis and is a viable alternative to Ambisonics-based binaural room auralization.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE)Institute of Electrical and Electronics Engineers (IEEE), 2021
Keywords
Microphone arrays, Estimation, Ear, Speech processing, Acoustics, Array signal processing, Magnetic heads, Beamforming, binaural recording, binaural room impulse response (BRIR), head-related transfer function (HRTF), interaural coherence, MIMO, virtual acoustic environment, virtual artificial head (VAH)
National Category
Signal Processing Fluid Mechanics
Identifiers
urn:nbn:se:uu:diva-456921 (URN)10.1109/TASLP.2021.3110340 (DOI)000697817100002 ()
Funder
Vinnova
Available from: 2021-10-25 Created: 2021-10-25 Last updated: 2025-08-28Bibliographically approved
2. Robust Binaural Measurements in the Ear Canal Using a Two-Microphone Array
Open this publication in new window or tab >>Robust Binaural Measurements in the Ear Canal Using a Two-Microphone Array
2023 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Accurate binaural rendering requires accurate reproduction of binaural signals at the eardrum, which in turn requires adequate binaural technology. We propose a method to measure head-related & headphone transfer functions with a two-microphone array in the ear canal. By implementing a cardioid directional pattern, the forward and reverse propagating sound pressure components are measured separately, thus avoiding the influence of standing waves in the ear canal on the measurements. The method is useful in filter design for individualized binaural rendering that, compared with the blocked-canal method, does not assume acoustically 'open' headphones to be used. The method also mitigates the excessive sensitivity to microphone position of regular open-canal measurements. Validation measurements are conducted using a natural scale replica ear and a MEMS microphone array.

Place, publisher, year, edition, pages
The Audio Engineering Society, 2023
Keywords
Headphone measurements, HRTFs, Cardioid
National Category
Engineering and Technology
Identifiers
urn:nbn:se:uu:diva-525352 (URN)
Conference
AES 2023 International Conference on Spatial and Immersive Audio, Huddersfield, UK, August 23-25, 2023
Available from: 2024-03-21 Created: 2024-03-21 Last updated: 2024-05-06Bibliographically approved
3. A practical method for HRTF phase pre-processing
Open this publication in new window or tab >>A practical method for HRTF phase pre-processing
2022 (English)In: Proceedings of the 24th International Congress on Acoustics, International Congress on Acoustics (ICA) , 2022, article id ABS-0222Conference paper, Published paper (Refereed)
Abstract [en]

Pre-processing of HRTF phase has proved useful to improve binaural rendering of order-limited spherical harmonics (SH) signals. The adjustment is typically applied at high frequencies and allows to reduce magnitude errors for directional sound field components. This article proposes a practical method for HRTF phase pre-processing using linear phase above a cutoff frequency, and a direction-dependent phase offset to maintain correct diffuse-field interaural coherence. Two applications are discussed - filter design for binaural rendering of microphone array or SH-signals, and reduced coloration in virtual source panning on virtual speaker setups. Factors influencing the perceptual transparency of the phase modification are evaluated subjectively and objectively.

Place, publisher, year, edition, pages
International Congress on Acoustics (ICA), 2022
Keywords
Auralization, Ambisonics decoding, HRTF interpolation
National Category
Other Engineering and Technologies
Identifiers
urn:nbn:se:uu:diva-509587 (URN)
Conference
24th International Congress on Acoustics, Gyeongju, Korea, October 24-28, 2022
Available from: 2023-08-21 Created: 2023-08-21 Last updated: 2025-02-18Bibliographically approved
4. Spectral Correction of Audio Objects in Stereophonic Rendering
Open this publication in new window or tab >>Spectral Correction of Audio Objects in Stereophonic Rendering
2024 (English)In: IEEE/ACM Transactions on Audio, Speech, and Language Processing, ISSN 2329-9290, E-ISSN 2329-9304, Vol. 32, p. 3141-3156Article in journal (Refereed) Published
Abstract [en]

This paper presents a comprehensive model for ear-signal level coloration in stereo amplitude panning, enabling the calculation of monaural correction filters that equalize the average coloration over a small area around the sweet spot. The model takes into account the speaker setup geometry, listener Head-Related Transfer-Functions (HRTFs), the employed pan-law, the direct-to-reflected sound ratio, and the correlation between the speaker signals at the listening position. Coloration in diffuse sound reproduction is also investigated. The coloration model is validated using binaural room impulse response measurements, and the correction filters are found to effectively reduce the difference in composite ear power spectrum between a discrete and virtual center source. A listening test on the perceived spectral difference between these two cases, with stereo setups in front of and behind the listener, indicate that the correction filter improves timbral similarity between a virtual and discrete center source for rear speaker panning. The test also indicates that remaining unmodeled coloration sources are large, especially for front panning. However, a second listening test finds that the correction filter improves accuracy of perceived direction in front panning by mitigating the phantom image elevation effect.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
National Category
Signal Processing
Identifiers
urn:nbn:se:uu:diva-527266 (URN)10.1109/TASLP.2024.3414283 (DOI)001256333200003 ()
Available from: 2024-04-26 Created: 2024-04-26 Last updated: 2025-08-28Bibliographically approved

Open Access in DiVA

UUThesis_Gunnarsson,V-2024(28377 kB)3337 downloads
File information
File name FULLTEXT01.pdfFile size 28377 kBChecksum SHA-512
92fd7e7d6f947bb7fd730459d556c2f529acdbc4126f395acab95bb3691b0cf9adaed11cf2116995138af835d6b78b010aad11f8f899b8e0294c6b3a6382b782
Type fulltextMimetype application/pdf

Authority records

Gunnarsson, Viktor

Search in DiVA

By author/editor
Gunnarsson, Viktor
By organisation
Signals and Systems
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 3349 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2990 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf