Citation: | XIE Bosun. Spatial Sound—History, Principle, Progress and Challenge[J]. Chinese Journal of Electronics, 2020, 29(3): 397-416. doi: 10.1049/cje.2020.02.016 |
F. Rumsey, Spatial Audio, Focal Press, Oxford, England, 2001.
|
B.S. Xie, Head-related Transfer Function and Virtual Auditory Display (Second Edition), J Ross Publishing, USA, 2013.
|
J. Blauert, Spatial Hearing: The Psychophysics of Human Sound Localization (Revised Edition), MIT Press, Cambridge, MA, USA, 1997.
|
J.L. Jiang, B.S. Xie, H.M. Mai, et al., “The role of dynamic cue in auditory vertical localization”, Applied Acoustics, Vol.146, pp.398-408, 2019.
|
H.A.M. Clack, G.F. Dutton and P.B. Vanderlyn, “The ‘stereosonic’ recording and reproduction system”, IRE Transactions on Audio, Vol.5, No.4, pp.96-111, 1957.
|
D.M. Leakey, “Some measurements on the effects of interchannel intensity and time differences in two channel sound systems”, Journal of the Acoustical Society of America, Vol.31, No.7, pp.977-986, 1959.
|
B.S. Xie, “Signal mixing for a 5.1 channel surround sound system-Analysis and experiment”, Journal of the Audio Engineering Society, Vol.49, No.4, pp.263-274, 2001.
|
H. Mertens, “Directional hearing in stereophony theory and experimental verification”, EBU Rev., Part A, No.92(Aug.), pp.146-158, 1965.
|
R.Y. Litovsky, H.S. Colburn, W.A. Yost, et al., “The precedence effect”, Journal of the Acoustical Society of America, Vol.106, No.4, pp.1633-1654, 1999.
|
A.W. Bronkhorst, “The cocktail party phenomenon: A review of research on speech intelligibility in multiple-talker conditions”, Acta Acustica united with Acustica, Vol.86, No.1, pp.117-128, 2000.
|
M. Barron and A.H. Marshall, “Spatial impression due to early lateral reflections in concert halls: The derivation of a physical measure”, Journal of Sound and Vibration, Vol.77, No.2, pp.211-232, 1981.
|
M. Morimoto, H. Fujimori and Z. Maekawa, “Discrimination between auditory source width and envelopment”, Journal of the Acoustical Society of Japan, Vol.46, No.6, pp.448-457, 1990.
|
Y. Ando, Auditory and Visual Sensation, Springer-Verlag, New York, U.S.A., 2009.
|
P. Damaske and Y. Ando, “Interaural crosscorrelation for multichannel loudspeaker reproduction”, Acta Acustica united with Acustica, Vol.27, No.4, pp.232-238, 1972.
|
B. Shi and B.S. Xie, “The cross-correlation of signals and spatial impression in surround sound reproduction”, Chinese Journal of Acoustics, Vol.29, No.3, pp.308-320, 2010.
|
B.S. Xie and S.Q. Gan, “Development and psychoacoustic principle of multichannel surround sound”, Audio Engineering, Vol.2002, No.2, pp.11-18, 2002. (in Chinese)
|
B.S. Xie, “Spatial interpolation of HRTFs and signal mixing for multichannel surround sound”, Chinese Journal of Acoustics (AES), Vol.25, No.4, pp.330-341, 2006.
|
M.A. Gerzon, “Periphony: With hight sound reproduction”, Journal of the Audio Engineering Society, Vol.21, No.1, pp.2-10, 1973.
|
M.A. Gerzon, “Ambisonics in multichannel broadcasting and video”, Journal of the Audio Engineering Society, Vol.33, No.11, pp.859-871, 1985.
|
X.F. Xie, “The 4-3-N matrix multi-channel sound system”, Chinese Journal of Acoustics, Vol.1, No.2, pp.210-218, 1982.
|
J.S. Bamford and J. Vanderkooy, “Ambisonic sound for us”, the AES 99th Convention, New York, USA, Paper No.4138, 1995.
|
J. Daniel and S. Moreau, “Further study of sound field coding with higher order Ambisonics”, the AES 116th Convention, Berlin, Germany, Paper No.6017, 2004.
|
D.B. Ward and T.D. Abhayapala, “Reproduction of a planewave sound field using an array of loudspeakers”, IEEE Transactions on Speech and Audio Processing, Vol.9, No.6, pp.697-707, 2001.
|
B.S. Xie and X.F. Xie, “Analyse and sound image localization experiment on multi-channel plannar surround sound system”, Chinese Journal of Acoustics, Vol.15, No.1, pp.52-64, 1996.
|
X.F. Xie, “The 4-3-4 matrix system for quadraphone”, Journal of South China University of Technology, Vol.5, No1, pp.40-48, 1977. (in Chinese)
|
X.F. Xie, “The 4-3-4 transform and N ≥ channels reproduction for panoramic (stereophonic) reproduction”, Journal of South China University of Technology, Vol.6, No.2, pp.54-70, 1978. (in Chinese)
|
A.J. Berkhout, D. De Vries and P. Vogel, “Acoustic control by wave field synthesis”, Journal of the Acoustical Society of America, Vol.93, No.5, pp.2764-2778, 1993.
|
M.M. Boone, E.N.G. Verheijen and P.F. Van Tol, “Spatial sound field reproduction by wave field synthesis”, Journal of the AES, Vol.43, No.12, pp.1003-1012, 1995.
|
S. Spors, R. Rabenstain and J. Ahrens, “The theory of wave field synthesis revisited”, the AES 124th Convention, Amsterdam, the Netherlands, Paper No.7358, 2008.
|
J. Ahrens, Analytic Methods of Sound Field Synthesis, Springer-Verlag Berlin Heidelberg, Berlin, Germany, 2012.
|
A.D. Blumlein, “Improvements in and relating to sound transmission, sound recording and sound reproducing systems”, British Patent Specification 394,325, Reprint in Journal of the AES, Vol.6, No.2, pp.91-98/130, 1958.
|
K. De Boer, “Stereophonic sound reproduction”, Philips Tech. Rev., Vol.1940, No.5, pp.107-114, 1940.
|
G. Thiele and G. Plenge, “Localization of lateral phantom sources”, Journal of the Audio Engineering Society, Vol.25, No.4, pp.196-200, 1977.
|
X.F. Xie, The Principle of Stereophonic Sound, Science Press, Beijing, China, 1981. (in Chinese)
|
ITU-R BS. 775-1:1994, Multichannel Stereophonic Sound System with and without Accompanying Picture.
|
Dolby Laboratories, “Home theater speaker guide”, http://www.dobby.com, 2007.
|
DTS Inc., “DTS-HD audio, consumer white paper for blue-ray disc and HD DVD applications”, http://www.dts.com, 2006.
|
M.A. Gerzon, “General metatheory of auditory localisation”, the AES 92nd Convention, Vienna, Austria, Paper No.3306, 1992.
|
X.F. Xie, “A mathematical analysis of three dimensional surround sound field”, Acta Acustica, Vol.13, No.5, pp.321-328, 1988. (in Chinese)
|
ITU-R Report BS. 2159-7: 2015, Multichannel Sound Technology in Home and Broadcasting Applications.
|
G. Theile and H. Wittek, “Principles in surround recordings with height”, the AES 130th Convention, London, UK, Paper No.8403, 2011.
|
T. Holman, “The number of loudspeaker channels”, the AES 19th International Conference, Schloss, Elmau, Germany, 2001.
|
T.Holman, Surround Sound, Up and Running (Second Edition), Focal Press, Burlington, MA, USA, 2008.
|
S. Kim, Y.W. Lee and V. Pulkki, “New 10.2-channel vertical surround system (10.2-VSS); comparison study of perceived audio quality in various multichannel sound systems with height loudspeakers”, the AES 129th Convention, San Francisco, USA, Paper No.8296, 2010.
|
K. Hamasaki, “The 22.2 multichannel sounds and its reproduction at home and personal environment”, the AES 43rd International Conference, Pohang, Korea, 2011.
|
Dolby Laboratories, “Dolby Atmos specifications”, http://www.dolby.com, 2015.
|
J. Herre, J. Hilpert, A. Kuntz, et al., “MPEG-H audio-The new standard for coding of immersive spatial audio”, IEEE J. of Selected Topics on Signal Processing, Vol.9, No.5, pp.770-779, 2015.
|
J.G. Woodward, “Quadraphony-A review”, Journal of the AES, Vol.25, No.10/11, pp.843-854, 1977.
|
B.S. Xie and X.F. Xie, “The study of planar surround sound field”, Acta Acustica, Vol.17, No.3, pp.225-231, 1992. (in Chinese)
|
Dolby Laboratories, “Dolby surround mixing manual”, http://www.dolby.com, 1998.
|
R. Dressler, “Dolby surround Pro Logic II decoder principles of operation”, http://www.dolby.com, 2000.
|
N. Tsingos, C. Chabanne, C. Robinson, et al., “Surround sound with height in games using Dolby Pro Logic Iiz”, the AES 129th Convention, San Francisco, USA, Paper No.8248, 2010.
|
M.R. Bai and G.Y. Shih, “Upmixing and downmixing twochannel stereo audio for consumer electronics”, IEEE Trans.Consumer Electronics, Vol.53, No.3, pp.1011-1019, 2007.
|
C. Faller, “Multiple-loudspeakers playback of stereo signals”, Journal of the Audio Engineering Society, Vol.54, No.11, pp.1051-1064, 2006.
|
S. Kraft and U. Zölzer, “Low-complexity stereo signal decomposition and source separation for application in stereo to 3D upmixing”, the AES 140th Convention, Paris, France, Paper No.9586, 2016.
|
H. M?ller, “Fundamentals of binaural technology”, Applied Acoustics, Vol.36, No.3/4, pp.171-218, 1992.
|
F.L. Wightman and D.J. Kistler, “Headphone simulation of free-field listening, I: Stimulus synthesis”, Journal of the Acoustical Society of America, Vol.85, No.2, pp.858-867, 1989.
|
F.L. Wightman and D.J. Kistler, “Headphone simulation of free-field listening, II: Psycho-physical validation”, Journal of the Acoustical Society of America, Vol.85, No.2, pp.868-878, 1989.
|
V.R. Algazi, R.O. Duda, D.M. Thompson, et al., “The CIPIC HRTF database”, Proceeding of 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics, New York, USA, pp.99-102, 2001.
|
W.G. Gardner and K.D. Martin, “HRTF measurements of a KEMAR”, Journal of the Acoustical Society of America, Vol.97, No.6, pp.3907-3908, 1995.
|
B.S. Xie, X.L. Zhong, D. Rao, et al., “Head-related transfer function database and its analyses”, Science in China Series G, Physics, Mechanics & Astronomy, Vol.50, No.3, pp.267-280, 2007.
|
D.S. Brungart and W.M. Rabinowitz, “Auditory localization of nearby sources, head-related transfer functions”, Journal of the Acoustical Society of America, Vol.106, No.3, pp.1465-1479, 1999.
|
T.S. Qu, Z. Xiao and M. Gong, “Distance-dependent head-related transfer functions measured with high spatial resolution using a spark gap”, IEEE Transactions on Audio, Speech, and Language Processing, Vol.17, No.6, pp.1124-1132, 2009.
|
G.Z. Yu, B.S. Xie and D. Rao., “Near-field head-related transfer functions of an artificial head and its characteristics”, Acta Acustica, Vol.37, No.4, pp.378-385, 2012. (in Chinese)
|
G.Z. Yu, R.X. Wu, Y. Liu, et al., “Near-field headrelated transfer-function measurement and database of human subjects”, Journal of the Acoustical Society of America, Vol.143, No.3, pp.EL194-EL198, 2018.
|
R.O. Duda and W.L. Martens, “Range dependence of the response of a spherical head model”, Journal of the Acoustical Society of America, Vol.104, No.5, pp.3048-3058, 1998.
|
V.R. Algazi, R.O. Duda, R. Duraiswami, et al., “Approximating the head-related transfer function using simple geometric models of the head and torso”, Journal of the Acoustical Society of America, Vol.112, No.5, pp.2053-2064, 2002.
|
B.F.G. Katz, “Boundary element method calculation of individual head-related transfer function. I. Rigid model calculation”, Journal of the Acoustical Society of America, Vol.110, No.5, pp.2440-2448, 2001.
|
Y. Kahana and P.A. Nelson, “Boundary element simulations of the transfer function of human heads and baffled pinnae using accurate geometric models”, Journal of Sound and Vibration, Vol.300, No.3/5, pp.552-579, 2007.
|
M. Otani and S. Ise, “Fast calculation system specialized for head-related transfer function based on boundary element method”, Journal of the Acoustical Society of America, Vol.119, No.5, pp.2589-2598, 2006.
|
N.A. Gumerov, A.E. O’Donovan, R. Duraiswami, et al., “Computation of the head-related transfer function via the fast multipole accelerated boundary element method and its spherical harmonic representation”, Journal of the Acoustical Society of America, Vol.127, No.1, pp.370-386, 2010.
|
X.L. Zhong and B.S. Xie, “Maximal azimuthal resolution needed in measurements of head-related transfer functions”, Journal of the Acoustical Society of America, Vol.125, No.4, pp.2209-2220, 2009.
|
D.J. Kistler and F.L. Wightman, “A model of head-related transfer functions based on principal components analysis and minimum-phase reconstruction”, Journal of the Acoustical Society of America, Vol.91, No.3, pp.1637-1647, 1992.
|
V. Larcher, J.M. Jot, J. Guyard, et al., “Study and comparison of efficient methods for 3D audio spatialization based on linear decomposition of HRTF data”, the AES 108th Convention, Paris, France, Paper No.5097, 2000.
|
B.S. Xie, “Recovery of individual head-related transfer functions from a small set of measurements”, Journal of the Acoustical Society of America, Vol.132, No.1, pp.282-294, 2012.
|
J. Mackenzie, J. Huopaniemi, V. Valimaki, et al., “Low-order modeling of head-related transfer functions using balanced model truncation”, IEEE Signal Processing Letters, Vol.4, No.2, pp.39-41, 1997.
|
Y. Haneda, S. Makino, Y. Kaneda, et al., “Common acoustical pole and zero modeling of room transfer functions”, IEEE Transactions on Speech and Audio Processing, Vol.7, No.2, pp.188-196, 1999.
|
M.A. Blommer and G.H. Wakefield, “Pole-zero approximations for head-related transfer functions using a logarithmic error criterion”, IEEE Transactions on Speech and Audio Processing, Vol.5, No.3, pp.278-287, 1997.
|
A. Kulkarni, S.K. Isabelle and H.S. Colburn, “Sensitivity of human subjects to head-related transfer-function phase spectra”, Journal of the Acoustical Society of America, Vol.105, No.5, pp.2821-2840, 1999.
|
A. Härmä, M. Karjalainen, L. Savioja, et al., “Frequencywarped signal processing for audio applications”, Journal of the AES, Vol.48, No.11, pp.1011-1031, 2000.
|
E.M. Wenzel, M. Arruda, D.J. Kistler, et al., “Localization using nonindividualized head-related transfer functions”, Journal of the Acoustical Society of America, Vol.94, No.1, pp.111-123, 1993.
|
X.L. Zhong and B.S. Xie, “Approximation of individualized head-related transfer function-Current progresses and problems”, Applied Acoustics, Vol.31, No.6, pp.410-415, 2012. (in Chinese)
|
B.S. Xie, X.L. Zhong and N.N. He, “Typical data and cluster analysis on head-related transfer functions from Chinese subjects”, Applied Acoustics, Vol 94, No.1, pp.1-13, 2015.
|
E.M. Wenzel, “What perception implies about implementation of interactive virtual acoustic environments”, the AES 101st Convention, Los Angeles, CA, USA, Paper No.4353, 1996
|
L. Saviojia, J. Huopaniemi, T. Lokki, et al., “Creating interactive virtual acoustic environments”, Journal of the Audio Engineering Society, Vol.47, No.9, pp.675-705, 1999.
|
J. Blauert, H. Lehnert, J. Sahrhage, et al., “An interactive virtual-environment generator for psychoacoustic research I: Architecture and implementation”, Acta Acustica United with Acustica, Vol.86, No.1, pp.94-102, 2000.
|
C.Y. Zhang and B.S. Xie, “Platform for dynamic virtual auditory environment real-time rendering system”, Chinese Science Bulletin, Vol.58, No.3, pp.316-327, 2013.
|
M.R. Schroeder and B.S. Atal, “Computer simulation of sound transmission in rooms”, Proceedings of the IEEE, Vol.51, No.3, pp.536-537, 1963.
|
J. Bauck and D.H. Cooper, “Generalized transaural stereo and applications”, Journal of the Audio Engineering Society, Vol.44, No.9, pp.683-705, 1996.
|
O. Kirkeby, P.A. Nelson and H. Hamada, “The ‘stereo dipole’-A virtual source imaging system using two closely spaced loudspeakers”, Journal of the Audio Engineering Society, Vol.46, No.5, pp.387-395, 1998.
|
Dolby Laboratories, “Dolby headphone”, http://www.dolby.com, 1996.
|
B.S. Xie, J. Wang, S.Q. Guan, et al., “Virtual reproduction of 5.1 channel surround sound by headphone”, Chinese Journal of Acoustics, Vol.24, No.1, pp.63-75, 2005.
|
M.F. Davis and M.C. Fellers, “Virtual surround presentation of Dolby AC-3 and Pro Logic signal”, the AES 103rd Convention, New York, USA, Paper No.4542, 1997.
|
B.S. Xie, Y. Shi, Z.W. Xie, et al., “Virtual reproducing system for 5.1 channel surround sound”, Chinese Journal of Acoustics, Vol.24, No.1, pp.76-88, 2005.
|
P. He, B.S. Xie and D. Rao, “Subjective and objective analyses of timbre equalized algorithms for virtual sound reproduction by loudspeakers”, Applied Acoustics, Vol.25, No.1, pp.4-12, 2006. (in Chinese)
|
B. Bernfeld, “Simple equations for multichannel stereophonic sound localization”, Journal of the Audio Engineering Society, Vol.23, No.7, pp.553-557, 1975.
|
D. Rao and B.S. Xie, “Head rotation and sound image localization in the median plane”, Chinese Science Bulletin, Vol.50, No.5, pp.412-416, 2005.
|
B.S. Xie, H.M. Mai, D. Rao, et al., “Analysis of and experiments on vertical summing localization of multichannel sound reproduction with amplitude panning”, Journal of the Audio Engineering Society, Vol.67, No.6, pp.1-18, 2019.
|
G. Theile, “Natural 5.1 channel recording based on psychoacoustic principles”, the AES 19th International Conference, Schloss Elmau, Germany, 2001.
|
H. Wittek and G. Theile, “Development and application of a stereophonic multichannel recording technique for 3D audio and VR”, the AES 143rd Convention, New York, USA, Paper No.9869, 2017.
|
B. Rafaely, “Analysis and design of spherical microphone arrays”, IEEE Transactions on Speech and Audio Processing, Vol.13, No.1, pp.135-143, 2005.
|
D.N. Zotkin, R. Duraiswami and N.A. Gumerov, “Planewave decomposition of acoustical scenes via spherical and cylindrical microphone arrays”, IEEE Transactions on Audio, Speech and Language Processing, Vol.18, No.1, pp.2-16, 2010.
|
V. Pulkki, “Virtual sound source positioning using vector base amplitude panning”, Journal of the Audio Engineering Society, Vol.45, No.6, pp.456-466, 1997.
|
V. Välimäki, J.D. Parker and L. Saviojia, “Fifty years of artificial reverberation”, IEEE Transactions on Audio, Speech and Language Processing, Vol.20, No.5, pp.1421-1447, 2012.
|
U.P. Svensson and U.R. Kristiansen, “Computational modeling and simulation of acoustic spaces”, the AES 22nd International Conference, Espoo, Finland, 2002.
|
ISO/IEC 23008-3: 2015, Information Technology-High Efficiency Coding and Media Delivery in Heterogeneous Environments, Part 3: 3D Audio.
|
K.C. Pohlmann, Principles of Digital Audio (6th Edition), McCraw-Hill Companies, Inc., New York, USA, 2011.
|
ETSI TS 102366 V1.4.1:2017, Digital Audio Compression (AC-3, Enhanced AC-3) Standard.
|
ETSI TS 103190-2 V1.2.1:2018, Digital Audio Compression (AC-4) Standard, Part 2: Immersive and Personalized Audio.
|
V. Pulkki, M. Karjalainen and J. Huopaniemi, “Analyzing virtual sound source attributes using a binaural auditory model”, Journal of the Audio Engineering Society, Vol.47, No.4, pp.203-217, 1999.
|
R. Baumgartner and P. Majdak, “Modeling localization of amplitude-panned virtual sources in sagittal planes”, Journal of the AES, Vol.63, No.7/8, pp.562-569, 2015.
|
J. Huopaniemi, N. Zacharov and M. Karjalainen, “Objective and subjective evaluation of head-related transfer function filter design”, Journal of the Audio Engineering Society, Vol.47, No.4, pp.218-239, 1999.
|
Y. Liu and B.S. Xie, “Analysis with binaural auditory model and experiment on the timbre of Ambisonics recording and reproduction”, Chinese Journal of Acoustics, Vol.34, No.4, pp.337-356, 2015.
|
ITU-R BS.1116-3:2015, Methods for the Subjective Assessment of Small Impairments in Audio Systems.
|
ITU-R BS.1534-3:2015, Method for the Subjective Assessment of Intermediate Quality Level of Audio Systems.
|
AES Technical Council, “Multichannel surround sound systems and operations”, AES Technical Council Document, AESTD1001.1.01-10, 2001.
|
W. Krebber, H.W. Gierlich and K. Genuit, “Auditory virtual environments: Basics and applications for interactive simulations”, Signal Processing, Vol.80, No.11, pp.2307-2322, 2000.
|
T.A. DeFanti, G. Dawe, D.J. Sandin, et al., “The StarCAVE, a third-generation CAVE and virtual reality Optiportal”, Future Generation Computer Systems, Vol.25, No.2, pp.169-178, 2009.
|
K.U. Doerr, H. Rademacher, S. Huesgen, et al., “Evaluation of a low-cost 3D sound system for immersive virtual reality training systems”, IEEE Transactions on Visualization and Computer Graphics, Vol.13, No.2, pp.204-212, 2007.
|
C. Jin, T. Tan, A. Kan, et al., “Real-time, head-tracked 3D audio with unlimited simultaneous sounds”, Proceedings of Eleventh Meeting of the International Conference on Auditory Display (ICAD 05), Limerick, Ireland, 2005.
|
M.J. Evans, A.I. Tew and J.A.S Angus, “Spatial audio teleconferencing-Which way is better?”, Proceedings of the Fourth International Conference on Auditory Displays (ICAD 97), Paloalto, California, USA, pp.29-37, 1997.
|
D.R. Begault, “Virtual acoustics, aeronautics, and communications”, Journal of the Audio Engineering Society, Vol.46, No.6, pp.520-530, 1998.
|
C. Sander, F. Wefers and D. Leckschat, “Scalable binaural synthesis on mobile devices”, the AES 133rd Convention, San Francisco, USA, Paper No.8783, 2012.
|
M.A. Ericson, “Multichannel sound reproduction in the environment for auditory research”, the AES 131st Convention, New York, USA, Paper No.8513, 2011.
|
M. Vorländer, Auralization: Fundamentals of Acoustics, Modelling, Simulation, Algorithms and Acoustic Virtual Reality, Springer-Verlag Berlin Heidelberg, Berlin, Germany, 2008.
|
B.G. Shinn-Cunningham, “Applications of virtual auditory displays”, Proceedings of the 20th International Conference of the IEEE Engineering in Biology and Medicine Society, Hong Kong, China, Vol.20, No.3, pp.1105-1108, 1998.
|
B.U. Seeber, U. Baumann and H. Fastl, “Localization ability with bimodal hearing aids and bilateral cochlear implants”, Journal of the Audio Engineering Society, Vol.116, No.3, pp.1698-1709, 2004.
|
B.S. Xie, The Principle of Spatial Sound, Science Press, Beijing, China, 2019. (in Chinese)
|
J.B. Lauert, “Modeling binaural processing: What next?”, Journal of the AES, Vol.132, No.3, Page 1911, 2012.
|
F. Rumsey, “Automotive audio: They know where you sit”, Journal of the Audio Engineering Society, Vol.64, No.9, pp.705-708, 2016.
|
F. Rumsey, “Broadcast and streaming: Immersive audio, objects and OTT TV”, Journal of the Audio Engineering Society, Vol.65, No.4, pp.338-341, 2017.
|