Saturday, October 19, 2:30 pm — 4:15 pm (1E10)
Chair:
Dave Moffat, Queen Mary University London - London, UK
EB7-1 Realistic Procedural Sound Synthesis of Bird Song Using Particle Swarm Optimization—Jorge Zúñiga, Queen Mary University of London - London, UK; Joshua D. Reiss, Queen Mary University of London - London, UK
We present a synthesis algorithm for approximating bird song using particle swarm optimization to match real bird recordings. Frequency and amplitude envelope curves are first extracted from a bird recording. Further analysis identifies the presence of even and odd harmonics. A particle swarm algorithm is then used to find cubic Bezier curves which emulate the envelopes. These curves are applied to modulate a sine oscillator and its harmonics. The synthesized syllable can then be repeated to generate the sound. Thirty-six bird sounds have been emulated this way, and a real-time web-based demonstrator is available, with user control of all parameters. Objective evaluation showed that the synthesized bird sounds captured most audio features of the recordings.
EB7-2 Multi-Scale Auralization for Multimedia Analytical Feature Interaction—Nguyen Le Thanh Nguyen, University of Miami - Coral Gables, FL, USA; Hyunhwan Lee, University of Miami - Coral Gables, FL, USA; Joseph Johnson, University of Miami - Coral Gables, FL, USA; Mitsunori Ogihara, University of Miami - Coral Gables, FL, USA; Gang Ren, University of Miami - Coral Gables, FL, USA; James W. Beauchamp, Univ. of Illinois at Urbana-Champaign - Urbana, IL, USA
Modern human-computer interaction systems use multiple perceptual dimensions to enhance intuition and efficiency of the user by improving their situational awareness. A signal processing and interaction framework is proposed for auralizing signal patterns and augmenting the visualization-focused analysis tasks of social media content analysis and annotations, with the goal of assisting the user in analyzing, retrieving, and organizing relevant information for marketing research. Audio signals are generated from video/audio signal patterns as an auralization framework, for example, using the audio frequency modulation that follows the magnitude contours of video color saturation. The integration of visual and aural presentations will benefit the user interactions by reducing the fatigue level and sharping the users’ sensitivity, thereby improving work efficiency, confidence, and satisfaction.
EB7-3 Perceptually Motivated Hearing Loss Simulation for Audio Mixing Reference—Angeliki Mourgela, Queen Mary University of London - London, UK; Trevor Agus, Queens University Belfast - Belfast, UK; Joshua D. Reiss, Queen Mary University of London - London, UK
This paper proposes the development of a hearing loss simulation for use in audio mix referencing, designed according to psychoacoustic and audiology research findings. The simulation proposed in this paper aims to reproduce four perceptual aspects of hearing loss; threshold elevation, loss of dynamic range, reduced frequency and temporal resolution, while providing an audio input/output functionality.
EB7-4 Modeling between Partial Components for Musical Timbre Imitation and Migration—Angela C. Kihiko, Spelman College - Atlanta, GA, USA; Mitsunori Ogihara, University of Miami - Coral Gables, FL, USA; Gang Ren, University of Miami - Coral Gables, FL, USA; James W. Beauchamp, Univ. of Illinois at Urbana-Champaign - Urbana, IL, USA
Most musical sounds have strong and regularly distributed spectral components such as harmonic partials. However, the energy distribution patterns between any two such sonic partials, the in-between low-energy signal patterns such as performance articulation or instrument signatures, are also important for characterizing musical sounds. This paper presents a timbre-modeling framework for detecting and modeling the between-partial components for musical timbre analysis and synthesis. This framework focuses on timbre imitation and migration for electronic music instruments, where timbral patterns obtained from acoustical instruments are re-interpreted for electronic instruments and new music interfaces. The proposed framework will help musicians and audio engineers to better explore musical timbre and musical performance expressions for enhancing the naturalness, expressiveness, and creativeness of electronic/computer music systems.
EB7-5 Coherence as an Indicator of Distortion for Wide-Band Audio Signals such as M-Noise and Music—Merlijn van Veen, Meyer Sound Laboratories - Berkeley, CA, USA; Roger Schwenke, Meyer Sound Laboratories - Berkeley, CA, USA
M-Noise is a new scientifically derived test signal whose crest factor as a function of frequency is modeled after real music. M-Noise should be used with a complementary procedure for determining a loudspeaker’s maximum linear SPL. The M-Noise Procedure contains criteria for the maximum allowable change in coherence as well as frequency response. When the loudspeaker and microphone are positioned as prescribed by the procedure, reductions in coherence are expected to be caused by distortion. Although higher precision methods for measuring distortion exist, coherence has the advantage that it can be calculated for wide-band signals such as M-Noise as well as music. Examples will demonstrate the perceived audio quality associated with different amounts of distortion-induced coherence loss.
EB7-6 Fast Time Domain Stereo Audio Source Separation Using Fractional Delay Filters—Oleg Golokolenko, TU- Ilmenau - Ilmenau, Germany; Gerald Schuller, Ilmenau University of Technology - IImenau, Germany; Fraunhofer Institute for Digital Media technology (IDMT) - Ilmenau, Germany
Our goal is a system for the separation of two speakers during teleconferencing or for hearing aids. To be useful in real time, we want it to work online with as low delay as possible. Proposed approach works in time domain, using attenuation factors and fractional delays between microphone signals to minimize cross-talk, the principle of a fractional delay and sum beamformer. Compared to other approaches this has the advantage that we have lower computational complexity, no system delay and no musical noise like in frequency domain algorithms. We evaluate our approach on convolutive mixtures generated from speech signals taken from the TIMIT data-set using a room impulse response simulator.
EB7-7 Line Array Optimization through Innovative Multichannel Filtering—Paolo Martignon, Contralto Audio srl - Casoli, Italy; Mario Di Cola, Contralto Audio srl - Casoli, Italy; Letizia Chisari, Contralto Audio srl - Casoli, Italy
Element dependent filtering offers the possibility to optimize the sound coverage of vertical line arrays: distance dependent frequency response, as well as mid-low frequency beaming and air absorption can be partially compensated. Simulation of array elements contributions to venue acoustics is normally the input data for filters calculation, but some phenomena exist in the real world that are hardly addressed by simulations: for example, the dispersion of transducers responses, as well as the acoustic paths atmospheric conditions, among different array elements. This awareness induced us to develop an algorithm with the aim of being robust against these inaccuracies.