This is a spectrogram of the spoken syllables „ann-all-ack, repeated three times.
The input signal is fed into a bank of second-order bandpass filters with center frequencies ranging from 300 Hz to 3600 Hz on a logarithmic ( or mel ) scale. The Q-factor is chosen so high that their amplitude response peaks start to separate from each other. (I know that it’s definitely not state of the art, but then, nothing on these pages is.) The pixel values indicate, on a logarithmic scale, the energy that is stored in each bandpass. It is a quadratic form in the two state variables. Many artifacts in the „spectrogram“, e.g. the trumpet-shapes that open to the left are due to the ringing in the filters, they are not visible in a true spectrogram. A sine wave that is suddenly switched on excites a broad range of filters. Only filters with the correct center frequency will store significant amounts of energy. They correspond to bright horizontal lines.
All in all, there is an astonishing level of detail, both temporal and spectral. If we want to distinguish difficult consonants in spoken language, we will need them both. Some sounds differ only in small details, yet broad variations may be completely insignificant. Is this any different from our ability to recognize each and every A?

The bank of band-pass filters is a crude model of cochlear signal processing. The inner ear achieves its high frequency resolution by active filters. The wave that travels through the cochlea is amplified by tiny biomechanical amplifiers ( outer hair cells ) that compensate energy loss. The result is a large increase in sensitivity and frequency resolution.

Connecting to a pattern engine

The output of the filter bank is a time-varying vector in R64, but pattern engines need a bunch of discretization maps into small, discrete address spaces. An often overlooked map of this kind is given by cmp: R2 ⟶ {0,1} , cmp(x,y) = 1 iff x > y. A comparator attached between the output of a filter and a time–delayed output of another filter gives a map from R64 x [0,∞[ to {0,1} x [0,∞[. As time passes, the spectrogram scrolls to the left and the output toggles between 0 and 1 in a complicated manner. The output space {0,1} is, of course, far too small to be useful. If we use 16 to 24 comparators, we get a map h1 into {0,1}16. This is the space of small bitmaps again. Different delays and different filter outputs will give different maps h1 … hn. No sane engineer will consider these wildly nonlinear maps as part of an audio processing system, but a pattern engine does not require much from its input functions. Each hi will stay constant for short amounts of time, then it will change again. Because the spaces Xi are small, however, it is quite likely that a value of hi will repeat. If this happens, we know that a certain pattern of spectral and temporal variation has occurred again. A pattern engine will combine the locally constant functions hi into a new function that is constant on much larger domains.

Implementation of Band-Pass Filters

/* See, for example, the book "Musical Applications of Microprocessors" by Hal Chamberlin,
 * published in 1985. It definitely shows its age, but I've enjoyed reading it.
class Filter
    double d1,d2,f,q ;
    Filter(void) { d1 = d2 = 0 ; f = q = 0 ; } ;
    void init( double center_frequency, double Q , double sample_frequency ) 
    // set up center frequency and Q-factor
        f  = 2 * sin( M_PI * center_frequency / sample_frequency  ) ;
        q  = Q ;
        d1 = d2 = 0 ;
    double lpf(double x ) // return a low-pass filtered sample
        d2 += f * d1 ;
        d1 += f * ( x - d2  - d1 / q ) ;
        return d2 ;
   } ;
    double bpf(double x ) // return a band-pass filtered sample
        d2 += f * d1 ;
        d1 += f * ( x - d2  - d1 / q ) ;
        return d1 ;
    } ;
    double hamilton(double x ) // return the stored energy, a quadratic form in the state variables
        d2 += f * d1 ;
        d1 += f * ( x - d2  - d1 / q ) ;
        return d1*d1 + f*d1*d2 + d2*d2 ;
    } ;

}  ;
 Date Posted: 31 Aug 2009 @ 07 21 PM
Last Modified: 04 Sep 2009 @ 06 06 PM
Posted By: Hardy

Responses to this post » (None)


Post a Comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

\/ More Options ...
Change Theme...
  • Users » 3
  • Posts/Pages » 40
  • Comments » 3
Change Theme...
  • VoidVoid
  • LifeLife « Default
  • EarthEarth
  • WindWind
  • WaterWater
  • FireFire
  • LightLight

On Digital Memory

    No Child Pages.