Dynamic time warping Dynamic time warping (DTW) is an
algorithm for measuring similarity between two
temporal sequences, which may vary in speed. In general, DTW is a method that calculates an
optimal match between two given sequences (e.g. time series) with certain restriction and rules. The optimal match is denoted by the match that satisfies all the restrictions and the rules and that has the minimal cost, where the cost is computed as the sum of absolute differences, for each matched pair of indices, between their values.
Hidden Markov models A hidden Markov model can be represented as the simplest
dynamic Bayesian network. The goal of the algorithm is to estimate a hidden variable x(t) given a list of observations y(t). By applying the
Markov property, the
conditional probability distribution of the hidden variable
x(
t) at time
t, given the values of the hidden variable
x at all times, depends
only on the value of the hidden variable
x(
t − 1). Similarly, the value of the observed variable
y(
t) only depends on the value of the hidden variable
x(
t) (both at time
t).
Artificial neural networks An artificial neural network (ANN) is based on a collection of connected units or nodes called
artificial neurons, which loosely model the
neurons in a biological
brain. Each connection, like the
synapses in a biological
brain, can transmit a signal from one artificial neuron to another. An artificial neuron that receives a signal can process it and then signal additional artificial neurons connected to it. In common ANN implementations, the signal at a connection between artificial neurons is a
real number, and the output of each artificial neuron is computed by some non-linear function of the sum of its inputs.
Phase-aware processing Phase is often assumed to be random, but contains useful information. Wrapping of phase: can be introduced due to periodical jumps on 2 \pi. Phase unwrapping (see, Chapter 2.3;
Instantaneous phase and frequency), it can be expressed as: \phi(h,l) = \phi_{lin}(h,l) + \Psi(h,l), where \phi_{lin}(h,l) = \omega_0(l') {}_\Delta t is linear phase ({}_\Delta t is temporal shift at each frame of analysis), \Psi(h,l) is phase contribution of the vocal tract and phase source. and its derivatives by time (
instantaneous frequency) and frequency (
group delay), smoothing of phase across frequency. Joined amplitude and phase estimators can recover speech more accurately basing on assumption of von Mises distribution of phase. == Applications ==