2. Background to Pattern Matching and Alignment via DTW#

2.1. DTW in Speech Recognition#

What is Dynamic Time Warping?

Wikipedia Entry:

“In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed”

Dynamic Time Warping was first developed as a tool for speech pattern recognition and matching. It applies a particular dynamic programming algorithm that allows a quantification of similarities and alignment of different time- and/ or depth-series based on metrics of differences between these curves.

A typical mapping figure:

_images/ItakuraFig1.pdf

Fig. 2.1 Mapping path from Itakura, 1975 (their Fig. 1). Perhaps the resemblance to a depth-time or depth-depth geological mapping is clear.#

The first publications introducing this technique were:

2.1.1. Section Bibliography#

1

T. K. Vintsyuk. Speech discrimination by dynamic programming. Cybernetics, 4(1):52–57, 1968. URL: https://link.springer.com/content/pdf/10.1007/BF01074755.pdf, doi:10.1007/BF01074755.

2

F. Itakura. Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-23(1):67–72, 1975. URL: https://www.ee.columbia.edu/~dpwe/papers/Itak75-lpcasr.pdf, doi:10.1109/TASSP.1975.1162641.

3

H. Sakoe and S. Chiba. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-26(1):43–49, 1978. URL: https://www.irit.fr/~Julien.Pinquier/Docs/TP_MABS/res/dtw-sakoe-chiba78.pdf, doi:10.1109/TASSP.1978.1163055.