The Karplus-Strong Algorithm

In 1983, Alex Strong and Kevin Karplus published a simple but effective algorithm for synthesizing the sound of a plucked string.

Pick the period \(N\). Then:

  1. The first \(N\) outputs \(y[0], …​, y[N-1]\) are random.

  2. For \(n \ge N\), output \(y[n] = (y[n - N] + y[n - (N+1)])/2\). (By convention \(y[-1] = 0\).)

If played at the frequency \(f_s\), this sequence sounds like a string being plucked at frequency \(f_s / (N+1/2)\)

Explanation

The Karplus-Strong algorithm is an example of digital waveguide synthesis. An instrument is physically modeled and simulated. In this case, the random samples crudely represents the initial pluck: each part of the string is in a random position moving at a random velocity.

The delay and feedback cause the waveform to repeat itself, oscillating as a string would. If we just had \(y[n] = y[n-N]\), we would have a waveform that repeats with frequency \(f_s / N\).

Instead, taking the average of two consecutive samples acts as a one-zero low-pass filter, mimicking dampening effects of a real string as it vibrates. Higher frequency oscillations lose energy quicker than lower frequency oscillations.

The filter \(y[n] = (x[n] + x[n-1])/2\) has the transfer function \(H(z) = (1 + z^{-1})/2\). When \(z = e^{2 i a}\), this is

\[ e^{-i a} (e^{i a} + e^{-i a})/2 = e^{-i a} \cos a \]

Thus an input \(e^{2 i a n}\) comes out as \(e^{2 i a (n - 1/2)}\), explaining why we divide the sampling frequency by \(N+1/2\) to arrive at the frequency of the plucked string.

Extensions

Although the basic algorithm produces surprisingly good results, we can do better.

At higher frequencies, rounding \(f_s / (N+1/2)\) to the nearest integer is too crude. We can correct for the error by introducing an allpass filter in the loop: \(y[n] = C x[n] + x[n-1] - C y[n-1]\).

At lower frequencies, the sound decays too slowly. We can shorten the decay by introducing a loss factor \(\rho \lt 1\), and set \(y[n] = \rho (y[n - N] + y[n - (N+1)]) / 2\).

At higher frequencies, we have the opposite problem. We can stretch the decay by weighting the average. Pick some \(0 \lt S \lt 1\) and set \(y[n] = (1-S) y[n - N] + S y[n - (N+1)] \). This changes the phase delay; see Jaffe and Smith for the exact formula (or derive it yourself).

When a real string is plucked harder, the waveform contains more high frequency components. Thus by putting the output through an appropriate low-pass filter we change the loudness of the output. One possible dynamics filter is \(y[n] = (1 - R)x[n] + R y[n-1]\) for some \(0 \lt R \lt 1\) that depends on the frequency and desired loudness.

To simulate string muting, we can introduce a loss factor when a note ends.

Slurs can be simulated by using a new value of N on the fly. Similarly, glissandi can be simulated by changing N gradually.

References


Ben Lynn blynn@cs.stanford.edu 💡