This post demonstrates an extremely simple but foundational method in digital audio synthesis born in the US in the late 1970s, the Karplus-Strong algorithm.
We will work towards the following final result.
Basic idea, on the guitar
Let’s first start with the guitar sound. The drum case is a mere variation on the guitar and uses the same general approach.
We start with a set of initial values s_0, …, s_{T-1}. In the plucked string case, we generate T random numbers between -0.5 and 0.5.
We compute the next value with the following rule:
s_T = 0.995 \frac{s_0 + s_1}{2}
To produce s_{T+1}, we similarly combine s_1 and s_2:
And so on. “Sliding” this rule until the attenuation makes the sound die out is the essence of the Karplus-Strong algorithm. We still have some details to fill in though.
What is T ? How do we determine it?
T is going to be the (quasi-)period of our audio signal, i.e even though the attenuation guarantees that our signal isn’t really periodic (because the amplitudes go down as we progress), the different T-samples-long slices will bear quite a bit of resemblance.
T is determined by dividing the sampling rate we want to use to describe our sound, by the frequency of the note we would like to play. The higher the pitch, the shorter the (quasi) period. It is precisely by making audio signals with short periods that we can produce high pitches.
With all these elements in hand, let’s look at an actual audio file.
Audio example
Here is an example of audio we can produce with this method.
Zooming on the beginning reveals more details on the initial randomness and the beginning of the smoothing and attenuation process
Snare…ish
To produce snare-ish sounds, we need to make 2 very simple adjustments.
- instead of generating random numbers for the first T samples, we just fill those T slots with a constant, e.g 0.5
- instead of s_T = 0.995 \frac{s_0 + s_1}{2}, we will do:
s_T = \begin{cases} 0.995 \frac{s_0 + s_1}{2} & \text{with probability} \frac{1}{2} \\ - 0.995 \frac{s_0 + s_1}{2} & \text{with probability} \frac{1}{2} \end{cases}
In other words, instead of applying a deterministic rule to a random initial buffer, we apply a probabilistic rule to a constant buffer.
Audio example
Those simple tweaks allow us to get this sort of sound:
We can very clearly see the initial (constant) buffer here, the vertical rectangle at the very beginning of the signal. And we can see the roughly-even split between up and down occuring as well.
Haskell implementation
My first implementation of this algorithm many years ago was generating audio in PCM format and saving it to some file with a library.
When brushing up my memories of the algorithm for this article, I looked around for new Haskell libraries in the audio space and found two very useful additions.
- proteaaudio: portable (Windows, OSX, Linux) library for loading, playing and saving audio, at a low level.
- lambdasound: higher level library for defining sounds, twisting them, applying effects to them, and of course loading/saving/playing them.
We will be using lambdasound here. (Which uses
proteaaudio
underneath to play the audio)
{-# LANGUAGE DataKinds #-}
import Data.List (unfoldr)
import LambdaSound
import System.Random
import qualified Deque.Strict as DQ
The goal here isn’t efficiency, so we’ll use a simple deque to hold
onto and update our “sliding buffer” of T floats. In order to produce a Sound
from lambdasound
sample-by-sample, I use unfoldrSoundPulse
,
which is exactly like its list counterpart and fits our needs
perfectly.
Here is our implementation of the algorithm, abstracting over the initialization of the first T values and the sliding step.
karplusStrong :: (StdGen -> Maybe (Pulse, StdGen)) -- ^ how to initialize the wavetable
-> (StdGen -> Pulse -> Pulse -> (Pulse, StdGen)) -- ^ "slide step"
-> Int -- ^ sample rate
-> (Hz -> Sound 'I Pulse) -- ^ instrument
=
karplusStrong genPulse f sampleRate freq let waveTable = take waveTableLen $ unfoldr genPulse gen
= DQ.fromConsAndSnocLists waveTable []
deque0 = case DQ.uncons dq of
slide (dq, g) Just (a, as)
| Just a' <- DQ.head as ->
let (new_a, g') = f g a a' in
(a, (DQ.snoc new_a as, g'))-> error "welp"
_ in unfoldrSoundPulse slide (deque0, gen)
where waveTableLen = floor (fromIntegral sampleRate / freq) :: Int
= mkStdGen 142 gen
The sliding function merely wraps f
with a bit of
uncons
ing and snoc
ing, and everything else has
been covered earlier in this post. I use a fixed seed for the random
number generator because I don’t care about subtle variations in the
sound for this post.
We can “instantiate” this instrument family to our guitar and drums and put them to work in a non-trivial way with a few more lines.
main :: IO ()
= do
main let sampleRate = 44100
= 0.995*(a+b)/2
attenuate a b = case randomR (-0.5, 0.5) g of
centeredPulse g -> Just (Pulse a, g')
(a, g')
-- guitar
= karplusStrong centeredPulse (\_gen a a' -> (attenuate a a', _gen)) sampleRate
instr1 -- snare
= karplusStrong (\g -> Just (0.5, g))
instr2 -> case random gen of
(\gen a a' ->
(b, gen') let v = attenuate a a'
in (if b then v else negate v, gen')
) sampleRate
-- guitar sounds
= setDuration 2 $ asNote instr1 a3
sound1 = simpleReverb 0.01 sound1
sound2 = setDuration 2 $ parallel [ asNote instr1 x | x <- [c3, e3, g3] ]
sound3 = simpleReverb 0.01 sound3
sound4
-- drum sound
= setDuration 0.3 $ asNote instr2 a3
sound5
-- demo piece
=
sound6 let gtrLoop =
+1])
[ ([c3, g3], [c4, d4+1, g3], [c4, d4+1])
, ([a2+1, f3], [g4, f4])
, ([g2+1], [f4, d4+1])
, ([f2, d3
]= repeatSound 2 $ sequentially
gtr 4 $ parallel
[ repeatSound -- we play the notes from 'l' in parallel, followed by the
-- notes from 'r' in sequence
0.9 (parallel [ asNote instr1 x | x <- l ])
[ setDuration 0.3 silence >>> sequentially (map (setDuration 0.3 . asNote instr1) r)
, setDuration
]| (l, r) <- gtrLoop
]= repeatSound 48 (setDuration 0.6 sound5)
dr1 = repeatSound 96 (reduce 0.8 sound5)
dr2 = repeatSound 32 (amplify 1.2 $ setDuration 0.75 silence >>> setDuration 0.15 sound5)
dr3 in setDuration 2 silence >>> parallel [gtr, dr1, dr2, dr3]
-- all the sounds from this post
= [sound1, sound2, sound3, sound4, sound5, sound6]
sounds
mapM_ (play sampleRate 1) sounds
sequence_ [ saveWav ("karplus_strong_" ++ show n ++ ".wav") (fromIntegral sampleRate) s | (n, s) <- zip [(1 :: Int)..] sounds ]
sound6
defines the demo piece from the beginning of the
post, using both instruments and a few functions from
lambdasound
. Probably very unidiomatic, and I plan to study
the library more, but for the purpose of this post this will do.
You can find the code, along with a suitable cabal project,
shell.nix
and the audio files, in the following repo: https://github.com/alpmestan/sounds
Conclusion
The paper and many subsequent ones discuss variations on the initial buffer and sliding rule choices. Some links:
- The original paper by Kevin Karplus and Alex Strong
- Extensions of the Karplus-Strong Pucked-String Algorithm for the reader interested in improving the guitar sound
- Digital waveguide synthesis covering the wider class of algorithms that KS belongs to
Of course, there’s a lot more. I am fascinated by virtual instruments that are physically modelled, where the physics producing the sound in the real instrument is simulated to an extremely convincing degree in real-time, as opposed to playing pre-recorded samples or making cheap approximations like here. But this is the first audio synthesis algorithm I ever studied and implemented. It’s so simple!
Posted: