Home

Cheap guitars and drums in Haskell

This post demonstrates an extremely simple but foundational method in digital audio synthesis born in the US in the late 1970s, the Karplus-Strong algorithm.

We will work towards the following final result.

Basic idea, on the guitar

Let’s first start with the guitar sound. The drum case is a mere variation on the guitar and uses the same general approach.

s[0]s[1]s[2]...s[T-2]s[T-1]

We start with a set of initial values s_0, …, s_{T-1}. In the plucked string case, we generate T random numbers between -0.5 and 0.5.

We compute the next value with the following rule:

s_T = 0.995 \frac{s_0 + s_1}{2}

average and attenuates[T]s[0]s[1]s[2]...s[T-2]s[T-1]

To produce s_{T+1}, we similarly combine s_1 and s_2:

average and attenuates[T+1]s[0]s[1]s[2]...s[T-2]s[T-1]s[T]

And so on. “Sliding” this rule until the attenuation makes the sound die out is the essence of the Karplus-Strong algorithm. We still have some details to fill in though.

What is T ? How do we determine it?

T is going to be the (quasi-)period of our audio signal, i.e even though the attenuation guarantees that our signal isn’t really periodic (because the amplitudes go down as we progress), the different T-samples-long slices will bear quite a bit of resemblance.

T is determined by dividing the sampling rate we want to use to describe our sound, by the frequency of the note we would like to play. The higher the pitch, the shorter the (quasi) period. It is precisely by making audio signals with short periods that we can produce high pitches.

With all these elements in hand, let’s look at an actual audio file.

Audio example

Here is an example of audio we can produce with this method.

Zooming on the beginning reveals more details on the initial randomness and the beginning of the smoothing and attenuation process

Snare…ish

To produce snare-ish sounds, we need to make 2 very simple adjustments.

s_T = \begin{cases} 0.995 \frac{s_0 + s_1}{2} & \text{with probability} \frac{1}{2} \\ - 0.995 \frac{s_0 + s_1}{2} & \text{with probability} \frac{1}{2} \end{cases}

In other words, instead of applying a deterministic rule to a random initial buffer, we apply a probabilistic rule to a constant buffer.

Audio example

Those simple tweaks allow us to get this sort of sound:

We can very clearly see the initial (constant) buffer here, the vertical rectangle at the very beginning of the signal. And we can see the roughly-even split between up and down occuring as well.

Haskell implementation

My first implementation of this algorithm many years ago was generating audio in PCM format and saving it to some file with a library.

When brushing up my memories of the algorithm for this article, I looked around for new Haskell libraries in the audio space and found two very useful additions.

We will be using lambdasound here. (Which uses proteaaudio underneath to play the audio)

{-# LANGUAGE DataKinds #-}

import Data.List (unfoldr)
import LambdaSound
import System.Random

import qualified Deque.Strict as DQ

The goal here isn’t efficiency, so we’ll use a simple deque to hold onto and update our “sliding buffer” of T floats. In order to produce a Sound from lambdasound sample-by-sample, I use unfoldrSoundPulse, which is exactly like its list counterpart and fits our needs perfectly.

Here is our implementation of the algorithm, abstracting over the initialization of the first T values and the sliding step.

karplusStrong
  :: (StdGen -> Maybe (Pulse, StdGen)) -- ^ how to initialize the wavetable
  -> (StdGen -> Pulse -> Pulse -> (Pulse, StdGen)) -- ^ "slide step"
  -> Int -- ^ sample rate
  -> (Hz -> Sound 'I Pulse) -- ^ instrument
karplusStrong genPulse f sampleRate freq =
  let waveTable = take waveTableLen $ unfoldr genPulse gen
      deque0 = DQ.fromConsAndSnocLists waveTable []
      slide (dq, g) = case DQ.uncons dq of
        Just (a, as)
          | Just a' <- DQ.head as ->
              let (new_a, g') = f g a a' in
                (a, (DQ.snoc new_a as, g'))
        _ -> error "welp"
  in unfoldrSoundPulse slide (deque0, gen)

  where waveTableLen = floor (fromIntegral sampleRate / freq) :: Int
        gen = mkStdGen 142

The sliding function merely wraps f with a bit of unconsing and snocing, and everything else has been covered earlier in this post. I use a fixed seed for the random number generator because I don’t care about subtle variations in the sound for this post.

We can “instantiate” this instrument family to our guitar and drums and put them to work in a non-trivial way with a few more lines.

main :: IO ()
main = do
  let sampleRate = 44100

      attenuate a b = 0.995*(a+b)/2
      centeredPulse g = case randomR (-0.5, 0.5) g of
          (a, g') -> Just (Pulse a, g')

      -- guitar
      instr1 = karplusStrong centeredPulse (\_gen a a' -> (attenuate a a', _gen)) sampleRate
      -- snare
      instr2 = karplusStrong (\g -> Just (0.5, g))
        (\gen a a' -> case random gen of
            (b, gen') ->
              let v = attenuate a a'
              in (if b then v else negate v, gen')
        ) sampleRate

      -- guitar sounds
      sound1 = setDuration 2 $ asNote instr1 a3
      sound2 = simpleReverb 0.01 sound1
      sound3 = setDuration 2 $ parallel [ asNote instr1 x | x <- [c3, e3, g3] ]
      sound4 = simpleReverb 0.01 sound3

      -- drum sound
      sound5 = setDuration 0.3 $ asNote instr2 a3

      -- demo piece
      sound6 =
        let gtrLoop =
              [ ([c3, g3], [c4, d4+1])
              , ([a2+1, g3], [c4, d4+1])
              , ([g2+1, f3], [g4, f4])
              , ([f2, d3+1], [f4, d4+1])
              ]
            gtr = repeatSound 2 $ sequentially
              [ repeatSound 4 $ parallel
                -- we play the notes from 'l' in parallel, followed by the
                -- notes from 'r' in sequence
                [ setDuration 0.9 (parallel [ asNote instr1 x | x <- l ])
                , setDuration 0.3 silence >>> sequentially (map (setDuration 0.3 . asNote instr1) r)
                ]
              | (l, r) <- gtrLoop
              ]
            dr1 = repeatSound 48 (setDuration 0.6 sound5)
            dr2 = repeatSound 96 (reduce 0.8 sound5)
            dr3 = repeatSound 32 (amplify 1.2 $ setDuration 0.75 silence >>> setDuration 0.15 sound5)
        in setDuration 2 silence >>> parallel [gtr, dr1, dr2, dr3]

      -- all the sounds from this post
      sounds = [sound1, sound2, sound3, sound4, sound5, sound6]

  mapM_ (play sampleRate 1) sounds
  sequence_ [ saveWav ("karplus_strong_" ++ show n ++ ".wav") (fromIntegral sampleRate) s | (n, s) <- zip [(1 :: Int)..] sounds ]

sound6 defines the demo piece from the beginning of the post, using both instruments and a few functions from lambdasound. Probably very unidiomatic, and I plan to study the library more, but for the purpose of this post this will do.

You can find the code, along with a suitable cabal project, shell.nix and the audio files, in the following repo: https://github.com/alpmestan/sounds

Conclusion

The paper and many subsequent ones discuss variations on the initial buffer and sliding rule choices. Some links:

Of course, there’s a lot more. I am fascinated by virtual instruments that are physically modelled, where the physics producing the sound in the real instrument is simulated to an extremely convincing degree in real-time, as opposed to playing pre-recorded samples or making cheap approximations like here. But this is the first audio synthesis algorithm I ever studied and implemented. It’s so simple!

Posted: