The Web Audio API

The Web Audio API for hackers

(or, how to properly make elaborate noise on the web)

Outline of the talk

  • Overview
  • Nodes presentation
  • AudioParam
  • Demos/Livecoding
  • Known shortcomings
  • Firefox Devtool
  • Gecko/Blink/Webkit Compatibility

What is Web Audio, in one sentence

What does the spec tell us?

[...] A high-level JavaScript API for processing and synthesizing audio in web applications. The primary paradigm is of an audio routing graph, where a number of AudioNode objects are connected together to define the overall audio rendering.

What is Web Audio, in one picture (and some code)

Web Audio Graph
var ac = new AudioContext(); 
ac.decodeAudioData(ogg_arraybuffer, function(audiobuffer) { 
  var source = ac.createBufferSource();
  source.buffer = audiobuffer; 
  var d = ac.createDelay()
  var osc = ac.createOscillator();
  osc.type = "square";
  osc.frequency.value = 100; // Hz
  var g = ac.createGain();
}, function error() { alert("This is broken"); });

Use cases

  • Games
  • Audio visualization
  • Web Application audio feedback
  • Musical applications
  • Fancy streaming music player (until we have MediaSource Extension widely available Firefox bug, spec)
  • (Ideally) Anything on the web that makes non-trivial noises
  • (including offline rendering)

Feature Overview

  • Asynchronous compressed audio decoding to ArrayBuffers (formats: same as <audio>)
  • Precise timing for sample playback
  • Arbitrary audio routing/mixing
  • Effect transition, scheduling (automation)
  • Ability to analyse audio (e.g. FFT)
  • Integration with the web platform (MediaRecorder, WebRTC, <audio>)
  • Low-latency playback

Internal Processing model

  • Can't share AudioNodes between AudioContext
  • Can share AudioBuffers, though
  • Dedicated thread for audio processing
  • Main thread for control (shared with everything else: <canvas>, WegGL, etc.)
  • Message passing (between main thread and audio thread): no locking
  • Float32 everywhere
  • Block processing (128 sample-frames)
  • Feedback loops allowed (iff at least one DelayNode in the cycle)
  • Low latency (not low enough yet for Firefox but coming)

Processing model


  • Message passing for everything to avoid blocking the audio thread (that would result in hearable glitches)
  • You schedule things to happen on the audio thread, they do not happen immediatly, but at the next audio callback
  • You can't retrieve values from the audio thread (they would be meaningless, because late)


  • Input nodes: produce audio -- audio sources
  • Output nodes: receive audio -- audio sinks
  • Processing nodes: process incoming audio, output processed signal

Source Nodes

  • AudioBufferSourceNode, sample player, plays AudioBuffer. One shot, but cheap to create
  • OscillatorNode, sine, square, saw, triangle, custom from FFT. One shot (use a GainNode)
  • ScriptProcessorNode (with no input) to generate arbitrary waveforms using JavaScript (but beware, it's broken!)
  • MediaElementAudioSourceNode to pipe the audio from <audio> or <video> in an AudioContext
  • MediaStreamAudioSourceNode, from PeerConnection, getUserMedia.

Processing Nodes

  • GainNode: Change the volume
  • DelayNode: Delays the input, in time. Needed for cycles.
  • ScriptProcessorNode (with both input and output connected): arbitrary processing (but beware, it's broken)
  • PannerNode: Position a source and a listener in 3D and get the sound panned accordingly
  • Channel{Splitter,Merger}Node: Merge or Split multi-channel audio from/to mono
  • ConvolverNode: Perform one-dimensional convolution between an AudioBuffer and the input (e.g. reverb)

Processing Nodes (moar)

  • WaveShaperNode: Non-linear wave shaping (e.g. guitar distortion)
  • BiquadFilter: low-pass, high-pass, band-pass, all-pass, etc.
  • DynamicsCompressorNode: adjust audio dynamics

Output Nodes

  • MediaStreamAudioDestinationNode: Outputs to a MediaStream(send to e.g. a WebRTC PeerConnection or a MediaRecorder to encode)
  • AudioDestinationNode: Outputs to speakers/headphones
  • ScriptProcessorNode: arbitrary processing on input audio
  • AnalyserNode: Get time-domain or frequency-domain audio data, in ArrayBuffers

AudioParam for automation

  • Almost every AudioNode parameter is an AudioParam
  • Set directly: g.gain.value = 0.5;
  • Use automation curve functions:
    • g.gain.setValueAtTime(0.5, ctx.currentTime + 1.0);
    • g.gain.linearRampToValueAtTime(0.5, ct + 1.0);
    • g.gain.exponentialRampToValueAtTime(0.0, ct + 1.0);
    • g.gain.setTargetAtTime(0.0, ct + 1.0, 0.66 /*exp constant*/);
    • g.gain.setValueCurveAtTime(arrayBuffer, ct + 1.0, arrayBuffer.length / ctx.sampleRate);
    • g.gain.cancelScheduledValues (ct);
  • De-ziperring

AudioParam + OscillatorNode = LFO automation

                 var osc = ctx.createOscillator(); // default: sine 
                 osc.frequency.value = 4; // Hz 
                 var lpf = ctx.createBiquadFilter(); // default: low-pass 
                 var gain = ctx.createGain() 
                 gain.gain.value = 1000; // [0; 1.0] to [0; 1000] 
                 osc.connect(gain.gain); // osc -> AudioParam 
                 // dubstep 

Known shortcomings

  • The ScriptProcessorNode is broken by design
  • Sometimes Float32 are too memory-heavy (OfflineAudioContext trick to same memory)
  • No FFT/IFFT exposed other than in the AnalyserNode
  • Does not work (yet) in WebWorkers
  • No way to enqueue buffers and have them play (but you have MediaSource for that)


  • It still exists !
  • It can play a Blob (from IndexedDB, XMLHttpRequest, local file, ArrayBuffer)
  • For when you don't care about perf, scheduling effects
  • For when you don't want to have the data loaded in memory
  • Think about using MediaElementAudioSourceNode, best of both worlds!

