Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Audio (perry/audio)

The perry/audio module is Perry’s low-latency, game-engine-style audio mixer. Three concepts:

  • Sound — a loaded asset. loadSound("click.wav") returns one handle; the PCM data lives in memory until you unload().
  • PlaybackId — one live voice. play(sound) returns a new PlaybackId every time it’s called, so the same sound can overlap with itself (think: multiple gunshots, multiple footsteps).
  • Bus — a mixer group. Sounds route through a Bus, Buses route through their parent (default: master). One setVolume(musicBus, 0.3) scales every voice on it.

Use perry/audio for SFX, music loops, voice prompts, and any UI feedback where you want overlap or sub-20ms latency. For long-form streaming with a seek bar, lock-screen controls, and Now Playing metadata, use perry/media instead.

Quick start

import {
  loadSound, play, stop, setVolume,
  createBus, setMasterVolume,
} from "perry/audio";

// Optional: organise sounds into buses
const sfx   = createBus("sfx");
const music = createBus("music");

// Load assets — decode happens in the background. The handle is
// returned immediately; play() before decode finishes just queues
// the playback.
const click = loadSound("assets/click.wav", sfx);
const bgm   = loadSound("assets/bgm.mp3",   music, /* stream */ true);

// Fire-and-forget — overlap is automatic, each play() returns a new
// PlaybackId you can stop / fade / tune independently.
const a = play(click);
const b = play(click, 0.7, false, 0.95);  // slightly lower pitch
const bgmId = play(bgm, 1.0, true);        // looping

// Mix
setVolume(music, 0.3);
setMasterVolume(0.8);

// Stop
stop(a);          // one voice
stop(click);      // every live voice of this sound

Game-engine patterns

Pitch variation on repeated SFX

The single biggest “doesn’t feel robotic” trick: randomise the rate (±5%) on every play of high-frequency SFX (footsteps, gunshots, hits).

const rate = 0.95 + Math.random() * 0.1;  // 0.95 – 1.05
play(footstep, 1.0, false, rate);

Crossfade music tracks

const calmId = play(calm, 0.0, true);   // start silent
crossfade(intenseId, calmId, 2000);     // 2s linear crossfade

Pause when backgrounded

// from your app lifecycle hook (perry/system / onAppDidEnterBackground)
suspend();                              // silences everything
// onAppDidBecomeActive:
resumeAll();

Three-bus mix template

const sfx   = createBus("sfx");
const music = createBus("music");
const voice = createBus("voice");

// User-facing sliders bind to these:
setVolume(sfx,   userPreferences.sfxVolume);
setVolume(music, userPreferences.musicVolume);
setVolume(voice, userPreferences.voiceVolume);

Format compatibility

WAV (PCM) and MP3 are portable across every platform. The rest depend on the platform decoder:

FormatmacOS / iOS / tvOS / visionOSLinux / Windows / AndroidWeb
WAV
MP3
AAC / M4A
OGG Vorbis✓ (most browsers)
FLAC✓ (10.13+)partial (no Safari)
Opus✓ (iOS 11+)

When in doubt, ship WAV for SFX (small, instant decode) and MP3 for music (good compression, universal).

Performance notes

  • Preload, decode once. loadSound decodes a file to a single shared PCM buffer. Every subsequent play() of that sound schedules the same buffer — no re-decode, no second allocation. 1MB WAV = 1MB in RAM no matter how many times you play it.
  • Voice pool. Voices are preallocated and recycled. The hot path through play() is one indexed table read plus a scheduleBuffer call. No malloc, no string lookup.
  • One shared audio graph. A single AVAudioEngine (Apple) / AudioContext (Web) drives every sound. Bus volume / mute / solo are O(1) on a mixer node, not a walk over voices.
  • Streaming for big files only. Pass stream: true to loadSound for music or files >2MB — Perry reads chunks from disk as the voice consumes them, so a 60-minute track doesn’t occupy 60MB of RAM.
  • Target latency. <10ms on Apple, <30ms on Web. On par with Unity / Godot.

Platform implementation

PlatformBackend
macOS / iOS / tvOS / visionOSAVAudioEngine + AVAudioPlayerNode + AVAudioPCMBuffer + AVAudioUnitVarispeed (per-voice rate).
watchOSSame AVAudioEngine stack as iOS. Background audio requires the host app to declare the audio background mode entitlement; foreground playback works out of the box.
Web (WASM)Web Audio API (AudioContext + AudioBufferSourceNode + GainNode)
Linux / Windows / Androidminiaudio v0.11.22 (perry-audio-miniaudio crate). PulseAudio / PipeWire / ALSA on Linux, WASAPI / DirectSound / WinMM on Windows, AAudio (API 26+) / OpenSL ES on Android — chosen at runtime.

Web autoplay policy

Browsers don’t allow audio playback before a user gesture. The AudioContext is lazily created on the first loadSound() / play() call; if that call happens before any user interaction, the context starts in a suspended state and your play() is queued. Trigger a user-interaction-bound resumeAll() (or just any other play() inside a click handler) to release it.

API reference

See the TypeScript declarations for full parameter documentation. Summary:

FunctionPurpose
loadSound(path, bus?, stream?) -> SoundDecode (or open for streaming) an audio file.
unload(sound)Free the PCM buffer / stream decoder.
play(sound, volume?, loop?, rate?, pan?, fadeInMs?) -> PlaybackIdStart a new voice.
stop(handle, fadeOutMs?)Stop one voice or every voice of a sound.
pause(playback) / resume(playback)Pause/resume a single voice.
setVolume(handle, volume, fadeMs?)Sound default / live voice / bus.
setRate(playback, rate) / setPan(playback, pan)Per-voice pitch and stereo position.
fadeIn(playback, ms, toVol?) / fadeOut(playback, ms) / crossfade(a, b, ms)Linear ramps.
createBus(name, parent?) -> Bus / destroyBus(bus) / muteBus(bus, muted) / soloBus(bus, soloed)Mixer tree.
setMasterVolume(volume, fadeMs?)Root-bus gain.
suspend() / resumeAll()Whole-graph pause for foreground/background transitions.
isPlaying(handle) / getDuration(sound) / getPosition(playback)Introspection.
onEnded(playback, cb) / onLoaded(sound, cb)Lifecycle callbacks.

Tracked in issue #1867.