Audio (perry/audio)
The perry/audio module is Perry’s low-latency, game-engine-style audio
mixer. Three concepts:
Sound— a loaded asset.loadSound("click.wav")returns one handle; the PCM data lives in memory until youunload().PlaybackId— one live voice.play(sound)returns a new PlaybackId every time it’s called, so the same sound can overlap with itself (think: multiple gunshots, multiple footsteps).Bus— a mixer group. Sounds route through a Bus, Buses route through their parent (default: master). OnesetVolume(musicBus, 0.3)scales every voice on it.
Use perry/audio for SFX, music loops, voice prompts, and any UI
feedback where you want overlap or sub-20ms latency. For long-form
streaming with a seek bar, lock-screen controls, and Now Playing
metadata, use perry/media instead.
Quick start
import {
loadSound, play, stop, setVolume,
createBus, setMasterVolume,
} from "perry/audio";
// Optional: organise sounds into buses
const sfx = createBus("sfx");
const music = createBus("music");
// Load assets — decode happens in the background. The handle is
// returned immediately; play() before decode finishes just queues
// the playback.
const click = loadSound("assets/click.wav", sfx);
const bgm = loadSound("assets/bgm.mp3", music, /* stream */ true);
// Fire-and-forget — overlap is automatic, each play() returns a new
// PlaybackId you can stop / fade / tune independently.
const a = play(click);
const b = play(click, 0.7, false, 0.95); // slightly lower pitch
const bgmId = play(bgm, 1.0, true); // looping
// Mix
setVolume(music, 0.3);
setMasterVolume(0.8);
// Stop
stop(a); // one voice
stop(click); // every live voice of this sound
Game-engine patterns
Pitch variation on repeated SFX
The single biggest “doesn’t feel robotic” trick: randomise the rate (±5%) on every play of high-frequency SFX (footsteps, gunshots, hits).
const rate = 0.95 + Math.random() * 0.1; // 0.95 – 1.05
play(footstep, 1.0, false, rate);
Crossfade music tracks
const calmId = play(calm, 0.0, true); // start silent
crossfade(intenseId, calmId, 2000); // 2s linear crossfade
Pause when backgrounded
// from your app lifecycle hook (perry/system / onAppDidEnterBackground)
suspend(); // silences everything
// onAppDidBecomeActive:
resumeAll();
Three-bus mix template
const sfx = createBus("sfx");
const music = createBus("music");
const voice = createBus("voice");
// User-facing sliders bind to these:
setVolume(sfx, userPreferences.sfxVolume);
setVolume(music, userPreferences.musicVolume);
setVolume(voice, userPreferences.voiceVolume);
Format compatibility
WAV (PCM) and MP3 are portable across every platform. The rest depend on the platform decoder:
| Format | macOS / iOS / tvOS / visionOS | Linux / Windows / Android | Web |
|---|---|---|---|
| WAV | ✓ | ✓ | ✓ |
| MP3 | ✓ | ✓ | ✓ |
| AAC / M4A | ✓ | ✗ | ✓ |
| OGG Vorbis | ✗ | ✓ | ✓ (most browsers) |
| FLAC | ✓ (10.13+) | ✓ | partial (no Safari) |
| Opus | ✓ (iOS 11+) | ✓ | ✓ |
When in doubt, ship WAV for SFX (small, instant decode) and MP3 for music (good compression, universal).
Performance notes
- Preload, decode once.
loadSounddecodes a file to a single shared PCM buffer. Every subsequentplay()of that sound schedules the same buffer — no re-decode, no second allocation. 1MB WAV = 1MB in RAM no matter how many times you play it. - Voice pool. Voices are preallocated and recycled. The hot path
through
play()is one indexed table read plus ascheduleBuffercall. No malloc, no string lookup. - One shared audio graph. A single
AVAudioEngine(Apple) /AudioContext(Web) drives every sound. Bus volume / mute / solo are O(1) on a mixer node, not a walk over voices. - Streaming for big files only. Pass
stream: truetoloadSoundfor music or files >2MB — Perry reads chunks from disk as the voice consumes them, so a 60-minute track doesn’t occupy 60MB of RAM. - Target latency. <10ms on Apple, <30ms on Web. On par with Unity / Godot.
Platform implementation
| Platform | Backend |
|---|---|
| macOS / iOS / tvOS / visionOS | AVAudioEngine + AVAudioPlayerNode + AVAudioPCMBuffer + AVAudioUnitVarispeed (per-voice rate). |
| watchOS | Same AVAudioEngine stack as iOS. Background audio requires the host app to declare the audio background mode entitlement; foreground playback works out of the box. |
| Web (WASM) | Web Audio API (AudioContext + AudioBufferSourceNode + GainNode) |
| Linux / Windows / Android | miniaudio v0.11.22 (perry-audio-miniaudio crate). PulseAudio / PipeWire / ALSA on Linux, WASAPI / DirectSound / WinMM on Windows, AAudio (API 26+) / OpenSL ES on Android — chosen at runtime. |
Web autoplay policy
Browsers don’t allow audio playback before a user gesture. The
AudioContext is lazily created on the first loadSound() / play()
call; if that call happens before any user interaction, the context
starts in a suspended state and your play() is queued. Trigger a
user-interaction-bound resumeAll() (or just any other play()
inside a click handler) to release it.
API reference
See the TypeScript declarations for full parameter documentation. Summary:
| Function | Purpose |
|---|---|
loadSound(path, bus?, stream?) -> Sound | Decode (or open for streaming) an audio file. |
unload(sound) | Free the PCM buffer / stream decoder. |
play(sound, volume?, loop?, rate?, pan?, fadeInMs?) -> PlaybackId | Start a new voice. |
stop(handle, fadeOutMs?) | Stop one voice or every voice of a sound. |
pause(playback) / resume(playback) | Pause/resume a single voice. |
setVolume(handle, volume, fadeMs?) | Sound default / live voice / bus. |
setRate(playback, rate) / setPan(playback, pan) | Per-voice pitch and stereo position. |
fadeIn(playback, ms, toVol?) / fadeOut(playback, ms) / crossfade(a, b, ms) | Linear ramps. |
createBus(name, parent?) -> Bus / destroyBus(bus) / muteBus(bus, muted) / soloBus(bus, soloed) | Mixer tree. |
setMasterVolume(volume, fadeMs?) | Root-bus gain. |
suspend() / resumeAll() | Whole-graph pause for foreground/background transitions. |
isPlaying(handle) / getDuration(sound) / getPosition(playback) | Introspection. |
onEnded(playback, cb) / onLoaded(sound, cb) | Lifecycle callbacks. |
Tracked in issue #1867.