Sonogram: Transforming Sound into Visual Fingerprints

A deep dive into Sonogram, a full-stack web app that creates interactive 3D visualizations from audio mathematical properties.

  • FastAPI
  • Three.js
  • DSP
  • Generative Art
  • Sonogram

Sonogram is a full-stack web app that transforms audio files into unique generative visual artworks — “sonic fingerprints.” Upload a song, and it analyzes the audio’s mathematical properties to produce an interactive 3D visualization.

What it does

StudioUpload or record audio (MP3, FLAC, WAV, etc.). The app runs DSP analysis (spectral features, MFCCs, chroma, tempo, key detection via librosa) and renders a real-time 3D visualization using Three.js.
GalleryBrowse, search, and filter saved public visualizations by title, artist, genre, or BPM. Each piece has a shareable page with server-rendered OG tags for social previews.
Compare ModePlace two tracks side-by-side in a “Mirror Matrix” visualization.
Admin DashboardManage records, toggle visibility, and bulk-export artwork as ZIP files.
EmbeddableEach visualization has an iframe-friendly /embed/ route for external use.

Tech Stack

LayerTechnologies
BackendFastAPI, SQLAlchemy (SQLite), Uvicorn
DeploymentFly.io, Docker
Audio Analysislibrosa, scipy, numpy (Single-pass STFT)
FrontendVanilla JS (~9K lines), Three.js (WebGL), PWA
ImagingPillow (for PNG generation)

Interesting Facts

Steganography

UUIDs are invisibly embedded in artwork PNGs using LSB (Least Significant Bit) encoding in the blue channel.

Magic Bytes”SONO”
LocationFirst 320 pixels
BenefitRe-upload a PNG to recover metadata — no server lookup needed
Hardware-aware Rendering

The frontend auto-detects GPU, CPU cores, RAM, and display size. It then classifies the device into LOW / MEDIUM / HIGH tiers to dynamically adjust:

Resolution scalingBloom strengthParticle density
Deterministic Visuals

A SHA-256 hash of the raw audio bytes serves as the seed. This ensures that the same file always produces the exact same visualization, maintaining a 1:1 relationship between sound and art.

Three Visualization Modes

Users can switch between three distinct modes in fullscreen:

SpectrogramTime-frequency heatmap
CrystalAbstract 3D geometry
LandscapeTerrain-like surface
Key Detection

The app uses the Krumhansl-Schmuckler algorithm. It correlates chroma vectors against major/minor pitch-class profiles to estimate the musical key with high accuracy.

Theme System for AI Art

Includes theme definitions (Bauhaus, Neon Cyberpunk, Aerial Objects) with ControlNet scale parameters, hinting at a future generative AI triptych feature.

Hybrid Architecture

The Gallery and Studio operate as a SPA-style vanilla JS application for speed, while individual sonogram pages (/s/{id}) are server-rendered to ensure rich social sharing metadata and SEO optimization.

Comments