Real-world ATC audio
We collect audio from real flights and public sources — busy frequencies, regional accents and phrasing, stepped-on or partial calls, radio noise and distortion. The edge cases are the whole point.
Safety · Beta
In aviation, “almost right” isn't enough. We're staying in beta until ATC transcription is pilot-level accurate — not the demo-reel version.
We collect audio from real flights and public sources — busy frequencies, regional accents and phrasing, stepped-on or partial calls, radio noise and distortion. The edge cases are the whole point.
Before any of it enters a training set, pilots and controllers listen to the original audio and correct the transcripts — callsigns, numbers, clearances, phraseology. Only human-validated segments ever go into training.
Each cycle is measured against held-out flights. Generic transcription models are tuned on podcasts and meetings. Ours is tuned on VHF AM radio with a pilot in the loop.
Our bar
Generic ASR models are trained on podcasts and meetings. Aviation uses a VHF AM radio, domain-specific phraseology, numbers spoken as digits, and callsigns that do not appear in any dictionary. PilotGPT is purpose-built for that.
Roadmap
Each milestone is measured on held-out real-flight audio, not synthetic benchmarks.
Each milestone is measured on held-out real-flight audio, not synthetic benchmarks.
Each milestone is measured on held-out real-flight audio, not synthetic benchmarks.
Each milestone is measured on held-out real-flight audio, not synthetic benchmarks.
Questions