Pro

Better Transcription

Pro includes access to premium cloud transcription services that offer higher accuracy than local models, especially for accented speech, technical jargon, and noisy environments.

Pro Curated Models

Pro subscribers get access to curated cloud transcription models that work out of the box with no configuration required. These models are selected for quality and reliability, and API keys are managed automatically.

How Provider Selection Works

When you use Pro's curated transcription, you don't pick a specific provider — Char's server automatically selects the best one based on your configured languages. The routing logic evaluates each provider's language support quality and picks the highest-quality match.

Provider priority order:

PriorityProviderDefault ModelBest For
1Deepgramnova-3English, general use
2Sonioxstt-rt-v3Multilingual (e.g., Korean + English)
3AssemblyAIuniversalSpeaker diarization
4Gladiasolaria-1Code switching
5ElevenLabsscribe_v2_realtimeReal-time quality
6Fireworkswhisper-v3-turboWhisper-based
7OpenAIgpt-4o-transcribeFinal fallback
33
priorities: vec![
34
Provider::Deepgram,
35
Provider::Soniox,
36
Provider::AssemblyAI,
37
Provider::Gladia,
38
Provider::ElevenLabs,
39
Provider::Fireworks,
40
Provider::OpenAI,
41
],

For example, when transcribing Korean + English, Soniox is selected over Deepgram because it has better multilingual support — even though Deepgram has higher base priority. The router sorts by language quality first, then falls back to priority order:

70
pub fn select_provider_chain(
71
&self,
72
languages: &[Language],
73
available_providers: &HashSet<Provider>,
74
) -> Vec<Provider> {
75
let mut candidates: Vec<_> = self
76
.priorities
77
.iter()
78
.copied()
79
.filter_map(|p| {

How Audio Flows

Your Device ──WebSocket──▶ Char API Server ──WebSocket──▶ STT Provider
                           (pro.hyprnote.com)               (e.g., Deepgram, Soniox)
  1. Your device opens a WebSocket to the Char API server, authenticated with your Supabase JWT token.
  2. Char API server validates your Pro subscription, selects a provider based on your language, and opens a WebSocket to that provider.
  3. Your device streams raw audio (16kHz, 16-bit PCM, mono or stereo) through the Char server to the STT provider.
  4. The STT provider returns real-time transcription results back through the same chain.
  5. If a provider fails, the server retries with the next provider in the chain (up to 2 retries with exponential backoff).

Bring Your Own Key (BYOK)

If you want to use a specific transcription provider, you can bring your own API key. Supported providers include:

ProviderBest ForLanguages
DeepgramReal-time accuracy, keyword handling30+
AssemblyAISpeaker diarization, streaming20+
GladiaCode switching, multi-channel audio90+
OpenAIBatch transcription, Whisper API50+
SonioxHigh accuracy, enterprise features70+
ElevenLabsHigh-quality real-time transcription30+
DashScopeQwen3-ASR real-time speech recognition10+
MistralVoxtral audio transcription10+

To use BYOK, go to Settings > Transcription and enter your API key for your preferred provider.

How to Enable

  1. Subscribe to Pro or start a free trial
  2. Go to Settings > Transcription
  3. Use the curated Pro models (default) or enter your own API key for a specific provider

Language Support

Char checks if your selected provider supports your configured languages. If there's a mismatch, you'll see a warning with suggestions for compatible providers. Configure your languages in Settings > Language & Vocabulary.

How Your Audio Data Is Handled

When using cloud transcription, your recorded audio is sent to the selected provider for processing:

  • Pro curated models: Your audio is proxied through pro.hyprnote.com and forwarded to a curated STT provider. The proxy does not store your audio.
  • BYOK: Your audio is sent directly from your device to the provider you selected. Char acts only as the client.

Here is how Char selects the correct adapter for your configured provider — each provider has its own adapter that handles the audio stream:

22
pub(super) async fn spawn_rx_task(
23
args: ListenerArgs,
24
myself: ActorRef<ListenerMsg>,
25
) -> Result<
26
(
27
ChannelSender,
28
tokio::task::JoinHandle<()>,
29
tokio::sync::oneshot::Sender<()>,
30
String,
31
),

What Data Is Sent to the Provider

Sent alongside your audio stream:

122
ListenerArgs {
123
app: state.ctx.app.clone(),
124
languages: state.ctx.params.languages.clone(),
125
onboarding: state.ctx.params.onboarding,
126
model: state.ctx.params.model.clone(),
127
base_url: state.ctx.params.base_url.clone(),
128
api_key: state.ctx.params.api_key.clone(),
129
keywords: state.ctx.params.keywords.clone(),
130
mode,
131
session_started_at: state.ctx.started_at_instant,
  • Raw audio (Linear PCM, 16kHz sample rate, mono or stereo)
  • Configuration: model name, language codes, optional keyword boost list, sample rate, channel count

NOT sent to the provider:

  • Your user ID, email, or name
  • Your device fingerprint or JWT token
  • Meeting metadata (title, participants, notes)

Your audio files and transcripts are always stored locally on your device regardless of which transcription method you use. Cloud providers only receive the audio stream for processing and return the transcript.

What Char Logs

Char logs metadata about each STT session to PostHog for usage tracking. No audio or transcription text is ever logged.

26
let payload = AnalyticsPayload::builder("$stt_request")
27
.with("$stt_provider", event.provider.clone())
28
.with("$stt_duration", event.duration.as_secs_f64());

Logged: provider name, session duration. Not logged: audio content, transcription text, meeting content.

Provider Privacy Policies

All STT providers used by Pro have zero data retention for real-time/streaming transcription and are SOC 2 compliant.

Deepgram (Primary)

PolicyDetails
Data retentionZero storage by default — no audio or transcript retained after processing
TrainingDoes not train on customer data (acts as data processor)
ComplianceSOC 2 Type 2, GDPR, HIPAA, PCI, CCPA
EncryptionTLS 1.2+ (transit), AES-256 (rest)
Data locationUS (default), EU available (api.eu.deepgram.com)

"Deepgram's default configuration meets the strictest requirements with zero retention after processing."

Deepgram Compliance Guide

Official docs: Privacy Policy · Data Security · Information Security & Privacy

Soniox (Multilingual)

PolicyDetails
Data retentionNo retention for real-time API
TrainingNever uses customer audio or transcripts for model training
ComplianceSOC 2 Type 2, GDPR, HIPAA
EncryptionTLS 1.2+ (transit)
Data locationUS (default), EU (api.eu.soniox.com), Japan (api.jp.soniox.com)

"No retention – Soniox does not store your audio or transcript data unless explicitly requested through a service that supports storage."

"No model training – your audio and transcripts are never used to improve Soniox models or services."

Soniox Security & Privacy

Official docs: Privacy Policy · Security & Privacy · Data Residency

AssemblyAI (Fallback)

PolicyDetails
Data retentionZero for streaming API (when opted out of model training)
TrainingOptional — can opt out
ComplianceSOC 2 Type 2, GDPR, HIPAA, PCI-DSS 4.0 Level 1
EncryptionTLS 1.3 (transit), AES-256 (rest)
Data locationUS (default), EU (Dublin, Ireland)

"If you are opted out of model training, we offer zero data retention of audio and transcripts for our Streaming product."

AssemblyAI Data Retention FAQ

Official docs: Privacy Policy · Security · Trust Center

For the full details on every data flow, see AI Models & Data Privacy.

When to Use Cloud vs Local

Use cloud transcription when you need maximum accuracy and have internet access. Use local transcription (Whisper models) when privacy is paramount or you're offline. Local models support 50+ languages and run entirely on your device.

For local STT model details and manual download instructions, see Local Models.