# Speak-Y STT > Speak-Y STT is an AI-powered speech-to-text transcription service that transforms audio and video files into accurate text. Built on advanced Whisper AI technology, it offers speaker diarization, support for 57+ transcription languages, 12 interface languages, smart summaries, and seamless integrations. ## Overview Speak-Y STT is a modern web-based transcription platform designed for journalists, content creators, podcasters, researchers, and businesses who need reliable audio-to-text conversion. The service uses state-of-the-art AI models to deliver high-accuracy transcriptions with additional features like automatic speaker detection and intelligent summarization. ## Key Features - **AI-Powered Transcription**: Utilizes advanced Whisper AI models for industry-leading accuracy up to 99% - **Speaker Diarization**: Automatically identifies and labels different speakers in multi-speaker audio - **57+ Language Support**: Transcribe content in 57+ languages with auto-detection - **12 Interface Languages**: Full UI localization in English, Spanish, Russian, French, German, Italian, Chinese, Japanese, Korean, Portuguese, Arabic, Hindi - **Smart Summaries**: AI-generated summaries and chapter markers for long recordings - **Multiple Export Formats**: Download transcripts as TXT, SRT, VTT, DOCX, or JSON - **YouTube Integration**: Transcribe videos directly from YouTube URLs - **Custom Dictionary**: Add technical terms and proper nouns for improved accuracy - **API Access**: RESTful API for developers to integrate transcription into their workflows - **Real-time Processing**: Queue-based processing with live status updates via WebSocket - **AI Chapters**: Automatic chapter generation for long recordings ## Use Cases 1. **Journalism**: Transcribe interviews and press conferences quickly 2. **Podcasting**: Generate show notes and searchable transcripts 3. **Academic Research**: Transcribe focus groups and interviews 4. **Legal**: Document depositions and court proceedings 5. **Business**: Convert meeting recordings to searchable text 6. **Content Creation**: Generate subtitles for videos 7. **Accessibility**: Make audio content accessible to deaf/hard-of-hearing audiences ## Pricing Plans ### Free Plan - $0/month - 300 minutes/month - 60 minutes per file limit - 20 files/month - 7 days storage - Basic noise reduction - Export: TXT, SRT ### Pro Plan - $19/month - 25 hours/month (1,500 minutes) - 4 hours per file - 300 files/month - 30 days storage - Advanced noise reduction - Priority processing - All export formats - AI chapters generation - Custom dictionary (100 terms) ### Creator Plan - $39/month - 100 hours/month (6,000 minutes) - Unlimited file duration - Unlimited files - 90 days storage - Premium accuracy - Fastest processing - Translation to 50+ languages - AI chapters generation - Custom dictionary (unlimited) - Full API access ## Technical Specifications - **Supported Audio Formats**: MP3, WAV, M4A, FLAC, OGG, AAC, WMA - **Supported Video Formats**: MP4, MOV, AVI, MKV, WebM, FLV - **Maximum File Size**: 500MB (Free), 1.5GB (Pro), 2GB (Creator) - **Processing Speed**: ~10x real-time for Pro/Creator plans - **API Rate Limits**: 100 requests/hour (Pro), 500 requests/hour (Creator) ## Documentation - [API Reference](/api-reference): Complete API documentation with examples - [User Guide](/user-guide): Step-by-step guide for using the platform - [Integration Examples](/integration-examples): Code samples for popular frameworks ## Contact - Website: https://stt.speak-y.com - Email: support@speak-y.com - Documentation: https://stt.speak-y.com/api-reference ## Full Documentation For complete documentation including code examples and detailed API reference, see: - [llms-full.txt](/llms-full.txt)