# Speak-Y STT

> Speak-Y STT is an AI-powered speech-to-text transcription service that transforms audio and video files into accurate text. Built on advanced Whisper AI technology, it offers speaker diarization, support for 57+ transcription languages, 12 interface languages, smart summaries, and seamless integrations.

## Overview

Speak-Y STT is a modern web-based transcription platform designed for journalists, content creators, podcasters, researchers, and businesses who need reliable audio-to-text conversion. The service uses state-of-the-art AI models to deliver high-accuracy transcriptions with additional features like automatic speaker detection and intelligent summarization.

## Key Features

- **AI-Powered Transcription**: Utilizes advanced Whisper AI models for industry-leading accuracy up to 99%
- **Speaker Diarization**: Automatically identifies and labels different speakers in multi-speaker audio
- **57+ Language Support**: Transcribe content in 57+ languages with auto-detection
- **12 Interface Languages**: Full UI localization in English, Spanish, Russian, French, German, Italian, Chinese, Japanese, Korean, Portuguese, Arabic, Hindi
- **Smart Summaries**: AI-generated summaries and chapter markers for long recordings
- **Multiple Export Formats**: Download transcripts as TXT, SRT, VTT, DOCX, or JSON
- **YouTube Integration**: Transcribe videos directly from YouTube URLs
- **Custom Dictionary**: Add technical terms and proper nouns for improved accuracy
- **API Access**: RESTful API for developers to integrate transcription into their workflows
- **Real-time Processing**: Queue-based processing with live status updates via WebSocket
- **AI Chapters**: Automatic chapter generation for long recordings

## Use Cases

1. **Journalism**: Transcribe interviews and press conferences quickly
2. **Podcasting**: Generate show notes and searchable transcripts
3. **Academic Research**: Transcribe focus groups and interviews
4. **Legal**: Document depositions and court proceedings
5. **Business**: Convert meeting recordings to searchable text
6. **Content Creation**: Generate subtitles for videos
7. **Accessibility**: Make audio content accessible to deaf/hard-of-hearing audiences

## Pricing Plans

### Free Plan - $0/month
- 300 minutes/month
- 60 minutes per file limit
- 20 files/month
- 7 days storage
- Basic noise reduction
- Export: TXT, SRT

### Pro Plan - $19/month
- 25 hours/month (1,500 minutes)
- 4 hours per file
- 300 files/month
- 30 days storage
- Advanced noise reduction
- Priority processing
- All export formats
- AI chapters generation
- Custom dictionary (100 terms)

### Creator Plan - $39/month
- 100 hours/month (6,000 minutes)
- Unlimited file duration
- Unlimited files
- 90 days storage
- Premium accuracy
- Fastest processing
- Translation to 50+ languages
- AI chapters generation
- Custom dictionary (unlimited)
- Full API access

## Technical Specifications

- **Supported Audio Formats**: MP3, WAV, M4A, FLAC, OGG, AAC, WMA
- **Supported Video Formats**: MP4, MOV, AVI, MKV, WebM, FLV
- **Maximum File Size**: 500MB (Free), 1.5GB (Pro), 2GB (Creator)
- **Processing Speed**: ~10x real-time for Pro/Creator plans
- **API Rate Limits**: 100 requests/hour (Pro), 500 requests/hour (Creator)

## Documentation

- [API Reference](/api-reference): Complete API documentation with examples
- [User Guide](/user-guide): Step-by-step guide for using the platform
- [Integration Examples](/integration-examples): Code samples for popular frameworks

## Contact

- Website: https://stt.speak-y.com
- Email: support@speak-y.com
- Documentation: https://stt.speak-y.com/api-reference

## Full Documentation

For complete documentation including code examples and detailed API reference, see:
- [llms-full.txt](/llms-full.txt)