magic-echo

Requirements & Compatibility

OS Support: Windows (no special permissions), macOS (requires Accessibility and Screen Recording permissions). Linux is not supported.
AnythingLLM Version: Desktop v1.15.0 or later.
Hardware/Model Requirements: Smart Transcription with On-Screen Awareness requires a configured LLM provider supporting vision/multi-modal models.

Operating Modes

1. Quick Dictation (Press-and-Speak)

Start: Press activation shortcut (Default: Option+Z on macOS, Alt+Z on Windows).
Process: Speaks naturally; stops and auto-submits when silence is detected (configured via Silence Detection setting).
Cancel: Press Esc while active.
Force Submit: Press Enter to bypass silence detection.

2. Extended Dictation (Hold-to-Speak)

Start: Hold the activation shortcut.
Process: Continues recording until manually stopped.
Cancel: Press Esc to discard.
Submit: Click the “Stop” button in the widget.

Processing Modes

Smart Transcription: Transcribes audio, then processes via configured LLM for grammar, punctuation, formatting, and screen context. (Subject to daily limits on free tier; unlimited on local models/Pro). Falls back to Raw Transcription automatically if LLM processing fails.
Raw Transcription: Local on-device transcription without LLM processing. Faster, unlimited usage, skips grammar/punctuation cleanup.

Configuration Settings

Navigate to Settings -> Magic Echo to configure:

Setting	Description / Options
Activation Key	Key combined with `Option` (macOS) / `Alt` (Windows) to trigger dictation. Default: `Z`
Default Processing Mode	`Smart` or `Raw`
On-Screen Awareness	Enable/Disable visual screen context for LLM. (Requires vision-compatible LLM)
Preferred Microphone	Select input device
Silence Detection	`Aggressive` / `Average` / `Relaxed` (Controls auto-submit timing)
Widget Size	`Default` / `Large` / `Huge` / `Max`
Voice Commands	Configure static keyword triggers to paste predefined text blocks.
Custom Vocabulary	Add comma-separated jargon, technical terms, or names (Smart mode only).

Voice Commands

Configure at Settings -> Magic Echo -> Voice Commands.

Directly maps a spoken exact phrase to a text snippet (bypasses LLM, unlimited usage).
Example: "PRD Template" -> pastes Markdown template; "sign off" -> pastes "Best regards,\n[Name]".

Troubleshooting & Performance Optimization

macOS Permissions Fix

If text insertion or screen capture fails:

Quit AnythingLLM.
Open macOS System Settings -> Privacy & Security.
Manually add and enable AnythingLLM under both Accessibility and Screen Recording.
Restart AnythingLLM.
Disable and re-enable Magic Echo in settings.

Latency Reduction

Disable On-Screen Awareness to skip screenshot processing.
Set Default Processing Mode to Raw Transcription.
Note: The first dictation session after inactivity takes longer due to model loading; subsequent runs remain warm.