Technical Overview
The AnythingLLM Desktop Assistant is an operating-system-level overlay interface for chat, agent execution, and Model Context Protocol (MCP) tasks.
System Requirements & Platform Support
| Platform | Support Level | Screen Capture / OCR Features |
|---|---|---|
| MacOS (Silicon/Intel) | Full | Enabled |
| Windows (x64/ARM64) | Full | Enabled |
| Linux (x64/ARM64) | Limited | Disabled (No screen, application, or area capture) |
Keyboard Shortcuts
- Open/Toggle Overlay (Default):
- MacOS:
CMD + / - Windows/Linux:
CTRL + /
- MacOS:
- Modify Shortcut: Navigate to
Settings->Desktop Assistantin the main AnythingLLM menu.
Model Selection & Vision Requirements
The Desktop Assistant utilizes screen capture and image data for context.
Vision-Enabled (Multi-modal) Cloud & Custom Models
Ensure your selected cloud model provider natively supports image input. If the model does not support images, API errors will occur.
Local & Non-Vision Models (Auto-Fallback)
If using the Default LLM Provider, Ollama, or LM Studio, AnythingLLM processes images locally on-device.
- If the local model lacks vision capabilities: The system performs local Optical Character Recognition (OCR) on the screen capture and sends the extracted text to the model.
Recommended Local Models (Quantized)
Qwen3-VL 2B Instruct (Q8)Qwen3-VL 4B Instruct (Q4)Qwen3-VL 8B Instruct (Q4)Gemma3 4B+ (Q4)