Technical Overview

The AnythingLLM Desktop Assistant is an operating-system-level overlay interface for chat, agent execution, and Model Context Protocol (MCP) tasks.

System Requirements & Platform Support

PlatformSupport LevelScreen Capture / OCR Features
MacOS (Silicon/Intel)FullEnabled
Windows (x64/ARM64)FullEnabled
Linux (x64/ARM64)LimitedDisabled (No screen, application, or area capture)

Keyboard Shortcuts

  • Open/Toggle Overlay (Default):
    • MacOS: CMD + /
    • Windows/Linux: CTRL + /
  • Modify Shortcut: Navigate to Settings -> Desktop Assistant in the main AnythingLLM menu.

Model Selection & Vision Requirements

The Desktop Assistant utilizes screen capture and image data for context.

Vision-Enabled (Multi-modal) Cloud & Custom Models

Ensure your selected cloud model provider natively supports image input. If the model does not support images, API errors will occur.

Local & Non-Vision Models (Auto-Fallback)

If using the Default LLM Provider, Ollama, or LM Studio, AnythingLLM processes images locally on-device.

  • If the local model lacks vision capabilities: The system performs local Optical Character Recognition (OCR) on the screen capture and sends the extracted text to the model.
  • Qwen3-VL 2B Instruct (Q8)
  • Qwen3-VL 4B Instruct (Q4)
  • Qwen3-VL 8B Instruct (Q4)
  • Gemma3 4B+ (Q4)