1. Download Model Archives
Download the required model ZIP archive:
- Llama-3.2-3B-Chat (8k context):
https://cdn.anythingllm.com/support/qnn/llama_v3_2_3b_chat_8k.zip - Llama-3.2-3B-Chat (16k context):
https://cdn.anythingllm.com/support/qnn/llama_v3_2_3b_chat_16k.zip - Llama-3.1-8B-Chat (8k context):
https://cdn.anythingllm.com/support/qnn/llama_v3_1_8b_chat_8k.zip - Phi 3.5-mini-instruct (4k context):
https://cdn.anythingllm.com/support/qnn/phi_3_5_mini_instruct_4k.zip
2. Extraction and Placement
- Navigate to the desktop storage directory.
- Create the target directory if it does not exist:
models/QNN - Move the downloaded ZIP archive into
models/QNN. - Extract the ZIP archive.
Target Directory Structure
models/QNN/
└── llama_v3_2_3b_chat_8k/
├── genie_config.json
├── htp_backend_etc.bin
├── related-model-bin-file.bin
└── tokenizer.json3. Apply Changes
- Restart the desktop application.
- Select the model via the graphical user interface.