CLI Command Reference

Model names use the format namespace/name, e.g. Qwen/Qwen3-0.6B-GGUF.

Commands

Command	Description
`csghub-lite run <model>`	Pull, start server, and chat (all automatic)
`csghub-lite chat <model>`	Chat with a locally downloaded model
`csghub-lite ps`	List currently running models and their keep-alive
`csghub-lite stop <model>`	Stop/unload a running model
`csghub-lite serve`	Start the API server (auto-started by `run`)
`csghub-lite pull <model>`	Download a model from CSGHub
`csghub-lite list` / `ls`	List locally downloaded models
`csghub-lite show <model>`	Show model details (format, size, files)
`csghub-lite rm <model>`	Remove a locally downloaded model
`csghub-lite login`	Set CSGHub access token
`csghub-lite search <query>`	Search models on CSGHub
`csghub-lite config set <key> <value>`	Set configuration
`csghub-lite config get <key>`	Get a configuration value
`csghub-lite config show`	Show current configuration
`csghub-lite uninstall`	Remove csghub-lite, llama-server, and all data
`csghub-lite --version`	Show version information

run vs chat

run — Downloads the model if not present, auto-starts the background server, and opens a chat session. After you exit, the model stays loaded for 5 minutes (configurable) so the next run is instant.
chat — Starts a chat session with a model that is already downloaded. Supports --system flag for custom system prompts.

# Auto-download and chat (first time)
csghub-lite run Qwen/Qwen3-0.6B-GGUF

# Exit chat, model stays loaded — reconnect instantly
csghub-lite run Qwen/Qwen3-0.6B-GGUF

# Check which models are still loaded
csghub-lite ps

# Chat with custom system prompt (model must be downloaded)
csghub-lite chat Qwen/Qwen3-0.6B-GGUF --system "You are a coding assistant."

Commands​

run vs chat​

Commands

run vs chat