# 🤖 LLM GGUF Tools Python tools for transforming language models into optimised [GGUF format](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md) using proven quantisation strategies. Based on analysis of community patterns, these tools replicate Bartowski's acclaimed quantisation profiles whilst handling edge cases that break naive conversion approaches. The project bridges the gap between HuggingFace's SafeTensors ecosystem and llama.cpp's GGUF inference engine, with particular focus on models that fall outside llama.cpp's supported architecture list. > 💡 **Looking for quantised models?** Check out [tcpipuk's HuggingFace profile](https://huggingface.co/tcpipuk) > for models quantised using these tools! ## Available Tools | Tool | Purpose | Documentation | |------|---------|---------------| | [quantise_gguf.py](./quantise_gguf.py) | Advanced GGUF quantisation with Bartowski's proven profiles (Q3_K-Q6_K variants) | [📖 Docs](docs/quantise_gguf.md) • [🔬 Analysis](docs/bartowski_analysis.md) | | [safetensors2gguf.py](./safetensors2gguf.py) | Direct SafeTensors conversion for unsupported architectures | [📖 Docs](docs/safetensors2gguf.md) | ## Quick Start The project uses [`uv`](https://docs.astral.sh/uv/) for Rust-fast dependency management with automatic Python version handling: ```bash # Install uv (or update existing: uv self update) curl -LsSf https://astral.sh/uv/install.sh | sh # Clone and set up the project git clone https://git.tomfos.tr/tom/llm-gguf-tools.git cd llm-gguf-tools uv sync # Installs llama-cpp-python with CUDA support if available # Generate HuggingFace token for uploads (optional) # Visit https://huggingface.co/settings/tokens export HF_TOKEN=your_token_here ``` Then quantise any HuggingFace model: ```bash # Fire-and-forget quantisation with automatic upload uv run quantise_gguf.py https://huggingface.co/meta-llama/Llama-3.2-1B # Or convert unsupported architectures directly uv run safetensors2gguf.py ./path/to/model ``` For importance matrix (imatrix) data and calibration techniques, see the [📖 IMatrix Data Guide](docs/imatrix_data.md). ## Development Contributions welcome for pragmatic solutions. See [📖 Development Guide](docs/development.md) for setup, standards, and architectural decisions. ## License Apache 2.0 License - see [LICENSE](./LICENSE) file for details.