2.1 KiB
2.1 KiB
🤖 LLM GGUF Tools
A collection of Python tools for converting and quantising language models to GGUF format, featuring advanced quantisation methods and direct SafeTensors conversion capabilities.
💡 Looking for quantised models? Check out tcpipuk's HuggingFace profile for models quantised using these tools!
Available Tools
Tool | Purpose | Documentation |
---|---|---|
quantise_gguf.py | ⚡ GGUF quantisation using a variant of Bartowski's method | 📖 Docs |
safetensors2gguf.py | 🔄 Direct SafeTensors to GGUF conversion | 📖 Docs |
Installation
-
You need
uv
for the dependencies:# Install uv (see https://docs.astral.sh/uv/#installation for more options) curl -LsSf https://astral.sh/uv/install.sh | sh # Or update your existing instance uv self update
-
Then to set up the environment for these scripts:
# Clone the repository git clone https://git.tomfos.tr/tom/llm-gguf-tools.git cd llm-gguf-tools # Set up virtual environment and install dependencies uv sync
Requirements
- For quantisation: llama.cpp binaries
(
llama-quantize
,llama-cli
,llama-imatrix
) - For BFloat16 models: PyTorch (optional, auto-detected)
- For uploads: HuggingFace API token (set
HF_TOKEN
environment variable)
Development
For development setup and contribution guidelines, see 📖 Development Guide.
Notes
The resources/imatrix_data.txt
file contains importance matrix calibration data from
Bartowski's Gist,
based on calibration data provided by Dampf, building upon Kalomaze's foundational work.
License
Apache 2.0 License - see LICENSE file for details.