🤖 LLM GGUF Tools

A collection of Python tools for converting and quantising language models to GGUF format, featuring advanced quantisation methods and direct SafeTensors conversion capabilities.

💡 Looking for quantised models? Check out tcpipuk's HuggingFace profile for models quantised using these tools!

Available Tools

Tool	Purpose	Documentation
quantise_gguf.py	⚡ GGUF quantisation using a variant of Bartowski's method	📖 Docs
safetensors2gguf.py	🔄 Direct SafeTensors to GGUF conversion	📖 Docs

Installation

You need uv for the dependencies:

# Install uv (see https://docs.astral.sh/uv/#installation for more options)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Or update your existing instance
uv self update

Then to set up the environment for these scripts:

# Clone the repository
git clone https://git.tomfos.tr/tom/llm-gguf-tools.git
cd llm-gguf-tools

# Set up virtual environment and install dependencies
uv sync

Requirements

For quantisation: llama.cpp binaries (llama-quantize, llama-cli, llama-imatrix)
For BFloat16 models: PyTorch (optional, auto-detected)
For uploads: HuggingFace API token (set HF_TOKEN environment variable)

Development

For development setup and contribution guidelines, see 📖 Development Guide.

Notes

The resources/imatrix_data.txt file contains importance matrix calibration data from Bartowski's Gist, based on calibration data provided by Dampf, building upon Kalomaze's foundational work.

License

Apache 2.0 License - see LICENSE file for details.

2.1 KiB Raw Blame History