59 lines
2.1 KiB
Markdown
59 lines
2.1 KiB
Markdown
# 🤖 LLM GGUF Tools
|
|
|
|
A collection of Python tools for converting and quantising language models to
|
|
[GGUF format](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md), featuring advanced
|
|
quantisation methods and direct SafeTensors conversion capabilities.
|
|
|
|
> 💡 **Looking for quantised models?** Check out [tcpipuk's HuggingFace profile](https://huggingface.co/tcpipuk)
|
|
> for models quantised using these tools!
|
|
|
|
## Available Tools
|
|
|
|
| Tool | Purpose | Documentation |
|
|
|------|---------|---------------|
|
|
| [quantise_gguf.py](./quantise_gguf.py) | ⚡ GGUF quantisation using a variant of [Bartowski's method](https://huggingface.co/bartowski) | [📖 Docs](docs/quantise_gguf.md) |
|
|
| [safetensors2gguf.py](./safetensors2gguf.py) | 🔄 Direct SafeTensors to GGUF conversion | [📖 Docs](docs/safetensors2gguf.md) |
|
|
|
|
## Installation
|
|
|
|
1. You need [`uv`](https://docs.astral.sh/uv/) for the dependencies:
|
|
|
|
```bash
|
|
# Install uv (see https://docs.astral.sh/uv/#installation for more options)
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
|
|
# Or update your existing instance
|
|
uv self update
|
|
```
|
|
|
|
2. Then to set up the environment for these scripts:
|
|
|
|
```bash
|
|
# Clone the repository
|
|
git clone https://git.tomfos.tr/tom/llm-gguf-tools.git
|
|
cd llm-gguf-tools
|
|
|
|
# Set up virtual environment and install dependencies
|
|
uv sync
|
|
```
|
|
|
|
## Requirements
|
|
|
|
- **For quantisation**: [llama.cpp](https://github.com/ggerganov/llama.cpp) binaries
|
|
(`llama-quantize`, `llama-cli`, `llama-imatrix`)
|
|
- **For BFloat16 models**: PyTorch (optional, auto-detected)
|
|
- **For uploads**: HuggingFace API token (set `HF_TOKEN` environment variable)
|
|
|
|
## Development
|
|
|
|
For development setup and contribution guidelines, see [📖 Development Guide](docs/development.md).
|
|
|
|
## Notes
|
|
|
|
The `resources/imatrix_data.txt` file contains importance matrix calibration data from
|
|
[Bartowski's Gist](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8),
|
|
based on calibration data provided by Dampf, building upon Kalomaze's foundational work.
|
|
|
|
## License
|
|
|
|
Apache 2.0 License - see [LICENSE](./LICENSE) file for details.
|