Refactor to extrapolate helpers
This commit is contained in:
parent
485e838bf5
commit
0294945904
27 changed files with 6701 additions and 115 deletions
113
.gitignore
vendored
113
.gitignore
vendored
|
@ -1,4 +1,3 @@
|
|||
# ---> Python
|
||||
# Byte-compiled / optimized / DLL files
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
|
@ -27,16 +26,6 @@ share/python-wheels/
|
|||
*.egg
|
||||
MANIFEST
|
||||
|
||||
# PyInstaller
|
||||
# Usually these files are written by a python script from a template
|
||||
# before PyInstaller builds the exe, so as to inject date/other infos into it.
|
||||
*.manifest
|
||||
*.spec
|
||||
|
||||
# Installer logs
|
||||
pip-log.txt
|
||||
pip-delete-this-directory.txt
|
||||
|
||||
# Unit test / coverage reports
|
||||
htmlcov/
|
||||
.tox/
|
||||
|
@ -52,76 +41,6 @@ coverage.xml
|
|||
.pytest_cache/
|
||||
cover/
|
||||
|
||||
# Translations
|
||||
*.mo
|
||||
*.pot
|
||||
|
||||
# Django stuff:
|
||||
*.log
|
||||
local_settings.py
|
||||
db.sqlite3
|
||||
db.sqlite3-journal
|
||||
|
||||
# Flask stuff:
|
||||
instance/
|
||||
.webassets-cache
|
||||
|
||||
# Scrapy stuff:
|
||||
.scrapy
|
||||
|
||||
# Sphinx documentation
|
||||
docs/_build/
|
||||
|
||||
# PyBuilder
|
||||
.pybuilder/
|
||||
target/
|
||||
|
||||
# Jupyter Notebook
|
||||
.ipynb_checkpoints
|
||||
|
||||
# IPython
|
||||
profile_default/
|
||||
ipython_config.py
|
||||
|
||||
# pyenv
|
||||
# For a library or package, you might want to ignore these files since the code is
|
||||
# intended to run in multiple environments; otherwise, check them in:
|
||||
# .python-version
|
||||
|
||||
# pipenv
|
||||
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
|
||||
# However, in case of collaboration, if having platform-specific dependencies or dependencies
|
||||
# having no cross-platform support, pipenv may install dependencies that don't work, or not
|
||||
# install all needed dependencies.
|
||||
#Pipfile.lock
|
||||
|
||||
# poetry
|
||||
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
|
||||
# This is especially recommended for binary packages to ensure reproducibility, and is more
|
||||
# commonly ignored for libraries.
|
||||
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
|
||||
#poetry.lock
|
||||
|
||||
# pdm
|
||||
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
|
||||
#pdm.lock
|
||||
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
|
||||
# in version control.
|
||||
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
|
||||
.pdm.toml
|
||||
.pdm-python
|
||||
.pdm-build/
|
||||
|
||||
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
|
||||
__pypackages__/
|
||||
|
||||
# Celery stuff
|
||||
celerybeat-schedule
|
||||
celerybeat.pid
|
||||
|
||||
# SageMath parsed files
|
||||
*.sage.py
|
||||
|
||||
# Environments
|
||||
.env
|
||||
.venv
|
||||
|
@ -130,35 +49,3 @@ venv/
|
|||
ENV/
|
||||
env.bak/
|
||||
venv.bak/
|
||||
|
||||
# Spyder project settings
|
||||
.spyderproject
|
||||
.spyproject
|
||||
|
||||
# Rope project settings
|
||||
.ropeproject
|
||||
|
||||
# mkdocs documentation
|
||||
/site
|
||||
|
||||
# mypy
|
||||
.mypy_cache/
|
||||
.dmypy.json
|
||||
dmypy.json
|
||||
|
||||
# Pyre type checker
|
||||
.pyre/
|
||||
|
||||
# pytype static type analyzer
|
||||
.pytype/
|
||||
|
||||
# Cython debug symbols
|
||||
cython_debug/
|
||||
|
||||
# PyCharm
|
||||
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
|
||||
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
|
||||
# and can be added to the global gitignore or merged into this file. For a more nuclear
|
||||
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
|
||||
#.idea/
|
||||
|
||||
|
|
57
README.md
57
README.md
|
@ -1,3 +1,56 @@
|
|||
# llm-gguf-tools
|
||||
# LLM GGUF Tools
|
||||
|
||||
Tools to convert/quantise language models in GGUF format
|
||||
A collection of Python tools for converting and quantising language models to
|
||||
[GGUF format](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md), featuring advanced
|
||||
quantisation methods and direct SafeTensors conversion capabilities.
|
||||
|
||||
## Available Tools
|
||||
|
||||
| Tool | Purpose | Documentation |
|
||||
|------|---------|---------------|
|
||||
| [quantize_gguf.py](./quantize_gguf.py) | GGUF quantisation using [Bartowski's method](https://huggingface.co/bartowski) | [📖 Docs](docs/quantize_gguf.md) |
|
||||
| [safetensors2gguf.py](./safetensors2gguf.py) | Direct SafeTensors to GGUF conversion | [📖 Docs](docs/safetensors2gguf.md) |
|
||||
|
||||
## Installation
|
||||
|
||||
1. You need [`uv`](https://docs.astral.sh/uv/) for the dependencies:
|
||||
|
||||
```bash
|
||||
# Install uv (see https://docs.astral.sh/uv/#installation for more options)
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
|
||||
# Or update your existing instance
|
||||
uv self update
|
||||
```
|
||||
|
||||
2. Then to set up the environment for these scripts:
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://git.tomfos.tr/tom/llm-gguf-tools.git
|
||||
cd llm-gguf-tools
|
||||
|
||||
# Set up virtual environment and install dependencies
|
||||
uv sync
|
||||
```
|
||||
|
||||
## Requirements
|
||||
|
||||
- **For quantisation**: [llama.cpp](https://github.com/ggerganov/llama.cpp) binaries
|
||||
(`llama-quantize`, `llama-cli`, `llama-imatrix`)
|
||||
- **For BFloat16 models**: PyTorch (optional, auto-detected)
|
||||
- **For uploads**: HuggingFace API token (set `HF_TOKEN` environment variable)
|
||||
|
||||
## Development
|
||||
|
||||
For development setup and contribution guidelines, see [📖 Development Guide](docs/development.md).
|
||||
|
||||
## Notes
|
||||
|
||||
The `resources/imatrix_data.txt` file contains importance matrix calibration data from
|
||||
[Bartowski's Gist](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8),
|
||||
based on calibration data provided by Dampf, building upon Kalomaze's foundational work.
|
||||
|
||||
## License
|
||||
|
||||
Apache 2.0 License - see [LICENSE](./LICENSE) file for details.
|
||||
|
|
86
docs/development.md
Normal file
86
docs/development.md
Normal file
|
@ -0,0 +1,86 @@
|
|||
# Development Guide
|
||||
|
||||
This guide covers development setup, code quality standards, and project structure for contributors.
|
||||
|
||||
## Code Quality
|
||||
|
||||
```bash
|
||||
# Run linting
|
||||
uv run ruff check
|
||||
|
||||
# Format code
|
||||
uv run ruff format
|
||||
|
||||
# Run with debug logging
|
||||
DEBUG=true uv run <script>
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
```plain
|
||||
llm-gguf-tools/
|
||||
├── quantise.py # Bartowski quantisation tool
|
||||
├── direct_safetensors_to_gguf.py # Direct conversion tool
|
||||
├── helpers/ # Shared utilities
|
||||
│ ├── __init__.py
|
||||
│ └── logger.py # Colour-coded logging
|
||||
├── resources/ # Resource files
|
||||
│ └── imatrix_data.txt # Calibration data for imatrix
|
||||
├── docs/ # Detailed documentation
|
||||
│ ├── quantise.md
|
||||
│ ├── direct_safetensors_to_gguf.md
|
||||
│ └── development.md
|
||||
└── pyproject.toml # Project configuration
|
||||
```
|
||||
|
||||
## Contributing Guidelines
|
||||
|
||||
Contributions are welcome! Please ensure:
|
||||
|
||||
1. Code follows the existing style (run `uv run ruff format`)
|
||||
2. All functions have Google-style docstrings
|
||||
3. Type hints are used throughout
|
||||
4. Tests pass (if applicable)
|
||||
|
||||
## Development Workflow
|
||||
|
||||
### Setting Up Development Environment
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://git.tomfos.tr/tom/llm-gguf-tools.git
|
||||
cd llm-gguf-tools
|
||||
|
||||
# Install all dependencies including dev
|
||||
uv sync --all-groups
|
||||
```
|
||||
|
||||
### Code Style
|
||||
|
||||
- Follow PEP 8 with ruff enforcement
|
||||
- Use UK English spelling in comments and documentation
|
||||
- Maximum line length: 100 characters
|
||||
- Use type hints for all function parameters and returns
|
||||
|
||||
### Testing
|
||||
|
||||
While formal tests are not yet implemented, ensure:
|
||||
|
||||
- Scripts run without errors on sample models
|
||||
- Logger output is correctly formatted
|
||||
- File I/O operations handle errors gracefully
|
||||
|
||||
### Debugging
|
||||
|
||||
Enable debug logging for verbose output:
|
||||
|
||||
```bash
|
||||
DEBUG=true uv run quantise.py <model_url>
|
||||
```
|
||||
|
||||
This will show additional information about:
|
||||
|
||||
- Model download progress
|
||||
- Conversion steps
|
||||
- File operations
|
||||
- Error details
|
102
docs/quantize_gguf.md
Normal file
102
docs/quantize_gguf.md
Normal file
|
@ -0,0 +1,102 @@
|
|||
# quantise.py - Advanced GGUF Quantisation
|
||||
|
||||
Advanced GGUF quantisation tool implementing Bartowski's sophisticated quantisation pipeline.
|
||||
|
||||
## Overview
|
||||
|
||||
This tool automates the complete quantisation workflow for converting models to GGUF format with
|
||||
multiple precision variants, importance matrix generation, and automatic upload to HuggingFace.
|
||||
|
||||
## Quantisation Variants
|
||||
|
||||
The tool produces four quantisation variants based on Bartowski's method:
|
||||
|
||||
- **Q4_K_M**: Standard baseline quantisation
|
||||
- **Q4_K_L**: Q6_K embeddings + Q6_K attention layers for better quality
|
||||
- **Q4_K_XL**: Q8_0 embeddings + Q6_K attention layers for enhanced precision
|
||||
- **Q4_K_XXL**: Q8_0 embeddings + Q8_0 attention for maximum precision
|
||||
|
||||
## Features
|
||||
|
||||
- **Automatic model download**: Downloads models from HuggingFace automatically
|
||||
- **Importance matrix generation**: Creates imatrix for improved quantisation quality
|
||||
- **Parallel processing**: Uploads multiple variants simultaneously
|
||||
- **Progress tracking**: Real-time status updates during conversion
|
||||
- **README generation**: Automatically creates model cards with quantisation details
|
||||
- **HuggingFace integration**: Direct upload to HuggingFace with proper metadata
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```bash
|
||||
# Quantise a model from HuggingFace
|
||||
uv run quantise.py https://huggingface.co/meta-llama/Llama-3.2-1B
|
||||
```
|
||||
|
||||
### Command Line Options
|
||||
|
||||
```bash
|
||||
# Skip imatrix generation for faster processing
|
||||
uv run quantise.py <model_url> --no-imatrix
|
||||
|
||||
# Local testing without upload
|
||||
uv run quantise.py <model_url> --no-upload
|
||||
|
||||
# Custom output directory
|
||||
uv run quantise.py <model_url> --output-dir ./my-models
|
||||
|
||||
# Use specific HuggingFace token
|
||||
uv run quantise.py <model_url> --hf-token YOUR_TOKEN
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
- `HF_TOKEN`: HuggingFace API token for uploads
|
||||
- `LLAMA_CPP_DIR`: Custom path to llama.cpp binaries
|
||||
- `DEBUG`: Enable debug logging when set to "true"
|
||||
|
||||
## Requirements
|
||||
|
||||
- **llama.cpp binaries**: `llama-quantize`, `llama-cli`, `llama-imatrix`
|
||||
- **Calibration data**: `resources/imatrix_data.txt` for importance matrix generation
|
||||
- **HuggingFace account**: For uploading quantised models (optional)
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Download**: Fetches the model from HuggingFace
|
||||
2. **Convert**: Converts to initial GGUF format (F32)
|
||||
3. **Generate imatrix**: Creates importance matrix using calibration data
|
||||
4. **Quantise**: Produces multiple quantisation variants in parallel
|
||||
5. **Upload**: Pushes quantised models to HuggingFace with metadata
|
||||
6. **Clean up**: Removes temporary files and caches
|
||||
|
||||
## Output Structure
|
||||
|
||||
```plain
|
||||
output_dir/
|
||||
├── model-F32.gguf # Full precision conversion
|
||||
├── model-Q4_K_M.gguf # Standard quantisation
|
||||
├── model-Q4_K_M-imat.gguf # With importance matrix
|
||||
├── model-Q4_K_L-imat.gguf # Enhanced embeddings/attention
|
||||
├── model-Q4_K_XL-imat.gguf # High precision embeddings
|
||||
├── model-Q4_K_XXL-imat.gguf # Maximum precision
|
||||
└── imatrix.dat # Generated importance matrix
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
The tool includes comprehensive error handling for:
|
||||
|
||||
- Network failures during download
|
||||
- Missing binaries or dependencies
|
||||
- Insufficient disk space
|
||||
- HuggingFace API errors
|
||||
- Conversion failures
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
- **Disk space**: Requires ~3x model size in free space
|
||||
- **Memory**: Needs RAM proportional to model size
|
||||
- **Processing time**: Varies from minutes to hours based on model size
|
||||
- **Network**: Downloads can be large (10-100+ GB for large models)
|
164
docs/safetensors2gguf.md
Normal file
164
docs/safetensors2gguf.md
Normal file
|
@ -0,0 +1,164 @@
|
|||
# direct_safetensors_to_gguf.py - Direct SafeTensors Conversion
|
||||
|
||||
Direct SafeTensors to GGUF converter for unsupported architectures.
|
||||
|
||||
## Overview
|
||||
|
||||
This tool converts SafeTensors models directly to GGUF format without requiring specific
|
||||
architecture support in llama.cpp. It's particularly useful for experimental models, custom
|
||||
architectures, or when llama.cpp's standard conversion tools don't recognise your model
|
||||
architecture.
|
||||
|
||||
## Features
|
||||
|
||||
- **Architecture-agnostic**: Works with unsupported model architectures
|
||||
- **Automatic mapping**: Intelligently maps tensor names to GGUF conventions
|
||||
- **BFloat16 support**: Handles BF16 tensors with PyTorch (optional)
|
||||
- **Vision models**: Supports models with vision components
|
||||
- **Tokeniser preservation**: Extracts and includes tokeniser metadata
|
||||
- **Fallback mechanisms**: Provides sensible defaults for unknown architectures
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```bash
|
||||
# Convert a local SafeTensors model
|
||||
uv run direct_safetensors_to_gguf.py /path/to/model/directory
|
||||
```
|
||||
|
||||
### Command Line Options
|
||||
|
||||
```bash
|
||||
# Specify output file
|
||||
uv run direct_safetensors_to_gguf.py /path/to/model -o output.gguf
|
||||
|
||||
# Force specific architecture mapping
|
||||
uv run direct_safetensors_to_gguf.py /path/to/model --force-arch qwen2
|
||||
|
||||
# Convert with custom output path
|
||||
uv run direct_safetensors_to_gguf.py ./my-model --output ./converted/my-model.gguf
|
||||
```
|
||||
|
||||
## Supported Input Formats
|
||||
|
||||
The tool automatically detects and handles:
|
||||
|
||||
1. **Single file models**: `model.safetensors`
|
||||
2. **Sharded models**: `model-00001-of-00005.safetensors`, etc.
|
||||
3. **Custom names**: Any `*.safetensors` files in the directory
|
||||
|
||||
## Architecture Mapping
|
||||
|
||||
The tool includes built-in mappings for several architectures:
|
||||
|
||||
- `DotsOCRForCausalLM` → `qwen2`
|
||||
- `GptOssForCausalLM` → `llama`
|
||||
- Unknown architectures → `llama` (fallback)
|
||||
|
||||
You can override these with the `--force-arch` parameter.
|
||||
|
||||
## Tensor Name Mapping
|
||||
|
||||
The converter automatically maps common tensor patterns:
|
||||
|
||||
| Original Pattern | GGUF Name |
|
||||
|-----------------|-----------|
|
||||
| `model.embed_tokens.weight` | `token_embd.weight` |
|
||||
| `model.norm.weight` | `output_norm.weight` |
|
||||
| `lm_head.weight` | `output.weight` |
|
||||
| `layers.N.self_attn.q_proj` | `blk.N.attn_q` |
|
||||
| `layers.N.self_attn.k_proj` | `blk.N.attn_k` |
|
||||
| `layers.N.self_attn.v_proj` | `blk.N.attn_v` |
|
||||
| `layers.N.mlp.gate_proj` | `blk.N.ffn_gate` |
|
||||
| `layers.N.mlp.up_proj` | `blk.N.ffn_up` |
|
||||
| `layers.N.mlp.down_proj` | `blk.N.ffn_down` |
|
||||
|
||||
## Configuration Requirements
|
||||
|
||||
The model directory must contain:
|
||||
|
||||
- **config.json**: Model configuration file (required)
|
||||
- **\*.safetensors**: One or more SafeTensors files (required)
|
||||
- **tokenizer_config.json**: Tokeniser configuration (optional)
|
||||
- **tokenizer.json**: Tokeniser data (optional)
|
||||
|
||||
## Output Format
|
||||
|
||||
The tool produces a single GGUF file containing:
|
||||
|
||||
- All model weights in F32 format
|
||||
- Model architecture metadata
|
||||
- Tokeniser configuration (if available)
|
||||
- Special token IDs (BOS, EOS, UNK, PAD)
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Error | Message | Solution |
|
||||
|-------|---------|----------|
|
||||
| Missing config.json | `FileNotFoundError: Config file not found` | Ensure the model directory contains a valid `config.json` file |
|
||||
| No SafeTensors files | `FileNotFoundError: No safetensor files found` | Check that the directory contains `.safetensors` files |
|
||||
| BFloat16 without PyTorch | `Warning: PyTorch not available, BFloat16 models may not convert properly` | Install PyTorch for BF16 support: `uv add torch` |
|
||||
| Unknown architecture | `Warning: Unknown architecture X, using llama as fallback` | Use `--force-arch` to specify a known compatible architecture |
|
||||
|
||||
## Technical Details
|
||||
|
||||
### Parameter Inference
|
||||
|
||||
The tool infers GGUF parameters from the model configuration:
|
||||
|
||||
- `vocab_size` → vocabulary size (default: 32000)
|
||||
- `max_position_embeddings` → context length (default: 2048)
|
||||
- `hidden_size` → embedding dimension (default: 4096)
|
||||
- `num_hidden_layers` → number of transformer blocks (default: 32)
|
||||
- `num_attention_heads` → attention head count (default: 32)
|
||||
- `num_key_value_heads` → KV head count (defaults to attention heads)
|
||||
- `rope_theta` → RoPE frequency base (default: 10000.0)
|
||||
- `rms_norm_eps` → layer normalisation epsilon (default: 1e-5)
|
||||
|
||||
### Vision Model Support
|
||||
|
||||
For models with vision components, the tool extracts:
|
||||
|
||||
- Vision embedding dimensions
|
||||
- Vision transformer block count
|
||||
- Vision attention heads
|
||||
- Vision feed-forward dimensions
|
||||
- Patch size and spatial merge parameters
|
||||
|
||||
## Limitations
|
||||
|
||||
- **F32 only**: Currently outputs only full precision (F32) models
|
||||
- **Architecture guessing**: May require manual architecture specification
|
||||
- **Tokeniser compatibility**: Uses llama tokeniser as default fallback
|
||||
- **Memory usage**: Requires loading full tensors into memory
|
||||
|
||||
## Examples
|
||||
|
||||
### Converting a custom model
|
||||
|
||||
```bash
|
||||
# Download a model first
|
||||
git clone https://huggingface.co/my-org/my-model ./my-model
|
||||
|
||||
# Convert to GGUF
|
||||
uv run direct_safetensors_to_gguf.py ./my-model
|
||||
|
||||
# Output will be at ./my-model/my-model-f32.gguf
|
||||
```
|
||||
|
||||
### Converting with specific architecture
|
||||
|
||||
```bash
|
||||
# For a Qwen2-based model
|
||||
uv run direct_safetensors_to_gguf.py ./qwen-model --force-arch qwen2
|
||||
```
|
||||
|
||||
### Batch conversion
|
||||
|
||||
```bash
|
||||
# Convert multiple models
|
||||
for model in ./models/*; do
|
||||
uv run direct_safetensors_to_gguf.py "$model" -o "./gguf/$(basename $model).gguf"
|
||||
done
|
||||
```
|
6
helpers/__init__.py
Normal file
6
helpers/__init__.py
Normal file
|
@ -0,0 +1,6 @@
|
|||
"""Helper utilities for LLM GGUF tools.
|
||||
|
||||
This package provides common utilities, logging, and shared functionality
|
||||
used across the quantisation and conversion tools. Uses UK English spelling
|
||||
conventions throughout.
|
||||
"""
|
6
helpers/config/__init__.py
Normal file
6
helpers/config/__init__.py
Normal file
|
@ -0,0 +1,6 @@
|
|||
"""Configuration module for quantisation settings and tensor-level precision control.
|
||||
|
||||
Provides structured configuration definitions for Bartowski quantisation methods
|
||||
including Q4_K_M, Q4_K_L, Q4_K_XL, and Q4_K_XXL variants with fallback strategies
|
||||
for different model architectures and deployment scenarios.
|
||||
"""
|
95
helpers/config/quantisation_configs.py
Normal file
95
helpers/config/quantisation_configs.py
Normal file
|
@ -0,0 +1,95 @@
|
|||
"""Quantisation configuration definitions.
|
||||
|
||||
Pre-defined quantisation configurations for the Bartowski method, supporting
|
||||
Q4_K_M, Q4_K_L, Q4_K_XL, and Q4_K_XXL variants with tensor-level precision control.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from helpers.models.quantisation import QuantisationConfig, QuantisationType
|
||||
|
||||
QUANTISATION_CONFIGS: dict[QuantisationType, QuantisationConfig] = {
|
||||
QuantisationType.Q4_K_M: QuantisationConfig(
|
||||
name="Q4_K_M",
|
||||
description="Standard Q4_K_M quantisation (baseline)",
|
||||
tensor_types={}, # No special tensor overrides - uses default Q4_K_M
|
||||
fallback_methods=[],
|
||||
),
|
||||
QuantisationType.Q4_K_L: QuantisationConfig(
|
||||
name="Q4_K_L",
|
||||
description="Q6_K embeddings + Q6_K attention (+753MB for vocab + reasoning)",
|
||||
tensor_types={
|
||||
"token_embd.weight": "Q6_K",
|
||||
"output.weight": "Q6_K",
|
||||
"lm_head.weight": "Q6_K",
|
||||
"blk.*.attn_q.weight": "Q6_K",
|
||||
"blk.*.attn_k.weight": "Q6_K",
|
||||
"blk.*.attn_v.weight": "Q6_K",
|
||||
},
|
||||
fallback_methods=[
|
||||
{
|
||||
"embed_tokens.weight": "Q6_K",
|
||||
"output.weight": "Q6_K",
|
||||
"lm_head.weight": "Q6_K",
|
||||
"blk.*.attn_q.weight": "Q6_K",
|
||||
"blk.*.attn_k.weight": "Q6_K",
|
||||
"blk.*.attn_v.weight": "Q6_K",
|
||||
},
|
||||
{"token-embedding-type": "Q6_K", "output-tensor-type": "Q6_K"},
|
||||
],
|
||||
),
|
||||
QuantisationType.Q4_K_XL: QuantisationConfig(
|
||||
name="Q4_K_XL",
|
||||
description="Q8_0 embeddings + Q6_K attention (+2.1GB for vocabulary + reasoning)",
|
||||
tensor_types={
|
||||
"token_embd.weight": "Q8_0",
|
||||
"output.weight": "Q8_0",
|
||||
"lm_head.weight": "Q8_0",
|
||||
"blk.*.attn_q.weight": "Q6_K",
|
||||
"blk.*.attn_k.weight": "Q6_K",
|
||||
"blk.*.attn_v.weight": "Q6_K",
|
||||
},
|
||||
fallback_methods=[
|
||||
{
|
||||
"embed_tokens.weight": "Q8_0",
|
||||
"output.weight": "Q8_0",
|
||||
"lm_head.weight": "Q8_0",
|
||||
"blk.*.attn_q.weight": "Q6_K",
|
||||
"blk.*.attn_k.weight": "Q6_K",
|
||||
"blk.*.attn_v.weight": "Q6_K",
|
||||
},
|
||||
{"token-embedding-type": "Q8_0", "output-tensor-type": "Q8_0"},
|
||||
],
|
||||
),
|
||||
QuantisationType.Q4_K_XXL: QuantisationConfig(
|
||||
name="Q4_K_XXL",
|
||||
description="Q8_0 embeddings + Q8_0 attention (+2.8GB total, maximum precision)",
|
||||
tensor_types={
|
||||
"token_embd.weight": "Q8_0",
|
||||
"output.weight": "Q8_0",
|
||||
"lm_head.weight": "Q8_0",
|
||||
"blk.*.attn_q.weight": "Q8_0",
|
||||
"blk.*.attn_k.weight": "Q8_0",
|
||||
"blk.*.attn_v.weight": "Q8_0",
|
||||
},
|
||||
fallback_methods=[
|
||||
{
|
||||
"embed_tokens.weight": "Q8_0",
|
||||
"output.weight": "Q8_0",
|
||||
"lm_head.weight": "Q8_0",
|
||||
"blk.*.attn_q.weight": "Q8_0",
|
||||
"blk.*.attn_k.weight": "Q8_0",
|
||||
"blk.*.attn_v.weight": "Q8_0",
|
||||
},
|
||||
{"token-embedding-type": "Q8_0", "output-tensor-type": "Q8_0"},
|
||||
],
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
SUPPORTED_QUANTISATION_TYPES: list[QuantisationType] = [
|
||||
QuantisationType.Q4_K_M,
|
||||
QuantisationType.Q4_K_L,
|
||||
QuantisationType.Q4_K_XL,
|
||||
QuantisationType.Q4_K_XXL,
|
||||
]
|
94
helpers/logger.py
Normal file
94
helpers/logger.py
Normal file
|
@ -0,0 +1,94 @@
|
|||
"""Colour-coded logging configuration for LLM GGUF tools.
|
||||
|
||||
Provides a consistent logging interface with colour-coded output for different
|
||||
log levels, making it easier to identify warnings, errors, and informational
|
||||
messages at a glance during tool execution and debugging sessions.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from logging import (
|
||||
CRITICAL,
|
||||
DEBUG,
|
||||
ERROR,
|
||||
INFO,
|
||||
WARNING,
|
||||
Formatter as LoggingFormatter,
|
||||
Logger,
|
||||
LogRecord,
|
||||
StreamHandler as LoggingStreamHandler,
|
||||
getLogger,
|
||||
)
|
||||
from os import getenv as os_getenv
|
||||
from sys import stdout as sys_stdout
|
||||
from typing import ClassVar
|
||||
|
||||
DEBUG_MODE = os_getenv("DEBUG", "false").lower() == "true"
|
||||
|
||||
|
||||
class ColourFormatter(LoggingFormatter):
|
||||
"""Custom formatter adding colours to log messages based on severity level.
|
||||
|
||||
Uses ANSI escape codes to provide visual distinction between different
|
||||
log levels in terminal output. Supports standard logging levels with
|
||||
appropriate colour coding: DEBUG (cyan), INFO (green), WARNING (yellow),
|
||||
ERROR (red), and CRITICAL (bold red) for immediate visual feedback.
|
||||
"""
|
||||
|
||||
# ANSI colour codes
|
||||
COLOURS: ClassVar[dict[int, str]] = {
|
||||
DEBUG: "\033[36m", # Cyan
|
||||
INFO: "\033[32m", # Green
|
||||
WARNING: "\033[33m", # Yellow
|
||||
ERROR: "\033[31m", # Red
|
||||
CRITICAL: "\033[1;31m", # Bold Red
|
||||
}
|
||||
RESET = "\033[0m"
|
||||
|
||||
# Emoji prefixes for different levels
|
||||
EMOJIS: ClassVar[dict[int, str]] = {
|
||||
DEBUG: "🔍",
|
||||
INFO: "ℹ️ ", # noqa: RUF001
|
||||
WARNING: "⚠️ ",
|
||||
ERROR: "❌",
|
||||
CRITICAL: "🔥",
|
||||
}
|
||||
|
||||
def format(self, record: LogRecord) -> str:
|
||||
"""Format log record with colour and emoji based on severity level.
|
||||
|
||||
Enhances standard log formatting by prepending ANSI colour codes and
|
||||
emoji indicators, then appending reset codes to prevent colour bleeding.
|
||||
Maintains standard log structure whilst adding visual enhancements for
|
||||
improved readability in terminal environments.
|
||||
|
||||
Returns:
|
||||
str: Formatted log message with colour and emoji.
|
||||
"""
|
||||
# Get colour for this level
|
||||
colour = self.COLOURS.get(record.levelno, "")
|
||||
emoji = self.EMOJIS.get(record.levelno, "")
|
||||
|
||||
# Format the message
|
||||
record.msg = f"{emoji} {record.msg}"
|
||||
formatted = super().format(record)
|
||||
|
||||
# Add colour codes
|
||||
return f"{colour}{formatted}{self.RESET}"
|
||||
|
||||
|
||||
# Create and configure the logger
|
||||
logger: Logger = getLogger("llm-gguf-tools")
|
||||
logger.setLevel(DEBUG if DEBUG_MODE else INFO)
|
||||
|
||||
# Create console handler with colour formatter
|
||||
handler = LoggingStreamHandler(sys_stdout)
|
||||
handler.setLevel(DEBUG if DEBUG_MODE else INFO)
|
||||
|
||||
# Set formatter without timestamp for cleaner output
|
||||
formatter = ColourFormatter(fmt="%(message)s", datefmt="%H:%M:%S")
|
||||
handler.setFormatter(formatter)
|
||||
logger.addHandler(handler)
|
||||
|
||||
# Prevent propagation to root logger
|
||||
logger.propagate = False
|
35
helpers/models/__init__.py
Normal file
35
helpers/models/__init__.py
Normal file
|
@ -0,0 +1,35 @@
|
|||
"""Pydantic models for llm-gguf-tools.
|
||||
|
||||
This module provides structured data models for quantisation and conversion
|
||||
operations, ensuring type safety and validation across the toolset.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from helpers.models.conversion import (
|
||||
GGUFParameters,
|
||||
ModelConfig,
|
||||
TensorMapping,
|
||||
VisionConfig,
|
||||
)
|
||||
from helpers.models.quantisation import (
|
||||
LlamaCppEnvironment,
|
||||
ModelSource,
|
||||
QuantisationConfig,
|
||||
QuantisationResult,
|
||||
QuantisationType,
|
||||
URLType,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"GGUFParameters",
|
||||
"LlamaCppEnvironment",
|
||||
"ModelConfig",
|
||||
"ModelSource",
|
||||
"QuantisationConfig",
|
||||
"QuantisationResult",
|
||||
"QuantisationType",
|
||||
"TensorMapping",
|
||||
"URLType",
|
||||
"VisionConfig",
|
||||
]
|
150
helpers/models/conversion.py
Normal file
150
helpers/models/conversion.py
Normal file
|
@ -0,0 +1,150 @@
|
|||
"""Pydantic models for GGUF conversion operations.
|
||||
|
||||
Contains data models for SafeTensors to GGUF conversion including
|
||||
model configurations, parameter mappings, and tensor specifications.
|
||||
Uses UK English spelling conventions throughout.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any
|
||||
|
||||
from pydantic import BaseModel, ConfigDict, Field
|
||||
|
||||
|
||||
class ModelConfig(BaseModel):
|
||||
"""Parsed model configuration from HuggingFace config.json.
|
||||
|
||||
Represents the standard configuration metadata extracted from HuggingFace
|
||||
models, providing structured access to architecture details, hyperparameters,
|
||||
and quantisation settings required for GGUF conversion.
|
||||
"""
|
||||
|
||||
model_config = ConfigDict(extra="allow")
|
||||
|
||||
architectures: list[str] = Field(default_factory=lambda: ["Unknown"])
|
||||
model_type: str = "unknown"
|
||||
vocab_size: int = 32000
|
||||
max_position_embeddings: int = 2048
|
||||
hidden_size: int = 4096
|
||||
num_hidden_layers: int = 32
|
||||
intermediate_size: int = 11008
|
||||
num_attention_heads: int = 32
|
||||
num_key_value_heads: int | None = None
|
||||
rope_theta: float = 10000.0
|
||||
rope_scaling: dict[str, Any] | None = None
|
||||
rms_norm_eps: float = 1e-5
|
||||
vision_config: VisionConfig | None = None
|
||||
|
||||
def to_gguf_params(self) -> GGUFParameters:
|
||||
"""Convert model configuration to GGUF parameters.
|
||||
|
||||
Translates HuggingFace model configuration values to GGUF-specific
|
||||
parameter format, handling defaults and calculating derived values
|
||||
like RoPE dimension count from head dimensions.
|
||||
|
||||
Returns:
|
||||
GGUFParameters instance with converted values.
|
||||
"""
|
||||
params = {
|
||||
"vocab_size": self.vocab_size,
|
||||
"context_length": self.max_position_embeddings,
|
||||
"embedding_length": self.hidden_size,
|
||||
"block_count": self.num_hidden_layers,
|
||||
"feed_forward_length": self.intermediate_size,
|
||||
"attention.head_count": self.num_attention_heads,
|
||||
"attention.head_count_kv": self.num_key_value_heads or self.num_attention_heads,
|
||||
"attention.layer_norm_rms_epsilon": self.rms_norm_eps,
|
||||
"rope.freq_base": self.rope_theta,
|
||||
"rope.dimension_count": self.hidden_size // self.num_attention_heads,
|
||||
}
|
||||
return GGUFParameters(**params) # type: ignore[arg-type]
|
||||
|
||||
|
||||
class VisionConfig(BaseModel):
|
||||
"""Vision model configuration for multimodal models.
|
||||
|
||||
Contains parameters specific to vision components in multimodal architectures,
|
||||
including patch sizes, embedding dimensions, and spatial merge configurations
|
||||
for proper GGUF metadata generation.
|
||||
"""
|
||||
|
||||
model_config = ConfigDict(extra="allow")
|
||||
|
||||
hidden_size: int = 1536
|
||||
num_hidden_layers: int = 42
|
||||
num_attention_heads: int = 12
|
||||
intermediate_size: int = 4224
|
||||
patch_size: int = 14
|
||||
spatial_merge_size: int = 2
|
||||
rms_norm_eps: float | None = None
|
||||
|
||||
|
||||
class GGUFParameters(BaseModel):
|
||||
"""GGUF-specific parameters inferred from model configuration.
|
||||
|
||||
Translates HuggingFace configuration values to GGUF parameter names and
|
||||
formats, providing a standardised interface for GGUF writer configuration
|
||||
across different model architectures and quantisation strategies.
|
||||
"""
|
||||
|
||||
model_config = ConfigDict(extra="allow")
|
||||
|
||||
# Basic parameters
|
||||
vocab_size: int
|
||||
context_length: int
|
||||
embedding_length: int
|
||||
block_count: int
|
||||
feed_forward_length: int
|
||||
|
||||
# Attention parameters
|
||||
attention_head_count: int = Field(alias="attention.head_count")
|
||||
attention_head_count_kv: int = Field(alias="attention.head_count_kv")
|
||||
attention_layer_norm_rms_epsilon: float = Field(alias="attention.layer_norm_rms_epsilon")
|
||||
|
||||
# RoPE parameters
|
||||
rope_freq_base: float = Field(alias="rope.freq_base")
|
||||
rope_dimension_count: int = Field(alias="rope.dimension_count")
|
||||
rope_scaling_type: str | None = Field(default=None, alias="rope.scaling.type")
|
||||
rope_scaling_factor: float | None = Field(default=None, alias="rope.scaling.factor")
|
||||
|
||||
|
||||
class TensorMapping(BaseModel):
|
||||
"""Mapping configuration for tensor name conversion.
|
||||
|
||||
Defines rules for translating between HuggingFace tensor naming conventions
|
||||
and GGUF tensor names, supporting both direct mappings and pattern-based
|
||||
transformations for layer-specific tensors.
|
||||
"""
|
||||
|
||||
model_config = ConfigDict(frozen=True)
|
||||
|
||||
# Direct mappings (exact name matches)
|
||||
direct_mappings: dict[str, str] = Field(
|
||||
default_factory=lambda: {
|
||||
"model.embed_tokens.weight": "token_embd.weight",
|
||||
"model.norm.weight": "output_norm.weight",
|
||||
"lm_head.weight": "output.weight",
|
||||
}
|
||||
)
|
||||
|
||||
# Layer component patterns (for .layers.N. tensors)
|
||||
layer_patterns: dict[str, str] = Field(
|
||||
default_factory=lambda: {
|
||||
"self_attn.q_proj.weight": "attn_q.weight",
|
||||
"self_attn.q_proj.bias": "attn_q.bias",
|
||||
"self_attn.k_proj.weight": "attn_k.weight",
|
||||
"self_attn.k_proj.bias": "attn_k.bias",
|
||||
"self_attn.v_proj.weight": "attn_v.weight",
|
||||
"self_attn.v_proj.bias": "attn_v.bias",
|
||||
"self_attn.o_proj": "attn_output.weight",
|
||||
"mlp.gate_proj": "ffn_gate.weight",
|
||||
"mlp.up_proj": "ffn_up.weight",
|
||||
"mlp.down_proj": "ffn_down.weight",
|
||||
"input_layernorm": "attn_norm.weight",
|
||||
"post_attention_layernorm": "ffn_norm.weight",
|
||||
}
|
||||
)
|
||||
|
||||
# Architecture-specific overrides
|
||||
architecture_overrides: dict[str, dict[str, str]] = Field(default_factory=dict)
|
168
helpers/models/quantisation.py
Normal file
168
helpers/models/quantisation.py
Normal file
|
@ -0,0 +1,168 @@
|
|||
"""Pydantic models for quantisation operations.
|
||||
|
||||
Contains data models specific to the quantisation workflow including
|
||||
quantisation types, configurations, and results. Uses UK English spelling
|
||||
conventions throughout (quantisation, not quantization).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from enum import StrEnum
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from pydantic import BaseModel, ConfigDict, Field, field_validator
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
class QuantisationType(StrEnum):
|
||||
"""Available quantisation types for Bartowski-method GGUF model conversion.
|
||||
|
||||
Defines the specific quantisation strategies supported by this tool, ranging
|
||||
from Q4_K_M baseline to Q4_K_XXL maximum precision variants. Each type
|
||||
represents different trade-offs between model size and quality preservation
|
||||
for embeddings, attention layers, and feed-forward networks.
|
||||
"""
|
||||
|
||||
Q4_K_M = "Q4_K_M"
|
||||
Q4_K_L = "Q4_K_L"
|
||||
Q4_K_XL = "Q4_K_XL"
|
||||
Q4_K_XXL = "Q4_K_XXL"
|
||||
|
||||
|
||||
class URLType(StrEnum):
|
||||
"""Supported URL formats for model source specification.
|
||||
|
||||
Categorises input URL formats to enable appropriate handling strategies.
|
||||
HuggingFace URLs require full model download and conversion, whilst Ollama
|
||||
GGUF URLs allow direct GGUF file downloads with pattern matching for
|
||||
efficient processing of pre-quantised models.
|
||||
"""
|
||||
|
||||
HUGGINGFACE = "huggingface"
|
||||
OLLAMA_GGUF = "ollama_gguf"
|
||||
|
||||
|
||||
class QuantisationConfig(BaseModel):
|
||||
"""Configuration for a specific quantisation method with tensor-level precision control.
|
||||
|
||||
Defines quantisation parameters including tensor type mappings and fallback
|
||||
methods for handling different model architectures. Enables fine-grained
|
||||
control over which layers receive higher precision treatment whilst
|
||||
maintaining compatibility across diverse model structures.
|
||||
"""
|
||||
|
||||
model_config = ConfigDict(use_enum_values=True)
|
||||
|
||||
name: str
|
||||
description: str
|
||||
tensor_types: dict[str, str] = Field(default_factory=dict)
|
||||
fallback_methods: list[dict[str, str]] = Field(default_factory=list)
|
||||
|
||||
|
||||
class ModelSource(BaseModel):
|
||||
"""Represents a model source with parsed information from URL analysis.
|
||||
|
||||
Contains comprehensive metadata extracted from model URLs including source
|
||||
repository details, author information, and GGUF file patterns. Enables
|
||||
differentiation between regular HuggingFace repositories requiring conversion
|
||||
and GGUF repositories allowing direct file downloads.
|
||||
"""
|
||||
|
||||
model_config = ConfigDict(use_enum_values=True, protected_namespaces=())
|
||||
|
||||
url: str
|
||||
url_type: URLType
|
||||
source_model: str
|
||||
original_author: str
|
||||
model_name: str
|
||||
gguf_file_pattern: str | None = None
|
||||
is_gguf_repo: bool = False
|
||||
|
||||
@field_validator("url")
|
||||
@classmethod
|
||||
def validate_url(cls, v: str) -> str:
|
||||
"""Validate that URL is not empty.
|
||||
|
||||
Ensures the provided URL string is not empty or None,
|
||||
as this is required for model source identification.
|
||||
|
||||
Returns:
|
||||
The validated URL string.
|
||||
|
||||
Raises:
|
||||
ValueError: If URL is empty or None.
|
||||
"""
|
||||
if not v:
|
||||
msg = "URL cannot be empty"
|
||||
raise ValueError(msg)
|
||||
return v
|
||||
|
||||
|
||||
class QuantisationResult(BaseModel):
|
||||
"""Result of a quantisation operation with comprehensive status tracking.
|
||||
|
||||
Captures the outcome of individual quantisation attempts including success
|
||||
status, file paths, sizes, and error details. Supports workflow status
|
||||
tracking from planning through processing to completion, enabling real-time
|
||||
progress reporting and parallel upload coordination.
|
||||
"""
|
||||
|
||||
model_config = ConfigDict(use_enum_values=True, arbitrary_types_allowed=True)
|
||||
|
||||
quantisation_type: QuantisationType
|
||||
success: bool
|
||||
file_path: Path | None = None
|
||||
file_size: str | None = None
|
||||
method_used: str | None = None
|
||||
error_message: str | None = None
|
||||
status: str = "pending" # planned, processing, uploading, completed, failed
|
||||
|
||||
|
||||
class LlamaCppEnvironment(BaseModel):
|
||||
"""Represents llama.cpp environment setup with binary and script locations.
|
||||
|
||||
Encapsulates the runtime environment for llama.cpp tools including paths
|
||||
to quantisation binaries, CLI tools, and conversion scripts. Handles both
|
||||
local binary installations and repository-based setups to provide flexible
|
||||
deployment options across different system configurations.
|
||||
"""
|
||||
|
||||
model_config = ConfigDict(arbitrary_types_allowed=True)
|
||||
|
||||
quantise_binary: Path # UK spelling
|
||||
cli_binary: Path
|
||||
convert_script: str
|
||||
use_repo: bool = False
|
||||
|
||||
|
||||
class QuantisationContext(BaseModel):
|
||||
"""Context object containing all parameters needed for quantisation execution.
|
||||
|
||||
Encapsulates quantisation parameters to reduce method argument counts
|
||||
and improve code maintainability following parameter object pattern.
|
||||
"""
|
||||
|
||||
model_config = ConfigDict(frozen=True)
|
||||
|
||||
f16_model_path: Path
|
||||
model_source: ModelSource
|
||||
config: QuantisationConfig
|
||||
llama_env: LlamaCppEnvironment
|
||||
models_dir: Path
|
||||
imatrix_path: Path | None = None
|
||||
base_quant: str = "Q4_K_M"
|
||||
|
||||
def get_output_path(self) -> Path:
|
||||
"""Generate output path for quantised model.
|
||||
|
||||
Returns:
|
||||
Path to the output GGUF file.
|
||||
"""
|
||||
output_filename = (
|
||||
f"{self.model_source.original_author}-"
|
||||
f"{self.model_source.model_name}-"
|
||||
f"{self.config.name}.gguf"
|
||||
)
|
||||
return self.models_dir / self.model_source.model_name / output_filename
|
20
helpers/services/__init__.py
Normal file
20
helpers/services/__init__.py
Normal file
|
@ -0,0 +1,20 @@
|
|||
"""Service layer for llm-gguf-tools.
|
||||
|
||||
Provides high-level service interfaces for interacting with external systems
|
||||
including HuggingFace, llama.cpp, and filesystem operations. Uses UK English
|
||||
spelling conventions throughout.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from helpers.services.filesystem import FilesystemService
|
||||
from helpers.services.huggingface import HuggingFaceService, ReadmeGenerator
|
||||
from helpers.services.llama_cpp import EnvironmentManager, IMatrixGenerator
|
||||
|
||||
__all__ = [
|
||||
"EnvironmentManager",
|
||||
"FilesystemService",
|
||||
"HuggingFaceService",
|
||||
"IMatrixGenerator",
|
||||
"ReadmeGenerator",
|
||||
]
|
174
helpers/services/filesystem.py
Normal file
174
helpers/services/filesystem.py
Normal file
|
@ -0,0 +1,174 @@
|
|||
"""Filesystem operations service.
|
||||
|
||||
Provides unified filesystem operations including file discovery, size
|
||||
calculation, and path management. Consolidates common filesystem patterns
|
||||
used across quantisation and conversion workflows.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from helpers.logger import logger
|
||||
|
||||
BYTES_PER_UNIT = 1024.0
|
||||
|
||||
|
||||
class FilesystemService:
|
||||
"""Handles filesystem operations with consistent error handling.
|
||||
|
||||
Provides methods for file discovery, size formatting, and JSON loading
|
||||
with proper error handling and logging. Ensures consistent behaviour
|
||||
across different tools and workflows.
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def get_file_size(file_path: Path) -> str:
|
||||
"""Get human-readable file size using system utilities.
|
||||
|
||||
Attempts to use `du -h` for human-readable output, falling back to
|
||||
Python calculation if the system command fails. Provides consistent
|
||||
size formatting across the toolset.
|
||||
|
||||
Returns:
|
||||
Human-readable file size string (e.g., "1.5G", "750M").
|
||||
"""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["du", "-h", str(file_path)], capture_output=True, text=True, check=True
|
||||
)
|
||||
return result.stdout.split()[0]
|
||||
except (subprocess.CalledProcessError, FileNotFoundError):
|
||||
# Fallback to Python calculation
|
||||
|
||||
try:
|
||||
size_bytes: float = float(file_path.stat().st_size)
|
||||
for unit in ["B", "K", "M", "G", "T"]:
|
||||
if size_bytes < BYTES_PER_UNIT:
|
||||
return f"{size_bytes:.1f}{unit}"
|
||||
size_bytes /= BYTES_PER_UNIT
|
||||
except Exception:
|
||||
return "Unknown"
|
||||
else:
|
||||
return f"{size_bytes:.1f}P"
|
||||
|
||||
@staticmethod
|
||||
def load_json_config(config_path: Path) -> dict[str, Any]:
|
||||
"""Load and parse JSON configuration file.
|
||||
|
||||
Provides consistent JSON loading with proper error handling and
|
||||
encoding specification. Used for loading model configurations,
|
||||
tokeniser settings, and other JSON-based metadata.
|
||||
|
||||
Returns:
|
||||
Parsed JSON content as dictionary.
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If config file doesn't exist.
|
||||
"""
|
||||
if not config_path.exists():
|
||||
msg = f"Configuration file not found: {config_path}"
|
||||
raise FileNotFoundError(msg)
|
||||
|
||||
with Path(config_path).open(encoding="utf-8") as f:
|
||||
return json.load(f)
|
||||
|
||||
@staticmethod
|
||||
def find_safetensor_files(model_path: Path) -> list[Path]:
|
||||
"""Find all SafeTensor files in model directory using priority search.
|
||||
|
||||
Searches for tensor files in order of preference: single model.safetensors,
|
||||
sharded model-*-of-*.safetensors files, then any *.safetensors files. This
|
||||
approach handles both single-file and multi-shard model distributions whilst
|
||||
ensuring predictable file ordering for conversion consistency.
|
||||
|
||||
Returns:
|
||||
List of SafeTensor file paths in priority order.
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If no SafeTensor files are found.
|
||||
"""
|
||||
# Check for single file
|
||||
single_file = model_path / "model.safetensors"
|
||||
if single_file.exists():
|
||||
return [single_file]
|
||||
|
||||
# Check for sharded files
|
||||
pattern = "model-*-of-*.safetensors"
|
||||
sharded_files = sorted(model_path.glob(pattern))
|
||||
if sharded_files:
|
||||
return sharded_files
|
||||
|
||||
# Check for any safetensor files
|
||||
any_files = sorted(model_path.glob("*.safetensors"))
|
||||
if any_files:
|
||||
return any_files
|
||||
|
||||
msg = f"No SafeTensor files found in {model_path}"
|
||||
raise FileNotFoundError(msg)
|
||||
|
||||
@staticmethod
|
||||
def find_gguf_files(model_path: Path, pattern: str | None = None) -> list[Path]:
|
||||
"""Find GGUF files in directory, optionally filtered by pattern.
|
||||
|
||||
Searches for GGUF files with optional pattern matching. Prioritises
|
||||
multi-part files (00001-of-*) over single files for proper handling
|
||||
of large models split across multiple files.
|
||||
|
||||
Returns:
|
||||
List of GGUF file paths, sorted with multi-part files first.
|
||||
"""
|
||||
if pattern:
|
||||
gguf_files = list(model_path.glob(f"*{pattern}*.gguf"))
|
||||
else:
|
||||
gguf_files = list(model_path.glob("*.gguf"))
|
||||
|
||||
# Sort to prioritise 00001-of-* files
|
||||
gguf_files.sort(
|
||||
key=lambda x: (
|
||||
"00001-of-" not in x.name, # False sorts before True
|
||||
x.name,
|
||||
)
|
||||
)
|
||||
|
||||
return gguf_files
|
||||
|
||||
@staticmethod
|
||||
def ensure_directory(path: Path) -> Path:
|
||||
"""Ensure directory exists, creating if necessary.
|
||||
|
||||
Creates directory and all parent directories if they don't exist.
|
||||
Returns the path for method chaining convenience.
|
||||
|
||||
Returns:
|
||||
The directory path.
|
||||
"""
|
||||
path.mkdir(parents=True, exist_ok=True)
|
||||
return path
|
||||
|
||||
@staticmethod
|
||||
def cleanup_directory(path: Path, pattern: str = "*") -> int:
|
||||
"""Remove files matching pattern from directory.
|
||||
|
||||
Safely removes files matching the specified glob pattern. Returns
|
||||
count of files removed for logging purposes.
|
||||
|
||||
Returns:
|
||||
Number of files removed.
|
||||
"""
|
||||
if not path.exists():
|
||||
return 0
|
||||
|
||||
files_removed = 0
|
||||
for file_path in path.glob(pattern):
|
||||
if file_path.is_file():
|
||||
try:
|
||||
file_path.unlink()
|
||||
files_removed += 1
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to remove {file_path}: {e}")
|
||||
|
||||
return files_removed
|
210
helpers/services/gguf.py
Normal file
210
helpers/services/gguf.py
Normal file
|
@ -0,0 +1,210 @@
|
|||
"""GGUF file operations service.
|
||||
|
||||
Provides unified interface for creating, writing, and manipulating GGUF files.
|
||||
Consolidates GGUF-specific operations from conversion and quantisation workflows.
|
||||
Uses UK English spelling conventions throughout.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
import gguf
|
||||
import torch
|
||||
from safetensors import safe_open
|
||||
|
||||
from helpers.logger import logger
|
||||
from helpers.services.filesystem import FilesystemService
|
||||
from helpers.utils.config_parser import ConfigParser
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from pathlib import Path
|
||||
|
||||
import numpy as np
|
||||
|
||||
from helpers.models.conversion import ModelConfig
|
||||
|
||||
|
||||
class GGUFWriter:
|
||||
"""Manages GGUF file creation and metadata writing.
|
||||
|
||||
Provides high-level interface for GGUF file operations including metadata
|
||||
configuration, tensor addition, and tokeniser integration. Encapsulates
|
||||
low-level GGUF library interactions for consistent error handling.
|
||||
"""
|
||||
|
||||
def __init__(self, output_path: Path, architecture: str) -> None:
|
||||
"""Initialise GGUF writer with output path and architecture.
|
||||
|
||||
Creates the underlying GGUF writer instance and prepares for metadata
|
||||
and tensor addition. Sets up the file structure for the specified
|
||||
model architecture.
|
||||
"""
|
||||
self.output_path = output_path
|
||||
self.architecture = architecture
|
||||
self.writer = gguf.GGUFWriter(str(output_path), architecture)
|
||||
logger.info(f"Created GGUF writer for {architecture} architecture")
|
||||
|
||||
def add_metadata(self, model_config: ModelConfig, model_name: str) -> None:
|
||||
"""Add comprehensive metadata from model configuration.
|
||||
|
||||
Writes general model information, architectural parameters, and
|
||||
quantisation settings to the GGUF file header. Handles both standard
|
||||
and vision model configurations with appropriate parameter mapping.
|
||||
"""
|
||||
# General metadata
|
||||
self.writer.add_name(model_name)
|
||||
self.writer.add_description(f"Converted from {model_config.architectures[0]}")
|
||||
self.writer.add_file_type(gguf.LlamaFileType.ALL_F32)
|
||||
|
||||
# Model parameters from config
|
||||
params = model_config.to_gguf_params()
|
||||
self.writer.add_context_length(params.context_length)
|
||||
self.writer.add_embedding_length(params.embedding_length)
|
||||
self.writer.add_block_count(params.block_count)
|
||||
self.writer.add_feed_forward_length(params.feed_forward_length)
|
||||
self.writer.add_head_count(params.attention_head_count)
|
||||
self.writer.add_head_count_kv(params.attention_head_count_kv)
|
||||
self.writer.add_layer_norm_rms_eps(params.attention_layer_norm_rms_epsilon)
|
||||
self.writer.add_rope_freq_base(params.rope_freq_base)
|
||||
self.writer.add_rope_dimension_count(params.rope_dimension_count)
|
||||
|
||||
logger.info(f"Added metadata: {params.block_count} layers, {params.context_length} context")
|
||||
|
||||
def add_vision_metadata(self, vision_config: Any) -> None:
|
||||
"""Add vision model parameters to GGUF metadata.
|
||||
|
||||
Configures vision-specific parameters for multimodal models including
|
||||
embedding dimensions, attention heads, and spatial processing settings.
|
||||
"""
|
||||
if not vision_config:
|
||||
return
|
||||
|
||||
logger.info("Adding vision model parameters...")
|
||||
self.writer.add_vision_embedding_length(vision_config.hidden_size)
|
||||
self.writer.add_vision_block_count(vision_config.num_hidden_layers)
|
||||
self.writer.add_vision_head_count(vision_config.num_attention_heads)
|
||||
self.writer.add_vision_feed_forward_length(vision_config.intermediate_size)
|
||||
self.writer.add_vision_patch_size(vision_config.patch_size)
|
||||
self.writer.add_vision_spatial_merge_size(vision_config.spatial_merge_size)
|
||||
|
||||
if hasattr(vision_config, "rms_norm_eps") and vision_config.rms_norm_eps:
|
||||
self.writer.add_vision_attention_layernorm_eps(vision_config.rms_norm_eps)
|
||||
|
||||
def add_tokeniser(self, tokeniser_config: dict[str, Any]) -> None:
|
||||
"""Add tokeniser metadata to GGUF file.
|
||||
|
||||
Writes special token IDs and tokeniser model type to enable proper
|
||||
text processing during inference. Uses sensible defaults for missing
|
||||
configuration values.
|
||||
"""
|
||||
self.writer.add_bos_token_id(tokeniser_config.get("bos_token_id", 1))
|
||||
self.writer.add_eos_token_id(tokeniser_config.get("eos_token_id", 2))
|
||||
self.writer.add_unk_token_id(tokeniser_config.get("unk_token_id", 0))
|
||||
self.writer.add_pad_token_id(tokeniser_config.get("pad_token_id", 0))
|
||||
self.writer.add_tokenizer_model(tokeniser_config.get("model_type", "llama"))
|
||||
|
||||
logger.info("Added tokeniser configuration")
|
||||
|
||||
def add_tensor(self, name: str, data: np.ndarray) -> None:
|
||||
"""Add a tensor to the GGUF file.
|
||||
|
||||
Writes tensor data with the specified name to the file. Handles
|
||||
data type conversions and validates tensor shapes.
|
||||
"""
|
||||
self.writer.add_tensor(name, data)
|
||||
|
||||
def finalise(self) -> None:
|
||||
"""Write all data to file and close writer.
|
||||
|
||||
Completes the GGUF file creation by writing headers, key-value data,
|
||||
and tensor data in the correct order. Ensures proper file closure.
|
||||
"""
|
||||
logger.info(f"Writing GGUF file to {self.output_path}")
|
||||
self.writer.write_header_to_file()
|
||||
self.writer.write_kv_data_to_file()
|
||||
self.writer.write_tensors_to_file()
|
||||
self.writer.close()
|
||||
logger.info("GGUF file written successfully")
|
||||
|
||||
|
||||
class GGUFConverter:
|
||||
"""High-level GGUF conversion orchestrator.
|
||||
|
||||
Coordinates the complete conversion workflow from source models to GGUF
|
||||
format, managing metadata extraction, tensor mapping, and file writing.
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def convert_safetensors(
|
||||
model_path: Path,
|
||||
output_path: Path,
|
||||
model_config: ModelConfig,
|
||||
architecture: str,
|
||||
tensor_mapper: Any,
|
||||
) -> bool:
|
||||
"""Convert SafeTensors model to GGUF format.
|
||||
|
||||
Orchestrates the conversion process including metadata setup, tensor
|
||||
loading with BFloat16 support, name mapping, and tokeniser integration.
|
||||
|
||||
Returns:
|
||||
True if conversion successful, False otherwise.
|
||||
"""
|
||||
logger.info(f"Converting {model_path.name} to GGUF...")
|
||||
|
||||
# Create writer
|
||||
writer_wrapper = GGUFWriter(output_path, architecture)
|
||||
|
||||
# Add metadata
|
||||
writer_wrapper.add_metadata(model_config, model_path.name)
|
||||
|
||||
# Add vision metadata if present
|
||||
if model_config.vision_config:
|
||||
writer_wrapper.add_vision_metadata(model_config.vision_config)
|
||||
|
||||
# Load and add tensors
|
||||
fs = FilesystemService()
|
||||
tensor_files = fs.find_safetensor_files(model_path)
|
||||
logger.info(f"Found {len(tensor_files)} tensor file(s)")
|
||||
|
||||
tensor_count = 0
|
||||
for tensor_file in tensor_files:
|
||||
logger.info(f"Loading {tensor_file.name}...")
|
||||
with safe_open(tensor_file, framework="pt") as f:
|
||||
for tensor_name in f:
|
||||
tensor_data = f.get_tensor(tensor_name)
|
||||
|
||||
# Convert BFloat16 to Float32
|
||||
if hasattr(tensor_data, "numpy"):
|
||||
if torch and tensor_data.dtype == torch.bfloat16:
|
||||
tensor_data = tensor_data.float()
|
||||
tensor_data = tensor_data.numpy()
|
||||
|
||||
# Map tensor name
|
||||
gguf_name = tensor_mapper.map_tensor_name(tensor_name)
|
||||
|
||||
if gguf_name:
|
||||
writer_wrapper.add_tensor(gguf_name, tensor_data)
|
||||
tensor_count += 1
|
||||
|
||||
if tensor_count % 100 == 0:
|
||||
logger.info(f" Processed {tensor_count} tensors...")
|
||||
|
||||
logger.info(f"Total tensors processed: {tensor_count}")
|
||||
|
||||
# Add tokeniser
|
||||
try:
|
||||
tok_config = ConfigParser.load_tokeniser_config(model_path)
|
||||
writer_wrapper.add_tokeniser(tok_config)
|
||||
logger.info("Tokeniser added")
|
||||
except Exception as e:
|
||||
logger.warning(f"Could not add tokeniser: {e}")
|
||||
|
||||
# Finalise file
|
||||
writer_wrapper.finalise()
|
||||
|
||||
file_size = fs.get_file_size(output_path)
|
||||
logger.info(f"Conversion complete! Output: {output_path} ({file_size})")
|
||||
|
||||
return True
|
454
helpers/services/huggingface.py
Normal file
454
helpers/services/huggingface.py
Normal file
|
@ -0,0 +1,454 @@
|
|||
"""HuggingFace operations service.
|
||||
|
||||
Handles all interactions with HuggingFace including model downloads,
|
||||
uploads, README generation, and repository management. Uses UK English
|
||||
spelling conventions throughout.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
import subprocess
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from helpers.logger import logger
|
||||
from helpers.models.quantisation import QuantisationType
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from helpers.models.quantisation import ModelSource, QuantisationResult
|
||||
|
||||
|
||||
class HuggingFaceService:
|
||||
"""Manages HuggingFace repository operations.
|
||||
|
||||
Provides methods for downloading models, uploading files, and managing
|
||||
repositories. Handles authentication, error recovery, and progress tracking
|
||||
for robust interaction with HuggingFace services.
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def get_username() -> str:
|
||||
"""Get authenticated HuggingFace username.
|
||||
|
||||
Retrieves the current user's HuggingFace username using the CLI.
|
||||
Requires prior authentication via `huggingface-cli login`.
|
||||
|
||||
Returns:
|
||||
HuggingFace username.
|
||||
|
||||
Raises:
|
||||
RuntimeError: If not authenticated or CLI not available.
|
||||
"""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["huggingface-cli", "whoami"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=True,
|
||||
)
|
||||
return result.stdout.strip()
|
||||
except (subprocess.CalledProcessError, FileNotFoundError) as err:
|
||||
msg = "Please log in to HuggingFace first: huggingface-cli login"
|
||||
raise RuntimeError(msg) from err
|
||||
|
||||
@staticmethod
|
||||
def download_model(
|
||||
model_name: str, output_dir: Path, include_pattern: str | None = None
|
||||
) -> None:
|
||||
"""Download model from HuggingFace.
|
||||
|
||||
Downloads a complete model or specific files matching a pattern.
|
||||
Creates the output directory if it doesn't exist. Supports filtered
|
||||
downloads for efficient bandwidth usage when only certain files are needed.
|
||||
"""
|
||||
logger.info(f"Downloading {model_name} to {output_dir}")
|
||||
|
||||
cmd = [
|
||||
"huggingface-cli",
|
||||
"download",
|
||||
model_name,
|
||||
"--local-dir",
|
||||
str(output_dir),
|
||||
]
|
||||
|
||||
if include_pattern:
|
||||
cmd.extend(["--include", include_pattern])
|
||||
|
||||
subprocess.run(cmd, check=True)
|
||||
logger.info("Download complete")
|
||||
|
||||
@staticmethod
|
||||
def upload_file(
|
||||
repo_id: str,
|
||||
local_path: Path,
|
||||
repo_path: str | None = None,
|
||||
create_repo: bool = False,
|
||||
) -> None:
|
||||
"""Upload a file to HuggingFace repository.
|
||||
|
||||
Uploads a single file to the specified repository path. Can create
|
||||
the repository if it doesn't exist. Handles repository creation conflicts
|
||||
gracefully by retrying without the create flag when needed.
|
||||
|
||||
Raises:
|
||||
CalledProcessError: If upload fails.
|
||||
"""
|
||||
repo_path = repo_path or local_path.name
|
||||
logger.info(f"Uploading {local_path.name} to {repo_id}/{repo_path}")
|
||||
|
||||
cmd = [
|
||||
"huggingface-cli",
|
||||
"upload",
|
||||
repo_id,
|
||||
str(local_path),
|
||||
repo_path,
|
||||
]
|
||||
|
||||
if create_repo:
|
||||
cmd.append("--create")
|
||||
|
||||
try:
|
||||
subprocess.run(cmd, check=True, capture_output=True)
|
||||
logger.info(f"Uploaded {repo_path}")
|
||||
except subprocess.CalledProcessError:
|
||||
if create_repo:
|
||||
# Repository might already exist, retry without --create
|
||||
cmd = cmd[:-1] # Remove --create flag
|
||||
subprocess.run(cmd, check=True)
|
||||
logger.info(f"Updated {repo_path}")
|
||||
else:
|
||||
raise
|
||||
|
||||
|
||||
class ReadmeGenerator:
|
||||
"""Generates README files for quantised models.
|
||||
|
||||
Creates comprehensive README documentation including model cards,
|
||||
quantisation details, and status tracking. Supports both initial
|
||||
planning documentation and final result summaries.
|
||||
"""
|
||||
|
||||
def generate(
|
||||
self,
|
||||
model_source: ModelSource,
|
||||
results: dict[QuantisationType, QuantisationResult],
|
||||
models_dir: Path,
|
||||
output_repo: str | None = None,
|
||||
) -> Path:
|
||||
"""Generate README file for quantised model repository.
|
||||
|
||||
Creates a comprehensive README with frontmatter, quantisation table,
|
||||
and original model information. Handles status tracking for planned,
|
||||
processing, and completed quantisations.
|
||||
|
||||
Returns:
|
||||
Path to generated README file.
|
||||
"""
|
||||
logger.info("Creating model card...")
|
||||
|
||||
model_dir = models_dir / model_source.model_name
|
||||
readme_path = model_dir / "README.md"
|
||||
|
||||
# Get original README content
|
||||
original_content = self._get_original_readme(model_source, model_dir)
|
||||
|
||||
# Generate new README
|
||||
readme_content = self._generate_readme_content(
|
||||
model_source, results, original_content, output_repo
|
||||
)
|
||||
|
||||
readme_path.write_text(readme_content)
|
||||
return readme_path
|
||||
|
||||
def _get_original_readme(self, model_source: ModelSource, model_dir: Path) -> dict[str, str]:
|
||||
"""Extract original README and metadata.
|
||||
|
||||
Downloads or reads the original model's README for inclusion in the
|
||||
quantised model documentation. Parses YAML frontmatter if present.
|
||||
|
||||
Returns:
|
||||
Dictionary with readme content, licence, tags, and frontmatter.
|
||||
"""
|
||||
content = {"readme": "", "licence": "apache-2.0", "tags": "", "frontmatter": ""}
|
||||
|
||||
# Try local file first
|
||||
readme_path = model_dir / "README.md"
|
||||
if readme_path.exists():
|
||||
content["readme"] = readme_path.read_text(encoding="utf-8")
|
||||
logger.info(f"Found original README ({len(content['readme'])} characters)")
|
||||
else:
|
||||
# Download separately
|
||||
content = self._download_readme(model_source)
|
||||
|
||||
# Parse frontmatter if present
|
||||
if content["readme"].startswith("---\n"):
|
||||
content = self._parse_frontmatter(content["readme"])
|
||||
|
||||
return content
|
||||
|
||||
def _download_readme(self, model_source: ModelSource) -> dict[str, str]:
|
||||
"""Download README from HuggingFace repository.
|
||||
|
||||
Attempts to download just the README.md file from the source repository
|
||||
for efficient documentation extraction.
|
||||
|
||||
Returns:
|
||||
Dictionary with readme content and default metadata.
|
||||
"""
|
||||
content = {"readme": "", "licence": "apache-2.0", "tags": "", "frontmatter": ""}
|
||||
|
||||
with tempfile.TemporaryDirectory() as temp_dir:
|
||||
try:
|
||||
logger.info(f"Downloading README from {model_source.source_model}...")
|
||||
subprocess.run(
|
||||
[
|
||||
"huggingface-cli",
|
||||
"download",
|
||||
model_source.source_model,
|
||||
"--include",
|
||||
"README.md",
|
||||
"--local-dir",
|
||||
temp_dir,
|
||||
],
|
||||
check=True,
|
||||
capture_output=True,
|
||||
)
|
||||
|
||||
readme_path = Path(temp_dir) / "README.md"
|
||||
if readme_path.exists():
|
||||
content["readme"] = readme_path.read_text(encoding="utf-8")
|
||||
logger.info(f"Downloaded README ({len(content['readme'])} characters)")
|
||||
except subprocess.CalledProcessError as e:
|
||||
logger.warning(f"Failed to download README: {e}")
|
||||
|
||||
return content
|
||||
|
||||
def _parse_frontmatter(self, readme_text: str) -> dict[str, str]:
|
||||
"""Parse YAML frontmatter from README.
|
||||
|
||||
Extracts metadata from YAML frontmatter including licence, tags,
|
||||
and other model card fields.
|
||||
|
||||
Returns:
|
||||
Dictionary with separated content and metadata.
|
||||
"""
|
||||
lines = readme_text.split("\n")
|
||||
if lines[0] != "---":
|
||||
return {
|
||||
"readme": readme_text,
|
||||
"licence": "apache-2.0",
|
||||
"tags": "",
|
||||
"frontmatter": "",
|
||||
}
|
||||
|
||||
frontmatter_end = -1
|
||||
for i, line in enumerate(lines[1:], 1):
|
||||
if line == "---":
|
||||
frontmatter_end = i
|
||||
break
|
||||
|
||||
if frontmatter_end == -1:
|
||||
return {
|
||||
"readme": readme_text,
|
||||
"licence": "apache-2.0",
|
||||
"tags": "",
|
||||
"frontmatter": "",
|
||||
}
|
||||
|
||||
frontmatter = "\n".join(lines[1:frontmatter_end])
|
||||
content = "\n".join(lines[frontmatter_end + 1 :])
|
||||
|
||||
# Extract licence
|
||||
licence_match = re.search(r"^license:\s*(.+)$", frontmatter, re.MULTILINE)
|
||||
licence_val = licence_match.group(1).strip().strip('"') if licence_match else "apache-2.0"
|
||||
|
||||
# Extract tags
|
||||
tags = []
|
||||
in_tags = False
|
||||
for line in frontmatter.split("\n"):
|
||||
if line.startswith("tags:"):
|
||||
in_tags = True
|
||||
continue
|
||||
if in_tags:
|
||||
if line.startswith("- "):
|
||||
tags.append(line[2:].strip())
|
||||
elif line and not line.startswith(" "):
|
||||
break
|
||||
|
||||
return {
|
||||
"readme": content,
|
||||
"licence": licence_val,
|
||||
"tags": ",".join(tags),
|
||||
"frontmatter": frontmatter,
|
||||
}
|
||||
|
||||
def _generate_readme_content(
|
||||
self,
|
||||
model_source: ModelSource,
|
||||
results: dict[QuantisationType, QuantisationResult],
|
||||
original_content: dict[str, str],
|
||||
output_repo: str | None = None,
|
||||
) -> str:
|
||||
"""Generate complete README content with quantisation details.
|
||||
|
||||
Creates the full README including YAML frontmatter, quantisation status
|
||||
table, and original model information.
|
||||
|
||||
Returns:
|
||||
Complete README markdown content.
|
||||
"""
|
||||
# Build tags
|
||||
our_tags = [
|
||||
"quantised",
|
||||
"gguf",
|
||||
"q4_k_m",
|
||||
"q4_k_l",
|
||||
"q4_k_xl",
|
||||
"q4_k_xxl",
|
||||
"bartowski-method",
|
||||
]
|
||||
original_tags = original_content["tags"].split(",") if original_content["tags"] else []
|
||||
all_tags = sorted(set(our_tags + original_tags))
|
||||
|
||||
# Build frontmatter
|
||||
frontmatter = f"""---
|
||||
license: {original_content["licence"]}
|
||||
library_name: gguf
|
||||
base_model: {model_source.source_model}
|
||||
tags:
|
||||
"""
|
||||
for tag in all_tags:
|
||||
if tag.strip():
|
||||
frontmatter += f"- {tag.strip()}\n"
|
||||
|
||||
frontmatter += "---\n\n"
|
||||
|
||||
# Build main content
|
||||
hf_url = f"https://huggingface.co/{model_source.source_model}"
|
||||
content = f"""# {model_source.original_author}-{model_source.model_name}-GGUF
|
||||
|
||||
GGUF quantisations of [{model_source.source_model}]({hf_url}) using Bartowski's method.
|
||||
|
||||
| Quantisation | Embeddings/Output | Attention | Feed-Forward | Status |
|
||||
|--------------|-------------------|-----------|--------------|--------|
|
||||
"""
|
||||
|
||||
# Add results table
|
||||
for quant_type in [
|
||||
QuantisationType.Q4_K_M,
|
||||
QuantisationType.Q4_K_L,
|
||||
QuantisationType.Q4_K_XL,
|
||||
QuantisationType.Q4_K_XXL,
|
||||
]:
|
||||
result = results.get(quant_type)
|
||||
if not result:
|
||||
result = type("Result", (), {"status": "planned", "success": False})()
|
||||
|
||||
layers = self._get_layers_config(quant_type)
|
||||
status = self._format_status(result, model_source, quant_type, output_repo)
|
||||
|
||||
content += (
|
||||
f"| {quant_type.value} | {layers['embeddings']} | "
|
||||
f"{layers['attention']} | {layers['ffn']} | {status} |\n"
|
||||
)
|
||||
|
||||
content += "\n---\n\n"
|
||||
|
||||
# Add original content
|
||||
if original_content["readme"]:
|
||||
content += "# Original Model Information\n\n" + original_content["readme"]
|
||||
else:
|
||||
content += f"## Original Model\n\nQuantisation of [{model_source.source_model}](https://huggingface.co/{model_source.source_model}).\n"
|
||||
|
||||
return frontmatter + content
|
||||
|
||||
def _get_layers_config(self, quant_type: QuantisationType) -> dict[str, str]:
|
||||
"""Get layer configuration for quantisation type.
|
||||
|
||||
Returns layer precision specifications for the quantisation table.
|
||||
|
||||
Returns:
|
||||
Dictionary with embeddings, attention, and ffn precision labels.
|
||||
"""
|
||||
configs = {
|
||||
QuantisationType.Q4_K_M: {
|
||||
"embeddings": "Q4_K_M",
|
||||
"attention": "Q4_K_M",
|
||||
"ffn": "Q4_K_M",
|
||||
},
|
||||
QuantisationType.Q4_K_L: {"embeddings": "Q6_K", "attention": "Q6_K", "ffn": "Q4_K_M"},
|
||||
QuantisationType.Q4_K_XL: {"embeddings": "Q8_0", "attention": "Q6_K", "ffn": "Q4_K_M"},
|
||||
QuantisationType.Q4_K_XXL: {"embeddings": "Q8_0", "attention": "Q8_0", "ffn": "Q4_K_M"},
|
||||
}
|
||||
return configs.get(
|
||||
quant_type, {"embeddings": "Unknown", "attention": "Unknown", "ffn": "Unknown"}
|
||||
)
|
||||
|
||||
def _format_status(
|
||||
self,
|
||||
result: QuantisationResult,
|
||||
model_source: ModelSource,
|
||||
quant_type: QuantisationType,
|
||||
output_repo: str | None,
|
||||
) -> str:
|
||||
"""Format status indicator for README table.
|
||||
|
||||
Creates appropriate status indicator based on quantisation state
|
||||
including progress indicators, file sizes, and download links.
|
||||
|
||||
Returns:
|
||||
Formatted status string for table cell.
|
||||
"""
|
||||
status_map = {
|
||||
"planned": "⏳ Planned",
|
||||
"processing": "🔄 Processing...",
|
||||
"uploading": "⬆️ Uploading...",
|
||||
"failed": "❌ Failed",
|
||||
}
|
||||
|
||||
if hasattr(result, "status") and result.status in status_map:
|
||||
base_status = status_map[result.status]
|
||||
|
||||
if result.status == "uploading" and hasattr(result, "file_size") and result.file_size:
|
||||
return f"{base_status} ({result.file_size})"
|
||||
if result.status == "completed" or (hasattr(result, "success") and result.success):
|
||||
return self._format_success_status(result, model_source, quant_type, output_repo)
|
||||
return base_status
|
||||
|
||||
# Legacy support
|
||||
if hasattr(result, "success") and result.success:
|
||||
return self._format_success_status(result, model_source, quant_type, output_repo)
|
||||
return "❌ Failed"
|
||||
|
||||
def _format_success_status(
|
||||
self,
|
||||
result: QuantisationResult,
|
||||
model_source: ModelSource,
|
||||
quant_type: QuantisationType,
|
||||
output_repo: str | None,
|
||||
) -> str:
|
||||
"""Format successful quantisation status with download link.
|
||||
|
||||
Creates a download link if repository information is available,
|
||||
otherwise shows file size.
|
||||
|
||||
Returns:
|
||||
Formatted success status string.
|
||||
"""
|
||||
if not output_repo:
|
||||
return (
|
||||
f"✅ {result.file_size}"
|
||||
if hasattr(result, "file_size") and result.file_size
|
||||
else "✅ Available"
|
||||
)
|
||||
|
||||
filename = (
|
||||
f"{model_source.original_author}-{model_source.model_name}-{quant_type.value}.gguf"
|
||||
)
|
||||
url = f"https://huggingface.co/{output_repo}?show_file_info={filename}"
|
||||
|
||||
if hasattr(result, "file_size") and result.file_size:
|
||||
return f"[✅ {result.file_size}]({url})"
|
||||
return f"[✅ Available]({url})"
|
417
helpers/services/llama_cpp.py
Normal file
417
helpers/services/llama_cpp.py
Normal file
|
@ -0,0 +1,417 @@
|
|||
"""llama.cpp environment and operations service.
|
||||
|
||||
Manages llama.cpp binary discovery, environment setup, and imatrix generation.
|
||||
Provides consistent interface for interacting with llama.cpp tools across
|
||||
different installation methods.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
|
||||
from helpers.logger import logger
|
||||
from helpers.models.quantisation import LlamaCppEnvironment
|
||||
from helpers.services.filesystem import FilesystemService
|
||||
|
||||
|
||||
class EnvironmentManager:
|
||||
"""Manages llama.cpp environment setup and binary discovery.
|
||||
|
||||
Handles detection of local binaries, repository setup, and conversion
|
||||
script location. Provides fallback strategies for different installation
|
||||
scenarios including local builds and repository-based setups.
|
||||
"""
|
||||
|
||||
def __init__(self, work_dir: Path) -> None:
|
||||
"""Initialise EnvironmentManager."""
|
||||
self.work_dir = work_dir
|
||||
self.llama_cpp_dir = work_dir / "llama.cpp"
|
||||
self.fs = FilesystemService()
|
||||
|
||||
def setup(self) -> LlamaCppEnvironment:
|
||||
"""Set up llama.cpp environment with automatic detection.
|
||||
|
||||
Checks for local llama.cpp binaries first, then falls back to
|
||||
repository-based setup if needed. Handles conversion script location,
|
||||
dependency installation, and path resolution.
|
||||
|
||||
Returns:
|
||||
Configured LlamaCppEnvironment instance.
|
||||
"""
|
||||
# Check for local binaries first
|
||||
local_env = self._check_local_binaries()
|
||||
if local_env:
|
||||
return local_env
|
||||
|
||||
# Setup repository if needed
|
||||
return self.setup_repository()
|
||||
|
||||
def _check_local_binaries(self) -> LlamaCppEnvironment | None:
|
||||
"""Check for existing llama.cpp binaries in current directory.
|
||||
|
||||
Searches for quantise and CLI binaries in the current directory
|
||||
and standard installation paths. Also locates conversion scripts.
|
||||
|
||||
Returns:
|
||||
LlamaCppEnvironment if binaries found, None otherwise.
|
||||
"""
|
||||
quantise_bin = Path("./llama-quantize")
|
||||
cli_bin = Path("./llama-cli")
|
||||
|
||||
if not (quantise_bin.exists() and cli_bin.exists()):
|
||||
return None
|
||||
|
||||
logger.info("Found llama.cpp binaries in current directory")
|
||||
|
||||
# Check for conversion script
|
||||
convert_script = self._find_convert_script()
|
||||
if convert_script:
|
||||
logger.info(f"Found conversion script: {convert_script}")
|
||||
return LlamaCppEnvironment(
|
||||
quantise_binary=quantise_bin.resolve(),
|
||||
cli_binary=cli_bin.resolve(),
|
||||
convert_script=convert_script,
|
||||
use_repo=False,
|
||||
)
|
||||
|
||||
logger.warning("No conversion script found in current directory")
|
||||
logger.info("Will use llama.cpp repository method for conversion")
|
||||
return LlamaCppEnvironment(
|
||||
quantise_binary=quantise_bin.resolve(),
|
||||
cli_binary=cli_bin.resolve(),
|
||||
convert_script=f"python3 {self.llama_cpp_dir}/convert_hf_to_gguf.py",
|
||||
use_repo=True,
|
||||
)
|
||||
|
||||
def _find_convert_script(self) -> str | None:
|
||||
"""Find conversion script in current directory.
|
||||
|
||||
Searches for various naming conventions of the HF to GGUF
|
||||
conversion script.
|
||||
|
||||
Returns:
|
||||
Command to run conversion script, or None if not found.
|
||||
"""
|
||||
scripts = [
|
||||
"./llama-convert-hf-to-gguf",
|
||||
"python3 ./convert_hf_to_gguf.py",
|
||||
"python3 ./convert-hf-to-gguf.py",
|
||||
]
|
||||
|
||||
for script in scripts:
|
||||
if script.startswith("python3"):
|
||||
script_path = script.split(" ", 1)[1]
|
||||
if Path(script_path).exists():
|
||||
return script
|
||||
elif Path(script).exists():
|
||||
return script
|
||||
return None
|
||||
|
||||
def setup_repository(self) -> LlamaCppEnvironment:
|
||||
"""Setup llama.cpp repository for conversion scripts.
|
||||
|
||||
Clones the llama.cpp repository if not present and installs
|
||||
Python dependencies for model conversion.
|
||||
|
||||
Returns:
|
||||
LlamaCppEnvironment configured with repository paths.
|
||||
"""
|
||||
if not self.llama_cpp_dir.exists():
|
||||
logger.info("Cloning llama.cpp for conversion script...")
|
||||
subprocess.run(
|
||||
[
|
||||
"git",
|
||||
"clone",
|
||||
"https://github.com/ggerganov/llama.cpp.git",
|
||||
str(self.llama_cpp_dir),
|
||||
],
|
||||
check=True,
|
||||
)
|
||||
|
||||
# Install Python requirements
|
||||
logger.info("Installing Python requirements...")
|
||||
subprocess.run(
|
||||
[
|
||||
"pip3",
|
||||
"install",
|
||||
"-r",
|
||||
"requirements.txt",
|
||||
"--break-system-packages",
|
||||
"--root-user-action=ignore",
|
||||
],
|
||||
cwd=self.llama_cpp_dir,
|
||||
check=True,
|
||||
)
|
||||
|
||||
# Install additional conversion dependencies
|
||||
logger.info("Installing additional conversion dependencies...")
|
||||
subprocess.run(
|
||||
[
|
||||
"pip3",
|
||||
"install",
|
||||
"transformers",
|
||||
"sentencepiece",
|
||||
"protobuf",
|
||||
"--break-system-packages",
|
||||
"--root-user-action=ignore",
|
||||
],
|
||||
check=True,
|
||||
)
|
||||
else:
|
||||
logger.info("llama.cpp repository already exists")
|
||||
|
||||
# Use local binaries but repo conversion script
|
||||
return LlamaCppEnvironment(
|
||||
quantise_binary=Path("./llama-quantize").resolve(),
|
||||
cli_binary=Path("./llama-cli").resolve(),
|
||||
convert_script=f"python3 {self.llama_cpp_dir}/convert_hf_to_gguf.py",
|
||||
use_repo=False,
|
||||
)
|
||||
|
||||
|
||||
class IMatrixGenerator:
|
||||
"""Handles importance matrix generation for quantisation guidance.
|
||||
|
||||
Generates or locates importance matrices that guide quantisation
|
||||
decisions, helping preserve model quality by identifying critical
|
||||
tensors requiring higher precision.
|
||||
"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
"""Initialise IMatrixGenerator."""
|
||||
self.fs = FilesystemService()
|
||||
|
||||
def generate_imatrix(
|
||||
self, f16_model_path: Path, llama_env: LlamaCppEnvironment, model_dir: Path
|
||||
) -> Path | None:
|
||||
"""Generate importance matrix for quantisation guidance.
|
||||
|
||||
Searches for existing imatrix files first, provides interactive
|
||||
prompts for user-supplied matrices, then generates new matrices
|
||||
using calibration data if necessary.
|
||||
|
||||
Returns:
|
||||
Path to imatrix file, or None if generation fails.
|
||||
"""
|
||||
imatrix_path = model_dir / "imatrix.dat"
|
||||
|
||||
# Check for existing imatrix
|
||||
if imatrix_path.exists():
|
||||
logger.info(f"Found existing imatrix: {imatrix_path.name}")
|
||||
return imatrix_path
|
||||
|
||||
# Try user-provided imatrix
|
||||
user_imatrix = self._prompt_for_user_imatrix(model_dir, imatrix_path)
|
||||
if user_imatrix:
|
||||
return user_imatrix
|
||||
|
||||
# Generate new imatrix
|
||||
calibration_file = self._get_calibration_file()
|
||||
if not calibration_file:
|
||||
return None
|
||||
|
||||
return self._generate_new_imatrix(f16_model_path, llama_env, imatrix_path, calibration_file)
|
||||
|
||||
def _prompt_for_user_imatrix(self, model_dir: Path, imatrix_path: Path) -> Path | None:
|
||||
"""Prompt user for existing imatrix file.
|
||||
|
||||
Returns:
|
||||
Path to user-provided imatrix, or None if not available.
|
||||
"""
|
||||
logger.info(f"Model directory: {model_dir}")
|
||||
logger.info(f"Looking for imatrix file at: {imatrix_path}")
|
||||
logger.info(
|
||||
"Tip: You can download pre-computed imatrix files from Bartowski's repositories!"
|
||||
)
|
||||
logger.info(
|
||||
" Example: https://huggingface.co/bartowski/MODEL-NAME-GGUF/resolve/main/MODEL-NAME.imatrix"
|
||||
)
|
||||
|
||||
response = (
|
||||
input("\n❓ Do you have an imatrix file to place in the model directory? (y/N): ")
|
||||
.strip()
|
||||
.lower()
|
||||
)
|
||||
|
||||
if response != "y":
|
||||
return None
|
||||
|
||||
logger.info(f"Please place your imatrix.dat file in: {model_dir}")
|
||||
input("⏳ Press Enter when you've placed the imatrix.dat file (or Ctrl+C to cancel)...")
|
||||
|
||||
if imatrix_path.exists():
|
||||
file_size = self.fs.get_file_size(imatrix_path)
|
||||
logger.info(f"Found imatrix file! ({file_size})")
|
||||
return imatrix_path
|
||||
|
||||
logger.warning("No imatrix.dat file found - continuing with automatic generation")
|
||||
return None
|
||||
|
||||
def _get_calibration_file(self) -> Path | None:
|
||||
"""Get calibration data file for imatrix generation.
|
||||
|
||||
Returns:
|
||||
Path to calibration file, or None if not found.
|
||||
"""
|
||||
calibration_file = Path(__file__).parent.parent.parent / "resources" / "imatrix_data.txt"
|
||||
if not calibration_file.exists():
|
||||
logger.warning("resources/imatrix_data.txt not found - skipping imatrix generation")
|
||||
logger.info(
|
||||
"Download from: https://gist.githubusercontent.com/bartowski1182/"
|
||||
"eb213dccb3571f863da82e99418f81e8/raw/calibration_datav3.txt"
|
||||
)
|
||||
return None
|
||||
return calibration_file
|
||||
|
||||
def _generate_new_imatrix(
|
||||
self,
|
||||
f16_model_path: Path,
|
||||
llama_env: LlamaCppEnvironment,
|
||||
imatrix_path: Path,
|
||||
calibration_file: Path,
|
||||
) -> Path | None:
|
||||
"""Generate new importance matrix using calibration data.
|
||||
|
||||
Returns:
|
||||
Path to generated imatrix, or None if generation fails.
|
||||
"""
|
||||
logger.info("Generating importance matrix (this may take 1-4 hours for large models)...")
|
||||
logger.info(f"Model: {f16_model_path.name}")
|
||||
logger.info(f"Calibration: {calibration_file}")
|
||||
logger.info(f"Output: {imatrix_path}")
|
||||
|
||||
# Find imatrix binary
|
||||
imatrix_binary = self._find_imatrix_binary(llama_env)
|
||||
if not imatrix_binary:
|
||||
logger.warning("llama-imatrix binary not found - skipping imatrix generation")
|
||||
logger.info("Make sure llama-imatrix is in the same directory as llama-quantize")
|
||||
return None
|
||||
|
||||
# Build and execute command
|
||||
cmd = self._build_imatrix_command(
|
||||
imatrix_binary, f16_model_path, calibration_file, imatrix_path
|
||||
)
|
||||
return self._execute_imatrix_generation(cmd, imatrix_path)
|
||||
|
||||
def _build_imatrix_command(
|
||||
self, binary: Path, model_path: Path, calibration_file: Path, output_path: Path
|
||||
) -> list[str]:
|
||||
"""Build imatrix generation command.
|
||||
|
||||
Returns:
|
||||
Command arguments as list.
|
||||
"""
|
||||
return [
|
||||
str(binary),
|
||||
"-m",
|
||||
str(model_path),
|
||||
"-f",
|
||||
str(calibration_file),
|
||||
"-o",
|
||||
str(output_path),
|
||||
"--process-output",
|
||||
"--output-frequency",
|
||||
"10",
|
||||
"--save-frequency",
|
||||
"50",
|
||||
"-t",
|
||||
"8",
|
||||
"-c",
|
||||
"2048",
|
||||
"-b",
|
||||
"512",
|
||||
]
|
||||
|
||||
def _execute_imatrix_generation(self, cmd: list[str], imatrix_path: Path) -> Path | None:
|
||||
"""Execute imatrix generation command with real-time output.
|
||||
|
||||
Returns:
|
||||
Path to generated imatrix file, or None if generation fails.
|
||||
"""
|
||||
logger.info(f"Running: {' '.join(cmd)}")
|
||||
logger.info("Starting imatrix generation... (progress will be shown)")
|
||||
|
||||
try:
|
||||
process = subprocess.Popen(
|
||||
cmd,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.STDOUT,
|
||||
universal_newlines=True,
|
||||
bufsize=1,
|
||||
)
|
||||
|
||||
self._stream_imatrix_output(process)
|
||||
|
||||
return_code = process.poll()
|
||||
if return_code == 0:
|
||||
return self._validate_imatrix_output(imatrix_path)
|
||||
|
||||
except KeyboardInterrupt:
|
||||
logger.info("imatrix generation cancelled by user")
|
||||
process.terminate()
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"imatrix generation failed with exception: {e}")
|
||||
return None
|
||||
else:
|
||||
logger.error(f"imatrix generation failed with return code {return_code}")
|
||||
return None
|
||||
|
||||
def _stream_imatrix_output(self, process: subprocess.Popen) -> None:
|
||||
"""Stream imatrix generation output in real-time."""
|
||||
while True:
|
||||
if process.stdout is not None:
|
||||
output = process.stdout.readline()
|
||||
else:
|
||||
break
|
||||
if not output and process.poll() is not None:
|
||||
break
|
||||
if output:
|
||||
line = output.strip()
|
||||
if self._should_log_imatrix_line(line):
|
||||
logger.info(line)
|
||||
|
||||
def _should_log_imatrix_line(self, line: str) -> bool:
|
||||
"""Determine if imatrix output line should be logged.
|
||||
|
||||
Returns:
|
||||
True if line should be logged, False otherwise.
|
||||
"""
|
||||
keywords = ["Computing imatrix", "perplexity:", "save_imatrix", "entries =", "ETA"]
|
||||
return any(keyword in line for keyword in keywords) or line.startswith("[")
|
||||
|
||||
def _validate_imatrix_output(self, imatrix_path: Path) -> Path | None:
|
||||
"""Validate generated imatrix file.
|
||||
|
||||
Returns:
|
||||
Path to imatrix if valid, None otherwise.
|
||||
"""
|
||||
if imatrix_path.exists():
|
||||
file_size = self.fs.get_file_size(imatrix_path)
|
||||
logger.info(f"imatrix generation successful! ({file_size})")
|
||||
return imatrix_path
|
||||
logger.error("imatrix generation completed but file not found")
|
||||
return None
|
||||
|
||||
def _find_imatrix_binary(self, llama_env: LlamaCppEnvironment) -> Path | None:
|
||||
"""Find llama-imatrix binary in common locations.
|
||||
|
||||
Searches for the imatrix binary in the current directory and
|
||||
standard installation paths.
|
||||
|
||||
Returns:
|
||||
Path to imatrix binary, or None if not found.
|
||||
"""
|
||||
candidates = [
|
||||
Path("./llama-imatrix"),
|
||||
llama_env.quantise_binary.parent / "llama-imatrix",
|
||||
Path("/usr/local/bin/llama-imatrix"),
|
||||
Path("/usr/bin/llama-imatrix"),
|
||||
]
|
||||
|
||||
for candidate in candidates:
|
||||
if candidate.exists() and candidate.is_file():
|
||||
return candidate
|
||||
|
||||
return None
|
397
helpers/services/orchestrator.py
Normal file
397
helpers/services/orchestrator.py
Normal file
|
@ -0,0 +1,397 @@
|
|||
"""Quantisation orchestration service.
|
||||
|
||||
High-level orchestration of the complete quantisation workflow from model
|
||||
acquisition through processing to upload. Manages parallel processing,
|
||||
status tracking, and cleanup operations for efficient resource utilisation.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from concurrent.futures import Future, ThreadPoolExecutor
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from helpers.config.quantisation_configs import QUANTISATION_CONFIGS, SUPPORTED_QUANTISATION_TYPES
|
||||
from helpers.logger import logger
|
||||
from helpers.models.quantisation import (
|
||||
ModelSource,
|
||||
QuantisationContext,
|
||||
QuantisationResult,
|
||||
QuantisationType,
|
||||
)
|
||||
from helpers.services.huggingface import ReadmeGenerator
|
||||
from helpers.services.llama_cpp import EnvironmentManager, IMatrixGenerator
|
||||
from helpers.services.quantisation import HuggingFaceUploader, ModelManager, QuantisationEngine
|
||||
from helpers.utils.tensor_mapping import URLParser
|
||||
|
||||
|
||||
@dataclass(slots=True)
|
||||
class QuantisationOrchestrator:
|
||||
"""Orchestrates the complete quantisation workflow.
|
||||
|
||||
Uses dataclass with slots for efficient memory usage and dependency injection
|
||||
for modular service interaction following SOLID principles.
|
||||
"""
|
||||
|
||||
work_dir: Path = field(default_factory=lambda: Path.cwd() / "quantisation_work")
|
||||
use_imatrix: bool = True
|
||||
imatrix_base: str = "Q4_K_M"
|
||||
no_upload: bool = False
|
||||
|
||||
# Service dependencies with factory defaults
|
||||
url_parser: URLParser = field(default_factory=URLParser)
|
||||
quantisation_engine: QuantisationEngine = field(default_factory=QuantisationEngine)
|
||||
imatrix_generator: IMatrixGenerator = field(default_factory=IMatrixGenerator)
|
||||
readme_generator: ReadmeGenerator = field(default_factory=ReadmeGenerator)
|
||||
uploader: HuggingFaceUploader = field(default_factory=HuggingFaceUploader)
|
||||
|
||||
# Computed properties
|
||||
models_dir: Path = field(init=False)
|
||||
environment_manager: EnvironmentManager = field(init=False)
|
||||
model_manager: ModelManager = field(init=False)
|
||||
|
||||
def __post_init__(self) -> None:
|
||||
"""Initialise computed properties after dataclass construction."""
|
||||
self.models_dir = self.work_dir / "models"
|
||||
self.environment_manager = EnvironmentManager(self.work_dir)
|
||||
self.model_manager = ModelManager(self.models_dir, self.environment_manager)
|
||||
|
||||
def quantise(self, url: str) -> dict[QuantisationType, QuantisationResult]:
|
||||
"""Main quantisation workflow orchestrating model processing from URL to upload.
|
||||
|
||||
Returns:
|
||||
dict[QuantisationType, QuantisationResult]: Quantisation results for each type.
|
||||
"""
|
||||
logger.info("Starting Bartowski quantisation process...")
|
||||
|
||||
# Setup and preparation
|
||||
model_source, llama_env, f16_model_path, imatrix_path, output_repo = (
|
||||
self._setup_environment(url)
|
||||
)
|
||||
|
||||
# Create initial repository
|
||||
self._create_initial_repository(model_source, output_repo)
|
||||
|
||||
# Execute all quantisations
|
||||
results = self._execute_quantisations(
|
||||
model_source, llama_env, f16_model_path, imatrix_path, output_repo
|
||||
)
|
||||
|
||||
# Cleanup
|
||||
self._cleanup_files(f16_model_path, model_source)
|
||||
|
||||
self._print_completion_summary(model_source, results, output_repo)
|
||||
return results
|
||||
|
||||
def _setup_environment(self, url: str) -> tuple[ModelSource, Any, Path, Path | None, str]:
|
||||
"""Setup environment and prepare model for quantisation.
|
||||
|
||||
Returns:
|
||||
Tuple of (model_source, llama_env, f16_model_path, imatrix_path, output_repo).
|
||||
"""
|
||||
model_source = self.url_parser.parse(url)
|
||||
self._print_model_info(model_source)
|
||||
|
||||
self.models_dir.mkdir(parents=True, exist_ok=True)
|
||||
llama_env = self.environment_manager.setup()
|
||||
|
||||
f16_model_path = self.model_manager.prepare_model(model_source, llama_env)
|
||||
|
||||
imatrix_path = None
|
||||
if self.use_imatrix:
|
||||
logger.info("Generating importance matrix (imatrix)...")
|
||||
imatrix_path = self.imatrix_generator.generate_imatrix(
|
||||
f16_model_path, llama_env, self.models_dir / model_source.model_name
|
||||
)
|
||||
|
||||
output_repo = (
|
||||
f"{self.uploader.get_username()}/"
|
||||
f"{model_source.original_author}-{model_source.model_name}-GGUF"
|
||||
)
|
||||
|
||||
return model_source, llama_env, f16_model_path, imatrix_path, output_repo
|
||||
|
||||
def _create_initial_repository(self, model_source: ModelSource, output_repo: str) -> None:
|
||||
"""Create initial repository with planned quantisations."""
|
||||
logger.info("Creating initial README with planned quantisations...")
|
||||
planned_results = {
|
||||
qt: QuantisationResult(quantisation_type=qt, success=False, status="planned")
|
||||
for qt in SUPPORTED_QUANTISATION_TYPES
|
||||
}
|
||||
readme_path = self.readme_generator.generate(
|
||||
model_source, planned_results, self.models_dir, output_repo
|
||||
)
|
||||
|
||||
if not self.no_upload:
|
||||
logger.info("Creating repository with planned quantisations...")
|
||||
self.uploader.upload_readme(output_repo, readme_path)
|
||||
else:
|
||||
logger.info("Skipping repository creation (--no-upload specified)")
|
||||
|
||||
def _execute_quantisations(
|
||||
self,
|
||||
model_source: ModelSource,
|
||||
llama_env: Any,
|
||||
f16_model_path: Path,
|
||||
imatrix_path: Path | None,
|
||||
output_repo: str,
|
||||
) -> dict[QuantisationType, QuantisationResult]:
|
||||
"""Execute all quantisation types with parallel uploads.
|
||||
|
||||
Returns:
|
||||
dict[QuantisationType, QuantisationResult]: Quantisation results for each type.
|
||||
"""
|
||||
results: dict[QuantisationType, QuantisationResult] = {}
|
||||
upload_futures: list[Future[None]] = []
|
||||
|
||||
with ThreadPoolExecutor(max_workers=1, thread_name_prefix="uploader") as upload_executor:
|
||||
for quant_type in SUPPORTED_QUANTISATION_TYPES:
|
||||
result = self._process_single_quantisation(
|
||||
quant_type,
|
||||
model_source,
|
||||
llama_env,
|
||||
f16_model_path,
|
||||
imatrix_path,
|
||||
output_repo,
|
||||
results,
|
||||
upload_executor,
|
||||
upload_futures,
|
||||
)
|
||||
results[quant_type] = result
|
||||
|
||||
self._wait_for_uploads(upload_futures)
|
||||
|
||||
return results
|
||||
|
||||
def _process_single_quantisation(
|
||||
self,
|
||||
quant_type: QuantisationType,
|
||||
model_source: ModelSource,
|
||||
llama_env: Any,
|
||||
f16_model_path: Path,
|
||||
imatrix_path: Path | None,
|
||||
output_repo: str,
|
||||
results: dict[QuantisationType, QuantisationResult],
|
||||
upload_executor: ThreadPoolExecutor,
|
||||
upload_futures: list,
|
||||
) -> QuantisationResult:
|
||||
"""Process a single quantisation type.
|
||||
|
||||
Returns:
|
||||
QuantisationResult: Result of the quantisation attempt.
|
||||
"""
|
||||
try:
|
||||
logger.info(f"Starting {quant_type.value} quantisation...")
|
||||
config = QUANTISATION_CONFIGS[quant_type]
|
||||
|
||||
# Update status to processing
|
||||
result = QuantisationResult(quantisation_type=quant_type, success=False)
|
||||
result.status = "processing"
|
||||
results[quant_type] = result
|
||||
|
||||
self._update_readme_status(model_source, results, output_repo)
|
||||
|
||||
# Perform quantisation
|
||||
context = QuantisationContext(
|
||||
f16_model_path=f16_model_path,
|
||||
model_source=model_source,
|
||||
config=config,
|
||||
llama_env=llama_env,
|
||||
models_dir=self.models_dir,
|
||||
imatrix_path=imatrix_path,
|
||||
base_quant=self.imatrix_base,
|
||||
)
|
||||
result = self.quantisation_engine.quantise(context)
|
||||
|
||||
self._handle_quantisation_result(
|
||||
result,
|
||||
quant_type,
|
||||
model_source,
|
||||
results,
|
||||
output_repo,
|
||||
upload_executor,
|
||||
upload_futures,
|
||||
)
|
||||
except Exception as e:
|
||||
return self._handle_quantisation_error(
|
||||
e, quant_type, model_source, results, output_repo
|
||||
)
|
||||
else:
|
||||
return result
|
||||
|
||||
def _handle_quantisation_result(
|
||||
self,
|
||||
result: QuantisationResult,
|
||||
quant_type: QuantisationType,
|
||||
model_source: ModelSource,
|
||||
results: dict[QuantisationType, QuantisationResult],
|
||||
output_repo: str,
|
||||
upload_executor: ThreadPoolExecutor,
|
||||
upload_futures: list,
|
||||
) -> None:
|
||||
"""Handle successful or failed quantisation result."""
|
||||
if result.success and result.file_path:
|
||||
quant_str = getattr(result.quantisation_type, "value", result.quantisation_type)
|
||||
logger.info(f"Starting parallel upload of {quant_str}...")
|
||||
upload_future = upload_executor.submit(
|
||||
self._upload_and_cleanup,
|
||||
output_repo,
|
||||
result.file_path,
|
||||
quant_type,
|
||||
model_source,
|
||||
results,
|
||||
)
|
||||
upload_futures.append(upload_future)
|
||||
result.file_path = None # Mark as being uploaded
|
||||
result.status = "uploading"
|
||||
else:
|
||||
result.status = "failed"
|
||||
|
||||
self._update_readme_status(model_source, results, output_repo)
|
||||
|
||||
def _handle_quantisation_error(
|
||||
self,
|
||||
error: Exception,
|
||||
quant_type: QuantisationType,
|
||||
model_source: ModelSource,
|
||||
results: dict[QuantisationType, QuantisationResult],
|
||||
output_repo: str,
|
||||
) -> QuantisationResult:
|
||||
"""Handle quantisation processing error.
|
||||
|
||||
Returns:
|
||||
QuantisationResult: Failed quantisation result with error information.
|
||||
"""
|
||||
logger.error(f"Error processing {quant_type.value}: {error}")
|
||||
result = QuantisationResult(quantisation_type=quant_type, success=False)
|
||||
result.status = "failed"
|
||||
result.error_message = str(error)
|
||||
|
||||
try:
|
||||
self._update_readme_status(model_source, results, output_repo)
|
||||
except Exception as readme_error:
|
||||
logger.error(f"Failed to update README after error: {readme_error}")
|
||||
|
||||
return result
|
||||
|
||||
def _update_readme_status(
|
||||
self,
|
||||
model_source: ModelSource,
|
||||
results: dict[QuantisationType, QuantisationResult],
|
||||
output_repo: str,
|
||||
) -> None:
|
||||
"""Update README with current quantisation status."""
|
||||
if not self.no_upload:
|
||||
updated_readme_path = self.readme_generator.generate(
|
||||
model_source, results, self.models_dir, output_repo
|
||||
)
|
||||
self.uploader.upload_readme(output_repo, updated_readme_path)
|
||||
|
||||
def _wait_for_uploads(self, upload_futures: list) -> None:
|
||||
"""Wait for all parallel uploads to complete."""
|
||||
logger.info("Waiting for any remaining uploads to complete...")
|
||||
for future in upload_futures:
|
||||
try:
|
||||
future.result(timeout=300) # 5 minute timeout per upload
|
||||
except Exception as e:
|
||||
logger.warning(f"Upload error: {e}")
|
||||
|
||||
def _cleanup_files(self, f16_model_path: Path, model_source: ModelSource) -> None:
|
||||
"""Clean up temporary files after processing."""
|
||||
if f16_model_path.exists():
|
||||
logger.info(f"Removing F16 model {f16_model_path.name} to save disk space...")
|
||||
f16_model_path.unlink()
|
||||
|
||||
if not model_source.is_gguf_repo:
|
||||
self._cleanup_original_model(model_source)
|
||||
|
||||
def _cleanup_original_model(self, model_source: ModelSource) -> None:
|
||||
"""Clean up original safetensors/PyTorch files after successful conversion."""
|
||||
model_dir = self.models_dir / model_source.model_name
|
||||
|
||||
pytorch_files = list(model_dir.glob("pytorch_model*.bin"))
|
||||
if pytorch_files:
|
||||
logger.info(f"Removing {len(pytorch_files)} PyTorch model files to save disk space...")
|
||||
for file in pytorch_files:
|
||||
file.unlink()
|
||||
|
||||
logger.info("Keeping config files, tokeniser, and metadata for reference")
|
||||
|
||||
def _upload_and_cleanup(
|
||||
self,
|
||||
output_repo: str,
|
||||
file_path: Path,
|
||||
quant_type: QuantisationType,
|
||||
model_source: ModelSource,
|
||||
results: dict[QuantisationType, QuantisationResult],
|
||||
) -> None:
|
||||
"""Upload file and clean up (runs in background thread)."""
|
||||
try:
|
||||
logger.info(f"[PARALLEL] Uploading {quant_type}...")
|
||||
self.uploader.upload_model_file(output_repo, file_path)
|
||||
|
||||
logger.info(f"[PARALLEL] Removing {file_path.name} to save disk space...")
|
||||
file_path.unlink()
|
||||
|
||||
results[quant_type].status = "completed"
|
||||
updated_readme_path = self.readme_generator.generate(
|
||||
model_source, results, self.models_dir, output_repo
|
||||
)
|
||||
self.uploader.upload_readme(output_repo, updated_readme_path)
|
||||
|
||||
logger.info(f"[PARALLEL] {quant_type} upload and cleanup complete")
|
||||
except Exception as e:
|
||||
logger.error(f"[PARALLEL] Failed to upload {quant_type}: {e}")
|
||||
results[quant_type].status = "failed"
|
||||
results[quant_type].error_message = str(e)
|
||||
|
||||
updated_readme_path = self.readme_generator.generate(
|
||||
model_source, results, self.models_dir, output_repo
|
||||
)
|
||||
self.uploader.upload_readme(output_repo, updated_readme_path)
|
||||
raise
|
||||
|
||||
def _print_model_info(self, model_source: ModelSource) -> None:
|
||||
"""Print model information."""
|
||||
logger.info(f"Source URL: {model_source.url}")
|
||||
logger.info(f"Source model: {model_source.source_model}")
|
||||
logger.info(f"Original author: {model_source.original_author}")
|
||||
logger.info(f"Model name: {model_source.model_name}")
|
||||
logger.info(f"Your HF username: {self.uploader.get_username()}")
|
||||
logger.info(f"Working directory: {self.work_dir}")
|
||||
|
||||
def _print_completion_summary(
|
||||
self,
|
||||
model_source: ModelSource,
|
||||
results: dict[QuantisationType, QuantisationResult],
|
||||
output_repo: str,
|
||||
) -> None:
|
||||
"""Print completion summary."""
|
||||
successful_results = [r for r in results.values() if r.success]
|
||||
|
||||
if successful_results:
|
||||
logger.info("Complete! Your quantised models are available at:")
|
||||
logger.info(f" https://huggingface.co/{output_repo}")
|
||||
logger.info("Model info:")
|
||||
logger.info(f" - Source URL: {model_source.url}")
|
||||
logger.info(f" - Original: {model_source.source_model}")
|
||||
logger.info(
|
||||
" - Method: "
|
||||
f"{'Direct GGUF download' if model_source.is_gguf_repo else 'HF model conversion'}"
|
||||
)
|
||||
logger.info(f" - Quantised: {output_repo}")
|
||||
|
||||
for result in successful_results:
|
||||
if result.file_size:
|
||||
filename = (
|
||||
f"{model_source.original_author}-{model_source.model_name}-"
|
||||
f"{result.quantisation_type}.gguf"
|
||||
)
|
||||
logger.info(f" - {result.quantisation_type}: {filename} ({result.file_size})")
|
||||
else:
|
||||
logger.error(
|
||||
"All quantisations failed - repository created with documentation "
|
||||
"but no model files"
|
||||
)
|
||||
logger.error(f" Repository: https://huggingface.co/{output_repo}")
|
486
helpers/services/quantisation.py
Normal file
486
helpers/services/quantisation.py
Normal file
|
@ -0,0 +1,486 @@
|
|||
"""Quantisation operations service.
|
||||
|
||||
Provides modular quantisation engine, model management, and upload capabilities
|
||||
for GGUF model processing. Consolidates quantisation logic from various tools
|
||||
into reusable components following SOLID principles.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import shutil
|
||||
import subprocess
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from helpers.logger import logger
|
||||
from helpers.models.quantisation import (
|
||||
ModelSource,
|
||||
QuantisationContext,
|
||||
QuantisationResult,
|
||||
QuantisationType,
|
||||
)
|
||||
from helpers.services.filesystem import FilesystemService
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from pathlib import Path
|
||||
|
||||
from helpers.models.quantisation import LlamaCppEnvironment
|
||||
from helpers.services.llama_cpp import EnvironmentManager
|
||||
|
||||
|
||||
class QuantisationEngine:
|
||||
"""Handles the actual quantisation process with configurable methods.
|
||||
|
||||
Provides flexible quantisation execution supporting multiple tensor
|
||||
precision configurations, importance matrices, and fallback strategies.
|
||||
Encapsulates llama-quantize binary interactions with real-time output.
|
||||
"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
"""Initialise quantisation engine."""
|
||||
self.fs = FilesystemService()
|
||||
|
||||
def quantise(self, context: QuantisationContext) -> QuantisationResult:
|
||||
"""Perform quantisation using the specified configuration.
|
||||
|
||||
Executes quantisation with primary and fallback methods, handling
|
||||
tensor-specific precision overrides and importance matrix guidance.
|
||||
|
||||
Returns:
|
||||
QuantisationResult with success status and file information.
|
||||
"""
|
||||
logger.info(
|
||||
f"⚙️ Creating {context.config.name} quantisation ({context.config.description})..."
|
||||
)
|
||||
|
||||
output_path = context.get_output_path()
|
||||
|
||||
logger.info(f"🎯 Attempting {context.config.name} quantisation...")
|
||||
logger.info(f"📝 Source: {context.f16_model_path}")
|
||||
logger.info(f"📝 Target: {output_path}")
|
||||
|
||||
# Try primary method
|
||||
if self._try_quantisation_method(
|
||||
context, output_path, context.config.tensor_types, "method 1"
|
||||
):
|
||||
return self._create_success_result(context.config.name, output_path, "method 1")
|
||||
|
||||
# Try fallback methods
|
||||
for i, fallback_method in enumerate(context.config.fallback_methods, 2):
|
||||
method_name = f"method {i}"
|
||||
if self._try_quantisation_method(context, output_path, fallback_method, method_name):
|
||||
return self._create_success_result(context.config.name, output_path, method_name)
|
||||
|
||||
logger.error("All %s quantisation methods failed", context.config.name)
|
||||
return QuantisationResult(
|
||||
quantisation_type=QuantisationType(context.config.name),
|
||||
success=False,
|
||||
error_message="All quantisation methods failed",
|
||||
)
|
||||
|
||||
def _try_quantisation_method(
|
||||
self,
|
||||
context: QuantisationContext,
|
||||
output_path: Path,
|
||||
tensor_config: dict[str, str],
|
||||
method_name: str,
|
||||
) -> bool:
|
||||
"""Try a specific quantisation method with real-time output.
|
||||
|
||||
Builds and executes llama-quantize command with appropriate parameters,
|
||||
streaming output for progress monitoring.
|
||||
|
||||
Returns:
|
||||
True if quantisation successful, False otherwise.
|
||||
"""
|
||||
logger.info(f"🔍 Trying {method_name}...")
|
||||
|
||||
cmd = self._build_quantisation_command(context, output_path, tensor_config)
|
||||
return self._execute_quantisation_command(cmd, method_name)
|
||||
|
||||
def _build_quantisation_command(
|
||||
self, context: QuantisationContext, output_path: Path, tensor_config: dict[str, str]
|
||||
) -> list[str]:
|
||||
"""Build quantisation command with all required parameters.
|
||||
|
||||
Returns:
|
||||
List of command arguments.
|
||||
"""
|
||||
cmd = [str(context.llama_env.quantise_binary)]
|
||||
|
||||
# Add imatrix if available
|
||||
if context.imatrix_path and context.imatrix_path.exists():
|
||||
cmd.extend(["--imatrix", str(context.imatrix_path)])
|
||||
logger.info(f"🧮 Using imatrix: {context.imatrix_path.name}")
|
||||
|
||||
# Add tensor type arguments
|
||||
self._add_tensor_type_arguments(cmd, tensor_config)
|
||||
|
||||
cmd.extend([str(context.f16_model_path), str(output_path), context.base_quant])
|
||||
return cmd
|
||||
|
||||
def _add_tensor_type_arguments(self, cmd: list[str], tensor_config: dict[str, str]) -> None:
|
||||
"""Add tensor type arguments to command."""
|
||||
if not tensor_config:
|
||||
return
|
||||
|
||||
for tensor_name, quant_type in tensor_config.items():
|
||||
if tensor_name.startswith(("token-embedding-type", "output-tensor-type")):
|
||||
cmd.extend([f"--{tensor_name}", quant_type])
|
||||
else:
|
||||
cmd.extend(["--tensor-type", f"{tensor_name}={quant_type}"])
|
||||
|
||||
def _execute_quantisation_command(self, cmd: list[str], method_name: str) -> bool:
|
||||
"""Execute quantisation command with real-time output.
|
||||
|
||||
Returns:
|
||||
True if quantisation successful, False otherwise.
|
||||
"""
|
||||
logger.info(f"💻 Running: {' '.join(cmd)}")
|
||||
logger.info("⏳ Quantisation in progress... (this may take several minutes)")
|
||||
|
||||
try:
|
||||
process = subprocess.Popen(
|
||||
cmd,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.STDOUT,
|
||||
universal_newlines=True,
|
||||
bufsize=1,
|
||||
)
|
||||
|
||||
self._stream_quantisation_output(process)
|
||||
|
||||
return_code = process.poll()
|
||||
if return_code == 0:
|
||||
logger.info(f"✅ {method_name} quantisation successful!")
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.info(f"❌ {method_name} failed with exception: {e}")
|
||||
return False
|
||||
else:
|
||||
logger.info(f"❌ {method_name} failed with return code {return_code}")
|
||||
return False
|
||||
|
||||
def _stream_quantisation_output(self, process: subprocess.Popen) -> None:
|
||||
"""Stream quantisation output in real-time."""
|
||||
while True:
|
||||
if process.stdout is not None:
|
||||
output = process.stdout.readline()
|
||||
else:
|
||||
break
|
||||
if not output and process.poll() is not None:
|
||||
break
|
||||
if output:
|
||||
logger.info(f"📊 {output.strip()}")
|
||||
|
||||
def _create_success_result(
|
||||
self, quant_type: str, output_path: Path, method_used: str
|
||||
) -> QuantisationResult:
|
||||
"""Create successful quantisation result with file metadata.
|
||||
|
||||
Returns:
|
||||
QuantisationResult with file path and size information.
|
||||
"""
|
||||
file_size = self.fs.get_file_size(output_path)
|
||||
return QuantisationResult(
|
||||
quantisation_type=QuantisationType(quant_type),
|
||||
success=True,
|
||||
file_path=output_path,
|
||||
file_size=file_size,
|
||||
method_used=method_used,
|
||||
)
|
||||
|
||||
|
||||
class ModelManager:
|
||||
"""Handles model downloading and preparation for quantisation.
|
||||
|
||||
Manages both GGUF repository downloads and HuggingFace model conversions,
|
||||
providing unified interface for model acquisition and preparation.
|
||||
"""
|
||||
|
||||
def __init__(self, models_dir: Path, environment_manager: EnvironmentManager) -> None:
|
||||
"""Initialise model manager with storage and environment configuration.
|
||||
|
||||
Sets up model storage directory and links to environment manager for
|
||||
conversion script access and llama.cpp tool discovery.
|
||||
"""
|
||||
self.models_dir = models_dir
|
||||
self.environment_manager = environment_manager
|
||||
self.fs = FilesystemService()
|
||||
|
||||
def prepare_model(self, model_source: ModelSource, llama_env: LlamaCppEnvironment) -> Path:
|
||||
"""Prepare model for quantisation and return F16 model path.
|
||||
|
||||
Handles both GGUF repository downloads and regular HuggingFace model
|
||||
conversion workflows with automatic format detection.
|
||||
|
||||
Returns:
|
||||
Path to F16 GGUF model ready for quantisation.
|
||||
"""
|
||||
model_dir = self.models_dir / model_source.model_name
|
||||
|
||||
if model_source.is_gguf_repo:
|
||||
return self._handle_gguf_repo(model_source, model_dir)
|
||||
return self._handle_regular_repo(model_source, model_dir, llama_env)
|
||||
|
||||
def _handle_gguf_repo(self, model_source: ModelSource, model_dir: Path) -> Path:
|
||||
"""Handle GGUF repository download with pattern matching.
|
||||
|
||||
Downloads GGUF files matching specified patterns, prioritising
|
||||
multi-part files and F16 variants.
|
||||
|
||||
Returns:
|
||||
Path to downloaded or existing GGUF file.
|
||||
"""
|
||||
logger.info(f"⬇️ Downloading GGUF file from repository: {model_source.source_model}")
|
||||
logger.info(f"🔍 Looking for file pattern: *{model_source.gguf_file_pattern}*")
|
||||
|
||||
f16_model = model_dir / f"{model_source.model_name}-f16.gguf"
|
||||
|
||||
if f16_model.exists():
|
||||
logger.info(f"✅ Found existing F16 file: {f16_model.name}")
|
||||
return f16_model
|
||||
|
||||
# Check for existing GGUF files
|
||||
model_dir.mkdir(parents=True, exist_ok=True)
|
||||
existing_gguf = self.fs.find_gguf_files(model_dir)
|
||||
|
||||
if existing_gguf:
|
||||
logger.info(f"✅ Found existing GGUF file: {existing_gguf[0].name}")
|
||||
return existing_gguf[0]
|
||||
|
||||
# Download with patterns
|
||||
downloaded_file = self._download_gguf_with_patterns(
|
||||
model_source.source_model, model_source.gguf_file_pattern, model_dir
|
||||
)
|
||||
|
||||
if downloaded_file:
|
||||
# Handle multi-part files
|
||||
if "00001-of-" in downloaded_file.name:
|
||||
return downloaded_file
|
||||
if "-00002-of-" in downloaded_file.name or "-00003-of-" in downloaded_file.name:
|
||||
base_name = downloaded_file.name.replace("-00002-of-", "-00001-of-").replace(
|
||||
"-00003-of-", "-00001-of-"
|
||||
)
|
||||
first_part = downloaded_file.parent / base_name
|
||||
if first_part.exists():
|
||||
logger.info(f"🔄 Using first part: {first_part.name}")
|
||||
return first_part
|
||||
|
||||
# Rename single file to standard name
|
||||
downloaded_file.rename(f16_model)
|
||||
return f16_model
|
||||
|
||||
# Fallback to regular conversion
|
||||
logger.info("💡 Falling back to downloading full repository and converting...")
|
||||
return self._handle_regular_repo(
|
||||
ModelSource(**{**model_source.dict(), "is_gguf_repo": False}),
|
||||
model_dir,
|
||||
None,
|
||||
)
|
||||
|
||||
def _download_gguf_with_patterns(
|
||||
self, source_model: str, pattern: str | None, model_dir: Path
|
||||
) -> Path | None:
|
||||
"""Download GGUF file using various pattern strategies.
|
||||
|
||||
Tries multiple pattern variations to find and download appropriate
|
||||
GGUF files, handling timeouts and temporary directories.
|
||||
|
||||
Returns:
|
||||
Path to downloaded file, or None if all patterns fail.
|
||||
"""
|
||||
if pattern:
|
||||
patterns = [
|
||||
f"*{pattern}*",
|
||||
f"*{pattern.lower()}*",
|
||||
f"*{pattern.upper()}*",
|
||||
"*f16*",
|
||||
"*F16*",
|
||||
"*fp16*",
|
||||
]
|
||||
else:
|
||||
patterns = ["*f16*", "*F16*", "*fp16*"]
|
||||
|
||||
temp_dir = model_dir / "gguf_temp"
|
||||
|
||||
for search_pattern in patterns:
|
||||
logger.info(f"🔍 Trying pattern: {search_pattern}")
|
||||
temp_dir.mkdir(exist_ok=True)
|
||||
|
||||
try:
|
||||
subprocess.run(
|
||||
[
|
||||
"timeout",
|
||||
"300",
|
||||
"huggingface-cli",
|
||||
"download",
|
||||
source_model,
|
||||
"--include",
|
||||
search_pattern,
|
||||
"--local-dir",
|
||||
str(temp_dir),
|
||||
],
|
||||
check=True,
|
||||
capture_output=True,
|
||||
)
|
||||
|
||||
# Find downloaded GGUF files
|
||||
gguf_files = self.fs.find_gguf_files(temp_dir, pattern)
|
||||
if gguf_files:
|
||||
found_file = gguf_files[0]
|
||||
logger.info(f"✅ Found GGUF file: {found_file.name}")
|
||||
|
||||
# Move to parent directory
|
||||
final_path = model_dir / found_file.name
|
||||
shutil.move(str(found_file), str(final_path))
|
||||
shutil.rmtree(temp_dir)
|
||||
return final_path
|
||||
|
||||
except subprocess.CalledProcessError:
|
||||
logger.info(f"⚠️ Pattern {search_pattern} failed or timed out")
|
||||
continue
|
||||
finally:
|
||||
if temp_dir.exists():
|
||||
shutil.rmtree(temp_dir, ignore_errors=True)
|
||||
|
||||
return None
|
||||
|
||||
def _handle_regular_repo(
|
||||
self,
|
||||
model_source: ModelSource,
|
||||
model_dir: Path,
|
||||
llama_env: LlamaCppEnvironment | None,
|
||||
) -> Path:
|
||||
"""Handle regular HuggingFace repository conversion.
|
||||
|
||||
Downloads full model repository and converts to F16 GGUF format
|
||||
using llama.cpp conversion scripts.
|
||||
|
||||
Returns:
|
||||
Path to converted F16 GGUF model.
|
||||
"""
|
||||
logger.info(f"⬇️ Downloading source model: {model_source.source_model}")
|
||||
|
||||
if not model_dir.exists():
|
||||
subprocess.run(
|
||||
[
|
||||
"huggingface-cli",
|
||||
"download",
|
||||
model_source.source_model,
|
||||
"--local-dir",
|
||||
str(model_dir),
|
||||
],
|
||||
check=True,
|
||||
)
|
||||
else:
|
||||
logger.info("✅ Model already downloaded")
|
||||
|
||||
logger.info("🔄 Converting to GGUF F16 format...")
|
||||
f16_model = model_dir / f"{model_source.model_name}-f16.gguf"
|
||||
|
||||
if not f16_model.exists():
|
||||
if not llama_env:
|
||||
llama_env = self.environment_manager.setup()
|
||||
|
||||
# Ensure conversion script is available
|
||||
if llama_env.use_repo or not self.environment_manager.llama_cpp_dir.exists():
|
||||
logger.info("Getting conversion script from llama.cpp repository...")
|
||||
llama_env = self.environment_manager.setup_repository()
|
||||
|
||||
subprocess.run(
|
||||
[
|
||||
*llama_env.convert_script.split(),
|
||||
str(model_dir),
|
||||
"--outtype",
|
||||
"f16",
|
||||
"--outfile",
|
||||
str(f16_model),
|
||||
],
|
||||
check=True,
|
||||
)
|
||||
else:
|
||||
logger.info("✅ F16 model already exists")
|
||||
|
||||
return f16_model
|
||||
|
||||
|
||||
class HuggingFaceUploader:
|
||||
"""Handles uploading models and documentation to HuggingFace.
|
||||
|
||||
Provides methods for repository creation, file uploads, and README
|
||||
updates with proper error handling and retry logic.
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def get_username() -> str:
|
||||
"""Get authenticated HuggingFace username.
|
||||
|
||||
Returns:
|
||||
HuggingFace username from CLI authentication.
|
||||
|
||||
Raises:
|
||||
RuntimeError: If not authenticated.
|
||||
"""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["huggingface-cli", "whoami"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=True,
|
||||
)
|
||||
return result.stdout.strip()
|
||||
except (subprocess.CalledProcessError, FileNotFoundError) as err:
|
||||
msg = "Please log in to HuggingFace first: huggingface-cli login"
|
||||
raise RuntimeError(msg) from err
|
||||
|
||||
def upload_readme(self, output_repo: str, readme_path: Path) -> None:
|
||||
"""Upload or update README file to repository.
|
||||
|
||||
Creates repository if needed, handles existing repository updates.
|
||||
"""
|
||||
logger.info("Uploading README...")
|
||||
try:
|
||||
subprocess.run(
|
||||
[
|
||||
"huggingface-cli",
|
||||
"upload",
|
||||
output_repo,
|
||||
str(readme_path),
|
||||
"README.md",
|
||||
"--create",
|
||||
],
|
||||
check=True,
|
||||
capture_output=True,
|
||||
)
|
||||
logger.info("README uploaded")
|
||||
except subprocess.CalledProcessError:
|
||||
# Repository exists, update without --create
|
||||
subprocess.run(
|
||||
[
|
||||
"huggingface-cli",
|
||||
"upload",
|
||||
output_repo,
|
||||
str(readme_path),
|
||||
"README.md",
|
||||
],
|
||||
check=True,
|
||||
)
|
||||
logger.info("README updated")
|
||||
|
||||
def upload_model_file(self, output_repo: str, model_path: Path) -> None:
|
||||
"""Upload model file to repository.
|
||||
|
||||
Uploads GGUF model file to specified repository path.
|
||||
"""
|
||||
logger.info(f"Uploading {model_path.name}...")
|
||||
subprocess.run(
|
||||
[
|
||||
"huggingface-cli",
|
||||
"upload",
|
||||
output_repo,
|
||||
str(model_path),
|
||||
model_path.name,
|
||||
],
|
||||
check=True,
|
||||
)
|
||||
logger.info(f"{model_path.name} uploaded")
|
16
helpers/utils/__init__.py
Normal file
16
helpers/utils/__init__.py
Normal file
|
@ -0,0 +1,16 @@
|
|||
"""Utility functions for llm-gguf-tools.
|
||||
|
||||
Provides low-level utilities for tensor mapping, configuration parsing,
|
||||
and other common operations. Uses UK English spelling conventions throughout.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from helpers.utils.config_parser import ConfigParser
|
||||
from helpers.utils.tensor_mapping import TensorMapper, URLParser
|
||||
|
||||
__all__ = [
|
||||
"ConfigParser",
|
||||
"TensorMapper",
|
||||
"URLParser",
|
||||
]
|
171
helpers/utils/config_parser.py
Normal file
171
helpers/utils/config_parser.py
Normal file
|
@ -0,0 +1,171 @@
|
|||
"""Configuration parsing utilities.
|
||||
|
||||
Provides utilities for parsing model configurations, inferring parameters,
|
||||
and handling architecture-specific settings. Uses UK English spelling
|
||||
conventions throughout.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
from helpers.models.conversion import GGUFParameters, ModelConfig, VisionConfig
|
||||
from helpers.services.filesystem import FilesystemService
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
class ConfigParser:
|
||||
"""Parses and transforms model configuration files.
|
||||
|
||||
Handles loading of HuggingFace config.json files, parameter inference,
|
||||
and conversion to GGUF-compatible formats. Provides sensible defaults
|
||||
for missing values and architecture-specific handling.
|
||||
"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
"""Initialise ConfigParser."""
|
||||
self.fs = FilesystemService()
|
||||
|
||||
def load_model_config(self, model_path: Path) -> ModelConfig:
|
||||
"""Load model configuration from config.json file.
|
||||
|
||||
Reads the standard HuggingFace config.json file and parses it into
|
||||
a structured ModelConfig instance with proper type validation. Handles
|
||||
vision model configurations and provides sensible defaults for missing values.
|
||||
|
||||
Returns:
|
||||
Parsed ModelConfig instance.
|
||||
"""
|
||||
config_file = model_path / "config.json"
|
||||
raw_config = self.fs.load_json_config(config_file)
|
||||
|
||||
# Parse vision config if present
|
||||
vision_config = None
|
||||
if "vision_config" in raw_config:
|
||||
vision_config = VisionConfig(**raw_config["vision_config"])
|
||||
|
||||
# Create ModelConfig with parsed values
|
||||
return ModelConfig(
|
||||
architectures=raw_config.get("architectures", ["Unknown"]),
|
||||
model_type=raw_config.get("model_type", "unknown"),
|
||||
vocab_size=raw_config.get("vocab_size", 32000),
|
||||
max_position_embeddings=raw_config.get("max_position_embeddings", 2048),
|
||||
hidden_size=raw_config.get("hidden_size", 4096),
|
||||
num_hidden_layers=raw_config.get("num_hidden_layers", 32),
|
||||
intermediate_size=raw_config.get("intermediate_size", 11008),
|
||||
num_attention_heads=raw_config.get("num_attention_heads", 32),
|
||||
num_key_value_heads=raw_config.get("num_key_value_heads"),
|
||||
rope_theta=raw_config.get("rope_theta", 10000.0),
|
||||
rope_scaling=raw_config.get("rope_scaling"),
|
||||
rms_norm_eps=raw_config.get("rms_norm_eps", 1e-5),
|
||||
vision_config=vision_config,
|
||||
)
|
||||
|
||||
def infer_gguf_parameters(self, config: ModelConfig) -> GGUFParameters:
|
||||
"""Infer GGUF parameters from model configuration.
|
||||
|
||||
Translates HuggingFace model configuration to GGUF parameter format,
|
||||
providing sensible defaults for missing values and handling various
|
||||
architecture conventions.
|
||||
|
||||
Args:
|
||||
config: Parsed ModelConfig instance.
|
||||
|
||||
Returns:
|
||||
GGUFParameters with inferred values.
|
||||
"""
|
||||
# Calculate derived parameters
|
||||
num_heads = config.num_attention_heads
|
||||
embedding_length = config.hidden_size
|
||||
rope_dimension_count = embedding_length // num_heads
|
||||
|
||||
# Handle KV heads (for GQA models)
|
||||
num_kv_heads = config.num_key_value_heads or num_heads
|
||||
|
||||
# Create GGUFParameters using dict with aliases
|
||||
params_dict = {
|
||||
"vocab_size": config.vocab_size,
|
||||
"context_length": config.max_position_embeddings,
|
||||
"embedding_length": embedding_length,
|
||||
"block_count": config.num_hidden_layers,
|
||||
"feed_forward_length": config.intermediate_size,
|
||||
"attention.head_count": num_heads,
|
||||
"attention.head_count_kv": num_kv_heads,
|
||||
"attention.layer_norm_rms_epsilon": config.rms_norm_eps,
|
||||
"rope.freq_base": config.rope_theta,
|
||||
"rope.dimension_count": rope_dimension_count,
|
||||
}
|
||||
|
||||
params = GGUFParameters.model_validate(params_dict)
|
||||
|
||||
# Add RoPE scaling if present
|
||||
if config.rope_scaling:
|
||||
params.rope_scaling_type = config.rope_scaling.get("type", "linear")
|
||||
params.rope_scaling_factor = config.rope_scaling.get("factor", 1.0)
|
||||
|
||||
return params
|
||||
|
||||
@staticmethod
|
||||
def get_architecture_mapping(architecture: str) -> str:
|
||||
"""Map architecture names to known GGUF architectures.
|
||||
|
||||
Provides fallback mappings for architectures not directly supported
|
||||
by GGUF, mapping them to similar known architectures.
|
||||
|
||||
Args:
|
||||
architecture: Original architecture name from config.
|
||||
|
||||
Returns:
|
||||
GGUF-compatible architecture name.
|
||||
"""
|
||||
# Architecture mappings to known GGUF types
|
||||
mappings = {
|
||||
"DotsOCRForCausalLM": "qwen2", # Similar architecture
|
||||
"GptOssForCausalLM": "llama", # Use llama as fallback
|
||||
"MistralForCausalLM": "llama", # Mistral is llama-like
|
||||
"Qwen2ForCausalLM": "qwen2",
|
||||
"LlamaForCausalLM": "llama",
|
||||
"GemmaForCausalLM": "gemma",
|
||||
"Phi3ForCausalLM": "phi3",
|
||||
# Add more mappings as needed
|
||||
}
|
||||
|
||||
return mappings.get(architecture, "llama") # Default to llama
|
||||
|
||||
@staticmethod
|
||||
def load_tokeniser_config(model_path: Path) -> dict[str, Any]:
|
||||
"""Load tokeniser configuration from model directory.
|
||||
|
||||
Reads tokenizer_config.json to extract special token IDs and
|
||||
other tokenisation parameters.
|
||||
|
||||
Args:
|
||||
model_path: Path to model directory.
|
||||
|
||||
Returns:
|
||||
Tokeniser configuration dictionary.
|
||||
"""
|
||||
fs = FilesystemService()
|
||||
tokeniser_config_path = model_path / "tokenizer_config.json"
|
||||
|
||||
if not tokeniser_config_path.exists():
|
||||
# Return defaults if no config found
|
||||
return {
|
||||
"bos_token_id": 1,
|
||||
"eos_token_id": 2,
|
||||
"unk_token_id": 0,
|
||||
"pad_token_id": 0,
|
||||
}
|
||||
|
||||
config = fs.load_json_config(tokeniser_config_path)
|
||||
|
||||
# Extract token IDs with defaults
|
||||
return {
|
||||
"bos_token_id": config.get("bos_token_id", 1),
|
||||
"eos_token_id": config.get("eos_token_id", 2),
|
||||
"unk_token_id": config.get("unk_token_id", 0),
|
||||
"pad_token_id": config.get("pad_token_id", 0),
|
||||
"model_type": config.get("model_type", "llama"),
|
||||
}
|
196
helpers/utils/tensor_mapping.py
Normal file
196
helpers/utils/tensor_mapping.py
Normal file
|
@ -0,0 +1,196 @@
|
|||
"""Tensor mapping and URL parsing utilities.
|
||||
|
||||
Provides utilities for mapping tensor names between different formats,
|
||||
parsing model URLs, and handling architecture-specific conversions.
|
||||
Uses UK English spelling conventions throughout.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from typing import ClassVar
|
||||
|
||||
from helpers.models.quantisation import ModelSource, URLType
|
||||
|
||||
|
||||
class TensorMapper:
|
||||
"""Maps tensor names between HuggingFace and GGUF conventions.
|
||||
|
||||
Provides flexible tensor name translation supporting direct mappings,
|
||||
layer-aware transformations, and architecture-specific overrides.
|
||||
Handles both simple renames and complex pattern-based conversions.
|
||||
"""
|
||||
|
||||
# Common direct mappings across architectures
|
||||
DIRECT_MAPPINGS: ClassVar[dict[str, str]] = {
|
||||
"model.embed_tokens.weight": "token_embd.weight",
|
||||
"model.norm.weight": "output_norm.weight",
|
||||
"lm_head.weight": "output.weight",
|
||||
}
|
||||
|
||||
# Layer component patterns for transformer blocks
|
||||
LAYER_PATTERNS: ClassVar[dict[str, str]] = {
|
||||
"self_attn.q_proj.weight": "attn_q.weight",
|
||||
"self_attn.q_proj.bias": "attn_q.bias",
|
||||
"self_attn.k_proj.weight": "attn_k.weight",
|
||||
"self_attn.k_proj.bias": "attn_k.bias",
|
||||
"self_attn.v_proj.weight": "attn_v.weight",
|
||||
"self_attn.v_proj.bias": "attn_v.bias",
|
||||
"self_attn.o_proj": "attn_output.weight",
|
||||
"mlp.gate_proj": "ffn_gate.weight",
|
||||
"mlp.up_proj": "ffn_up.weight",
|
||||
"mlp.down_proj": "ffn_down.weight",
|
||||
"input_layernorm": "attn_norm.weight",
|
||||
"post_attention_layernorm": "ffn_norm.weight",
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def map_tensor_name(cls, original_name: str) -> str | None:
|
||||
"""Map original tensor name to GGUF format.
|
||||
|
||||
Translates HuggingFace tensor naming to GGUF format, handling embeddings,
|
||||
attention layers, feed-forward networks, and normalisation layers. Uses
|
||||
layer-aware mapping for transformer blocks whilst maintaining consistency
|
||||
across different model architectures.
|
||||
|
||||
Returns:
|
||||
GGUF tensor name, or None if unmappable.
|
||||
"""
|
||||
# Check direct mappings first
|
||||
if original_name in cls.DIRECT_MAPPINGS:
|
||||
return cls.DIRECT_MAPPINGS[original_name]
|
||||
|
||||
# Handle layer-specific tensors
|
||||
if ".layers." in original_name:
|
||||
return cls._map_layer_tensor(original_name)
|
||||
|
||||
# Return None for unmapped tensors
|
||||
return None
|
||||
|
||||
@classmethod
|
||||
def _map_layer_tensor(cls, tensor_name: str) -> str | None:
|
||||
"""Map layer-specific tensor names.
|
||||
|
||||
Handles tensors within transformer layers, extracting layer indices
|
||||
and mapping component names to GGUF conventions.
|
||||
|
||||
Args:
|
||||
tensor_name: Layer tensor name containing .layers.N. pattern.
|
||||
|
||||
Returns:
|
||||
Mapped GGUF tensor name, or None if unmappable.
|
||||
"""
|
||||
# Extract layer number
|
||||
parts = tensor_name.split(".")
|
||||
layer_idx = None
|
||||
for i, part in enumerate(parts):
|
||||
if part == "layers" and i + 1 < len(parts):
|
||||
layer_idx = parts[i + 1]
|
||||
break
|
||||
|
||||
if layer_idx is None:
|
||||
return None
|
||||
|
||||
# Check each pattern
|
||||
for pattern, replacement in cls.LAYER_PATTERNS.items():
|
||||
if pattern in tensor_name:
|
||||
return f"blk.{layer_idx}.{replacement}"
|
||||
|
||||
return None
|
||||
|
||||
|
||||
class URLParser:
|
||||
"""Parses and validates model URLs from various sources.
|
||||
|
||||
Handles HuggingFace URLs, Ollama-style GGUF references, and other
|
||||
model source formats. Extracts metadata including author, model name,
|
||||
and file patterns for appropriate download strategies.
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def parse(url: str) -> ModelSource:
|
||||
"""Parse URL and extract model source information.
|
||||
|
||||
Analyses URL format to determine source type and extract relevant
|
||||
metadata for model download and processing.
|
||||
|
||||
Args:
|
||||
url: Model URL in supported format.
|
||||
|
||||
Returns:
|
||||
ModelSource with parsed information.
|
||||
|
||||
Raises:
|
||||
ValueError: If URL format is not recognised.
|
||||
"""
|
||||
if not url:
|
||||
msg = "URL cannot be empty"
|
||||
raise ValueError(msg)
|
||||
|
||||
# Try Ollama-style GGUF URL first (hf.co/author/model:pattern)
|
||||
ollama_match = re.match(r"^hf\.co/([^:]+):(.+)$", url)
|
||||
if ollama_match:
|
||||
source_model = ollama_match.group(1)
|
||||
gguf_pattern = ollama_match.group(2)
|
||||
return URLParser._create_model_source(
|
||||
url,
|
||||
URLType.OLLAMA_GGUF,
|
||||
source_model,
|
||||
gguf_file_pattern=gguf_pattern,
|
||||
is_gguf_repo=True,
|
||||
)
|
||||
|
||||
# Try regular HuggingFace URL
|
||||
hf_match = re.match(r"https://huggingface\.co/([^/]+/[^/?]+)", url)
|
||||
if hf_match:
|
||||
source_model = hf_match.group(1)
|
||||
return URLParser._create_model_source(
|
||||
url, URLType.HUGGINGFACE, source_model, is_gguf_repo=False
|
||||
)
|
||||
|
||||
msg = (
|
||||
"Invalid URL format\n"
|
||||
"Supported formats:\n"
|
||||
" - https://huggingface.co/username/model-name\n"
|
||||
" - hf.co/username/model-name-GGUF:F16"
|
||||
)
|
||||
raise ValueError(msg)
|
||||
|
||||
@staticmethod
|
||||
def _create_model_source(
|
||||
url: str,
|
||||
url_type: URLType,
|
||||
source_model: str,
|
||||
gguf_file_pattern: str | None = None,
|
||||
is_gguf_repo: bool = False,
|
||||
) -> ModelSource:
|
||||
"""Create ModelSource with parsed information.
|
||||
|
||||
Constructs a ModelSource instance with extracted metadata,
|
||||
handling author/model name splitting and GGUF suffix removal.
|
||||
|
||||
Args:
|
||||
url: Original URL.
|
||||
url_type: Type of URL (HuggingFace or Ollama GGUF).
|
||||
source_model: Repository identifier (author/model).
|
||||
gguf_file_pattern: Optional GGUF file pattern.
|
||||
is_gguf_repo: Whether this is a GGUF repository.
|
||||
|
||||
Returns:
|
||||
Configured ModelSource instance.
|
||||
"""
|
||||
author, model_name = source_model.split("/", 1)
|
||||
|
||||
# Strip -GGUF suffix for GGUF repos
|
||||
if is_gguf_repo and model_name.endswith("-GGUF"):
|
||||
model_name = model_name[:-5]
|
||||
|
||||
return ModelSource(
|
||||
url=url,
|
||||
url_type=url_type,
|
||||
source_model=source_model,
|
||||
original_author=author,
|
||||
model_name=model_name,
|
||||
gguf_file_pattern=gguf_file_pattern,
|
||||
is_gguf_repo=is_gguf_repo,
|
||||
)
|
96
pyproject.toml
Normal file
96
pyproject.toml
Normal file
|
@ -0,0 +1,96 @@
|
|||
[project]
|
||||
name = "llm-gguf-tools"
|
||||
version = "0.1.0"
|
||||
description = "Tools to convert and quantise language models in GGUF format"
|
||||
readme = "README.md"
|
||||
license = { text = "Apache-2.0" }
|
||||
authors = [{ name = "Tom Foster", email = "tom@tomfos.tr" }]
|
||||
maintainers = [{ name = "Tom Foster", email = "tom@tomfos.tr" }]
|
||||
requires-python = ">=3.13"
|
||||
classifiers = [
|
||||
"Development Status :: 3 - Alpha",
|
||||
"License :: OSI Approved :: Apache Software License",
|
||||
"Programming Language :: Python",
|
||||
"Programming Language :: Python :: 3",
|
||||
"Programming Language :: Python :: 3.13",
|
||||
"Topic :: Scientific/Engineering :: Artificial Intelligence",
|
||||
"Topic :: Software Development :: Libraries :: Python Modules",
|
||||
]
|
||||
dependencies = ["gguf>=0", "pydantic>=2", "safetensors>=0", "torch>=2"]
|
||||
|
||||
[project.urls]
|
||||
Homepage = "https://git.tomfos.tr/tom/llm-gguf-tools"
|
||||
"Bug Reports" = "https://git.tomfos.tr/tom/llm-gguf-tools/issues"
|
||||
"Source" = "https://git.tomfos.tr/tom/llm-gguf-tools"
|
||||
|
||||
[dependency-groups]
|
||||
dev = ["pytest>=8", "ruff>=0", "uv>=0"]
|
||||
|
||||
[tool.uv]
|
||||
package = true
|
||||
|
||||
[[tool.uv.index]]
|
||||
name = "pytorch-cpu"
|
||||
url = "https://download.pytorch.org/whl/cpu"
|
||||
|
||||
[tool.uv.sources]
|
||||
torch = { index = "pytorch-cpu" }
|
||||
|
||||
[build-system]
|
||||
requires = ["setuptools>=61.0"]
|
||||
build-backend = "setuptools.build_meta"
|
||||
|
||||
[project.scripts]
|
||||
quantise = "quantise:main"
|
||||
safetensors-to-gguf = "direct_safetensors_to_gguf:main"
|
||||
|
||||
[tool.setuptools]
|
||||
packages = { find = {} }
|
||||
|
||||
[tool.ruff]
|
||||
cache-dir = "/tmp/.ruff_cache"
|
||||
fix = true
|
||||
line-length = 100
|
||||
preview = true
|
||||
show-fixes = false
|
||||
target-version = "py313"
|
||||
unsafe-fixes = true
|
||||
|
||||
[tool.ruff.format]
|
||||
line-ending = "auto"
|
||||
skip-magic-trailing-comma = false
|
||||
|
||||
[tool.ruff.lint]
|
||||
fixable = ["ALL"]
|
||||
ignore = [
|
||||
"ANN401", # use of Any type
|
||||
"BLE001", # blind Exception usage
|
||||
"COM812", # missing trailing comma
|
||||
"CPY", # flake8-copyright
|
||||
"FBT", # boolean arguments
|
||||
"PLR0912", # too many branches
|
||||
"PLR0913", # too many arguments
|
||||
"PLR0915", # too many statements
|
||||
"PLR0917", # too many positional arguments
|
||||
"PLR6301", # method could be static
|
||||
"RUF029", # async methods that don't await
|
||||
"S104", # binding to all interfaces
|
||||
"S110", # passed exceptions
|
||||
"S404", # use of subprocess
|
||||
"S603", # check subprocess input
|
||||
"S607", # subprocess with partial path
|
||||
"TRY301", # raise inside try block
|
||||
]
|
||||
select = ["ALL"]
|
||||
unfixable = [
|
||||
"F841", # local variable assigned but never used
|
||||
"RUF100", # unused noqa comments
|
||||
"T201", # don't strip print statement
|
||||
]
|
||||
|
||||
[tool.ruff.lint.isort]
|
||||
combine-as-imports = true
|
||||
required-imports = ["from __future__ import annotations"]
|
||||
|
||||
[tool.ruff.lint.pydocstyle]
|
||||
convention = "google"
|
101
quantize_gguf.py
Normal file
101
quantize_gguf.py
Normal file
|
@ -0,0 +1,101 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Bartowski Quantisation Script for advanced GGUF model processing.
|
||||
|
||||
Implements a sophisticated quantisation pipeline supporting Q4_K_M, Q4_K_L,
|
||||
Q4_K_XL, and Q4_K_XXL methods with tensor-level precision control. Features
|
||||
parallel processing, status tracking, automatic README generation, and
|
||||
HuggingFace integration for streamlined model distribution workflows.
|
||||
|
||||
Usage: python quantise.py <huggingface_url>
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import shutil
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from helpers.logger import logger
|
||||
from helpers.services.orchestrator import QuantisationOrchestrator
|
||||
|
||||
|
||||
def main() -> None:
|
||||
"""Main entry point for the Bartowski quantisation workflow.
|
||||
|
||||
Parses command-line arguments, initialises the quantisation orchestrator,
|
||||
and executes the complete model processing pipeline from HuggingFace URL
|
||||
to quantised GGUF files with optional HuggingFace upload and cleanup.
|
||||
"""
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Bartowski Quantisation Script - Supports Q4_K_M, Q4_K_L, Q4_K_XL, Q4_K_XXL",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
python quantise.py https://huggingface.co/DavidAU/Gemma-3-4b-it-Uncensored-DBL-X
|
||||
python quantise.py hf.co/DavidAU/Gemma-3-it-4B-Uncensored-DBL-X-GGUF:F16
|
||||
""",
|
||||
)
|
||||
parser.add_argument("url", help="HuggingFace model URL")
|
||||
parser.add_argument(
|
||||
"--work-dir", type=Path, help="Working directory (default: ./quantisation_work)"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--no-imatrix",
|
||||
action="store_true",
|
||||
help="Skip imatrix generation (faster but lower quality)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--imatrix-base",
|
||||
choices=[
|
||||
"Q2_K",
|
||||
"Q3_K_L",
|
||||
"Q3_K_M",
|
||||
"Q3_K_S",
|
||||
"Q4_K_S",
|
||||
"Q4_K_M",
|
||||
"Q5_K_S",
|
||||
"Q5_K_M",
|
||||
"Q6_K",
|
||||
"Q8_0",
|
||||
],
|
||||
default="Q4_K_M",
|
||||
help="Base quantisation for imatrix generation",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--no-upload",
|
||||
action="store_true",
|
||||
help="Skip uploading to HuggingFace (local testing only)",
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if not args.url:
|
||||
parser.print_help()
|
||||
sys.exit(1)
|
||||
|
||||
try:
|
||||
orchestrator = QuantisationOrchestrator(
|
||||
work_dir=args.work_dir or Path.cwd() / "quantisation_work",
|
||||
use_imatrix=not args.no_imatrix,
|
||||
imatrix_base=args.imatrix_base,
|
||||
no_upload=args.no_upload,
|
||||
)
|
||||
orchestrator.quantise(args.url)
|
||||
|
||||
# Cleanup prompt
|
||||
logger.info("Cleaning up...")
|
||||
response = input("Delete working files? (y/N): ").strip().lower()
|
||||
if response == "y":
|
||||
shutil.rmtree(orchestrator.work_dir)
|
||||
logger.info("Cleanup complete")
|
||||
else:
|
||||
logger.info(f"Working files kept in: {orchestrator.work_dir}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error: {e}")
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
2482
resources/imatrix_data.txt
Normal file
2482
resources/imatrix_data.txt
Normal file
File diff suppressed because one or more lines are too long
95
safetensors2gguf.py
Normal file
95
safetensors2gguf.py
Normal file
|
@ -0,0 +1,95 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Direct SafeTensors to GGUF converter for unsupported architectures.
|
||||
|
||||
This script attempts to convert SafeTensors models to GGUF format directly,
|
||||
without relying on llama.cpp's architecture-specific conversion logic.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
import traceback
|
||||
from argparse import ArgumentParser
|
||||
from pathlib import Path
|
||||
|
||||
from helpers.logger import logger
|
||||
from helpers.services.gguf import GGUFConverter
|
||||
from helpers.utils.config_parser import ConfigParser
|
||||
from helpers.utils.tensor_mapping import TensorMapper
|
||||
|
||||
|
||||
def convert_safetensors_to_gguf(
|
||||
model_path: Path, output_path: Path, force_architecture: str | None = None
|
||||
) -> bool:
|
||||
"""Convert SafeTensors model to GGUF format with comprehensive metadata handling.
|
||||
|
||||
Orchestrates the complete conversion workflow: loads configuration, maps
|
||||
architecture to known GGUF types, creates writer with proper metadata,
|
||||
processes all tensor files with name mapping, and adds tokeniser data.
|
||||
Handles BFloat16 conversion and provides fallback architecture mapping
|
||||
for unsupported model types to ensure maximum compatibility.
|
||||
|
||||
Returns:
|
||||
True if conversion was successful, False otherwise.
|
||||
"""
|
||||
# Use ConfigParser to load configuration
|
||||
config_parser = ConfigParser()
|
||||
model_config = config_parser.load_model_config(model_path)
|
||||
|
||||
arch_name = model_config.architectures[0]
|
||||
model_type = model_config.model_type
|
||||
|
||||
logger.info(f"Architecture: {arch_name}")
|
||||
logger.info(f"Model type: {model_type}")
|
||||
|
||||
# Use forced architecture or try to map to a known one
|
||||
if force_architecture:
|
||||
arch = force_architecture
|
||||
logger.warning(f"Using forced architecture: {arch}")
|
||||
else:
|
||||
# Use ConfigParser's architecture mapping
|
||||
arch = config_parser.get_architecture_mapping(arch_name)
|
||||
if arch != arch_name:
|
||||
logger.warning(f"Unknown architecture {arch_name}, using {arch} as fallback")
|
||||
|
||||
# Use the new GGUFConverter for the conversion
|
||||
tensor_mapper = TensorMapper()
|
||||
return GGUFConverter.convert_safetensors(
|
||||
model_path, output_path, model_config, arch, tensor_mapper
|
||||
)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
"""Main entry point for SafeTensors to GGUF conversion command-line interface.
|
||||
|
||||
Parses command-line arguments, validates input paths, and orchestrates the
|
||||
conversion process with proper error handling. Supports forced architecture
|
||||
mapping and flexible output path specification. Provides comprehensive
|
||||
error reporting and exit codes for integration with automated workflows.
|
||||
"""
|
||||
parser = ArgumentParser(description="Convert SafeTensors to GGUF directly")
|
||||
parser.add_argument("model_path", help="Path to SafeTensors model directory")
|
||||
parser.add_argument("-o", "--output", help="Output GGUF file path")
|
||||
parser.add_argument("--force-arch", help="Force a specific architecture mapping")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
model_path = Path(args.model_path)
|
||||
if not model_path.exists():
|
||||
logger.error(f"Model path not found: {model_path}")
|
||||
sys.exit(1)
|
||||
|
||||
output_path = Path(args.output) if args.output else model_path / f"{model_path.name}-f32.gguf"
|
||||
|
||||
try:
|
||||
success = convert_safetensors_to_gguf(model_path, output_path, args.force_arch)
|
||||
sys.exit(0 if success else 1)
|
||||
except Exception as e:
|
||||
logger.error(f"Conversion failed: {e}")
|
||||
|
||||
traceback.print_exc()
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
425
uv.lock
generated
Normal file
425
uv.lock
generated
Normal file
|
@ -0,0 +1,425 @@
|
|||
version = 1
|
||||
revision = 2
|
||||
requires-python = ">=3.13"
|
||||
resolution-markers = [
|
||||
"sys_platform != 'darwin'",
|
||||
"sys_platform == 'darwin'",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "annotated-types"
|
||||
version = "0.7.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/ee/67/531ea369ba64dcff5ec9c3402f9f51bf748cec26dde048a2f973a4eea7f5/annotated_types-0.7.0.tar.gz", hash = "sha256:aff07c09a53a08bc8cfccb9c85b05f1aa9a2a6f23728d790723543408344ce89", size = 16081, upload-time = "2024-05-20T21:33:25.928Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl", hash = "sha256:1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53", size = 13643, upload-time = "2024-05-20T21:33:24.1Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "colorama"
|
||||
version = "0.4.6"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "filelock"
|
||||
version = "3.13.1"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/filelock-3.13.1-py3-none-any.whl", hash = "sha256:57dbda9b35157b05fb3e58ee91448612eb674172fab98ee235ccb0b5bee19a1c" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "fsspec"
|
||||
version = "2024.6.1"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/fsspec-2024.6.1-py3-none-any.whl", hash = "sha256:3cb443f8bcd2efb31295a5b9fdb02aee81d8452c80d28f97a6d0959e6cee101e" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "gguf"
|
||||
version = "0.17.1"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "numpy" },
|
||||
{ name = "pyyaml" },
|
||||
{ name = "tqdm" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/08/08/7de1ca4b71e7bf33b547f82bb22505e221b5fa42f67d635e200e0ad22ad6/gguf-0.17.1.tar.gz", hash = "sha256:36ad71aad900a3e75fc94ebe96ea6029f03a4e44be7627ef7ad3d03e8c7bcb53", size = 89338, upload-time = "2025-06-19T14:00:33.705Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/fc/31/6a93a887617ee7deeaa602ca3d02d1c12a6cb8a742a695de5d128f5fa46a/gguf-0.17.1-py3-none-any.whl", hash = "sha256:7bc5aa7eeb1931f7d39b48fdc5b38fda6b294b9dca75cf607ac69557840a3943", size = 96224, upload-time = "2025-06-19T14:00:32.88Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "iniconfig"
|
||||
version = "2.1.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/f2/97/ebf4da567aa6827c909642694d71c9fcf53e5b504f2d96afea02718862f3/iniconfig-2.1.0.tar.gz", hash = "sha256:3abbd2e30b36733fee78f9c7f7308f2d0050e88f0087fd25c2645f63c773e1c7", size = 4793, upload-time = "2025-03-19T20:09:59.721Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/2c/e1/e6716421ea10d38022b952c159d5161ca1193197fb744506875fbb87ea7b/iniconfig-2.1.0-py3-none-any.whl", hash = "sha256:9deba5723312380e77435581c6bf4935c94cbfab9b1ed33ef8d238ea168eb760", size = 6050, upload-time = "2025-03-19T20:10:01.071Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "jinja2"
|
||||
version = "3.1.4"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
dependencies = [
|
||||
{ name = "markupsafe" },
|
||||
]
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/Jinja2-3.1.4-py3-none-any.whl", hash = "sha256:bc5dd2abb727a5319567b7a813e6a2e7318c39f4f487cfe6c89c6f9c7d25197d" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "llm-gguf-tools"
|
||||
version = "0.1.0"
|
||||
source = { editable = "." }
|
||||
dependencies = [
|
||||
{ name = "gguf" },
|
||||
{ name = "pydantic" },
|
||||
{ name = "safetensors" },
|
||||
{ name = "torch", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform == 'darwin'" },
|
||||
{ name = "torch", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform != 'darwin'" },
|
||||
]
|
||||
|
||||
[package.dev-dependencies]
|
||||
dev = [
|
||||
{ name = "pytest" },
|
||||
{ name = "ruff" },
|
||||
{ name = "uv" },
|
||||
]
|
||||
|
||||
[package.metadata]
|
||||
requires-dist = [
|
||||
{ name = "gguf", specifier = ">=0" },
|
||||
{ name = "pydantic", specifier = ">=2" },
|
||||
{ name = "safetensors", specifier = ">=0" },
|
||||
{ name = "torch", specifier = ">=2", index = "https://download.pytorch.org/whl/cpu" },
|
||||
]
|
||||
|
||||
[package.metadata.requires-dev]
|
||||
dev = [
|
||||
{ name = "pytest", specifier = ">=8" },
|
||||
{ name = "ruff", specifier = ">=0" },
|
||||
{ name = "uv", specifier = ">=0" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "markupsafe"
|
||||
version = "3.0.2"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/MarkupSafe-3.0.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:15ab75ef81add55874e7ab7055e9c397312385bd9ced94920f2802310c930396" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "mpmath"
|
||||
version = "1.3.0"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/mpmath-1.3.0-py3-none-any.whl", hash = "sha256:a0b2b9fe80bbcd81a6647ff13108738cfb482d481d826cc0e02f5b35e5c88d2c" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "networkx"
|
||||
version = "3.3"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/networkx-3.3-py3-none-any.whl", hash = "sha256:28575580c6ebdaf4505b22c6256a2b9de86b316dc63ba9e93abde3d78dfdbcf2" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "numpy"
|
||||
version = "2.1.2"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:a84498e0d0a1174f2b3ed769b67b656aa5460c92c9554039e11f20a05650f00d" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:4d6ec0d4222e8ffdab1744da2560f07856421b367928026fb540e1945f2eeeaf" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313-macosx_14_0_arm64.whl", hash = "sha256:259ec80d54999cc34cd1eb8ded513cb053c3bf4829152a2e00de2371bd406f5e" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313-macosx_14_0_x86_64.whl", hash = "sha256:675c741d4739af2dc20cd6c6a5c4b7355c728167845e3c6b0e824e4e5d36a6c3" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:05b2d4e667895cc55e3ff2b56077e4c8a5604361fc21a042845ea3ad67465aa8" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:43cca367bf94a14aca50b89e9bc2061683116cfe864e56740e083392f533ce7a" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313-win_amd64.whl", hash = "sha256:f2ded8d9b6f68cc26f8425eda5d3877b47343e68ca23d0d0846f4d312ecaa445" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:2ffef621c14ebb0188a8633348504a35c13680d6da93ab5cb86f4e54b7e922b5" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:ad369ed238b1959dfbade9018a740fb9392c5ac4f9b5173f420bd4f37ba1f7a0" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313t-macosx_14_0_arm64.whl", hash = "sha256:d82075752f40c0ddf57e6e02673a17f6cb0f8eb3f587f63ca1eaab5594da5b17" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313t-macosx_14_0_x86_64.whl", hash = "sha256:1600068c262af1ca9580a527d43dc9d959b0b1d8e56f8a05d830eea39b7c8af6" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a26ae94658d3ba3781d5e103ac07a876b3e9b29db53f68ed7df432fd033358a8" },
|
||||
{ url = "https://download.pytorch.org/whl/numpy-2.1.2-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:13311c2db4c5f7609b462bc0f43d3c465424d25c626d95040f073e30f7570e35" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "packaging"
|
||||
version = "24.1"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/packaging-24.1-py3-none-any.whl", hash = "sha256:5b8f2217dbdbd2f7f384c41c628544e6d52f2d0f53c6d0c3ea61aa5d1d7ff124" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "pluggy"
|
||||
version = "1.6.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/f9/e2/3e91f31a7d2b083fe6ef3fa267035b518369d9511ffab804f839851d2779/pluggy-1.6.0.tar.gz", hash = "sha256:7dcc130b76258d33b90f61b658791dede3486c3e6bfb003ee5c9bfb396dd22f3", size = 69412, upload-time = "2025-05-15T12:30:07.975Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/54/20/4d324d65cc6d9205fabedc306948156824eb9f0ee1633355a8f7ec5c66bf/pluggy-1.6.0-py3-none-any.whl", hash = "sha256:e920276dd6813095e9377c0bc5566d94c932c33b27a3e3945d8389c374dd4746", size = 20538, upload-time = "2025-05-15T12:30:06.134Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "pydantic"
|
||||
version = "2.11.7"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "annotated-types" },
|
||||
{ name = "pydantic-core" },
|
||||
{ name = "typing-extensions" },
|
||||
{ name = "typing-inspection" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/00/dd/4325abf92c39ba8623b5af936ddb36ffcfe0beae70405d456ab1fb2f5b8c/pydantic-2.11.7.tar.gz", hash = "sha256:d989c3c6cb79469287b1569f7447a17848c998458d49ebe294e975b9baf0f0db", size = 788350, upload-time = "2025-06-14T08:33:17.137Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/6a/c0/ec2b1c8712ca690e5d61979dee872603e92b8a32f94cc1b72d53beab008a/pydantic-2.11.7-py3-none-any.whl", hash = "sha256:dde5df002701f6de26248661f6835bbe296a47bf73990135c7d07ce741b9623b", size = 444782, upload-time = "2025-06-14T08:33:14.905Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "pydantic-core"
|
||||
version = "2.33.2"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "typing-extensions" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/ad/88/5f2260bdfae97aabf98f1778d43f69574390ad787afb646292a638c923d4/pydantic_core-2.33.2.tar.gz", hash = "sha256:7cb8bc3605c29176e1b105350d2e6474142d7c1bd1d9327c4a9bdb46bf827acc", size = 435195, upload-time = "2025-04-23T18:33:52.104Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/46/8c/99040727b41f56616573a28771b1bfa08a3d3fe74d3d513f01251f79f172/pydantic_core-2.33.2-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:1082dd3e2d7109ad8b7da48e1d4710c8d06c253cbc4a27c1cff4fbcaa97a9e3f", size = 2015688, upload-time = "2025-04-23T18:31:53.175Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3a/cc/5999d1eb705a6cefc31f0b4a90e9f7fc400539b1a1030529700cc1b51838/pydantic_core-2.33.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:f517ca031dfc037a9c07e748cefd8d96235088b83b4f4ba8939105d20fa1dcd6", size = 1844808, upload-time = "2025-04-23T18:31:54.79Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/6f/5e/a0a7b8885c98889a18b6e376f344da1ef323d270b44edf8174d6bce4d622/pydantic_core-2.33.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0a9f2c9dd19656823cb8250b0724ee9c60a82f3cdf68a080979d13092a3b0fef", size = 1885580, upload-time = "2025-04-23T18:31:57.393Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3b/2a/953581f343c7d11a304581156618c3f592435523dd9d79865903272c256a/pydantic_core-2.33.2-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:2b0a451c263b01acebe51895bfb0e1cc842a5c666efe06cdf13846c7418caa9a", size = 1973859, upload-time = "2025-04-23T18:31:59.065Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e6/55/f1a813904771c03a3f97f676c62cca0c0a4138654107c1b61f19c644868b/pydantic_core-2.33.2-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:1ea40a64d23faa25e62a70ad163571c0b342b8bf66d5fa612ac0dec4f069d916", size = 2120810, upload-time = "2025-04-23T18:32:00.78Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/aa/c3/053389835a996e18853ba107a63caae0b9deb4a276c6b472931ea9ae6e48/pydantic_core-2.33.2-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:0fb2d542b4d66f9470e8065c5469ec676978d625a8b7a363f07d9a501a9cb36a", size = 2676498, upload-time = "2025-04-23T18:32:02.418Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/eb/3c/f4abd740877a35abade05e437245b192f9d0ffb48bbbbd708df33d3cda37/pydantic_core-2.33.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9fdac5d6ffa1b5a83bca06ffe7583f5576555e6c8b3a91fbd25ea7780f825f7d", size = 2000611, upload-time = "2025-04-23T18:32:04.152Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/59/a7/63ef2fed1837d1121a894d0ce88439fe3e3b3e48c7543b2a4479eb99c2bd/pydantic_core-2.33.2-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:04a1a413977ab517154eebb2d326da71638271477d6ad87a769102f7c2488c56", size = 2107924, upload-time = "2025-04-23T18:32:06.129Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/04/8f/2551964ef045669801675f1cfc3b0d74147f4901c3ffa42be2ddb1f0efc4/pydantic_core-2.33.2-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:c8e7af2f4e0194c22b5b37205bfb293d166a7344a5b0d0eaccebc376546d77d5", size = 2063196, upload-time = "2025-04-23T18:32:08.178Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/26/bd/d9602777e77fc6dbb0c7db9ad356e9a985825547dce5ad1d30ee04903918/pydantic_core-2.33.2-cp313-cp313-musllinux_1_1_armv7l.whl", hash = "sha256:5c92edd15cd58b3c2d34873597a1e20f13094f59cf88068adb18947df5455b4e", size = 2236389, upload-time = "2025-04-23T18:32:10.242Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/42/db/0e950daa7e2230423ab342ae918a794964b053bec24ba8af013fc7c94846/pydantic_core-2.33.2-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:65132b7b4a1c0beded5e057324b7e16e10910c106d43675d9bd87d4f38dde162", size = 2239223, upload-time = "2025-04-23T18:32:12.382Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/58/4d/4f937099c545a8a17eb52cb67fe0447fd9a373b348ccfa9a87f141eeb00f/pydantic_core-2.33.2-cp313-cp313-win32.whl", hash = "sha256:52fb90784e0a242bb96ec53f42196a17278855b0f31ac7c3cc6f5c1ec4811849", size = 1900473, upload-time = "2025-04-23T18:32:14.034Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a0/75/4a0a9bac998d78d889def5e4ef2b065acba8cae8c93696906c3a91f310ca/pydantic_core-2.33.2-cp313-cp313-win_amd64.whl", hash = "sha256:c083a3bdd5a93dfe480f1125926afcdbf2917ae714bdb80b36d34318b2bec5d9", size = 1955269, upload-time = "2025-04-23T18:32:15.783Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/f9/86/1beda0576969592f1497b4ce8e7bc8cbdf614c352426271b1b10d5f0aa64/pydantic_core-2.33.2-cp313-cp313-win_arm64.whl", hash = "sha256:e80b087132752f6b3d714f041ccf74403799d3b23a72722ea2e6ba2e892555b9", size = 1893921, upload-time = "2025-04-23T18:32:18.473Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a4/7d/e09391c2eebeab681df2b74bfe6c43422fffede8dc74187b2b0bf6fd7571/pydantic_core-2.33.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:61c18fba8e5e9db3ab908620af374db0ac1baa69f0f32df4f61ae23f15e586ac", size = 1806162, upload-time = "2025-04-23T18:32:20.188Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/f1/3d/847b6b1fed9f8ed3bb95a9ad04fbd0b212e832d4f0f50ff4d9ee5a9f15cf/pydantic_core-2.33.2-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:95237e53bb015f67b63c91af7518a62a8660376a6a0db19b89acc77a4d6199f5", size = 1981560, upload-time = "2025-04-23T18:32:22.354Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/6f/9a/e73262f6c6656262b5fdd723ad90f518f579b7bc8622e43a942eec53c938/pydantic_core-2.33.2-cp313-cp313t-win_amd64.whl", hash = "sha256:c2fc0a768ef76c15ab9238afa6da7f69895bb5d1ee83aeea2e3509af4472d0b9", size = 1935777, upload-time = "2025-04-23T18:32:25.088Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "pygments"
|
||||
version = "2.19.2"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/b0/77/a5b8c569bf593b0140bde72ea885a803b82086995367bf2037de0159d924/pygments-2.19.2.tar.gz", hash = "sha256:636cb2477cec7f8952536970bc533bc43743542f70392ae026374600add5b887", size = 4968631, upload-time = "2025-06-21T13:39:12.283Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/c7/21/705964c7812476f378728bdf590ca4b771ec72385c533964653c68e86bdc/pygments-2.19.2-py3-none-any.whl", hash = "sha256:86540386c03d588bb81d44bc3928634ff26449851e99741617ecb9037ee5ec0b", size = 1225217, upload-time = "2025-06-21T13:39:07.939Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "pytest"
|
||||
version = "8.4.1"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "colorama", marker = "sys_platform == 'win32'" },
|
||||
{ name = "iniconfig" },
|
||||
{ name = "packaging" },
|
||||
{ name = "pluggy" },
|
||||
{ name = "pygments" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/08/ba/45911d754e8eba3d5a841a5ce61a65a685ff1798421ac054f85aa8747dfb/pytest-8.4.1.tar.gz", hash = "sha256:7c67fd69174877359ed9371ec3af8a3d2b04741818c51e5e99cc1742251fa93c", size = 1517714, upload-time = "2025-06-18T05:48:06.109Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/29/16/c8a903f4c4dffe7a12843191437d7cd8e32751d5de349d45d3fe69544e87/pytest-8.4.1-py3-none-any.whl", hash = "sha256:539c70ba6fcead8e78eebbf1115e8b589e7565830d7d006a8723f19ac8a0afb7", size = 365474, upload-time = "2025-06-18T05:48:03.955Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "pyyaml"
|
||||
version = "6.0.2"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/54/ed/79a089b6be93607fa5cdaedf301d7dfb23af5f25c398d5ead2525b063e17/pyyaml-6.0.2.tar.gz", hash = "sha256:d584d9ec91ad65861cc08d42e834324ef890a082e591037abe114850ff7bbc3e", size = 130631, upload-time = "2024-08-06T20:33:50.674Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/ef/e3/3af305b830494fa85d95f6d95ef7fa73f2ee1cc8ef5b495c7c3269fb835f/PyYAML-6.0.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:efdca5630322a10774e8e98e1af481aad470dd62c3170801852d752aa7a783ba", size = 181309, upload-time = "2024-08-06T20:32:43.4Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/45/9f/3b1c20a0b7a3200524eb0076cc027a970d320bd3a6592873c85c92a08731/PyYAML-6.0.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:50187695423ffe49e2deacb8cd10510bc361faac997de9efef88badc3bb9e2d1", size = 171679, upload-time = "2024-08-06T20:32:44.801Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/7c/9a/337322f27005c33bcb656c655fa78325b730324c78620e8328ae28b64d0c/PyYAML-6.0.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0ffe8360bab4910ef1b9e87fb812d8bc0a308b0d0eef8c8f44e0254ab3b07133", size = 733428, upload-time = "2024-08-06T20:32:46.432Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a3/69/864fbe19e6c18ea3cc196cbe5d392175b4cf3d5d0ac1403ec3f2d237ebb5/PyYAML-6.0.2-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:17e311b6c678207928d649faa7cb0d7b4c26a0ba73d41e99c4fff6b6c3276484", size = 763361, upload-time = "2024-08-06T20:32:51.188Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/04/24/b7721e4845c2f162d26f50521b825fb061bc0a5afcf9a386840f23ea19fa/PyYAML-6.0.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:70b189594dbe54f75ab3a1acec5f1e3faa7e8cf2f1e08d9b561cb41b845f69d5", size = 759523, upload-time = "2024-08-06T20:32:53.019Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/2b/b2/e3234f59ba06559c6ff63c4e10baea10e5e7df868092bf9ab40e5b9c56b6/PyYAML-6.0.2-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:41e4e3953a79407c794916fa277a82531dd93aad34e29c2a514c2c0c5fe971cc", size = 726660, upload-time = "2024-08-06T20:32:54.708Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/fe/0f/25911a9f080464c59fab9027482f822b86bf0608957a5fcc6eaac85aa515/PyYAML-6.0.2-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:68ccc6023a3400877818152ad9a1033e3db8625d899c72eacb5a668902e4d652", size = 751597, upload-time = "2024-08-06T20:32:56.985Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/14/0d/e2c3b43bbce3cf6bd97c840b46088a3031085179e596d4929729d8d68270/PyYAML-6.0.2-cp313-cp313-win32.whl", hash = "sha256:bc2fa7c6b47d6bc618dd7fb02ef6fdedb1090ec036abab80d4681424b84c1183", size = 140527, upload-time = "2024-08-06T20:33:03.001Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/fa/de/02b54f42487e3d3c6efb3f89428677074ca7bf43aae402517bc7cca949f3/PyYAML-6.0.2-cp313-cp313-win_amd64.whl", hash = "sha256:8388ee1976c416731879ac16da0aff3f63b286ffdd57cdeb95f3f2e085687563", size = 156446, upload-time = "2024-08-06T20:33:04.33Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "ruff"
|
||||
version = "0.12.7"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/a1/81/0bd3594fa0f690466e41bd033bdcdf86cba8288345ac77ad4afbe5ec743a/ruff-0.12.7.tar.gz", hash = "sha256:1fc3193f238bc2d7968772c82831a4ff69252f673be371fb49663f0068b7ec71", size = 5197814, upload-time = "2025-07-29T22:32:35.877Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/e1/d2/6cb35e9c85e7a91e8d22ab32ae07ac39cc34a71f1009a6f9e4a2a019e602/ruff-0.12.7-py3-none-linux_armv6l.whl", hash = "sha256:76e4f31529899b8c434c3c1dede98c4483b89590e15fb49f2d46183801565303", size = 11852189, upload-time = "2025-07-29T22:31:41.281Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/63/5b/a4136b9921aa84638f1a6be7fb086f8cad0fde538ba76bda3682f2599a2f/ruff-0.12.7-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:789b7a03e72507c54fb3ba6209e4bb36517b90f1a3569ea17084e3fd295500fb", size = 12519389, upload-time = "2025-07-29T22:31:54.265Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a8/c9/3e24a8472484269b6b1821794141f879c54645a111ded4b6f58f9ab0705f/ruff-0.12.7-py3-none-macosx_11_0_arm64.whl", hash = "sha256:2e1c2a3b8626339bb6369116e7030a4cf194ea48f49b64bb505732a7fce4f4e3", size = 11743384, upload-time = "2025-07-29T22:31:59.575Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/26/7c/458dd25deeb3452c43eaee853c0b17a1e84169f8021a26d500ead77964fd/ruff-0.12.7-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:32dec41817623d388e645612ec70d5757a6d9c035f3744a52c7b195a57e03860", size = 11943759, upload-time = "2025-07-29T22:32:01.95Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/7f/8b/658798472ef260ca050e400ab96ef7e85c366c39cf3dfbef4d0a46a528b6/ruff-0.12.7-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:47ef751f722053a5df5fa48d412dbb54d41ab9b17875c6840a58ec63ff0c247c", size = 11654028, upload-time = "2025-07-29T22:32:04.367Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a8/86/9c2336f13b2a3326d06d39178fd3448dcc7025f82514d1b15816fe42bfe8/ruff-0.12.7-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a828a5fc25a3efd3e1ff7b241fd392686c9386f20e5ac90aa9234a5faa12c423", size = 13225209, upload-time = "2025-07-29T22:32:06.952Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/76/69/df73f65f53d6c463b19b6b312fd2391dc36425d926ec237a7ed028a90fc1/ruff-0.12.7-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:5726f59b171111fa6a69d82aef48f00b56598b03a22f0f4170664ff4d8298efb", size = 14182353, upload-time = "2025-07-29T22:32:10.053Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/58/1e/de6cda406d99fea84b66811c189b5ea139814b98125b052424b55d28a41c/ruff-0.12.7-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:74e6f5c04c4dd4aba223f4fe6e7104f79e0eebf7d307e4f9b18c18362124bccd", size = 13631555, upload-time = "2025-07-29T22:32:12.644Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/6f/ae/625d46d5164a6cc9261945a5e89df24457dc8262539ace3ac36c40f0b51e/ruff-0.12.7-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:5d0bfe4e77fba61bf2ccadf8cf005d6133e3ce08793bbe870dd1c734f2699a3e", size = 12667556, upload-time = "2025-07-29T22:32:15.312Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/55/bf/9cb1ea5e3066779e42ade8d0cd3d3b0582a5720a814ae1586f85014656b6/ruff-0.12.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:06bfb01e1623bf7f59ea749a841da56f8f653d641bfd046edee32ede7ff6c606", size = 12939784, upload-time = "2025-07-29T22:32:17.69Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/55/7f/7ead2663be5627c04be83754c4f3096603bf5e99ed856c7cd29618c691bd/ruff-0.12.7-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:e41df94a957d50083fd09b916d6e89e497246698c3f3d5c681c8b3e7b9bb4ac8", size = 11771356, upload-time = "2025-07-29T22:32:20.134Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/17/40/a95352ea16edf78cd3a938085dccc55df692a4d8ba1b3af7accbe2c806b0/ruff-0.12.7-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:4000623300563c709458d0ce170c3d0d788c23a058912f28bbadc6f905d67afa", size = 11612124, upload-time = "2025-07-29T22:32:22.645Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/4d/74/633b04871c669e23b8917877e812376827c06df866e1677f15abfadc95cb/ruff-0.12.7-py3-none-musllinux_1_2_i686.whl", hash = "sha256:69ffe0e5f9b2cf2b8e289a3f8945b402a1b19eff24ec389f45f23c42a3dd6fb5", size = 12479945, upload-time = "2025-07-29T22:32:24.765Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/be/34/c3ef2d7799c9778b835a76189c6f53c179d3bdebc8c65288c29032e03613/ruff-0.12.7-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:a07a5c8ffa2611a52732bdc67bf88e243abd84fe2d7f6daef3826b59abbfeda4", size = 12998677, upload-time = "2025-07-29T22:32:27.022Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/77/ab/aca2e756ad7b09b3d662a41773f3edcbd262872a4fc81f920dc1ffa44541/ruff-0.12.7-py3-none-win32.whl", hash = "sha256:c928f1b2ec59fb77dfdf70e0419408898b63998789cc98197e15f560b9e77f77", size = 11756687, upload-time = "2025-07-29T22:32:29.381Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b4/71/26d45a5042bc71db22ddd8252ca9d01e9ca454f230e2996bb04f16d72799/ruff-0.12.7-py3-none-win_amd64.whl", hash = "sha256:9c18f3d707ee9edf89da76131956aba1270c6348bfee8f6c647de841eac7194f", size = 12912365, upload-time = "2025-07-29T22:32:31.517Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/4c/9b/0b8aa09817b63e78d94b4977f18b1fcaead3165a5ee49251c5d5c245bb2d/ruff-0.12.7-py3-none-win_arm64.whl", hash = "sha256:dfce05101dbd11833a0776716d5d1578641b7fddb537fe7fa956ab85d1769b69", size = 11982083, upload-time = "2025-07-29T22:32:33.881Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "safetensors"
|
||||
version = "0.6.1"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/6c/d2/94fe37355a1d4ff86b0f43b9a018515d5d29bf7ad6d01318a80f5db2fd6a/safetensors-0.6.1.tar.gz", hash = "sha256:a766ba6e19b198eff09be05f24cd89eda1670ed404ae828e2aa3fc09816ba8d8", size = 197968, upload-time = "2025-08-06T09:39:38.376Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/6b/c0/40263a2103511917f9a92b4e114ecaff68586df07f12d1d877312f1261f3/safetensors-0.6.1-cp38-abi3-macosx_10_12_x86_64.whl", hash = "sha256:81ed1b69d6f8acd7e759a71197ce3a69da4b7e9faa9dbb005eb06a83b1a4e52d", size = 455232, upload-time = "2025-08-06T09:39:32.037Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/86/bf/432cb4bb1c336d338dd9b29f78622b1441ee06e5868bf1de2ca2bec74c08/safetensors-0.6.1-cp38-abi3-macosx_11_0_arm64.whl", hash = "sha256:01b51af8cb7a3870203f2735e3c7c24d1a65fb2846e75613c8cf9d284271eccc", size = 432150, upload-time = "2025-08-06T09:39:31.008Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/05/d7/820c99032a53d57279ae199df7d114a8c9e2bbce4fa69bc0de53743495f0/safetensors-0.6.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:64a733886d79e726899b9d9643813e48a2eec49f3ef0fdb8cd4b8152046101c3", size = 471634, upload-time = "2025-08-06T09:39:22.17Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ea/8b/bcd960087eded7690f118ceeda294912f92a3b508a1d9a504f9c2e02041b/safetensors-0.6.1-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f233dc3b12fb641b36724844754b6bb41349615a0e258087560968d6da92add5", size = 487855, upload-time = "2025-08-06T09:39:24.142Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/41/64/b44eac4ad87c4e1c0cf5ba5e204c032b1b1eac8ce2b8f65f87791e647bd6/safetensors-0.6.1-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:6f16289e2af54affd591dd78ed12b5465e4dc5823f818beaeddd49a010cf3ba7", size = 607240, upload-time = "2025-08-06T09:39:25.463Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/52/75/0347fa0c080af8bd3341af26a30b85939f6362d4f5240add1a0c9d793354/safetensors-0.6.1-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:1b62eab84e2c69918b598272504c5d2ebfe64da6c16fdf8682054eec9572534d", size = 519864, upload-time = "2025-08-06T09:39:26.872Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ea/f3/83843d1fe9164f44a267373c55cba706530b209b58415f807b40edddcd3e/safetensors-0.6.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d498363746555dccffc02a47dfe1dee70f7784f3f37f1d66b408366c5d3a989e", size = 485926, upload-time = "2025-08-06T09:39:29.109Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b8/26/f6b0cb5210bab0e343214fdba7c2df80a69b019e62e760ddc61b18bec383/safetensors-0.6.1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:eed2079dca3ca948d7b0d7120396e776bbc6680637cf199d393e157fde25c937", size = 518999, upload-time = "2025-08-06T09:39:28.054Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/90/b7/8910b165c97d3bd6d445c6ca8b704ec23d0fa33849ce9a51dc783827a302/safetensors-0.6.1-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:294040ff20ebe079a2b4976cfa9a5be0202f56ca4f7f190b4e52009e8c026ceb", size = 650669, upload-time = "2025-08-06T09:39:32.997Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/00/bc/2eeb025381d0834ae038aae2d383dfa830c2e0068e2e4e512ea99b135a4b/safetensors-0.6.1-cp38-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:75693208b492a026b926edeebbae888cc644433bee4993573ead2dc44810b519", size = 750019, upload-time = "2025-08-06T09:39:34.397Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/f9/38/5dda9a8e056eb1f17ed3a7846698fd94623a1648013cdf522538845755da/safetensors-0.6.1-cp38-abi3-musllinux_1_2_i686.whl", hash = "sha256:a8687b71ac67a0b3f8ce87df9e8024edf087e94c34ef46eaaad694dce8d2f83f", size = 689888, upload-time = "2025-08-06T09:39:35.584Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/dd/60/15ee3961996d951002378d041bd82863a5c70738a71375b42d6dd5d2a6d3/safetensors-0.6.1-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:5dd969a01c738104f707fa0e306b757f5beb3ebdcd682fe0724170a0bf1c21fb", size = 655539, upload-time = "2025-08-06T09:39:37.093Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/91/d6/01172a9c77c566800286d379bfc341d75370eae2118dfd339edfd0394c4a/safetensors-0.6.1-cp38-abi3-win32.whl", hash = "sha256:7c3d8d34d01673d1a917445c9437ee73a9d48bc6af10352b84bbd46c5da93ca5", size = 308594, upload-time = "2025-08-06T09:39:40.916Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/6c/5d/195dc1917d7ae93dd990d9b2f8b9c88e451bcc78e0b63ee107beebc1e4be/safetensors-0.6.1-cp38-abi3-win_amd64.whl", hash = "sha256:4720957052d57c5ac48912c3f6e07e9a334d9632758c9b0c054afba477fcbe2d", size = 320282, upload-time = "2025-08-06T09:39:39.54Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "setuptools"
|
||||
version = "70.2.0"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/setuptools-70.2.0-py3-none-any.whl", hash = "sha256:b8b8060bb426838fbe942479c90296ce976249451118ef566a5a0b7d8b78fb05" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "sympy"
|
||||
version = "1.13.3"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
dependencies = [
|
||||
{ name = "mpmath" },
|
||||
]
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/sympy-1.13.3-py3-none-any.whl" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "torch"
|
||||
version = "2.8.0"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
resolution-markers = [
|
||||
"sys_platform == 'darwin'",
|
||||
]
|
||||
dependencies = [
|
||||
{ name = "filelock", marker = "sys_platform == 'darwin'" },
|
||||
{ name = "fsspec", marker = "sys_platform == 'darwin'" },
|
||||
{ name = "jinja2", marker = "sys_platform == 'darwin'" },
|
||||
{ name = "networkx", marker = "sys_platform == 'darwin'" },
|
||||
{ name = "setuptools", marker = "sys_platform == 'darwin'" },
|
||||
{ name = "sympy", marker = "sys_platform == 'darwin'" },
|
||||
{ name = "typing-extensions", marker = "sys_platform == 'darwin'" },
|
||||
]
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0-cp313-cp313t-macosx_14_0_arm64.whl", hash = "sha256:fbe2e149c5174ef90d29a5f84a554dfaf28e003cb4f61fa2c8c024c17ec7ca58" },
|
||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:057efd30a6778d2ee5e2374cd63a63f63311aa6f33321e627c655df60abdd390" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "torch"
|
||||
version = "2.8.0+cpu"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
resolution-markers = [
|
||||
"sys_platform != 'darwin'",
|
||||
]
|
||||
dependencies = [
|
||||
{ name = "filelock", marker = "sys_platform != 'darwin'" },
|
||||
{ name = "fsspec", marker = "sys_platform != 'darwin'" },
|
||||
{ name = "jinja2", marker = "sys_platform != 'darwin'" },
|
||||
{ name = "networkx", marker = "sys_platform != 'darwin'" },
|
||||
{ name = "setuptools", marker = "sys_platform != 'darwin'" },
|
||||
{ name = "sympy", marker = "sys_platform != 'darwin'" },
|
||||
{ name = "typing-extensions", marker = "sys_platform != 'darwin'" },
|
||||
]
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-linux_s390x.whl", hash = "sha256:8b5882276633cf91fe3d2d7246c743b94d44a7e660b27f1308007fdb1bb89f7d" },
|
||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:a5064b5e23772c8d164068cc7c12e01a75faf7b948ecd95a0d4007d7487e5f25" },
|
||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:8f81dedb4c6076ec325acc3b47525f9c550e5284a18eae1d9061c543f7b6e7de" },
|
||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-win_amd64.whl", hash = "sha256:e1ee1b2346ade3ea90306dfbec7e8ff17bc220d344109d189ae09078333b0856" },
|
||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313-win_arm64.whl", hash = "sha256:64c187345509f2b1bb334feed4666e2c781ca381874bde589182f81247e61f88" },
|
||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:af81283ac671f434b1b25c95ba295f270e72db1fad48831eb5e4748ff9840041" },
|
||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:a9dbb6f64f63258bc811e2c0c99640a81e5af93c531ad96e95c5ec777ea46dab" },
|
||||
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp313-cp313t-win_amd64.whl", hash = "sha256:6d93a7165419bc4b2b907e859ccab0dea5deeab261448ae9a5ec5431f14c0e64" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "tqdm"
|
||||
version = "4.66.5"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
dependencies = [
|
||||
{ name = "colorama", marker = "sys_platform == 'win32'" },
|
||||
]
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/tqdm-4.66.5-py3-none-any.whl", hash = "sha256:90279a3770753eafc9194a0364852159802111925aa30eb3f9d85b0e805ac7cd" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "typing-extensions"
|
||||
version = "4.12.2"
|
||||
source = { registry = "https://download.pytorch.org/whl/cpu" }
|
||||
wheels = [
|
||||
{ url = "https://download.pytorch.org/whl/typing_extensions-4.12.2-py3-none-any.whl", hash = "sha256:04e5ca0351e0f3f85c6853954072df659d0d13fac324d0072316b67d7794700d" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "typing-inspection"
|
||||
version = "0.4.1"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "typing-extensions" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/f8/b1/0c11f5058406b3af7609f121aaa6b609744687f1d158b3c3a5bf4cc94238/typing_inspection-0.4.1.tar.gz", hash = "sha256:6ae134cc0203c33377d43188d4064e9b357dba58cff3185f22924610e70a9d28", size = 75726, upload-time = "2025-05-21T18:55:23.885Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/17/69/cd203477f944c353c31bade965f880aa1061fd6bf05ded0726ca845b6ff7/typing_inspection-0.4.1-py3-none-any.whl", hash = "sha256:389055682238f53b04f7badcb49b989835495a96700ced5dab2d8feae4b26f51", size = 14552, upload-time = "2025-05-21T18:55:22.152Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "uv"
|
||||
version = "0.8.5"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/83/94/e18a40fe6f6d724c1fbf2c9328806359e341710b2fd42dc928a1a8fc636b/uv-0.8.5.tar.gz", hash = "sha256:078cf2935062d5b61816505f9d6f30b0221943a1433b4a1de8f31a1dfe55736b", size = 3451272, upload-time = "2025-08-05T20:50:21.159Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/d9/b9/78cde56283b6b9a8a84b0bf9334442ed75a843310229aaf7f1a71fe67818/uv-0.8.5-py3-none-linux_armv6l.whl", hash = "sha256:e236372a260e312aef5485a0e5819a0ec16c9197af06d162ad5a3e8bd62f9bba", size = 18146198, upload-time = "2025-08-05T20:49:18.859Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ed/83/5deda1a19362ce426da7f9cc4764a0dd57e665ecbaddd9900d4200bc10ab/uv-0.8.5-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:53a40628329e543a5c5414553f5898131d5c1c6f963708cb0afc2ecf3e8d8167", size = 18242690, upload-time = "2025-08-05T20:49:23.409Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/06/6e/80b08ee544728317d9c8003d4c10234007e12f384da1c3dfe579489833c9/uv-0.8.5-py3-none-macosx_11_0_arm64.whl", hash = "sha256:43a689027696bc9c62e6da3f06900c52eafc4debbf4fba9ecb906196730b34c8", size = 16913881, upload-time = "2025-08-05T20:49:26.631Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/34/f6/47a44dabfc25b598ea6f2ab9aa32ebf1cbd87ed8af18ccde6c5d36f35476/uv-0.8.5-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.musllinux_1_1_aarch64.whl", hash = "sha256:a34d783f5cef00f1918357c0cd9226666e22640794e9e3862820abf4ee791141", size = 17527439, upload-time = "2025-08-05T20:49:30.464Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ef/7d/ee7c2514e064412133ee9f01c4c42de20da24617b8c25d81cf7021b774d8/uv-0.8.5-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:2140383bc25228281090cc34c00500d8e5822877c955f691d69bbf967e8efa73", size = 17833275, upload-time = "2025-08-05T20:49:33.783Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/f9/e7/5233cf5cbcca8ea65aa1f1e48bf210dc9773fb86b8104ffbc523be7f6a3f/uv-0.8.5-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6b449779ff463b059504dc30316a634f810149e02482ce36ea35daea8f6ce7af", size = 18568916, upload-time = "2025-08-05T20:49:37.031Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/d8/54/6cabb2a0347c51c8366ca3bffeeebd7f829a15f6b29ad20f51fd5ca9c4bd/uv-0.8.5-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:a7f8739d05cc513eee2f1f8a7e6c482a9c1e8860d77cd078d1ea7c3fe36d7a65", size = 19993334, upload-time = "2025-08-05T20:49:40.361Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3c/7a/b84d994d52f20bc56229840c31e77aff4653e5902ea7b7c2616e9381b5b8/uv-0.8.5-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:62ebbd22f780ba2585690332765caf9e29c9758e48a678148e8b1ea90580cdb9", size = 19643358, upload-time = "2025-08-05T20:49:43.955Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/c8/f1/7552f2bea528456d34bc245f2959ce910631e01571c4b7ea421ead9a9fc6/uv-0.8.5-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:4f8dd0555f05d66ff46fdab551137cc2b1ea9c5363358913e2af175e367f4398", size = 18947757, upload-time = "2025-08-05T20:49:47.381Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/57/9b/46aadd186a1e16a23cd0701dda0e640197db49a3add074a47231fed45a4f/uv-0.8.5-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:38c04408ad5eae7a178a1e3b0e09afeb436d0c97075530a3c82de453b78d0448", size = 18906135, upload-time = "2025-08-05T20:49:50.985Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/c0/31/6661adedaba9ebac8bb449ec9901f8cbf124fa25e0db3a9e6cf3053cee88/uv-0.8.5-py3-none-manylinux_2_28_aarch64.whl", hash = "sha256:73e772caf7310af4b21eaf8c25531b934391f1e84f3afa8e67822d7c432f6dad", size = 17787943, upload-time = "2025-08-05T20:49:54.59Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/11/f2/73fb5c3156fdae830b83edec2f430db84cb4bc4b78f61d21694bd59004cb/uv-0.8.5-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:3ddd7d8c01073f23ba2a4929ab246adb30d4f8a55c5e007ad7c8341f7bf06978", size = 18675864, upload-time = "2025-08-05T20:49:57.87Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b5/29/774c6f174c53d68ae9a51c2fabf1b09003b93a53c24591a108be0dc338d7/uv-0.8.5-py3-none-musllinux_1_1_armv7l.whl", hash = "sha256:7d601f021cbc179320ea3a75cd1d91bd49af03d2a630c4d04ebd38ff6b87d419", size = 17808770, upload-time = "2025-08-05T20:50:01.566Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a9/b0/5d164ce84691f5018c5832e9e3371c0196631b1f1025474a179de1d6a70a/uv-0.8.5-py3-none-musllinux_1_1_i686.whl", hash = "sha256:6ee97b7299990026619c20e30e253972c6c0fb6fba4f5658144e62aa1c07785a", size = 18076516, upload-time = "2025-08-05T20:50:04.94Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/d1/73/4d8baefb4f4b07df6a4db7bbd604cb361d4f5215b94d3f66553ea26edfd4/uv-0.8.5-py3-none-musllinux_1_1_x86_64.whl", hash = "sha256:09804055d6346febf0767767c04bdd2fab7d911535639f9c18de2ea744b2954c", size = 19031195, upload-time = "2025-08-05T20:50:08.211Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/44/2a/3d074391df2c16c79fc6bf333e4bde75662e64dac465050a03391c75b289/uv-0.8.5-py3-none-win32.whl", hash = "sha256:6362a2e1fa535af0e4c0a01f83e666a4d5f9024d808f9e64e3b6ef07c97aff54", size = 18026273, upload-time = "2025-08-05T20:50:11.868Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3c/2f/e850d3e745ccd1125b7a48898421824700fd3e996d27d835139160650124/uv-0.8.5-py3-none-win_amd64.whl", hash = "sha256:dd89836735860461c3a5563731e77c011d1831f14ada540f94bf1a7011dbea14", size = 19822158, upload-time = "2025-08-05T20:50:15.428Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/6f/df/e5565b3faf2c6147a877ab7e96ef31e2333f08c5138a98ce77003b1bf65e/uv-0.8.5-py3-none-win_arm64.whl", hash = "sha256:37c1a22915392014d8b4ade9e69e157c8e5ccdf32f37070a84f749a708268335", size = 18430102, upload-time = "2025-08-05T20:50:18.785Z" },
|
||||
]
|
Loading…
Add table
Add a link
Reference in a new issue