gpu-llm-benchmarking/scripts
Tom Foster 430dc059d4
All checks were successful
Build Vast.ai Ollama Benchmark Image / Build and Push (push) Successful in 6m4s
Debugging
2025-07-28 21:15:37 +01:00
..
helpers Debugging 2025-07-28 21:15:37 +01:00
llm_benchmark.py Debugging 2025-07-28 21:15:37 +01:00
README.md Initial commit 2025-07-28 16:58:21 +01:00
run_vast.ai_benchmark.py Debugging 2025-07-28 21:15:37 +01:00

Scripts

This directory contains all benchmarking and utility scripts for the GPU LLM Benchmarking project.

  1. Available Scripts
  2. Running Locally
  3. Running on Vast.ai

Available Scripts

Script About
llm_benchmark.py The main benchmarking script that tests LLM inference performance across different context window
sizes using dual scenario testing (short prompts vs half-context), comprehensive GPU monitoring, and
statistical analysis across multiple runs.
run_vast.ai_benchmark.py Remote benchmarking runner for executing LLM benchmarks on Vast.ai GPU instances. Handles the
complete lifecycle including instance provisioning, Ollama installation and configuration, benchmark
execution, results retrieval, and resource cleanup.

Running Locally

For testing on your own hardware:

# Install uv if missing
curl -LsSf https://astral.sh/uv/install.sh | sh

# Or update your existing copy
uv self update

# Path is important to load the .env from the directory above
uv run scripts/llm_benchmark.py

Copy the .env.example to .env locally, and configure it as required. This script does not install Ollama - it expects Ollama to already be running with your desired model pre-pulled.

Running on Vast.ai

For testing across different GPU configurations:

# Install uv if missing
curl -LsSf https://astral.sh/uv/install.sh | sh

# Or update your existing copy
uv self update

# Path is important to load the .env from the directory above
uv run scripts/run_vast.ai_benchmark.py

Copy the .env.example to .env locally, and configure it as required, including an API key from Vast.ai. The script automatically provisions the cheapest instance meeting the parameters, installs dependencies, executes benchmarks, then retrieves the results.