raytune as docker
Some checks failed
build_docker / essential (pull_request) Successful in 1s
build_docker / build_cpu (pull_request) Successful in 4m14s
build_docker / build_easyocr (pull_request) Successful in 12m19s
build_docker / build_easyocr_gpu (pull_request) Successful in 14m2s
build_docker / build_doctr (pull_request) Successful in 12m24s
build_docker / build_doctr_gpu (pull_request) Successful in 13m10s
build_docker / build_raytune (pull_request) Successful in 1m50s
build_docker / build_gpu (pull_request) Has been cancelled

This commit is contained in:
2026-01-19 16:32:45 +01:00
parent d67cbd4677
commit 94b25f9752
20 changed files with 7214 additions and 112 deletions

View File

@@ -1,74 +1,153 @@
# Running Notebooks in Background
## Quick: Check Ray Tune Progress
```bash
# Is papermill still running?
ps aux | grep papermill | grep -v grep
# View live log
tail -f papermill.log
# Find latest Ray Tune run and count completed trials
LATEST=$(ls -td ~/ray_results/trainable_* 2>/dev/null | head -1)
echo "Run: $LATEST"
COMPLETED=$(find "$LATEST" -name "result.json" -size +0 2>/dev/null | wc -l)
TOTAL=$(ls -d "$LATEST"/trainable_*/ 2>/dev/null | wc -l)
echo "Completed: $COMPLETED / $TOTAL"
# Check workers are healthy
for port in 8001 8002 8003; do
status=$(curl -s "localhost:$port/health" 2>/dev/null | python3 -c "import sys,json; print(json.load(sys.stdin).get('status','down'))" 2>/dev/null || echo "down")
echo "Worker $port: $status"
done
# Show best result so far
if [ "$COMPLETED" -gt 0 ]; then
find "$LATEST" -name "result.json" -size +0 -exec cat {} \; 2>/dev/null | \
python3 -c "import sys,json; results=[json.loads(l) for l in sys.stdin if l.strip()]; best=min(results,key=lambda x:x.get('CER',999)); print(f'Best CER: {best[\"CER\"]:.4f}, WER: {best[\"WER\"]:.4f}')" 2>/dev/null
fi
```
---
## Option 1: Papermill (Recommended)
Runs notebooks directly without conversion.
```bash
pip install papermill
nohup papermill <notebook>.ipynb output.ipynb > papermill.log 2>&1 &
```
Monitor:
```bash
tail -f papermill.log
```
## Option 2: Convert to Python Script
```bash
jupyter nbconvert --to script <notebook>.ipynb
nohup python <notebook>.py > output.log 2>&1 &
```
**Note:** `%pip install` magic commands need manual removal before running as `.py`
## Important Notes
- Ray Tune notebooks require the OCR service running first (Docker)
- For Ray workers, imports must be inside trainable functions
## Example: Ray Tune PaddleOCR
```bash
# 1. Start OCR service
cd src/paddle_ocr && docker compose up -d ocr-cpu
# 2. Run notebook with papermill
cd src
nohup papermill paddle_ocr_raytune_rest.ipynb output_raytune.ipynb > papermill.log 2>&1 &
# 3. Monitor
tail -f papermill.log
```
# OCR Hyperparameter Tuning with Ray Tune
This directory contains the Docker setup for running automated hyperparameter optimization on OCR services using Ray Tune with Optuna.
## Prerequisites
- Docker with NVIDIA GPU support (`nvidia-container-toolkit`)
- NVIDIA GPU with CUDA support
## Quick Start
```bash
cd src
# Start PaddleOCR service and run tuning (images pulled from registry)
docker compose -f docker-compose.tuning.paddle.yml up -d paddle-ocr-gpu
docker compose -f docker-compose.tuning.paddle.yml run raytune --service paddle --samples 64
```
## Available Services
| Service | Port | Compose File |
|---------|------|--------------|
| PaddleOCR | 8002 | `docker-compose.tuning.paddle.yml` |
| DocTR | 8003 | `docker-compose.tuning.doctr.yml` |
| EasyOCR | 8002 | `docker-compose.tuning.easyocr.yml` |
**Note:** PaddleOCR and EasyOCR both use port 8002. Run them separately.
## Usage Examples
### PaddleOCR Tuning
```bash
# Start service
docker compose -f docker-compose.tuning.paddle.yml up -d paddle-ocr-gpu
# Wait for health check (check with)
curl http://localhost:8002/health
# Run tuning (64 samples)
docker compose -f docker-compose.tuning.paddle.yml run raytune --service paddle --samples 64
# Stop service
docker compose -f docker-compose.tuning.paddle.yml down
```
### DocTR Tuning
```bash
docker compose -f docker-compose.tuning.doctr.yml up -d doctr-gpu
curl http://localhost:8003/health
docker compose -f docker-compose.tuning.doctr.yml run raytune --service doctr --samples 64
docker compose -f docker-compose.tuning.doctr.yml down
```
### EasyOCR Tuning
```bash
docker compose -f docker-compose.tuning.easyocr.yml up -d easyocr-gpu
curl http://localhost:8002/health
docker compose -f docker-compose.tuning.easyocr.yml run raytune --service easyocr --samples 64
docker compose -f docker-compose.tuning.easyocr.yml down
```
### Run Multiple Services (PaddleOCR + DocTR)
```bash
# Start both services
docker compose -f docker-compose.tuning.yml up -d paddle-ocr-gpu doctr-gpu
# Run tuning for each
docker compose -f docker-compose.tuning.yml run raytune --service paddle --samples 64
docker compose -f docker-compose.tuning.yml run raytune --service doctr --samples 64
# Stop all
docker compose -f docker-compose.tuning.yml down
```
## Command Line Options
```bash
docker compose -f <compose-file> run raytune --service <service> --samples <n>
```
| Option | Description | Default |
|--------|-------------|---------|
| `--service` | OCR service: `paddle`, `doctr`, `easyocr` | Required |
| `--samples` | Number of hyperparameter trials | 64 |
## Output
Results are saved to `src/results/` as CSV files:
- `raytune_paddle_results_<timestamp>.csv`
- `raytune_doctr_results_<timestamp>.csv`
- `raytune_easyocr_results_<timestamp>.csv`
## Directory Structure
```
src/
├── docker-compose.tuning.yml # All services (PaddleOCR + DocTR)
├── docker-compose.tuning.paddle.yml # PaddleOCR only
├── docker-compose.tuning.doctr.yml # DocTR only
├── docker-compose.tuning.easyocr.yml # EasyOCR only
├── raytune/
│ ├── Dockerfile
│ ├── requirements.txt
│ ├── raytune_ocr.py
│ └── run_tuning.py
├── dataset/ # Input images and ground truth
├── results/ # Output CSV files
└── debugset/ # Debug output
```
## Docker Images
All images are pre-built and pulled from registry:
- `seryus.ddns.net/unir/raytune:latest` - Ray Tune tuning service
- `seryus.ddns.net/unir/paddle-ocr-gpu:latest` - PaddleOCR GPU
- `seryus.ddns.net/unir/doctr-gpu:latest` - DocTR GPU
- `seryus.ddns.net/unir/easyocr-gpu:latest` - EasyOCR GPU
### Build locally (development)
```bash
# Build raytune image locally
docker build -t seryus.ddns.net/unir/raytune:latest ./raytune
```
## Troubleshooting
### Service not ready
Wait for the health check to pass before running tuning:
```bash
# Check service health
curl http://localhost:8002/health
# Expected: {"status": "ok", "model_loaded": true, ...}
```
### GPU not detected
Ensure `nvidia-container-toolkit` is installed:
```bash
nvidia-smi # Should show your GPU
docker run --rm --gpus all nvidia/cuda:12.4.1-base nvidia-smi
```
### Port already in use
Stop any running OCR services:
```bash
docker compose -f docker-compose.tuning.paddle.yml down
docker compose -f docker-compose.tuning.easyocr.yml down
```

View File

@@ -0,0 +1,50 @@
# docker-compose.tuning.doctr.yml - Ray Tune with DocTR GPU
# Usage:
# docker compose -f docker-compose.tuning.doctr.yml up -d doctr-gpu
# docker compose -f docker-compose.tuning.doctr.yml run raytune --service doctr --samples 64
# docker compose -f docker-compose.tuning.doctr.yml down
services:
raytune:
image: seryus.ddns.net/unir/raytune:latest
command: ["--service", "doctr", "--host", "doctr-gpu", "--port", "8000", "--samples", "64"]
volumes:
- ./results:/app/results:rw
environment:
- PYTHONUNBUFFERED=1
depends_on:
doctr-gpu:
condition: service_healthy
doctr-gpu:
image: seryus.ddns.net/unir/doctr-gpu:latest
container_name: doctr-gpu-tuning
ports:
- "8003:8000"
volumes:
- ./dataset:/app/dataset:ro
- ./debugset:/app/debugset:rw
- doctr-cache:/root/.cache/doctr
environment:
- PYTHONUNBUFFERED=1
- CUDA_VISIBLE_DEVICES=0
- DOCTR_DET_ARCH=db_resnet50
- DOCTR_RECO_ARCH=crnn_vgg16_bn
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"]
interval: 30s
timeout: 10s
retries: 3
start_period: 180s
volumes:
doctr-cache:
name: doctr-model-cache

View File

@@ -0,0 +1,51 @@
# docker-compose.tuning.easyocr.yml - Ray Tune with EasyOCR GPU
# Usage:
# docker compose -f docker-compose.tuning.easyocr.yml up -d easyocr-gpu
# docker compose -f docker-compose.tuning.easyocr.yml run raytune --service easyocr --samples 64
# docker compose -f docker-compose.tuning.easyocr.yml down
#
# Note: EasyOCR uses port 8002 (same as PaddleOCR). Cannot run simultaneously.
services:
raytune:
image: seryus.ddns.net/unir/raytune:latest
command: ["--service", "easyocr", "--host", "easyocr-gpu", "--port", "8000", "--samples", "64"]
volumes:
- ./results:/app/results:rw
environment:
- PYTHONUNBUFFERED=1
depends_on:
easyocr-gpu:
condition: service_healthy
easyocr-gpu:
image: seryus.ddns.net/unir/easyocr-gpu:latest
container_name: easyocr-gpu-tuning
ports:
- "8002:8000"
volumes:
- ./dataset:/app/dataset:ro
- ./debugset:/app/debugset:rw
- easyocr-cache:/root/.EasyOCR
environment:
- PYTHONUNBUFFERED=1
- CUDA_VISIBLE_DEVICES=0
- EASYOCR_LANGUAGES=es,en
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"]
interval: 30s
timeout: 10s
retries: 3
start_period: 120s
volumes:
easyocr-cache:
name: easyocr-model-cache

View File

@@ -0,0 +1,50 @@
# docker-compose.tuning.paddle.yml - Ray Tune with PaddleOCR GPU
# Usage:
# docker compose -f docker-compose.tuning.paddle.yml up -d paddle-ocr-gpu
# docker compose -f docker-compose.tuning.paddle.yml run raytune --service paddle --samples 64
# docker compose -f docker-compose.tuning.paddle.yml down
services:
raytune:
image: seryus.ddns.net/unir/raytune:latest
command: ["--service", "paddle", "--host", "paddle-ocr-gpu", "--port", "8000", "--samples", "64"]
volumes:
- ./results:/app/results:rw
environment:
- PYTHONUNBUFFERED=1
depends_on:
paddle-ocr-gpu:
condition: service_healthy
paddle-ocr-gpu:
image: seryus.ddns.net/unir/paddle-ocr-gpu:latest
container_name: paddle-ocr-gpu-tuning
ports:
- "8002:8000"
volumes:
- ./dataset:/app/dataset:ro
- ./debugset:/app/debugset:rw
- paddlex-cache:/root/.paddlex
environment:
- PYTHONUNBUFFERED=1
- CUDA_VISIBLE_DEVICES=0
- PADDLE_DET_MODEL=PP-OCRv5_mobile_det
- PADDLE_REC_MODEL=PP-OCRv5_mobile_rec
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
volumes:
paddlex-cache:
name: paddlex-model-cache

View File

@@ -0,0 +1,82 @@
# docker-compose.tuning.yml - Ray Tune with all OCR services (PaddleOCR + DocTR)
# Usage:
# docker compose -f docker-compose.tuning.yml up -d paddle-ocr-gpu doctr-gpu
# docker compose -f docker-compose.tuning.yml run raytune --service paddle --samples 64
# docker compose -f docker-compose.tuning.yml run raytune --service doctr --samples 64
# docker compose -f docker-compose.tuning.yml down
#
# Note: EasyOCR uses port 8002 (same as PaddleOCR). Use docker-compose.tuning.easyocr.yml separately.
services:
raytune:
image: seryus.ddns.net/unir/raytune:latest
network_mode: host
shm_size: '5gb'
volumes:
- ./results:/app/results:rw
environment:
- PYTHONUNBUFFERED=1
paddle-ocr-gpu:
image: seryus.ddns.net/unir/paddle-ocr-gpu:latest
container_name: paddle-ocr-gpu-tuning
ports:
- "8002:8000"
volumes:
- ./dataset:/app/dataset:ro
- ./debugset:/app/debugset:rw
- paddlex-cache:/root/.paddlex
environment:
- PYTHONUNBUFFERED=1
- CUDA_VISIBLE_DEVICES=0
- PADDLE_DET_MODEL=PP-OCRv5_mobile_det
- PADDLE_REC_MODEL=PP-OCRv5_mobile_rec
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
doctr-gpu:
image: seryus.ddns.net/unir/doctr-gpu:latest
container_name: doctr-gpu-tuning
ports:
- "8003:8000"
volumes:
- ./dataset:/app/dataset:ro
- ./debugset:/app/debugset:rw
- doctr-cache:/root/.cache/doctr
environment:
- PYTHONUNBUFFERED=1
- CUDA_VISIBLE_DEVICES=0
- DOCTR_DET_ARCH=db_resnet50
- DOCTR_RECO_ARCH=crnn_vgg16_bn
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"]
interval: 30s
timeout: 10s
retries: 3
start_period: 180s
volumes:
paddlex-cache:
name: paddlex-model-cache
doctr-cache:
name: doctr-model-cache

18
src/raytune/Dockerfile Normal file
View File

@@ -0,0 +1,18 @@
FROM python:3.12-slim
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application files
COPY raytune_ocr.py .
COPY run_tuning.py .
# Create results directory
RUN mkdir -p /app/results
ENV PYTHONUNBUFFERED=1
ENTRYPOINT ["python", "run_tuning.py"]

131
src/raytune/README.md Normal file
View File

@@ -0,0 +1,131 @@
# Ray Tune OCR Hyperparameter Optimization
Docker-based hyperparameter tuning for OCR services using Ray Tune with Optuna search.
## Structure
```
raytune/
├── Dockerfile # Python 3.12-slim with Ray Tune + Optuna
├── requirements.txt # Dependencies
├── raytune_ocr.py # Shared utilities and search spaces
├── run_tuning.py # CLI entry point
└── README.md
```
## Quick Start
```bash
cd src
# Build the raytune image
docker compose -f docker-compose.tuning.paddle.yml build raytune
# Or pull from registry
docker pull seryus.ddns.net/unir/raytune:latest
```
## Usage
### PaddleOCR Tuning
```bash
# Start PaddleOCR service
docker compose -f docker-compose.tuning.paddle.yml up -d paddle-ocr-gpu
# Wait for health check, then run tuning
docker compose -f docker-compose.tuning.paddle.yml run raytune --service paddle --samples 64
# Stop when done
docker compose -f docker-compose.tuning.paddle.yml down
```
### DocTR Tuning
```bash
docker compose -f docker-compose.tuning.doctr.yml up -d doctr-gpu
docker compose -f docker-compose.tuning.doctr.yml run raytune --service doctr --samples 64
docker compose -f docker-compose.tuning.doctr.yml down
```
### EasyOCR Tuning
```bash
# Note: EasyOCR uses port 8002 (same as PaddleOCR). Cannot run simultaneously.
docker compose -f docker-compose.tuning.easyocr.yml up -d easyocr-gpu
docker compose -f docker-compose.tuning.easyocr.yml run raytune --service easyocr --samples 64
docker compose -f docker-compose.tuning.easyocr.yml down
```
## CLI Options
```
python run_tuning.py --service {paddle,doctr,easyocr} --samples N
```
| Option | Description | Default |
|------------|--------------------------------------|---------|
| --service | OCR service to tune (required) | - |
| --samples | Number of hyperparameter trials | 64 |
## Search Spaces
### PaddleOCR
- `use_doc_orientation_classify`: [True, False]
- `use_doc_unwarping`: [True, False]
- `textline_orientation`: [True, False]
- `text_det_thresh`: uniform(0.0, 0.7)
- `text_det_box_thresh`: uniform(0.0, 0.7)
- `text_rec_score_thresh`: uniform(0.0, 0.7)
### DocTR
- `assume_straight_pages`: [True, False]
- `straighten_pages`: [True, False]
- `preserve_aspect_ratio`: [True, False]
- `symmetric_pad`: [True, False]
- `disable_page_orientation`: [True, False]
- `disable_crop_orientation`: [True, False]
- `resolve_lines`: [True, False]
- `resolve_blocks`: [True, False]
- `paragraph_break`: uniform(0.01, 0.1)
### EasyOCR
- `text_threshold`: uniform(0.3, 0.9)
- `low_text`: uniform(0.2, 0.6)
- `link_threshold`: uniform(0.2, 0.6)
- `slope_ths`: uniform(0.0, 0.3)
- `ycenter_ths`: uniform(0.3, 1.0)
- `height_ths`: uniform(0.3, 1.0)
- `width_ths`: uniform(0.3, 1.0)
- `add_margin`: uniform(0.0, 0.3)
- `contrast_ths`: uniform(0.05, 0.3)
- `adjust_contrast`: uniform(0.3, 0.8)
- `decoder`: ["greedy", "beamsearch"]
- `beamWidth`: [3, 5, 7, 10]
- `min_size`: [5, 10, 15, 20]
## Output
Results are saved to `src/results/` as CSV files:
- `raytune_paddle_results_YYYYMMDD_HHMMSS.csv`
- `raytune_doctr_results_YYYYMMDD_HHMMSS.csv`
- `raytune_easyocr_results_YYYYMMDD_HHMMSS.csv`
Each row contains:
- Configuration parameters (prefixed with `config/`)
- Metrics: CER, WER, TIME, PAGES, TIME_PER_PAGE
- Worker URL used for the trial
## Network Mode
The raytune container uses `network_mode: host` to access OCR services on localhost ports:
- PaddleOCR: port 8002
- DocTR: port 8003
- EasyOCR: port 8002 (conflicts with PaddleOCR)
## Dependencies
- ray[tune]==2.52.1
- optuna==4.7.0
- requests>=2.28.0
- pandas>=2.0.0

371
src/raytune/raytune_ocr.py Normal file
View File

@@ -0,0 +1,371 @@
# raytune_ocr.py
# Shared Ray Tune utilities for OCR hyperparameter optimization
#
# Usage:
# from raytune_ocr import check_workers, create_trainable, run_tuner, analyze_results
#
# Environment variables:
# OCR_HOST: Host for OCR services (default: localhost)
import os
from datetime import datetime
from typing import List, Dict, Any, Callable
import requests
import pandas as pd
import ray
from ray import tune
from ray.tune.search.optuna import OptunaSearch
def check_workers(
ports: List[int],
service_name: str = "OCR",
timeout: int = 180,
interval: int = 5,
) -> List[str]:
"""
Wait for workers to be fully ready (model + dataset loaded) and return healthy URLs.
Args:
ports: List of port numbers to check
service_name: Name for error messages
timeout: Max seconds to wait for each worker
interval: Seconds between retries
Returns:
List of healthy worker URLs
Raises:
RuntimeError if no healthy workers found after timeout
"""
import time
host = os.environ.get("OCR_HOST", "localhost")
worker_urls = [f"http://{host}:{port}" for port in ports]
healthy_workers = []
for url in worker_urls:
print(f"Waiting for {url}...")
start = time.time()
while time.time() - start < timeout:
try:
health = requests.get(f"{url}/health", timeout=10).json()
model_ok = health.get('model_loaded', False)
dataset_ok = health.get('dataset_loaded', False)
if health.get('status') == 'ok' and model_ok:
gpu = health.get('gpu_name', 'CPU')
print(f"{url}: ready ({gpu})")
healthy_workers.append(url)
break
elapsed = int(time.time() - start)
print(f" [{elapsed}s] model={model_ok}")
except requests.exceptions.RequestException:
elapsed = int(time.time() - start)
print(f" [{elapsed}s] not reachable")
time.sleep(interval)
else:
print(f"{url}: timeout after {timeout}s")
if not healthy_workers:
raise RuntimeError(
f"No healthy {service_name} workers found.\n"
f"Checked ports: {ports}"
)
print(f"\n{len(healthy_workers)}/{len(worker_urls)} workers ready\n")
return healthy_workers
def create_trainable(ports: List[int], payload_fn: Callable[[Dict], Dict]) -> Callable:
"""
Factory to create a trainable function for Ray Tune.
Args:
ports: List of worker ports for load balancing
payload_fn: Function that takes config dict and returns API payload dict
Returns:
Trainable function for Ray Tune
Note:
Ray Tune 2.x API: tune.report(metrics_dict) - pass dict directly, NOT kwargs.
See: https://docs.ray.io/en/latest/tune/api/doc/ray.tune.report.html
"""
def trainable(config):
import os
import random
import requests
from ray.tune import report # Ray 2.x: report(dict), not report(**kwargs)
host = os.environ.get("OCR_HOST", "localhost")
api_url = f"http://{host}:{random.choice(ports)}"
payload = payload_fn(config)
try:
response = requests.post(f"{api_url}/evaluate", json=payload, timeout=None)
response.raise_for_status()
metrics = response.json()
metrics["worker"] = api_url
report(metrics) # Ray 2.x API: pass dict directly
except Exception as e:
report({ # Ray 2.x API: pass dict directly
"CER": 1.0,
"WER": 1.0,
"TIME": 0.0,
"PAGES": 0,
"TIME_PER_PAGE": 0,
"worker": api_url,
"ERROR": str(e)[:500]
})
return trainable
def run_tuner(
trainable: Callable,
search_space: Dict[str, Any],
num_samples: int = 64,
num_workers: int = 1,
metric: str = "CER",
mode: str = "min",
) -> tune.ResultGrid:
"""
Initialize Ray and run hyperparameter tuning.
Args:
trainable: Trainable function from create_trainable()
search_space: Dict of parameter names to tune.* search spaces
num_samples: Number of trials to run
num_workers: Max concurrent trials
metric: Metric to optimize
mode: "min" or "max"
Returns:
Ray Tune ResultGrid
"""
ray.init(
ignore_reinit_error=True,
include_dashboard=False,
configure_logging=False,
_metrics_export_port=0, # Disable metrics export to avoid connection warnings
)
print(f"Ray Tune ready (version: {ray.__version__})")
tuner = tune.Tuner(
trainable,
tune_config=tune.TuneConfig(
metric=metric,
mode=mode,
search_alg=OptunaSearch(),
num_samples=num_samples,
max_concurrent_trials=num_workers,
),
param_space=search_space,
)
return tuner.fit()
def analyze_results(
results: tune.ResultGrid,
output_folder: str = "results",
prefix: str = "raytune",
config_keys: List[str] = None,
) -> pd.DataFrame:
"""
Analyze and save tuning results.
Args:
results: Ray Tune ResultGrid
output_folder: Directory to save CSV
prefix: Filename prefix
config_keys: List of config keys to show in best result (without 'config/' prefix)
Returns:
Results DataFrame
"""
os.makedirs(output_folder, exist_ok=True)
df = results.get_dataframe()
# Save to CSV
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"{prefix}_results_{timestamp}.csv"
filepath = os.path.join(output_folder, filename)
df.to_csv(filepath, index=False)
print(f"Results saved: {filepath}")
# Best configuration
best = df.loc[df["CER"].idxmin()]
print(f"\nBest CER: {best['CER']:.6f}")
print(f"Best WER: {best['WER']:.6f}")
if config_keys:
print(f"\nOptimal Configuration:")
for key in config_keys:
col = f"config/{key}"
if col in best:
val = best[col]
if isinstance(val, float):
print(f" {key}: {val:.4f}")
else:
print(f" {key}: {val}")
return df
def correlation_analysis(df: pd.DataFrame, param_keys: List[str]) -> None:
"""
Print correlation of numeric parameters with CER/WER.
Args:
df: Results DataFrame
param_keys: List of config keys (without 'config/' prefix)
"""
param_cols = [f"config/{k}" for k in param_keys if f"config/{k}" in df.columns]
numeric_cols = [c for c in param_cols if df[c].dtype in ['float64', 'int64']]
if not numeric_cols:
print("No numeric parameters for correlation analysis")
return
corr_cer = df[numeric_cols + ["CER"]].corr()["CER"].sort_values(ascending=False)
corr_wer = df[numeric_cols + ["WER"]].corr()["WER"].sort_values(ascending=False)
print("Correlation with CER:")
print(corr_cer)
print("\nCorrelation with WER:")
print(corr_wer)
# =============================================================================
# OCR-specific payload functions
# =============================================================================
def paddle_ocr_payload(config: Dict) -> Dict:
"""Create payload for PaddleOCR API. Uses pages 5-10 (first doc) for tuning."""
return {
"pdf_folder": "/app/dataset",
"use_doc_orientation_classify": config.get("use_doc_orientation_classify", False),
"use_doc_unwarping": config.get("use_doc_unwarping", False),
"textline_orientation": config.get("textline_orientation", True),
"text_det_thresh": config.get("text_det_thresh", 0.0),
"text_det_box_thresh": config.get("text_det_box_thresh", 0.0),
"text_det_unclip_ratio": config.get("text_det_unclip_ratio", 1.5),
"text_rec_score_thresh": config.get("text_rec_score_thresh", 0.0),
"start_page": 5,
"end_page": 10,
"save_output": False,
}
def doctr_payload(config: Dict) -> Dict:
"""Create payload for DocTR API. Uses pages 5-10 (first doc) for tuning."""
return {
"pdf_folder": "/app/dataset",
"assume_straight_pages": config.get("assume_straight_pages", True),
"straighten_pages": config.get("straighten_pages", False),
"preserve_aspect_ratio": config.get("preserve_aspect_ratio", True),
"symmetric_pad": config.get("symmetric_pad", True),
"disable_page_orientation": config.get("disable_page_orientation", False),
"disable_crop_orientation": config.get("disable_crop_orientation", False),
"resolve_lines": config.get("resolve_lines", True),
"resolve_blocks": config.get("resolve_blocks", False),
"paragraph_break": config.get("paragraph_break", 0.035),
"start_page": 5,
"end_page": 10,
"save_output": False,
}
def easyocr_payload(config: Dict) -> Dict:
"""Create payload for EasyOCR API. Uses pages 5-10 (first doc) for tuning."""
return {
"pdf_folder": "/app/dataset",
"text_threshold": config.get("text_threshold", 0.7),
"low_text": config.get("low_text", 0.4),
"link_threshold": config.get("link_threshold", 0.4),
"slope_ths": config.get("slope_ths", 0.1),
"ycenter_ths": config.get("ycenter_ths", 0.5),
"height_ths": config.get("height_ths", 0.5),
"width_ths": config.get("width_ths", 0.5),
"add_margin": config.get("add_margin", 0.1),
"contrast_ths": config.get("contrast_ths", 0.1),
"adjust_contrast": config.get("adjust_contrast", 0.5),
"decoder": config.get("decoder", "greedy"),
"beamWidth": config.get("beamWidth", 5),
"min_size": config.get("min_size", 10),
"start_page": 5,
"end_page": 10,
"save_output": False,
}
# =============================================================================
# Search spaces
# =============================================================================
PADDLE_OCR_SEARCH_SPACE = {
"use_doc_orientation_classify": tune.choice([True, False]),
"use_doc_unwarping": tune.choice([True, False]),
"textline_orientation": tune.choice([True, False]),
"text_det_thresh": tune.uniform(0.0, 0.7),
"text_det_box_thresh": tune.uniform(0.0, 0.7),
"text_det_unclip_ratio": tune.choice([0.0]),
"text_rec_score_thresh": tune.uniform(0.0, 0.7),
}
DOCTR_SEARCH_SPACE = {
"assume_straight_pages": tune.choice([True, False]),
"straighten_pages": tune.choice([True, False]),
"preserve_aspect_ratio": tune.choice([True, False]),
"symmetric_pad": tune.choice([True, False]),
"disable_page_orientation": tune.choice([True, False]),
"disable_crop_orientation": tune.choice([True, False]),
"resolve_lines": tune.choice([True, False]),
"resolve_blocks": tune.choice([True, False]),
"paragraph_break": tune.uniform(0.01, 0.1),
}
EASYOCR_SEARCH_SPACE = {
"text_threshold": tune.uniform(0.3, 0.9),
"low_text": tune.uniform(0.2, 0.6),
"link_threshold": tune.uniform(0.2, 0.6),
"slope_ths": tune.uniform(0.0, 0.3),
"ycenter_ths": tune.uniform(0.3, 1.0),
"height_ths": tune.uniform(0.3, 1.0),
"width_ths": tune.uniform(0.3, 1.0),
"add_margin": tune.uniform(0.0, 0.3),
"contrast_ths": tune.uniform(0.05, 0.3),
"adjust_contrast": tune.uniform(0.3, 0.8),
"decoder": tune.choice(["greedy", "beamsearch"]),
"beamWidth": tune.choice([3, 5, 7, 10]),
"min_size": tune.choice([5, 10, 15, 20]),
}
# =============================================================================
# Config keys for results display
# =============================================================================
PADDLE_OCR_CONFIG_KEYS = [
"use_doc_orientation_classify", "use_doc_unwarping", "textline_orientation",
"text_det_thresh", "text_det_box_thresh", "text_det_unclip_ratio", "text_rec_score_thresh",
]
DOCTR_CONFIG_KEYS = [
"assume_straight_pages", "straighten_pages", "preserve_aspect_ratio", "symmetric_pad",
"disable_page_orientation", "disable_crop_orientation", "resolve_lines", "resolve_blocks",
"paragraph_break",
]
EASYOCR_CONFIG_KEYS = [
"text_threshold", "low_text", "link_threshold", "slope_ths", "ycenter_ths",
"height_ths", "width_ths", "add_margin", "contrast_ths", "adjust_contrast",
"decoder", "beamWidth", "min_size",
]

View File

@@ -0,0 +1,4 @@
ray[tune]==2.52.1
optuna==4.7.0
requests>=2.28.0
pandas>=2.0.0

80
src/raytune/run_tuning.py Normal file
View File

@@ -0,0 +1,80 @@
#!/usr/bin/env python3
"""Run hyperparameter tuning for OCR services."""
import os
import sys
import argparse
from raytune_ocr import (
check_workers, create_trainable, run_tuner, analyze_results,
paddle_ocr_payload, doctr_payload, easyocr_payload,
PADDLE_OCR_SEARCH_SPACE, DOCTR_SEARCH_SPACE, EASYOCR_SEARCH_SPACE,
PADDLE_OCR_CONFIG_KEYS, DOCTR_CONFIG_KEYS, EASYOCR_CONFIG_KEYS,
)
SERVICES = {
"paddle": {
"payload_fn": paddle_ocr_payload,
"search_space": PADDLE_OCR_SEARCH_SPACE,
"config_keys": PADDLE_OCR_CONFIG_KEYS,
"name": "PaddleOCR",
},
"doctr": {
"payload_fn": doctr_payload,
"search_space": DOCTR_SEARCH_SPACE,
"config_keys": DOCTR_CONFIG_KEYS,
"name": "DocTR",
},
"easyocr": {
"payload_fn": easyocr_payload,
"search_space": EASYOCR_SEARCH_SPACE,
"config_keys": EASYOCR_CONFIG_KEYS,
"name": "EasyOCR",
},
}
def main():
parser = argparse.ArgumentParser(description="Run OCR hyperparameter tuning")
parser.add_argument("--service", choices=["paddle", "doctr", "easyocr"], required=True)
parser.add_argument("--host", type=str, default="localhost", help="OCR service host")
parser.add_argument("--port", type=int, default=8000, help="OCR service port")
parser.add_argument("--samples", type=int, default=64, help="Number of samples")
args = parser.parse_args()
# Set environment variable for raytune_ocr module
os.environ["OCR_HOST"] = args.host
cfg = SERVICES[args.service]
ports = [args.port]
print(f"\n{'='*50}")
print(f"Hyperparameter Tuning: {cfg['name']}")
print(f"Host: {args.host}:{args.port}")
print(f"Samples: {args.samples}")
print(f"{'='*50}\n")
# Check workers
healthy = check_workers(ports, cfg["name"])
# Create trainable and run tuning
trainable = create_trainable(ports, cfg["payload_fn"])
results = run_tuner(
trainable=trainable,
search_space=cfg["search_space"],
num_samples=args.samples,
num_workers=len(healthy),
)
# Analyze results
df = analyze_results(
results,
output_folder="results",
prefix=f"raytune_{args.service}",
config_keys=cfg["config_keys"],
)
print(f"\n{'='*50}")
print("Tuning complete!")
print(f"{'='*50}")
if __name__ == "__main__":
main()