raytune as docker
Some checks failed
build_docker / essential (pull_request) Successful in 1s
build_docker / build_cpu (pull_request) Successful in 4m14s
build_docker / build_easyocr (pull_request) Successful in 12m19s
build_docker / build_easyocr_gpu (pull_request) Successful in 14m2s
build_docker / build_doctr (pull_request) Successful in 12m24s
build_docker / build_doctr_gpu (pull_request) Successful in 13m10s
build_docker / build_raytune (pull_request) Successful in 1m50s
build_docker / build_gpu (pull_request) Has been cancelled

This commit is contained in:
2026-01-19 16:32:45 +01:00
parent d67cbd4677
commit 94b25f9752
20 changed files with 7214 additions and 112 deletions

View File

@@ -1,74 +1,153 @@
# Running Notebooks in Background
## Quick: Check Ray Tune Progress
```bash
# Is papermill still running?
ps aux | grep papermill | grep -v grep
# View live log
tail -f papermill.log
# Find latest Ray Tune run and count completed trials
LATEST=$(ls -td ~/ray_results/trainable_* 2>/dev/null | head -1)
echo "Run: $LATEST"
COMPLETED=$(find "$LATEST" -name "result.json" -size +0 2>/dev/null | wc -l)
TOTAL=$(ls -d "$LATEST"/trainable_*/ 2>/dev/null | wc -l)
echo "Completed: $COMPLETED / $TOTAL"
# Check workers are healthy
for port in 8001 8002 8003; do
status=$(curl -s "localhost:$port/health" 2>/dev/null | python3 -c "import sys,json; print(json.load(sys.stdin).get('status','down'))" 2>/dev/null || echo "down")
echo "Worker $port: $status"
done
# Show best result so far
if [ "$COMPLETED" -gt 0 ]; then
find "$LATEST" -name "result.json" -size +0 -exec cat {} \; 2>/dev/null | \
python3 -c "import sys,json; results=[json.loads(l) for l in sys.stdin if l.strip()]; best=min(results,key=lambda x:x.get('CER',999)); print(f'Best CER: {best[\"CER\"]:.4f}, WER: {best[\"WER\"]:.4f}')" 2>/dev/null
fi
```
---
## Option 1: Papermill (Recommended)
Runs notebooks directly without conversion.
```bash
pip install papermill
nohup papermill <notebook>.ipynb output.ipynb > papermill.log 2>&1 &
```
Monitor:
```bash
tail -f papermill.log
```
## Option 2: Convert to Python Script
```bash
jupyter nbconvert --to script <notebook>.ipynb
nohup python <notebook>.py > output.log 2>&1 &
```
**Note:** `%pip install` magic commands need manual removal before running as `.py`
## Important Notes
- Ray Tune notebooks require the OCR service running first (Docker)
- For Ray workers, imports must be inside trainable functions
## Example: Ray Tune PaddleOCR
```bash
# 1. Start OCR service
cd src/paddle_ocr && docker compose up -d ocr-cpu
# 2. Run notebook with papermill
cd src
nohup papermill paddle_ocr_raytune_rest.ipynb output_raytune.ipynb > papermill.log 2>&1 &
# 3. Monitor
tail -f papermill.log
```
# OCR Hyperparameter Tuning with Ray Tune
This directory contains the Docker setup for running automated hyperparameter optimization on OCR services using Ray Tune with Optuna.
## Prerequisites
- Docker with NVIDIA GPU support (`nvidia-container-toolkit`)
- NVIDIA GPU with CUDA support
## Quick Start
```bash
cd src
# Start PaddleOCR service and run tuning (images pulled from registry)
docker compose -f docker-compose.tuning.paddle.yml up -d paddle-ocr-gpu
docker compose -f docker-compose.tuning.paddle.yml run raytune --service paddle --samples 64
```
## Available Services
| Service | Port | Compose File |
|---------|------|--------------|
| PaddleOCR | 8002 | `docker-compose.tuning.paddle.yml` |
| DocTR | 8003 | `docker-compose.tuning.doctr.yml` |
| EasyOCR | 8002 | `docker-compose.tuning.easyocr.yml` |
**Note:** PaddleOCR and EasyOCR both use port 8002. Run them separately.
## Usage Examples
### PaddleOCR Tuning
```bash
# Start service
docker compose -f docker-compose.tuning.paddle.yml up -d paddle-ocr-gpu
# Wait for health check (check with)
curl http://localhost:8002/health
# Run tuning (64 samples)
docker compose -f docker-compose.tuning.paddle.yml run raytune --service paddle --samples 64
# Stop service
docker compose -f docker-compose.tuning.paddle.yml down
```
### DocTR Tuning
```bash
docker compose -f docker-compose.tuning.doctr.yml up -d doctr-gpu
curl http://localhost:8003/health
docker compose -f docker-compose.tuning.doctr.yml run raytune --service doctr --samples 64
docker compose -f docker-compose.tuning.doctr.yml down
```
### EasyOCR Tuning
```bash
docker compose -f docker-compose.tuning.easyocr.yml up -d easyocr-gpu
curl http://localhost:8002/health
docker compose -f docker-compose.tuning.easyocr.yml run raytune --service easyocr --samples 64
docker compose -f docker-compose.tuning.easyocr.yml down
```
### Run Multiple Services (PaddleOCR + DocTR)
```bash
# Start both services
docker compose -f docker-compose.tuning.yml up -d paddle-ocr-gpu doctr-gpu
# Run tuning for each
docker compose -f docker-compose.tuning.yml run raytune --service paddle --samples 64
docker compose -f docker-compose.tuning.yml run raytune --service doctr --samples 64
# Stop all
docker compose -f docker-compose.tuning.yml down
```
## Command Line Options
```bash
docker compose -f <compose-file> run raytune --service <service> --samples <n>
```
| Option | Description | Default |
|--------|-------------|---------|
| `--service` | OCR service: `paddle`, `doctr`, `easyocr` | Required |
| `--samples` | Number of hyperparameter trials | 64 |
## Output
Results are saved to `src/results/` as CSV files:
- `raytune_paddle_results_<timestamp>.csv`
- `raytune_doctr_results_<timestamp>.csv`
- `raytune_easyocr_results_<timestamp>.csv`
## Directory Structure
```
src/
├── docker-compose.tuning.yml # All services (PaddleOCR + DocTR)
├── docker-compose.tuning.paddle.yml # PaddleOCR only
├── docker-compose.tuning.doctr.yml # DocTR only
├── docker-compose.tuning.easyocr.yml # EasyOCR only
├── raytune/
│ ├── Dockerfile
│ ├── requirements.txt
│ ├── raytune_ocr.py
│ └── run_tuning.py
├── dataset/ # Input images and ground truth
├── results/ # Output CSV files
└── debugset/ # Debug output
```
## Docker Images
All images are pre-built and pulled from registry:
- `seryus.ddns.net/unir/raytune:latest` - Ray Tune tuning service
- `seryus.ddns.net/unir/paddle-ocr-gpu:latest` - PaddleOCR GPU
- `seryus.ddns.net/unir/doctr-gpu:latest` - DocTR GPU
- `seryus.ddns.net/unir/easyocr-gpu:latest` - EasyOCR GPU
### Build locally (development)
```bash
# Build raytune image locally
docker build -t seryus.ddns.net/unir/raytune:latest ./raytune
```
## Troubleshooting
### Service not ready
Wait for the health check to pass before running tuning:
```bash
# Check service health
curl http://localhost:8002/health
# Expected: {"status": "ok", "model_loaded": true, ...}
```
### GPU not detected
Ensure `nvidia-container-toolkit` is installed:
```bash
nvidia-smi # Should show your GPU
docker run --rm --gpus all nvidia/cuda:12.4.1-base nvidia-smi
```
### Port already in use
Stop any running OCR services:
```bash
docker compose -f docker-compose.tuning.paddle.yml down
docker compose -f docker-compose.tuning.easyocr.yml down
```