raytune as docker
Some checks failed
build_docker / essential (pull_request) Successful in 1s
build_docker / build_cpu (pull_request) Successful in 4m14s
build_docker / build_easyocr (pull_request) Successful in 12m19s
build_docker / build_easyocr_gpu (pull_request) Successful in 14m2s
build_docker / build_doctr (pull_request) Successful in 12m24s
build_docker / build_doctr_gpu (pull_request) Successful in 13m10s
build_docker / build_raytune (pull_request) Successful in 1m50s
build_docker / build_gpu (pull_request) Has been cancelled
Some checks failed
build_docker / essential (pull_request) Successful in 1s
build_docker / build_cpu (pull_request) Successful in 4m14s
build_docker / build_easyocr (pull_request) Successful in 12m19s
build_docker / build_easyocr_gpu (pull_request) Successful in 14m2s
build_docker / build_doctr (pull_request) Successful in 12m24s
build_docker / build_doctr_gpu (pull_request) Successful in 13m10s
build_docker / build_raytune (pull_request) Successful in 1m50s
build_docker / build_gpu (pull_request) Has been cancelled
This commit is contained in:
227
src/README.md
227
src/README.md
@@ -1,74 +1,153 @@
|
||||
# Running Notebooks in Background
|
||||
|
||||
## Quick: Check Ray Tune Progress
|
||||
|
||||
```bash
|
||||
# Is papermill still running?
|
||||
ps aux | grep papermill | grep -v grep
|
||||
|
||||
# View live log
|
||||
tail -f papermill.log
|
||||
|
||||
# Find latest Ray Tune run and count completed trials
|
||||
LATEST=$(ls -td ~/ray_results/trainable_* 2>/dev/null | head -1)
|
||||
echo "Run: $LATEST"
|
||||
COMPLETED=$(find "$LATEST" -name "result.json" -size +0 2>/dev/null | wc -l)
|
||||
TOTAL=$(ls -d "$LATEST"/trainable_*/ 2>/dev/null | wc -l)
|
||||
echo "Completed: $COMPLETED / $TOTAL"
|
||||
|
||||
# Check workers are healthy
|
||||
for port in 8001 8002 8003; do
|
||||
status=$(curl -s "localhost:$port/health" 2>/dev/null | python3 -c "import sys,json; print(json.load(sys.stdin).get('status','down'))" 2>/dev/null || echo "down")
|
||||
echo "Worker $port: $status"
|
||||
done
|
||||
|
||||
# Show best result so far
|
||||
if [ "$COMPLETED" -gt 0 ]; then
|
||||
find "$LATEST" -name "result.json" -size +0 -exec cat {} \; 2>/dev/null | \
|
||||
python3 -c "import sys,json; results=[json.loads(l) for l in sys.stdin if l.strip()]; best=min(results,key=lambda x:x.get('CER',999)); print(f'Best CER: {best[\"CER\"]:.4f}, WER: {best[\"WER\"]:.4f}')" 2>/dev/null
|
||||
fi
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Option 1: Papermill (Recommended)
|
||||
|
||||
Runs notebooks directly without conversion.
|
||||
|
||||
```bash
|
||||
pip install papermill
|
||||
nohup papermill <notebook>.ipynb output.ipynb > papermill.log 2>&1 &
|
||||
```
|
||||
|
||||
Monitor:
|
||||
```bash
|
||||
tail -f papermill.log
|
||||
```
|
||||
|
||||
## Option 2: Convert to Python Script
|
||||
|
||||
```bash
|
||||
jupyter nbconvert --to script <notebook>.ipynb
|
||||
nohup python <notebook>.py > output.log 2>&1 &
|
||||
```
|
||||
|
||||
**Note:** `%pip install` magic commands need manual removal before running as `.py`
|
||||
|
||||
## Important Notes
|
||||
|
||||
- Ray Tune notebooks require the OCR service running first (Docker)
|
||||
- For Ray workers, imports must be inside trainable functions
|
||||
|
||||
## Example: Ray Tune PaddleOCR
|
||||
|
||||
```bash
|
||||
# 1. Start OCR service
|
||||
cd src/paddle_ocr && docker compose up -d ocr-cpu
|
||||
|
||||
# 2. Run notebook with papermill
|
||||
cd src
|
||||
nohup papermill paddle_ocr_raytune_rest.ipynb output_raytune.ipynb > papermill.log 2>&1 &
|
||||
|
||||
# 3. Monitor
|
||||
tail -f papermill.log
|
||||
```
|
||||
# OCR Hyperparameter Tuning with Ray Tune
|
||||
|
||||
This directory contains the Docker setup for running automated hyperparameter optimization on OCR services using Ray Tune with Optuna.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Docker with NVIDIA GPU support (`nvidia-container-toolkit`)
|
||||
- NVIDIA GPU with CUDA support
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
cd src
|
||||
|
||||
# Start PaddleOCR service and run tuning (images pulled from registry)
|
||||
docker compose -f docker-compose.tuning.paddle.yml up -d paddle-ocr-gpu
|
||||
docker compose -f docker-compose.tuning.paddle.yml run raytune --service paddle --samples 64
|
||||
```
|
||||
|
||||
## Available Services
|
||||
|
||||
| Service | Port | Compose File |
|
||||
|---------|------|--------------|
|
||||
| PaddleOCR | 8002 | `docker-compose.tuning.paddle.yml` |
|
||||
| DocTR | 8003 | `docker-compose.tuning.doctr.yml` |
|
||||
| EasyOCR | 8002 | `docker-compose.tuning.easyocr.yml` |
|
||||
|
||||
**Note:** PaddleOCR and EasyOCR both use port 8002. Run them separately.
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### PaddleOCR Tuning
|
||||
|
||||
```bash
|
||||
# Start service
|
||||
docker compose -f docker-compose.tuning.paddle.yml up -d paddle-ocr-gpu
|
||||
|
||||
# Wait for health check (check with)
|
||||
curl http://localhost:8002/health
|
||||
|
||||
# Run tuning (64 samples)
|
||||
docker compose -f docker-compose.tuning.paddle.yml run raytune --service paddle --samples 64
|
||||
|
||||
# Stop service
|
||||
docker compose -f docker-compose.tuning.paddle.yml down
|
||||
```
|
||||
|
||||
### DocTR Tuning
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose.tuning.doctr.yml up -d doctr-gpu
|
||||
curl http://localhost:8003/health
|
||||
docker compose -f docker-compose.tuning.doctr.yml run raytune --service doctr --samples 64
|
||||
docker compose -f docker-compose.tuning.doctr.yml down
|
||||
```
|
||||
|
||||
### EasyOCR Tuning
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose.tuning.easyocr.yml up -d easyocr-gpu
|
||||
curl http://localhost:8002/health
|
||||
docker compose -f docker-compose.tuning.easyocr.yml run raytune --service easyocr --samples 64
|
||||
docker compose -f docker-compose.tuning.easyocr.yml down
|
||||
```
|
||||
|
||||
### Run Multiple Services (PaddleOCR + DocTR)
|
||||
|
||||
```bash
|
||||
# Start both services
|
||||
docker compose -f docker-compose.tuning.yml up -d paddle-ocr-gpu doctr-gpu
|
||||
|
||||
# Run tuning for each
|
||||
docker compose -f docker-compose.tuning.yml run raytune --service paddle --samples 64
|
||||
docker compose -f docker-compose.tuning.yml run raytune --service doctr --samples 64
|
||||
|
||||
# Stop all
|
||||
docker compose -f docker-compose.tuning.yml down
|
||||
```
|
||||
|
||||
## Command Line Options
|
||||
|
||||
```bash
|
||||
docker compose -f <compose-file> run raytune --service <service> --samples <n>
|
||||
```
|
||||
|
||||
| Option | Description | Default |
|
||||
|--------|-------------|---------|
|
||||
| `--service` | OCR service: `paddle`, `doctr`, `easyocr` | Required |
|
||||
| `--samples` | Number of hyperparameter trials | 64 |
|
||||
|
||||
## Output
|
||||
|
||||
Results are saved to `src/results/` as CSV files:
|
||||
- `raytune_paddle_results_<timestamp>.csv`
|
||||
- `raytune_doctr_results_<timestamp>.csv`
|
||||
- `raytune_easyocr_results_<timestamp>.csv`
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
src/
|
||||
├── docker-compose.tuning.yml # All services (PaddleOCR + DocTR)
|
||||
├── docker-compose.tuning.paddle.yml # PaddleOCR only
|
||||
├── docker-compose.tuning.doctr.yml # DocTR only
|
||||
├── docker-compose.tuning.easyocr.yml # EasyOCR only
|
||||
├── raytune/
|
||||
│ ├── Dockerfile
|
||||
│ ├── requirements.txt
|
||||
│ ├── raytune_ocr.py
|
||||
│ └── run_tuning.py
|
||||
├── dataset/ # Input images and ground truth
|
||||
├── results/ # Output CSV files
|
||||
└── debugset/ # Debug output
|
||||
```
|
||||
|
||||
## Docker Images
|
||||
|
||||
All images are pre-built and pulled from registry:
|
||||
- `seryus.ddns.net/unir/raytune:latest` - Ray Tune tuning service
|
||||
- `seryus.ddns.net/unir/paddle-ocr-gpu:latest` - PaddleOCR GPU
|
||||
- `seryus.ddns.net/unir/doctr-gpu:latest` - DocTR GPU
|
||||
- `seryus.ddns.net/unir/easyocr-gpu:latest` - EasyOCR GPU
|
||||
|
||||
### Build locally (development)
|
||||
|
||||
```bash
|
||||
# Build raytune image locally
|
||||
docker build -t seryus.ddns.net/unir/raytune:latest ./raytune
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Service not ready
|
||||
Wait for the health check to pass before running tuning:
|
||||
```bash
|
||||
# Check service health
|
||||
curl http://localhost:8002/health
|
||||
# Expected: {"status": "ok", "model_loaded": true, ...}
|
||||
```
|
||||
|
||||
### GPU not detected
|
||||
Ensure `nvidia-container-toolkit` is installed:
|
||||
```bash
|
||||
nvidia-smi # Should show your GPU
|
||||
docker run --rm --gpus all nvidia/cuda:12.4.1-base nvidia-smi
|
||||
```
|
||||
|
||||
### Port already in use
|
||||
Stop any running OCR services:
|
||||
```bash
|
||||
docker compose -f docker-compose.tuning.paddle.yml down
|
||||
docker compose -f docker-compose.tuning.easyocr.yml down
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user