All checks were successful
build_docker / essential (push) Successful in 0s
build_docker / build_cpu (push) Successful in 5m0s
build_docker / build_gpu (push) Successful in 22m55s
build_docker / build_easyocr (push) Successful in 18m47s
build_docker / build_easyocr_gpu (push) Successful in 19m0s
build_docker / build_raytune (push) Successful in 3m27s
build_docker / build_doctr (push) Successful in 19m42s
build_docker / build_doctr_gpu (push) Successful in 14m49s
Ray Tune OCR Hyperparameter Optimization
Docker-based hyperparameter tuning for OCR services using Ray Tune with Optuna search.
Structure
raytune/
├── Dockerfile # Python 3.12-slim with Ray Tune + Optuna
├── requirements.txt # Dependencies
├── raytune_ocr.py # Shared utilities and search spaces
├── run_tuning.py # CLI entry point
└── README.md
Quick Start
cd src
# Build the raytune image
docker compose -f docker-compose.tuning.paddle.yml build raytune
# Or pull from registry
docker pull seryus.ddns.net/unir/raytune:latest
Usage
PaddleOCR Tuning
# Start PaddleOCR service
docker compose -f docker-compose.tuning.paddle.yml up -d paddle-ocr-gpu
# Wait for health check, then run tuning
docker compose -f docker-compose.tuning.paddle.yml run raytune --service paddle --samples 64
# Stop when done
docker compose -f docker-compose.tuning.paddle.yml down
DocTR Tuning
docker compose -f docker-compose.tuning.doctr.yml up -d doctr-gpu
docker compose -f docker-compose.tuning.doctr.yml run raytune --service doctr --samples 64
docker compose -f docker-compose.tuning.doctr.yml down
EasyOCR Tuning
# Note: EasyOCR uses port 8002 (same as PaddleOCR). Cannot run simultaneously.
docker compose -f docker-compose.tuning.easyocr.yml up -d easyocr-gpu
docker compose -f docker-compose.tuning.easyocr.yml run raytune --service easyocr --samples 64
docker compose -f docker-compose.tuning.easyocr.yml down
CLI Options
python run_tuning.py --service {paddle,doctr,easyocr} --samples N
| Option | Description | Default |
|---|---|---|
| --service | OCR service to tune (required) | - |
| --samples | Number of hyperparameter trials | 64 |
Search Spaces
PaddleOCR
use_doc_orientation_classify: [True, False]use_doc_unwarping: [True, False]textline_orientation: [True, False]text_det_thresh: uniform(0.0, 0.7)text_det_box_thresh: uniform(0.0, 0.7)text_rec_score_thresh: uniform(0.0, 0.7)
DocTR
assume_straight_pages: [True, False]straighten_pages: [True, False]preserve_aspect_ratio: [True, False]symmetric_pad: [True, False]disable_page_orientation: [True, False]disable_crop_orientation: [True, False]resolve_lines: [True, False]resolve_blocks: [True, False]paragraph_break: uniform(0.01, 0.1)
EasyOCR
text_threshold: uniform(0.3, 0.9)low_text: uniform(0.2, 0.6)link_threshold: uniform(0.2, 0.6)slope_ths: uniform(0.0, 0.3)ycenter_ths: uniform(0.3, 1.0)height_ths: uniform(0.3, 1.0)width_ths: uniform(0.3, 1.0)add_margin: uniform(0.0, 0.3)contrast_ths: uniform(0.05, 0.3)adjust_contrast: uniform(0.3, 0.8)decoder: ["greedy", "beamsearch"]beamWidth: [3, 5, 7, 10]min_size: [5, 10, 15, 20]
Output
Results are saved to src/results/ as CSV files:
raytune_paddle_results_YYYYMMDD_HHMMSS.csvraytune_doctr_results_YYYYMMDD_HHMMSS.csvraytune_easyocr_results_YYYYMMDD_HHMMSS.csv
Each row contains:
- Configuration parameters (prefixed with
config/) - Metrics: CER, WER, TIME, PAGES, TIME_PER_PAGE
- Worker URL used for the trial
Network Mode
The raytune container uses network_mode: host to access OCR services on localhost ports:
- PaddleOCR: port 8002
- DocTR: port 8003
- EasyOCR: port 8002 (conflicts with PaddleOCR)
Dependencies
- ray[tune]==2.52.1
- optuna==4.7.0
- requests>=2.28.0
- pandas>=2.0.0