Paddle ocr gpu support. #4
@@ -2,25 +2,31 @@
|
||||
|
||||
## Quick: Check Ray Tune Progress
|
||||
|
||||
**Current run:** PaddleOCR hyperparameter optimization via Ray Tune + Optuna.
|
||||
- 64 trials searching for optimal detection/recognition thresholds
|
||||
- 2 CPU workers running in parallel (Docker containers on ports 8001-8002)
|
||||
- Notebook: `paddle_ocr_raytune_rest.ipynb` → `output_raytune.ipynb`
|
||||
- Results saved to: `~/ray_results/trainable_paddle_ocr_2026-01-18_17-25-43/`
|
||||
|
||||
```bash
|
||||
# Is it still running?
|
||||
# Is papermill still running?
|
||||
ps aux | grep papermill | grep -v grep
|
||||
|
||||
# View live log
|
||||
tail -f papermill.log
|
||||
|
||||
# Count completed trials (64 total)
|
||||
find ~/ray_results/trainable_paddle_ocr_2026-01-18_17-25-43/ -name "result.json" ! -empty | wc -l
|
||||
# Find latest Ray Tune run and count completed trials
|
||||
LATEST=$(ls -td ~/ray_results/trainable_* 2>/dev/null | head -1)
|
||||
echo "Run: $LATEST"
|
||||
COMPLETED=$(find "$LATEST" -name "result.json" -size +0 2>/dev/null | wc -l)
|
||||
TOTAL=$(ls -d "$LATEST"/trainable_*/ 2>/dev/null | wc -l)
|
||||
echo "Completed: $COMPLETED / $TOTAL"
|
||||
|
||||
# Check workers are healthy
|
||||
curl -s localhost:8001/health | jq -r '.status'
|
||||
curl -s localhost:8002/health | jq -r '.status'
|
||||
for port in 8001 8002 8003; do
|
||||
status=$(curl -s "localhost:$port/health" 2>/dev/null | python3 -c "import sys,json; print(json.load(sys.stdin).get('status','down'))" 2>/dev/null || echo "down")
|
||||
echo "Worker $port: $status"
|
||||
done
|
||||
|
||||
# Show best result so far
|
||||
if [ "$COMPLETED" -gt 0 ]; then
|
||||
find "$LATEST" -name "result.json" -size +0 -exec cat {} \; 2>/dev/null | \
|
||||
python3 -c "import sys,json; results=[json.loads(l) for l in sys.stdin if l.strip()]; best=min(results,key=lambda x:x.get('CER',999)); print(f'Best CER: {best[\"CER\"]:.4f}, WER: {best[\"WER\"]:.4f}')" 2>/dev/null
|
||||
fi
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -101,6 +101,55 @@ Run OCR evaluation with given hyperparameters.
|
||||
|
||||
**Note:** `model_reinitialized` indicates if the model was reloaded due to changed processing flags (adds ~2-5s overhead).
|
||||
|
||||
## Debug Output (debugset)
|
||||
|
||||
The `debugset` folder allows saving OCR predictions for debugging and analysis. When `save_output=True` is passed to `/evaluate`, predictions are written to `/app/debugset`.
|
||||
|
||||
### Enable Debug Output
|
||||
|
||||
```json
|
||||
{
|
||||
"pdf_folder": "/app/dataset",
|
||||
"save_output": true,
|
||||
"start_page": 5,
|
||||
"end_page": 10
|
||||
}
|
||||
```
|
||||
|
||||
### Output Structure
|
||||
|
||||
```
|
||||
debugset/
|
||||
├── doc1/
|
||||
│ └── doctr/
|
||||
│ ├── page_0005.txt
|
||||
│ ├── page_0006.txt
|
||||
│ └── ...
|
||||
├── doc2/
|
||||
│ └── doctr/
|
||||
│ └── ...
|
||||
```
|
||||
|
||||
Each `.txt` file contains the OCR-extracted text for that page.
|
||||
|
||||
### Docker Mount
|
||||
|
||||
Add the debugset volume to your docker run command:
|
||||
|
||||
```bash
|
||||
docker run -d -p 8003:8000 \
|
||||
-v $(pwd)/../dataset:/app/dataset:ro \
|
||||
-v $(pwd)/../debugset:/app/debugset:rw \
|
||||
-v doctr-cache:/root/.cache/doctr \
|
||||
doctr-api:cpu
|
||||
```
|
||||
|
||||
### Use Cases
|
||||
|
||||
- **Compare OCR engines**: Run same pages through PaddleOCR, DocTR, EasyOCR with `save_output=True`, then diff results
|
||||
- **Debug hyperparameters**: See how different settings affect text extraction
|
||||
- **Ground truth comparison**: Compare predictions against expected output
|
||||
|
||||
## Hyperparameters
|
||||
|
||||
### Processing Flags (Require Model Reinitialization)
|
||||
|
||||
@@ -96,6 +96,55 @@ Run OCR evaluation with given hyperparameters.
|
||||
{"CER": 0.0234, "WER": 0.1156, "TIME": 45.2, "PAGES": 5, "TIME_PER_PAGE": 9.04}
|
||||
```
|
||||
|
||||
## Debug Output (debugset)
|
||||
|
||||
The `debugset` folder allows saving OCR predictions for debugging and analysis. When `save_output=True` is passed to `/evaluate`, predictions are written to `/app/debugset`.
|
||||
|
||||
### Enable Debug Output
|
||||
|
||||
```json
|
||||
{
|
||||
"pdf_folder": "/app/dataset",
|
||||
"save_output": true,
|
||||
"start_page": 5,
|
||||
"end_page": 10
|
||||
}
|
||||
```
|
||||
|
||||
### Output Structure
|
||||
|
||||
```
|
||||
debugset/
|
||||
├── doc1/
|
||||
│ └── easyocr/
|
||||
│ ├── page_0005.txt
|
||||
│ ├── page_0006.txt
|
||||
│ └── ...
|
||||
├── doc2/
|
||||
│ └── easyocr/
|
||||
│ └── ...
|
||||
```
|
||||
|
||||
Each `.txt` file contains the OCR-extracted text for that page.
|
||||
|
||||
### Docker Mount
|
||||
|
||||
Add the debugset volume to your docker run command:
|
||||
|
||||
```bash
|
||||
docker run -d -p 8002:8000 \
|
||||
-v $(pwd)/../dataset:/app/dataset:ro \
|
||||
-v $(pwd)/../debugset:/app/debugset:rw \
|
||||
-v easyocr-cache:/root/.EasyOCR \
|
||||
easyocr-api:cpu
|
||||
```
|
||||
|
||||
### Use Cases
|
||||
|
||||
- **Compare OCR engines**: Run same pages through PaddleOCR, DocTR, EasyOCR with `save_output=True`, then diff results
|
||||
- **Debug hyperparameters**: See how different settings affect text extraction
|
||||
- **Ground truth comparison**: Compare predictions against expected output
|
||||
|
||||
## Hyperparameters
|
||||
|
||||
### Detection (CRAFT Algorithm)
|
||||
|
||||
@@ -110,6 +110,52 @@ Run OCR evaluation with given hyperparameters.
|
||||
### `POST /evaluate_full`
|
||||
Same as `/evaluate` but runs on ALL pages (ignores start_page/end_page).
|
||||
|
||||
## Debug Output (debugset)
|
||||
|
||||
The `debugset` folder allows saving OCR predictions for debugging and analysis. When `save_output=True` is passed to `/evaluate`, predictions are written to `/app/debugset`.
|
||||
|
||||
### Enable Debug Output
|
||||
|
||||
```json
|
||||
{
|
||||
"pdf_folder": "/app/dataset",
|
||||
"save_output": true,
|
||||
"start_page": 5,
|
||||
"end_page": 10
|
||||
}
|
||||
```
|
||||
|
||||
### Output Structure
|
||||
|
||||
```
|
||||
debugset/
|
||||
├── doc1/
|
||||
│ └── paddle_ocr/
|
||||
│ ├── page_0005.txt
|
||||
│ ├── page_0006.txt
|
||||
│ └── ...
|
||||
├── doc2/
|
||||
│ └── paddle_ocr/
|
||||
│ └── ...
|
||||
```
|
||||
|
||||
Each `.txt` file contains the OCR-extracted text for that page.
|
||||
|
||||
### Docker Mount
|
||||
|
||||
The `debugset` folder is mounted read-write in docker-compose:
|
||||
|
||||
```yaml
|
||||
volumes:
|
||||
- ../debugset:/app/debugset:rw
|
||||
```
|
||||
|
||||
### Use Cases
|
||||
|
||||
- **Compare OCR engines**: Run same pages through PaddleOCR, DocTR, EasyOCR with `save_output=True`, then diff results
|
||||
- **Debug hyperparameters**: See how different settings affect text extraction
|
||||
- **Ground truth comparison**: Compare predictions against expected output
|
||||
|
||||
## Building Images
|
||||
|
||||
### CPU Image (Multi-Architecture)
|
||||
|
||||
Reference in New Issue
Block a user