word gen
Some checks failed
build_docker / essential (push) Successful in 1s
build_docker / build_paddle_ocr (push) Successful in 6m9s
build_docker / build_easyocr_gpu (push) Has been cancelled
build_docker / build_doctr (push) Has been cancelled
build_docker / build_doctr_gpu (push) Has been cancelled
build_docker / build_raytune (push) Has been cancelled
build_docker / build_easyocr (push) Has been cancelled
build_docker / build_paddle_ocr_gpu (push) Has been cancelled

This commit is contained in:
2026-01-24 16:09:26 +01:00
parent a071b82b38
commit 0e074f6101
2 changed files with 193 additions and 168 deletions

View File

@@ -117,7 +117,12 @@ flowchart LR
#### Clase ImageTextDataset
Se implementó una clase Python para cargar pares imagen-texto que retorna tuplas (PIL.Image, str) desde carpetas pareadas. La implementación se encuentra en [`src/prepare_dataset.ipynb`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/prepare_dataset.ipynb) y en [`src/paddle_ocr/dataset_manager.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/paddle_ocr/dataset_manager.py), [`src/easyocr_service/dataset_manager.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/easyocr_service/dataset_manager.py), [`src/doctr_service/dataset_manager.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/doctr_service/dataset_manager.py).)
Se implementó una clase Python para cargar pares imagen-texto que retorna tuplas (PIL.Image, str) desde carpetas pareadas. La implementación se encuentra en:
- [`src/prepare_dataset.ipynb`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/prepare_dataset.ipynb)
- [`src/paddle_ocr/dataset_manager.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/paddle_ocr/dataset_manager.py)
- [`src/easyocr_service/dataset_manager.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/easyocr_service/dataset_manager.py)
- [`src/doctr_service/dataset_manager.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/doctr_service/dataset_manager.py)
### Fase 2: Benchmark Comparativo
@@ -134,7 +139,11 @@ Fuente: [`docs/metrics/metrics.md`](https://seryus.ddns.net/unir/MastersThesis/s
#### Métricas de Evaluación
Se utilizó la biblioteca `jiwer` para calcular CER y WER comparando el texto de referencia con la predicción del modelo OCR. La implementación se encuentra en [`src/paddle_ocr/paddle_ocr_tuning_rest.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/paddle_ocr/paddle_ocr_tuning_rest.py), [`src/easyocr_service/easyocr_tuning_rest.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/easyocr_service/easyocr_tuning_rest.py) y [`src/doctr_service/doctr_tuning_rest.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/doctr_service/doctr_tuning_rest.py).)
Se utilizó la biblioteca `jiwer` para calcular CER y WER comparando el texto de referencia con la predicción del modelo OCR. La implementación se encuentra en:
- [`src/paddle_ocr/paddle_ocr_tuning_rest.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/paddle_ocr/paddle_ocr_tuning_rest.py)
- [`src/easyocr_service/easyocr_tuning_rest.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/easyocr_service/easyocr_tuning_rest.py)
- [`src/doctr_service/doctr_tuning_rest.py`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/doctr_service/doctr_tuning_rest.py)
### Fase 3: Espacio de Búsqueda
@@ -165,7 +174,12 @@ Se implementó una arquitectura basada en contenedores Docker para aislar los se
#### Ejecución con Docker Compose
Los servicios se orquestan mediante Docker Compose ([`src/docker-compose.tuning.paddle.yml`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/docker-compose.tuning.paddle.yml), [`src/docker-compose.tuning.doctr.yml`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/docker-compose.tuning.doctr.yml), [`src/docker-compose.tuning.easyocr.yml`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/docker-compose.tuning.easyocr.yml), [`src/docker-compose.tuning.yml`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/docker-compose.tuning.yml)):
Los servicios se orquestan mediante Docker Compose:
- [`src/docker-compose.tuning.paddle.yml`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/docker-compose.tuning.paddle.yml)
- [`src/docker-compose.tuning.doctr.yml`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/docker-compose.tuning.doctr.yml)
- [`src/docker-compose.tuning.easyocr.yml`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/docker-compose.tuning.easyocr.yml)
- [`src/docker-compose.tuning.yml`](https://seryus.ddns.net/unir/MastersThesis/src/branch/main/src/docker-compose.tuning.yml)
```bash
# Iniciar servicio OCR

File diff suppressed because one or more lines are too long