regen docs

2026-01-19 17:38:43 +01:00
parent b1539fd79f
commit 07a7ba8c01
4 changed files with 138 additions and 541 deletions
--- a/docs/03_objetivos_metodologia.md
+++ b/docs/03_objetivos_metodologia.md
@@ -104,16 +104,7 @@ flowchart LR

 #### Clase ImageTextDataset

-Se implementó una clase Python para cargar pares imagen-texto:
-
-```python
-class ImageTextDataset:
-    def __init__(self, root):
-        # Carga pares (imagen, texto) de carpetas pareadas
-
-    def __getitem__(self, idx):
-        # Retorna (PIL.Image, str)
-```
+Se implementó una clase Python para cargar pares imagen-texto que retorna tuplas (PIL.Image, str) desde carpetas pareadas. La implementación completa está disponible en `src/ocr_benchmark_notebook.ipynb` (ver Anexo A).

 ### Fase 2: Benchmark Comparativo

@@ -131,17 +122,7 @@ class ImageTextDataset:

 #### Métricas de Evaluación

-Se utilizó la biblioteca `jiwer` para calcular:
-
-```python
-from jiwer import wer, cer
-
-def evaluate_text(reference, prediction):
-    return {
-        'WER': wer(reference, prediction),
-        'CER': cer(reference, prediction)
-    }
-```
+Se utilizó la biblioteca `jiwer` para calcular CER y WER comparando el texto de referencia con la predicción del modelo OCR. La implementación está disponible en `src/ocr_benchmark_notebook.ipynb` (ver Anexo A).

 ### Fase 3: Espacio de Búsqueda

@@ -163,66 +144,45 @@ def evaluate_text(reference, prediction):

 #### Configuración de Ray Tune

-```python
-from ray import tune
-from ray.tune.search.optuna import OptunaSearch
-
-search_space = {
-    "use_doc_orientation_classify": tune.choice([True, False]),
-    "use_doc_unwarping": tune.choice([True, False]),
-    "textline_orientation": tune.choice([True, False]),
-    "text_det_thresh": tune.uniform(0.0, 0.7),
-    "text_det_box_thresh": tune.uniform(0.0, 0.7),
-    "text_det_unclip_ratio": tune.choice([0.0]),
-    "text_rec_score_thresh": tune.uniform(0.0, 0.7),
-}
-
-tuner = tune.Tuner(
-    trainable_paddle_ocr,
-    tune_config=tune.TuneConfig(
-        metric="CER",
-        mode="min",
-        search_alg=OptunaSearch(),
-        num_samples=64,
-        max_concurrent_trials=2
-    )
-)
-```
+El espacio de búsqueda se definió utilizando `tune.choice()` para parámetros booleanos y `tune.uniform()` para parámetros continuos, con OptunaSearch como algoritmo de optimización configurado para minimizar CER en 64 trials. La implementación completa está disponible en `src/raytune/raytune_ocr.py` (ver Anexo A).

 ### Fase 4: Ejecución de Optimización

 #### Arquitectura de Ejecución

-Debido a incompatibilidades entre Ray y PaddleOCR en el mismo proceso, se implementó una arquitectura basada en subprocesos:
+Se implementó una arquitectura basada en contenedores Docker para aislar los servicios OCR y facilitar la reproducibilidad:

 ```mermaid
 ---
-title: "Arquitectura de ejecución con subprocesos"
+title: "Arquitectura de ejecución con Docker Compose"
 ---
 flowchart LR
-    A["Ray Tune (proceso principal)"]
+    subgraph Docker["Docker Compose"]
+        A["RayTune Container"]
+        B["OCR Service Container"]
+    end

-    A --> B["Subprocess 1: paddle_ocr_tuning.py --config"]
-    B --> B_out["Retorna JSON con métricas"]
-
-    A --> C["Subprocess 2: paddle_ocr_tuning.py --config"]
-    C --> C_out["Retorna JSON con métricas"]
+    A -->|"HTTP POST /evaluate"| B
+    B -->|"JSON {CER, WER, TIME}"| A
+    A -.->|"Health check /health"| B
 ```

-#### Script de Evaluación (paddle_ocr_tuning.py)
+#### Ejecución con Docker Compose

-El script recibe hiperparámetros por línea de comandos:
+Los servicios se orquestan mediante Docker Compose (`src/docker-compose.tuning.*.yml`):

 ```bash
-python paddle_ocr_tuning.py \
-    --pdf-folder ./dataset \
-    --textline-orientation True \
-    --text-det-box-thresh 0.5 \
-    --text-det-thresh 0.4 \
-    --text-rec-score-thresh 0.6
+# Iniciar servicio OCR
+docker compose -f docker-compose.tuning.doctr.yml up -d doctr-gpu
+
+# Ejecutar optimización (64 trials)
+docker compose -f docker-compose.tuning.doctr.yml run raytune --service doctr --samples 64
+
+# Detener servicios
+docker compose -f docker-compose.tuning.doctr.yml down
 ```

-Y retorna métricas en formato JSON:
+El servicio OCR expone una API REST que retorna métricas en formato JSON:

 ```json
 {