-`instructions/plantilla_individual.pdf` - Official template preview
**IMPORTANT:** When styling elements (tables, figures, notes, quotes), ALWAYS check `plantilla_individual.htm` for existing Word/CSS classes (e.g., `MsoQuote`, `MsoCaption`, `Piedefoto-tabla`). Use these classes instead of custom inline styles.
### UNIR Color Palette (from plantilla_individual.htm)
- All CER/WER values must match those in `docs/metrics/*.md`
- Verify: baseline, optimized, best trial, percentage improvement
- Verify: GPU vs CPU acceleration factor
- Verify: dataset size (pages)
### UNIR Formatting
- Tables: `**Tabla N.** *Descriptive title in italics.*` followed by table, then a line that starts with `Fuente:` immediately after the table (no blank lines), e.g., `Fuente: ...`
- Table titles must describe the content (e.g., "Comparación de modelos OCR")
- Figures: `**Figura N.** *Descriptive title in italics.*`
- Figure titles must describe the content (e.g., "Pipeline de un sistema OCR moderno")
- Sequential numbering (no duplicates, no gaps)
- APA citation format for references
### Word Generation Alignment
- Table sources are only captured when the line **immediately after** the table starts with `Fuente:` (per `apply_content.py`).
- Mermaid figures use the YAML `title:` for captions in Word output; `**Figura N.**` lines are ignored by the generator but should remain for UNIR compliance.
### Mermaid Diagrams
- **All diagrams must be in Mermaid format** (no external images for flowcharts/charts)
- All Mermaid diagrams must use the UNIR color theme
- List of incorrect values found (with file:line references)
- Formatting issues detected
- Specific corrections needed
- Overall documentation health assessment
5.**Language**: All docs/* files must be in Spanish. README.md and CLAUDE.md can be in English.
6.**Audit Run (repeatable process)**:
- Validate each Mermaid diagram that contains numbers against its stated source (CSV or metrics file).
- Confirm every figure/table that includes metrics has a valid `*Fuente:*` line pointing to:
-`src/results/*.csv`, `src/results/correlations/*.csv`, or `docs/metrics/*.md`, or
- External sources listed in `docs/07_anexo_a.md`.
- Record any missing or mismatched sources before making edits.
## Writing Style (Required)
- Use fluent Spanish with standard punctuation, avoid long dashes.
- Prefer commas, semicolons, or short sentences over em dashes.
- Keep paragraphs concise and clear, avoid overly long sentences.
## Data Integrity (Required)
- Do not invent or estimate values. Every numeric claim must be sourced from `src/results/*.csv`, `docs/metrics/*.md`, or external documentation explicitly listed in `docs/07_anexo_a.md`.
- If a value is not present in those sources, remove it or mark it as unknown and request clarification.
- Source of truth for OCR metrics in `docs/00-07`: use `docs/metrics/*.md` for both "Resultados del Subconjunto de Ajuste" and "Evaluación del Dataset Completo", and `src/results/*.csv` for tuning subset values referenced by those sections.
## CSV Verification (Required)
Use the CSVs to validate best-trial values and to confirm that tuning-only figures are not confused with full-dataset results.
### Interpretation Rules
- The CSVs are from tuning on pages 5-10, not the full 45-page dataset.
- Values like “best trial CER” and “best trial WER” must match the CSVs.
- Full-dataset metrics must be sourced elsewhere and clearly labeled as full evaluation.
-`src/raytune_paddle_subproc_results_20251207_192320.csv` is CPU-only timing reference; do not use it for accuracy claims.
- GPU results are the primary research driver. CPU results are only used to illustrate timing without GPU.