Documentation review and data consistency.
Some checks failed
build_docker / essential (push) Successful in 0s
build_docker / build_paddle_ocr (push) Successful in 4m57s
build_docker / build_raytune (push) Has been cancelled
build_docker / build_easyocr_gpu (push) Has been cancelled
build_docker / build_doctr (push) Has been cancelled
build_docker / build_doctr_gpu (push) Has been cancelled
build_docker / build_paddle_ocr_gpu (push) Has been cancelled
build_docker / build_easyocr (push) Has been cancelled
Some checks failed
build_docker / essential (push) Successful in 0s
build_docker / build_paddle_ocr (push) Successful in 4m57s
build_docker / build_raytune (push) Has been cancelled
build_docker / build_easyocr_gpu (push) Has been cancelled
build_docker / build_doctr (push) Has been cancelled
build_docker / build_doctr_gpu (push) Has been cancelled
build_docker / build_paddle_ocr_gpu (push) Has been cancelled
build_docker / build_easyocr (push) Has been cancelled
This commit is contained in:
51
AGENTS.md
Normal file
51
AGENTS.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# Repository Guidelines
|
||||
|
||||
## Project Structure & Module Organization
|
||||
|
||||
- `docs/`: Thesis chapters 00-07 in Spanish (UNIR structure). Edit these for narrative changes.
|
||||
- `src/`: OCR tuning code, services, notebooks, and results. Key subfolders: `raytune/`, `paddle_ocr/`, `doctr_service/`, `easyocr_service/`, `results/`, `results/correlations/`.
|
||||
- `instructions/`: UNIR template and writing rules (`plantilla_individual.htm` is the styling source of truth).
|
||||
- `thesis_output/`: Generated thesis HTML and figures (do not edit by hand).
|
||||
- Root scripts: `generate_mermaid_figures.py` (Mermaid to PNG) and `apply_content.py` (template assembly).
|
||||
- Temporary scripts go in `tem/scripts/`.
|
||||
|
||||
## Build, Test, and Development Commands
|
||||
|
||||
- `source .venv/bin/activate` before installing or running Python tools.
|
||||
- `npm install`: install Mermaid CLI (`node_modules/.bin/mmdc`) for figure generation.
|
||||
- `python3 generate_mermaid_figures.py`: write PNGs to `thesis_output/figures/` from `docs/*.md`.
|
||||
- `python3 apply_content.py`: generate `thesis_output/plantilla_individual.htm` from `docs/` + `instructions/`.
|
||||
- `jupyter notebook src/prepare_dataset.ipynb`: prepare OCR dataset from PDFs.
|
||||
- `jupyter notebook src/paddle_ocr_fine_tune_unir_raytune.ipynb`: run the main tuning experiment.
|
||||
- Docker tuning (GPU):
|
||||
- `docker compose -f src/docker-compose.tuning.paddle.yml up -d paddle-ocr-gpu`
|
||||
- `docker compose -f src/docker-compose.tuning.paddle.yml run raytune --service paddle --samples 64`
|
||||
- `docker compose -f src/docker-compose.tuning.paddle.yml down`
|
||||
- Use `.claude/commands/word-generation.md` to regenerate the thesis output.
|
||||
|
||||
## Coding Style & Naming Conventions
|
||||
|
||||
- Python: PEP 8, 4-space indentation, `snake_case`.
|
||||
- Notebooks live in `src/` and should keep execution order clean when committed.
|
||||
- Documentation in `docs/` is Spanish; code comments stay in English.
|
||||
|
||||
## Data, Documentation, and Formatting Rules
|
||||
|
||||
- Run `.claude/commands/documentation-review.md` before editing `docs/00-07`.
|
||||
- Do not invent numbers. Every numeric claim must come from `src/results/*.csv`, `src/results/correlations/*.csv`, `docs/metrics/*.md`, or external sources listed in `docs/07_anexo_a.md`.
|
||||
- Tables and figures must use UNIR caption format: `**Tabla N.** *Título.*` / `**Figura N.** *Título.*` plus `*Fuente: ...*`.
|
||||
- Mermaid diagrams require YAML frontmatter with a quoted `title:` and UNIR theme variables.
|
||||
- Use full repository links in `*Fuente:*` lines, e.g. `https://seryus.ddns.net/unir/MastersThesis/src/branch/main/docs/metrics/metrics.md`.
|
||||
|
||||
## Testing Guidelines
|
||||
|
||||
- No automated tests. Validate changes by running a small tuning run and checking CSV output in `src/results/`.
|
||||
|
||||
## Commit & Pull Request Guidelines
|
||||
|
||||
- Commit messages are short, sentence case, and may include a tracker reference in parentheses.
|
||||
- Keep commits focused; mention generated outputs (figures, HTML) when relevant.
|
||||
|
||||
## Agent-Specific Notes
|
||||
|
||||
- Follow `claude.md` for thesis-specific constraints and templates.
|
||||
Reference in New Issue
Block a user