diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..b585786
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1 @@
+~$*.docx
diff --git a/README.md b/README.md
index b5d0e0f..6bf17d6 100644
--- a/README.md
+++ b/README.md
@@ -1,2 +1,30 @@
-# MastersThesis
+# 🧠 Intelligent OCR System for Scanned PDF Documents
+
+**Master’s Thesis – Software Development Project**  
+**Línea de trabajo:** Percepción computacional & Aprendizaje automático  
+**Author:** Sergio Jiménez   
+**Institution:** (UNIR - Universidad Internacional de La Rioja] (https://www.unir.net/ingenieria/master-inteligencia-artificial/)
+**Date:** 2025  
+
+---
+
+## 📘 Overview
+
+This project develops an **intelligent system for text extraction from scanned PDF documents**, combining **computer vision techniques** and **modern OCR models based on deep learning**.  
+The goal is to overcome the limitations of traditional OCR tools (e.g., Tesseract) when dealing with **low-quality, skewed, or noisy scanned documents**, particularly in **Spanish**.
+
+---
+
+## 🎯 Objectives
+
+- Develop a **modular OCR pipeline** that processes scanned PDFs end-to-end.
+- Compare classical OCR tools with **state-of-the-art deep learning approaches** (EasyOCR, TrOCR, CRNN).
+- Evaluate performance using **Character Error Rate (CER)** and **Word Error Rate (WER)**.
+- Provide a **CLI-based demonstration tool** and analysis module for automated evaluation.
+
+---
+
+## 🧩 System Architecture
+
+TODO
 
diff --git a/thesis_report.docx b/thesis_report.docx
new file mode 100644
index 0000000..16a9ac5
Binary files /dev/null and b/thesis_report.docx differ