This commit is contained in:
2026-01-17 10:46:36 +01:00
parent 27609a0ed0
commit 78fe3e8c81
5 changed files with 358 additions and 45 deletions

View File

@@ -66,8 +66,10 @@ docker compose up ocr-cpu
| `dataset_manager.py` | Dataset loader |
| `test.py` | API test client |
| `Dockerfile.cpu` | CPU-only image (multi-arch) |
| `Dockerfile.gpu` | GPU/CUDA image (x86_64) |
| `Dockerfile.gpu` | GPU/CUDA image (x86_64 + ARM64 with local wheel) |
| `Dockerfile.build-paddle` | PaddlePaddle GPU wheel builder for ARM64 |
| `docker-compose.yml` | Service orchestration |
| `wheels/` | Local PaddlePaddle wheels (created by build-paddle) |
## API Endpoints
@@ -147,54 +149,172 @@ docker run -d -p 8000:8000 --gpus all \
paddle-ocr-api:gpu
```
## DGX Spark (ARM64 + CUDA)
## GPU Support Analysis
DGX Spark uses ARM64 (Grace CPU) with NVIDIA Hopper GPU. You have two options:
### Host System Reference (DGX Spark)
### Option 1: Native ARM64 Build (Recommended)
This section documents GPU support findings based on testing on an NVIDIA DGX Spark:
PaddlePaddle has ARM64 support. Build natively:
| Component | Value |
|-----------|-------|
| Architecture | ARM64 (aarch64) |
| CPU | NVIDIA Grace (ARM) |
| GPU | NVIDIA GB10 |
| CUDA Version | 13.0 |
| Driver | 580.95.05 |
| OS | Ubuntu 24.04 LTS |
| Container Toolkit | nvidia-container-toolkit 1.18.1 |
| Docker | 28.5.1 |
| Docker Compose | v2.40.0 |
### PaddlePaddle GPU Platform Support
**Critical Finding:** PaddlePaddle-GPU does **NOT** support ARM64/aarch64 architecture.
| Platform | CPU | GPU |
|----------|-----|-----|
| Linux x86_64 | ✅ | ✅ CUDA 10.2/11.x/12.x |
| Windows x64 | ✅ | ✅ CUDA 10.2/11.x/12.x |
| macOS x64 | ✅ | ❌ |
| macOS ARM64 (M1/M2) | ✅ | ❌ |
| Linux ARM64 (Jetson/DGX) | ✅ | ❌ No wheels |
**Source:** [PaddlePaddle-GPU PyPI](https://pypi.org/project/paddlepaddle-gpu/) - only `manylinux_x86_64` and `win_amd64` wheels available.
### Why GPU Doesn't Work on ARM64
1. **No prebuilt wheels**: `pip install paddlepaddle-gpu` fails on ARM64 - no compatible wheels exist
2. **Not a CUDA issue**: The NVIDIA CUDA base images work fine on ARM64 (`nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04`)
3. **Not a container toolkit issue**: `nvidia-container-toolkit` is installed and functional
4. **PaddlePaddle limitation**: The Paddle team hasn't compiled GPU wheels for ARM64
When you run `pip install paddlepaddle-gpu` on ARM64:
```
ERROR: No matching distribution found for paddlepaddle-gpu
```
### Options for ARM64 Systems
#### Option 1: CPU-Only (Recommended)
Use `Dockerfile.cpu` which works on ARM64:
```bash
# On DGX Spark or ARM64 machine
# On DGX Spark
docker compose up ocr-cpu
# Or build directly
docker build -f Dockerfile.cpu -t paddle-ocr-api:cpu .
```
**Performance:** CPU inference on ARM64 Grace is surprisingly fast due to high core count. Expect ~2-5 seconds per page.
#### Option 2: Build PaddlePaddle from Source (Docker-based)
Use the included Docker builder to compile PaddlePaddle GPU for ARM64:
```bash
cd src/paddle_ocr
# Step 1: Build the PaddlePaddle GPU wheel (one-time, 2-4 hours)
docker compose --profile build run --rm build-paddle
# Verify wheel was created
ls -la wheels/paddlepaddle*.whl
# Step 2: Build the GPU image (uses local wheel)
docker compose build ocr-gpu
# Step 3: Run with GPU
docker compose up ocr-gpu
# Verify GPU is working
docker compose exec ocr-gpu python -c "import paddle; print(paddle.device.is_compiled_with_cuda())"
```
**What this does:**
1. `build-paddle` compiles PaddlePaddle from source inside a CUDA container
2. The wheel is saved to `./wheels/` directory
3. `Dockerfile.gpu` detects the local wheel and uses it instead of PyPI
**Caveats:**
- Build takes 2-4 hours on first run
- Requires ~20GB disk space during build
- Not officially supported by PaddlePaddle team
- May need adjustments for future PaddlePaddle versions
See: [GitHub Issue #17327](https://github.com/PaddlePaddle/PaddleOCR/issues/17327)
#### Option 3: Alternative OCR Engines
For ARM64 GPU acceleration, consider alternatives:
| Engine | ARM64 GPU | Notes |
|--------|-----------|-------|
| **Tesseract** | ❌ CPU-only | Good fallback, widely available |
| **EasyOCR** | ⚠️ Via PyTorch | PyTorch has ARM64 GPU support |
| **TrOCR** | ⚠️ Via Transformers | Hugging Face Transformers + PyTorch |
| **docTR** | ⚠️ Via TensorFlow/PyTorch | Both backends have ARM64 support |
EasyOCR with PyTorch is a viable alternative:
```bash
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
pip install easyocr
```
### x86_64 GPU Setup (Working)
For x86_64 systems with NVIDIA GPU, the GPU Docker works:
```bash
# Verify GPU is accessible
nvidia-smi
# Verify Docker GPU access
docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi
# Build and run GPU version
docker compose up ocr-gpu
```
### GPU Docker Compose Configuration
The `docker-compose.yml` configures GPU access via:
```yaml
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
```
This requires Docker Compose v2 and nvidia-container-toolkit.
## DGX Spark / ARM64 Quick Start
For ARM64 systems (DGX Spark, Jetson, Graviton), use CPU-only:
```bash
cd src/paddle_ocr
# Build ARM64-native CPU image
docker build -f Dockerfile.cpu -t paddle-ocr-api:arm64 .
```
For GPU acceleration on ARM64, you'll need to modify `Dockerfile.gpu` to use ARM-compatible base image:
```dockerfile
# Change this line in Dockerfile.gpu:
FROM nvcr.io/nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04
# To ARM64-compatible version:
FROM nvcr.io/nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04
# (same image works on ARM64 when pulled on ARM machine)
```
Then build on the DGX Spark:
```bash
docker build -f Dockerfile.gpu -t paddle-ocr-api:gpu-arm64 .
```
### Option 2: x86_64 Emulation via QEMU (Slow)
You CAN run x86_64 images on ARM via emulation, but it's ~10-20x slower:
```bash
# On DGX Spark, enable QEMU emulation
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
# Run x86_64 image with emulation
docker run --platform linux/amd64 -p 8000:8000 \
# Run
docker run -d -p 8000:8000 \
-v $(pwd)/../dataset:/app/dataset:ro \
paddle-ocr-api:cpu
paddle-ocr-api:arm64
# Test
curl http://localhost:8000/health
```
**Not recommended** for production due to severe performance penalty.
### Cross-Compile from x86_64
### Option 3: Cross-compile from x86_64
Build ARM64 images from your x86_64 machine:
Build ARM64 images from an x86_64 machine:
```bash
# Setup buildx for multi-arch
@@ -209,6 +329,7 @@ docker buildx build -f Dockerfile.cpu \
# Save and transfer to DGX Spark
docker save paddle-ocr-api:arm64 | gzip > paddle-ocr-arm64.tar.gz
scp paddle-ocr-arm64.tar.gz dgx-spark:~/
# On DGX Spark:
docker load < paddle-ocr-arm64.tar.gz
```