gpu dgx
This commit is contained in:
@@ -66,8 +66,10 @@ docker compose up ocr-cpu
|
||||
| `dataset_manager.py` | Dataset loader |
|
||||
| `test.py` | API test client |
|
||||
| `Dockerfile.cpu` | CPU-only image (multi-arch) |
|
||||
| `Dockerfile.gpu` | GPU/CUDA image (x86_64) |
|
||||
| `Dockerfile.gpu` | GPU/CUDA image (x86_64 + ARM64 with local wheel) |
|
||||
| `Dockerfile.build-paddle` | PaddlePaddle GPU wheel builder for ARM64 |
|
||||
| `docker-compose.yml` | Service orchestration |
|
||||
| `wheels/` | Local PaddlePaddle wheels (created by build-paddle) |
|
||||
|
||||
## API Endpoints
|
||||
|
||||
@@ -147,54 +149,172 @@ docker run -d -p 8000:8000 --gpus all \
|
||||
paddle-ocr-api:gpu
|
||||
```
|
||||
|
||||
## DGX Spark (ARM64 + CUDA)
|
||||
## GPU Support Analysis
|
||||
|
||||
DGX Spark uses ARM64 (Grace CPU) with NVIDIA Hopper GPU. You have two options:
|
||||
### Host System Reference (DGX Spark)
|
||||
|
||||
### Option 1: Native ARM64 Build (Recommended)
|
||||
This section documents GPU support findings based on testing on an NVIDIA DGX Spark:
|
||||
|
||||
PaddlePaddle has ARM64 support. Build natively:
|
||||
| Component | Value |
|
||||
|-----------|-------|
|
||||
| Architecture | ARM64 (aarch64) |
|
||||
| CPU | NVIDIA Grace (ARM) |
|
||||
| GPU | NVIDIA GB10 |
|
||||
| CUDA Version | 13.0 |
|
||||
| Driver | 580.95.05 |
|
||||
| OS | Ubuntu 24.04 LTS |
|
||||
| Container Toolkit | nvidia-container-toolkit 1.18.1 |
|
||||
| Docker | 28.5.1 |
|
||||
| Docker Compose | v2.40.0 |
|
||||
|
||||
### PaddlePaddle GPU Platform Support
|
||||
|
||||
**Critical Finding:** PaddlePaddle-GPU does **NOT** support ARM64/aarch64 architecture.
|
||||
|
||||
| Platform | CPU | GPU |
|
||||
|----------|-----|-----|
|
||||
| Linux x86_64 | ✅ | ✅ CUDA 10.2/11.x/12.x |
|
||||
| Windows x64 | ✅ | ✅ CUDA 10.2/11.x/12.x |
|
||||
| macOS x64 | ✅ | ❌ |
|
||||
| macOS ARM64 (M1/M2) | ✅ | ❌ |
|
||||
| Linux ARM64 (Jetson/DGX) | ✅ | ❌ No wheels |
|
||||
|
||||
**Source:** [PaddlePaddle-GPU PyPI](https://pypi.org/project/paddlepaddle-gpu/) - only `manylinux_x86_64` and `win_amd64` wheels available.
|
||||
|
||||
### Why GPU Doesn't Work on ARM64
|
||||
|
||||
1. **No prebuilt wheels**: `pip install paddlepaddle-gpu` fails on ARM64 - no compatible wheels exist
|
||||
2. **Not a CUDA issue**: The NVIDIA CUDA base images work fine on ARM64 (`nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04`)
|
||||
3. **Not a container toolkit issue**: `nvidia-container-toolkit` is installed and functional
|
||||
4. **PaddlePaddle limitation**: The Paddle team hasn't compiled GPU wheels for ARM64
|
||||
|
||||
When you run `pip install paddlepaddle-gpu` on ARM64:
|
||||
```
|
||||
ERROR: No matching distribution found for paddlepaddle-gpu
|
||||
```
|
||||
|
||||
### Options for ARM64 Systems
|
||||
|
||||
#### Option 1: CPU-Only (Recommended)
|
||||
|
||||
Use `Dockerfile.cpu` which works on ARM64:
|
||||
|
||||
```bash
|
||||
# On DGX Spark or ARM64 machine
|
||||
# On DGX Spark
|
||||
docker compose up ocr-cpu
|
||||
|
||||
# Or build directly
|
||||
docker build -f Dockerfile.cpu -t paddle-ocr-api:cpu .
|
||||
```
|
||||
|
||||
**Performance:** CPU inference on ARM64 Grace is surprisingly fast due to high core count. Expect ~2-5 seconds per page.
|
||||
|
||||
#### Option 2: Build PaddlePaddle from Source (Docker-based)
|
||||
|
||||
Use the included Docker builder to compile PaddlePaddle GPU for ARM64:
|
||||
|
||||
```bash
|
||||
cd src/paddle_ocr
|
||||
|
||||
# Step 1: Build the PaddlePaddle GPU wheel (one-time, 2-4 hours)
|
||||
docker compose --profile build run --rm build-paddle
|
||||
|
||||
# Verify wheel was created
|
||||
ls -la wheels/paddlepaddle*.whl
|
||||
|
||||
# Step 2: Build the GPU image (uses local wheel)
|
||||
docker compose build ocr-gpu
|
||||
|
||||
# Step 3: Run with GPU
|
||||
docker compose up ocr-gpu
|
||||
|
||||
# Verify GPU is working
|
||||
docker compose exec ocr-gpu python -c "import paddle; print(paddle.device.is_compiled_with_cuda())"
|
||||
```
|
||||
|
||||
**What this does:**
|
||||
1. `build-paddle` compiles PaddlePaddle from source inside a CUDA container
|
||||
2. The wheel is saved to `./wheels/` directory
|
||||
3. `Dockerfile.gpu` detects the local wheel and uses it instead of PyPI
|
||||
|
||||
**Caveats:**
|
||||
- Build takes 2-4 hours on first run
|
||||
- Requires ~20GB disk space during build
|
||||
- Not officially supported by PaddlePaddle team
|
||||
- May need adjustments for future PaddlePaddle versions
|
||||
|
||||
See: [GitHub Issue #17327](https://github.com/PaddlePaddle/PaddleOCR/issues/17327)
|
||||
|
||||
#### Option 3: Alternative OCR Engines
|
||||
|
||||
For ARM64 GPU acceleration, consider alternatives:
|
||||
|
||||
| Engine | ARM64 GPU | Notes |
|
||||
|--------|-----------|-------|
|
||||
| **Tesseract** | ❌ CPU-only | Good fallback, widely available |
|
||||
| **EasyOCR** | ⚠️ Via PyTorch | PyTorch has ARM64 GPU support |
|
||||
| **TrOCR** | ⚠️ Via Transformers | Hugging Face Transformers + PyTorch |
|
||||
| **docTR** | ⚠️ Via TensorFlow/PyTorch | Both backends have ARM64 support |
|
||||
|
||||
EasyOCR with PyTorch is a viable alternative:
|
||||
```bash
|
||||
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
|
||||
pip install easyocr
|
||||
```
|
||||
|
||||
### x86_64 GPU Setup (Working)
|
||||
|
||||
For x86_64 systems with NVIDIA GPU, the GPU Docker works:
|
||||
|
||||
```bash
|
||||
# Verify GPU is accessible
|
||||
nvidia-smi
|
||||
|
||||
# Verify Docker GPU access
|
||||
docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi
|
||||
|
||||
# Build and run GPU version
|
||||
docker compose up ocr-gpu
|
||||
```
|
||||
|
||||
### GPU Docker Compose Configuration
|
||||
|
||||
The `docker-compose.yml` configures GPU access via:
|
||||
|
||||
```yaml
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
```
|
||||
|
||||
This requires Docker Compose v2 and nvidia-container-toolkit.
|
||||
|
||||
## DGX Spark / ARM64 Quick Start
|
||||
|
||||
For ARM64 systems (DGX Spark, Jetson, Graviton), use CPU-only:
|
||||
|
||||
```bash
|
||||
cd src/paddle_ocr
|
||||
|
||||
# Build ARM64-native CPU image
|
||||
docker build -f Dockerfile.cpu -t paddle-ocr-api:arm64 .
|
||||
```
|
||||
|
||||
For GPU acceleration on ARM64, you'll need to modify `Dockerfile.gpu` to use ARM-compatible base image:
|
||||
|
||||
```dockerfile
|
||||
# Change this line in Dockerfile.gpu:
|
||||
FROM nvcr.io/nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04
|
||||
|
||||
# To ARM64-compatible version:
|
||||
FROM nvcr.io/nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04
|
||||
# (same image works on ARM64 when pulled on ARM machine)
|
||||
```
|
||||
|
||||
Then build on the DGX Spark:
|
||||
```bash
|
||||
docker build -f Dockerfile.gpu -t paddle-ocr-api:gpu-arm64 .
|
||||
```
|
||||
|
||||
### Option 2: x86_64 Emulation via QEMU (Slow)
|
||||
|
||||
You CAN run x86_64 images on ARM via emulation, but it's ~10-20x slower:
|
||||
|
||||
```bash
|
||||
# On DGX Spark, enable QEMU emulation
|
||||
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
|
||||
|
||||
# Run x86_64 image with emulation
|
||||
docker run --platform linux/amd64 -p 8000:8000 \
|
||||
# Run
|
||||
docker run -d -p 8000:8000 \
|
||||
-v $(pwd)/../dataset:/app/dataset:ro \
|
||||
paddle-ocr-api:cpu
|
||||
paddle-ocr-api:arm64
|
||||
|
||||
# Test
|
||||
curl http://localhost:8000/health
|
||||
```
|
||||
|
||||
**Not recommended** for production due to severe performance penalty.
|
||||
### Cross-Compile from x86_64
|
||||
|
||||
### Option 3: Cross-compile from x86_64
|
||||
|
||||
Build ARM64 images from your x86_64 machine:
|
||||
Build ARM64 images from an x86_64 machine:
|
||||
|
||||
```bash
|
||||
# Setup buildx for multi-arch
|
||||
@@ -209,6 +329,7 @@ docker buildx build -f Dockerfile.cpu \
|
||||
# Save and transfer to DGX Spark
|
||||
docker save paddle-ocr-api:arm64 | gzip > paddle-ocr-arm64.tar.gz
|
||||
scp paddle-ocr-arm64.tar.gz dgx-spark:~/
|
||||
|
||||
# On DGX Spark:
|
||||
docker load < paddle-ocr-arm64.tar.gz
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user