# RT-DETR Training for OAK-D Camera Deployment RT-DETR (Real-Time Detection Transformer) is Apache 2.0 licensed - **free for commercial use**. It's designed for real-time detection and works great on edge devices like the OAK-D 4 Pro. ## Why RT-DETR? - ✅ **Apache 2.0 license** - truly free for commercial use - ✅ **Excellent OAK camera compatibility** - exports cleanly to OpenVINO - ✅ **Real-time performance** - 30-60 FPS on OAK-D 4 Pro - ✅ **Modern transformer architecture** - competitive accuracy with YOLO - ✅ **Easy deployment** - direct export to OpenVINO format ## Quick Start ### 1. Annotate Images Use the annotation GUI: ```bash .venv/bin/python annotation_gui.py ``` - Load your images from Settings - Annotate knots manually or use auto-labeling - Aim for 100+ annotated images for good results ### 2. Train Model From the GUI: 1. Go to **Training** tab 2. Click "Prepare Dataset" (creates train/valid/test splits) 3. Select **RT-DETR** framework 4. Choose model size: - `nano` (r18): Fastest, 30-40 FPS on OAK - `small` (r34): Balanced - `medium` (r50): More accurate - `base` (l): Best accuracy, slower 5. Click "Start Training" Or from command line: ```bash .venv/bin/python train_rtdetr.py \ --dataset-dir dataset_prepared \ --model rtdetr-r18 \ --epochs 100 \ --batch-size 8 ``` ### 3. Test Model ```bash .venv/bin/python predict_rtdetr.py \ --weights runs/rtdetr_training/training/weights/best.pt \ --image test_image.jpg ``` ### 4. Export for OAK-D Export to OpenVINO format: ```bash .venv/bin/python export_rtdetr_oak.py \ --weights runs/rtdetr_training/training/weights/best.pt \ --img-size 640 ``` This creates: - `best_openvino_model/` - OpenVINO IR format (.xml + .bin files) - `best.onnx` - ONNX format (intermediate) ### 5. Convert to Blob for OAK **Option A: Online converter** (easiest) 1. Go to https://blobconverter.luxonis.com/ 2. Upload `best_openvino_model/model.xml` 3. Select "OAK-D 4 Pro" 4. Download `.blob` file **Option B: Command line** ```bash pip install blobconverter blobconverter --openvino-xml best_openvino_model/model.xml \ --shaves 6 ``` ### 6. Deploy to OAK-D Camera Example DepthAI script: ```python import depthai as dai import cv2 # Create pipeline pipeline = dai.Pipeline() # Camera cam = pipeline.createColorCamera() cam.setPreviewSize(640, 640) cam.setInterleaved(False) # Neural network nn = pipeline.createNeuralNetwork() nn.setBlobPath("best.blob") cam.preview.link(nn.input) # Output xout = pipeline.createXLinkOut() xout.setStreamName("detections") nn.out.link(xout.input) # Run with dai.Device(pipeline) as device: queue = device.getOutputQueue("detections") while True: detections = queue.get() # Process detections... ``` ## Model Comparison | Model | Size | Speed (OAK-D) | Accuracy | License | |-------|------|---------------|----------|---------| | RT-DETR r18 | ~15MB | 30-40 FPS | Good | Apache 2.0 ✅ | | RT-DETR r34 | ~30MB | 20-30 FPS | Better | Apache 2.0 ✅ | | YOLOv11n | ~6MB | 50-60 FPS | Good | AGPL ❌ | | YOLOv6n | ~10MB | 40-50 FPS | Good | MIT ✅ | | RF-DETR nano | ~15MB | 10-20 FPS* | Good | Check repo | *May have compatibility issues with OpenVINO ## Training Tips 1. **Dataset size**: - Minimum: 50 images - Good: 200+ images - Excellent: 1000+ images 2. **Data diversity**: - Different wood types - Various lighting conditions - Multiple knot sizes/types - Different angles 3. **Training settings**: - Start with `rtdetr-r18` for fastest iteration - Use `batch-size=8` if you have 8GB+ GPU - Train for 100-200 epochs - Use early stopping (patience=20) 4. **Data augmentation** (automatic): - Flips, rotations - Color adjustments - Crops and scales ## Troubleshooting **Training is slow:** - Reduce batch size - Use smaller model (r18) - Check GPU usage with `nvidia-smi` **Low accuracy:** - Add more training data - Train longer (more epochs) - Use larger model (r34 or r50) - Check your annotations for errors **OAK deployment fails:** - Ensure OpenVINO export succeeded - Check blob size (<200MB for OAK-D) - Verify input size matches training (640x640) - Try FP16 instead of FP32 to reduce size ## Resources - [RT-DETR Paper](https://arxiv.org/abs/2304.08069) - [Ultralytics RT-DETR Docs](https://docs.ultralytics.com/models/rtdetr/) - [OAK-D Docs](https://docs.luxonis.com/) - [DepthAI Examples](https://github.com/luxonis/depthai-experiments) ## License RT-DETR is Apache 2.0 licensed - you can use it for: - ✅ Personal projects - ✅ Commercial products - ✅ Internal business tools - ✅ Proprietary software No restrictions, no paid licenses required!