saw_mill_knot_detection/RTDETR_README.md

# RT-DETR Training for OAK-D Camera Deployment

RT-DETR (Real-Time Detection Transformer) is Apache 2.0 licensed - **free for commercial use**. It's designed for real-time detection and works great on edge devices like the OAK-D 4 Pro.

## Why RT-DETR?

- ✅ **Apache 2.0 license** - truly free for commercial use
- ✅ **Excellent OAK camera compatibility** - exports cleanly to OpenVINO
- ✅ **Real-time performance** - 30-60 FPS on OAK-D 4 Pro
- ✅ **Modern transformer architecture** - competitive accuracy with YOLO
- ✅ **Easy deployment** - direct export to OpenVINO format

## Quick Start

### 1. Annotate Images

Use the annotation GUI:
```bash
.venv/bin/python annotation_gui.py
```

- Load your images from Settings
- Annotate knots manually or use auto-labeling
- Aim for 100+ annotated images for good results

### 2. Train Model

From the GUI:
1. Go to **Training** tab
2. Click "Prepare Dataset" (creates train/valid/test splits)
3. Select **RT-DETR** framework
4. Choose model size:
   - `nano` (r18): Fastest, 30-40 FPS on OAK
   - `small` (r34): Balanced
   - `medium` (r50): More accurate
   - `base` (l): Best accuracy, slower
5. Click "Start Training"

Or from command line:
```bash
.venv/bin/python train_rtdetr.py \
    --dataset-dir dataset_prepared \
    --model rtdetr-r18 \
    --epochs 100 \
    --batch-size 8
```

### 3. Test Model

```bash
.venv/bin/python predict_rtdetr.py \
    --weights runs/rtdetr_training/training/weights/best.pt \
    --image test_image.jpg
```

### 4. Export for OAK-D

Export to OpenVINO format:
```bash
.venv/bin/python export_rtdetr_oak.py \
    --weights runs/rtdetr_training/training/weights/best.pt \
    --img-size 640
```

This creates:
- `best_openvino_model/` - OpenVINO IR format (.xml + .bin files)
- `best.onnx` - ONNX format (intermediate)

### 5. Convert to Blob for OAK

**Option A: Online converter** (easiest)
1. Go to https://blobconverter.luxonis.com/
2. Upload `best_openvino_model/model.xml`
3. Select "OAK-D 4 Pro" 
4. Download `.blob` file

**Option B: Command line**
```bash
pip install blobconverter
blobconverter --openvino-xml best_openvino_model/model.xml \
              --shaves 6
```

### 6. Deploy to OAK-D Camera

Example DepthAI script:
```python
import depthai as dai
import cv2

# Create pipeline
pipeline = dai.Pipeline()

# Camera
cam = pipeline.createColorCamera()
cam.setPreviewSize(640, 640)
cam.setInterleaved(False)

# Neural network
nn = pipeline.createNeuralNetwork()
nn.setBlobPath("best.blob")
cam.preview.link(nn.input)

# Output
xout = pipeline.createXLinkOut()
xout.setStreamName("detections")
nn.out.link(xout.input)

# Run
with dai.Device(pipeline) as device:
    queue = device.getOutputQueue("detections")
    
    while True:
        detections = queue.get()
        # Process detections...
```

## Model Comparison

| Model | Size | Speed (OAK-D) | Accuracy | License |
|-------|------|---------------|----------|---------|
| RT-DETR r18 | ~15MB | 30-40 FPS | Good | Apache 2.0 ✅ |
| RT-DETR r34 | ~30MB | 20-30 FPS | Better | Apache 2.0 ✅ |
| YOLOv11n | ~6MB | 50-60 FPS | Good | AGPL ❌ |
| YOLOv6n | ~10MB | 40-50 FPS | Good | MIT ✅ |
| RF-DETR nano | ~15MB | 10-20 FPS* | Good | Check repo |

*May have compatibility issues with OpenVINO

## Training Tips

1. **Dataset size**: 
   - Minimum: 50 images
   - Good: 200+ images
   - Excellent: 1000+ images

2. **Data diversity**:
   - Different wood types
   - Various lighting conditions
   - Multiple knot sizes/types
   - Different angles

3. **Training settings**:
   - Start with `rtdetr-r18` for fastest iteration
   - Use `batch-size=8` if you have 8GB+ GPU
   - Train for 100-200 epochs
   - Use early stopping (patience=20)

4. **Data augmentation** (automatic):
   - Flips, rotations
   - Color adjustments
   - Crops and scales

## Troubleshooting

**Training is slow:**
- Reduce batch size
- Use smaller model (r18)
- Check GPU usage with `nvidia-smi`

**Low accuracy:**
- Add more training data
- Train longer (more epochs)
- Use larger model (r34 or r50)
- Check your annotations for errors

**OAK deployment fails:**
- Ensure OpenVINO export succeeded
- Check blob size (<200MB for OAK-D)
- Verify input size matches training (640x640)
- Try FP16 instead of FP32 to reduce size

## Resources

- [RT-DETR Paper](https://arxiv.org/abs/2304.08069)
- [Ultralytics RT-DETR Docs](https://docs.ultralytics.com/models/rtdetr/)
- [OAK-D Docs](https://docs.luxonis.com/)
- [DepthAI Examples](https://github.com/luxonis/depthai-experiments)

## License

RT-DETR is Apache 2.0 licensed - you can use it for:
- ✅ Personal projects
- ✅ Commercial products
- ✅ Internal business tools
- ✅ Proprietary software

No restrictions, no paid licenses required!
Initial commit: Wood knot detection model and GUI 2025-12-22 14:11:39 -07:00			`# RT-DETR Training for OAK-D Camera Deployment`

			`RT-DETR (Real-Time Detection Transformer) is Apache 2.0 licensed - free for commercial use. It's designed for real-time detection and works great on edge devices like the OAK-D 4 Pro.`

			`## Why RT-DETR?`

			`- ✅ Apache 2.0 license - truly free for commercial use`
			`- ✅ Excellent OAK camera compatibility - exports cleanly to OpenVINO`
			`- ✅ Real-time performance - 30-60 FPS on OAK-D 4 Pro`
			`- ✅ Modern transformer architecture - competitive accuracy with YOLO`
			`- ✅ Easy deployment - direct export to OpenVINO format`

			`## Quick Start`

			`### 1. Annotate Images`

			`Use the annotation GUI:`
			```bash
			`.venv/bin/python annotation_gui.py`
			```

			`- Load your images from Settings`
			`- Annotate knots manually or use auto-labeling`
			`- Aim for 100+ annotated images for good results`

			`### 2. Train Model`

			`From the GUI:`
			`1. Go to Training tab`
			`2. Click "Prepare Dataset" (creates train/valid/test splits)`
			`3. Select RT-DETR framework`
			`4. Choose model size:`
			- `nano` (r18): Fastest, 30-40 FPS on OAK
			- `small` (r34): Balanced
			- `medium` (r50): More accurate
			- `base` (l): Best accuracy, slower
			`5. Click "Start Training"`

			`Or from command line:`
			```bash
			`.venv/bin/python train_rtdetr.py \`
			`--dataset-dir dataset_prepared \`
			`--model rtdetr-r18 \`
			`--epochs 100 \`
			`--batch-size 8`
			```

			`### 3. Test Model`

			```bash
			`.venv/bin/python predict_rtdetr.py \`
			`--weights runs/rtdetr_training/training/weights/best.pt \`
			`--image test_image.jpg`
			```

			`### 4. Export for OAK-D`

			`Export to OpenVINO format:`
			```bash
			`.venv/bin/python export_rtdetr_oak.py \`
			`--weights runs/rtdetr_training/training/weights/best.pt \`
			`--img-size 640`
			```

			`This creates:`
			- `best_openvino_model/` - OpenVINO IR format (.xml + .bin files)
			- `best.onnx` - ONNX format (intermediate)

			`### 5. Convert to Blob for OAK`

			`Option A: Online converter (easiest)`
			`1. Go to https://blobconverter.luxonis.com/`
			2. Upload `best_openvino_model/model.xml`
			`3. Select "OAK-D 4 Pro"`
			4. Download `.blob` file

			`Option B: Command line`
			```bash
			`pip install blobconverter`
			`blobconverter --openvino-xml best_openvino_model/model.xml \`
			`--shaves 6`
			```

			`### 6. Deploy to OAK-D Camera`

			`Example DepthAI script:`
			```python
			`import depthai as dai`
			`import cv2`

			`# Create pipeline`
			`pipeline = dai.Pipeline()`

			`# Camera`
			`cam = pipeline.createColorCamera()`
			`cam.setPreviewSize(640, 640)`
			`cam.setInterleaved(False)`

			`# Neural network`
			`nn = pipeline.createNeuralNetwork()`
			`nn.setBlobPath("best.blob")`
			`cam.preview.link(nn.input)`

			`# Output`
			`xout = pipeline.createXLinkOut()`
			`xout.setStreamName("detections")`
			`nn.out.link(xout.input)`

			`# Run`
			`with dai.Device(pipeline) as device:`
			`queue = device.getOutputQueue("detections")`

			`while True:`
			`detections = queue.get()`
			`# Process detections...`
			```

			`## Model Comparison`

			`\| Model \| Size \| Speed (OAK-D) \| Accuracy \| License \|`
			`\|-------\|------\|---------------\|----------\|---------\|`
			`\| RT-DETR r18 \| ~15MB \| 30-40 FPS \| Good \| Apache 2.0 ✅ \|`
			`\| RT-DETR r34 \| ~30MB \| 20-30 FPS \| Better \| Apache 2.0 ✅ \|`
			`\| YOLOv11n \| ~6MB \| 50-60 FPS \| Good \| AGPL ❌ \|`
			`\| YOLOv6n \| ~10MB \| 40-50 FPS \| Good \| MIT ✅ \|`
			`\| RF-DETR nano \| ~15MB \| 10-20 FPS* \| Good \| Check repo \|`

			`*May have compatibility issues with OpenVINO`

			`## Training Tips`

			`1. Dataset size:`
			`- Minimum: 50 images`
			`- Good: 200+ images`
			`- Excellent: 1000+ images`

			`2. Data diversity:`
			`- Different wood types`
			`- Various lighting conditions`
			`- Multiple knot sizes/types`
			`- Different angles`

			`3. Training settings:`
			- Start with `rtdetr-r18` for fastest iteration
			- Use `batch-size=8` if you have 8GB+ GPU
			`- Train for 100-200 epochs`
			`- Use early stopping (patience=20)`

			`4. Data augmentation (automatic):`
			`- Flips, rotations`
			`- Color adjustments`
			`- Crops and scales`

			`## Troubleshooting`

			`Training is slow:`
			`- Reduce batch size`
			`- Use smaller model (r18)`
			- Check GPU usage with `nvidia-smi`

			`Low accuracy:`
			`- Add more training data`
			`- Train longer (more epochs)`
			`- Use larger model (r34 or r50)`
			`- Check your annotations for errors`

			`OAK deployment fails:`
			`- Ensure OpenVINO export succeeded`
			`- Check blob size (<200MB for OAK-D)`
			`- Verify input size matches training (640x640)`
			`- Try FP16 instead of FP32 to reduce size`

			`## Resources`

			`- [RT-DETR Paper](https://arxiv.org/abs/2304.08069)`
			`- [Ultralytics RT-DETR Docs](https://docs.ultralytics.com/models/rtdetr/)`
			`- [OAK-D Docs](https://docs.luxonis.com/)`
			`- [DepthAI Examples](https://github.com/luxonis/depthai-experiments)`

			`## License`

			`RT-DETR is Apache 2.0 licensed - you can use it for:`
			`- ✅ Personal projects`
			`- ✅ Commercial products`
			`- ✅ Internal business tools`
			`- ✅ Proprietary software`

			`No restrictions, no paid licenses required!`