7.2 KiB
OAK-D 4 Pro Workflow: Label, Train, and Convert AI Model
This guide walks you through the complete workflow for creating a custom wood knot detection model optimized for the OAK-D 4 Pro camera: from manual image annotation to trained model conversion for edge deployment.
📋 Prerequisites
Environment Setup
# Clone the repository
git clone git@143.244.157.110:dillon_stuff/saw_mill_knot_detection.git
cd saw_mill_knot_detection
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
Required Dependencies
- Python 3.8+
- Pillow (for image processing)
- Ultralytics (for YOLO/RT-DETR models)
- RF-DETR (optional, for RF-DETR models)
- OpenVINO (installed via convert script)
🏷️ Step 1: Label Images
Use the Tkinter-based annotation GUI to manually label your wood surface images.
1.1 Prepare Images
Place your images in a directory (e.g., IMAGE/):
IMAGE/
├── image1.jpg
├── image2.jpg
└── annotations.json # Will be created/updated
1.2 Launch Annotation GUI
# Using the convenience script
./run_tk_gui.sh --images-dir IMAGE/
# Or directly
python tk_annotation_gui.py --images-dir IMAGE/
1.3 Annotate Images
- Navigate: Use Prev/Next buttons or click image thumbnails
- Draw Boxes: Click and drag on the image to create bounding boxes
- Auto-Label (optional): Load trained weights and auto-detect knots
- Enter weights path (e.g.,
runs/yolox_training/training/weights/best.pt) - Select model type (auto-detect usually works)
- Set confidence threshold (0.3-0.7 recommended)
- Click "Load Model" then "Auto-Label Current"
- Enter weights path (e.g.,
- Edit Annotations: Double-click list items to delete, or manually draw corrections
- Save: Annotations auto-save to
IMAGE/annotations.json
1.4 Annotation Format
Each image gets entries like:
{
"image1.jpg": [
{
"bbox": [x1, y1, x2, y2],
"label": "knot",
"confidence": 1.0,
"source": "manual"
}
]
}
Tips:
- Aim for 100-500 annotated images for good results
- Focus on challenging cases (small knots, lighting variations)
- Use auto-labeling to speed up the process, then manually correct
🏋️ Step 2: Train Model
Train a detection model using your annotated images.
2.1 Prepare Dataset (Optional)
The training script can prepare the dataset automatically, but you can do it manually:
python train_model.py --prepare-dataset --images-dir IMAGE --annotations annotations.json --dataset dataset_prepared
2.2 Choose Model Framework
Available frameworks (all MIT/Apache 2.0 licensed):
- RF-DETR: Highest accuracy, slower inference
- RT-DETR: Good balance, optimized for edge devices
- YOLOv6: Fast inference, good for real-time
- YOLOX: Versatile, widely supported
2.3 Train Model
# Basic training
python train_model.py \
--framework rtdetr \
--dataset dataset_prepared \
--output runs/rtdetr_training \
--model-size small \
--epochs 100
# Advanced options
python train_model.py \
--framework yolox \
--dataset dataset_prepared \
--output runs/yolox_training \
--model-size nano \
--epochs 50 \
--batch-size 8 \
--lr 0.001 \
--prepare-dataset \
--images-dir IMAGE \
--annotations annotations.json
2.4 Monitor Training
- Check
runs/*/training/for logs and checkpoints - Training saves best model as
best.pt - Use TensorBoard or Weights & Biases for monitoring (if configured)
Training Tips:
- Start with
nanoorsmallmodels for faster iteration - 50-200 epochs typically sufficient
- Monitor validation mAP for convergence
- Use data augmentation for better generalization
🔄 Step 3: Convert for OAK-D Deployment
Convert the trained model to OpenVINO format for OAK-D 4 Pro.
3.1 Run Conversion
# Basic conversion
python convert_for_deployment.py \
--model runs/rtdetr_training/training/weights/best.pt \
--output oak_d_deployment
# Advanced options
python convert_for_deployment.py \
--model runs/yolox_training/training/weights/best.pt \
--output oak_d_deployment \
--img-size 640 \
--framework auto
3.2 Output Files
After conversion, you'll get:
oak_d_deployment/
├── model.xml # OpenVINO IR model
├── model.bin # OpenVINO IR weights
├── model.onnx # ONNX format (intermediate)
└── config.yaml # Model configuration
3.3 Convert to Blob Format
For OAK-D deployment, convert to .blob format:
Option A: Online Converter (Recommended)
- Go to https://blobconverter.luxonis.com/
- Upload
model.xml - Select "OAK-D 4 Pro"
- Download
.blobfile
Option B: Command Line
pip install blobconverter
blobconverter --openvino-xml oak_d_deployment/model.xml
🧪 Step 4: Test and Deploy
4.1 Test OpenVINO Model
# Verify model loads
python -c "from openvino.runtime import Core; core = Core(); model = core.read_model('oak_d_deployment/model.xml'); print('✓ Model loaded')"
4.2 Deploy to OAK-D
Use DepthAI Python API or OAK-D examples:
import depthai as dai
# Create pipeline
pipeline = dai.Pipeline()
# Load your blob
detection_nn = pipeline.create(dai.node.NeuralNetwork)
detection_nn.setBlobPath("model.blob")
# Configure camera and output streams
# ... (see DepthAI documentation)
4.3 Performance Optimization
- Quantization: Use 8-bit quantization for faster inference
- Model Size: Nano models work best on edge devices
- Input Resolution: 320x320 or 416x416 balances speed/accuracy
- Calibration: Test with real-world images for best results
🔧 Troubleshooting
Common Issues
GUI won't start:
- Ensure Pillow and Tkinter are installed
- Check Python version (3.8+ required)
Training fails:
- Verify dataset format (COCO for RF-DETR, YOLO for others)
- Check GPU memory if using CUDA
- Reduce batch size if out of memory
Conversion fails:
- Ensure model is compatible with OpenVINO
- Check input/output shapes match expectations
- Try different image sizes (320, 416, 512, 640)
OAK-D deployment issues:
- Verify blob was created for correct OAK model (4 Pro)
- Check camera calibration and input preprocessing
- Ensure model input size matches camera output
Getting Help
- Check existing issues in the repository
- Review DepthAI documentation: https://docs.luxonis.com/
- Test with provided example models first
📊 Performance Benchmarks
Expected performance on OAK-D 4 Pro:
| Model | Size | FPS | mAP | Use Case |
|---|---|---|---|---|
| RT-DETR | Nano | 25-35 | 0.75 | Balanced |
| YOLOX | Nano | 30-45 | 0.70 | Fast |
| RF-DETR | Nano | 15-25 | 0.80 | Accurate |
Results vary based on model training and calibration
🎯 Next Steps
- Iterate: Collect more data, retrain, redeploy
- Optimize: Experiment with quantization and pruning
- Integrate: Add your model to production applications
- Monitor: Track performance in real-world conditions
License: All models are MIT/Apache 2.0 licensed - free for commercial use! /home/dillon/_code/saw_mill_knot_detection/OAK_D_WORKFLOW_README.md