Files

2025-12-26 15:17:02 -07:00

7.2 KiB

Raw Blame History

OAK-D 4 Pro Workflow: Label, Train, and Convert AI Model

This guide walks you through the complete workflow for creating a custom wood knot detection model optimized for the OAK-D 4 Pro camera: from manual image annotation to trained model conversion for edge deployment.

📋 Prerequisites

Environment Setup

# Clone the repository
git clone git@143.244.157.110:dillon_stuff/saw_mill_knot_detection.git
cd saw_mill_knot_detection

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Required Dependencies

Python 3.8+
Pillow (for image processing)
Ultralytics (for YOLO/RT-DETR models)
RF-DETR (optional, for RF-DETR models)
OpenVINO (installed via convert script)

🏷️ Step 1: Label Images

Use the Tkinter-based annotation GUI to manually label your wood surface images.

1.1 Prepare Images

Place your images in a directory (e.g., IMAGE/):

IMAGE/
├── image1.jpg
├── image2.jpg
└── annotations.json  # Will be created/updated

1.2 Launch Annotation GUI

# Using the convenience script
./run_tk_gui.sh --images-dir IMAGE/

# Or directly
python tk_annotation_gui.py --images-dir IMAGE/

1.3 Annotate Images

Navigate: Use Prev/Next buttons or click image thumbnails
Draw Boxes: Click and drag on the image to create bounding boxes
Auto-Label (optional): Load trained weights and auto-detect knots
- Enter weights path (e.g., runs/yolox_training/training/weights/best.pt)
- Select model type (auto-detect usually works)
- Set confidence threshold (0.3-0.7 recommended)
- Click "Load Model" then "Auto-Label Current"
Edit Annotations: Double-click list items to delete, or manually draw corrections
Save: Annotations auto-save to IMAGE/annotations.json

1.4 Annotation Format

Each image gets entries like:

{
  "image1.jpg": [
    {
      "bbox": [x1, y1, x2, y2],
      "label": "knot",
      "confidence": 1.0,
      "source": "manual"
    }
  ]
}

Tips:

Aim for 100-500 annotated images for good results
Focus on challenging cases (small knots, lighting variations)
Use auto-labeling to speed up the process, then manually correct

🏋️ Step 2: Train Model

Train a detection model using your annotated images.

2.1 Prepare Dataset (Optional)

The training script can prepare the dataset automatically, but you can do it manually:

python train_model.py --prepare-dataset --images-dir IMAGE --annotations annotations.json --dataset dataset_prepared

2.2 Choose Model Framework

Available frameworks (all MIT/Apache 2.0 licensed):

RF-DETR: Highest accuracy, slower inference
RT-DETR: Good balance, optimized for edge devices
YOLOv6: Fast inference, good for real-time
YOLOX: Versatile, widely supported

2.3 Train Model

# Basic training
python train_model.py \
    --framework rtdetr \
    --dataset dataset_prepared \
    --output runs/rtdetr_training \
    --model-size small \
    --epochs 100

# Advanced options
python train_model.py \
    --framework yolox \
    --dataset dataset_prepared \
    --output runs/yolox_training \
    --model-size nano \
    --epochs 50 \
    --batch-size 8 \
    --lr 0.001 \
    --prepare-dataset \
    --images-dir IMAGE \
    --annotations annotations.json

2.4 Monitor Training

Check runs/*/training/ for logs and checkpoints
Training saves best model as best.pt
Use TensorBoard or Weights & Biases for monitoring (if configured)

Training Tips:

Start with nano or small models for faster iteration
50-200 epochs typically sufficient
Monitor validation mAP for convergence
Use data augmentation for better generalization

🔄 Step 3: Convert for OAK-D Deployment

Convert the trained model to OpenVINO format for OAK-D 4 Pro.

3.1 Run Conversion

# Basic conversion
python convert_for_deployment.py \
    --model runs/rtdetr_training/training/weights/best.pt \
    --output oak_d_deployment

# Advanced options
python convert_for_deployment.py \
    --model runs/yolox_training/training/weights/best.pt \
    --output oak_d_deployment \
    --img-size 640 \
    --framework auto

3.2 Output Files

After conversion, you'll get:

oak_d_deployment/
├── model.xml          # OpenVINO IR model
├── model.bin          # OpenVINO IR weights
├── model.onnx         # ONNX format (intermediate)
└── config.yaml        # Model configuration

3.3 Convert to Blob Format

For OAK-D deployment, convert to .blob format:

Option A: Online Converter (Recommended)

Go to https://blobconverter.luxonis.com/
Upload model.xml
Select "OAK-D 4 Pro"
Download .blob file

Option B: Command Line

pip install blobconverter
blobconverter --openvino-xml oak_d_deployment/model.xml

🧪 Step 4: Test and Deploy

4.1 Test OpenVINO Model

# Verify model loads
python -c "from openvino.runtime import Core; core = Core(); model = core.read_model('oak_d_deployment/model.xml'); print('✓ Model loaded')"

4.2 Deploy to OAK-D

Use DepthAI Python API or OAK-D examples:

import depthai as dai

# Create pipeline
pipeline = dai.Pipeline()

# Load your blob
detection_nn = pipeline.create(dai.node.NeuralNetwork)
detection_nn.setBlobPath("model.blob")

# Configure camera and output streams
# ... (see DepthAI documentation)

4.3 Performance Optimization

Quantization: Use 8-bit quantization for faster inference
Model Size: Nano models work best on edge devices
Input Resolution: 320x320 or 416x416 balances speed/accuracy
Calibration: Test with real-world images for best results

🔧 Troubleshooting

Common Issues

GUI won't start:

Ensure Pillow and Tkinter are installed
Check Python version (3.8+ required)

Training fails:

Verify dataset format (COCO for RF-DETR, YOLO for others)
Check GPU memory if using CUDA
Reduce batch size if out of memory

Conversion fails:

Ensure model is compatible with OpenVINO
Check input/output shapes match expectations
Try different image sizes (320, 416, 512, 640)

OAK-D deployment issues:

Verify blob was created for correct OAK model (4 Pro)
Check camera calibration and input preprocessing
Ensure model input size matches camera output

Getting Help

Check existing issues in the repository
Review DepthAI documentation: https://docs.luxonis.com/
Test with provided example models first

📊 Performance Benchmarks

Expected performance on OAK-D 4 Pro:

Model	Size	FPS	mAP	Use Case
RT-DETR	Nano	25-35	0.75	Balanced
YOLOX	Nano	30-45	0.70	Fast
RF-DETR	Nano	15-25	0.80	Accurate

Results vary based on model training and calibration

🎯 Next Steps

Iterate: Collect more data, retrain, redeploy
Optimize: Experiment with quantization and pruning
Integrate: Add your model to production applications
Monitor: Track performance in real-world conditions

License: All models are MIT/Apache 2.0 licensed - free for commercial use! /home/dillon/_code/saw_mill_knot_detection/OAK_D_WORKFLOW_README.md

7.2 KiB Raw Blame History