Files
saw_mill_knot_detection/OAK_D_WORKFLOW_README.md
2025-12-26 15:17:02 -07:00

7.2 KiB

OAK-D 4 Pro Workflow: Label, Train, and Convert AI Model

This guide walks you through the complete workflow for creating a custom wood knot detection model optimized for the OAK-D 4 Pro camera: from manual image annotation to trained model conversion for edge deployment.

📋 Prerequisites

Environment Setup

# Clone the repository
git clone git@143.244.157.110:dillon_stuff/saw_mill_knot_detection.git
cd saw_mill_knot_detection

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Required Dependencies

  • Python 3.8+
  • Pillow (for image processing)
  • Ultralytics (for YOLO/RT-DETR models)
  • RF-DETR (optional, for RF-DETR models)
  • OpenVINO (installed via convert script)

🏷️ Step 1: Label Images

Use the Tkinter-based annotation GUI to manually label your wood surface images.

1.1 Prepare Images

Place your images in a directory (e.g., IMAGE/):

IMAGE/
├── image1.jpg
├── image2.jpg
└── annotations.json  # Will be created/updated

1.2 Launch Annotation GUI

# Using the convenience script
./run_tk_gui.sh --images-dir IMAGE/

# Or directly
python tk_annotation_gui.py --images-dir IMAGE/

1.3 Annotate Images

  1. Navigate: Use Prev/Next buttons or click image thumbnails
  2. Draw Boxes: Click and drag on the image to create bounding boxes
  3. Auto-Label (optional): Load trained weights and auto-detect knots
    • Enter weights path (e.g., runs/yolox_training/training/weights/best.pt)
    • Select model type (auto-detect usually works)
    • Set confidence threshold (0.3-0.7 recommended)
    • Click "Load Model" then "Auto-Label Current"
  4. Edit Annotations: Double-click list items to delete, or manually draw corrections
  5. Save: Annotations auto-save to IMAGE/annotations.json

1.4 Annotation Format

Each image gets entries like:

{
  "image1.jpg": [
    {
      "bbox": [x1, y1, x2, y2],
      "label": "knot",
      "confidence": 1.0,
      "source": "manual"
    }
  ]
}

Tips:

  • Aim for 100-500 annotated images for good results
  • Focus on challenging cases (small knots, lighting variations)
  • Use auto-labeling to speed up the process, then manually correct

🏋️ Step 2: Train Model

Train a detection model using your annotated images.

2.1 Prepare Dataset (Optional)

The training script can prepare the dataset automatically, but you can do it manually:

python train_model.py --prepare-dataset --images-dir IMAGE --annotations annotations.json --dataset dataset_prepared

2.2 Choose Model Framework

Available frameworks (all MIT/Apache 2.0 licensed):

  • RF-DETR: Highest accuracy, slower inference
  • RT-DETR: Good balance, optimized for edge devices
  • YOLOv6: Fast inference, good for real-time
  • YOLOX: Versatile, widely supported

2.3 Train Model

# Basic training
python train_model.py \
    --framework rtdetr \
    --dataset dataset_prepared \
    --output runs/rtdetr_training \
    --model-size small \
    --epochs 100

# Advanced options
python train_model.py \
    --framework yolox \
    --dataset dataset_prepared \
    --output runs/yolox_training \
    --model-size nano \
    --epochs 50 \
    --batch-size 8 \
    --lr 0.001 \
    --prepare-dataset \
    --images-dir IMAGE \
    --annotations annotations.json

2.4 Monitor Training

  • Check runs/*/training/ for logs and checkpoints
  • Training saves best model as best.pt
  • Use TensorBoard or Weights & Biases for monitoring (if configured)

Training Tips:

  • Start with nano or small models for faster iteration
  • 50-200 epochs typically sufficient
  • Monitor validation mAP for convergence
  • Use data augmentation for better generalization

🔄 Step 3: Convert for OAK-D Deployment

Convert the trained model to OpenVINO format for OAK-D 4 Pro.

3.1 Run Conversion

# Basic conversion
python convert_for_deployment.py \
    --model runs/rtdetr_training/training/weights/best.pt \
    --output oak_d_deployment

# Advanced options
python convert_for_deployment.py \
    --model runs/yolox_training/training/weights/best.pt \
    --output oak_d_deployment \
    --img-size 640 \
    --framework auto

3.2 Output Files

After conversion, you'll get:

oak_d_deployment/
├── model.xml          # OpenVINO IR model
├── model.bin          # OpenVINO IR weights
├── model.onnx         # ONNX format (intermediate)
└── config.yaml        # Model configuration

3.3 Convert to Blob Format

For OAK-D deployment, convert to .blob format:

Option A: Online Converter (Recommended)

  1. Go to https://blobconverter.luxonis.com/
  2. Upload model.xml
  3. Select "OAK-D 4 Pro"
  4. Download .blob file

Option B: Command Line

pip install blobconverter
blobconverter --openvino-xml oak_d_deployment/model.xml

🧪 Step 4: Test and Deploy

4.1 Test OpenVINO Model

# Verify model loads
python -c "from openvino.runtime import Core; core = Core(); model = core.read_model('oak_d_deployment/model.xml'); print('✓ Model loaded')"

4.2 Deploy to OAK-D

Use DepthAI Python API or OAK-D examples:

import depthai as dai

# Create pipeline
pipeline = dai.Pipeline()

# Load your blob
detection_nn = pipeline.create(dai.node.NeuralNetwork)
detection_nn.setBlobPath("model.blob")

# Configure camera and output streams
# ... (see DepthAI documentation)

4.3 Performance Optimization

  • Quantization: Use 8-bit quantization for faster inference
  • Model Size: Nano models work best on edge devices
  • Input Resolution: 320x320 or 416x416 balances speed/accuracy
  • Calibration: Test with real-world images for best results

🔧 Troubleshooting

Common Issues

GUI won't start:

  • Ensure Pillow and Tkinter are installed
  • Check Python version (3.8+ required)

Training fails:

  • Verify dataset format (COCO for RF-DETR, YOLO for others)
  • Check GPU memory if using CUDA
  • Reduce batch size if out of memory

Conversion fails:

  • Ensure model is compatible with OpenVINO
  • Check input/output shapes match expectations
  • Try different image sizes (320, 416, 512, 640)

OAK-D deployment issues:

  • Verify blob was created for correct OAK model (4 Pro)
  • Check camera calibration and input preprocessing
  • Ensure model input size matches camera output

Getting Help

  • Check existing issues in the repository
  • Review DepthAI documentation: https://docs.luxonis.com/
  • Test with provided example models first

📊 Performance Benchmarks

Expected performance on OAK-D 4 Pro:

Model Size FPS mAP Use Case
RT-DETR Nano 25-35 0.75 Balanced
YOLOX Nano 30-45 0.70 Fast
RF-DETR Nano 15-25 0.80 Accurate

Results vary based on model training and calibration

🎯 Next Steps

  1. Iterate: Collect more data, retrain, redeploy
  2. Optimize: Experiment with quantization and pruning
  3. Integrate: Add your model to production applications
  4. Monitor: Track performance in real-world conditions

License: All models are MIT/Apache 2.0 licensed - free for commercial use! /home/dillon/_code/saw_mill_knot_detection/OAK_D_WORKFLOW_README.md