# OAK-D 4 Pro Workflow: Label, Train, and Convert AI Model This guide walks you through the complete workflow for creating a custom wood knot detection model optimized for the OAK-D 4 Pro camera: from manual image annotation to trained model conversion for edge deployment. ## ๐Ÿ“‹ Prerequisites ### Environment Setup ```bash # Clone the repository git clone git@143.244.157.110:dillon_stuff/saw_mill_knot_detection.git cd saw_mill_knot_detection # Create virtual environment python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate # Install dependencies pip install -r requirements.txt ``` ### Required Dependencies - Python 3.8+ - Pillow (for image processing) - Ultralytics (for YOLO/RT-DETR models) - RF-DETR (optional, for RF-DETR models) - OpenVINO (installed via convert script) ## ๐Ÿท๏ธ Step 1: Label Images Use the Tkinter-based annotation GUI to manually label your wood surface images. ### 1.1 Prepare Images Place your images in a directory (e.g., `IMAGE/`): ``` IMAGE/ โ”œโ”€โ”€ image1.jpg โ”œโ”€โ”€ image2.jpg โ””โ”€โ”€ annotations.json # Will be created/updated ``` ### 1.2 Launch Annotation GUI ```bash # Using the convenience script ./run_tk_gui.sh --images-dir IMAGE/ # Or directly python tk_annotation_gui.py --images-dir IMAGE/ ``` ### 1.3 Annotate Images 1. **Navigate**: Use Prev/Next buttons or click image thumbnails 2. **Draw Boxes**: Click and drag on the image to create bounding boxes 3. **Auto-Label** (optional): Load trained weights and auto-detect knots - Enter weights path (e.g., `runs/yolox_training/training/weights/best.pt`) - Select model type (auto-detect usually works) - Set confidence threshold (0.3-0.7 recommended) - Click "Load Model" then "Auto-Label Current" 4. **Edit Annotations**: Double-click list items to delete, or manually draw corrections 5. **Save**: Annotations auto-save to `IMAGE/annotations.json` ### 1.4 Annotation Format Each image gets entries like: ```json { "image1.jpg": [ { "bbox": [x1, y1, x2, y2], "label": "knot", "confidence": 1.0, "source": "manual" } ] } ``` **Tips**: - Aim for 100-500 annotated images for good results - Focus on challenging cases (small knots, lighting variations) - Use auto-labeling to speed up the process, then manually correct ## ๐Ÿ‹๏ธ Step 2: Train Model Train a detection model using your annotated images. ### 2.1 Prepare Dataset (Optional) The training script can prepare the dataset automatically, but you can do it manually: ```bash python train_model.py --prepare-dataset --images-dir IMAGE --annotations annotations.json --dataset dataset_prepared ``` ### 2.2 Choose Model Framework Available frameworks (all MIT/Apache 2.0 licensed): - **RF-DETR**: Highest accuracy, slower inference - **RT-DETR**: Good balance, optimized for edge devices - **YOLOv6**: Fast inference, good for real-time - **YOLOX**: Versatile, widely supported ### 2.3 Train Model ```bash # Basic training python train_model.py \ --framework rtdetr \ --dataset dataset_prepared \ --output runs/rtdetr_training \ --model-size small \ --epochs 100 # Advanced options python train_model.py \ --framework yolox \ --dataset dataset_prepared \ --output runs/yolox_training \ --model-size nano \ --epochs 50 \ --batch-size 8 \ --lr 0.001 \ --prepare-dataset \ --images-dir IMAGE \ --annotations annotations.json ``` ### 2.4 Monitor Training - Check `runs/*/training/` for logs and checkpoints - Training saves best model as `best.pt` - Use TensorBoard or Weights & Biases for monitoring (if configured) **Training Tips**: - Start with `nano` or `small` models for faster iteration - 50-200 epochs typically sufficient - Monitor validation mAP for convergence - Use data augmentation for better generalization ## ๐Ÿ”„ Step 3: Convert for OAK-D Deployment Convert the trained model to OpenVINO format for OAK-D 4 Pro. ### 3.1 Run Conversion ```bash # Basic conversion python convert_for_deployment.py \ --model runs/rtdetr_training/training/weights/best.pt \ --output oak_d_deployment # Advanced options python convert_for_deployment.py \ --model runs/yolox_training/training/weights/best.pt \ --output oak_d_deployment \ --img-size 640 \ --framework auto ``` ### 3.2 Output Files After conversion, you'll get: ``` oak_d_deployment/ โ”œโ”€โ”€ model.xml # OpenVINO IR model โ”œโ”€โ”€ model.bin # OpenVINO IR weights โ”œโ”€โ”€ model.onnx # ONNX format (intermediate) โ””โ”€โ”€ config.yaml # Model configuration ``` ### 3.3 Convert to Blob Format For OAK-D deployment, convert to `.blob` format: **Option A: Online Converter (Recommended)** 1. Go to https://blobconverter.luxonis.com/ 2. Upload `model.xml` 3. Select "OAK-D 4 Pro" 4. Download `.blob` file **Option B: Command Line** ```bash pip install blobconverter blobconverter --openvino-xml oak_d_deployment/model.xml ``` ## ๐Ÿงช Step 4: Test and Deploy ### 4.1 Test OpenVINO Model ```bash # Verify model loads python -c "from openvino.runtime import Core; core = Core(); model = core.read_model('oak_d_deployment/model.xml'); print('โœ“ Model loaded')" ``` ### 4.2 Deploy to OAK-D Use DepthAI Python API or OAK-D examples: ```python import depthai as dai # Create pipeline pipeline = dai.Pipeline() # Load your blob detection_nn = pipeline.create(dai.node.NeuralNetwork) detection_nn.setBlobPath("model.blob") # Configure camera and output streams # ... (see DepthAI documentation) ``` ### 4.3 Performance Optimization - **Quantization**: Use 8-bit quantization for faster inference - **Model Size**: Nano models work best on edge devices - **Input Resolution**: 320x320 or 416x416 balances speed/accuracy - **Calibration**: Test with real-world images for best results ## ๐Ÿ”ง Troubleshooting ### Common Issues **GUI won't start**: - Ensure Pillow and Tkinter are installed - Check Python version (3.8+ required) **Training fails**: - Verify dataset format (COCO for RF-DETR, YOLO for others) - Check GPU memory if using CUDA - Reduce batch size if out of memory **Conversion fails**: - Ensure model is compatible with OpenVINO - Check input/output shapes match expectations - Try different image sizes (320, 416, 512, 640) **OAK-D deployment issues**: - Verify blob was created for correct OAK model (4 Pro) - Check camera calibration and input preprocessing - Ensure model input size matches camera output ### Getting Help - Check existing issues in the repository - Review DepthAI documentation: https://docs.luxonis.com/ - Test with provided example models first ## ๐Ÿ“Š Performance Benchmarks Expected performance on OAK-D 4 Pro: | Model | Size | FPS | mAP | Use Case | |-------|------|-----|-----|----------| | RT-DETR | Nano | 25-35 | 0.75 | Balanced | | YOLOX | Nano | 30-45 | 0.70 | Fast | | RF-DETR | Nano | 15-25 | 0.80 | Accurate | *Results vary based on model training and calibration* ## ๐ŸŽฏ Next Steps 1. **Iterate**: Collect more data, retrain, redeploy 2. **Optimize**: Experiment with quantization and pruning 3. **Integrate**: Add your model to production applications 4. **Monitor**: Track performance in real-world conditions --- **License**: All models are MIT/Apache 2.0 licensed - free for commercial use! /home/dillon/_code/saw_mill_knot_detection/OAK_D_WORKFLOW_README.md