257 lines
7.2 KiB
Markdown
257 lines
7.2 KiB
Markdown
|
|
# OAK-D 4 Pro Workflow: Label, Train, and Convert AI Model
|
||
|
|
|
||
|
|
This guide walks you through the complete workflow for creating a custom wood knot detection model optimized for the OAK-D 4 Pro camera: from manual image annotation to trained model conversion for edge deployment.
|
||
|
|
|
||
|
|
## 📋 Prerequisites
|
||
|
|
|
||
|
|
### Environment Setup
|
||
|
|
```bash
|
||
|
|
# Clone the repository
|
||
|
|
git clone git@143.244.157.110:dillon_stuff/saw_mill_knot_detection.git
|
||
|
|
cd saw_mill_knot_detection
|
||
|
|
|
||
|
|
# Create virtual environment
|
||
|
|
python -m venv .venv
|
||
|
|
source .venv/bin/activate # On Windows: .venv\Scripts\activate
|
||
|
|
|
||
|
|
# Install dependencies
|
||
|
|
pip install -r requirements.txt
|
||
|
|
```
|
||
|
|
|
||
|
|
### Required Dependencies
|
||
|
|
- Python 3.8+
|
||
|
|
- Pillow (for image processing)
|
||
|
|
- Ultralytics (for YOLO/RT-DETR models)
|
||
|
|
- RF-DETR (optional, for RF-DETR models)
|
||
|
|
- OpenVINO (installed via convert script)
|
||
|
|
|
||
|
|
## 🏷️ Step 1: Label Images
|
||
|
|
|
||
|
|
Use the Tkinter-based annotation GUI to manually label your wood surface images.
|
||
|
|
|
||
|
|
### 1.1 Prepare Images
|
||
|
|
Place your images in a directory (e.g., `IMAGE/`):
|
||
|
|
```
|
||
|
|
IMAGE/
|
||
|
|
├── image1.jpg
|
||
|
|
├── image2.jpg
|
||
|
|
└── annotations.json # Will be created/updated
|
||
|
|
```
|
||
|
|
|
||
|
|
### 1.2 Launch Annotation GUI
|
||
|
|
```bash
|
||
|
|
# Using the convenience script
|
||
|
|
./run_tk_gui.sh --images-dir IMAGE/
|
||
|
|
|
||
|
|
# Or directly
|
||
|
|
python tk_annotation_gui.py --images-dir IMAGE/
|
||
|
|
```
|
||
|
|
|
||
|
|
### 1.3 Annotate Images
|
||
|
|
1. **Navigate**: Use Prev/Next buttons or click image thumbnails
|
||
|
|
2. **Draw Boxes**: Click and drag on the image to create bounding boxes
|
||
|
|
3. **Auto-Label** (optional): Load trained weights and auto-detect knots
|
||
|
|
- Enter weights path (e.g., `runs/yolox_training/training/weights/best.pt`)
|
||
|
|
- Select model type (auto-detect usually works)
|
||
|
|
- Set confidence threshold (0.3-0.7 recommended)
|
||
|
|
- Click "Load Model" then "Auto-Label Current"
|
||
|
|
4. **Edit Annotations**: Double-click list items to delete, or manually draw corrections
|
||
|
|
5. **Save**: Annotations auto-save to `IMAGE/annotations.json`
|
||
|
|
|
||
|
|
### 1.4 Annotation Format
|
||
|
|
Each image gets entries like:
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"image1.jpg": [
|
||
|
|
{
|
||
|
|
"bbox": [x1, y1, x2, y2],
|
||
|
|
"label": "knot",
|
||
|
|
"confidence": 1.0,
|
||
|
|
"source": "manual"
|
||
|
|
}
|
||
|
|
]
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Tips**:
|
||
|
|
- Aim for 100-500 annotated images for good results
|
||
|
|
- Focus on challenging cases (small knots, lighting variations)
|
||
|
|
- Use auto-labeling to speed up the process, then manually correct
|
||
|
|
|
||
|
|
## 🏋️ Step 2: Train Model
|
||
|
|
|
||
|
|
Train a detection model using your annotated images.
|
||
|
|
|
||
|
|
### 2.1 Prepare Dataset (Optional)
|
||
|
|
The training script can prepare the dataset automatically, but you can do it manually:
|
||
|
|
```bash
|
||
|
|
python train_model.py --prepare-dataset --images-dir IMAGE --annotations annotations.json --dataset dataset_prepared
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2.2 Choose Model Framework
|
||
|
|
Available frameworks (all MIT/Apache 2.0 licensed):
|
||
|
|
- **RF-DETR**: Highest accuracy, slower inference
|
||
|
|
- **RT-DETR**: Good balance, optimized for edge devices
|
||
|
|
- **YOLOv6**: Fast inference, good for real-time
|
||
|
|
- **YOLOX**: Versatile, widely supported
|
||
|
|
|
||
|
|
### 2.3 Train Model
|
||
|
|
```bash
|
||
|
|
# Basic training
|
||
|
|
python train_model.py \
|
||
|
|
--framework rtdetr \
|
||
|
|
--dataset dataset_prepared \
|
||
|
|
--output runs/rtdetr_training \
|
||
|
|
--model-size small \
|
||
|
|
--epochs 100
|
||
|
|
|
||
|
|
# Advanced options
|
||
|
|
python train_model.py \
|
||
|
|
--framework yolox \
|
||
|
|
--dataset dataset_prepared \
|
||
|
|
--output runs/yolox_training \
|
||
|
|
--model-size nano \
|
||
|
|
--epochs 50 \
|
||
|
|
--batch-size 8 \
|
||
|
|
--lr 0.001 \
|
||
|
|
--prepare-dataset \
|
||
|
|
--images-dir IMAGE \
|
||
|
|
--annotations annotations.json
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2.4 Monitor Training
|
||
|
|
- Check `runs/*/training/` for logs and checkpoints
|
||
|
|
- Training saves best model as `best.pt`
|
||
|
|
- Use TensorBoard or Weights & Biases for monitoring (if configured)
|
||
|
|
|
||
|
|
**Training Tips**:
|
||
|
|
- Start with `nano` or `small` models for faster iteration
|
||
|
|
- 50-200 epochs typically sufficient
|
||
|
|
- Monitor validation mAP for convergence
|
||
|
|
- Use data augmentation for better generalization
|
||
|
|
|
||
|
|
## 🔄 Step 3: Convert for OAK-D Deployment
|
||
|
|
|
||
|
|
Convert the trained model to OpenVINO format for OAK-D 4 Pro.
|
||
|
|
|
||
|
|
### 3.1 Run Conversion
|
||
|
|
```bash
|
||
|
|
# Basic conversion
|
||
|
|
python convert_for_deployment.py \
|
||
|
|
--model runs/rtdetr_training/training/weights/best.pt \
|
||
|
|
--output oak_d_deployment
|
||
|
|
|
||
|
|
# Advanced options
|
||
|
|
python convert_for_deployment.py \
|
||
|
|
--model runs/yolox_training/training/weights/best.pt \
|
||
|
|
--output oak_d_deployment \
|
||
|
|
--img-size 640 \
|
||
|
|
--framework auto
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3.2 Output Files
|
||
|
|
After conversion, you'll get:
|
||
|
|
```
|
||
|
|
oak_d_deployment/
|
||
|
|
├── model.xml # OpenVINO IR model
|
||
|
|
├── model.bin # OpenVINO IR weights
|
||
|
|
├── model.onnx # ONNX format (intermediate)
|
||
|
|
└── config.yaml # Model configuration
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3.3 Convert to Blob Format
|
||
|
|
For OAK-D deployment, convert to `.blob` format:
|
||
|
|
|
||
|
|
**Option A: Online Converter (Recommended)**
|
||
|
|
1. Go to https://blobconverter.luxonis.com/
|
||
|
|
2. Upload `model.xml`
|
||
|
|
3. Select "OAK-D 4 Pro"
|
||
|
|
4. Download `.blob` file
|
||
|
|
|
||
|
|
**Option B: Command Line**
|
||
|
|
```bash
|
||
|
|
pip install blobconverter
|
||
|
|
blobconverter --openvino-xml oak_d_deployment/model.xml
|
||
|
|
```
|
||
|
|
|
||
|
|
## 🧪 Step 4: Test and Deploy
|
||
|
|
|
||
|
|
### 4.1 Test OpenVINO Model
|
||
|
|
```bash
|
||
|
|
# Verify model loads
|
||
|
|
python -c "from openvino.runtime import Core; core = Core(); model = core.read_model('oak_d_deployment/model.xml'); print('✓ Model loaded')"
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4.2 Deploy to OAK-D
|
||
|
|
Use DepthAI Python API or OAK-D examples:
|
||
|
|
```python
|
||
|
|
import depthai as dai
|
||
|
|
|
||
|
|
# Create pipeline
|
||
|
|
pipeline = dai.Pipeline()
|
||
|
|
|
||
|
|
# Load your blob
|
||
|
|
detection_nn = pipeline.create(dai.node.NeuralNetwork)
|
||
|
|
detection_nn.setBlobPath("model.blob")
|
||
|
|
|
||
|
|
# Configure camera and output streams
|
||
|
|
# ... (see DepthAI documentation)
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4.3 Performance Optimization
|
||
|
|
- **Quantization**: Use 8-bit quantization for faster inference
|
||
|
|
- **Model Size**: Nano models work best on edge devices
|
||
|
|
- **Input Resolution**: 320x320 or 416x416 balances speed/accuracy
|
||
|
|
- **Calibration**: Test with real-world images for best results
|
||
|
|
|
||
|
|
## 🔧 Troubleshooting
|
||
|
|
|
||
|
|
### Common Issues
|
||
|
|
|
||
|
|
**GUI won't start**:
|
||
|
|
- Ensure Pillow and Tkinter are installed
|
||
|
|
- Check Python version (3.8+ required)
|
||
|
|
|
||
|
|
**Training fails**:
|
||
|
|
- Verify dataset format (COCO for RF-DETR, YOLO for others)
|
||
|
|
- Check GPU memory if using CUDA
|
||
|
|
- Reduce batch size if out of memory
|
||
|
|
|
||
|
|
**Conversion fails**:
|
||
|
|
- Ensure model is compatible with OpenVINO
|
||
|
|
- Check input/output shapes match expectations
|
||
|
|
- Try different image sizes (320, 416, 512, 640)
|
||
|
|
|
||
|
|
**OAK-D deployment issues**:
|
||
|
|
- Verify blob was created for correct OAK model (4 Pro)
|
||
|
|
- Check camera calibration and input preprocessing
|
||
|
|
- Ensure model input size matches camera output
|
||
|
|
|
||
|
|
### Getting Help
|
||
|
|
- Check existing issues in the repository
|
||
|
|
- Review DepthAI documentation: https://docs.luxonis.com/
|
||
|
|
- Test with provided example models first
|
||
|
|
|
||
|
|
## 📊 Performance Benchmarks
|
||
|
|
|
||
|
|
Expected performance on OAK-D 4 Pro:
|
||
|
|
|
||
|
|
| Model | Size | FPS | mAP | Use Case |
|
||
|
|
|-------|------|-----|-----|----------|
|
||
|
|
| RT-DETR | Nano | 25-35 | 0.75 | Balanced |
|
||
|
|
| YOLOX | Nano | 30-45 | 0.70 | Fast |
|
||
|
|
| RF-DETR | Nano | 15-25 | 0.80 | Accurate |
|
||
|
|
|
||
|
|
*Results vary based on model training and calibration*
|
||
|
|
|
||
|
|
## 🎯 Next Steps
|
||
|
|
|
||
|
|
1. **Iterate**: Collect more data, retrain, redeploy
|
||
|
|
2. **Optimize**: Experiment with quantization and pruning
|
||
|
|
3. **Integrate**: Add your model to production applications
|
||
|
|
4. **Monitor**: Track performance in real-world conditions
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
**License**: All models are MIT/Apache 2.0 licensed - free for commercial use!</content>
|
||
|
|
<parameter name="filePath">/home/dillon/_code/saw_mill_knot_detection/OAK_D_WORKFLOW_README.md
|