UI improvements
This commit is contained in:
257
OAK_D_WORKFLOW_README.md
Normal file
257
OAK_D_WORKFLOW_README.md
Normal file
@ -0,0 +1,257 @@
|
||||
# OAK-D 4 Pro Workflow: Label, Train, and Convert AI Model
|
||||
|
||||
This guide walks you through the complete workflow for creating a custom wood knot detection model optimized for the OAK-D 4 Pro camera: from manual image annotation to trained model conversion for edge deployment.
|
||||
|
||||
## 📋 Prerequisites
|
||||
|
||||
### Environment Setup
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone git@143.244.157.110:dillon_stuff/saw_mill_knot_detection.git
|
||||
cd saw_mill_knot_detection
|
||||
|
||||
# Create virtual environment
|
||||
python -m venv .venv
|
||||
source .venv/bin/activate # On Windows: .venv\Scripts\activate
|
||||
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### Required Dependencies
|
||||
- Python 3.8+
|
||||
- Pillow (for image processing)
|
||||
- Ultralytics (for YOLO/RT-DETR models)
|
||||
- RF-DETR (optional, for RF-DETR models)
|
||||
- OpenVINO (installed via convert script)
|
||||
|
||||
## 🏷️ Step 1: Label Images
|
||||
|
||||
Use the Tkinter-based annotation GUI to manually label your wood surface images.
|
||||
|
||||
### 1.1 Prepare Images
|
||||
Place your images in a directory (e.g., `IMAGE/`):
|
||||
```
|
||||
IMAGE/
|
||||
├── image1.jpg
|
||||
├── image2.jpg
|
||||
└── annotations.json # Will be created/updated
|
||||
```
|
||||
|
||||
### 1.2 Launch Annotation GUI
|
||||
```bash
|
||||
# Using the convenience script
|
||||
./run_tk_gui.sh --images-dir IMAGE/
|
||||
|
||||
# Or directly
|
||||
python tk_annotation_gui.py --images-dir IMAGE/
|
||||
```
|
||||
|
||||
### 1.3 Annotate Images
|
||||
1. **Navigate**: Use Prev/Next buttons or click image thumbnails
|
||||
2. **Draw Boxes**: Click and drag on the image to create bounding boxes
|
||||
3. **Auto-Label** (optional): Load trained weights and auto-detect knots
|
||||
- Enter weights path (e.g., `runs/yolox_training/training/weights/best.pt`)
|
||||
- Select model type (auto-detect usually works)
|
||||
- Set confidence threshold (0.3-0.7 recommended)
|
||||
- Click "Load Model" then "Auto-Label Current"
|
||||
4. **Edit Annotations**: Double-click list items to delete, or manually draw corrections
|
||||
5. **Save**: Annotations auto-save to `IMAGE/annotations.json`
|
||||
|
||||
### 1.4 Annotation Format
|
||||
Each image gets entries like:
|
||||
```json
|
||||
{
|
||||
"image1.jpg": [
|
||||
{
|
||||
"bbox": [x1, y1, x2, y2],
|
||||
"label": "knot",
|
||||
"confidence": 1.0,
|
||||
"source": "manual"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Tips**:
|
||||
- Aim for 100-500 annotated images for good results
|
||||
- Focus on challenging cases (small knots, lighting variations)
|
||||
- Use auto-labeling to speed up the process, then manually correct
|
||||
|
||||
## 🏋️ Step 2: Train Model
|
||||
|
||||
Train a detection model using your annotated images.
|
||||
|
||||
### 2.1 Prepare Dataset (Optional)
|
||||
The training script can prepare the dataset automatically, but you can do it manually:
|
||||
```bash
|
||||
python train_model.py --prepare-dataset --images-dir IMAGE --annotations annotations.json --dataset dataset_prepared
|
||||
```
|
||||
|
||||
### 2.2 Choose Model Framework
|
||||
Available frameworks (all MIT/Apache 2.0 licensed):
|
||||
- **RF-DETR**: Highest accuracy, slower inference
|
||||
- **RT-DETR**: Good balance, optimized for edge devices
|
||||
- **YOLOv6**: Fast inference, good for real-time
|
||||
- **YOLOX**: Versatile, widely supported
|
||||
|
||||
### 2.3 Train Model
|
||||
```bash
|
||||
# Basic training
|
||||
python train_model.py \
|
||||
--framework rtdetr \
|
||||
--dataset dataset_prepared \
|
||||
--output runs/rtdetr_training \
|
||||
--model-size small \
|
||||
--epochs 100
|
||||
|
||||
# Advanced options
|
||||
python train_model.py \
|
||||
--framework yolox \
|
||||
--dataset dataset_prepared \
|
||||
--output runs/yolox_training \
|
||||
--model-size nano \
|
||||
--epochs 50 \
|
||||
--batch-size 8 \
|
||||
--lr 0.001 \
|
||||
--prepare-dataset \
|
||||
--images-dir IMAGE \
|
||||
--annotations annotations.json
|
||||
```
|
||||
|
||||
### 2.4 Monitor Training
|
||||
- Check `runs/*/training/` for logs and checkpoints
|
||||
- Training saves best model as `best.pt`
|
||||
- Use TensorBoard or Weights & Biases for monitoring (if configured)
|
||||
|
||||
**Training Tips**:
|
||||
- Start with `nano` or `small` models for faster iteration
|
||||
- 50-200 epochs typically sufficient
|
||||
- Monitor validation mAP for convergence
|
||||
- Use data augmentation for better generalization
|
||||
|
||||
## 🔄 Step 3: Convert for OAK-D Deployment
|
||||
|
||||
Convert the trained model to OpenVINO format for OAK-D 4 Pro.
|
||||
|
||||
### 3.1 Run Conversion
|
||||
```bash
|
||||
# Basic conversion
|
||||
python convert_for_deployment.py \
|
||||
--model runs/rtdetr_training/training/weights/best.pt \
|
||||
--output oak_d_deployment
|
||||
|
||||
# Advanced options
|
||||
python convert_for_deployment.py \
|
||||
--model runs/yolox_training/training/weights/best.pt \
|
||||
--output oak_d_deployment \
|
||||
--img-size 640 \
|
||||
--framework auto
|
||||
```
|
||||
|
||||
### 3.2 Output Files
|
||||
After conversion, you'll get:
|
||||
```
|
||||
oak_d_deployment/
|
||||
├── model.xml # OpenVINO IR model
|
||||
├── model.bin # OpenVINO IR weights
|
||||
├── model.onnx # ONNX format (intermediate)
|
||||
└── config.yaml # Model configuration
|
||||
```
|
||||
|
||||
### 3.3 Convert to Blob Format
|
||||
For OAK-D deployment, convert to `.blob` format:
|
||||
|
||||
**Option A: Online Converter (Recommended)**
|
||||
1. Go to https://blobconverter.luxonis.com/
|
||||
2. Upload `model.xml`
|
||||
3. Select "OAK-D 4 Pro"
|
||||
4. Download `.blob` file
|
||||
|
||||
**Option B: Command Line**
|
||||
```bash
|
||||
pip install blobconverter
|
||||
blobconverter --openvino-xml oak_d_deployment/model.xml
|
||||
```
|
||||
|
||||
## 🧪 Step 4: Test and Deploy
|
||||
|
||||
### 4.1 Test OpenVINO Model
|
||||
```bash
|
||||
# Verify model loads
|
||||
python -c "from openvino.runtime import Core; core = Core(); model = core.read_model('oak_d_deployment/model.xml'); print('✓ Model loaded')"
|
||||
```
|
||||
|
||||
### 4.2 Deploy to OAK-D
|
||||
Use DepthAI Python API or OAK-D examples:
|
||||
```python
|
||||
import depthai as dai
|
||||
|
||||
# Create pipeline
|
||||
pipeline = dai.Pipeline()
|
||||
|
||||
# Load your blob
|
||||
detection_nn = pipeline.create(dai.node.NeuralNetwork)
|
||||
detection_nn.setBlobPath("model.blob")
|
||||
|
||||
# Configure camera and output streams
|
||||
# ... (see DepthAI documentation)
|
||||
```
|
||||
|
||||
### 4.3 Performance Optimization
|
||||
- **Quantization**: Use 8-bit quantization for faster inference
|
||||
- **Model Size**: Nano models work best on edge devices
|
||||
- **Input Resolution**: 320x320 or 416x416 balances speed/accuracy
|
||||
- **Calibration**: Test with real-world images for best results
|
||||
|
||||
## 🔧 Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**GUI won't start**:
|
||||
- Ensure Pillow and Tkinter are installed
|
||||
- Check Python version (3.8+ required)
|
||||
|
||||
**Training fails**:
|
||||
- Verify dataset format (COCO for RF-DETR, YOLO for others)
|
||||
- Check GPU memory if using CUDA
|
||||
- Reduce batch size if out of memory
|
||||
|
||||
**Conversion fails**:
|
||||
- Ensure model is compatible with OpenVINO
|
||||
- Check input/output shapes match expectations
|
||||
- Try different image sizes (320, 416, 512, 640)
|
||||
|
||||
**OAK-D deployment issues**:
|
||||
- Verify blob was created for correct OAK model (4 Pro)
|
||||
- Check camera calibration and input preprocessing
|
||||
- Ensure model input size matches camera output
|
||||
|
||||
### Getting Help
|
||||
- Check existing issues in the repository
|
||||
- Review DepthAI documentation: https://docs.luxonis.com/
|
||||
- Test with provided example models first
|
||||
|
||||
## 📊 Performance Benchmarks
|
||||
|
||||
Expected performance on OAK-D 4 Pro:
|
||||
|
||||
| Model | Size | FPS | mAP | Use Case |
|
||||
|-------|------|-----|-----|----------|
|
||||
| RT-DETR | Nano | 25-35 | 0.75 | Balanced |
|
||||
| YOLOX | Nano | 30-45 | 0.70 | Fast |
|
||||
| RF-DETR | Nano | 15-25 | 0.80 | Accurate |
|
||||
|
||||
*Results vary based on model training and calibration*
|
||||
|
||||
## 🎯 Next Steps
|
||||
|
||||
1. **Iterate**: Collect more data, retrain, redeploy
|
||||
2. **Optimize**: Experiment with quantization and pruning
|
||||
3. **Integrate**: Add your model to production applications
|
||||
4. **Monitor**: Track performance in real-world conditions
|
||||
|
||||
---
|
||||
|
||||
**License**: All models are MIT/Apache 2.0 licensed - free for commercial use!</content>
|
||||
<parameter name="filePath">/home/dillon/_code/saw_mill_knot_detection/OAK_D_WORKFLOW_README.md
|
||||
Reference in New Issue
Block a user