saw_mill_knot_detection/README.md

# Saw Mill Knot Detection (YOLOX/YOLO)

This repository contains a complete wood defect detection system using YOLOX/YOLO models, trained to detect 10 different types of wood surface defects. The system includes a web-based annotation GUI, automated training pipeline, and is optimized for deployment on OAK-D cameras.

## 🎯 Project Overview

- **Model**: YOLOX-nano (Ultralytics YOLO framework)
- **Dataset**: 20,276 wood surface defect images with 10 defect categories
- **Training**: 5 epochs, mAP50: 0.612, mAP50-95: 0.357
- **Deployment Target**: OAK-D 4 Pro camera
- **Framework**: Ultralytics 8.3.240

## 📊 Dataset Information

**Source**: [Kaggle Wood Surface Defects Dataset](https://www.kaggle.com/datasets/kirs0816/wood-surface-defects)

**Classes** (10 total):
- Live knot
- Dead knot
- Knot with crack
- Crack
- Resin
- Marrow
- Quartzity
- Knot missing
- Blue stain
- Overgrown

**Dataset Split**:
- Train: 16,220 images
- Valid: 2,027 images
- Test: 2,029 images

**Formats Available**:
- `dataset_coco/` → COCO format for RF-DETR
- `dataset_yolo/` → YOLO format for YOLOX, YOLOv6, YOLOv8

## 🚀 Quick Start

### 1. Environment Setup

```bash
# Clone the repository
git clone git@143.244.157.110:dillon_stuff/saw_mill_knot_detection.git
cd saw_mill_knot_detection

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -U pip
pip install ultralytics gradio rfdetr
```

### 2. Setup Datasets

```bash
# Download dataset from Kaggle (requires Kaggle API)
kaggle datasets download -d kirs0816/wood-surface-defects
unzip wood-surface-defects.zip

# Create multi-format datasets
python split_coco_dataset.py  # Creates dataset_yolo/
python setup_datasets.py      # Creates dataset_coco/ and updates configs
```

### 3. Launch Annotation GUI

```bash
python annotation_gui.py
```

Open http://localhost:7860 in your browser to access the web-based annotation interface with:
- Image navigation with index display
- Auto-labeling with trained YOLOX model
- Manual annotation tools
- Real-time result visualization

### 4. Train Models

Choose from three different frameworks:

#### RF-DETR (Highest accuracy, slower training)
```bash
python train_rfdetr.py \
  --dataset-dir dataset_coco \
  --output-dir runs/rfdetr_medium \
  --model medium \
  --epochs 50 \
  --batch-size 4 \
  --grad-accum-steps 4 \
  --lr 1e-4
```

#### YOLOX (Balanced performance/speed)
```bash
python train_yolox.py \
  --dataset-dir dataset_yolo \
  --model yolox-nano \
  --epochs 50 \
  --batch-size 8
```

#### YOLOv6 (Fastest, edge-optimized)
```bash
python train_yolov6.py \
  --dataset-dir dataset_yolo \
  --model yolov6n \
  --epochs 50 \
  --batch-size 8
```

## 📁 Project Structure

```
saw_mill_knot_detection/
├── annotation_gui.py          # Gradio web interface for annotation
├── train_rfdetr.py           # RF-DETR training script
├── train_yolox.py            # YOLOX training script
├── train_yolov6.py           # YOLOv6 training script
├── setup_datasets.py         # Multi-format dataset setup script
├── split_coco_dataset.py     # Dataset splitting utility
├── config.py                 # Configuration settings
├── dataset_coco/             # RF-DETR dataset (COCO format)
│   ├── train/
│   │   ├── *.jpg             # Training images
│   │   └── _annotations.coco.json
│   ├── valid/
│   │   ├── *.jpg             # Validation images
│   │   └── _annotations.coco.json
│   └── test/
│       ├── *.jpg             # Test images
│       └── _annotations.coco.json
├── dataset_yolo/             # YOLOX/YOLOv6/YOLOv8 dataset (YOLO format)
│   ├── train/
│   │   ├── images/           # Training images
│   │   └── labels/           # YOLO format labels
│   ├── valid/
│   │   ├── images/           # Validation images
│   │   └── labels/           # YOLO format labels
│   ├── test/
│   │   ├── images/           # Test images
│   │   └── labels/           # YOLO format labels
│   └── data.yaml             # YOLO dataset configuration
├── runs/                     # Training outputs (excluded from git)
├── bbox_coco_dataset.json     # Original COCO annotations
├── requirements.txt           # Python dependencies
├── .gitignore                # Excludes large data files
└── README.md                 # This file
```

## 🤖 Framework Comparison

| Framework | Accuracy | Speed | Memory | Deployment | Best For |
|-----------|----------|-------|--------|------------|----------|
| **RF-DETR** | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐ | CPU/GPU | Highest accuracy, research |
| **YOLOX** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | Edge devices | Balanced performance |
| **YOLOv6** | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ | Mobile/Edge | Fast inference, production |

## 🛠️ Usage Guide

### Annotation GUI Features

The Gradio-based annotation interface provides:

- **Image Navigation**: Browse through dataset with current index display
- **Auto-Labeling**: One-click defect detection using trained YOLOX model
- **Manual Annotation**: Draw bounding boxes for corrections
- **Real-time Visualization**: Immediate display of detection results
- **Export Options**: Save annotations in multiple formats

### Training

```bash
# Basic training
python train_yolox.py --dataset-dir dataset_split --model yolox-nano --epochs 10

# Advanced training with custom parameters
python train_yolox.py \
  --dataset-dir dataset_split \
  --model yolox-nano \
  --epochs 20 \
  --batch-size 8 \
  --img-size 640
```

### Inference

```python
from ultralytics import YOLO

# Load trained model
model = YOLO('runs/yolox_training/training/weights/best.pt')

# Predict on image
results = model.predict('path/to/image.jpg', conf=0.4)

# Process results
for result in results:
    boxes = result.boxes  # Bounding boxes
    for box in boxes:
        cls = int(box.cls)  # Class index
        conf = float(box.conf)  # Confidence score
        xyxy = box.xyxy.tolist()[0]  # Box coordinates
```

## 🔧 Configuration

Key settings in `config.py`:

```python
DEFAULT_MODEL_WEIGHTS = "runs/yolox_training/training/weights/best.pt"
DEFAULT_IMAGES_DIR = "IMAGE/"
WOOD_DEFECT_CLASSES = [
    'Live knot', 'Dead knot', 'Knot with crack', 'Crack',
    'Resin', 'Marrow', 'Quartzity', 'Knot missing',
    'Blue stain', 'Overgrown'
]
```

## 📈 Model Performance

**YOLOX-nano Results** (5 epochs):
- mAP50: 0.612
- mAP50-95: 0.357
- Precision: 0.68
- Recall: 0.55

## 🎯 Deployment on OAK-D

The trained model can be exported for OAK-D deployment:

```python
from ultralytics import YOLO

# Load and export model
model = YOLO('runs/yolox_training/training/weights/best.pt')
model.export(format='onnx')  # Export to ONNX for OAK-D
```

## 🤝 Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test thoroughly
5. Submit a pull request

## 📄 License

This project uses the Kaggle Wood Surface Defects dataset. Please refer to the original dataset license for usage terms.

## 🙏 Acknowledgments

- Kaggle for providing the wood surface defects dataset
- Ultralytics for the YOLO framework
- Gradio for the web interface framework