-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME.md.backup
More file actions
executable file
·816 lines (658 loc) · 34 KB
/
Copy pathREADME.md.backup
File metadata and controls
executable file
·816 lines (658 loc) · 34 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
# 🛠️ GARNET: AI-Driven P&ID Symbol Detection and Analysis
**G**CME **A**I-**R**ecognition **N**etwork for **E**ngineering **T**echnology
_Precision in Every Connection_
[](https://github.com/ultralytics/ultralytics)
[](https://opencv.org/)
[](https://networkx.org/)
[](https://github.com/JaidedAI/EasyOCR)
[](https://fastapi.tiangolo.com/)
[](https://react.dev/)
[](https://www.typescriptlang.org/)
[](https://vitejs.dev/)
GARNET is an AI-powered tool designed to **automate symbol detection, classification, and connectivity analysis** in Piping and Instrumentation Diagrams (P&IDs). Built for engineers and maintenance teams, it combines state-of-the-art object detection (YOLOv11/YOLOv8) with graph-based analytics to transform P&ID workflows.
---
## 🚀 Features
### Frontend Features
- **Interactive Canvas**: Pan, zoom, and navigate through large P&ID images with minimap support
- **Object Detection Visualization**: Color-coded bounding boxes with confidence scores
- **Object Editing**: Create, update, and delete detected objects directly on the canvas
- **Review Workflow**: Accept/reject objects with visual status indicators
- **Batch Processing**: Process multiple images with queue management, pause/resume, and progress tracking
- **Undo/Redo**: Full history support for all editing operations
- **Keyboard Shortcuts**: Efficient navigation and editing with keyboard shortcuts
- **Export Formats**: Export to JSON, YOLO, COCO, LabelMe, or PDF
- **Dark Mode**: Toggle between light and dark themes
- **Confidence Filtering**: Filter objects by confidence threshold
- **Class Visibility**: Toggle visibility of specific object classes
### Backend Features
- **Symbol Detection**: Identify valves (gate, globe, check), pumps, tanks, and more using YOLOv11/YOLOv8.
- **SAHI Integration**: Slicing Aided Hyper Inference for accurate detection on large images.
- **Automated Counting**: Generate counts for each symbol type in a P&ID.
- **Text Recognition (OCR)**: Extract text annotations from symbols using EasyOCR with support for vertical text.
- **Model Caching**: Cache loaded models for faster subsequent detections.
- **Results Caching**: In-memory cache with TTL for detection results.
- **Automatic Cleanup**: Periodic cleanup of old prediction images and expired cache entries.
- **Health Monitoring**: Health check endpoint with model loading status and memory usage.
- **Environment Configuration**: Configurable via environment variables for development and production.
---
## 📐 Architecture
### System Overview
GARNET follows a modern client-server architecture with a clear separation between the interactive frontend and the AI-powered backend:
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ GARNET Architecture │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────┐ ┌─────────────────────────────┐ │
│ │ React Frontend │ │ FastAPI Backend │ │
│ │ ┌────────────────────────┐ │ │ ┌───────────────────────┐ │ │
│ │ │ React 18 + TypeScript │ │◄────────►│ │ FastAPI │ │ │
│ │ │ Zustand (State) │ │ HTTP │ │ SAHI (Slicing) │ │ │
│ │ │ Tailwind CSS + Radix │ │ │ │ Ultralytics (YOLO) │ │ │
│ │ │ Vite (Build Tool) │ │ │ │ EasyOCR (Text) │ │ │
│ │ └────────────────────────┘ │ │ │ OpenCV (Image Proc) │ │ │
│ │ │ │ └───────────────────────┘ │ │
│ │ Port: 5173 (dev) / 80/443 │ │ Port: 8001 │ │
│ frontend/ │ │ backend/api.py │ │
│ └──────────────────────────────┘ └─────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### Frontend Stack
| Technology | Version | Purpose |
| ---------------- | ------- | ----------------------------------------------- |
| **React** | 18.3.1 | UI library with hooks and functional components |
| **TypeScript** | 5.7.3 | Type-safe development |
| **Vite** | 6.0.7 | Fast development server and optimized builds |
| **Zustand** | 5.0.3 | Lightweight state management |
| **Tailwind CSS** | 3.4.17 | Utility-first CSS framework |
| **Radix UI** | Latest | Accessible, unstyled UI primitives |
| **Lucide React** | Latest | Modern icon library |
| **jsPDF** | 4.0.0 | PDF export functionality |
#### Frontend Structure
```
frontend/
├── src/
│ ├── components/ # React components
│ │ ├── ui/ # Reusable UI primitives (Radix-based)
│ │ ├── App.tsx # Main application component
│ │ ├── UploadZone.tsx # Image upload with drag-and-drop
│ │ ├── DetectionSetup.tsx # Detection parameter configuration
│ │ ├── ResultsView.tsx # Detection results with interactive canvas
│ │ ├── BatchResultsView.tsx # Batch processing interface
│ │ ├── ProcessingView.tsx # Processing status view
│ │ ├── ZoomControls.tsx # Canvas zoom controls
│ │ ├── ObjectSidebar.tsx # Object list and review controls
│ │ ├── Header.tsx # Application header
│ │ ├── ErrorBoundary.tsx # Error boundary component
│ │ ├── CanvasView.tsx # Main canvas implementation
│ │ ├── ReviewCanvasLayers.tsx # Canvas layers for review
│ │ ├── PipelineResultsView.tsx # Pipeline results view
│ │ ├── PipelineHitlReviewView.tsx # Pipeline HITL review
│ │ ├── PipelineReviewWorkspaceView.tsx # Pipeline workspace
│ │ ├── Stage6LineAssociationReview.tsx # Stage 6 line association review
│ │ ├── PdfPageSelector.tsx # PDF page selector
│ │ ├── PipelineArtifactCanvas.tsx # Pipeline artifact display
│ │ └── ...
│ ├── stores/
│ │ ├── appStore.ts # Main application state (Zustand)
│ │ └── historyStore.ts # Undo/redo action history
│ ├── lib/
│ │ ├── api.ts # API client functions
│ │ ├── pdfExport.ts # PDF generation utilities
│ │ ├── exportFormats.ts # CSV/JSON export utilities
│ │ ├── utils.ts # Utility functions
│ │ ├── categoryColors.ts # Category color mappings
│ │ └── objectKey.ts # Object key generation
│ ├── hooks/ # Custom React hooks
│ │ ├── useKeyboardShortcuts.ts # Keyboard shortcuts handler
│ │ └── useInlineEdit.ts # Inline editing hook
│ ├── types.ts # TypeScript type definitions
│ └── styles/ # Global styles and CSS variables
│ └── index.css # Global CSS file
├── package.json
├── vite.config.ts # Vite configuration with API proxy
└── tsconfig.json
```
### Backend Stack
| Technology | Purpose |
| -------------------- | ---------------------------------------------- |
| **FastAPI** | High-performance Python web framework |
| **SAHI** | Slicing Aided Hyper Inference for large images |
| **Ultralytics** | YOLOv11/YOLOv8 object detection models |
| **EasyOCR** | Text recognition from detected symbols |
| **OpenCV** | Image processing and manipulation |
| **NetworkX** | Graph construction and connectivity analysis |
| **Pydantic** | Data validation and settings management |
| **python-multipart** | File upload handling |
#### Backend Structure
```
backend/
├── api.py # Canonical FastAPI application entry point
├── main.py # Compatibility shim that re-exports api:app
├── requirements.txt # Python dependencies
├── .env # Environment configuration
├── garnet/ # Core logic package
│ ├── __init__.py # Package initializer
│ ├── Settings.py # Settings management
│ ├── model_defaults.py # Model weight file discovery
│ ├── object_detection_sahi.py # SAHI-based object detection
│ ├── easyocr_sahi.py # EasyOCR with SAHI
│ ├── gemini_ocr_sahi.py # Gemini OCR with SAHI
│ ├── paddle_ocr_sahi.py # PaddleOCR with SAHI
│ ├── ocrmac_sahi.py # OCRMac with SAHI
│ ├── pipeline/ # Pipeline processing stages
│ │ ├── pid_extractor.py # Main pipeline orchestrator
│ │ ├── stage1_input_normalization.py
│ │ ├── stage2_ocr_discovery.py
│ │ ├── stage4_object_detection.py
│ │ ├── stage4_line_number_fusion.py
│ │ ├── stage4_instrument_tag_fusion.py
│ │ ├── stage5_pipe_mask.py
│ │ ├── stage5b_pipe_trace.py
│ │ ├── stage6_trace_associations.py
│ │ ├── stage7_geometric_graph_assembly.py
│ │ ├── stage7c_page_connector_labeling.py
│ │ ├── stage7b_graph_export.py
│ │ ├── stage8_graph_qa.py
│ │ ├── stage9_apply_review_decisions.py
│ │ ├── stage10_process_exports.py
│ │ └── stage11_connection_overlay.py
│ ├── review/ # Review and HITL functionality
│ │ ├── review_state.py # Review state management
│ │ ├── review_workspace.py # Review workspace handling
│ │ ├── stage8_review_package.py
│ │ ├── stage9_review_decisions.py
│ │ └── stage10_review_package.py
│ ├── tracing/ # Path tracing and graph construction
│ │ ├── path_tracer/
│ │ │ ├── cv_pipe_tracer.py
│ │ │ └── stage5b_pipeline.py
│ │ ├── trace_associations.py
│ │ ├── trace_graph_builder.py
│ │ ├── trace_graph_qa.py
│ │ └── topolopy_markers.py
│ ├── utils/ # Utility functions
│ │ ├── utils.py # Common utilities
│ │ └── __init__.py
│ ├── graph_export_adapter.py # Graph export functionality
│ ├── pipe_mask.py # Pipe mask generation
│ ├── pipe_sheet_merge.py # Pipe sheet merging
│ ├── reviewed_outputs.py # Reviewed output generation
│ ├── topology_markers.py # Topology detection
│ ├── line_number_fusion.py # Line number fusion
│ ├── instrument_tag_fusion.py # Instrument tag fusion
│ ├── page_connector.py # Page connector handling
│ └── render_connection_pipeline_overlay.py # Connection overlay rendering
├── datasets/ # Dataset configuration files
├── yolo_weights/ # Model weights
└── static/ # Static assets
└── images/predictions/ # Generated prediction images
```
### API Endpoints
| Method | Endpoint | Description |
| ------ | ------------------------------------------- | ----------------------------------------------- |
| GET | `/` | API root with service info |
| GET | `/api/health` | Health check with model status and memory usage |
| GET | `/api/model-types` | Get available model types |
| GET | `/api/models` | Get model type values |
| GET | `/api/weight-files` | Get available model weight files |
| GET | `/api/config-files` | Get available dataset config files |
| POST | `/api/pdf-extract` | Extract PDF pages to PNG images |
| POST | `/api/detect` | Run object detection on uploaded image |
| GET | `/api/results/{result_id}` | Fetch a previously detected result |
| PATCH | `/api/results/{result_id}/objects/{obj_id}` | Update a detected object |
| POST | `/api/results/{result_id}/objects` | Create a new object |
| DELETE | `/api/results/{result_id}/objects/{obj_id}` | Delete a detected object |
| POST | `/api/pipeline/jobs` | Start a new pipeline job |
| GET | `/api/pipeline/jobs/{job_id}` | Get pipeline job status and details |
| GET | `/api/pipeline/jobs/{job_id}/stage-status` | Get status of all pipeline stages |
| POST | `/api/pipeline/jobs/{job_id}/resume-from/{stage}` | Resume pipeline from a specific stage |
| GET | `/api/pipeline/jobs/{job_id}/review-state` | Get pipeline review state |
| PUT | `/api/pipeline/jobs/{job_id}/review-state` | Update pipeline review state |
| GET | `/api/pipeline/jobs/{job_id}/review-workspace` | Get pipeline review workspace |
| PUT | `/api/pipeline/jobs/{job_id}/review-workspace` | Update pipeline review workspace |
| POST | `/api/pipeline/jobs/{job_id}/review-workspace/recompute` | Recompute pipeline review workspace |
| POST | `/api/pipeline/jobs/{job_id}/review-workspace/commit` | Commit pipeline review workspace |
| GET | `/api/pipeline/jobs/{job_id}/reviewed-graph`| Get reviewed graph from pipeline |
| GET | `/api/pipeline/jobs/{job_id}/reviewed-qa` | Get reviewed QA report from pipeline |
| PUT | `/api/pipeline/jobs/{job_id}/artifacts/{artifact_name}` | Update pipeline artifact |
| GET | `/api/pipeline/jobs/{job_id}/artifacts/{artifact_name}` | Get pipeline artifact |
### Data Flow
```
1. Upload P&ID Image (Frontend)
├─ Drag & drop or file selection
└─ Preview image with metadata
│
▼
2. Configure Detection Parameters (Frontend)
├─ Select model type (ultralytics or gemini)
├─ Choose weight file (for ultralytics)
├─ Set confidence threshold (0.2 - 0.95)
├─ Configure image size (128 - 1280)
├─ Set overlap ratio (0.0 - 0.5)
├─ Configure post-processing options
├─ Enable/disable OCR
└─ Select processing mode: Detection or Pipeline
│
▼
3. Frontend → POST /api/detect (Detection Mode)
OR Frontend → POST /api/pipeline/jobs (Pipeline Mode)
├─ FormData with image and parameters
└─ AbortController for cancellation
│
▼
4. Backend: Validation & Processing
├─ Validate file extension and size
├─ Load cached model or create new one
├─ Decode image with OpenCV
├─ (For Pipeline Mode: Store job info and start background processing)
└─ Store result in memory cache
│
▼
5. Backend: SAHI Slicing (if applicable)
├─ Slice large image into tiles (configurable size)
├─ Run YOLO inference on each tile
└─ Merge overlapping detections (NMM postprocessing)
│
▼
6. Backend: OCR (optional, if enabled)
├─ Extract text from symbol regions
├─ Rotate vertical text objects
├─ Apply image preprocessing
└─ Use EasyOCR/Gemini/PaddleOCR/OCRMac with wordbeamsearch decoder
│
▼
7. Backend: Response
├─ (Detection Mode) JSON with detections + image URL
├─ (Pipeline Mode) Job ID for tracking progress
├─ Store in RESULTS_STORE with TTL (Detection Mode)
└─ Store job info in PIPELINE_JOBS (Pipeline Mode)
│
▼
8. Frontend: ResultsView/ProcessingView
├─ (Detection Mode) Interactive canvas with pan/zoom/minimap
│ ├─ Color-coded bounding boxes by category
│ ├─ Object list with editing capabilities
│ ├─ Accept/reject workflow with status indicators
│ ├─ Undo/redo support for all operations
│ └─ Export options (JSON, YOLO, COCO, LabelMe, PDF)
└─ (Pipeline Mode) Progress tracking and stage-wise results
├─ Real-time stage status updates
├─ Access to intermediate artifacts
└─ Review workflow for HITL stages
```
---
## 📦 Installation
### Prerequisites
- Python 3.9+
- Node.js 18+ (for frontend)
- Bun or npm (package manager)
- Git
### 1. Clone the Repository
```bash
git clone https://github.com/may3rd/GARNET.git
cd GARNET
```
### 2. Install Python Dependencies
```bash
cd backend
pip install -r requirements.txt
cd ..
```
### 3. Install Frontend Dependencies
```bash
cd frontend
bun install
# or: npm install
cd ..
```
---
## 🖥️ Usage
### 1. React Frontend + API Backend (Recommended)
This is the primary mode for interactive P&ID analysis. The React frontend provides a modern UI for uploading images, configuring detection parameters, reviewing results, and exporting data.
#### Tech Stack Summary
| Component | Technology | Version |
| ---------- | ------------------ | ------- |
| Frontend | React + TypeScript | 18.3.1 |
| Build Tool | Vite | 6.4.3 |
| Styling | Tailwind CSS | 3.4.19 |
| State | Zustand | 5.0.14 |
| Backend | FastAPI | Latest |
| AI Engine | SAHI + Ultralytics | Latest |
#### Quick Start
**Terminal 1 - Start the API backend:**
```bash
# Copy environment file and configure
cp .env.example .env
# Start FastAPI server
cd backend
uvicorn api:app --reload --port 8001
```
Backend runs at `http://localhost:8001` with auto-reload enabled for development.
**Terminal 2 - Start the React frontend:**
```bash
cd frontend
# Copy environment file and configure
cp .env.example .env.local
# Start development server
bun run dev
# or: npm run dev
```
Frontend runs at `http://localhost:5173` with hot module replacement.
#### Environment Variables
**Backend Configuration (`.env`):**
```bash
# Environment (development, production)
ENV=development
DEBUG=true
# Server Configuration
HOST=localhost
PORT=8001
# CORS - Comma-separated list of allowed origins
ALLOWED_ORIGINS=http://localhost:5173,http://localhost:4173
# File Upload Limits
MAX_FILE_SIZE_MB=50
MAX_PDF_PAGES=50
PDF_DPI=300
ALLOWED_IMAGE_EXTENSIONS=.jpg,.jpeg,.png,.webp,.bmp,.tiff
# Model Defaults
DEFAULT_CONF_THRESHOLD=0.8
DEFAULT_IMAGE_SIZE=640
DEFAULT_OVERLAP_RATIO=0.2
# Gemini / OpenRouter (required when using model=gemini)
OPENROUTER_API_KEY=
OPENROUTER_MODEL=google/gemini-3-flash-preview
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
OPENROUTER_TEMPERATURE=0.7
# Cache Configuration
RESULTS_CACHE_MAX_SIZE=100
RESULTS_CACHE_TTL=3600
MODEL_CACHE_MAX_SIZE=10
# Cleanup Configuration
PREDICTION_IMAGE_TTL_HOURS=24
CLEANUP_INTERVAL_MINUTES=60
# OCR Configuration
OCR_CACHE_ENABLED=true
OCR_LANGUAGES=en
OCR_GPU=true
# Logging
LOG_LEVEL=INFO
LOG_FILE=garnet.log
# Paths
PREDICTIONS_DIR=static/images/predictions
```
**Frontend Configuration (`frontend/.env.local`):**
```bash
# API Configuration
VITE_API_URL=http://localhost:8001
# Development Server
VITE_PORT=5173
VITE_HOST=localhost
# Build Configuration
VITE_SOURCEMAP=false
VITE_OUT_DIR=dist
```
#### Configuration Options
The Detection Setup panel in the frontend provides the following options:
| Parameter | Description | Default | Range |
| ------------------------ | ---------------------------------------- | ------------------------- | ------------- |
| **Model** | Detection model type | `ultralytics` | `ultralytics`, `gemini` |
| **Weight File** | Path to model weights (`.pt` or `.onnx`) | Auto-selected | Ultralytics only |
| **Config File** | YAML dataset configuration | `backend/datasets/yaml/data.yaml` | - |
| **Confidence Threshold** | Minimum detection confidence | `0.8` | `0.2 - 0.95` |
| **Image Size** | Input size for model inference | `640` | `128 - 1280` |
| **Overlap Ratio** | SAHI slice overlap for large images | `0.2` | `0.0 - 0.5` |
| **Text OCR** | Enable text extraction from symbols | `false` | `true/false` |
| **Postprocess Type** | Detection post-processing method | `GREEDYNMM` | `NMM`, `GREEDYNMM`, `NMS` |
| **Postprocess Match Metric** | Matching metric for post-processing | `IOS` | `IOU`, `IOS` |
| **Postprocess Match Threshold** | Threshold for post-processing | `0.1` | `0.0 - 1.0` |
**Frontend Features:**
| Feature | Description |
| ---------------------- | ------------------------------------------------------------------------ |
| **Minimap** | Navigate large images with a minimap showing viewport position |
| **Zoom Controls** | Zoom in/out, reset to 100%, fit to screen |
| **Keyboard Shortcuts** | Arrow keys for navigation, Enter/Delete for accept/reject, Ctrl+Z/Y for undo/redo |
| **Object Editing** | Click to select, drag to move, resize handles to adjust bounding box |
| **Create Object** | Draw new bounding boxes on canvas to add custom objects |
| **Delete Object** | Remove objects with confirmation |
| **Review Status** | Mark objects as accepted (green) or rejected (red/dashed) |
| **Export** | Download results in JSON, YOLO, COCO, LabelMe, or PDF format |
| **Batch Mode** | Queue multiple images, pause/resume processing, navigate between results |
#### Production Deployment
**Build the frontend for production:**
```bash
cd frontend
bun run build
# or: npm run build
```
This creates an optimized build in the `frontend/dist/` directory.
**Production deployment options:**
1. **Separate Services (Recommended)**
- Serve frontend via nginx/Apache or CDN
- Run backend with production ASGI server (Uvicorn + Gunicorn)
```bash
# Backend production start
cd backend
gunicorn api:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8001
```
2. **Combined Serving**
- Mount the built frontend static files in FastAPI:
```python
from fastapi.staticfiles import StaticFiles
app.mount("/", StaticFiles(directory="frontend/dist", html=True), name="static")
```
**Production Environment Variables:**
```bash
# Backend
ENV=production
DEBUG=false
HOST=0.0.0.0
PORT=8001
CORS_ORIGINS=https://yourdomain.com
# Frontend (build time)
VITE_API_URL=https://api.yourdomain.com
VITE_SOURCEMAP=false
```
---
### 2. Batch Inference Script
Run inference on multiple P&IDs in a folder using `garnet/predict_images.py`:
**Command-Line Arguments:**
```bash
python garnet/predict_images.py \
--image_path path/to/pids_folder \
--model_type yolov8 \
--model_path path/to/model_weights.pt \
--output_path results/
```
**Output:**
- Annotated images (saved in output_path).
- CSV file with symbol counts (`output_path/symbol_counts.csv`).
**Example code snippet:**
```python
from garnet.predict_images import predict_images
predict_images(
image_path="path/to/pids_folder",
model_type="yolov8",
model_path="path/to/model_weights.pt",
output_path="results/"
)
```
---
### 3. Pipeline: End-to-End P&ID Digitization and Connectivity Analysis
A staged, runnable pipeline with CLI is available in `garnet/pid_extractor.py`. This pipeline performs comprehensive P&ID analysis with the following stages:
**Usage:**
```bash
python garnet/pid_extractor.py \
--image path/to/pid_image.png \
--coco path/to/coco_annotations.json \
--ocr path/to/ocr_results.json \
--arrow-coco path/to/coco_arrows.json \
--out output/ \
--stop-after 11
```
**Pipeline Stages:**
1. **Stage 1: Input Normalization**
- Load P&ID image (BGR format)
- Apply normalization (deskew, contrast enhancement)
- Generate binary mask for pipe detection
- Output: Normalized image, grayscale, binary mask
2. **Stage 2: OCR Discovery**
- Run selected OCR route (EasyOCR, Gemini, PaddleOCR, or OCRMac)
- Detect text regions and extract text content
- Output: OCR results with bounding boxes and text
3. **Stage 3: HITL Review (Manual)**
- Reserved for human-in-the-loop review
- Major equipment bounding boxes supplied through LabelMe/frontend review
- Note: This stage is managed outside the automatic CLI flow
4. **Stage 4: Object Detection & Fusion**
- Run SAHI-based object detection (YOLOv8/YOLOv11)
- Process line number fusion and instrument tag fusion
- Apply topology-marker routing for junction detection
- Output: Detected objects with bounding boxes, class labels, and confidence scores
5. **Stage 5: Pipe Mask Generation**
- Create provisional pipe mask from binary image
- Suppress OCR and object regions to focus on pipe regions
- Output: Pipe mask image and summary statistics
6. **Stage 5b: Pipe Tracing**
- Trace pipe centerlines from mask using computer vision algorithms
- Start from detected ports and follow pipe paths
- Handle inline objects (valves, reducers) by jumping over them
- Detect terminals (equipment, tags, junctions, sheet edges)
- Output: Traced pipe paths, junctions, and endpoints
7. **Stage 6: Trace Associations**
- Associate traced paths with ports, inline objects, line numbers, instruments
- Connect detected objects to the pipe network
- Fill missing line numbers with simulated-HITL placeholders
- Output: Object-to-pipe associations and connection data
8. **Stage 7: Geometric Graph Assembly**
- Assemble and normalize the geometric trace graph
- Label page connectors for multi-document support
- Run graph quality assurance checks
- Export initial graph payload (v1)
- Output: Geometric graph representation (nodes, edges, connectivity)
9. **Stage 8: HITL Review Package**
- Build graph/line-number human-in-the-loop review package
- Prepare items for validation (junctions, connections, line numbers)
- Output: Review items ready for frontend inspection
10. **Stage 9: Apply Review Decisions**
- Apply human review decisions to correct the graph
- Pass through unchanged when no decisions exist
- Output: Corrected graph based on review feedback
11. **Stage 10: Process Exports**
- Generate line lists, equipment connectivity, inline MTO
- Create inline observations and instrument index
- Output: Process exports in JSON/CSV formats
12. **Stage 11: Connection Overlay**
- Render connection-pipeline overlay for visual review
- Create visual representation of connections on original image
- Output: Connection pipeline overlay image
**Configurable Parameters (PipelineConfig):**
- Image processing: DPI, Canny thresholds, binarization settings
- Detection: Confidence thresholds per class, image size, overlap ratio
- Text association: Multiplier for bbox diagonal
- Graph: Connection radius, port counts, angle separation
- Valve linking: Directional strategy, edge offset, raycast step
- Template matching: For valve orientation detection
- Cleanup: Bridge max distance, angle tolerance
- Tracing: Branch stub length, merge angle tolerance, opposite angle tolerance
- Crossing resolution: Center blob radius, threshold, marker match distance
**Note:** This pipeline is designed to be modular, so each step can be run independently or as part of the full digitization workflow. Use `--stop-after N` to run only specific stages. The stage numbering intentionally skips Stage 3 in the automatic CLI flow because HITL input is managed outside this runner.
---
### 4. Model Training (Optional)
To train custom YOLO models for P&ID symbols using Ultralytics:
```bash
cd backend
yolo train \
data=backend/datasets/yaml/data.yaml \
model=yolov8n.pt \
epochs=100 \
imgsz=640 \
batch=16
cd ..
```
**Available dataset configurations:**
- `backend/datasets/yaml/data.yaml` - Default dataset configuration
- `backend/datasets/yaml/balanced.yaml` - Balanced class distribution
- `backend/datasets/yaml/iso.yaml` - ISO standard symbols
- `backend/datasets/yaml/pttep.yaml` - PTEP-specific symbols
**Training tips:**
- Use balanced datasets for better model performance
- Adjust `imgsz` based on your P&ID image resolution
- Increase epochs for better convergence (100-300 typical)
- Use data augmentation for improved generalization
## ⌨️ Keyboard Shortcuts
The frontend supports the following keyboard shortcuts for efficient navigation and editing:
| Shortcut | Action |
| ------------------------------- | -------------------------------- |
| `←` / `→` | Navigate to previous/next object |
| `Enter` | Accept selected object |
| `Delete` / `Backspace` | Reject selected object |
| `Ctrl + Z` | Undo last action |
| `Ctrl + Y` / `Ctrl + Shift + Z` | Redo last action |
| `F` | Fit image to screen |
| `0` | Reset zoom to 100% |
| `+` / `-` | Zoom in/out |
| `Esc` | Deselect object / Cancel editing |
## 📂 Dataset
GARNET uses the YOLOv8 dataset format. Example structure:
```
dataset/
├── train/
│ ├── images/ # P&ID images (.jpg, .png)
│ └── labels/ # YOLO-format labels (.txt)
├── val/
│ ├── images/
│ └── labels/
└── data.yaml # Dataset config (class names, paths)
```
**Available dataset configurations:**
- `backend/datasets/yaml/data.yaml` - Default dataset configuration
- `backend/datasets/yaml/balanced.yaml` - Balanced class distribution
- `backend/datasets/yaml/iso.yaml` - ISO standard symbols
- `backend/datasets/yaml/pttep.yaml` - PTEP-specific symbols
**Class definitions:**
- `backend/datasets/classes.txt` - List of all class names
- `backend/datasets/predefined_classes.txt` - Predefined class mappings
- `backend/datasets/settings_labels.json` - Label settings configuration
Example `backend/datasets/yaml/data.yaml`:
```yaml
train: images/train
val: images/val
# Classes
names:
0: butterfly valve
1: check valve
2: control valve
3: gate valve
4: globe valve
5: heat exchanger
6: instrument DCS
7: instrument tag
8: page connection
9: three way valve
10: utility connection
```
---
## 📊 Results
| **Detection** | **Graph Analysis** |
| ------------------------------------------------- | ------------------------------------------------ |
|  |  |
_Example output: Symbol counts and connectivity graph for a P&ID._
---
## 📈 Future Outcomes
Additional planned outcomes from the GARNET project include:
- **MTO for Valves**: Automated generation of material take-off lists for all detected valve types, including specifications and quantities.
- **Line List**: Extraction and tabulation of pipeline data, including line tags, sizes, service, and connected equipment.
---
## 🤝 Contributing
Contributions are welcome! Please fork the repository and submit a pull request.
For major changes, open an issue first to discuss your ideas.
---
## 📜 License
This project is licensed under the **MIT License**. See [LICENSE](LICENSE) for details.
---
## 📧 Contact
For questions or collaborations, contact:
- **Maetee Lorprajuksiri** - [may3rd@gmail.com](mailto:may3rd@gmail.com)
- **GCME (GC Maintenance and Engineering Co., Ltd.)** - [www.gcme.com](https://www.gcme.com)