
1 · The raster bottleneck & why vectors win
Rasters are excellent for storing raw Earth-observation data, but their flat grids make it hard to query, visualise, or fuse information with BIM, CAD or enterprise GIS. By contrast, image segmentation:
- Distils pixels into labelled polygons—immediately searchable with spatial SQL.
- Reduces data volume—a single polygon represents thousands of pixels.
- Enables object-level analytics—area, perimeter, adjacency, change-detection.
- Slots straight into digital-twin and dashboard workflows—no manual tracing.
Segmentation quality is always a product of input data and model tuning; that’s where our engine steps in.
2 · Inside the engine (alpha release)
| Stage | What it does | Tech highlights |
|---|---|---|
| 1 · Smart Tiler | Streams tiles from disk or Azure Blob, validates CRS, fixes label geometry, packages NPZ files. | Rasterio, GeoPandas, Azure SDK |
| 2 · Adaptive Trainer | Accepts any band-count, plugs in U-Net / DeepLab-v3+ or your own backbone, auto-logs to MLflow. | PyTorch ecosystem |
| 3 · Inference Runner | Pads edges to avoid artefacts, writes class masks directly to Azure, CPU/GPU switchable. | PyTorch, Albumentations |
| 4 · Vectoriser | Converts masks to clean polygons, repairs validity, filters by area. | rasterio.features.shapes, GeoPandas |
| 5 · Auto-Reporter | Compiles a brand-ready PDF—run metadata, class stats, example plates. | ReportLab |
Flex deployment
- Azure-native today—runs as containerised jobs or Azure ML notebooks.
- On-prem/edge ready—identical build executes offline for secure sites.
- Other clouds? We’re cloud-agnostic by design and open to integration feedback.
3 · Proof-of-concept snapshot
On a pilot dataset of very-high-resolution imagery (centimetre-level GSD) the alpha engine achieved a mean IoU approaching 0.68 on its very first training pass, producing tens of thousands of clean polygons. The numbers will only climb as we introduce hyper-parameter sweeps and larger training corpora—but the takeaway is clear: the workflow is robust out of the box.
4 · What is mIoU and why should you care?
| Term | Plain-English definition | Why it matters |
|---|---|---|
| IoU (Intersection over Union) | For one class, IoU = area of overlap between the predicted polygon and the ground-truth polygon ÷ area of their union. A perfect match = 1.0; total miss = 0.0. | Tells you how well the model captures shape and location, not just presence. |
| mIoU (mean IoU) | The arithmetic mean of IoU across all classes, usually weighted equally. | Single, easy-to-track score that penalises models which favour dominant classes and ignore rare ones. |
Why a first-pass mIoU ≈ 0.68 is news-worthy
- Most production segmentation systems start around 0.40-0.55 before tuning.
- Our alpha run used default hyper-parameters, no class re-weighting and a small training set—yet it is already nudging 0.70.
- A high opening score signals that the data pipeline is sound (label alignment, CRS, tiling) and the architecture is a good fit for the problem space.
Clear levers to push it higher
- Bigger & cleaner labels – more diverse scenes, better balance between classes.
- Hyper-parameter sweep – LR schedulers, deeper backbones, mixed precision.
- Class-aware augmentation – synthetic minority oversampling, cut-mix, color jitter tuned per class.
- Self-supervised pre-training – replace ImageNet weights with MAE/SIMCLR encoders trained on unlabelled imagery.
- Ensembling & TTA – vote multiple model checkpoints, apply flip/rotate inference.
Our goal for beta is mIoU ≥ 0.80 on very-high-resolution scenes—closing the gap between automated and manual digitising.
5 · Industries we already serve
Stoian Co.’s broader consulting practice covers GIS, digital-twin, drone capture and spatial analytics for urban planning, AEC, real-estate, infrastructure, environment and risk management arenas. (Services – Stoian Co, About – Stoian Co)
The segmentation engine slots directly into those solutions, letting clients:
- Automate land-use & encroachment audits.
- Feed BIM-GIS pipelines with up-to-date feature layers.
- Build ESG baselines—from vegetation to impervious surfaces.
- Generate rapid asset inventories for utilities and insurers.
6 · Why this matters—right now
- Consulting access first – While in alpha, the engine ships as a service engagement. We run the pipeline, deliver the vectors, KPIs and PDF dossier—no install hassle.
- Image-agnostic – Works from sub-decimetre drone orthos to metre-class satellites.
- No black box – Outputs carry full metadata and MLflow lineage for audit.
Keen to turn pixels into actionable vectors?
Reach out at contact@stoian.co and let’s explore an alpha-stage pilot. Stay tuned for public licensing announcements!