Zone Delineation — HDBSCAN + Network-Constrained Buffering

Raw perception data gives us 315 pole locations. The question: what shape does each school zone take on the road? We answer this by directly buffering the computed extent LineStrings (Deliverable 2) — each extent already traces the road from START sign to END sign. HDBSCAN clustering was explored for initial grouping but superseded: since extents already define the zone path, buffering them is simpler and more accurate.

Stage 1: Spatial Clustering (HDBSCAN)

HDBSCAN (Campello, Moulavi & Sander, 2013, JMLR) is a hierarchical density-based clustering algorithm. Unlike DBSCAN, it requires no manual ε parameter — it discovers clusters at varying density scales automatically.

315
Input poles
82
Final zone polygons
43
Confirmed zones
ParameterValueRationale
min_cluster_size2Even 2 nearby poles form a valid school zone
min_samples1Permissive — we want all poles assigned, not discarded as noise
metricEuclideanOn UTM EPSG:25833 coordinates (metres, not degrees)
cluster_selection_methodEOMExcess of mass — produces stable, well-separated clusters

Noise points (label = −1) are promoted to singleton clusters. The algorithm runs in O(n log n) time — 315 poles cluster in under 1 second.

Stage 2: Road-Network Buffering

Alpha shapes or convex hulls around point clusters produce circular blobs — geometrically incorrect for road-following speed restrictions. Instead, we use network-constrained buffering, a standard GIS technique from transport planning:

1
Spatial join: pole → road linkFor each pole, query the R-tree spatial index (STRtree — same engine as PostGIS) to find the nearest road segment from road_geometry.csv (27,943 links). Also collect secondary segments within 50m for junction coverage.
2
Buffer road LineStringsBuffer each matched road segment by 15m on each side (flat cap style) — producing a 30m-wide rectangular corridor that follows the road geometry exactly.
3
Union per clusterMerge all buffered road segments within the same HDBSCAN cluster using unary_union. At junctions, overlapping buffers merge into natural polygon shapes.
4
SimplifySimplify merged polygons (3m tolerance) to reduce vertex count without visible loss of shape fidelity.

Post-Processing

StepOperationResult
1. Extent clippingTrim zone polygons to computed extent endpoints (from Deliverable 2)89 zones trimmed — eliminates road overshoot beyond restriction boundary
2. Same-status mergeAdjacent clusters with identical classification within 30m are merged97 extent-based zones → 82 after merging nearby same-status zones
3. Sliver removalRemove polygon fragments < 200m²2 fragments removed

Cross-Validation

98.7%
Pole containment rate (311/315 poles inside their zone)
12m
Maximum distance for the 4 near-miss poles
Why not full network-constrained clustering? The ideal approach (Boeing, 2018) uses road-network distance instead of Euclidean for the clustering step itself. This is more accurate at junctions but significantly slower (O(n² · Dijkstra)). For 315 poles in a ~25km² area, Euclidean clustering produces equivalent groupings — network distance only diverges when two poles are nearby as-the-crow-flies but far apart by road (rare in our study area). The pragmatic composition (Euclidean HDBSCAN + network boundaries) runs in 3 seconds versus ~10 minutes for full network-constrained clustering. At scale (100K+ poles), the Euclidean approach is necessary.
43
2
8
29
CONFIRMED (52%) MIXED (1%) NEEDS_REVIEW (10%) EXCLUDED (35%)

Zone Status Definitions

StatusCountDefinition
CONFIRMED43≥70% of poles classified AUTO_APPROVE
MIXED2Both APPROVE and EXCLUDE poles present, neither dominant
NEEDS_REVIEW8REVIEW poles present, no APPROVE poles
EXCLUDED29100% of poles classified EXCLUDE

1. Dataset Summary

FileRowsKey ColumnsPurpose
schoolzone_speedlimit_signs.csv326pole_lat/lon, bearing, ndetections, speedlimit_gfrgroup, supplemental_gfrgroupSchool zone START signs (our primary input)
remaining_speedlimit_signs.csv11,607lat/lon, bearing, pole_lat/lon, ndetections, speedlimit_gfrgroupEND signs + all other speed signs (extent validation)
road_geometry.csv27,943id (UUID), link_geometry (WKT), travel_direction, road_name, average_speedRoad network for graph walking
school_pointsofinterest.csv4,780 (filtered: educational only)suppliers, longitude, latitudeSchool proximity gate (G2)
traffic_data/HERE_DA_*.csv.gz48.5M recordsLINK-DIR, DATE-TIME, SPDLIMIT, MEAN, COUNTSpeed anomaly cross-validation (Feb 2026)
mapillary_images/1,850 imagespt{ID}_id{MAPILLARY_ID}_date_lat_lon.jpgVisual confirmation, OCR temporal classification
enriched_master_v7_classified.csv315 polespole_id, classification, decision_path, confidence, nearest_school_dist_mFinal output of gate pipeline
School POI Pollution: The raw school_pointsofinterest.csv contains 654 unique supplier strings including Photography, Entertainment, SharedOfficeSpaces. We filter to educational keywords only (school, schule, gymnasium, kindergarten, kita, kinder, university, etc.) before use in Gate G2. Unfiltered use would produce false school proximity matches.
Detection Count Distribution 163 2 93 259 24 515 11 772 9 1028 4 1285 4 1542 2 1798 2055 5 2311 n_detections per pole
Nearest School Distance (metres) 247 5-239 52 239-473 11 473-707 3 707-941 0 941-1175 0 1175-1409 0 1409-1644 2 1644-1878 distance to nearest school (m)
Gate Deep Dives — The sections below give a plain-language explanation of each gate, real statistics from the 315-pole dataset, and concrete example poles to illustrate each decision.

Gate 1 — Tempo-30 Zone Check

Does this sign sit inside an existing 30 km/h zone?

Before treating a school sign as a standalone speed restriction, we check whether the area already has a blanket Tempo-30 zone. If it does, the school sign is just an advisory warning (Zeichen 136), not a separate speed limit. Including it as a new restriction would double-count the rule. We exclude those poles immediately — the 30 km/h limit was already mapped through the zone, not the school sign.

230
Pass — no zone overlap
80
Excluded — inside Tempo-30 zone
5
Keep but stricter review
315
Total poles entering G1

How it works

We load the Berlin open-data Tempo-30 zone polygons and test whether each sign's position falls inside one. Signs within a zone boundary go to EXCLUDE. Signs right on the edge (within ~20 m) get flagged as KEEP_STRICTER — possible boundary cases that need a human check.

German traffic law: Permanent school zone speed limits apply even at night, weekends, and during school holidays (ADAC confirmed). Only conditional zones have time restrictions — and only zones created by the school sign itself, not inherited zone limits.

Example — pole excluded by zone overlap

Pole 30 sits inside a Tempo-30 zone. The area already has a blanket 30 km/h limit. This sign contributes no new restriction.
FieldValue
Pole ID30
Gate 1 resultEXCLUDE
Spatial evidence strengthVERY_STRONG
Sign typeSpeedLimit2V30
Weekly detections863
Nearest school80 m

Contrast — pole that passes G1

For comparison, here is a sign that is not inside any Tempo-30 zone and continues through the pipeline:

FieldValue
Pole ID55
Gate 1 resultPASS
Nearest school8 m
Weekly detections71
Final classificationAUTO_APPROVE

Gate 2 — School Proximity Check

Is there actually a school nearby?

A school zone sign should be near a school. We search our database of 4,780 educational facilities — schools, kindergartens, after-school care — filtered to genuine educational use only (photography studios and shared office spaces were removed from the raw data). The sign must be within a sensible distance to count as truly associated with a school.

231
Confirmed near a school
3
School present but distant
1
No school found nearby
123/315
Within 100 m of school

Distance distribution

Nearest school distance for all 315 poles ≤ 100 m 123 101–300 m 152 301–500 m 28 > 500 m 12

The vast majority of school zone signs sit within 300 m of an educational facility — confirming the sign placement is genuinely associated with a school, not a stray detection.

Example — strong proximity confirmation

FieldValue
Pole ID319
Gate 2 resultpositive
Nearest school5 m
School type
Confirmed facility nameKindervilla am Griebnitzsee
Schools within 300 m
Confidence scoreTRIPLE_CONFIRMED

Example — no school found (gate negative)

Pole 112: the nearest educational facility is 412 m away. This may be an edge-of-zone sign or a misclassified detection.
FieldValue
Pole ID112
Gate 2 resultnegative
Nearest facility412 m
Facility typeKindergarten and Childcare
Final classificationAUTO_APPROVE

Gate 3 — Street-Level Image Check

Can we see the sign in Mapillary photos?

We pull street-level photos from Mapillary — a crowdsourced image platform where drivers upload dashcam footage. An AI model (Florence-2 vision) scans each image looking for school zone signs, children crossing warning signs, and speed plates. A visual score from 0 to 1 is assigned based on: how many images show the sign, how recent they are, and whether the camera was facing the sign front-on rather than shooting it from behind.

Images older than a few years are down-weighted — a 2017 photo confirming a sign does not tell us whether that sign is still there in 2026.

190
Visually confirmed
19
Partial evidence
26
No usable images
0.80
Mean visual score

Example — high confidence visual confirmation (Pole 240)

FieldValue
Pole ID240
Gate 3 resultpositive
Visual score1.000
Evidence tierHIGH
Mapillary images found8.0
Camera quality score0.90
Front-facing camera ratio1.00
Matched roadOnkel-Tom-Straße

Detection overlay — AI model marks the sign

Sign detection for pole 240

Cropped sign patch — sent to OCR for text reading

Cropped sign for pole 240

Unverified case — no usable images

Some poles have no Mapillary coverage at all, or only low-quality images taken from too far away. These are marked unverified — the sign may well be real, but visual confirmation is unavailable. The pipeline falls back to gates 1–2 and 5–6 for its confidence score.

FieldValue
Pole ID0
Gate 3 resultunverified
Mapillary images
Visual score0.0
Final classificationAUTO_APPROVE

Florence-2 Detection Examples

Florence-2 (Microsoft, 2024) runs on GPU to detect and localise signs in street-level images. Below: bounding box detections and cropped sign regions fed to OCR.

det 00250
Pole 00250 - bounding box detection
crop 00250
Pole 00250 - cropped sign for OCR
det 00235
Pole 00235 - bounding box detection
crop 00235
Pole 00235 - cropped sign for OCR
det 00272
Pole 00272 - bounding box detection
crop 00272
Pole 00272 - cropped sign for OCR
det 00240
Pole 00240 - bounding box detection
crop 00240
Pole 00240 - cropped sign for OCR

Gate 4 — Permanent vs Conditional Check

Is this sign always active, or only at certain times?

German school zone signs come in two flavours. Permanent signs (Zeichen 274, plain speed plate) apply 24/7 — including weekends, evenings, and school holidays. Conditional signs have a supplemental plate below them showing something like "Mo–Fr 7–16 Uhr" (weekdays 7 am–4 pm only).

We determine which type each sign is through three independent checks, then cross-validate them:

  1. Supplemental flag in probe data — HERE's fleet tagged whether a sub-plate was observed
  2. OCR on cropped sign images — we read the text directly from the Mapillary photo
  3. Speed pattern from 48 million probe readings — if traffic slows down specifically during school hours and speeds up after, that is a conditional zone signal
129
Permanent (always active)
101
Conditional (time-restricted)
5
Inconclusive
57.1%
Permanent fraction
Key nuance: No dataset contains actual school timetables. We infer conditional hours from image OCR first, then fall back to probe speed patterns. Standard German school hours (~07:00–16:00) are used as a prior when both sources are silent.

Example — permanent restriction (Pole 107)

FieldValue
Pole ID107
Gate 4 resultpositive_permanent
Temporal classPERMANENT
Speed limit30.0 km/h
Mean speed — school hours15.7 km/h
Mean speed — rest of day15.2 km/h
Supplemental flag
OCR result
Average speed: school hours vs rest of day 15.7 km/h School hours (07:00–16:00) 15.2 km/h Rest of day 30 limit

Example — conditional restriction with OCR confirmation (Pole 322)

OCR found a time restriction on the sign plate: Mo-Fr
FieldValue
Pole ID322
Gate 4 resultpositive_conditional
Time restriction foundMo-Fr
Raw OCR text (excerpt)人 | 30 | Mo-Fr | 6-18h | 7 | 人 | 30 | moto- | store | GNOH | Mo-Fr | 6-18h | 人 | 30 | Mo-Fr | 6-18h | moto- | store | GM
Speed read from image30.0 km/h
Useful sign crops11.0
Mean speed — school hours— km/h
Mean speed — rest of day— km/h
Speed drop— km/h
Average speed: school hours vs rest of day 25.0 km/h School hours (07:00–16:00) 30.0 km/h Rest of day 30 limit ↓ 5.0 km/h drop during school hours

Gate 5 — Detection Confidence Check

How many independent probe vehicles confirmed this sign?

HERE's fleet of connected vehicles drives past signs repeatedly. Each week, the system clusters nearby detections and produces a count: how many vehicles reported seeing this sign this week. A sign seen by 2 vehicles once is much less reliable than one seen by 400 vehicles over many weeks.

We use two metrics together: the weekly detection count (how many vehicles per week) and the number of weekly snapshots (how many separate weeks this sign was detected). A sign that appears consistently over many weeks is far more trustworthy than a one-off burst.

177
High confidence (50+ detections)
44
Medium confidence (10–49)
8
Low confidence (< 10)
243
Median weekly detections

Distribution of weekly detection counts

Weekly detection count distribution (n = 315 poles) 2–9 19 10–49 37 50–199 85 200–499 115 500+ 59

Most school zone signs have strong repeated confirmation — the median is 243 detections per week. The small tail of low-confidence signs gets flagged for human review rather than auto-approved.

Example — high confidence (Pole 107)

FieldValue
Pole ID107
Gate 5 resultHIGH
Weekly detections2568
Peak detections (best week)2568
Weeks with detections1
Final classificationAUTO_APPROVE

Example — low confidence (Pole 180)

Pole 180 was detected only 9 times across 1 week(s). Could be a new sign, a rarely travelled road, or a false positive.
FieldValue
Pole ID180
Gate 5 resultLOW
Weekly detections9
Weeks with detections1
Final classificationEXCLUDE

Gate 6 — Bearing & Alignment Check

Does the sign face the right direction for the road it is on?

A school zone sign governs traffic approaching from a specific direction. The sign faces the oncoming traffic — so if the sign face points east (bearing 90°), it is controlling westbound traffic (the cars heading towards it). We check whether that implied road direction matches an actual road near the sign's location.

We also check whether the Mapillary cameras that photographed the sign were approaching it head-on (front-facing) rather than driving away from it (rear-facing). A clear front-facing view confirms the sign placement relative to traffic flow.

53
Good alignment
87
Acceptable (within 45°)
95
Ambiguous — needs review
180°
Bearing correction applied
Bearing maths note: The sign face direction and the road direction it governs are opposite. A sign at 90° faces east — but it controls westbound drivers (heading east towards the sign). We apply a 180° flip, then use wrap-around angle comparison to handle the 0°/360° boundary correctly.

Example — good alignment (Pole 272)

FieldValue
Pole ID272
Gate 6 resultGOOD
Sign bearing (face direction)138.0°
Angle to matched road— (match via proximity)
Matched road nameRudolf-Breitscheid-Straße
Cameras facing sign front-on1%
Camera faces sign?

Example — ambiguous bearing (Pole 0)

Ambiguous cases usually occur at junctions, one-way systems, or where the road network topology does not clearly match the sign orientation. These are sent to human review rather than auto-approved.

FieldValue
Pole ID0
Gate 6 resultAMBIGUOUS
Sign bearing73.0°
Matched roadAm Weinberg
Front-facing ratio0.0

Gate 7 — Sign Context & OCR Check

What do the nearby signs and the sign text itself tell us?

The final gate combines two checks in one pass:

  1. Context signs on the same pole or within a few metres — what other signs share this pole? A school sign next to a 30 km/h plate and a "StartBUA" (built-up area) marker is exactly what we expect. A school sign surrounded by motorway signs is not.
  2. OCR text extraction from the sign crop — we read the cropped sign image and look for: speed numbers, day/time text ("Mo–Fr 6–18 Uhr"), and any other restriction text on sub-plates.

Images older than 2020 are down-weighted because the sign may have changed since the photo was taken. The crop_n_useful field counts how many readable sign crops the model found in recent images.

254
Confirmed by context/OCR
53
Context contradicts sign
8
Conflicting signals
34
Poles with no images (G7 skipped)

Example — positive with time restriction OCR (Pole 228)

OCR extracted a time restriction from the sign plate image: Mo-Fr
FieldValue
Pole ID228
Gate 7 resultpositive
Context signs foundconfirming context: SpeedLimit2V30 at 15m; StartBUA at 35m; SpeedLimit2V30 at 0m
OCR text (all crops)大 | 30 | Mo-Fr | 6-18h | 人人 | 30 | Mo-Fr | 人人 | 30 | Mo-Fr | Mo-Fr | 6-18h | 人人 | 30
Time restriction matchedMo-Fr
Speed read by OCR30.0 km/h
Useful sign crops7.0
Has time restrictionTrue

Example — confirmed by context signs (no time restriction) (Pole 208)

FieldValue
Pole ID208
Gate 7 resultpositive
Nearby sign contextconfirming context: SpeedLimit2V30 at 40m; SpeedLimit2V30 at 0m; SpeedLimit2V30 at 17m
Speed plate seen30.0 km/h
Useful sign crops0.0
Final classificationAUTO_APPROVE

Example — negative (context does not support school zone) (Pole 107)

Pole 107: the sign context or OCR text contradicts a school zone classification. Sent to REVIEW or EXCLUDE.
FieldValue
Pole ID107
Gate 7 resultnegative
Gate 7 detailCONTRADICTIONS: SpeedLimit2V90 at 0m (same direction, bearing_diff=0°)
OCR textH | E | 3 | 3 | L | 1
Weekly detections2568
Final classificationAUTO_APPROVE
315
Total poles processed
219
AUTO_APPROVE (69.5%)
80
EXCLUDE (25.4%)
16
REVIEW (5.1%)
ClassificationCount%Meaning
AUTO_APPROVE21969.5%All gates passed — high confidence valid school zone
EXCLUDE8025.4%One or more gates failed — not a valid school zone speed restriction
REVIEW165.1%Borderline — gates partially passed, requires human validator

School Zone Analysis

OutcomeZonesNotes
Confirmed60Of 74 evaluated zones
Needs review2Borderline confidence
Excluded12Not valid school zone speed restrictions
Key design note: The pipeline is designed to surface uncertainty, not hide it. The 16 REVIEW poles have full decision_paths showing exactly which gates partially failed and by what margin. This makes human review efficient: validators know exactly what to check.

DBSCAN Cluster Analysis

DBSCAN (ε=300m, min_samples=1, UTM EPSG:25833) groups poles sharing a geographic cluster into a single logical school zone unit. This corrects the broken nearest-school assignment (91% of poles were >1km from their assigned school under naive nearest-neighbour matching) and handles directional sign pairs (opposing signs for the same zone).

315
Total poles
137
Clusters (ε=300m)
64
Matched to school ≤500m
6
No school nearby

Zone status breakdown: CONFIRMED: 84 | EXCLUDED: 25 | NEEDS_REVIEW: 9 | MIXED: 19

Educational POIs used: 3985 | Mean poles per cluster: 4.5

ClusterPolesNearest School≤500m?School NameStatus
1 55 106.5m unified_schools NEEDS_REVIEW
10 28 350.2m unified_schools MIXED
2 26 39.1m School NEEDS_REVIEW
5 26 83.7m School NEEDS_REVIEW
23 8 62.6m unified_schools MIXED
0 13 414.9m unified_schools NEEDS_REVIEW
22 11 187.1m unified_schools MIXED
33 10 94.4m School MIXED
26 7 77.2m School MIXED
44 6 35.1m School CONFIRMED
13 5 15.8m School,Kindergarten and Childcare MIXED
17 5 50.2m Kindergarten and Childcare MIXED
24 5 37.6m School,Education Facility,Kindergar... MIXED
35 5 200.8m School MIXED
4 4 86.6m Kindergarten and Childcare CONFIRMED
Finding: 0 school zone classifications changed versus pole-level analysis. The gate logic is already robust at the individual pole level. DBSCAN provides grouping for zone-level reporting and school matching.

4. Deliverable 2: Zone Extent — Graph Walking Algorithm

Once a pole is classified as AUTO_APPROVE, we determine how far the speed restriction extends along the road network.

1
Spatial join: pole → road linkProject pole coordinates to UTM EPSG:25833. Find nearest road geometry from road_geometry.csv. Note: road_geometry uses UUIDs, traffic data uses numeric HERE IDs — spatial join is required, not string matching.
2
Determine governed directionRoad direction the sign governs: (pole_bearing + 180) % 360. Sign faces oncoming traffic; adding 180° gives the direction of travel it restricts. Angular comparison uses modular arithmetic: min(|a-b|, 360-|a-b|) to handle 0°/360° wraparound.
Bearing Math: Sign Facing → Governed Direction N (0°) E (90°) S (180°) W (270°) ROAD SURFACE ← westbound eastbound → 30 POLE sign faces ← (270°) governed → (90°) Sign faces oncoming traffic Direction traffic must obey 30 km/h governed = (pole_bearing + 180) % 360 Example: (270° + 180°) % 360 = 90° Angular diff: min(|a−b|, 360−|a−b|)
3
Zone sign branchingSpeedLimitZoneV30 signs (Zeichen 274.1) define 2D areas, not linear stretches. These are branched separately before the linear walk begins. 1,057 such records in the dataset.
4
Graph walk forwardFollow road topology in governed bearing direction. At each junction, select the outgoing link whose bearing is closest to the governed direction (modular angular difference).
5
Termination detectionWalk terminates on first match: END_SIGN (Zeichen 274 Ende) | new higher speed limit | DEAD_END (no outgoing links) | ROAD_END (link boundary) | MIRROR (opposing direction sign found) | MAX_DIST (500m cap).

Extent Results

208
Extents computed
224m
Mean zone length
80.3%
Validated by END sign
Termination TypeCount%Description
END_SIGN7737.0%Zeichen 274 termination sign found during walk
DEAD_END5727.4%Road topology ends (cul-de-sac, dead end)
ROAD_END4923.6%Link boundary reached
MIRROR178.2%Opposing direction sign found — symmetric zone
MAX_DIST83.8%500m cap reached — unusual, flagged for review
Mapping to HERE’s 4 Official Termination Types

HERE’s hackathon brief defines exactly 4 ways a school zone speed limit ends. We implement 3 of the 4 types directly.

HERE Type Description Our Rule Count
Type 1 Text with distance — sign says “Schule 30 / 2 km” Not implemented
Type 2 + 3 “End of SL” / “End of all restrictions” / EndBUA or new higher speed limit sign (e.g. 50 km/h) END_SIGN 77
Type 4 Bounding logic — road graph terminates (cul-de-sac) DEAD_END 57
Type 4 Bounding logic — road link ends, no continuation ROAD_END 49
Type 4 Bounding logic — inferred from opposite-direction pole MIRROR_FALLBACK 17
Type 4 Bounding logic — safety cap at 500m, flagged for review MAX_DISTANCE 8
We implement 3 of 4 HERE-defined termination types. Type 1 (distance from sign text) requires OCR extraction of distance values from sign images — a natural extension of our Florence-2 pipeline. END_SIGN catches both HERE Type 2 (cancellation signs: EndSpeedLimit2V30, EndRestriction, EndBUA) and Type 3 (new higher speed limits) in a single road graph walk against all 11,607 signs in remaining_speedlimit_signs.csv.
Data provenance: what is given vs. what we computed
InputSourceRole in extent computation
road_geometry.csvGiven (HERE)27,943 road links — the graph we walk
remaining_speedlimit_signs.csvGiven (HERE)11,607 speed signs — END_SIGN termination points
Pole bearingsGiven (HERE perception)Sign facing direction → governed direction via +180° flip
Graph walk algorithmOur computationWalk road topology in governed direction, select best-bearing link at junctions
Termination detectionOur computation5 termination rules: END_SIGN, DEAD_END, ROAD_END, MIRROR, MAX_DIST
Spatial join (pole → road)Our computationUTM EPSG:25833 nearest-neighbour matching (not string ID matching)

Extents are purely topological — no images, no ML, no external APIs. The algorithm is fully deterministic: same input always produces same output.

Verification: How do we know the extents are correct?

Three independent verification methods confirm extent accuracy:

80.3%
Extents confirmed by independent END sign
167/208
Matched within 50m of predicted endpoint
11,607
Independent END signs searched (fully separate dataset)
  1. End sign cross-validation (80.3%) — For each computed extent, we searched remaining_speedlimit_signs.csv (11,607 records — a completely separate dataset from the school zone signs) for END signs within 50m of the predicted termination point. 167 of 208 extents have an independently confirmed end sign. This is the strongest validation: two independent HERE perception datasets agreeing on the same physical location.
  2. Road topology validation — DEAD_END and ROAD_END terminations (106 extents, 51%) are structurally correct by definition: the road graph has no further links. These cannot be wrong unless the road geometry itself is wrong.
  3. Distance sanity check — Mean extent length is 224m (range: 15m–500m). 92% of extents are under 400m, consistent with German school zone regulations which typically span 100–300m around school entrances. The 8 extents that hit MAX_DIST (500m cap) are flagged for human review.
What we cannot verify: The exact metre-position of the zone boundary between the last confirmed sign and the first higher-speed sign. Our extent marks the termination trigger (the END sign or dead end), but the legal boundary may be slightly before or after. For HERE’s purposes, this is within the acceptable accuracy band for map enrichment.
School Zone Pole (315) Snap to Road Link Walk Graph in Bearing Dir Find END Sign/Dead End Extent LineString 219 approved 27,943 links mean 224m 80.3% confirmed 208 extents
Extent Termination Types END_SIGN 77 DEAD_END 57 ROAD_END 49 MIRROR_FALLBACK 17 MAX_DISTANCE 8

End Sign Cross-Validation

For each computed extent, we queried remaining_speedlimit_signs.csv (11,607 records) for independent END signs within 50m of the predicted termination point. This is a fully independent dataset from the schoolzone signs — any match is genuine cross-validation.

167/208
Extents with confirmed END sign
80.3%
Validation rate
When the graph-walk predicts "zone ends here" and an independent END sign exists within 50m of that point, this is strong evidence that both the walk algorithm and the underlying HERE perception data are accurate.

5. Florence-2 Detection Pipeline

Signs occupy <5% of a typical Mapillary dashcam frame. Direct VLM queries on full images hallucinate at high rates. Our pipeline: detect → isolate → read.

  1. 1,850 Mapillary images fed to Florence-2 (GPU: Lightning AI L4, 23GB VRAM)
  2. Florence-2 object detection: 451 images contain sign-type objects
  3. 1,640 individual sign detections across all images
  4. Bounding box crops extracted (1,639 crops)
  5. Tesseract OCR on crops → regex extraction
  6. 35 poles yield time text; 32 yield confirmed speed text

5.1 Detection Example — Pole 00250

Pole 00250 Florence-2 detection with green bounding boxes

Full scene — Florence-2 draws bounding boxes around 30 km/h sign and supplemental time plate

Cropped sign showing 30 and Mo-Fr 6-18 h

Bounding box crop — sign fills the frame. Tesseract reads "30" + "Mo-Fr 6-18 h"

"30"
"Mo-Fr 6-18 h"
→ CONDITIONAL

5.2 Detection Example — Pole 00235

Pole 00235 school crossing sign detection

Zeichen 136 (school crossing) + 30 km/h plate — both detected

Pole 00235 crop

Crop — clear school crossing warning sign

5. Deliverable 3: Permanent or Conditional? — Temporal Classification

German law (ADAC confirmed) states that permanent school zone speed limits apply even during holidays, weekends, and night. Only conditional zones have time restrictions. The supplemental_gfrgroup flag in the CSV indicates a time modifier exists — but does not provide the actual hours. The only source of truth is the physical sign text.

57.1%
Permanent (always 30 km/h)
42.9%
Conditional (school hours only)

Florence-2 Detection Pipeline

1
1,850 Mapillary images loadedDownloaded for all 326 sign point IDs. 292 OK, 34 not found (10.4% coverage gap).
2
Florence-2 object detectionRun on each image. 451 of 1,850 images contained detectable sign regions. 1,640 individual bounding boxes generated.
3
Bounding box crop extraction1,639 crops extracted (110% padding to include supplemental plate below main sign). Failed crops dropped (1 image corruption).
4
Tesseract OCR (German)OCR run on each crop with lang=deu. Raw text normalised: strip whitespace, lowercase, handle OCR errors (Uhr→Uhr, rn→m etc.).
5
Strict German time regexPattern: Mo[-–]Fr\s+\d{1,2}[-–]\d{1,2}\s+Uhr and variants (Sa, Schulzeit, Schulbetrieb, während Schulbetrieb). Only strict matches accepted. No partial/fuzzy matching.
6
Recency-weighted classificationMapillary images span 2014–2025. A 2017 image ≠ 2026 confirmation. Weight recent images (2023+) heavily. Conflict resolution: latest image wins.

Anti-Hallucination Benchmark

Gemini Flash hallucination rate: 89.8%. When asked to describe school zone sign time restrictions from images, Gemini Flash fabricated time restriction text ("Mo–Fr 7–17 Uhr") for permanent signs with no supplemental plate visible. This is the core motivation for our strict OCR-only approach.
ApproachFalse Positive RateMethod
Our pipeline (Florence-2 + strict OCR)5.4%Detect crop → OCR → strict regex only
Gemini Flash (VLM baseline)89.8%Direct VLM query for time restrictions

The 5.4% residual false positive rate comes from OCR misreads on poor-quality images (distance blur, vehicle motion, weather). These are flagged with low confidence scores for human review.

6. ML Experiments — What We Tried

ApproachOutcomeRoot CauseLesson
Random Forest classifier Circular Feature engineering derived from ground truth labels Feature leakage invalidates cross-validation
CLIP-GTSRB fusion Failed Model load failure; German signs not in GTSRB training set Domain mismatch between GTSRB (US/generic) and German road signs
Attention model (sign detection) Insufficient data 315 poles insufficient for deep learning generalisation Deep learning requires orders of magnitude more samples
Gemini Flash VLM (temporal) 89.8% FP VLMs hallucinate plausible-sounding time text from scene context No VLM in decision loop — only strict OCR regex
QwenVL (image classification) Lower accuracy Lower performance than Florence-2 on sign region crops Florence-2 better calibrated for object detection pre-processing
Florence-2 + OCR (final) 5.4% FP Detection → crop → strict OCR is the right architecture
The ML experiments weren't failures — they were the empirical proof that deterministic gates are the right architecture for this problem. With 315 samples, interpretable threshold-based rules outperform black-box ML on every metric.

Map Enrichment — Correcting Road Speed Limits

Our pipeline does more than validate school zone signs — it surfaces cases where HERE's base map speed limit is wrong. When our confidence pipeline approves a school zone with a 30 km/h restriction, but the road link in the map database still carries a higher speed (50 or 70 km/h), that is a direct enrichment opportunity: the school zone sign changes the legal limit, and the map needs updating.

113
Roads where map speed > 30 km/h
confirmed school zone says 30
62.1%
Of matched AUTO_APPROVE poles
represent enrichment cases
43
Highest map speed (km/h)
on a confirmed school zone road
69
Roads already ≤ 30 km/h
map agrees with sign

Road Speed Distribution — AUTO_APPROVE School Zones

The chart below shows the distribution of road average_speed for all AUTO_APPROVE poles that could be matched to a road link. Red bars indicate roads where the current map speed exceeds 30 km/h — these are the enrichment candidates.

0257181131142141142152151151163161161161162171171171181181181181192191191191201201222221232231231231242242251262261261274271281281281291292291291291293301301303301301312311311314315311321322321322321322322331331331332332344342341341341352352351352351351352364361361361363362361361371371374371372374373381381382382382383381391392401401401401401411411411431431Road avg > 30 km/h (enrichment)Already ≤ 30 km/hRoad average_speed (km/h) — AUTO_APPROVE school zones

Top Enrichment Cases

Roads with the largest discrepancy between sign speed (30 km/h) and the current map speed. Compliance is the proportion of probe vehicle observations at or below 30 km/h — where this is high, real-world drivers are already honouring the restriction the map doesn't yet record.

Pole ID Road Sign Speed Map Speed Probe Mean Compliance n_detections
304 L77 30 km/h 43.6 km/h 22.3 100% 153
147 B1 30 km/h 43.4 km/h 31.6 85% 23
240 Onkel-Tom-Straße 30 km/h 41.8 km/h 35.6 60% 425
199 Fischerhüttenstraße 30 km/h 41.5 km/h 26.8 100% 389
238 B1 30 km/h 41.2 km/h 26.1 100% 140
232 Konrad-Wolf-Allee 30 km/h 40.9 km/h 30.4 85% 182
236 Lindenthaler Allee 30 km/h 40.7 km/h 34.0 85% 2479
139 Mariannenstraße 30 km/h 40.5 km/h 24.1 100% 18
227 Clayallee 30 km/h 40.4 km/h 29.4 100% 61
235 Mariannenstraße 30 km/h 40.2 km/h 24.1 100% 16
Pitch angle: We found 113 road links where the current HERE map shows a speed above 30 km/h, but our pipeline has independently confirmed a school zone speed restriction of 30 km/h with AUTO_APPROVE confidence. Each of these is a direct, actionable enrichment record — feed it back into the map compilation pipeline to correct the posted speed. No human review required for AUTO_APPROVE cases; the 7-gate evidence chain already documents why each one is trusted.
Methodology note: Road link matching uses matched_link_id (UUID) joined to road_geometry.id. Of 219 AUTO_APPROVE poles, 182 had a matched road link with a valid average_speed. The remaining 37 poles either had no link match or a null average_speed in the road geometry dataset.

7. Speed Anomaly Audit

We analysed 48.5M HERE probe speed records (Feb 2026, weekdays 06:00–20:00) to cross-validate school zone classifications via observed vehicle behaviour.

0
Classifications changed by speed audit
48.5M
Speed records analysed

The null result is informative: the gate pipeline classifications are not contradicted by observed probe vehicle speeds. Zones classified as AUTO_APPROVE show speed patterns consistent with active 30 km/h restrictions during school hours. Zones classified as EXCLUDE show no such pattern.

Limitation: Traffic LINK-DIR uses numeric HERE link IDs, while road_geometry uses UUIDs. A spatial join (UTM EPSG:25833) is required to link these datasets. Direct string matching fails and returns no results.

8. Tempo-30 Overlap Analysis

Germany has extensive Tempo-30-Zone (SpeedLimitZoneV30) coverage in urban areas. These zones (Zeichen 274.1) establish area-wide 30 km/h limits independent of school zone signs.

89.3%
Study area already ≤30 km/h
G6
Gate that handles this exclusion

School signs inside an existing Tempo-30-Zone are excluded (Gate G6). In that context, the school sign (Zeichen 136 — "Achtung Kinder") is advisory only, not a separate speed restriction. The 30 km/h limit comes from the zone, not the school sign. Counting these as independent school zone speed restrictions would double-count existing restrictions.

9. Scalability and Business Value

Pipeline Properties

PropertyDetail
DeterministicNo LLM in decision loop — same data always produces same output
Threshold-tunableGate parameters (min n_detections, max proximity, max zone length) are config values — adjust without retraining
City-agnosticOSM road network + Mapillary available globally — deploy to any city with HERE perception data
AuditableEvery pole has a full decision_path field showing exactly which gates passed/failed and by what margin
QA prioritisedREVIEW poles sorted by uncertainty score — human validators check highest-risk first
Siemens AX4 readyConfidence scores + decision paths exportable as structured JSON for logistics routing integration

Data Sources Fused

SourceUsed ForScale
HERE Perception (44M vehicles)Sign detections, n_detections confidence315 poles (Potsdam/SW Berlin)
MapillaryVisual confirmation, temporal OCR1,850 images, 1,639 crops
OpenStreetMapRoad graph for extent walking27,943 links
Berlin Geoportal schoolsSchool proximity gate G24,780 POIs (filtered)
HERE probe speed dataSpeed anomaly cross-validation48.5M records (Feb 2026)
Florence-2 (ML)Sign region detection in images1,640 bounding boxes

10. Exclusion Case Studies

A robust validation pipeline must demonstrate it rejects false positives, not merely that it approves true ones. The 80 EXCLUDED poles are not pipeline failures — they are correct decisions. The three cases below show the pipeline correctly withholding enrichment where a school sign exists physically but does not constitute a standalone speed restriction in law.

Why exclusions matter: If the pipeline rubber-stamped every school zone sign, HERE's map database would double-count restrictions already covered by Tempo-30-Zones (Zeichen 274.1). Gate G6 is the mechanism that prevents this. These three poles demonstrate Gate G6 working correctly on high-signal, high-confidence inputs — cases where a naive system would have incorrectly approved.

Case Study 1 — Pole 89 (The Clearest False-Positive Trap)

Location: 52.423073°N, 13.313412°E  |  Gate result: EXCLUDE — G6 Tempo-30 zone overlap at 19.9m

565
Pole 89 Mapillary
Pole 89 street-level image
n_detections (weekly)
187.9m
Nearest school
1.00
Visual score
18
Mapillary images
80 days
Image age (recent)
19.9m
Zone overlap distance

Pole 89 carries a SupplementalTimeModifier flag, indicating a conditional restriction. It sits 187.9m from a school, has 18 recent Mapillary images (80 days old), a perfect visual confirmation score of 1.00, and a weekly detection count of 565 — 47% above the dataset mean of 385.

The false-positive trap: 565 probe vehicles per week, perfect visual confirmation, a school 188m away, and a conditional time modifier. Every surface signal points to include. A pipeline relying only on Gates G1–G5 would approve this pole. Gate G6 catches what the others cannot: this pole sits 19.9m inside an existing SpeedLimitZoneV30 area. The 30 km/h restriction is already mandated by the zone (Zeichen 274.1). The school sign here is Zeichen 136 — "Achtung Kinder" — advisory only. Enriching this as an independent school zone speed restriction would create a duplicate entry in HERE's map database. Excluded correctly.

Case Study 2 — Pole 80

Location: 52.418526°N, 13.336783°E  |  Gate result: EXCLUDE — G6 Tempo-30 zone overlap at 24.0m

145
Pole 80 Mapillary
Pole 80 street-level image
n_detections (weekly)
185.2m
Nearest school
1.00
Visual score
18
Mapillary images
320 days
Image age
24.0m
Zone overlap distance

Pole 80 is structurally similar to Pole 89: school nearby, perfect visual confirmation, SupplementalTimeModifier present. Weekly detections (145) are below the dataset mean, but still comfortably above the minimum threshold — a naive detector would pass this. The Mapillary images are 320 days old (versus 80 for Pole 89), slightly reducing temporal confidence, but Gate G6 is the operative gate here. Zone overlap at 24.0m places it firmly inside Tempo-30 territory. Excluded correctly.

Case Study 3 — Pole 12 (Multiple Red Flags)

Location: 52.391018°N, 13.092919°E — Rudolf-Breitscheid-Straße, Potsdam  |  Gate result: EXCLUDE — G6 Tempo-30 zone overlap at 22.0m

7
Pole 12 Mapillary
Pole 12 street-level image
n_detections (weekly)
1,733m
Nearest school
1.00
Visual score
12
Mapillary images
22.0m
Zone overlap distance
3 flags
Concurrent red flags

Pole 12 is the most straightforwardly incorrect candidate in this set. Three concurrent failure signals converge:

  1. Zone overlap (G6): 22.0m inside a SpeedLimitZoneV30 — operative exclusion gate.
  2. Low detection count (G1 pressure): n_detections = 7, compared to a dataset mean of 385. Only 7 probe vehicles per week confirmed this sign — the noise floor, not a reliable signal.
  3. School far away (G2 pressure): Nearest verified educational POI is 1,733m distant — nearly six times the G2 proximity threshold of 300m. No school is in range to justify a school zone restriction at this location.
Pipeline behaviour on convergent failure: Gate G6 alone is sufficient to exclude Pole 12. The additional pressure from G1 (low detections) and G2 (no nearby school) provides independent corroboration. The pipeline records all gate outcomes in the decision_path field, so a human validator reviewing this EXCLUDE decision sees the full picture — not just why it was excluded, but how many independent signals agree. This is the QA prioritisation layer at work: Pole 12 ranks near the bottom of any validator queue, requiring minimal human attention.

HERE Technologies Berlin Hackathon 2026 · School Zone Speed Limit Validation · Technical Evidence Document
Generated: April 2026 · Map: http://89.167.96.65:8081/transport_map.html