Design: Neighbor-site following (multi-site)
Status: proposal (design-first; no behaviour change in this document). Scope: P25 Phase 1/2 first; the data model is protocol-neutral so DMR T3 / NXDN can follow.
Problem
GopherTrunk already decodes and displays the P25 broadcasts that describe a
system’s topology, but it does not act on them. The package doc in
internal/trunking/site.go:1-4 lists “(later) multi-site neighbor tracking” as
an explicit gap. SDRTrunk uses neighbor broadcasts two ways operators ask for:
- Roaming follow — when a radio moves to an adjacent site, follow its calls there instead of losing the conversation at the site boundary.
- Stronger-site selection — when the camped control channel degrades, hop to a neighbor site’s CC that decodes more cleanly.
This proposal designs how to add those two behaviours on top of the topology data GopherTrunk already has, without widening the behaviour-change surface into a refactor (per CONTRIBUTING.md / CLAUDE.md “design first”).
What already exists (reuse, do not rebuild)
The decode + display path is complete and is the foundation here:
- Decode.
internal/radio/p25/phase1/opcodes.goparses the three topology broadcasts: Network Status (0x3B), RFSS Status (0x3A), and Adjacent Site Status (0x3C,ParseAdjacentSiteStatusBroadcast, ~opcodes.go:648). - Accumulate.
internal/radio/p25/phase1/network.gofolds them into aNetworkModel:ApplyAdjacentSite(~network.go:200) de-duplicates neighbors by(RFSS, Site)and votes System ID;Snapshot()returns the topology with neighbors sorted by(RFSS, Site). - Resolve + expose.
control.go:TopologySnapshot()(~control.go:225) resolves each neighbor’s channel ID/number to Hz via the IDEN_UP band plan and emitsTopoNeighborRef{RFSS, Site, ChannelID, ChannelNumber, FrequencyHz}.internal/trunking/grant.gocarries it onSiteUpdate.Topology;internal/trunking/site_tracker.go:Topology()keeps the latest snapshot per system;internal/api/{types,handlers}.goserves it asSystemDTO.Neighbors;internal/trunking/network_report.gorenders it (ReportNeighbor,RenderNeighborLines).
Key point: the neighbor control-channel frequencies are already resolved and live in the per-system topology snapshot. Following is a control-flow / device-allocation problem, not a decode problem.
Non-goals
- No automatic wideband multi-site coverage (one front end watching N sites at once) — that is a separate, hardware-bound feature. This design covers a single control receiver that re-tunes between sites, plus the existing voice-follow device.
- No change to how neighbors are decoded, voted, or displayed.
- No new config DSL beyond a couple of opt-in toggles (below).
Design
Data already in hand
SiteTracker.Topology(system) → TopologySnapshot with:
PrimaryCC, Secondary[], and Neighbors []TopoNeighborRef (each with a
resolved FrequencyHz). That is the candidate set for both behaviours.
Behaviour 1 — stronger-site selection (control-channel roam)
Today the CC hunter (System.HuntOrder, site.go:416) ranks only the
operator-configured ControlChannels. Extension:
- Feed decoded neighbor CC frequencies into the hunt candidate set as a secondary tier, below the configured list (configured CCs remain the floor/seed; neighbors are discovered extras). Keep them separate so a bad decode can’t permanently pollute the configured order.
- Add a lock-quality signal to the decision. The decoder already surfaces
per-lock SNR/EVM-style metrics (see CLAUDE.md DSP notes and the
gophertrunk_sdr_*metrics); gate a roam on a sustained deficit on the camped CC (hysteresis + dwell, not a single bad frame) so we don’t flap. - On roam, re-tune the control receiver to the chosen neighbor CC and let the existing hunter re-lock. The neighbor’s own broadcasts then refine topology from the new vantage point.
Selection policy (start simple, document it): prefer the configured CC; consider
a neighbor only when the camped CC is below a quality threshold for N seconds;
among neighbors, pick highest recent lock quality, breaking ties by (RFSS,
Site) for determinism.
Behaviour 2 — roaming voice follow
A grant on the camped site references a channel resolvable from that site’s band
plan, so same-site follow is unchanged. Cross-site follow is the harder case and
is explicitly staged last because it needs a second receiver or a re-tune
window the control path can tolerate. Initial version: surface a roam event
(the call’s talkgroup last seen here, now active on neighbor (RFSS, Site)) on
the bus and in the API, so an operator/automation can act, without yet
stealing the single control receiver mid-call. Full automatic cross-site audio
follow is a follow-on once Behaviour 1’s re-tune machinery and quality gating are
proven.
Config surface (opt-in, minimal)
system.neighbor_follow: off|select|roam(defaultoff) —selectenables Behaviour 1;roamadditionally emits roam events (Behaviour 2 stage 1).- Reuse existing quality thresholds where possible; expose at most one
neighbor_roam_dwellknob rather than a new tuning DSL.
Interaction with existing machinery
- CC hunter (
site.go/ the hunt supervisor): neighbor CCs extend the candidate set;HuntOrderstays the configured-floor source of truth. - Grant follow (
internal/trunkinggrant path): unchanged for same-site; gains a roam-event emission for cross-site. - SiteTracker: already the per-system topology owner — it becomes the source the selection policy reads from. No new global state.
Staged implementation plan
- Plumb neighbor CCs into the hunt candidate set (no auto-roam yet): expose them as discovered candidates, log/metric them, keep configured order authoritative. Verifiable with a topology fixture → expected candidate list.
- Quality-gated control-channel roam (Behaviour 1): add hysteresis/dwell selection over camped-vs-neighbor lock quality; re-tune on sustained deficit. Verifiable with a synthetic two-CC replay where one CC degrades.
- Roam-event emission (Behaviour 2 stage 1): bus + API event when a tracked talkgroup appears on a neighbor site. No receiver stealing.
- Automatic cross-site voice follow (Behaviour 2 stage 2): only after 1-3 are proven and the device/re-tune budget is understood.
Each stage is a separate PR with its own failing-first test; stages 1-3 are testable offline against topology/replay fixtures (no second radio required).
Open questions for the maintainer
- Single re-tuning control receiver vs. requiring a dedicated dongle per site for
true simultaneous multi-site (the
gophertrunk_sdr_iq_power_dbfs/per-tap story indaemon.goalready warns weak co-tenant sites need their own front end). - Whether
selectshould ever override a configured CC, or only ever add neighbors as extras. - Quality metric to gate on: reuse the demod SNR/EVM the replay path reports, or a cheaper proxy (TSBK CRC pass rate) for the live decision.