Network Topology via GraphDB | AcropolisDocs
Network Intelligence RAN Automation Agentic RCA
Architecture Blueprint

Mobile Network
Topology GraphDB

Mobile networks are fundamentally relational systems — cell neighbor meshes, transport backhaul hierarchies, and 5G slice service chains are all graphs that relational and time-series stores model poorly at scale. A purpose-built GraphDB layer, operating as a federated topology intelligence plane over existing RAN, Core, and Transport systems, stores relationships and traversal-critical properties in the graph while leaving authoritative attribute detail in source systems. Agents traverse causal chains across domain boundaries in under a second — identifying common transport ancestors behind alarm storms, tracing handover failure paths through neighbor meshes, assessing slice SLA exposure from a single NF fault.

30–90m
Manual cross-domain correlation time
<1s
With GraphDB topology traversal
1,000×
Faster complex network queries
20×
Less code required

The Structural Problem with Today's Network Data

Mobile networks generate enormous volumes of event, alarm, KPI, and configuration data — but the data is siloed by domain. Your RAN OSS doesn't natively know that a gNodeB backhaul link traverses a specific transport segment that also carries AMF signaling. Your Core NMS doesn't know which cells are co-located neighbors or which UEs are currently handing over between two sites experiencing simultaneous degradation.

The result is the classic correlation gap: Network engineering teams manually stitch together context across four or five different systems, taking 30–90 minutes to build a picture that a graph query can assemble in under a second. This is not a tooling problem — it is a data model problem. The relationships between network entities are first-class information, and they need first-class storage.

Importantly, solving this does not require consolidating all topology data into a single system. The GraphDB can operate as a federated inventory layer — holding identity keys and relationships across domains, while individual source systems retain ownership of their full attribute sets. Traversal-critical properties (band, operational state, role) are replicated into the graph for query performance; everything else is fetched on demand from the authoritative source by reference.

Why Graphs Fit Mobile Networks
  • Cells have overlapping coverage — inherently a graph edge (NEIGHBOR_OF)
  • Handover sequences trace paths through adjacency relationships
  • Transport paths hop through multiple nodes and logical segments
  • Multi-band environments create layered spectrum relationships per site
  • 5G slices traverse RAN → Transport → UPF → DN as a graph traversal
  • Failure propagation follows topology — downstream nodes relate to upstream faults
  • X2/Xn neighbor tables are literal graph edge lists already managed in the RAN
  • Federated model means source systems stay authoritative — graph stores relationships, not duplicated inventory
Where Flat Models Break Down
  • TSDB: great for KPI trends, blind to topology relationships
  • Elasticsearch: full-text alarm correlation, no causal chain traversal
  • Relational DB: join explosions when modeling N:M neighbor meshes
  • No native representation of "which cells share the same transport segment"
  • Cannot model sector-to-carrier-to-band-to-spectrum-block relationships efficiently
  • Multi-hop queries (cell → site → transport ring → AMF) require complex ETL
  • Schema migrations are expensive when network topology changes

Should You Build This? — The Honest Assessment

This is a significant platform investment. Before committing, weigh the structural advantages against the real operational and organizational costs. The decision should hinge on your RCA scale problem, your willingness to invest in graph data engineering, and your agentic AI maturity. Critically, the investment profile changes significantly depending on whether you pursue a full consolidation model versus a federated inventory model. The federated approach is almost always the right starting point.

Strong Reasons To Build
  • Network Operations spending >45 min/incident correlating cross-domain context manually
  • Operating multi-band (n77+n41+B4/13) with complex inter-frequency neighbor meshes
  • Repeated cascading failures — single transport fault triggers 50+ alarms across domains
  • Building agentic AI — agents need a structured topology query interface, not free-text alarm logs
  • Handover failure RCA requires traversing: cell → neighbor list → X2 interface → transport → shared node
  • RAN has 1M+ cells — at this scale, neighbor relationship management is an unsolvable graph problem without a purpose-built store
  • Network slicing (5G SA) maps service chains across 4+ NFs — a natural graph traversal
  • Proactively identifying single points of failure in your transport mesh
  • Running spectrum refarming — impact radius queries are graph-native
  • Federated model: RAN OSS, Transport NMS, and Core NMS remain authoritative — the graph adds relationship intelligence without displacing existing investments
Reasons to Pause or Simplify
  • A small network (<5k cells) — a well-structured relational model may suffice
  • No agentic AI roadmap — incremental value may not justify cost
  • Source topology systems lack APIs or reliable data quality — graph reflects what you feed it
  • No graph-skilled engineering resources — Neo4j requires dedicated expertise
  • Existing NMS provides cross-domain correlation at adequate depth
  • Topology change velocity is low — pipeline complexity may exceed benefit
  • Data governance is immature — incomplete topology data creates false RCA conclusions
  • Source systems have poor or no APIs — federated model requires reliable on-demand query capability
  • Short-term MTTR improvement may be achieved faster with simpler enrichment pipelines
Verdict

For agentic RCA at scale across 1M+ cells in multi-band, multi-domain environments, a GraphDB topology plane is not optional — it is the enabling infrastructure. Start with the federated inventory model: the graph owns relationships and identity, source systems own attributes. This limits initial scope, preserves existing system investments, and delivers agent-ready topology traversal without a full consolidation program.

Network Topology as a Graph — Conceptual Model

The critical insight is that every physical and logical relationship in your network maps cleanly to graph primitives. Nodes represent network entities; edges represent relationships with properties describing the nature of that relationship — capacity, state, protocol, neighbor priority, and more.

Entity / Relationship Graph Type Key Properties Source System
NODE Site / Tower Site siteId, address, lat, lon, powerFeed, backhaul type Network Inventory / PPM
NODE gNodeB / eNodeB RadioNode nodeId, vendor, swVersion, lat, lon, site, powerClass RAN OSS / Network Inventory
NODE Cell / Sector Cell/Carrier cellId, PCI, TAC, earfcn/nrArfcn, txPower, bandClass, dlBW, ulBW, state, MIMO Layers RAN OSS
NODE Spectrum Carriers Carrier Aggregation allowedBandClass, dlBW, ulBW, scs, MIMO Layers RAN OSS
NODE Transport Segment TransportSegment segmentId, type (fiber/MW/MPLS), capacity, SLA, latency Transport NMS
NODE Transport Node TransportNode nodeId, role (CE/PE/P), vendor, swVersion, protectedBy Transport NMS / IP OSS / Network Inventory
NODE 5G CNF CoreNF nfType (AMF/SMF/UPF), nfId, plmn, slice, capacity, state Core NMS / 3GPP O&M
NODE Network Slice Slice sst, sd, dnn, eMBB/URLLC/mMTC, SLA, subscriberProvision RAN / NSMF / Core NMS
REL NEIGHBOR_OF Cell → Cell nrRelType, freq (intra/inter), weight RAN OSS
REL CARRIED_BY Cell → TransportSegment interface, vlan, bandwidth, protectionPath Transport NMS / Network Inventory
REL CONNECTED_TO gNodeB → CoreNF (AMF) ngInterface, ngState, sctp Core NMS / RAN OSS
REL SERVES_SLICE Cell → Slice nssai, admittedUE, configuredCapacity Core NMS / RAN OSS
REL BACKHAULED_VIA Site → TransportNode linkType, protection, latencyMsec, bwMbps Transport NMS(s)
REL HAS_CARRIER Cell → Cell (Carrier) isPrimary, aggregationType (CA/DC), scgOrMcg RAN OSS
REL COLO_WITH Cell → Cell sameAntenna, sameRRH, coTower Network Inventory
REL ANCHORED_BY NRCell → LTECell enDCCapable, psCell, scgBearers RAN OSS

This schema naturally accommodates multi-band environments: a single site has multiple cells per sector (one per band), each with their own carrier nodes, all connected to the same transport segment and the same core network nodes — creating a rich, queryable topology that is impossible to represent in a star-schema relational model without dozens of joins.

Worked Example · Use Case 01
Graph Layer Interactions on a Physical Network View
The property graph is an intelligence overlay, not a replacement for physical inventory. Below, a multi-band three-site cluster is drawn as physical infrastructure — cell sites on a shared metro fibre, a transport node, and the 5G core — with its property-graph representation floating above and tie-lines mapping every graph node to the physical element it represents. The highlighted traversal is the cascading-alarm RCA of Use Case 01: six cell alarms are walked back along CARRIED_BY edges to a single shared transport segment as the root cause.
PHYSICAL NETWORKcell sites — metro backhaul — 5G coreGRAPH INTELLIGENCE LAYERproperty graph — nodes & relationshipsRCA TRAVERSAL · UC-01SITE-A · gNodeBn77CELLB13CELLSITE-B · gNodeBn77CELLB13CELLSITE-C · gNodeBn77CELLB13CELLNEIGHBOR_OFHAS_CARRIERSEG-METRO-7CARRIED_BY × 6ROOT CAUSECommon ancestor — all sixCRITICAL cells resolve to asingle TransportSegment.metro user-plane pathN3TN-7TransportNodeUPFCoreNF · 5GCmetro fibre — SEG-METRO-7FIBRE DEGRADATIONCSRSITE-ACSRSITE-BCSRSITE-CTN-7Metro PE5G CORE DCTOPOLOGY MAPPING — each graph node resolves to a physical element1Alarm storm detected6 cells CRITICAL, 3 sites, in under 5 min2Traverse the graphagent walks CARRIED_BY from each cell3Root cause isolatedSEG-METRO-7 — common to all 6 cells4Closed-loop actionsuppress 6 cell alarms, escalate 1 faultCellCell — CRITICALTransport SegmentTransport NodeCore NF (UPF)CARRIED_BY — RCA traversalNEIGHBOR_OFHAS_CARRIERtie-line: graph node ⇄ physicalgNodeB group

Integration Architecture — From Source Systems to Graph

The architecture positions the GraphDB as a topology intelligence plane — not a replacement for existing OSS/NMS systems, but an integration layer that creates a unified, traversable model of your network. All writes are driven by topology change events; the graph maintains a live representation of network relationships enriched with the minimum node properties required for traversal.

For source systems with mature, well-documented APIs — Transport NMS, Core NMS, Network Inventory — a federated access pattern is preferable to full ingestion. The graph holds only node identity keys and relationship structure. When an agent completes a traversal and needs full attribute context, it calls the owning system directly using the identity key returned by the graph. This keeps the graph lean, avoids dual-maintenance of authoritative data, and means agents interact with source systems as a natural part of their reasoning workflow.

Integration Architecture
From Source Systems to Graph — the Topology Intelligence Plane
Topology change events flow up the pipeline from RAN, Core, and Transport source systems into the Neo4j graph store; agents and operational tooling query back down through Cypher. Source systems remain authoritative — the federated return path lets agents resolve full attribute detail by identity key without duplicating inventory into the graph.
SOURCEINGESTPROCESSGRAPH STORECONSUMERAN OSSEricsson · Nokia · SamsungCore NMS5GC · EPC O&MTransport NMSIP/MPLS · MW · FibreNetwork InventorySites · physical plantNETCONF/YANG · gNMI · RESTCONF · 3GPP O1 · SNMPTopology Change Event Streamchange-data-capture from OSS / NMSConfig Sync PollinggNMI subscribe · bulk CM / neighbour exportApache Kafka — topology-events topic (topology changes only)Topology Normalizationvendor model → canonical schemaSchema Mapping Layeridentity-key resolution · federation tagConflict Resolutiontruth-source priority · idempotent mergeValidated graph mutations — batch + streaming (Apache Flink)Neo4jprimary topology GraphDB · GDSTopology Change Logsoft-delete history · RCA replayGraph Schema Registrynode / relationship property contractsCypher · openCypher · Gremlin — agent & tool queriesRCA Agentroot-cause isolationCorrelation Agentalarm de-duplicationSLA Impact Agentslice exposureNOC Dashboardoperator topology viewCapacity Planninggrowth modellingFEDERATED ATTRIBUTE FETCHFederated path — after a traversal, agents resolve full attribute detail from the owning source system by identity key; the graph stays lean.
Key Integration Protocols
  • gNMI/gRPC — streaming telemetry for real-time state updates (link state, cell state)
  • NETCONF/YANG — configuration retrieval for topology bootstrap and change events
  • RESTCONF — transport node configuration and topology APIs
  • SNMP traps → Kafka — legacy transport equipment state changes
  • Vendor OSS APIs — bulk neighbor table export, CM file parsing
  • 3GPP O1 — RAN element management, neighbor sync
Critical Design Decisions
  • Topology truth source priority — OSS wins over NMS wins over inferred
  • Change detection must be idempotent — duplicate events cannot corrupt graph
  • Soft-delete for removed neighbors — retain historical adjacency for RCA replay
  • Classify each node type: graph-resident vs. federated (identity key only)
  • Traversal-critical properties (band, opState, role, alarmState) replicated into graph; CM detail and SLA remain in source systems
  • Time-stamped relationship properties for handover metric trending
  • Anchor cell (LTE) identified explicitly for EN-DC configurations
  • Agent tool library must include both graph query tools and source system API tools

Agentic RCA Use Cases — Graph-Powered Intelligence

The following use cases illustrate how an AI agent with access to the GraphDB fundamentally changes the RCA workflow. Each includes the graph traversal pattern that powers it — impossible to replicate efficiently in a non-graph data store.

01
Cascading Alarm Root Cause Isolation
TRIGGER: Alarm storm >20 cells in <5 min · DOMAIN: RAN + Transport
Agent Reasoning Pattern

When 20+ cell alarms fire simultaneously, the agent queries the graph to find the minimal common ancestor — the upstream transport node or segment shared by the largest subset of alarming cells. If a single transport segment is the common backhaul path for 80% of alarming cells, that segment is the probable root cause, not the cells themselves. The agent suppresses downstream alarms, escalates the transport fault, and auto-populates the incident ticket with the full impact radius.

Cypher Query Pattern
// Find common transport ancestor for alarming cells MATCH (c:Cell)-[:CARRIED_BY]->(seg:TransportSegment) WHERE c.alarmState = 'CRITICAL' AND c.alarmTimestamp > $windowStart WITH seg, collect(c) AS affectedCells, count(c) AS cellCount ORDER BY cellCount DESC RETURN seg.segmentId, seg.type, cellCount, affectedCells LIMIT 5
02
Handover Failure Chain Analysis
TRIGGER: HO Success Rate < threshold on target cell · DOMAIN: RAN neighbor mesh
Agent Reasoning Pattern

The agent identifies cells with degraded HO success rate, then traverses the neighbor graph to determine whether the failure is confined to a single source-target pair (PCI confusion, coverage gap) or is systemic across all neighbors of the target cell. It also checks whether the target and source share the same transport backhaul — a transport fault can masquerade as a handover failure. Multi-band environments require checking both intra-frequency and inter-frequency neighbor relationships separately.

Cypher Query Pattern
// Find all source cells for a failing target cell MATCH (src:Cell)-[r:NEIGHBOR_OF]->(tgt:Cell) WHERE tgt.cellId = $targetCellId AND r.hoSuccessRate < $threshold OPTIONAL MATCH (src)-[:CARRIED_BY]->(seg:TransportSegment) <-[:CARRIED_BY]-(tgt) RETURN src.cellId, src.band, r.hoSuccessRate, r.handoverCount, seg.segmentId AS sharedTransport, src.alarmState
03
Multi-Band Coverage Hole Detection
TRIGGER: Drive test / MDT anomaly · DOMAIN: Multi-spectrum RAN topology
Agent Reasoning Pattern

In a multi-band environment (n77 for capacity + n41 mmW for peak + B13 for coverage), a coverage hole may appear when the mid-band anchor cell is degraded and fallback to coverage-layer cells fails because the inter-frequency neighbor relationship is misconfigured or missing. The agent traverses the band-layered neighbor graph to identify whether a geographic area's cells have proper inter-frequency and inter-RAT neighbor relationships, flagging gaps where coverage continuity depends on missing or improperly weighted neighbors.

Cypher Query Pattern
// Find co-located cells missing inter-freq neighbors MATCH (c1:Cell)-[:COLO_WITH]->(c2:Cell) WHERE c1.band = 'n77' AND c2.band = 'B13' AND NOT (c1)-[:NEIGHBOR_OF {nrRelType:'interFreq'}]->(c2) RETURN c1.cellId AS midBandCell, c2.cellId AS coverageCell, c1.sector, c1.site ORDER BY c1.site
04
5G Slice SLA Impact Assessment
TRIGGER: UPF degradation event · DOMAIN: Core + RAN + Transport
Agent Reasoning Pattern

When a UPF instance degrades, the agent traverses the slice graph to determine which network slices are served by that UPF, then traces which cells are configured to offer those slices, then determines the subscriber count and enterprise SLA commitments at risk. The cross-domain traversal (Core NF → Slice → Cell → Site → Subscriber segment) takes under 1 second in the graph but would require 4–5 system queries and manual correlation without it. The output drives automated SLA notification and remediation prioritization.

Cypher Query Pattern
// Trace SLA impact from UPF fault MATCH (upf:CoreNF {nfId:$upfId}) <-[:ANCHORED_TO_UPF]-(sl:Slice) <-[:SERVES_SLICE]-(c:Cell) -[:COLO_WITH*0..1]-(co:Cell) RETURN sl.sst, sl.sd, sl.dnn, sl.slaClass, sl.subscriberCount, collect(distinct c.site) AS impactedSites, count(distinct c) AS impactedCells ORDER BY sl.slaClass
05
Single Point of Failure Risk Identification
TRIGGER: Proactive / Scheduled · DOMAIN: Transport topology
Agent Reasoning Pattern

Without a graph, identifying transport single points of failure requires a network architect to manually trace topologies. The agent periodically traverses the transport graph to find segments or nodes that, if removed, would disconnect the largest number of RAN sites from core. It weights by subscriber population and SLA tier to rank remediation priority. This proactive use case — converting graph centrality analysis into a risk report — is a zero-human-effort operation once the graph is live, turning topology intelligence into a continuous reliability program.

Cypher Query Pattern
// Find critical transport nodes by site dependency MATCH (site:Site)-[:BACKHAULED_VIA]->(tn:TransportNode) WITH tn, collect(site) AS dependentSites, count(site) AS siteCount WHERE siteCount > $spofThreshold AND NOT tn.isProtected RETURN tn.nodeId, tn.role, siteCount, [s IN dependentSites | s.siteId] AS exposedSites ORDER BY siteCount DESC

A Technology Stack

Graph Database
Neo4j Enterprise

Preferred for Cypher expressiveness, APOC procedures, and GDS (Graph Data Science) library — critical for centrality, community detection, and path analysis used in RCA agents.

Streaming Backbone
Apache Kafka

Topology Events Topic with schema validation via Confluent Schema Registry. Separate topics for RAN, Core, Transport topology changes. Change Data Capture (CDC) pattern for OSS/NMS integration.

Stream Processing
Apache Flink

Stateful stream processing for topology normalization, deduplication, and graph mutation generation. Flink preferred for windowed join logic across domain events. Kafka Streams for simpler topologies.

RAN Integration
gNMI / NETCONF

gNMI streaming for real-time state (cell state, link state). NETCONF/YANG for topology bootstrap. Vendor adapters required for Ericsson ENMIQ, Nokia NetAct, Samsung SON.

Agent Framework
LangGraph / Custom

Graph-aware RCA agents with Neo4j tool calls as agent capabilities. Each agent exposes Cypher query templates as tools. LLM layer (Claude / GPT-5) for natural language alarm narrative generation and recommendation synthesis.

Complementary Stores
Multi-Modal Storage

TSDB (InfluxDB/Prometheus) for KPI time-series linked by nodeId. Elasticsearch for alarm text. Apache Iceberg for historical topology snapshots. GraphDB stores relationships; other stores handle their native data types.

Implementation Roadmap

Phase 1 · M01–M04

Foundation & Graph Bootstrap

  • Deploy Neo4j cluster
  • Define graph schema — nodes, relationships, property contracts
  • Federated vs. consolidated evaluation — assess each source system's API maturity and data quality
  • RAN Physical topology via Planning Tool
  • RAN Element topology bootstrap via NETCONF/YANG bulk export
  • Neighbor table import from Network Element Managers
  • Transport topology import from NMS APIs
  • Core NF topology from 5GC O&M interfaces
  • Kafka topology events pipeline — RAN topology changes
  • Data quality validation framework
  • Basic Cypher query library for Operations Tooling
Phase 2 · M05–M09

Real-Time State & Open-Loop Agents

  • gNMI streaming integration — cell/link state → graph properties
  • FM enrichment — alarm state written to node properties
  • Transport state sync from NMS
  • Flink normalization layer for cross-domain joins
  • Cascading alarm RCA agent (Use Case 01)
  • Handover failure chain agent (Use Case 02)
  • Agent recommendations surfaced in NOC dashboard
  • Human-in-the-loop validation workflow
  • KPI → TSDB linkage by nodeId for enriched context
Phase 3 · M10–M18

Advanced Intelligence & Closed-Loop

  • Multi-band coverage hole agent (Use Case 03)
  • Slice SLA impact agent (Use Case 04)
  • SPOF risk identification — scheduled graph analysis (Use Case 05)
  • Neo4j GDS — centrality and community detection
  • Historical topology snapshots (Iceberg) for RCA replay
  • Trust-gated autonomous remediation (neighbor add/delete)
  • Closed-loop transport rerouting recommendations
  • Agent performance measurement — MTTR delta, false positive rate
Critical Success Factor

Graph data quality is the single largest risk to this program. A neighbor list that is 85% complete produces RCA conclusions that are correct 85% of the time — which erodes NOC trust rapidly. Invest heavily in Phase 1 data validation before building agents on top of it. A graph with known, bounded incompleteness is far more valuable than one with unknown gaps.

Decision Framework Summary

Build If You Have
  • 1M+ cells with complex multi-band neighbor meshes
  • Active agentic AI / automation roadmap for NOC operations
  • Documented cross-domain RCA gaps adding >30 min to MTTR
  • 5G SA or NSA with network slicing SLA commitments
  • Engineering capacity for graph data pipeline development
  • Willingness to invest 6–9 months before agent value is realized
  • Executive sponsorship for a multi-year network intelligence program
Consider Alternatives If
  • Network is small or topology is relatively static and simple
  • No agentic AI roadmap in the next 18 months
  • Source system data quality is poor or API access is limited
  • No engineering resources with graph/data pipeline experience
  • Current NMS provides adequate cross-domain visibility
  • Budget constraints require single-system consolidation first
  • Simpler enrichment layer on Elasticsearch can close the gap short-term
Bottom Line

The GraphDB topology plane is not a nice-to-have for large-scale agentic network operations at 1M+ cells — it is the foundational data structure that makes cross-domain RCA traversal computationally tractable. Every agentic use case described above degrades to a slower, less reliable version of itself without it. The investment is justified precisely because mobile network relationships are the data, not metadata — and graphs are the only model that treats them as such.

Looking Ahead — Virtualisation Expands the Graph

Mobile networks are mid-transformation. The RAN is decomposing into RU, DU, and CU functions running on COTS compute under O-RAN and AI-RAN reference designs; the 5G core's VNFs are being re-platformed as containerised network functions (CNFs) orchestrated by Kubernetes and Nephio; and both increasingly share the same multi-tenant edge and regional data-centre fabric. The clean boundary between a network element and a physical platform is disappearing — and the topology graph has to follow.

The same property-graph plane that today stops at cells, transport segments, and core NFs will extend one layer further down — into the compute hosts, accelerators, smart NICs, fabric switches, racks, and power feeds that the virtualised functions sit on top of. RCA traversal then crosses the virtualisation boundary: an SMF processing-latency anomaly walks not just through its SBA peers, but through the Pod, the K8s Node, the host's NUMA domain, the SmartNIC, and out to the rack-level leaf switch that connects them. Slice SLA exposure stops being only about which UPF is degraded and starts being about which rack lost a PSU.

What's Changing in the Network
  • vRAN / Open RAN — RU, DU, CU disaggregated onto COTS x86/ARM compute with FEC and inline acceleration
  • 5G core VNFs/CNFs spanning AMF, SMF, UPF, PCF, UDM, NSSF — each scaled independently as Kubernetes Pods across multiple compute hosts and edge sites
  • Far-edge cloud — DU and edge UPF colocated on the same compute platform at aggregation sites
  • AI-RAN — RAN xApps and inference workloads co-tenanting GPU and DPU pools (AI-RAN Alliance reference designs)
  • Multi-tenant infrastructure — enterprise slices, MVNOs, and internal tenants sharing the underlying fabric
  • Standards velocity — 3GPP Rel-18 → Rel-20, O-RAN ALLIANCE quarterly releases, ETSI NFV → CNF transition, TM Forum AN L4+ targets
What's Changing in the Graph
  • New node types model the platform layer — NFInstance, Pod, K8sNode, Hypervisor, ComputeHost, SmartNIC, Accelerator, FabricSwitch, Rack, DCSite, PowerFeed
  • New edges traverse the virtualisation boundary — HOSTED_BY, RUNS_IN, ON_HOST, ON_ACCEL, FABRIC_VIA, RACKED_IN, POWERED_BY
  • Telemetry attaches by identity key — Kubernetes metrics, host counters, switch port stats, DCIM facility data — queryable from the same traversal
  • Federated model still wins — most attributes live in source systems (CNF orchestrator, Kubernetes, DCIM, Redfish, OpenConfig); the graph holds identity and relationships
  • Schema aligns with O-RAN SMO topology, 3GPP NRM Rel-18, ETSI NFV-IFA inventory, and DCIM / Redfish for the facility layer
  • Closed-loop scope widens — autonomous remediation moves from "reroute traffic" to "drain the K8s Node and reschedule the Pod"

The schema extension below sketches the platform layer added on top of the cell, transport, and core nodes already defined in Section 03. Identity keys remain authoritative in the source systems; the graph stores them alongside the relationships needed for cross-layer traversal.

Entity / Relationship Graph Type Key Properties Source System
NODE NF InstanceNFInstancenfId, version, sliceMembership, replicaId, regionCNF Orchestrator / SMO
NODE PodPodpodId, namespace, restartCount, qosClassKubernetes API
NODE K8s NodeK8sNodenodeId, kernel, cpuModel, gpuPresent, taintsKubernetes API
NODE Compute HostComputeHosthostId, chassisSerial, biosVer, numaTopologyDCIM / Redfish
NODE AcceleratorAcceleratoracceleratorId, type (FEC/GPU/DPU), driverVer, partitionableCNF Orchestrator / DCIM
NODE Fabric SwitchFabricSwitchswitchId, role (leaf/spine), portCount, fabricSegmentDC NMS / OpenConfig
NODE RackRackrackId, location, designKw, coolingZoneDCIM
NODE DC SiteDCSitesiteId, region, tier, latencyClassDCIM / Network Inventory
REL HOSTED_BYNFInstance → PodreplicaId, started, podPhaseCNF Orchestrator
REL RUNS_INPod → K8sNoderesourceLimits, nodeSelector, qosClassKubernetes API
REL ON_HOSTK8sNode → ComputeHosthypervisor, virtType, vCPU/pCPU mapOrchestrator / DCIM
REL ON_ACCELPod → AcceleratorpartitionId, MIG/SR-IOV profile, shareCNF Orchestrator
REL FABRIC_VIAComputeHost → FabricSwitchport, vlan, bandwidth, lagIdDC NMS / OpenConfig
REL RACKED_INComputeHost → RackunitU, orientationDCIM
REL POWERED_BYRack → PowerFeedbreaker, redundancyTier, designKwDCIM

These extensions unlock a class of cross-layer traversals that today require manual correlation across orchestrator dashboards, monitoring stacks, and DCIM tools. A few examples worth scoping early in the program roadmap:

Use Case 06
Cross-Layer NF Placement RCA

An SMF instance shows elevated processing latency. The agent traverses HOSTED_BY → RUNS_IN → ON_HOST → FABRIC_VIA and isolates whether the cause lives in the SBA peer, the Pod, the K8s Node, the host's NUMA pinning, or leaf-switch port congestion — in one query rather than four tools.

Use Case 07
Slice SLA Blast Radius Through the Fabric

A leaf-switch port fails. The agent traverses FABRIC_VIA back to every ComputeHost dependent on that port, then forward through HOSTED_BY → SERVES_SLICE to enumerate the slices and subscriber populations at risk. Notification precedes SLA breach instead of trailing it.

Use Case 08
Slice-Level Energy Attribution

Sustainability KPIs become a graph query. NFInstance → ON_HOST → RACKED_IN → POWERED_BY, with Pod resource share apportioning host power draw, yields per-slice kWh. Scope 2 reporting moves from spreadsheet inference to traversable evidence.

Use Case 09
AI-RAN Noisy-Neighbour Detection

RAN xApps and slice-bound inference workloads share GPU partitions. ON_ACCEL traversal surfaces co-tenancy as a queryable property; contention symptoms attributed to RAN performance can be traced to specific Accelerator partitions and rescheduled by the orchestrator.

Use Case 10
Compute-Layer SPOF Discovery

Extending GDS centrality from transport into the platform graph surfaces single hosts and single fabric links carrying disproportionate Pod load. The remediation is a placement decision in the orchestrator — not a procurement request — and the program returns value before new hardware is purchased.

Use Case 11
Closed-Loop Pod Reschedule

The closed-loop scope expands. When the graph isolates a degraded host as the root cause, the agent's remediation is no longer just "reroute traffic" — it is "drain the K8s Node, evict the affected Pod, reschedule onto a healthier host," coordinated through the CNF orchestrator API.

Strategic Note — Plan the Schema, Stage the Implementation

For most operators this is not a Phase 1 conversation. The organisational boundary between RAN/Core engineering and DC/platform engineering is sharp, and the operational habit of separating network and platform inventory runs deep. But the schema for the extended graph should be sketched in Phase 1 — even if only the cell, transport, and core layer is populated initially — so the property contracts, identity-key strategy, and federated-access paths to Kubernetes, DCIM, and Redfish do not have to be retrofitted when the platform layer comes online. The federated model is what makes this affordable: the graph remains a thin relationship plane and platform-attribute detail stays in the systems that already own it.

Looking Ahead

The cascading-alarm RCA walked in Section 03 — six cell alarms resolving to a shared transport segment — is the simplest case. The same traversal pattern, applied to a graph that includes the compute, accelerator, and facility layer, becomes the operational foundation for fully autonomous network operations across the disaggregated, virtualised network. The investment in the topology plane today is the investment in tomorrow's TM Forum AN L4+ posture.