Skip to content

Commit a417bb6

Browse files
authored
Merge pull request #5 from OpenCHAMI/update-readme
Update readme
2 parents 5fb6c20 + 0939b12 commit a417bb6

1 file changed

Lines changed: 112 additions & 0 deletions

File tree

README.md

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,118 @@ Unlike a simple CRUD API, this service is designed to be populated by a collecto
1010
4. The reconciler performs a "get-or-create" for each `Device` in the payload, using the **Redfish URI** as the unique key (to handle components without serial numbers).
1111
5. A two-pass system ensures that after all devices are created, parent/child relationships are linked by resolving the `parentSerialNumber` (from the collector) to the `parentID` (the parent's UUID in the database).
1212

13+
## What Can You Do To Work With This Today?
14+
15+
You can run the service locally and simulate a hardware discovery event to see the event-driven reconciliation in action. This requires no actual hardware or background knowledge of the system.
16+
17+
### 1. Start the Server
18+
Open a terminal and start the API server:
19+
20+
```bash
21+
go mod tidy
22+
go run ./cmd/server serve
23+
```
24+
25+
*What is happening:* The server initializes the local file database, starts the internal event bus, and spins up the background reconciliation workers. It is now listening for requests on `http://localhost:8080`.
26+
27+
### 2. Simulate a Hardware Discovery
28+
Open a **second** terminal. Create a file named `upload_request.json` containing a mock payload with a Node and a DIMM:
29+
30+
```bash
31+
cat << 'EOF' > upload_request.json
32+
{
33+
"apiVersion": "example.fabrica.dev/v1",
34+
"kind": "DiscoverySnapshot",
35+
"metadata": {
36+
"name": "manual-snapshot-01"
37+
},
38+
"spec": {
39+
"rawData": [
40+
{
41+
"deviceType": "Node",
42+
"serialNumber": "NODE12345",
43+
"manufacturer": "Intel",
44+
"properties": {
45+
"redfish_uri": "/Systems/NODE12345"
46+
}
47+
},
48+
{
49+
"deviceType": "DIMM",
50+
"partNumber": "16GB-DDR4",
51+
"serialNumber": "DIMM67890",
52+
"parentSerialNumber": "NODE12345",
53+
"properties": {
54+
"redfish_uri": "/Systems/NODE12345/Memory/1"
55+
}
56+
}
57+
]
58+
}
59+
}
60+
EOF
61+
```
62+
63+
Post this payload to the server:
64+
65+
```bash
66+
curl -X POST http://localhost:8080/discoverysnapshots \
67+
-H "Content-Type: application/json" \
68+
-d @upload_request.json
69+
```
70+
71+
*What is happening:* You are acting as the collector. The server accepts the snapshot and publishes a `created` event. The reconciler catches this event and processes the payload in the background, creating the two devices and linking the DIMM to the Node.
72+
73+
### 3. Verify the Results
74+
Retrieve the parsed devices from the API to see the results of the reconciliation:
75+
76+
```bash
77+
curl -s http://localhost:8080/devices
78+
```
79+
80+
*What is happening:* The output will show the two distinct `Device` resources. If you look at the `spec` for the DIMM, you will see that the `parentID` field has been automatically populated with the specific UUID of the Node, proving that the two-pass reconciler successfully executed.
81+
82+
### Intended Use Cases
83+
84+
The primary use case for `fru-tracker` is tracking hardware state changes over time using an event-driven architecture.
85+
86+
Instead of requiring clients to manually compute diffs between raw hardware snapshots, the system provides a workflow for detecting hardware modifications (e.g., a DIMM replacement or CPU swap):
87+
88+
1. **Initial Collection:** A collector pushes an initial `DiscoverySnapshot` containing the baseline hardware state. The reconciler parses this payload and populates the database with individual `Device` resources.
89+
2. **Hardware Modification:** A physical or configuration change occurs on the target machine.
90+
3. **Subsequent Collection:** The collector pushes a new `DiscoverySnapshot` reflecting the current state.
91+
4. **Event-Driven Delta Tracking:** During the reconciliation process, the system identifies differences between the newly observed state and the existing database state. For any modified component, the reconciler updates the corresponding `Device` record and automatically emits a `fru-tracker.resource.device.updated` event over the message bus.
92+
5. **Downstream Consumption:** External services or scripts can subscribe to this event stream to log the delta, trigger inventory alerts, or update external dashboards in real-time without needing to parse the raw snapshots.
93+
94+
#### Collector Integration
95+
96+
The `fru-tracker` service is designed to be passive and agnostic to the specific hardware management protocols used in a data center. It expects users to deploy their own collectors tailored to their environment, collecting only information useful to each site.
97+
98+
To integrate a custom collector, the collector simply needs to gather the hardware state, format it as a JSON array of device specifications, and `POST` it to the `/discoverysnapshots` endpoint.
99+
100+
A reference implementation of a Redfish-based collector is provided in `cmd/collector` to demonstrate this interaction and serve as a starting point for development. Also, see below for a sample payload.
101+
102+
### Current Capabilities
103+
104+
The current implementation has been validated with an end-to-end workflow using the provided Redfish collector and the event-driven reconciliation controller.
105+
106+
* **Redfish Discovery Collector (`cmd/collector`):** Capable of authenticating with a BMC, walking the Redfish `/Systems` tree, and extracting hardware data for Nodes, Processors (CPUs), and Memory (DIMMs). It packages this data into a `DiscoverySnapshot` payload and posts it to the API.
107+
* **Event-Driven Triggering:** The server publishes a `fru-tracker.resource.discoverysnapshot.created` event upon receiving a snapshot, which reliably triggers the background reconciler.
108+
* **Two-Pass Reconciliation:**
109+
* **Pass 1 (Ingestion):** The reconciler parses the raw JSON payload and performs a get-or-create operation for each device, utilizing the `redfish_uri` from the properties map as a unique primary key.
110+
* **Pass 2 (Relationship Linking):** The reconciler evaluates the `parentSerialNumber` provided by the collector, identifies the corresponding parent device in the database, and updates the child device's `parentID` with the appropriate UUID.
111+
* **Storage Backend:** Validated using the local file storage backend for persisting resources.
112+
113+
### Future Work
114+
115+
While the core event-driven ingestion pipeline is functional, several enhancements are planned to make `fru-tracker` production-ready:
116+
117+
* **Production Storage Backend:** Migrate testing and deployment documentation from the local `file` storage backend to a robust relational database (e.g., SMD using Fabrica's `ent` backend option).
118+
* **Hardware Removal Handling:** Enhance the `DiscoverySnapshotReconciler` to detect missing components. If a previously tracked child device is absent from a new snapshot, the reconciler should update the existing `Device` record to mark it as removed, offline, or inactive.
119+
* **Event Delta Consumer:** Build a reference implementation of an event subscriber. This service will listen to the message bus for `fru-tracker.resource.device.updated` and `deleted` events to generate human-readable changelogs and trigger alerts.
120+
* **Collector Enhancements:** * Expand the reference Redfish collector to support additional component types (e.g., Drives, PowerSupplies, NetworkAdapters).
121+
* Implement secure credential management for the collector (replacing hardcoded BMC credentials).
122+
* Develop examples of non-Redfish collectors (e.g., an OS-level script using `dmidecode` or `lshw`).
123+
* **CI/CD and Release Pipeline:** Implement a formal build and release process (utilizing `Make`, `GoReleaser`, and GitHub Actions) aligned with the OpenCHAMI ecosystem.
124+
13125
### Device Data Model
14126
All hardware data is stored in the `spec` field, representing the observed state from the last snapshot.
15127

0 commit comments

Comments
 (0)