SST Debug Stories

This repository contains a collection of debug use case examples for SST. These are small, artificial examples illustrating situations that might occur in an SST simulation where a debugger could be used to detect or analyze behavior. They are simple SST models with small topologies. Some examples demonstrate debugger features available today, but other cases might serve to inspire possible new debugger features or companion tools.

For each debug story we include a "use case report": a short write-up that explains the scenario, what behavior to observe, and how the SST debugger can be used to investigate it. Many reports also include thoughts and wishlist items for debugger improvements. The stories and links to their reports are in the story status table. For cross-cutting debugger ideas that come up across multiple stories, see the wish list document. This document also includes a catalog of wishlist items mentioned in individual stories.

Overview

All stories are launched from a single SST simulation configuration script, runStory.py, which is passed the name of the particular story to run. Valid story names that can be passed to this are listed in the first column of the story status table.
This repository is still a work in progress. All of the use cases listed below are implemented, and our current effort is focused on hand-verifying each case and evaluating how it could currently be addressed using the SST debugger.
All stories are built around a single SST component named Node (implemented in Node.cpp and Node.h) and use a unified simulation configuration file, runStory.py.

How to Run

From this directory:

Build and run in one step:

./doit <storyName>
Or run manually:

make clean && make

sst --interactive-stop ./runStory.py <storyName>

Where <storyName> is any valid story name from the story descriptions section.

Story Status

This table lists the use case stories included in this repository and overviews their status. To see a short description of each story see the story descriptions section.

All stories have been implemented, so we're now focused on ensuring that they have been implemented properly and writing "use case reports" for each.

In the "Verified?" column, we indicate whether it has been hand-verified (indicated with ✅, ❌, or ❓; ❌ indicates that something is wrong and ❓ indicates that although I've manually read the code and believe it to be correct I don't know of an easy way to verify that it's working as intended today).

In the "use case report" column I use ♦ symbols to indicate how "mature" I believe the report is. You can view one diamond as indicating that the report includes an example script of how to use the sst debugger to address the case but I haven't yet thought deeply about how effective it is. Two diamonds has more content and some thoughts on wishlist items for the SST debugger. Three diamonds indicates that I view the content as being "complete".

Story	Verified?	Use Case Report	Notes
Event Tracing
wrongPath	✅	♦♦	works in debugger but requires advanced topology knowledge and the event to set a side effect on components
infiniteLoop	✅	♦
unexpectedDisappear	✅	♦
missedDeadline	✅	♦
outOfOrderReceipt	✅	♦
duplicateSepTimes	✅	♦
duplicateSameTime	✅	♦
Event Processing
broadcastStorm	✅	♦
badMerge	✅	♦
Incorrect Topology
missingLink	✅	♦
wrongLink	✅	♦
unexpectedDuplicateLink	✅	♦
Deadlock
directDeadlock	❓	♦
indirectDeadlock	❓	♦
Fault Detection And Attribution
detectWhenComponentBecomesInvalid	✅	♦
badInvariantBetweenComponents	✅	♦
componentsLoseParity	✅	♦
divergedModels	✅	♦
componentCausesSegfault	✅	♦
badInitialState	✅	♦
badTerminatingState	✅	♦
findFirstToComplete	❓	♦
determineWhatNotComplete	❓	♦
Load Imbalances
findEventHeavyComponent	✅	♦
findSlowProcessingComponent	❓	♦
findMemHeavyComponent	❓	♦
findMemHeavyEvent	❓	♦
findStarvedComponent	✅	♦

Story Descriptions

Category	Stories
Event Tracing	`wrongPath`, `infiniteLoop`, `unexpectedDisappear`, `missedDeadline`, `outOfOrderReceipt`, `duplicateSepTimes`, `duplicateSameTime`
Event Processing	`broadcastStorm`, `badMerge`
Incorrect Topology	`missingLink`, `wrongLink`, `unexpectedDuplicateLink`
Deadlock	`directDeadlock`, `indirectDeadlock`
Fault Detection And Attribution	`detectWhenComponentBecomesInvalid`, `badInvariantBetweenComponents`, `componentsLoseParity`, diverged models: `divergedModels_A` and `divergedModels_B`, `componentCausesSegfault`, `badInitialState`, `badTerminatingState`, `findFirstToComplete`, `determineWhatNotComplete`
Load Imbalances	`findEventHeavyComponent`, `findSlowProcessingComponent`, `findMemHeavyComponent`, `findMemHeavyEvent`, `findStarvedComponent`

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

SST Debug Stories

Overview

How to Run

Story Status

Story Descriptions

Event Tracing

wrongPath

infiniteLoop

unexpectedDisappear

missedDeadline

outOfOrderReceipt

duplicateSepTimes

duplicateSameTime

Event Processing

broadcastStorm

badMerge

Incorrect Topology

missingLink

wrongLink

unexpectedDuplicateLink

Deadlock

directDeadlock

indirectDeadlock

Fault Detection And Attribution

detectWhenComponentBecomesInvalid

badInvariantBetweenComponents

componentsLoseParity

divergedModels (divergedModels_A and divergedModels_B substories)

componentCausesSegfault

badInitialState

badTerminatingState

findFirstToComplete

determineWhatNotComplete

Load Imbalances

findEventHeavyComponent

findSlowProcessingComponent

findMemHeavyComponent

findMemHeavyEvent

findStarvedComponent

Adding a New Story

Legacy Cases

`wrongPath`

`infiniteLoop`

`unexpectedDisappear`

`missedDeadline`

`outOfOrderReceipt`

`duplicateSepTimes`

`duplicateSameTime`

`broadcastStorm`

`badMerge`

`missingLink`

`wrongLink`

`unexpectedDuplicateLink`

`directDeadlock`

`indirectDeadlock`

`detectWhenComponentBecomesInvalid`

`badInvariantBetweenComponents`

`componentsLoseParity`

`divergedModels` (`divergedModels_A` and `divergedModels_B` substories)

`componentCausesSegfault`

`badInitialState`

`badTerminatingState`

`findFirstToComplete`

`determineWhatNotComplete`

`findEventHeavyComponent`

`findSlowProcessingComponent`

`findMemHeavyComponent`

`findMemHeavyEvent`

`findStarvedComponent`