Skip to content

Commit a311b39

Browse files
authored
Merge branch 'ngscopeclient:master' into stream-browser-dialog-update
2 parents 65c8c73 + d193989 commit a311b39

10 files changed

Lines changed: 264 additions & 43 deletions

File tree

CMakeLists.txt

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,17 @@ if(NOT SIGCXX_FOUND)
186186
message(FATAL_ERROR "Unable to find any version of sigc++; this is required to build ngscopeclient.")
187187
endif()
188188

189+
# Check if the user has provided a path for nvtx3 (not autodetected currently)
190+
include(CheckIncludeFileCXX)
191+
if(NVTX_PATH)
192+
check_include_file_cxx("${NVTX_PATH}/include/nvtx3/nvtx3.hpp" NVTX_FOUND)
193+
if(NVTX_FOUND)
194+
message("-- Found NVIDIA NVTX3: ${NVTX_PATH}")
195+
include_directories("${NVTX_PATH}/include")
196+
add_compile_definitions(HAVE_NVTX=1)
197+
endif()
198+
endif()
199+
189200
# We still use gtk on Linux for the file browser dialog in "native" mode)
190201
if(LINUX)
191202
pkg_check_modules(GTK QUIET IMPORTED_TARGET REQUIRED gtk+-3.0)

README.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,14 @@
1-
# scopehal-apps
1+
# ngscopeclient and scopehal-apps
22

3-
https://www.ngscopeclient.org
3+
This is the top level repository for ngscopeclient, as well as the unit tests for libscopehal.
44

5-
Applications for libscopehal
5+
Project website: [https://www.ngscopeclient.org](https://www.ngscopeclient.org)
66

7-
[C++ coding policy](https://github.com/azonenberg/coding-policy/blob/master/cpp-coding-policy.md)
7+
## Policies
8+
9+
* [C++ coding policy](https://github.com/azonenberg/coding-policy/blob/master/cpp-coding-policy.md)
10+
* [Code of Conduct](https://github.com/ngscopeclient/scopehal-apps/blob/master/CODE_OF_CONDUCT.md)
11+
* We are a proudly AI-free project. Only human created code is acceptable for contribution.
812

913
## Installation
1014

lib

Submodule lib updated 59 files

release-notes/CHANGELOG.md

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,26 @@ This is a running list of significant bug fixes and new features since the last
44

55
## New features since v0.1.1
66

7-
* Filters: Eye pattern is now GPU accelerated for the common case (DDR clock on uniformly sampled input) and runs about 25x faster than before (no github ticket)
8-
* Filters: CDR PLL is now GPU accelerated for the common case (no gating, deep waveform) and runs about 7.5x faster than before (https://github.com/ngscopeclient/scopehal/issues/977)
7+
* Core: Changed rate limiting sleep in InstrumentThread loop from 10ms to 1ms to avoid bogging down high performance instruments like the ThunderScope
8+
* Drivers: ThunderScope now overlaps socket IO and GPU processing of waveforms giving a significant increase in WFM/s rate
9+
* Filters: Added GPU acceleration for several filters including CDR PLL (7.5x speedup), 100baseTX (2.5x speedup), eye pattern (25x speedup), histogram (12x speedup), TIE (5.3x speedup) and more (https://github.com/ngscopeclient/scopehal/issues/977).
10+
* Filters: CDR PLL now outputs the input signal sampled by the recovered clock in a second data stream.
11+
* Filters: Peak detector for FFT etc now does quadratic interpolation for sub-sample peak fitting
912
* Filters: Horizontal bathtub curve now works properly with MLT-3 / PAM-3 eyes as well as NRZ. No PAM-4 or higher support yet.
13+
* Filters: PcapNG export now has an additional mode selector for use with named pipes, allowing live streaming of PcapNG formatted data to WireShark
1014
* GUI: enabled mouseover BER measurements on MLT-3 / PAM-3 eyes as well as NRZ. No PAM-4 or higher support yet.
1115

16+
## Breaking changes since v0.1.1
17+
18+
We try to maintain compatibility with older versions of ngscopeclient but occasionally we have no choice to change the interface of a block in a way that requires old filter graphs to be updated.
19+
20+
* Many serial protocol filters (currently 100baseTX) no longer take the input signal and recovered clock as separate inputs. Instead, they take the new sampled output from the CDR block. This eliminates redundant sampling and is significantly faster but was not possible to do in a fully backwards compatible fashion.
21+
1222
## Bugs fixed since v0.1.1
1323

24+
* Filters: PcapNG export did not handle named pipes correctly (no github ticket)
25+
* Filters: FFT waveforms were shifted one bin to the right of the correct position
26+
* Filters: Frequency and period measurement had a rounding error during integer-to-floating-point conversion causing half a cycle of the waveform to be dropped under some circumstances leading to an incorrect result, with worse error at low frequencies and short memory depths. This only affected the "summary" output not the trend plot.
27+
* GUI: Pressing middle mouse on the Y axis to autoscale would fail, setting the full scale range to zero volts, if the waveform was resident in GPU memory and the CPU-side copy of the buffer was stale
28+
1429
## Other changes since v0.1.1

src/ngscopeclient/InstrumentThread.cpp

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
* *
33
* ngscopeclient *
44
* *
5-
* Copyright (c) 2012-2025 Andrew D. Zonenberg and contributors *
5+
* Copyright (c) 2012-2026 Andrew D. Zonenberg and contributors *
66
* All rights reserved. *
77
* *
88
* Redistribution and use in source and binary forms, with or without modification, are permitted provided that the *
@@ -43,6 +43,10 @@ void InstrumentThread(InstrumentThreadArgs args)
4343
{
4444
pthread_setname_np_compat("InstrumentThread");
4545

46+
#ifdef HAVE_NVTX
47+
NVTX3_FUNC_RANGE();
48+
#endif
49+
4650
auto inst = args.inst;
4751
if(!inst)
4852
{
@@ -79,7 +83,7 @@ void InstrumentThread(InstrumentThreadArgs args)
7983
while(!*args.shuttingDown)
8084
{
8185
//Flush any pending commands
82-
inst->GetTransport()->FlushCommandQueue();
86+
inst->BackgroundProcessing();
8387

8488
//Scope processing
8589
if(scope)
@@ -389,9 +393,9 @@ void InstrumentThread(InstrumentThreadArgs args)
389393
//TODO: does this make sense to do in the instrument thread?
390394
session->RefreshDirtyFiltersNonblocking();
391395

392-
//Rate limit to 100 Hz to avoid saturating CPU with polls
396+
//Rate limit to 1 kHz to avoid saturating CPU with polls
393397
//(this also provides a yield point for the gui thread to get mutex ownership etc)
394-
this_thread::sleep_for(chrono::milliseconds(10));
398+
this_thread::sleep_for(chrono::milliseconds(1));
395399
}
396400

397401
LogTrace("Shutting down instrument thread\n");

src/ngscopeclient/WaveformArea.cpp

Lines changed: 76 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -80,10 +80,6 @@ DisplayedChannel::DisplayedChannel(StreamDescriptor stream, Session& session)
8080
m_rasterizedWaveform.SetCpuAccessHint(AcceleratorBuffer<float>::HINT_LIKELY);
8181
m_rasterizedWaveform.SetGpuAccessHint(AcceleratorBuffer<float>::HINT_LIKELY);
8282

83-
//Use pinned memory for index buffer since it should only be read once
84-
m_indexBuffer.SetCpuAccessHint(AcceleratorBuffer<uint32_t>::HINT_LIKELY);
85-
m_indexBuffer.SetGpuAccessHint(AcceleratorBuffer<uint32_t>::HINT_UNLIKELY);
86-
8783
//Create tone map pipeline depending on waveform type
8884
switch(m_stream.GetType())
8985
{
@@ -111,6 +107,23 @@ DisplayedChannel::DisplayedChannel(StreamDescriptor stream, Session& session)
111107
m_toneMapPipe = make_shared<ComputePipeline>(
112108
"shaders/WaveformToneMap.spv", 1, sizeof(WaveformToneMapArgs), 1);
113109
}
110+
111+
//If we have native int64 support we can do the index search for sparse waveforms on the GPU
112+
if(g_hasShaderInt64)
113+
{
114+
m_indexSearchComputePipeline = make_shared<ComputePipeline>(
115+
"shaders/IndexSearch.spv", 2, sizeof(IndexSearchConstants));
116+
117+
//Use GPU local memory for index buffer
118+
m_indexBuffer.SetCpuAccessHint(AcceleratorBuffer<uint32_t>::HINT_LIKELY);
119+
m_indexBuffer.SetGpuAccessHint(AcceleratorBuffer<uint32_t>::HINT_LIKELY);
120+
}
121+
else
122+
{
123+
//Use pinned memory for index buffer since it should only be read once
124+
m_indexBuffer.SetCpuAccessHint(AcceleratorBuffer<uint32_t>::HINT_LIKELY);
125+
m_indexBuffer.SetGpuAccessHint(AcceleratorBuffer<uint32_t>::HINT_UNLIKELY);
126+
}
114127
}
115128

116129
DisplayedChannel::~DisplayedChannel()
@@ -1339,7 +1352,6 @@ ImVec2 WaveformArea::ClosestPointOnLineSegment(ImVec2 lineA, ImVec2 lineB, ImVec
13391352
void WaveformArea::RenderSpectrumPeaks(ImDrawList* list, shared_ptr<DisplayedChannel> channel)
13401353
{
13411354
auto stream = channel->GetStream();
1342-
auto data = stream.GetData();
13431355
auto& peaks = dynamic_cast<PeakDetectionFilter*>(stream.m_channel)->GetPeaks();
13441356

13451357
//TODO: add a preference for peak circle color and size?
@@ -1378,9 +1390,8 @@ void WaveformArea::RenderSpectrumPeaks(ImDrawList* list, shared_ptr<DisplayedCha
13781390
for(auto p : peaks)
13791391
{
13801392
//Draw the circle for the peak
1381-
auto x = (p.m_x * data->m_timescale) + data->m_triggerPhase;
13821393
list->AddCircle(
1383-
ImVec2(m_group->XAxisUnitsToXPosition(x), YAxisUnitsToYPosition(p.m_y)),
1394+
ImVec2(m_group->XAxisUnitsToXPosition(p.m_x), YAxisUnitsToYPosition(p.m_y)),
13841395
radius,
13851396
circleColor,
13861397
0,
@@ -1390,11 +1401,11 @@ void WaveformArea::RenderSpectrumPeaks(ImDrawList* list, shared_ptr<DisplayedCha
13901401
bool hit = false;
13911402
for(size_t i=0; i<channel->m_peakLabels.size(); i++)
13921403
{
1393-
if( llabs(channel->m_peakLabels[i].m_peakXpos - x) < neighborThresholdXUnits )
1404+
if( llabs(channel->m_peakLabels[i].m_peakXpos - p.m_x) < neighborThresholdXUnits )
13941405
{
13951406
//This peak is close enough we'll call it the same. Update the position.
13961407
hit = true;
1397-
channel->m_peakLabels[i].m_peakXpos = x;
1408+
channel->m_peakLabels[i].m_peakXpos = p.m_x;
13981409
channel->m_peakLabels[i].m_peakYpos = p.m_y;
13991410
channel->m_peakLabels[i].m_peakAlpha = 255;
14001411
channel->m_peakLabels[i].m_fwhm = p.m_fwhm;
@@ -1408,8 +1419,8 @@ void WaveformArea::RenderSpectrumPeaks(ImDrawList* list, shared_ptr<DisplayedCha
14081419
PeakLabel npeak;
14091420

14101421
//Initial X position is just left of the peak
1411-
npeak.m_labelXpos = x - m_group->PixelsToXAxisUnits(5 * ImGui::GetFontSize());
1412-
npeak.m_peakXpos = x;
1422+
npeak.m_labelXpos = p.m_x - m_group->PixelsToXAxisUnits(5 * ImGui::GetFontSize());
1423+
npeak.m_peakXpos = p.m_x;
14131424
npeak.m_peakYpos = p.m_y;
14141425
npeak.m_fwhm = p.m_fwhm;
14151426

@@ -2108,38 +2119,72 @@ void WaveformArea::RasterizeAnalogOrDigitalWaveform(
21082119
}
21092120

21102121
//Bind input buffers
2111-
if(uadata)
2112-
comp->BindBufferNonblocking(1, uadata->m_samples, cmdbuf);
2113-
if(uddata)
2114-
comp->BindBufferNonblocking(1, uddata->m_samples, cmdbuf);
21152122
if(sdata)
21162123
{
2124+
//Calculate indexes for X axis
2125+
auto& ibuf = channel->GetIndexBuffer();
2126+
2127+
//FIXME: what still depends on m_offsets CPU side??
2128+
//If we don't copy this, nothing is drawn
2129+
sdata->m_offsets.PrepareForCpuAccessNonblocking(cmdbuf);
2130+
2131+
//If we have native int64, do this on the GPU
2132+
if(g_hasShaderInt64)
2133+
{
2134+
IndexSearchConstants cfg;
2135+
cfg.len = data->size();
2136+
cfg.w = w;
2137+
cfg.xscale = xscale;
2138+
cfg.offset_samples = offset_samples;
2139+
2140+
const uint32_t threadsPerBlock = 64;
2141+
const uint32_t numBlocks = (w | (threadsPerBlock - 1)) / threadsPerBlock;
2142+
2143+
auto ipipe = channel->GetIndexSearchPipeline();
2144+
ipipe->BindBufferNonblocking(0, sdata->m_offsets, cmdbuf);
2145+
ipipe->BindBufferNonblocking(1, ibuf, cmdbuf, true);
2146+
ipipe->Dispatch(cmdbuf, cfg, numBlocks);
2147+
ipipe->AddComputeMemoryBarrier(cmdbuf);
2148+
ibuf.MarkModifiedFromGpu();
2149+
}
2150+
2151+
//otherwise CPU fallback
2152+
else
2153+
{
2154+
ibuf.PrepareForCpuAccess();
2155+
sdata->m_offsets.PrepareForCpuAccess();
2156+
for(size_t i=0; i<w; i++)
2157+
{
2158+
int64_t target = floor(i / xscale) + offset_samples;
2159+
ibuf[i] = BinarySearchForGequal(
2160+
sdata->m_offsets.GetCpuPointer(),
2161+
data->size(),
2162+
target);
2163+
2164+
if(i < 16)
2165+
LogDebug("ibuf[%zu] = %d\n", i, ibuf[i]);
2166+
}
2167+
ibuf.MarkModifiedFromCpu();
2168+
}
2169+
2170+
//Bind the buffers
21172171
if(sadata)
21182172
comp->BindBufferNonblocking(1, sadata->m_samples, cmdbuf);
21192173
if(sddata)
21202174
comp->BindBufferNonblocking(1, sddata->m_samples, cmdbuf);
21212175

21222176
//Map offsets and, if requested, durations
21232177
comp->BindBufferNonblocking(2, sdata->m_offsets, cmdbuf);
2178+
comp->BindBufferNonblocking(3, ibuf, cmdbuf);
21242179
if(channel->ShouldMapDurations())
21252180
comp->BindBufferNonblocking(4, sdata->m_durations, cmdbuf);
2126-
2127-
//Calculate indexes for X axis
2128-
auto& ibuf = channel->GetIndexBuffer();
2129-
ibuf.PrepareForCpuAccess();
2130-
sdata->m_offsets.PrepareForCpuAccess();
2131-
for(size_t i=0; i<w; i++)
2132-
{
2133-
int64_t target = floor(i / xscale) + offset_samples;
2134-
ibuf[i] = BinarySearchForGequal(
2135-
sdata->m_offsets.GetCpuPointer(),
2136-
data->size(),
2137-
target);
2138-
}
2139-
ibuf.MarkModifiedFromCpu();
2140-
comp->BindBufferNonblocking(3, ibuf, cmdbuf);
21412181
}
21422182

2183+
if(uadata)
2184+
comp->BindBufferNonblocking(1, uadata->m_samples, cmdbuf);
2185+
if(uddata)
2186+
comp->BindBufferNonblocking(1, uddata->m_samples, cmdbuf);
2187+
21432188
//Bind output texture and bail if there's nothing there
21442189
auto& imgOut = channel->GetRasterizedWaveform();
21452190
if(imgOut.empty())
@@ -2745,6 +2790,7 @@ void WaveformArea::RenderYAxis(ImVec2 size, map<float, float>& gridmap, float vb
27452790
for(auto& c : m_displayedChannels)
27462791
{
27472792
auto data = c->GetStream().GetData();
2793+
data->PrepareForCpuAccess();
27482794
auto sdata = dynamic_cast<SparseAnalogWaveform*>(data);
27492795
auto udata = dynamic_cast<UniformAnalogWaveform*>(data);
27502796
if(!sdata && !udata)

src/ngscopeclient/WaveformArea.h

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
* *
33
* ngscopeclient *
44
* *
5-
* Copyright (c) 2012-2025 Andrew D. Zonenberg and contributors *
5+
* Copyright (c) 2012-2026 Andrew D. Zonenberg and contributors *
66
* All rights reserved. *
77
* *
88
* Redistribution and use in source and binary forms, with or without modification, are permitted provided that the *
@@ -145,10 +145,21 @@ struct ConfigPushConstants
145145
float persistScale;
146146
};
147147

148+
class IndexSearchConstants
149+
{
150+
public:
151+
int64_t offset_samples;
152+
float xscale;
153+
uint32_t len;
154+
uint32_t w;
155+
};
156+
148157
/**
149158
@brief State for a single peak label
150159
151160
All positions/sizes are in waveform units, not screen units, so that they scale/move correctly with the waveform
161+
162+
X axis positions are in base units, not scaled by timebase
152163
*/
153164
struct PeakLabel
154165
{
@@ -333,6 +344,9 @@ class DisplayedChannel
333344
std::shared_ptr<ComputePipeline> GetToneMapPipeline()
334345
{ return m_toneMapPipe; }
335346

347+
std::shared_ptr<ComputePipeline> GetIndexSearchPipeline()
348+
{ return m_indexSearchComputePipeline; }
349+
336350
bool ZeroHoldFlagSet()
337351
{
338352
return m_stream.GetFlags() & Stream::STREAM_DO_NOT_INTERPOLATE;
@@ -430,6 +444,9 @@ class DisplayedChannel
430444
///@brief Compute pipeline for rendering sparse digital waveforms
431445
std::shared_ptr<ComputePipeline> m_sparseDigitalComputePipeline;
432446

447+
///@brief Compute pipeline for index searching
448+
std::shared_ptr<ComputePipeline> m_indexSearchComputePipeline;
449+
433450
///@brief Y axis position of our button within the view
434451
float m_yButtonPos;
435452

src/ngscopeclient/WaveformThread.cpp

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
* *
33
* ngscopeclient *
44
* *
5-
* Copyright (c) 2012-2024 Andrew D. Zonenberg and contributors *
5+
* Copyright (c) 2012-2026 Andrew D. Zonenberg and contributors *
66
* All rights reserved. *
77
* *
88
* Redistribution and use in source and binary forms, with or without modification, are permitted provided that the *
@@ -57,6 +57,9 @@ void RenderAllWaveforms(vk::raii::CommandBuffer& cmdbuf, Session* session, share
5757
void WaveformThread(Session* session, atomic<bool>* shuttingDown)
5858
{
5959
pthread_setname_np_compat("WaveformThread");
60+
#ifdef HAVE_NVTX
61+
nvtx3::scoped_range range("WaveformThread");
62+
#endif
6063

6164
LogTrace("Starting\n");
6265

@@ -125,6 +128,10 @@ void WaveformThread(Session* session, atomic<bool>* shuttingDown)
125128
//Wait for data to be available from all scopes
126129
if(!session->CheckForPendingWaveforms())
127130
{
131+
#ifdef HAVE_NVTX
132+
nvtx3::scoped_range range2("No data ready");
133+
#endif
134+
128135
this_thread::sleep_for(chrono::milliseconds(1));
129136
continue;
130137
}
@@ -146,6 +153,10 @@ void WaveformThread(Session* session, atomic<bool>* shuttingDown)
146153

147154
void RenderAllWaveforms(vk::raii::CommandBuffer& cmdbuf, Session* session, shared_ptr<QueueHandle> queue)
148155
{
156+
#ifdef HAVE_NVTX
157+
nvtx3::scoped_range range("RenderAllWaveforms");
158+
#endif
159+
149160
double tstart = GetTime();
150161

151162
//Must lock mutexes in this order to avoid deadlock

src/ngscopeclient/shaders/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ add_compute_shaders(
3131
SOURCES
3232
ConstellationToneMap.glsl
3333
EyeToneMap.glsl
34+
IndexSearch.glsl
3435
ScopeDeskewUniform4xRate.glsl
3536
ScopeDeskewUniformUnequalRate.glsl
3637
ScopeDeskewUniformEqualRate.glsl

0 commit comments

Comments
 (0)