SCP: Enable C99 standard 'bool'#485
Conversation
|
@pkoning2: Originally, I separated out the simulator changes as separate commits and can still do that if preferable. support/sim_bool.h: Seemed like a good idea to put the header file in a subdirectory vice the top level directory, where future utility headers similar to this one can be placed. Happy to move elsewhere. No negative performance impacts. |
|
So, serious question: is it worth supporting platforms before C99? Machines that don't have a C99 compiler are probably candidates for simulation rather than candidates for running simulators. |
@markpizz compiles with VS 2008, the reasons for which completely escape me and probably would escape you too. There are a few users who compile on live VAXen and HP hardware which aren't C99 (and I wouldn't be surprised if @sethm has SIMH compiled in an AT&T 3B2.) My attitude is to accommodate these users insofar as it's possible; cross-compiling isn't quite as satisfying as compiling on the real hardware. |
|
There are C99 versions of GCC for the VAX. I don't know about HP PA but I'm pretty sure that exists too. The 3B2 is pretty old of course; I haven't touched one of those in 40 years. However, those probably don't have a compiler that knows about prototypes. |
|
Please identify what actual bug or bugs this change will fix? C99 capable compilers haven't stopped correctly compiling earlier versions of C code, nor is anyone rushing to change deployed code everywhere throughout the coding universe to conform to the evolved coding syntax. Making these many changes in the simh codebase is merely change for the sake of change. It doesn't fix actual bugs and it doesn't add new functionality. If the simh steering committee is interested in changes like this then great. |
Maybe look at the commit before you hack off. |
I did. I notice changes to almost 41 source files. Nowhere in your commit comments identify functionality which didn't produce the desired results the author intended. I see that your definition of a bug is that the source code merely doesn't use C99 syntax. Not that anything which is broken by any other definition. |
|
@markpizz: Other than the "make sure a Boolean expression produces a Boolean value" (which I did note is likely overkill and can be relaxed when it gets reviewed) in some of the source, you might have noticed there were mistyped function prototypes. Not a strictly operational bug, but code hygiene. There are benefits to allowing the compiler to make use of a virtualized 1-bit register (aka the "Z" flag) that opens up optimization options, which is why the C99 standard writers (many of whom are compiler writers) put it into the standard. |
I asked for you to identify any cases where the existing code doesn't provide the functionality the author of that code intended. Any such cases (which you haven't called out) would need fixes, but certainly not changes with the consequences your PR provides.
You call out this benefit here, and elsewhere you change legacy SCP API signatures where an argument was originally described as an int32 value, and you find specific value changing these to size_t. You then have to use different format descriptions to actually print these values either directly or in debug output. The amazing thing is that this parameter only ever takes on values between 0 and 8. This hardly needs to be a size_t on any platform. You describe the performance benefit of not having to expand the int32 value to 64 bits on some environments. Since you need to mention these performance benefits, you must think that performance gains add significant weight to your justification. I will be glad to personally pay you $1000 if you can provide a scenario where a simulator on any platform can measure any difference between that simulator, running precisely the same workload. I'm so confident to challenge you based on when and how these APIs are used. Specifically simulators exist to simulate instruction execution of a particular machine. The code paths you're changing here have essentially nothing to do the instruction execution. Simulated instruction execution will typically run on the order of likely 10's of million simulated instructions per second on most modern systems. During that time, maybe SCP timer routines are invoked 60 to 100 times per second. Therefore some 100's of thousands of simulated instruction per call to these routines. I'm guessing, but simulating each instruction likely takes dozen's of host system instructions so we're back up to millions of host instructions per call to one of these size_t or bool enhanced codepaths which might have 1 or 2 extra instructions removed due to your optimization. A couple of host instructions out of millions will hardly be measurable. Change for the sake of change. No bugs fixed. Removal of the range of compilers which can actually digest the codebase. Or another case of "Gee if I was to write these things from scratch, this is what I'd do." I suggest that you come back to these issues once Linux and a significant percent of the existing C language application codebase has also converted these things. Bugs in the simh codebase have historically been identified and fixed with at most a handful of lines of code changes in the codebase. This PR impacts some 42 files across the SCP and simulator codebase. New functionality is a different story and the needed code for things like that varies immensely. It's hard to think of changes to the simh source code from you that fit that category. |
You've completely missed the point of this PR, nor have you articulated a reason why SIMH's
I never claimed there are performance benefits or consequences. OTOH, I never take money from children. |
UTSL. If you actually read through the code, you'll see that I specifically ensure that the older |
Allow SIMH to make use of the C standard bool type when supported,
implemented in the support/sim_bool.h header.
- If the compiler claims that __STDC_VERSION__ is C99 and above, or
if the Microsoft C compiler version is at least 1800, typedef t_bool
to bool, set TRUE and FALSE to true and false, respectively.
- If not a standard compliant compiler or older Microsoft compiler,
typedef t_bool to int, TRUE = 1 and FALSE = 0 (i.e., the old SIMH
settings.)
Make a pass through the SIMH code base for t_bool variable declarations
and associated references:
- Ensure boolean expressions are boolean expressions, e.g.,
"(foo & mask) != 0" vs. "foo & mask" to remove any ambiguity.
- Fix expressions, declarations where t_bool was assumed to be the
same as int.
- Fix inconsistent function prototypes.
Changes to individual simulators submitted as separate Git commits.
CMAKE: Use target_compiler_feature() to request C99 support at a
minimum.
3b2_mau.h: Adjust *_SIGN() macros to produce Boolean expressions.
3b2_mau.c: Change sign's type from t_bool to t_uint64 in
xfp_to_decimal().
s100_dazzler.c: Change daz_0e, daz_0f and daz_frame's type from t_bool to
uint8, remove assumption that t_bool is type-equivalent to int.
m68k/m68k.h, m68k/m68kcpu.c: Rename TRUE to NMI_TRUE, FALSE to NMI_FALSE
to resolve namespace collision with SIMH TRUE and FALSE.
h316_hi.c, h316_mi.c: Ensure assignment result is Boolean. hp2100_defs.h, FLIP_FLOP enum: Alias CLEAR and SET to FALSE and TRUE, respectively, to avoid any compiler ambiguity with respect to Boolean values. hp2100_sys.c: Update assignment to Boolean result.
hp3000_lp.c: Ensure (device_command ^ J2W10_INSTALLED) evaluates to a Boolean expression. Instantiates a separate boolean variable to preserve comment formatting.
i1401_cpu.c: Initialize clear to FALSE (vice 0), ensure conv_old is
assigned a Boolean value in cpu_set_conv() (t_bool is not int32.)
i1401_iq.c: Assign use_h from a Boolean expression.
i1401_sys.c: Assign use_h from a Boolean expression in dcw() and
frpint_sym().
Fix ch6_req_wr() return type: should be t_bool, not t_stat.
ibm1130_cr.c: Remove extraneous extern declaraction for cgi.
ibm1130_defs.h: Change cgi and cgiwritable's extern declaration type to
t_bool to be consistent with actual definitions.
ibm1130_disk.c: Assign raw_disk_debug from a Boolean expression.
ibm1139_gui.c: Update update_gui() prototype to be consistent across the
GUI and non-GUI compile paths.
ibm1130_sca.c: Assign any_timer_running from a Boolean expression.
ibm1130_stddev.c: Remove extraneous extern declaraction for cgi.
t_bool for update_gui
id_idc.c: Fix idc_wds() return type (was t_stat, should be t_bool.)
nd100_sys.c: Update sim_load()'s flag declaration. Should be int, not
t_bool (flag is an unused parameter.)
pdp11_cr.c: Assign eof_pending to TRUE (vice 1.) pdp11_rq.c: Assign dontchangecapac from Boolean expression.
m68k_sys.c: Fix sim_load()'s flag parameter type, should be int, not
t_bool.
vax_sysdev.c: Initialize ka_hltenab, tmr_inst[] to TRUE, FALSE to be
consistent with t_bool.
vax_va.c, vax_vc.c, vax4xx_va.c, vax4xx_vc.c: Assign va_input_captured
from Boolean expression.
vax780_fload.c: Update rtfile_read() return type to t_bool (not t_stat.)
Change build_dib_tab() return type to t_stat return type (was t_bool)
|
Broke the PR into individual commits for (somewhat) easier reading. |
|
I don't actually get why @markpizz is commenting on this. This isn't his codebase. He hard forked SIMH, caused a lot of chaos and trouble, and has his own sandbox to play in. |
|
@bscottm I'm interested in adopting this change into ZIMH, but I think that when I do so, I'm going to simply change t_bool to bool as ZIMH is committed to C99 and above (in truth, C11 and above). It looks like there's a lot of fallout from such a change that you have demonstrated above, and I'm going to probably base my patches on what you did. |
|
@bscottm So my goal in ZIMH is twofold:
Can you tell me what kind of testing and auditing you used to find the cases where t_bool was being abused? I want to make very sure none exist in the codebase at all when I rip t_bool out. If you feel it is more appropriate, we could move this to an issue on ZIMH. |
|
I found that I could use clang-tidy to find problems of this class. I found a bunch more, and have fixed them in my tree. Assuming I can smoke test everything and it works okay, I will be pushing fixes later today. My overall decision is I'm simply switching everything to <stdbool.h> naming; 1999 was 27 years ago and I think that's enough time to adopt something. |
|
Okay, and now, I've gotten Codex CLI to put together a custom clang-query tool to find more problem spots. I had never thought to try this before, but it feels like having superpowers. |
|
See: pmetzger/zimh@268ae2f9 and a dozen related commits that paved the way for this. The use of It was a huge job, but it paid off in lots of improved code and a few real bug fixes (including nightmarish stuff like where someone tried left-shifting a t_bool by 29 bits and depended on that working.) Anyway, it's all clean now in my repository. I will happily give advice to the SIMH team if they decide to do something like this. I highly recommend it; not biting the bullet and modernizing to current language dialects means you won't catch bugs and eventually will not have compiler support if only you wait long enough; it's like having your codebase embalmed because you assume it's not really a live project any more. |
|
I don't have a problem with the proposed change. The change in confined to sim_defs.h, and all the rest is fixing mistakes, some serious. I would note that Mark P stamped out a lot of the 'not quite a real boolean expression' issues when the code was being regularly scanned by Coverity. I do have a problem with ripping out t_bool. 'typedef bool t_bool' provides the rigor of the new data type with a lot less typing. The abstract data types are there for a reason. Hardware is very standardized these days; that could change. @pmetzger - If you can provide a list of additional fixes, beyond what @bscottm listed, I would appreciate it. |
|
Not to nit pick, but the commit message for the PDP10 should be KA10 and not PDP10. PDP10 is Bob's original PDP10 simulator, KA10 is my new simulator. KA10 is mostly for historical purposes since it original started out as only a KA10 emulator. |
There should be a pretty long list of them in the ZIMH commit log, which has hundreds of bug fixes in it at this point. Generally: a modern bool is not a machine int, but is usually represented as a single byte, and cannot store a full int width of data. Even just building with warnings on and t_bool set to bool got a lot of additional compile warnings about loss of precision; one of the more bizarre ones was a too clever by a half left shift of a bool value by 29 bits, which probably saved a comparison were a bool a 32 bit number and might have been very fine code fifty years ago. clang-query can be used as well to sweep the whole tree for AST shapes indicating bugs; I smoked out a bunch of things that way. BTW, with a small tool constructed in python, you do not need to do any hand patching. There is a pair of python scripts in the ZIMH tools/ directory that was used for the actual patching, and another pair that was used for my purge of the pre-c99 int type names. (ZIMH is cleanly using |
|
Generally: the biggest problem with SIMH, which I have been fixing in ZIMH, is the lack of testing. If you have unit tests covering something you are changing, you can make changes without much fear. If you can make changes without fear, you can clean things up and do experiments you might have to back out without spending all your time chasing accidental fallout. Cleaning the codebase, everything from the layout to the type names used to removing half-broken support for versions of Android no one has run in many years, makes it easier to read, more maintainable, easier to work with, but if one has no ability to make changes without fear, one gets scared, code encrusts, known limitations get frozen in place, and what one has is not a living codebase but one that has been embalmed. One does not embalm the living. |
Teach SIMH to make use of the C standard bool type when supported, implemented in the support/sim_bool.h header.
If the compiler claims that STDC_VERSION is C99 and above, or if the Microsoft C compiler version is at least 1800, include stdbool.h, typedef t_bool to bool, set TRUE and FALSE to true and false, respectively.
If not a standard compliant compiler or older Microsoft compiler, typedef t_bool to int, TRUE = 1 and FALSE = 0 (i.e., the old SIMH settings.)
CMake: Use target_compiler_feature() to baseline C99 dialect.
Make a pass through the SIMH code base for t_bool variable declarations and associated references:
Ensure boolean expressions are boolean expressions, e.g., "(foo & mask) != 0" vs. "foo & mask" to remove any ambiguity. Overkill, perhaps, but explicitly and unambiguously Boolean.
Fix expressions, declarations where t_bool was assumed to be the same as int.
Fix inconsistent function prototypes.
3B2:
3b2_mau.h: Adjust *_SIGN() macros to produce Boolean expressions.
AltairZ80:
s100_dazzler.c: Change daz_0e, daz_0f and daz_frame's type from t_bool to
uint8, remove assumption that t_bool is type-equivalent to int.
m68k/m68k.h, m68k/m68kcpu.c: Rename TRUE to NMI_TRUE, FALSE to NMI_FALSE
to resolve namespace collision with SIMH TRUE and FALSE.
H316:
h316_hi.c, h316_mi.c: Ensure assignment result is Boolean.
hp2100_defs.h, FLIP_FLOP enum: Alias CLEAR and SET to FALSE and TRUE,
respectively, to avoid any compiler ambiguity with respect to
Boolean values.
hp2100_sys.c: Update assignment to Boolean result.
HP3000:
hp3000_lp.c: Ensure (device_command ^ J2W10_INSTALLED) evaluates to a
Boolean expression. Instantiates a separate boolean variable to
preserve comment formatting.
I1401:
i1401_cpu.c: Initialize clear to FALSE (vice 0), ensure conv_old is
assigned a Boolean value in cpu_set_conv() (t_bool is not int32.)
i1401_iq.c: Assign use_h from a Boolean expression.
i1401_sys.c: Assign use_h from a Boolean expression in dcw() and
fprint_sym().
I7094:
Fix ch6_req_wr() return type: should be t_bool, not t_stat.
Ibm1130:
ibm1130_cr.c: Remove extraneous extern declaraction for cgi.
ibm1130_defs.h: Change cgi and cgiwritable's extern declaration type to
t_bool to be consistent with actual definitions.
ibm1130_disk.c: Assign raw_disk_debug from a Boolean expression.
ibm1139_gui.c: Update update_gui() prototype to be consistent across the
GUI and non-GUI compile paths.
ibm1130_sca.c: Assign any_timer_running from a Boolean expression.
ibm1130_stddev.c: Remove extraneous extern declaraction for cgi.
t_bool for update_gui
Interdata:
id_idc.c: Fix idc_wds() return type (was t_stat, should be t_bool.)
ND100:
nd100_sys.c: Update sim_load()'s flag declaration. Should be int, not
t_bool (flag is an unused parameter.)
PDP10:
pdp10_ksio.c: Return type from build_dib_tab should be t_stat,
not t_bool.
PDP11:
pdp11_cr.c: Assign eof_pending to TRUE (vice 1.)
pdp11_rq.c: Assign dontchangecapac from Boolean expression.
SAGE:
m68k_sys.c: Fix sim_load()'s flag parameter type, should be int, not
t_bool.
VAX:
vax_sysdev.c: Initialize ka_hltenab, tmr_inst[] to TRUE, FALSE to be
consistent with t_bool.
vax_va.c, vax_vc.c, vax4xx_va.c, vax4xx_vc.c: Assign va_input_captured
from Boolean expression.
vax780_fload.c: Update rtfile_read() return type to t_bool (not t_stat.)