Speed comparison 32-bit vs 64-bit

Richard Russell · Post by **Richard Russell** » Tue 03 Feb 2026, 14:46

I am not a great believer in benchmarks, as they can be very misleading, but the discussion about the differing results from the recent array scan challenge depending on the version of BBC BASIC led me to run the supplied timing.bbc program (in examples/tools/) again.

Here the first set of results is from the 32-bit coded-in-assembler version of BBCSDL and the second from the 64-bit coded-in-C version, running on the same hardware (Intel Core i7 clocked at 4 GHz):

 BBC BASIC for Win32 version 1.43b                     Average:      79 ns

 Colon:                       1 ns       A%=B%<<C%:                  53 ns
 Dispatch (ENDCASE):          5 ns       A%=B%>>C%:                  55 ns
 Dispatch (ENDIF):            6 ns       A%=B%>>>C%:                 55 ns
 NEXT (integer):             22 ns       A%%=B%%:                    71 ns
 NEXT (default real):        44 ns       A%%=B%%+C%%:               109 ns
 NEXT (64-bit real):         64 ns       A%%=B%%-C%%:               110 ns
 NEXT N%:                    18 ns       A%%=B%%*C%%:               107 ns
 NEXT n%:                    41 ns       A%%=B%%DIVC%%:             107 ns
 NEXT N:                     65 ns       A%%=B%%/C%%:               213 ns
 NEXT N#:                    88 ns       A%%=B%%^3:                 139 ns
 REPEATUNTILTRUE:            28 ns       A%%=B%%<<C%%:              115 ns
 WHILEFALSE:ENDWHILE:        24 ns       A%%=B%%>>C%%:              116 ns
 A%=FALSE:                   26 ns       A%%=B%%>>>C%%:             115 ns
 A%=0:                       34 ns       A=B+PI:                    101 ns
 A=PI:                       46 ns       A=B-PI:                    100 ns
 ANTIDISESTABLISHMENT=PI:    56 ns       A=B*PI:                    101 ns
 A=(PI):                     64 ns       A=B/PI:                    105 ns
 A%=1234567890:              50 ns       A=B^3:                     206 ns
 A%=&499602D2:               41 ns       A=B^PI:                    230 ns
 A=1.23456789E38:           102 ns       A=SINB:                    110 ns
 A%=B%:                      30 ns       A=TANB:                    116 ns
 A=B:                        69 ns       A=LOGB:                    104 ns
 ANTI=ANTI:                  72 ns       A=EXPB:                    113 ns
 A%=B%+C%:                   47 ns       A=SQRB:                     89 ns
 A%=B%-C%:                   47 ns       A=ATNB:                    119 ns
 A%=B%*C%:                   46 ns       A=ABSB:                     74 ns
 A%=B%DIVC%:                 47 ns       A=INTB:                     88 ns
 A%=B%/C%:                  152 ns       PROC1:                      39 ns
 A%=C%^D%:                   95 ns       A%=FN1:                    128 ns

 BBC BASIC for Win64 version 1.43c                     Average:      85 ns

 Colon:                       3 ns       A%=B%<<C%:                  94 ns
 Dispatch (ENDCASE):          4 ns       A%=B%>>C%:                  94 ns
 Dispatch (ENDIF):            5 ns       A%=B%>>>C%:                 95 ns
 NEXT (integer):             15 ns       A%%=B%%:                    62 ns
 NEXT (default real):        22 ns       A%%=B%%+C%%:               109 ns
 NEXT (64-bit real):         25 ns       A%%=B%%-C%%:               108 ns
 NEXT N%:                    20 ns       A%%=B%%*C%%:               108 ns
 NEXT n%:                    25 ns       A%%=B%%DIVC%%:             115 ns
 NEXT N:                     31 ns       A%%=B%%/C%%:               132 ns
 NEXT N#:                    35 ns       A%%=B%%^3:                 206 ns
 REPEATUNTILTRUE:            43 ns       A%%=B%%<<C%%:              118 ns
 WHILEFALSE:ENDWHILE:        33 ns       A%%=B%%>>C%%:              117 ns
 A%=FALSE:                   40 ns       A%%=B%%>>>C%%:             117 ns
 A%=0:                       45 ns       A=B+PI:                    108 ns
 A=PI:                       48 ns       A=B-PI:                    105 ns
 ANTIDISESTABLISHMENT=PI:    56 ns       A=B*PI:                    109 ns
 A=(PI):                     79 ns       A=B/PI:                    106 ns
 A%=1234567890:              52 ns       A=B^3:                     260 ns
 A%=&499602D2:               48 ns       A=B^PI:                    219 ns
 A=1.23456789E38:            84 ns       A=SINB:                    123 ns
 A%=B%:                      46 ns       A=TANB:                    130 ns
 A=B:                        58 ns       A=LOGB:                    126 ns
 ANTI=ANTI:                  61 ns       A=EXPB:                    127 ns
 A%=B%+C%:                   80 ns       A=SQRB:                    101 ns
 A%=B%-C%:                   82 ns       A=ATNB:                    134 ns
 A%=B%*C%:                   81 ns       A=ABSB:                     67 ns
 A%=B%DIVC%:                 92 ns       A=INTB:                     78 ns
 A%=B%/C%:                  104 ns       PROC1:                      33 ns
 A%=C%^D%:                  228 ns       A%=FN1:                    119 ns

To some extent what is most notable is not how they differ but just how similar they are, in general; I probably wouldn't have expected that and it shows how far compilers have come. But if one drills down into the detail there are some interesting comparisons, for example that on the 32-bit version division of two integers is significantly slower than raising to an integer power, whereas on the 64-bit version the opposite is the case.

Of course this program doesn't exercise the numeric SUM() function, which is where the difference between the two versions really stood out in the case of the challenge (until I made the experimental change of in-lining the addition routine).