String allocation comparison

Discussions about the BBC BASIC language, with particular reference to BB4W and BBCSDL
RichardRussell

String allocation comparison

Post by RichardRussell »

This is nothing new, but I thought I'd re-visit the various string allocation strategies used by different versions of BBC BASIC. I ran this simple test program (extending two strings alternately forces reallocation rather than expansion into the heap):

Code: Select all

   10 FOR I% = 1 TO 255
   20   S$ = S$ + "X"
   30   T$ = T$ + "X"
   40 NEXT
   50 DIM P% -1
   60 PRINT P% - LOMEM
The program prints the total heap usage; here is what I got for the different versions:
  • BASIC 2 (BeebEm): 7764 bytes
  • BASIC 5 (RedSquirrel): 16664 bytes
  • Matrix Brandy 1.22.8: 12792 bytes
  • BBC BASIC for SDL 2.0: 1160 bytes
Make of that what you will.
David Williams

Re: String allocation comparison

Post by David Williams »

RichardRussell wrote: Wed 04 Nov 2020, 10:11
  • BASIC 2 (BeebEm): 7764 bytes
  • BASIC 5 (RedSquirrel): 16664 bytes
  • Matrix Brandy 1.22.8: 12792 bytes
  • BBC BASIC for SDL 2.0: 1160 bytes
Make of that what you will.

In case it's of interest:
  • BASIC 4 (BeebEm): 7764 bytes
  • BBC BASIC 86 (DOS emulator online): 62266* bytes
  • BB4W v1.07f: 8470 bytes
* Based on 249 loop iterations because higher values result in a 'No room' error


David.
--
RichardRussell

Re: String allocation comparison

Post by RichardRussell »

David Williams wrote: Wed 04 Nov 2020, 18:25 [*]BBC BASIC 86 (DOS emulator online): 62266* bytes
That must be the small-memory-model (64K) version. With the 640K version I get:
  • BBC BASIC (86) for MS-DOS v4.81: 4397 bytes
Richard Russell
Posts: 272
Joined: Tue 18 Jun 2024, 09:32

Re: String allocation comparison

Post by Richard Russell »

RichardRussell wrote: Wed 04 Nov 2020, 10:11 The program prints the total heap usage; here is what I got for the different versions:
I'm not sure whether waking up a 4-year-old thread is appropriate, but I wanted to add the result from BBC BASIC (Z80) version 5.00 which didn't exist then. The total (heap) memory usage from alternately lengthening two strings, one byte at a time, to 255 bytes each (a 'pathological case' for BBC BASIC) is:
  • 6502 BASIC 2 (BeebEm): 7764 bytes
  • 6502 BASIC 4 (BeebEm): 7764 bytes
  • ARM BASIC 5 (RedSquirrel): 16664 bytes
  • BBC BASIC (86) for MS-DOS v4.81†: 4397 bytes
  • Matrix Brandy v1.22.8 (WIndows): 12792 bytes
  • Matrix Brandy v1.23.4 (Windows): 12864 bytes
  • BBC BASIC for Windows v1.07f: 8470 bytes
  • BBC BASIC for Windows v6.15a: 1159 bytes
  • BBC BASIC for SDL 2.0 v1.40a 32-bit: 1160 bytes
  • BBC BASIC for SDL 2.0 v1.40a 64-bit: 1287 bytes
  • BBC BASIC (Z80) v5.00: 1020 bytes
† Large memory model version ('BIGBASIC')
artfizz
Posts: 21
Joined: Wed 11 Dec 2024, 17:15

Re: String allocation comparison

Post by artfizz »

ARM BASIC 5 (RedSquirrel): 16664 bytes
Matrix Brandy v1.23.4 (Windows): 12864 bytes
BBC BASIC for Windows v1.07f: 8470 bytes
The Matrix Brandy value got worse in the later version!
Richard Russell
Posts: 272
Joined: Tue 18 Jun 2024, 09:32

Re: String allocation comparison

Post by Richard Russell »

As stated, alternately lengthening two strings is a 'pathological case' for BBC BASIC and, whilst it illustrates the worse case situation, is unlikely to happen in a real program. A more realistic test of the string allocation performance is to alternately set two strings to a random length between 1 and 255 characters:

Code: Select all

   10 S% = RND(-1234)
   20 DIM A% -1
   30 FOR I% = 1 TO 1000
   40   A$ = STRING$(RND(255), "A")
   50   B$ = STRING$(RND(255), "B")
   60 NEXT I%
   70 DIM B% -1
   80 PRINT "Total memory used = "; B%-A%; " bytes"
Running that program on a similar set of platforms gives the following results:
  • 6502 BASIC 2 (BeebEm): 911 bytes
  • 6502 BASIC 4 (BeebEm): 1189 bytes
  • ARM BASIC 5 (RedSquirrel): 7708 bytes
  • BBC BASIC (86) for MS-DOS v4.81†: 1248 bytes
  • Matrix Brandy v1.23.4 (Windows): 8392 bytes
  • BBC BASIC for Windows v6.15a: 1147 bytes
  • BBC BASIC for SDL 2.0 v1.40a 32-bit: 1147 bytes
  • BBC BASIC for SDL 2.0 v1.40a 64-bit: 1271 bytes
  • BBC BASIC (Z80) v2.20: 1932 bytes
  • BBC BASIC (Z80) v5.00: 954 bytes
† Large memory model version ('BIGBASIC')

There's less variation than with the 'pathological' case, but ARM BASIC 5 and Matrix Brandy stand out as using a surprising amount of memory compared with the rest.
Richard Russell
Posts: 272
Joined: Tue 18 Jun 2024, 09:32

Re: String allocation comparison

Post by Richard Russell »

artfizz wrote: Sat 21 Dec 2024, 12:05 The Matrix Brandy value got worse in the later version!
It does look like it, but other factors could be at play because I no longer have the earlier version installed here to compare with. For example it's quite possible that four years ago I was running the 32-bit version of Matrix Brandy and now I'm running the 64-bit version. So I wouldn't read too much into it, especially as the difference is tiny as a proportion.
Richard Russell
Posts: 272
Joined: Tue 18 Jun 2024, 09:32

Re: String allocation comparison

Post by Richard Russell »

Richard Russell wrote: Sat 21 Dec 2024, 12:11 There's less variation than with the 'pathological' case, but ARM BASIC 5 and Matrix Brandy stand out as using a surprising amount of memory compared with the rest.
I should point out that although the program seeds the random number generator (so it always gives the same result on any given platform) the values returned by RND(255) won't be identical on different versions of BBC BASIC. So I'm not actually comparing 'like with like', although looping over 1000 iterations should reduce the consequences of this.

If I'd thought about it - thinking is not one of my strengths now! - it would have been better to use RND AND 255 rather than RND(255) because the former should return an identical set of values (between 0 and 255) on all versions of BBC BASIC. If anybody cares enough I can repeat the tests with that change.
Richard Russell
Posts: 272
Joined: Tue 18 Jun 2024, 09:32

Re: String allocation comparison

Post by Richard Russell »

Richard Russell wrote: Sat 21 Dec 2024, 12:11 There's less variation than with the 'pathological' case, but ARM BASIC 5 and Matrix Brandy stand out as using a surprising amount of memory compared with the rest.
When you consider that two 255-character strings will require 510 bytes of storage, plus the memory required to hold the variable references themselves, the fact that the majority of BBC BASIC versions only use a total of a Kilobyte or so of heap when running this program really isn't too bad at all, considering there is no traditional 'garbage collection'.

The old BBC BASIC (Z80) v2.20 isn't quite as good, at around 2 Kilobytes (v5.00 is much better), and both ARM BASIC 5 and Matrix Brandy are really rather poor at around 8 Kilobytes.

It's instructive to see what happens if you initialise the strings to their maximum possible length, which is a strategy that was recommended with the old (1980s) 8-bit versions in order to minimise the use of string memory:

Code: Select all

   10 S% = RND(-1234)
   20 DIM A% -1
   22 A$ = STRING$(255, "A")
   24 B$ = STRING$(255, "B")
   30 FOR I% = 1 TO 1000
   40   A$ = STRING$(RND AND 255, "A")
   50   B$ = STRING$(RND AND 255, "B")
   60 NEXT I%
   70 DIM B% -1
   80 PRINT "Total memory used = "; B%-A%; " bytes"
Now the results are somewhat different:
  • 6502 BASIC 2 (BeebEm): 526 bytes
  • 6502 BASIC 4 (BeebEm): 526 bytes
  • ARM BASIC 5 (RedSquirrel): 8124 bytes
  • BBC BASIC (86) for MS-DOS v4.81†: 544 bytes
  • Matrix Brandy v1.23.4 (Windows): 8392 bytes
  • BBC BASIC for Windows v6.15a: 1130 bytes
  • BBC BASIC for SDL 2.0 v1.40a 32-bit: 1130 bytes
  • BBC BASIC for SDL 2.0 v1.40a 64-bit: 1246 bytes
  • BBC BASIC (Z80) v2.20: 526 bytes
  • BBC BASIC (Z80) v5.00: 526 bytes
† Large memory model version ('BIGBASIC')

Here there seem to be three distinct groups: the old 8-bit versions (6502 and Z80) all use 526 bytes, my 'modern' versions (BB4W and BBCSDL) around a Kilobyte, and once again the outliers are ARM BASIC 5 and Brandy BASIC at around 8 Kbytes.