Programming Challenge - count capital letters in a string
Re: Programming Challenge - count capital letters in a string
Hi Richard,
I'm interested! I didn't bother to comment, because I thought your explanation was perfectly clear. I'd been looking for a solution that would catch ALL the non-alphabetic characters in the range 64-95 at once, but I couldn't find one. Given that there are only a handful, though, filtering them individually is perfectly reasonable.
Best wishes,
D
I'm interested! I didn't bother to comment, because I thought your explanation was perfectly clear. I'd been looking for a solution that would catch ALL the non-alphabetic characters in the range 64-95 at once, but I couldn't find one. Given that there are only a handful, though, filtering them individually is perfectly reasonable.
Best wishes,
D
Re: Programming Challenge - count capital letters in a string
On 28/02/2023 00:45, J.G.Harston wrote (cross-posted from the Discussion Group):
For a 1 Mbyte string (which is not at all unreasonable in a Word Processing context) David's code took 70 ms to count the capital letters on this PC, yours took three minutes.
After all, the only reason the 'array arithmetic' operators were added to BBC BASIC (in 1986 or thereabouts) was speed; you can do the same things without them using a trivial loop (two nested loops in the case of the dot-product). If you eschew them in favour of the naïve approach, you pay a high price in performance.
By leveraging the array arithmetic in a non-standard ('clever') way you can reap even greater speed benefits, as I have shown.
A naïve approach is fine, when you don't care how long it takes, but with an interpreted language like BBC BASIC speed is often an important factor. I did a quick comparison of your method compared with David's, yours took about 2,500 (two-and-a-half-thousand) times longer than his!I think I'd just do it naively:
DEFFNstr_caps(A$)
LOCAL i%,n%
IF A$="":=0
FOR i%=1 TO LEN A$
IF MID$(A$,i%,1)>="A":IF MID$(A$,i%,1)<="Z":n%=n%+1
NEXT i%
=n%
For a 1 Mbyte string (which is not at all unreasonable in a Word Processing context) David's code took 70 ms to count the capital letters on this PC, yours took three minutes.

After all, the only reason the 'array arithmetic' operators were added to BBC BASIC (in 1986 or thereabouts) was speed; you can do the same things without them using a trivial loop (two nested loops in the case of the dot-product). If you eschew them in favour of the naïve approach, you pay a high price in performance.
By leveraging the array arithmetic in a non-standard ('clever') way you can reap even greater speed benefits, as I have shown.
-
- Posts: 1
- Joined: Thu 05 Apr 2018, 07:15
Re: Programming Challenge - count capital letters in a string
They absolutely are worthy of note and you have pointed out just how valuable an in-depth understanding of the language can be. Although this particular application might be narrowly specific, you have used it to demonstrate the worth of an important feature of the language which even experienced programmers might overlook.Hated Moron wrote: ↑Mon 27 Feb 2023, 23:14 Judging by the deafening silence, the solution doesn't seem to have been of any interest either! Does nobody think BBC BASIC 'hacks' of this kind are worthy of note?
I tend to keep a copy of these things in a folder which could be named "Don't when I will need this BUT one day ..."
Please keep up the good work!
- hellomike
- Posts: 192
- Joined: Sat 09 Jun 2018, 09:47
- Location: Amsterdam
Re: Programming Challenge - count capital letters in a string
To start with, BBC BASIC array arithmetic operators are extremely powerful. No question about it.
However I like to point out that using a loop as such is not the reason for the performance drop. I.e. this ´naïve´ approach is 160 times faster (tested using a 1Mbyte string as input).
Regards,
Mike
However I like to point out that using a loop as such is not the reason for the performance drop. I.e. this ´naïve´ approach is 160 times faster (tested using a 1Mbyte string as input).
Code: Select all
DEFFNstr_caps_fast(RETURN A$)
LOCAL i%,n%,b&()
IF A$="":=0
DIM b&(LEN(A$)) : $$^b&(0)=A$
FOR i%=0 TO LEN A$-1
IF b&(i%)>=65 IF b&(i%)<=90 n%+=1
NEXT i%
=n%
Mike
Re: Programming Challenge - count capital letters in a string
It's still about fifteen times slower than DDRM's code though (by my measurement), so using array arithmetic wins by a large margin. And anyway your code is by no means 'naïve', in my judgement: 'standard' BBC BASIC doesn't even have byte arrays or the 'address of' operator so copying a string into a byte array is quite an advanced technique.hellomike wrote: ↑Tue 28 Feb 2023, 14:59 However I like to point out that using a loop as such is not the reason for the performance drop. I.e. this ´naïve´ approach is 160 times faster (tested using a 1Mbyte string as input).
Code: Select all
DEFFNstr_caps_fast(RETURN A$) LOCAL i%,n%,b&() IF A$="":=0 DIM b&(LEN(A$)) : $$^b&(0)=A$ FOR i%=0 TO LEN A$-1 IF b&(i%)>=65 IF b&(i%)<=90 n%+=1 NEXT i% =n%
But of course you're right that the reason Jonathan's code is so terribly slow is not primarily the loop, it's the use of the MID$() string-slicing function. And the reason that MID$() is so slow is because it has to assume that evaluating the numeric parameter(s) might modify the string (although it probably won't) and therefore it has to copy the string first.
If the syntax of the MID$() function had been MID$(start, count, string$) rather than MID$(string$, start, count) it could be much faster, because the string could be used in-situ rather than a copy being made. A case of what was probably an arbitrary choice, originally, having major consequences downstream.
(Incidentally your code doesn't need to use RETURN A$ since it doesn't modify the string. The RETURN will slow it down fractionally).
- hellomike
- Posts: 192
- Joined: Sat 09 Jun 2018, 09:47
- Location: Amsterdam
Re: Programming Challenge - count capital letters in a string
Thanks for the information Richard. You are always very thorough and brilliant in explaining such details.
I used RETURN A$ because I thought calling by reference prevents the interpreter making a copy (and releasing it on exit) of the input on the stack, saving time, especially when 1MB chunks are involved.
Also I realized that the following function body is even easier and faster (240 times instead of 160 times).
Still considerably slower than using array arithmetic operators but arguably easier to read/understand and maintain for a common programmer (like me).
Regards,
Mike
I used RETURN A$ because I thought calling by reference prevents the interpreter making a copy (and releasing it on exit) of the input on the stack, saving time, especially when 1MB chunks are involved.
Also I realized that the following function body is even easier and faster (240 times instead of 160 times).
Code: Select all
DEFFNstr_caps_fast(RETURN A$)
LOCAL i%,n%
IF A$="":=0
FOR i%=PTR(A$) TO PTR(A$)+LENA$-1
IF ?i%>64 IF ?i%<91 n%+=1
NEXT i%
=n%
Regards,
Mike
-
- Posts: 327
- Joined: Wed 04 Apr 2018, 06:36
Re: Programming Challenge - count capital letters in a string
But this is faster still and - to my simple brain at least - seems to be simpler both to write and to execute.
Not much in it, but still ...
Code: Select all
l%=1024*500
a$=STRING$(l%,"*")
FORi%=1TOl%:MID$(a$,i%,1)=CHR$(32+RND(124)):NEXT
TIME=0
PRINTFNstr_caps_fast(a$)
PRINTTIME
TIME=0
PRINTFNcaps(a$)
PRINTTIME
END
:
DEFFNstr_caps_fast(RETURN A$)
LOCAL i%,n%
IF A$="":=0
FOR i%=PTR(A$) TO PTR(A$)+LENA$-1
IF ?i%>64 IF ?i%<91 n%+=1
NEXT i%
=n%
:
DEFFNcaps(a$):LOCALi%,n%,p%,e%
IFa$="":=0
p%=PTR(a$):e%=p%+LEN(a$)-1
FORi%=p%TOe%
IF?i%>64IF?i%<91n%+=1
NEXT
=n%
Re: Programming Challenge - count capital letters in a string
No, the opposite is the case: adding RETURN increases the number of string copies, it doesn't reduce them! When the function or procedure is called, the actual parameter is copied into the formal parameter; this happens whether a RETURN is present or not. On return from the FN/PROC, if there is no RETURN the formal parameter is simply discarded, but with RETURN the formal parameter is copied back into the actual parameter. So, in essence, without RETURN there is one copy and with RETURN there are two copies.
Although the documentation describes RETURN as passing the string parameter by reference it's a lie; the end result is similar but the mechanism is quite different. BBC BASIC has no variable aliases, there is no way that two different variables - in this case the actual parameter and the formal parameter - can refer to the same object in memory. Consider the following code:
Code: Select all
10 actual$ = "Input string"
20 PROCtest(actual$)
30 PRINT actual$
40 END
50 DEF PROCtest(RETURN formal$)
60 PRINT formal$
70 formal$ = "Output string"
80 ENDPROC
This can easily be demonstrated by replacing the ENDPROC with GOTO 30.
Re: Programming Challenge - count capital letters in a string
I wrote this program to time the various different methods of counting the number of capital letters (more precisely the number of characters in columns 4 and 5 of the ASCII table) in a string:
Try it yourself, you should find that DDRM's non-loop (array arithmetic) method is far faster than any of the others, albeit that it would also count characters in columns 0 and 1. That can be fixed, without introducing a loop, but will increase the time.
Code: Select all
HIMEM = PAGE + 10000000
test$ = STRING$(10000, "This is a TEST string. It has some Capital Letters, some 123456789 numbers, and some that are NOT so capital.")
TIME = 0 : PRINT FNstr_caps1(test$), TIME
TIME = 0 : PRINT FNstr_caps2(test$), TIME
TIME = 0 : PRINT FNstr_caps3(test$), TIME
TIME = 0 : PRINT FNstr_caps4(test$), TIME
END
DEFFNstr_caps1(a$)
LOCAL a&()
DIM a&(LENa$)
$$^a&(0) = a$
a&() AND= %100000
=LEN(a$)-SUM(a&())/32
DEFFNstr_caps2(a$)
LOCAL N%,p%%
IF a$="":=0
FOR p%% = PTR(a$) TO PTR(a$)+LEN(a$)-1
IF ?p%%>=&40 IF ?p%%<=&5F N%+=1
NEXT
=N%
DEFFNstr_caps3(a$)
LOCAL I%,N%,b&()
IF a$="":=0
DIM b&(LEN(a$)) : $$^b&(0)=a$
FOR I%=0 TO LEN a$-1
IF b&(I%)>=&40 IF b&(I%)<=&5F N%+=1
NEXT
=N%
DEFFNstr_caps4(a$)
LOCAL I%,N%
IF a$="":=0
FOR I%=1 TO LEN a$
IF MID$(a$,I%,1)>="@" IF MID$(a$,I%,1)<="_" N%+=1
NEXT
=N%
Re: Programming Challenge - count capital letters in a string
Putting to one side that your code is broken (pointers need to be 64-bit capable, so for example i%% here rather than i%) this is yet another method using a loop. The challenge stated: "you are not allowed to use any kind of loop, iteration or recursive function call - linear code only!".KenDown wrote: ↑Tue 28 Feb 2023, 21:17 But this is faster still and - to my simple brain at least - seems to be simpler both to write and to execute.
Code: Select all
DEFFNcaps(a$):LOCALi%,n%,p%,e% IFa$="":=0 p%=PTR(a$):e%=p%+LEN(a$)-1 FORi%=p%TOe% IF?i%>64IF?i%<91n%+=1 NEXT =n%
There are several 'non qualifying' methods using a loop, and whilst they vary in their efficiency even the fastest of them is about seven times slower than DDRM's non-loop solution, using his test string, or about four times slower if his solution is modified not to count characters in columns 0 and 1.
It's also worth noting that the 'loop' solutions will get even slower if the proportion of capital letters in the string increases, whereas the non-loop solution takes the same time however few or many capital letters there are.
So the moral of the story, if there is one, is that whenever iterating through all the elements of an array (and a string is conceptually an array of characters) think about whether you might be able to use array arithmetic. If you can, there will probably be a big speed benefit.