SORTLIB of 2D array

Edja · Post by **Edja** » Sun 05 Aug 2018, 09:59

I have some doubts about the way I go about to sort a 2D array.
My understanding of the SORTLIB documentation is that the sorting proceeds based on the values in a row of the array. But I want to sort the array based on the content of a column. To do this, I do a transposition of the array so I can choose a row (instead of a column) as the key of the sort.

The following code makes this more tangible. When executed you will see 4 times an array content.
The first is the original array, still to be sorted. I want to sort all rows based upon the values in column 3.
The second is the transposed array, still to be sorted.
The third one is the transposed array, after sorting based upon the data in row 3
The fourth is the result of transposing the third one. This delivers the original array but sorted based on the content of column 3

This effectively gives me the result I'm looking for. So I have no immediate problem.
Still, I feel uncomfortable because I think I'm not correctly using SORTLIB.
Question 1: Is there a way to perform the sort based upon the content of a column without having to transpose the array first.
Question 2: Row 0 (array elements A$(0,0) up to A$(0,4) can contain irrelevant data. How to avoid that this row is also taken into the sort. I'm not sure if I need to use the line C% = DIM(B$(),DIM(B$()))+1 or just C% = DIM(B$(),DIM(B$())) or another approach.

Thanks for any feedback,
Eddy

      MODE 28
      INSTALL @lib$+"SORTLIB"
      INSTALL @lib$+"ARRAYLIB"
      DIM A$(8,4),B$(4,8)
      FOR r=1 TO 8:FOR c=1 TO 4:READ A$(r,c):NEXT c:NEXT r

      PROC_Show_A$
      PROC_transpose$(A$(), B$())
      PROC_Show_B$

      sort% = FN_sortinit(0,2)
      C% = DIM(B$(),DIM(B$()))+1
      CALL sort%, B$(3,0),B$(1,0),B$(2,0),B$(4,0)

      PROC_Show_B$
      PROC_transpose$(B$(),A$())
      PROC_Show_A$

      END
      REM ----------------------------------------------------------------
      DEF PROC_Show_A$
      FOR r=1 TO 8:PRINT SPC2 A$(r,1) TAB(32)  A$(r,2) TAB(40) A$(r,3) TAB(55) A$(r,4):NEXT r
      PRINT STRING$(120,"_")
      ENDPROC
      REM ----------------------------------------------------------------
      DEF PROC_Show_B$
      FOR r=1 TO 4:FOR c=1 TO 7:PRINT B$(r,c) SPC5;:NEXT c:PRINT B$(r,c),:NEXT r
      PRINT STRING$(120,"_")
      ENDPROC
      REM ----------------------------------------------------------------
      DATA Gaztransport,177,IE003262,BRU,GLAXOSMITHKLINE,306,LU020157,NAV,GRANDVISION,199,AU031893
      DATA OTC,MYTHRA,103,LU017361,NAV,SISF UK OPP AD,66,ZS464286,BRU,TH UK EQ GBP,231,US500631
      DATA LSE,UMICORE,12114,FR001040,AEX,VODAFONE,196,LU010939,LSE,BOMBARDIER,1900,KL100584,PSE

Edja · Post by **Edja** » Mon 06 Aug 2018, 10:59

With some experimenting I've managed to find the answer to my question 2 :
"Question 2: Row 0 (array elements A$(0,0) up to A$(0,4) can contain irrelevant data. How to avoid that this row is also taken into the sort. I'm not sure if I need to use the line C% = DIM(B$(),DIM(B$()))+1 or just C% = DIM(B$(),DIM(B$())) or another approach."

It was all in the manual.

The two lines
C% = DIM(B$(),DIM(B$()))+1
CALL sort%, B$(3,0),B$(1,0),B$(2,0),B$(4,0)
should have been :
C% = DIM(B$(),DIM(B$()))
CALL sort%, B$(3,1),B$(1,1),B$(2,1),B$(4,1)

The first question is still open :"Is there a way to perform the sort based upon the content of a column without having to transpose the array first? "
Here is the corrected code :

      MODE 28
      INSTALL @lib$+"SORTLIB":INSTALL @lib$+"ARRAYLIB"
      DIM A$(8,4),B$(4,8)
      A$()="0":B$()="0"
      FOR r=1 TO 8:FOR c=1 TO 4:READ A$(r,c):NEXT c:NEXT r

      PROC_Show_A$:PROC_transpose$(A$(),B$()):PROC_Show_B$

      sort% = FN_sortinit(0,2)
      C% = DIM(B$(),DIM(B$()))
      CALL sort%, B$(3,1),B$(1,1),B$(2,1),B$(4,1)

      PROC_Show_B$:PROC_transpose$(B$(),A$()):PROC_Show_A$

      END
      REM ----------------------------------------------------------------
      DEF PROC_Show_A$
      LOCAL r,c
      FOR r=0 TO 8:PRINT A$(r,0) SPC2 A$(r,1) TAB(32)  A$(r,2) TAB(40) A$(r,3) TAB(55) A$(r,4):NEXT r
      PRINT STRING$(120,"_")
      ENDPROC
      REM ----------------------------------------------------------------
      DEF PROC_Show_B$
      LOCAL r,c
      FOR r=0 TO 4
        FOR c=0 TO 7
          PRINT B$(r,c) SPC5;
        NEXT c
        PRINT B$(r,c),
      NEXT r
      PRINT STRING$(120,"_")
      ENDPROC
      REM ----------------------------------------------------------------
      DATA Gaztransport,177,IE003262,BRU,GLAXOSMITHKLINE,306,LU020157,NAV,GRANDVISION,199,AU031893
      DATA OTC,MYTHRA,103,LU017361,NAV,SISF UK OPP AD,66,ZS464286,BRU,TH UK EQ GBP,231,US500631
      DATA LSE,UMICORE,12114,FR001040,AEX,VODAFONE,196,LU010939,LSE,BOMBARDIER,1900,KL100584,PSE

Eddy

DDRM · Post by **DDRM** » Mon 13 Aug 2018, 08:28

Hi Eddy,

Your second question, as you discovered, is essentially a question of using the "1 based array" technique shown in the manual. If you had several header cells I guess you could adapt the version for sorting only part of an array, but I haven't tried how that would work with multiple columns.

I suspect it's not possible to sort on columns rather than rows because of the way the array data is arranged in memory, which is probably critical to the way the assembler sorting algorithm works, though I haven't worked through it to find out!

If anyone knows better, I'm happy to be corrected...

Best wishes,

D

Edja · Post by **Edja** » Mon 13 Aug 2018, 14:12

Hi D,
thanks for the reply. I agree with your views.
When the sort-routines were written in assembler, the decision was made to do it based on rows. To do the same based on columns would just be doubling the work, unnecessarily, and perhaps be more complicated (just guessing,I'm not able evaluate this correctly).

So the way to sort a 2D-array based on columns is to first transpose these columns to rows, perform the sort, then do the reverse transpose as shown in the code I've submitted. Simple. And also, because the sort algorithm and PROC_transpose$ have both been written in assembler it remains very fast, no speed penalty because of the 2x-transpose (at least not for the array dimensions I use in my programs).

Maybe the code I've included in my previous post can be used by others as an example for "Sorting 2D-arrays based on column content".

Regards,
Eddy

DDRM · Post by **DDRM** » Tue 14 Aug 2018, 08:36

Hi Eddy,

I think it's more than just an arbitrary decision to write it for rows: the data in memory is contiguous for a row. If there are multiple rows, each row follows the last. Therefore by sorting the data in a row, the routine can simply run along the memory locations. If you tried to do the same with columns, you'd need to jump by (data size x no of elements) each time, which would be more complex, and require more inputs to deal with partial sorts.

I agree the double transposition is an acceptable solution for modest-sized arrays. An alternative is to realise from the start that you want to be able to sort, and set the arrays up with the data in rows in the first place!

For most uses it doesn't really matter which are the rows and which are the columns (it's just a matter of which way you put your index variables). Of course, if you are going to use matrix multiplication (for example to rotate location data), that will constrain things - but then it may not make sense to sort the data...

Best wishes,

D

Edja · Post by **Edja** » Tue 14 Aug 2018, 09:16

Hi D,

An alternative is to realize from the start that you want to be able to sort, and set the arrays up with the data in rows in the first place!

True and, if I remember correctly, Richard advised me along those lines some 3 years ago and I followed his advice. This time however, the array is the output of another, now stable, program that I've developed over the years and that I am reluctant to modify to avoid introducing bugs. For not too big arrays the transposition is an elegant way out.

it's more than just an arbitrary decision to write it for rows

After re-reading my previous reply, I realize I may have implied inadvertently that the decision was arbitrary. That was not my intention. My apologies. Your explanation clarifies in more precise terminology what was already my "feeling"
Thanks !
Regards,
Eddy

Edja · Post by **Edja** » Sat 25 Aug 2018, 13:58

if I remember correctly, Richard advised me along those lines some 3 years ago

In fact this goes back to 2012. I've now found the exchange of information about SORTLIB in https://groups.io/g/bb4w. I've done a search with the term "Multidimensional array sort" .
Lots of detailed and useful answers from Richard to my own questions. Since then the topic has been revisited a few times by others
I should have remembered to look there.

Eddy

BBC BASIC forum

SORTLIB of 2D array

SORTLIB of 2D array

Re: SORTLIB of 2D array

Re: SORTLIB of 2D array

Re: SORTLIB of 2D array

Re: SORTLIB of 2D array

Re: SORTLIB of 2D array

Re: SORTLIB of 2D array