User Tools

Site Tools


using_20regular_20expressions

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
using_20regular_20expressions [2018/03/31 13:19] – external edit 127.0.0.1using_20regular_20expressions [2024/01/05 00:21] (current) – external edit 127.0.0.1
Line 6: Line 6:
 | [abc]\\ | matches "a", "b" or "c"\\ | | [abc]\\ | matches "a", "b" or "c"\\ |
 | [a-z]\\ | matches any lowercase letter\\ | | [a-z]\\ | matches any lowercase letter\\ |
-| [^b]at\\ | matches "cat", "fat", "hat" etc. but not "bat"\\ |+<nowiki>[^b]</nowiki>at\\ | matches "cat", "fat", "hat" etc. but not "bat"\\ |
 \\  For more information on the syntax of regular expressions see this [[http://en.wikipedia.org/wiki/Regular_expression|Wikipedia article]].\\ \\  You can make use of regular expressions in your BBC BASIC program by means of the **gnu_regex** DLL which can be downloaded from [[http://people.delphiforums.com/gjc/gnu_regex.html|here]][[/Using%20regular%20expressions#footnote|[1]]]. To start with you must load the DLL in the usual way:\\ \\  \\  For more information on the syntax of regular expressions see this [[http://en.wikipedia.org/wiki/Regular_expression|Wikipedia article]].\\ \\  You can make use of regular expressions in your BBC BASIC program by means of the **gnu_regex** DLL which can be downloaded from [[http://people.delphiforums.com/gjc/gnu_regex.html|here]][[/Using%20regular%20expressions#footnote|[1]]]. To start with you must load the DLL in the usual way:\\ \\ 
 +<code bb4w>
         SYS "LoadLibrary", "gnu_regex.dll" TO gnu_regex%         SYS "LoadLibrary", "gnu_regex.dll" TO gnu_regex%
         IF gnu_regex% = 0 ERROR 100, "Cannot load gnu_regex.dll"         IF gnu_regex% = 0 ERROR 100, "Cannot load gnu_regex.dll"
         SYS "GetProcAddress", gnu_regex%, "regcomp" TO regcomp%         SYS "GetProcAddress", gnu_regex%, "regcomp" TO regcomp%
         SYS "GetProcAddress", gnu_regex%, "regexec" TO regexec%         SYS "GetProcAddress", gnu_regex%, "regexec" TO regexec%
 +</code>
 For this to work **gnu_regex.dll** needs to be in the current directory, the Windows directory (often C:\WINDOWS), the Windows system directory (often C:\WINDOWS\SYSTEM32) or one of the directories specified in the PATH environment variable. Alternatively you can copy the file to your BBC BASIC for Windows library folder and load it explicitly from there:\\ \\  For this to work **gnu_regex.dll** needs to be in the current directory, the Windows directory (often C:\WINDOWS), the Windows system directory (often C:\WINDOWS\SYSTEM32) or one of the directories specified in the PATH environment variable. Alternatively you can copy the file to your BBC BASIC for Windows library folder and load it explicitly from there:\\ \\ 
 +<code bb4w>
         SYS "LoadLibrary", @lib$+"gnu_regex.dll" TO gnu_regex%         SYS "LoadLibrary", @lib$+"gnu_regex.dll" TO gnu_regex%
 +</code>
 The code below illustrates a very simple example of setting up a pattern and inputting strings from the user which are tested against this pattern:\\ \\  The code below illustrates a very simple example of setting up a pattern and inputting strings from the user which are tested against this pattern:\\ \\ 
 +<code bb4w>
         DIM buffer% 255         DIM buffer% 255
  
Line 26: Line 31:
           IF result% PRINT "Not matched" ELSE PRINT "Matched"           IF result% PRINT "Not matched" ELSE PRINT "Matched"
         UNTIL FALSE         UNTIL FALSE
 +</code>
 You should ensure that **buffer%** points to a memory buffer large enough to contain the //compiled// regular expression (although it's not clear how you are supposed to ascertain this!). As always, make sure you execute the **DIM** statement only once, or use **DIM LOCAL**, to avoid a memory leak and an eventual **No room** error.\\ \\  In this example the pattern matches the characters "a", "b", "c", "x", "y" or "z" anywhere in the string. The program as listed provides no information on //where// in the string the match occurred. You can discover that information by amending the program as follows:\\ \\  You should ensure that **buffer%** points to a memory buffer large enough to contain the //compiled// regular expression (although it's not clear how you are supposed to ascertain this!). As always, make sure you execute the **DIM** statement only once, or use **DIM LOCAL**, to avoid a memory leak and an eventual **No room** error.\\ \\  In this example the pattern matches the characters "a", "b", "c", "x", "y" or "z" anywhere in the string. The program as listed provides no information on //where// in the string the match occurred. You can discover that information by amending the program as follows:\\ \\ 
 +<code bb4w>
         DIM offsets{start%, finish%}         DIM offsets{start%, finish%}
         REPEAT         REPEAT
Line 33: Line 40:
           IF result% PRINT "Not matched" ELSE PRINT "Matched at ";offsets.start%           IF result% PRINT "Not matched" ELSE PRINT "Matched at ";offsets.start%
         UNTIL FALSE         UNTIL FALSE
 +</code>
 Here **offsets.start%** is set to the offset from the beginning of the string of the first match.\\ \\  You can specify that the matching is //case insensitive// by changing the final parameter of **regcomp** from 0 to 2 as follows:\\ \\  Here **offsets.start%** is set to the offset from the beginning of the string of the first match.\\ \\  You can specify that the matching is //case insensitive// by changing the final parameter of **regcomp** from 0 to 2 as follows:\\ \\ 
 +<code bb4w>
         _REG_ICASE = 2         _REG_ICASE = 2
         SYS regcomp%, buffer%, pattern$, _REG_ICASE TO result%         SYS regcomp%, buffer%, pattern$, _REG_ICASE TO result%
 +</code>
 You can also specify the use of **extended regular expressions** by setting the final parameter to 1:\\ \\  You can also specify the use of **extended regular expressions** by setting the final parameter to 1:\\ \\ 
 +<code bb4w>
         _REG_EXTENDED = 1         _REG_EXTENDED = 1
         SYS regcomp%, buffer%, pattern$, _REG_EXTENDED TO result%         SYS regcomp%, buffer%, pattern$, _REG_EXTENDED TO result%
 +</code>
 In this mode additional //metacharacters// are recognised, for example the vertical bar (|) signifies alternatives:\\ \\  In this mode additional //metacharacters// are recognised, for example the vertical bar (|) signifies alternatives:\\ \\ 
  
-| abc|def\\ | matches "abc" or "def"\\ |+<nowiki>abc|def</nowiki>\\ | matches "abc" or "def"\\ |
 \\ \\
 ---- ----
 [1] When last checked, the file **gnu_regex.exe** was corrupted (missing the last byte). To repair it you can use this simple BBC BASIC program:\\ \\  [1] When last checked, the file **gnu_regex.exe** was corrupted (missing the last byte). To repair it you can use this simple BBC BASIC program:\\ \\ 
 +<code bb4w>
         F% = OPENUP("gnu_regex.exe")         F% = OPENUP("gnu_regex.exe")
         PTR#F% = EXT#F%         PTR#F% = EXT#F%
         BPUT #F%,0         BPUT #F%,0
         CLOSE #F%         CLOSE #F%
 +</code>
using_20regular_20expressions.1522502389.txt.gz · Last modified: 2024/01/05 00:16 (external edit)