Oxygen Basic

Programming => Example Code => Data Processing => Topic started by: Charles Pegge on May 18, 2014, 06:20:58 AM

Title: Token engine concept
Post by: Charles Pegge on May 18, 2014, 06:20:58 AM

Code: [Select]

'FOR READING SIMPLE SPACED WORDS

function Getword(string s,sys *i,*lw) as string
===============================================
sys ps = strptr s
byte b at (i-1+ps)
sys bi,ei
do
  select b
  case 0         : exit do
  case 33 to 255 : exit do
  end select
  @b++
enddo
bi=@b
do
  select b
  case 0       : exit do
  case 1 to 32 : exit do
  end select
  @b++
enddo
ei=@b
lw=ei-bi
i=ei-ps+1
if lw then return mid s,bi-ps+1,lw
end function



string tokerr
sys    ert
string cr=chr(13,10)


function tokenise(string s) as string
=====================================
sys aw
sys lw
sys iw=1
sys it=1
string wr
string tok
do
  wr=lcase getword s,iw,lw
  if wr="" then exit do
  if it>len tok then tok+=nuls 16000
  aw=asc wr
  select aw 'simple lookup
  case "a"
    if wr="any"
      mid tok,it,"A" : it++
    else
      ert=1
    end if
  case "b"
    if wr="brave"
      mid tok,it,"B" : it++
    else
      ert=1
    end if
  case "c"
    if wr="cat"
      mid tok,it,"C" : it++
    else
      ert=1
    end if
  case else
    ert=1
  end select
  if ert=1 then tokerr += "unknown " wr cr
  ert=0
end do
return left tok,it-1
end function

function exec(string toks)
==========================

  'JUMP BY TOKEN

  sys t[256] : t[65]=>{@FAA,@FBB,@FCC}
  '
  sys    e   = len(toks)
  sys    i   = 0
  byte  *b   = strptr(toks)-1
  sys    eb  = @b+e

  NextItem:

  @b++
  if @b>eb then exit function
  i=b : goto t[i]
  'goto t[b]
  FAA:
  print "AA"
  goto NextItem

  FBB:
  print "BB"
  goto NextItem
  '
  FCC:
  print "CC"
  goto NextItem

end function

string toks=tokenise " any brave cat "

if tokerr then
  print tokerr
else
  exec toks
end if

Title: Re: Token engine concept
Post by: Aurel on May 18, 2014, 09:33:25 AM
Thanks Charles .  :)
Charles...
Do you can present bytecode engine concept ?
I mean something simple to understand ..
Title: Re: Token engine concept
Post by: Peter on May 18, 2014, 10:53:18 AM
Quote
Do you can present bytecode engine concept ?

What about a storybook:  byte prince married the 64bit princess!
They are living happily in a 64k rom for the rest of their live.    ;D

Title: Re: Token engine concept
Post by: Mike Lobanovsky on May 18, 2014, 01:03:30 PM
(http://www.beat102103.com/beat-breakfast-blog/wp-content/uploads/2012/10/thumbs-up.jpg)
Title: Re: Token engine concept
Post by: Aurel on May 18, 2014, 01:08:33 PM
Peter...Better stick to your gold coin... :P here  -> (Y)
so what is wrong with my question now ?
Title: Re: Token engine concept
Post by: Frankolinox on May 20, 2014, 03:33:26 AM
@charles: "token engine concept": I have an error at line 82 (with an array (i) with your example above, cannot run this example.

regards, frank
Title: Re: Token engine concept
Post by: jack on May 20, 2014, 05:06:10 PM
Thanks Charles .  :)
Charles...
Do you can present bytecode engine concept ?
I mean something simple to understand ..
hello Aurel
if I were you, I would look at the USCD Pascal implementation, here's some info http://en.wikipedia.org/wiki/UCSD_Pascal
Title: Re: Token engine concept
Post by: Charles Pegge on May 20, 2014, 10:11:44 PM
Aurel,

This example simply demonstrates converting words into tokens, then using the  tokens to execute code. It minimises string processing by using stretchable buffer tok a byte array overlay b. Mid is used to patch bytes into the the buffer, without creating any additional pieces of string.


Frank,

Try copying again. Line 82 is a blank line, and line 81 is a comment :)


Jack,

I remember Apple Pascal. One of my friends earned a good living developing a bespoke database working on the Apple2c. Compiling a module on his system to P-code usually took about 20 minutes! It kept 2 floppy disk drives very busy. The 8 bit 6502 processor would have about 1/10000 the performance of a Pentium at best.


Title: Re: Token engine concept
Post by: Frankolinox on May 21, 2014, 05:58:13 AM
strange, this line doesn't work for me and produce this error message in my last picture I've attached:

Code: [Select]
i=b : goto t[i]
?

regards, frank
Title: Re: Token engine concept
Post by: Mike Lobanovsky on May 21, 2014, 06:55:30 AM
Hello Frank,

Sorry but you haven't attached anything to your message.

"Jump by array" is a new language feature. Make sure to download the latest O2 package from the Wizard at the top of this page. :)
Title: Re: Token engine concept
Post by: Aurel on May 21, 2014, 12:05:39 PM
Frank..frank...frank
I ask you question on ox ide topic ..but you not respond ... :o
maaaan...i really sometimes cannot figured you at all  ???
please don't get me wrong  :D
Title: Re: Token engine concept
Post by: Aurel on May 21, 2014, 12:19:24 PM
Quote
Aurel,

This example simply demonstrates converting words into tokens, then using the  tokens to execute code. It minimises string processing by using stretchable buffer tok a byte array overlay b. Mid is used to patch bytes into the the buffer, without creating any additional pieces of string.

Charles
I will try to study this example...
But what is with error...
in source code i found few enddo mistakes...looks trivial but
i am not sure that this is the best way or easiest..

Jack
hmmm pascal OS....
i found this interesting:

1.Token threading
2.Huffman threading
3.Lesser used threading



.
Title: Re: Token engine concept
Post by: Charles Pegge on May 21, 2014, 01:53:08 PM
Hi Aurel

The error is a little below your screen shot: goto t[ i]

The earlier syntax was something like : j=t[ i] : jmp j


You need a more recent o2 to support array goto.

http://www.oxygenbasic.org/o2zips/Oxygen.zip
Title: Re: Token engine concept
Post by: Aurel on May 21, 2014, 02:13:20 PM
Ok ...i see
Uf i forget to build new folder with new dll and supported changes
my fault  :-\
Title: Re: Token engine concept
Post by: Frankolinox on May 22, 2014, 03:24:26 AM
well, the "token engine" example works now, after I haved installed last oxygen.dll (update 02 latest build) and the array problem is gone. Sometimes I have problems with oxygen examples from oxygen folders and after having installed newer oxygen updated versions it's gone. in general: sorry, but I am wondering if one example is working fine some month without any problems and then it's producing errors. I am looking for the reason why this happened and cannot find solutions for it. for me it's better to ask here or show problems because charles often changes programming code functions or declarations or added new macros and so on.. :)

@aurel: what question do you have asked about oxide? I am not every day present and I have a lot of troubles about my e-mail accont last week (was hacked) and much more other thingies around my compiler stuff and of course the hot wheather outside .. :) and of course there's not enough time beside my mainly job for programming

thanks for feedback.mike, charles.

nice day, frank
ps: I'v sent wrong picture this afternoon, scusi senori, but now the example works correct , that was important :)
Title: Re: Token engine concept
Post by: Charles Pegge on May 23, 2014, 04:35:21 AM

The OpenglGL material went through a major overhaul recently, and OxygenBasic has been through quite a lot of internal changes as well. I will try to hold back on posting examples dependent on new features, and allow a bit more time to test and to make any necessary adjustments.
Title: Re: Token engine concept
Post by: Aurel on March 19, 2019, 04:41:43 AM
back to token engine concept..
this time example is very general (i hope) and useful( i hope too)
and also to post something.. ::)
MAIN goal of this small program should be part of small or little language interpreter(i hope again)

anyway it sems so far that tokenize properly:

ident( keywords,variables{a..z, a..z09) low case only(for now)
operators (+ , - , * , / , = , < , > )
quoted string[/b](literal) : "quoted"
parens (brackets) - () , []
special - comma,colon,,,

Code: [Select]
'microB tokenizer by Aurel 19.3.2019
int tkNULL=0,tkPLUS=1,tkMINUS=2,tkMULTI=3,tkDIVIDE=4
int tkCOLON=5,tkCOMMA=6,tkLPAREN=7,tkRPAREN=8,tkLBRACKET=9,tkRBRACKET=10
int tkPRINT=11,tkDOT=12,tkLINE=13,tkCIRCLE=14 ,tkEOL = 20
string tokList[256] : int typList[256]   'token/type arrays
int start , p = 1 ,start = p ,tp ,n      'init
string code,ch,tk ,crlf=chr(13)+chr(10),bf
'--------------------------------------------------------------------
code = "arr[10]: func(a,b): var1+ 0.5*7: str s="+ chr(34)+ "micro" + chr(34)  ' test or load_src?
'--------------------------------------------------------------------
sub tokenizer(src as string) as int
'ch = mid(src,p,1) : print "CH:" + ch' get first char
while p <= len(src)
    ' print "P:" + str(p)       
     ch = mid(src,p,1)                   'get char

 If asc(ch)=32 then p=p+1 : end if             'skip blank space[ ]
 If asc(ch)=9  then p=p+1 : end if             'skip TAB [    ]
'--------------------------------------------------------
 If asc(ch)=34 ' if char is QUOTE "
' print mid(src,p+1,1)
 p++ :  ch = mid(src,p,1) : tk=ch : p++        'skip quote :add ch TO tk buffer: p+1
while asc(ch) <> 34 'and mid(src,p+1,1)<> chr(34)       
   ch = mid(src,p,1) : if asc(ch)= 34 then exit while
        tk=tk+ch : p++
        IF ch = chr(10): print "Unclosed Quote! Exit...": exit sub : end if
wend
    tp++ : tokList[tp] = tk : tk="":ch="": p++  'add quoted string to token list
 End if
'-------------------------------------------------------           
 If (asc(ch)>96 and asc(ch)<123)          ' [a-z]
   while (asc(ch)>96 and asc(ch)<123) or (asc(ch)>47 and asc(ch)<58) ' [a-z0-9]*
         'print "AZ:" + ch
         tk=tk+ch : p++ : ch = mid(src,p,1)
   wend
      'print "TOK-AZ:" + tk + " PAZ:" + p
       tp++ : tokList[tp] = tk : tk="":ch=""       
       'return IDENT;
 End If
'--------------------------------------------------------------
'While (Asc(Look) > 47 And Asc(Look) < 58) Or Asc(Look) = 46'
 If (asc(ch)>47 and asc(ch)<58)                    ' [0-9.]
    while (asc(ch)>47 AND asc(ch)<58) OR asc(ch)=46  '[0-9[0.0]]*
        tk=tk+ch :p++
        ch = mid(src,p,1)
    wend
        'print "Pnum:" + str(p)
       tp++ : tokList[tp] = tk : tk="":ch=""
       'return NUMBER;
 End if
'---------------------------------------------------
 If asc(ch)=43 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' + plus
 If asc(ch)=45 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' - minus
 If asc(ch)=42 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' * multiply
 If asc(ch)=47 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' / divide
 If asc(ch)=40 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' ( Lparen
 If asc(ch)=41 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' ) Rparen
 If asc(ch)=44 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' , comma
 If asc(ch)=58 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' : colon
 If asc(ch)=60 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' < less
 If asc(ch)=61 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' = equal
 If asc(ch)=62 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' > more(greater)
 If asc(ch)=91 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' [ Lbracket
 If asc(ch)=93 : tp++ : tokList[tp] = ch : ch="" : p++ : End if  ' ] Rbracket

 'elseif...
 'End if
IF ASC(ch)>125: print "Unknown token!-[" +ch +" ]-Exit...": RETURN 0: END IF

wend
return tp
end sub

'call tokenizer..tested(ident,numbers)
int tn: tn = tokenizer(code) : print "number of tokens:" + str(tn)
for n = 1 to tn : bf = bf + tokList[n] + crlf : next n
print  bf
Title: Re: Token engine concept
Post by: Aurel on March 20, 2019, 09:32:34 AM
In addition to above tokenizer example ,here is very simple recursive descent token math evaluator.
In first i tried to eval expr() & term() with while loop - and that work but without them not
work ..simply recursion stoped on last operator ?
But when i add checking under factor() with instr() function then recursion continue..

program is just for testing purpose because is very primitive .
with combination of this two programs you can build primitive expression evaluator which use tokens.

Code: [Select]
'recursive descent token evaluator
#lookahead
int tc=0 : string token,look
string tokens[8]
tokens[1] = "2"
tokens[2] = "*"
tokens[3] = "("
tokens[4] = "10"
tokens[5] = "-"
tokens[6] = "2"
tokens[7] = ")"
'tokens[8] = "+"
'-----------------------------------------------------
sub gettok()
tc++
token = tokens[tc] : look = tokens[tc+1]
print "TOKEN: " + token + "  ,TC: " + str(tc)
if tokens[tc+1] <> "" then return
end sub
'----------------------------------------------------
sub expr() as float
float v=0.0
If token = "-"
 v = -(term())
else
 v = term()
end if
 
'while token = "+" or token = "-"
'while look <> ""
if token = "+": gettok() : v = v + term(): end if
if token = "-": gettok() : v = v - term(): end if

'wend


return v
end sub
'---------------------------------------------------
sub term() as float
float v
v = factor()

'while token = "*" or token = "/"
if token = "*": gettok() : v = v * factor(): end if
if token = "/": gettok() : v = v / factor(): end if
'wend

return v
end sub
'-------------------------------------------------------

sub factor() as float
float v

if asc(token)>47  and asc(token)<58 'nums
v = val(token) : gettok()
 if instr("+-*/",look) = 0
      return v
 end if 
end if


if asc(token)=40 and asc(token)<>41 'match (...)
gettok() : v = expr() : gettok()
end if


return v
end sub

'execute-----------------------------------------------------
gettok()'start
float res = expr()
print "RESULT=" + str(res)