Oxygen Basic

Information => Open Forum => Topic started by: JRS on July 24, 2011, 08:35:33 PM

Title: O2 Source ?
Post by: JRS on July 24, 2011, 08:35:33 PM
Charles,

Where is the maco() function defined. I'm looking over o2sema.bas and see this function used a lot but I couldn't find where its defined.

John
Title: Re: O2 Source ?
Post by: Charles Pegge on July 25, 2011, 07:53:05 AM

Hi John,

maco(record,field) is a database record array for all defined entities. This includes keywords, types, variables, functions, classes, macros, everything. That is why you see it very frequently.

Most of these things are defined in o2glob.bas.

Very brave of you to examine the source code  ;D

Charles
Title: Re: O2 Source ?
Post by: JRS on July 25, 2011, 07:59:32 AM
Quote
Very brave of you to examine the source code

I'm just trolling for show stoppers before I make any serious attempts at converting O2 to SB.

It is going to take a bit more work to determine what is an array and what is a function/sub when both use (). I guess I'm spoiled with SB using [] for arrays.


Title: Re: O2 Source ?
Post by: Charles Pegge on July 25, 2011, 09:34:25 AM
The o2lex functions are probably the most challenging because they are mainly in assembler. What they do is quite simple but they have to be fast.
Title: Re: O2 Source ?
Post by: JRS on July 25, 2011, 10:06:26 AM
As long as this is nothing more than converting text to a binary file format with no external tools, things should go smoothly. What I have seen so far isn't that hard to convert to SB.

I'm counting on that moving to Linux is going to be a run time code tweak and the logic for converting O2 Basic code to it's final executable state won't change much. With that said, I should be able to create a O2 compiler framework in SB and generate platform specific executables from any OS that SB runs under. (Windows, Linux and Mac 32/64 bit)

Things SB programmers don't need to be concerned with and learn to appreciate.


I could go on but these are at the top of the list.
Title: Re: O2 Source ?
Post by: Charles Pegge on July 25, 2011, 03:15:54 PM

With 757K of source code it is a major project! I wish I could make it shorter.

Quite a number of the functions written in Assembler could be rewritten in Basic with minimal loss of performance.

Select / switch / case constructs produce very efficient code - as good as doing the same in Assembler.

Charles
Title: Re: O2 Source ?
Post by: JRS on July 25, 2011, 03:25:32 PM
I'm not looking at changing any of the flow of O2 IF I decide to take a stab at converting O2 to SB. I'm assuming that it really doesn't matter if the text is Basic or ASM, your assembler/linker does it's magic and out pops a executable. It's a string processing job as I see it. If I'm missing something, please let me know so I don't waste a bunch of time on this effort.
Title: Re: O2 Source ?
Post by: Charles Pegge on July 25, 2011, 03:51:16 PM

Yes you are right John, It's a string job all the way down to machine code and the PE file (or ELF file for Linux). The only thing you can't do in SB is direct execution in memory.

I do it in these stages:

Basic to Assembler
Assembler to machine script
machine script to binary

It is easier to keep everything in text form until the final stage, though it is technically possible to go directly  down to binary.

I noticed that SB does not have MKL() and similar functions though these are mentioned as planned in the manual. You will have to construct binary encodings using asc() / chr().

Charles
Title: Re: O2 Source ?
Post by: JRS on July 25, 2011, 03:59:41 PM
Quote
I noticed that SB does not have MKL() and similar functions though these are mentioned as planned in the manual. You will have to construct binary encodings using asc() / chr().

I'm thinking it would be easier just to create a SB extension module in C to deal with any low level stuff that would be a pain in the ass in Basic to do.

Where do you suggest I start with the conversion?

Title: Re: O2 Source ?
Post by: Charles Pegge on July 25, 2011, 04:32:17 PM

I would start by investigating Oxygen.bas. It will give you a feel for how the compiler is organised.

For the self-compile I already had the run-time library RTL32.inc (from o2runt). Then I ported o2link, then o2glob and o2lexi. These 3 files are sufficient to make the machine script to binary stage. Then the other files were ported working down from the top of list and ending with o2assm. This is the least bumpy ride for test and debug.

The major part of the effort is testing all the components and flushing out bugs. Some of them, in my experience will be quite subtle and won't be obvious by looking at the code alone.

Charles
Title: Re: O2 Source ?
Post by: JRS on July 25, 2011, 06:47:33 PM
Quote
The only thing you can't do in SB is direct execution in memory.

 ???  I remember when there was a SBO2 directory out of examples, executing O2 scripts in memory.

I think SB with its internal preprocessor support could make a nice debugger.

My interest in this fork of O2 is to provide a means to roll your own compiler from anywhere for anything.

Scriptpiler  :D
Title: Re: O2 Source ?
Post by: JRS on July 25, 2011, 09:32:16 PM
Charles,

Do you see any problems with the asm/end asm being a string until the assembler looks at it?

Code: [Select]
 asm = """
    mov ecx,[p]                        'ADDRESS OF CODE
    cmp dword ptr [ecx],&h00905A4D     'DOS  `MZ` SIGNATURE: `MZ` 90 00
    jnz npef                           'SKIP IF NOT PE
    add ecx,&hff0                      'MOVE o2 ENTRY POINT
    npef:                              '
    mov ebx,[psysfuns]                 'BASE ADDRESS OF RUNTIME FUNCTION TABLE
    call ecx                           'CALL CODE
    mov [function],eax                 'RETURN DATA
  """

I understand that within the ASM code there may be variables references defined by the compiler Basic code.

Code: [Select]
    npef:                              '
    mov ebx,[""" & psysfuns & """]  'BASE ADDRESS OF RUNTIME FUNCTION TABLE
    call ecx                                  'CALL CODE

Title: Re: O2 Source ?
Post by: Charles Pegge on July 25, 2011, 10:24:35 PM
Yes the assembler receives the Basic output as a single string of text.

Each variable is converted to a register and an offset like [ebx+100] before it is passed to the assembler. [ebx+100] representing the location of the variable.

Charles
Title: Re: O2 Source ?
Post by: JRS on July 26, 2011, 09:36:30 PM
Charles,

Unfortunately O2 is too Windows specific at this time to attempt a meaningful translation to SB. There is still too much FreeBASIC specific code in the source to ignore as you transition to self compile.

I may have another look at this when you have a 64 bit Linux version to play with.

As much as I like what your doing with O2, resorting to Windows as the only target is too debilitating.

John

 
Title: Re: O2 Source ?
Post by: Charles Pegge on July 26, 2011, 10:13:47 PM
Hi John,

These are the core calls to the operating system.

Code: OxygenBasic
  1.  
  2.  'def LoadLibrary           [ebx+024]
  3. 'def GetProcAddress        [ebx+040]
  4. 'def FreeLibrary           [ebx+032]
  5.  def SysAllocStringByteLen [ebx+160]
  6.   def SysFreeString         [ebx+168]
  7.   def GetModuleHandle       [ebx+440]
  8.   def GetGetCommandLine     [ebx+448]
  9.   def GetExitCodeProcess    [ebx+456]
  10.   def ExitProcess           [ebx+464]
  11.   def CreateFile            [ebx+480]
  12.   def ReadFile              [ebx+488]
  13.   def CloseHandle           [ebx+496]
  14.   def MessageBox            [ebx+472]
  15.   def MessageBoxW           [ebx+504]
  16.   +
  17.   WriteFile
  18.  
  19.  
  20.  

Apart from these there is the more complex task of building an ELF header to replace the PE header. o2hdrs.bas deals with this aspect. I need to study some real ELF  headers to get a clear picture of how they are used. But it promises to be a lot cleaner than PE.

And in conjunction with this, resource building will also be totally different. An area I have not researched yet.

Charles





Title: Re: O2 Source ?
Post by: kryton9 on July 26, 2011, 10:58:18 PM
There are more than those Charle's if you start think about the windows, console and other api calls.

I wish I had a better understanding of the very low level stuff to help out. But I think finding a cross platform approach would be the way to go. As you see by your lines of code and as Eros has pointed how big thinBasic has gotten it just gets harder and harder later to port it over.

But what options does one have to make it cross platform, I don't know anymore. I have been researching this topic and the more I attempt to learn the more confused I get :)

Java and Mono are just in time compilers. I have seen via the Java Monkey Game Engine that you can compile to Windows, Linux and Mac. On windows you can make an executable file with the engine.
But via just Java, I have not been able to do that.

Mono, is .net but cross platform. I never thought the guys working on Mono would come as far as they have. Their IDE is even really nice.

The only one that is not a just in time compiler but a real compiler I see is free pascal. But again you need to find and use cross platform libraries.  gcc can be if only cross platform libraries are used and you are a wizard in writing make files, it is not easy. 

QT, I have not played with long enough to make a judgement.

What are your guys thoughts, what options does Charles have?

Title: Re: O2 Source ?
Post by: JRS on July 26, 2011, 11:00:06 PM
Quote
I noticed that SB does not have MKL() and similar functions though these are mentioned as planned in the manual. You will have to construct binary encodings using asc() / chr().

A new user on the ScriptBasic forum reminded me that the s = t::ArrayToString(Array) T extension module function also works with LONG and REAL numbers. This should return in a string the binary (internal) value.

Code: [Select]
INCLUDE t.bas

a = 1
b = 1.0

sl_a = t::ArrayToString(a)
sr_b = t::ArrayToString(b)

PRINT HTA(sl_a),"\n"
PRINT HTA(sr_b),"\n"

END

FUNCTION HTA(arg)
  FOR x = 1 TO LEN(arg)
    z &= HEX(ASC(MID(arg,x,1)))
  NEXT
  HTA = z
END FUNCTION

I think it would be better to put my efforts into the O2 assembler and linker first and then work backwards. There is too much framework being defined for O2 that SB doesn't need to make the translation from O2 code to the pre-assembly state.
Title: Re: O2 Source ?
Post by: jcfuller on July 27, 2011, 02:18:57 AM
Hi John,
Apart from these there is the more complex task of building an ELF header to replace the PE header. o2hdrs.bas deals with this aspect. I need to study some real ELF  headers to get a clear picture of how they are used. But it promises to be a lot cleaner than PE.

And in conjunction with this, resource building will also be totally different. An area I have not researched yet.

Charles



Charles,
  You might find some gems of information from the JAsm source?
http://www.japheth.de/JWasm.html
James
Title: Re: O2 Source ?
Post by: Charles Pegge on July 27, 2011, 08:58:28 AM

Thanks James,

I love the list of "Hello World"s. This is very useful information.

Here is the one for Linux32

Code: OxygenBasic
  1. ;--- "hello world" for Linux which uses int 80h.
  2. ;--- assemble: jwasm -Fo=Linux1.o Linux1.asm
  3. ;--- link:     wlink format ELF runtime linux file Linux1.o name Linux1.
  4.  
  5.     .386
  6.     .model flat
  7.  
  8. stdout    equ 1
  9. SYS_EXIT  equ 1
  10. SYS_WRITE equ 4
  11.  
  12.     .data
  13.  
  14. string  db 10,"Hello, world!",10
  15.  
  16.     .code
  17.  
  18. _start:
  19.  
  20.     mov ecx, offset string
  21.     mov edx, sizeof string
  22.     mov ebx, stdout
  23.     mov eax, SYS_WRITE
  24.     int 80h
  25.     mov eax, SYS_EXIT
  26.     int 80h
  27.  
  28.     end _start
  29.  

Charles
Title: Re: O2 Source ?
Post by: Peter on July 27, 2011, 12:14:16 PM
Here is my version.


Section .Data
msg      db "Hello World!"
msglen  equ 12

Section .Text
global  _start

_start:  mov edx,msglen
           mov ecx,msg
           mov ebx,1
           mov eax,4
           int 0x80
   
           mov ebx,0
           mov eax,1
           int 0x80
Title: Re: O2 Source ?
Post by: Charles Pegge on July 27, 2011, 05:17:49 PM

This is how I would implement print in Oxygen's Linux run-time library:

Code: OxygenBasic
  1. function print cpu (string s)
  2.   asm
  3.   push ebx
  4.                   'string ptr in ecx
  5.  mov edx,[ecx-4] 'length
  6.  mov ebx,1       'stdout
  7.  mov eax,4       'SYS_WRITE
  8.  int 0x80
  9.   pop ebx
  10.   end asm
  11. end function
  12.  
  13. print "Hello World!"
  14.  

Charles
Title: Re: O2 Source ?
Post by: JRS on July 27, 2011, 06:14:03 PM
Quote
This is how I would implement print in Oxygen's Linux run-time library:

Sounds like a commitment to Linux to me.  ;D
Title: Re: O2 Source ?
Post by: Charles Pegge on July 27, 2011, 06:38:14 PM
Well certainly John, If Linux is that simple!


Kent
Quote
There are more than those Charles if you start think about the windows, console and other api calls.

Those 16 OS functions are the essentials required to run the Oxygen core in Windows. - bloated in my opinion :) but that is what is required in Windows. The list for Linux will be quite similar.

For the GUI I still advocate OpenGL as the best means of getting uniformity across the platforms and OpenGL code looks the same across many different languages. So it is easy to port.

Charles


Title: Re: O2 Source ?
Post by: JRS on July 27, 2011, 10:41:50 PM
Charles,

I noticed ASM code in the assembler and linker. That means you have to be using external tools. Am I seeing old FreeBASIC code and not looking at the latest self compile version that you may not have released yet?

John
Title: Re: O2 Source ?
Post by: Charles Pegge on July 28, 2011, 12:54:12 AM
John,

Those Asm routines you see in o2link are for converting decimal and hexadecimal numbers into binary and patching them directly into the binary output buffer t. They perform the equivalent first converting the text into a value then performing binary conversions chr(), mkshort(),mklong(), mkquad(), mksingle(), mkdouble() as required.

o2link is the first one to go for. It is the original o2 machine script encoder. Hardcore programmers who think that Assembler is a superfluous luxury will require nothing more  ;D


Charles
Title: Re: O2 Source ?
Post by: JRS on July 28, 2011, 12:54:35 AM
Quote
Well certainly John, If Linux is that simple!

I figure if Wine can fake a Windows environment and run a wide array of software, (including O2) it must not make a lot of difference when your dealing directly with the CPU in an environment like ASM.
Title: Re: O2 Source ?
Post by: JRS on July 28, 2011, 01:11:35 AM
Quote
o2link is the first one to go for.

Once I convert it to SB, can I test to see if it works? Can O2 provide the string I need to link?

Title: Re: O2 Source ?
Post by: Charles Pegge on July 28, 2011, 12:24:13 PM
Hi John

One obstacle to doing this:

The Assembler produces a script which defines the memory image to be linked and o2link turns this into binary. But this memory image must then be compacted and post-processed to the PE file format.

This is done with the MakePE function in o2hdrs.bas.

Oxygen can emit a machine script for the memory image but not for the PE file itself.

This means you will need to port the MakePE function and its dependencies as well as o2link then feed the output of the linker into this function to generate the PE file.

I am refactoring makePE right now and will post an update soon.

Charles

Here is a machine script for "Hello World!"
Code: [Select]

`MZ`  90 00
03 00 00 00
04 00 00 00
FF FF 00 00
B8 00 00 00
00 00 00 00
40 00 00 00
$20
hle8
0E 1F BA 0E
00 B4 09 CD
21 B8 01 4C
CD 21
`This program cannot be run in DOS mode.` 0d 0a 00
/e8
`PE` 00 00
hw014c
hw0005
hl4E331D62
hl0
hl0
hw00e0
hw0102
hw010b 'MAGIC
02
00
hl0
hl0
hl00005000
ga _main_
ga _code_
 ga base_of_data
hg00400000
hl1000
hl200
hw4
hw0
hw1
hw0
hw4
hw0
hl0
hl5000
hl400
hl0
hw2
hw0
hg00100000
hg00001000
hg00100000
hg00001000
hl0
hl10
hl0 hl0
ga imports hl200
ga resources hl0
$68
`.text`    00 00 00 hl00000000  ga _code_    hl0000  $10   hl60000020
`.data`    00 00 00 hl40000000  ga _data     hl0200  $10   hlc0000040
`.bss`  00 00 00 00 hl00005000  ga bssdata   hl0000  $10   hlc0000080
`.idata`      00 00 hl02000000  ga imports   hl0200  $10   hlc0000040
`.rsrc`    00 00 00 hl02000000  ga resources hl0200  $10   hl40000040
/0ff0 e9 gf _o2_
/+1000
'  
._code_                         '  ._code_
._main_                         '  ._main_
 E8 gl get_bss                  '  call get_bss
 8B F8                          '  mov edi,eax
 E8 gl get_iat                  '  call get_iat
 8B D8                          '  mov ebx,eax
 89 9F 58 02 00 00              '  mov [edi+600],ebx
 6A 00                          '  push 0
 FF 53 0C                       '  call [ebx+12]
 89 87 60 02 00 00              '  mov [edi+608],eax
 9b db e3                       '  finit
 E8 gf get_oxygen               '  call fwd get_oxygen
 83 F8 00                       '  cmp eax,0
 0F 84 gf nend                  '  jz fwd nend
 FF D0                          '  call eax
 8B D8                          '  mov ebx,eax
 E8 gf _o2_                     '  call fwd _o2_
 E8 gl get_bss                  '  call get_bss
 8B F8                          '  mov edi,eax
 8B 9F 58 02 00 00              '  mov ebx,[edi+600]
 8B 87 68 02 00 00              '  mov eax,[edi+616]
 83 F8 00                       '  cmp eax,0
 0F 84 gf noxyfree              '  jz fwd noxyfree
 50                             '  push eax
 FF 53 08                       '  call [ebx+8]
.noxyfree                       '  .noxyfree
.nend                           '  .nend
 6A 00                          '  push 0
 8B C4                          '  mov eax,esp
 50                             '  push eax
 FF B7 60 02 00 00              '  push [edi+608]
 FF 53 14                       '  call [ebx+20]
 FF 53 18                       '  call [ebx+24]
.get_iat                        '  .get_iat
 E8 gf here                     '  call fwd here
.here                           '  .here
 58                             '  pop eax
 81 E8 ga here                  '  sub eax,here
 81 C0 ga import_address_table  '  add eax,import_address_table
 C3                             '  ret
.get_bss                        '  .get_bss
 E8 gf here                     '  call fwd here
.here                           '  .here
 58                             '  pop eax
 81 E8 ga here                  '  sub eax,here
 81 C0 ga bssdata               '  add eax,bssdata
 C3                             '  ret
.get_oxygen                     '  .get_oxygen
 57                             '  push edi
 68 66 67 00 00                 '  push 0x00006766
 68 65 6E 2E 63                 '  push 0x632e6e65
 68 6F 78 79 67                 '  push 0x6779786f
 8B C4                          '  mov eax,esp
 6A 00                          '  push 0
 68 80 00 00 00                 '  push 128
 6A 03                          '  push 3
 6A 00                          '  push 0
 6A 01                          '  push 1
 68 00 00 00 80                 '  push 0x80000000
 50                             '  push eax
 FF 53 1C                       '  call [ebx+28]
 83 C4 0C                       '  add esp,12
 83 F8 00                       '  cmp eax,0
 0F 84 gf nconfig               '  jz fwd nconfig
 8B F0                          '  mov esi,eax
 81 EC 04 08 00 00              '  sub esp,2052
 8B FC                          '  mov edi,esp
 C7 07 00 00 00 00              '  mov [edi],0
 6A 00                          '  push 0
 8B C4                          '  mov eax,esp
 6A 00                          '  push 0
 50                             '  push eax
 68 00 08 00 00                 '  push 2048
 57                             '  push edi
 56                             '  push esi
 FF 53 20                       '  call [ebx+32]
 56                             '  push esi
 FF 53 24                       '  call [ebx+36]
 8B CF                          '  mov ecx,edi
 FF C9                          '  dec ecx
 (                              '  (
 FF C1                          '  inc ecx
 8A 01                          '  mov al,[ecx]
 80 F8 20                       '  cmp al,32
 0F 87 rl                       '  ja repeat
 )                              '  )
 C7 01 00 00 00 00              '  mov [ecx],0
 57                             '  push edi
 FF 13                          '  call [ebx]
 81 C4 08 08 00 00              '  add esp,2056
 83 F8 00                       '  cmp eax,0
 0F 85 gf ndll                  '  jnz fwd ndll
.nconfig                        '  .nconfig
 68 6C 6C 00 00                 '  push 0x00006c6c
 68 65 6E 2E 64                 '  push 0x642e6e65
 68 6F 78 79 67                 '  push 0x6779786f
 8B C4                          '  mov eax,esp
 50                             '  push eax
 FF 13                          '  call [ebx]
 83 C4 0C                       '  add esp,12
 83 F8 00                       '  cmp eax,0
 0F 85 gf ndll                  '  jnz fwd ndll
 60                             '  pushad
 83 EC 20                       '  sub esp,32
 8B CC                          '  mov ecx,esp
 C7 01 4F 32 00 00              '  mov [ecx],0x324f
 C7 41 04 6F 78 79 67           '  mov [ecx+4],0x6779786f
 C7 41 08 65 6E 2E 64           '  mov [ecx+8],0x642e6e65
 C7 41 0C 6C 6C 3F 00           '  mov [ecx+12],0x003f6c6c
 6A 00                          '  push 0
 51                             '  push ecx
 83 C1 04                       '  add ecx,4
 51                             '  push ecx
 6A 00                          '  push 0
 FF 53 2C                       '  call [ebx+44]
 83 C4 20                       '  add esp,32
 61                             '  popad
 5F                             '  pop edi
 C3                             '  ret
.ndll                           '  .ndll
 5F                             '  pop edi
 89 87 68 02 00 00              '  mov [edi+616],eax
 8B D0                          '  mov edx,eax
 83 EC 10                       '  sub esp,16
 8B C4                          '  mov eax,esp
 C7 00 6F 32 5F 6C              '  mov [eax],0x6c5f326f
 C7 40 04 69 62 00 00           '  mov [eax+4],0x00006269
 50                             '  push eax
 52                             '  push edx
 FF 53 04                       '  call [ebx+04]
 83 C4 10                       '  add esp,16
 C3                             '  ret
._o2_                           '  ._o2_
 E9 gf _main                    '  jmp fwd _main
._mem                           '  ._mem
 E8 gl get_bss                  '  call get_bss
 8B D8                          '  mov ebx,eax
 C3                             '  ret
._newbuf                        '  ._newbuf
 68 00 10 00 00                 '  push 4096
 6A 00                          '  push 0  
 FF 93 A0 00 00 00              '  call [ebx+160]
 C7 C1 00 10 00 00              '  mov ecx,4096
 8B D0                          '  mov edx,eax
 (                              '  (
 C7 02 00 00 00 00              '  mov [edx],0
 83 C2 04                       '  add edx,4
 83 E9 04                       '  sub ecx,4
 0F 8F rl                       '  jg repeat
 )                              '  )
 89 07                          '  mov [edi],eax
 83 C7 08                       '  add edi,8
 C3                             '  ret
._main                          '  ._main
 55                             '  push ebp
 8B EC                          '  mov ebp,esp
 53                             '  push ebx
 56                             '  push esi
 57                             '  push edi
 8B FB                          '  mov edi,ebx
 E8 gl get_bss                  '  call get_bss
 8B D8                          '  mov ebx,eax
 8B F0                          '  mov esi,eax
 C7 C1 64 00 00 00              '  mov ecx,100
 (                              '  (
 8B 07                          '  mov eax,[edi]
 89 06                          '  mov [esi],eax
 83 C7 04                       '  add edi,4
 83 C6 04                       '  add esi,4
 FF C9                          '  dec ecx
 0F 8F rl                       '  jg repeat
 )                              '  )
 81 C7 6C 06 00 00              '  add edi,1644
 81 C6 6C 06 00 00              '  add esi, 1644
 C7 C1 00 02 00 00              '  mov ecx,512
 (                              '  (
 8B 07                          '  mov eax,[edi]
 89 06                          '  mov [esi],eax
 83 C7 04                       '  add edi,4
 83 C6 04                       '  add esi,4
 FF C9                          '  dec ecx
 0F 8F rl                       '  jg repeat
 )                              '  )
 8D BB 80 02 00 00              '  lea edi,[ebx+640]
 E8 gl _newbuf                  '  call _newbuf
 E8 gl _newbuf                  '  call _newbuf
 E8 gl _newbuf                  '  call _newbuf
 E8 gl _newbuf                  '  call _newbuf
 8D BB A8 02 00 00              '  lea edi,[ebx+680]
 E8 gl _newbuf                  '  call _newbuf
                                '  '_1
 8D 83 gc 1                     '  
 FF 93 68 08 00 00              '  call [ebx+2152]
 C6 C2 01                       '  mov dl,1
 FF 93 80 08 00 00              '  call [ebx+2176]
 C6 C2 01                       '  mov dl,1
 33 C0                          '  xor eax,eax
 FF 93 88 08 00 00              '  call [ebx+2184]
 50                             '  push eax
 C7 C0 01 00 00 00              '  mov eax,1
 FF 93 B0 08 00 00              '  call [ebx+2224]
 58                             '  pop eax
 FF 93 B0 08 00 00              '  call [ebx+2224]
 50                             '  push eax
 C6 C2 01                       '  mov dl,1
 FF 93 C0 09 00 00              '  call [ebx+2496]
 FF 93 38 08 00 00              '  call [ebx+2104]
                                '  '_2
 50                             '  push eax
 FF 93 90 09 00 00              '  call [ebx+2448]
 FF 93 60 08 00 00              '  call [ebx+2144]
 58                             '  pop eax
._end_                          '  ._end_
 5F                             '  pop edi
 5E                             '  pop esi
 5B                             '  pop ebx
 8B E5                          '  mov esp,ebp
 5D                             '  pop ebp
 C3                             '  ret
._error_                        '  ._error_
 51                             '  push ecx
 FF 93 00 08 00 00              '  call [ebx+2048]
 E9 gl _end_                    '  jmp _end_
.get_bss                        '  .get_bss
 E8 gf here                     '  call fwd here
.here                           '  .here
 58                             '  pop eax
 81 E8 ga here                  '  sub eax,here
 81 C0 ga bssdata               '  add eax,bssdata
 C3                             '  ret
 01 `end_of_code` 01 /+1000     '  
 
.base_of_data
._data
 gd 1 "Hello World!" 00 00

00 01 `end_of_data` 01 /+1000
.bssdata
$5000
.imports
ga name_list1 hl0 hl0 ga module_name1 ga address_list1
ga name_list3 hl0 hl0 ga module_name3 ga address_list3
hl0 hl0 hl0 hl0 hl0 ' termination
.module_name1 `KERNEL32.DLL` 00
.module_name3 `USER32.DLL` 00
.name_list1    gg _LoadLibrary gg _GetProcAddress gg _FreeLibrary
gg _GetModuleHandle gg _GetCommandLine gg _GetExitCodeProcess gg _ExitProcess
gg _CreateFile gg _ReadFile gg _CloseHandle hg0
.name_list3    gg _MessageBoxA gg _MessageBoxW hg0
/+8 .import_address_table
.address_list1    gg _LoadLibrary gg _GetProcAddress gg _FreeLibrary
gg _GetModuleHandle gg _GetCommandLine gg _GetExitCodeProcess gg _ExitProcess
gg _CreateFile gg _ReadFile gg _CloseHandle hg0
.address_list3    gg _MessageBoxA gg _MessageBoxW hg0
/+4 ._LoadLibrary        hw0 `LoadLibraryA` 00
/+4 ._GetProcAddress     hw0 `GetProcAddress` 00
/+4 ._FreeLibrary        hw0 `FreeLibrary` 00
/+4 ._GetModuleHandle    hw0 `GetModuleHandleA` 00
/+4 ._GetCommandLine     hw0  `GetCommandLineA` 00
/+4 ._GetExitCodeProcess hw0 `GetExitCodeProcess` 00
/+4 ._ExitProcess        hw0 `ExitProcess` 00
/+4 ._CreateFile         hw0 `CreateFileA` 00
/+4 ._ReadFile           hw0 `ReadFile` 00
/+4 ._CloseHandle        hw0 `CloseHandle` 00
/+4 ._MessageBoxA hw0 `MessageBoxA` 00
/+4 ._MessageBoxW hw0 `MessageBoxW` 00
01 `end_of_imports` 01
/+1000
.resources
hl0 hl0 hl1 hw0 hw1
hl6 hl80000018
hl0 hl0 hl1 hw0 hw1
hl1 hl80000030
hl0 hl0 hl1 hw0 hw1
nl2057 hl48
ga data1 nl32 nl0 hl0
.data1 nw11 w`OxygenBasic` nw3 w`o2h` nw0
/+200
01 `end_of_resources` 01
/+1000

Title: Re: O2 Source ?
Post by: Aurel on July 28, 2011, 01:04:05 PM
Quote
Oxygen can emit a machine script for the memory image but not for the PE file itself.
Im not expert but i see that old LibryCompiler written in VB 6.0 can create directly PE file.
Title: Re: O2 Source ?
Post by: JRS on July 28, 2011, 01:17:44 PM
Quote
Here is a machine script for "Hello World!"

WOW!

That has to be the most complex 'Hello World" I have ever seen.  :o
Title: Re: O2 Source ?
Post by: Charles Pegge on July 28, 2011, 01:29:39 PM
It will provide good exercise for your program  ;D

This stuff squashes down to a 4k EXE file. of which about 1.5k is padding.

Normally you never see it so it is quite a shock when all the opcodes and link information is made visible

Charles
Title: Re: O2 Source ?
Post by: efgee on July 28, 2011, 07:33:38 PM
Quote
Oxygen can emit a machine script for the memory image but not for the PE file itself.
Im not expert but i see that old LibryCompiler written in VB 6.0 can create directly PE file.

I think you misunderstood.

Charles talked about different source code files - all part of Oxygen.

The LibryCompiler has different files like assembler, linker etc. as well.


BTW: The LibryCompiler creates a "Hello World!" executable with 2560bytes.  :P

Title: Re: O2 Source ?
Post by: Charles Pegge on July 28, 2011, 10:07:09 PM
It depends on how many sections are used. Each section is padded out to the nearest 512 bytes, and when loaded into memory, these sections are aligned to the nearest 4K bytes.

The smallest storage unit on disk is 4K.

I think the only way to get an indication of a file's actual content is to Zip it. (about 1K in this case).

Charles
Title: Re: O2 Source ?
Post by: JRS on July 28, 2011, 10:46:08 PM
Quote
Normally you never see it so it is quite a shock when all the opcodes and link information is made visible

(http://t3.gstatic.com/images?q=tbn:ANd9GcQy6xd2whvtb1KWyCjacRscNdpG-9cATfdGIfO9yYfoOhq31cyi)
Title: Re: O2 Source ?
Post by: Aurel on July 29, 2011, 12:18:56 AM
Quote
The LibryCompiler has different files like assembler, linker etc. as well.

Where you see assembler in Libry source code ???
There is only pure machine hex code...
Title: Re: O2 Source ?
Post by: Peter on July 29, 2011, 01:55:06 AM
  LOL

  :D  :D   :D
Title: Re: O2 Source ?
Post by: Aurel on July 29, 2011, 05:09:36 AM
LOL 2
 :D :D :D
Title: Re: O2 Source ?
Post by: efgee on July 29, 2011, 10:44:41 AM
Quote
The LibryCompiler has different files like assembler, linker etc. as well.

Where you see assembler in Libry source code ???
There is only pure machine hex code...

Well, every time you see stuff like:

AddSectionByte(...
AddSectionWord(...
AddSectionDWord(...

hex numbers are added to different arrays that hold the information for different sections.

If you see stuff like:
AddSectionByte($58, ES_CODE)
you know that this function adds $58 to the program code section.

If we go on with this example if you see in an assembler code stuff like:
pop eax
the assembler adds $58 to the code section (only valid for the intel processors).

As you can see "pop eax" in assembler is the same as AddSectionByte($58, ES_CODE) in the LibryCompiler source code.

So if somebody knows what he is doing it's pretty straight forward... and a lot of work.
Title: Re: O2 Source ?
Post by: JRS on July 29, 2011, 02:46:01 PM
Is there a section of O2 binary executable code that is static and in all executables that I could just stitch into the final cut?
Title: Re: O2 Source ?
Post by: Charles Pegge on July 29, 2011, 08:24:20 PM

No, I don't think it would help. There are too many mappings. The best way to test the code initially would be to create a program for viewing the hexadecimal byte output. Then run some very simple test scripts to exercise each instruction.

I reckon testing is by far the greatest task in program development, and the earlier you can test something the easier it is.

Charles
Title: Re: O2 Source ?
Post by: JRS on July 29, 2011, 09:51:05 PM
Talk about taking a leap of faith ...

If I didn't think I might learn something in my journey to being overwhelmed, I would admit this is over my head an move on.

It might be fun trying though.  8)

Quote
The best way to test the code initially would be to create a program for viewing the hexadecimal byte output.

Is the first task on the list to convert THIS (http://www.oxygenbasic.org/forum/index.php?topic=276.msg1785#msg1785).

(http://www.radare.org/img/arale.jpg)

RADARE

Quote
The project aims to create a complete, portable, multi-architecture,
 unix-like toolchain for reverse engineering.

It is composed by an hexadecimal editor (radare) with a wrapped
IO layer supporting multiple backends for local/remote files,
debugger (osx,bsd,linux,w32), stream analyzer, assembler/disassembler
(rasm) for x86,arm,ppc,m68k,java,msil,sparc code analysis modules and
scripting facilities. A bindiffer named radiff, base converter (rax),
shellcode development helper (rasc), a binary information extracter
supporting (pe, mach0, elf, class, ...) named rabin, and a block-based
hash utility called rahash.

This package contains the gtk enabled edition of radare.

Code: [Select]
jrs@laptop:~$ radare -h
radare [options] [file]
  -s [offset]      seek to the desired offset (cfg.seek)
  -b [blocksize]   change the block size (512) (cfg.bsize)
  -i [script]      interpret radare or ruby/python/perl/lua script
  -p [project]     load metadata from project file
  -l [plugin.so]   link against a plugin (.so or .dll)
  -e [key=val]     evaluates a configuration string
  -d [program|pid] debug a program. same as --args in gdb
  -f               set block size to fit file size
  -L               list all available plugins
  -w               open file in read-write mode
  -x               dump block in hexa and exit
  -n               do not load ~/.radarerc and ./radarerc
  -v               same as -e cfg.verbose=false
  -V               show version information
  -u               unknown size (no seek limits)
  -h               this help message
jrs@laptop:~$ radare -L
haret       Read WCE memory ( haret://host:port )
shm         shared memory ( shm://key )
mmap        memory mapped device ( mmap://file )
serial      serial port access ( serial://path/to/dev:speed )
debug       Debugs or attach to a process ( dbg://file or pid://PID )
malloc      memory allocation ( malloc://size )
remote      TCP IO ( listen://:port or connect://host:port )
winedbg     Wine Debugger interface ( winedbg://program.exe )
windbg      Windbg serial port debugger ( windbg:///path/to/socket )
socket      socket stream access ( socket://host:port or socket://./socket.file )
gxemul      GxEmul Debugger interface ( gxemul://program.arm )
bfdbg       brainfuck debugger
gdbwrap     Connect to remote GDB using eresi's gdbwrap (gdbwrap://host:port)
gdb         Debugs/attach with gdb (gdb://file, gdb://PID, gdb://host:port)
gdbx        GDB shell interface 'gdbx://program.exe args' )
posix       plain posix file access
jrs@laptop:~$

Screen Shots (http://www.radare.org/new/?img)

Code: [Select]
jrs@laptop:~/.wine/drive_c/o2h/A36$ radare Oxygen.dll
open ro Oxygen.dll
Generating default ~/.radarerc...
> Importing file information...
[Information]
class=PE32
dll=True
machine=i386
big_endian=False
subsystem=Windows CUI
relocs=False
line_nums=True
local_syms=True
debug=True
number_of_sections=8
baddr=0x10000000
section_alignment=4096
file_alignment=512
image_size=6721536
[Entrypoint]
Memory address: 0x100010c0
> Importing symbols...
8 sections added
TODO
63 imports added
20 symbols added
4096 strings added
0
0
> Analyzing code...
                                                                        
strings: 2398
functions: 445
structs: 0
data_xrefs: 1199
code_xrefs: 1379
[0x100010C0]>
Title: Re: O2 Source ?
Post by: Aurel on July 30, 2011, 01:41:05 AM
Quote
the assembler adds $58 to the code section (only valid for the intel processors).
ooups i don't know that...
Anyway Libry is not bad example how write native compiler,right?
Title: Re: O2 Source ?
Post by: Charles Pegge on July 30, 2011, 03:27:05 AM
Radare is new to me John. Very interesting.

Here is a piece of code for testing your linker binary output. The first function displays the file in hexadecimal bytes. The second function executes the program directly in memory, bypassing all that PE stuff.

To demonstrate, a program written in machine code is generated. It is six bytes long and returns the long integer 0x1234

putfile "t.bin", chr(0xb8) chr(0x4) chr(0x3) chr(0x2) chr(0x1) chr(0xc3)

Code: OxygenBasic
  1.  
  2.   'Create a prog for direct call in mem
  3.  '====================================
  4.  
  5.  
  6.   putfile "t.bin", chr(0xb8) chr(0x4) chr(0x3) chr(0x2) chr(0x1) chr(0xc3)
  7.  
  8.   '====
  9.  
  10.   string prog=getfile "t.bin"
  11.  
  12.   function display(string prog)
  13.   '============================
  14.  sys a,c,i
  15.   string cr=chr(13)+chr(10)
  16.   string pr="Program Bytes" cr cr
  17.   '
  18.  for i=1 to len prog
  19.     a=asc prog,i
  20.     pr+=mid ("0"+hex(a),-2) " "
  21.     c+=1
  22.     if c>16 then pr+=cr : c=0
  23.   next
  24.   '
  25.  print pr
  26.   end function
  27.  
  28.  
  29.   function exec(string prog)
  30. '=========================
  31.  '
  32.  sys v=call ?prog
  33.   print "Result: 0x" hex v
  34.   '
  35.  end function
  36.  
  37.  
  38.   display prog
  39.   exec prog
  40.  
  41.  

Charles
Title: Re: O2 Source ?
Post by: Peter on July 30, 2011, 04:46:45 AM
Code: [Select]
pr+=mid ("0"+hex(a),-2) " "
pr+=asc ("0"+hex(a),-2) " " 
Title: Re: O2 Source ?
Post by: Charles Pegge on July 30, 2011, 05:00:34 AM

The intention is to produce hex byte codes  padded with a leading 0 to display consistent 2 character codes:

B8 04 03 02 01 C3

Charles
Title: Re: O2 Source ?
Post by: Peter on July 30, 2011, 06:19:50 AM
yes, I have understood this.  :D
Title: Re: O2 Source ?
Post by: efgee on July 30, 2011, 07:40:28 AM
Quote
the assembler adds $58 to the code section (only valid for the intel processors).
ooups i don't know that...
Anyway Libry is not bad example how write native compiler,right?

It's a bit of a hack but on the other side it's a great learning tool because the code is very small.

Like Oxygen it's an all in one compiler with tokenizer, lexer, parser, assembler and linker and if you know VB it's easy to read and understand.
(the linker just writes the content of different arrays to a file - so its not a "real" linker that handles object files)



Title: Re: O2 Source ?
Post by: JRS on July 30, 2011, 09:27:51 PM
Charles,

I'm convinced that Radare is the best tool to create a multi-platform O2 Basic JIT compiler. I think this direction would shorten the multi-platform development cycle and allow others to help/learn. The (graphic) self documenting feature is also a plus.

I would be willing to learn the guts of O2 if Radare was my framework/toolset.

Once I get my arms around how this all works, I might even try to contribute in someway.

John
Title: Re: O2 Source ?
Post by: Charles Pegge on July 30, 2011, 11:39:13 PM

Hi John,

I would use that Oxygen program to test the binary output file  from your linker. I think getting involved with C modules would be too cumbersome.

I had difficulties getting SB to convert hexadecimal strings to values >9

SB
Code: [Select]
'TEST HEXADECIMAL TO BINARY CONVERSION IN SB

  s="B8 04 03 02 01 c3"
  le=len(s)
  i=1
  j=1
  while i<le
    w="0x"+mid(s,i,2)
    print val(w) & " "
    v[j]=chr(w)
    i+=3
    j+=1
  wend

  op=join("",v)

  print "  " & len(op) & " >> "

  for i=1 to len(op)
    print hex(asc(mid(op,i,1))) & " "
  next

  line input z

Charles
Title: Re: O2 Source ?
Post by: JRS on July 31, 2011, 12:04:16 AM
Code: [Select]
s = CHR(0xB8)&CHR(0x04)&CHR(0x03)&CHR(0x02)&CHR(0x01)&CHR(0xC3)


Quote
There is no internal difference between decimal and hexadecimal numbers for ScriptBasic. The lexical analyzer converts both format to internal representation and stores the value of the number. In other words wherever you are allowed to use a decimal number you are allowed to use a hexadecimal number or any radix number as well. You should decide whether to use decimal or hexadecimal or any other radix number for your convenience taking care of BASIC source code readability.
Title: Re: O2 Source ?
Post by: Charles Pegge on July 31, 2011, 02:15:03 AM
Providing the full hex conversion algo on each character: :)

SB Script
Code: OxygenBasic
  1.  
  2.  ' TEST HEXADECIMAL TO BINARY CONVERSION
  3. ' =====================================
  4.  
  5.  
  6.   s="B8 04 03 02  01    c3"
  7.  
  8.   'CONVERT TO BINARY
  9.  '-----------------
  10.  
  11.   le=len(s)
  12.   i=1
  13.   j=1
  14.   a=0
  15.   n=0
  16.   while i<=le
  17.     w=asc(mid(s,i,1))
  18.     if w>96 then
  19.       w-=39
  20.     elseif w>64 then
  21.       w-=7
  22.     end if
  23.     w-=48
  24.     if w>=0 then
  25.       a=a*16+w
  26.       n=1
  27.     else
  28.       if n=1 then
  29.         v[j]=chr(a)
  30.         a=0
  31.         j+=1
  32.         n=0
  33.       end if
  34.     end if
  35.     i+=1
  36.   wend
  37.   if n=1 then
  38.     v[j]=chr(a)
  39.   end if
  40.  
  41.  
  42.   'SAVE BINARY
  43.  '-----------
  44.  
  45.   op=join("",v)
  46.  
  47.   open "t.bin" for output as 1
  48.   print #1,op
  49.   close 1
  50.  
  51.   'LOAD BINARY
  52.  '-----------
  53.  
  54.   open "t.bin" for binary as 1
  55.   op=input(1000,1)
  56.   close 1
  57.   '
  58.  'DISPLAY
  59.  '-------
  60.  
  61.   print "lENGTH=" & len(op) & "  BYTE CONTENT="
  62.  
  63.   for i=1 to len(op)
  64.     s="0" & hex(asc(mid(op,i,1)))
  65.     print right(s,2) & " "
  66.   next
  67.  
  68.  
  69.   line input z
  70.  

Charles
Title: Re: O2 Source ?
Post by: JRS on July 31, 2011, 02:48:55 AM
Using associative arrays as ASM keywords, the binary string value is assigned. (assembled)

Code: [Select]
MODULE ASM

pop{"eax"} = CHR(0x58)
pop{"ebx"} = CHR(0x5B)
pop{"edi"} = CHR(0x5F)
pop{"esi"} = CHR(0x5E)
pop{"ebp"} = CHR(0x5D)

push{"eax"} = CHR(0x50)
push{"edi"} = CHR(0x57)

END MODULE

Do_Stuff = _
ASM::pop{"eax"}   & _
ASM::push{"edi"}  & _
ASM::pop{"edi"}
Title: Re: O2 Source ?
Post by: JRS on July 31, 2011, 03:12:20 AM
Code: [Select]
  open "t.bin" for binary as 1 
  op=input(1000,1) 

Try ...

Code: [Select]
  t_size = filelen("t.bin")
  open "t.bin" for binary as 1 
  op=input(t_size,1) 
Title: Re: O2 Source ?
Post by: JRS on August 01, 2011, 06:36:18 PM
I guess everyone is stunned and speechless about my associative array ASM keyword emulator. Here are a few advantages as I see them.


SB's free form associative arrays should be flexible enough to define the ASM syntax.

ASM::move{"ecx","[ecx]"}

Anyone willing to help me build a ScriptBasic ASM module? (must have good understanding of ASM and Basic)
Title: Re: O2 Source ?
Post by: Charles Pegge on August 02, 2011, 01:11:42 AM

Hi John,

Yes, I've been studying associative arrays. They are an elegant idea but quite expensive to process behind the scenes.

I use quite a few lookup techniques in O2, the main one being hash encoding.

For short lists of around 10-15 words , it is economical to use instr based procedures, placing the most frequently used words first :

Code: OxygenBasic
  1.     m=instr(" mov inc dec push pop add sub mul div cmp and or xor test xchg movsx movzx ",wd)
  2.             '01  05  09  13  18  22  26  30  34  38  42   46 49  53   58   63    69    75
  3.      select case m...
  4.  

Charles
Title: Re: O2 Source ?
Post by: efgee on August 02, 2011, 09:23:44 AM
@JRS
Haven't looked at the SB source but would it not be easier to change the SB interpreter to a VM?

As long Oxygen does not have certain things that SB has you would need to create it yourself, because you need it as native code, don't you?

Title: Re: O2 Source ?
Post by: JRS on August 02, 2011, 09:41:50 AM
The idea is to port O2 to SB as a development environment. A tool to build Basic compilers so to speak.

My first goal is to translate ASM syntax to it's binary image format in but still keep it readable.

Title: Re: O2 Source ?
Post by: efgee on August 02, 2011, 01:59:57 PM
Then this is what you need:

http://ref.x86asm.net/
 (http://ref.x86asm.net/)
Title: Re: O2 Source ?
Post by: Charles Pegge on August 02, 2011, 02:54:54 PM

This is the definitive reference for x86 programming:

IntelĀ® 64 and IA-32 Architectures Software Developer's Manual
Volume 2A: Instruction Set Reference, A-M

IntelĀ® 64 and IA-32 Architectures Software Developer's Manual
Volume 2B: Instruction Set Reference, N-Z


http://www.intel.com/products/processor/manuals/

The Assembler has to combine the various bit patterns to create the right opcodes. There are a huge number of combinations. So if you do not want to make a literal translation of o2assm.bas, I would start off with a small instruction set with the essentials like mov add sub mul div push pop

Assembly code follows the 10%/90% rule that 10% of the instructions are used 90% of the time.

Charles
Title: Re: O2 Source ?
Post by: JRS on August 03, 2011, 12:48:09 AM
Quote
Yes, I've been studying associative arrays. They are an elegant idea but quite expensive to process behind the scenes.

I'm not concerned about lightning speed with this ASM scripting extension to SB. My first step is to create an assembly language module in SB that covers the basics. I can use desecrate binary strings for one off code segments. I should be able to add a little C to the mix and run these ASM images in memory.

SB can already interface seamlessly to C functions. Having a ASM extension module that can execute a binary image in memory created by a script should be interesting.

Title: Re: O2 Source ?
Post by: JRS on August 03, 2011, 06:40:31 PM
Quote
This is the definitive reference for x86 programming:

These are good reference guides but what I need is the binary translation of the ASM statements and opt codes.
Title: Re: O2 Source ?
Post by: JRS on August 03, 2011, 11:02:26 PM
Here is Charles's contribution from a previous post. I prefixed the ASM instructions with a _ to prevent conflicts with SB.

Code: [Select]
_mov = CHR(0x01)
_inc = CHR(0x05)
_dec = CHR(0x09)
_push = CHR(0x13)
_pop = CHR(0x18)
_add = CHR(0x22)
_sub = CHR(0x26)
_mul = CHR(0x30)
_div = CHR(0x34)
_cmp = CHR(0x38)
_and = CHR(0x42)
_or = CHR(0x46)
_xor = CHR(0x49)
_test = CHR(0x53)
_xchg = CHR(0x58)
_movsx = CHR(0x63)
_movzx = CHR(0x69)

I would love to see a minimal ASM function and the hex display of the binary instructions used that could be called from a C based DLL/so.
Title: Re: O2 Source ?
Post by: Charles Pegge on August 04, 2011, 08:13:37 AM
x86 encoding is quite complex and often irregular due to its ancient history and requirements for backward compatibility.

Each Assembler instruction generates several pieces of information, which are used in conjunction with the operands to generate the correct opcodes.

I have tidied up this table. (and fixed 2 bugs relating to 'test' and 'xchg'

Code: OxygenBasic
  1.  
  2.   'GENERAL ARITHMETIC AND LOGIC
  3.  
  4.  
  5.     m=instr(" mov inc dec push pop add sub mul div cmp and or xor test xchg movsx movzx ",wd)
  6.             '01  05  09  13  18  22  26  30  34  38  42   46 49  53   58   63    69    75
  7.    if m>0 then
  8.       '
  9.      select case m
  10.       '------------
  11.      case 01 : pn=2 : b1=&h88 : im=&hc6 : ic=0 : ima=2 : axdi=&hb0 : axim=&ha0 : goto mos 'mov
  12.      case 05 : pn=1 : b1=&hfe : im=0    : c=0  : ima=0 :             goto mos  ' inc    ' sb=&h40 not 64 bit mode
  13.      case 09 : pn=1 : b1=&hfe : im=0    : c=1  : ima=0 :             goto mos  ' dec    ' sb=&h48 not 64 bit mode
  14.      case 13 : pn=1 : b1=&hff : im=&h68 : c=6  : ima=1 : sb=&h50 : ic=0 : goto mos ' push
  15.      case 18 : pn=1 : b1=&h8f : im=0    : c=0  : ima=0 : sb=&h58   : goto mos  ' pop
  16.      case 22 : pn=2 : b1=&h00 : im=&h80 : ic=0 : ima=1 :             goto mos  ' add
  17.      case 26 : pn=2 : b1=&h28 : im=&h80 : ic=5 : ima=1 :             goto mos  ' sub
  18.      case 38 : pn=2 : b1=&h38 : im=&h80 : ic=7 : ima=1 :             goto mos  ' cmp
  19.      case 30 : pn=1 : b1=-1   : im=0    : c=4  : ima=1 : axdi=&hf6 : goto mos  ' mul
  20.      case 34 : pn=1 : b1=-1   : im=0    : c=6  : ima=1 : axdi=&hf6 : goto mos  ' div
  21.      case 42 : pn=2 : b1=&h20 : im=&h80 : ic=4 : ima=1 :             goto mos  ' and
  22.      case 46 : pn=2 : b1=&h08 : im=&h80 : ic=1 : ima=1 :             goto mos  ' or
  23.      case 49 : pn=2 : b1=&h30 : im=&h80 : ic=6 : ima=1 :             goto mos  ' xor
  24.      case 53 : pn=2 : b1=&h84 : im=&hf6 : ic=0 : ima=2 :             goto mos  ' test (no di)
  25.      case 58 : pn=2 : b1=&h86 : im=0    :        ima=0 : axdi=&h90 : goto mos  ' XCHG (no di) Exchange Register/Memory with mrm and AL/AX/EAX Register 90+reg
  26.      case 63 : pn=3 : b1=&hbe : im=0    :        ima=0 : b0=&h0f   : goto mos  ' MOVSX Move with Sign-Extend
  27.      case 69 : pn=3 : b1=&hb6 : im=0    :        ima=0 : b0=&h0f   : goto mos  ' MOVZX Move with Zero-Extend
  28.      end select
  29.       'ers="Internal error: Misaligned "+wd:ert=19: goto exita
  30.    end if
  31.  
  32.  

Charles
Title: Re: O2 Source ?
Post by: JRS on August 04, 2011, 01:47:41 PM
Wow!

I wonder if I'm getting over my head on this project. I may just have to be happy with calling C functions (SB extension modules) and oxygen.dll (Windows only) for ASM speed enhancements.

Thanks for the info and saving me from spinning my wheels. I have a whole new respect for O2 and the difficulty in offering a Basic compiler.

Title: Re: O2 Source ?
Post by: efgee on August 04, 2011, 06:39:58 PM
JRS,
if you have time you could take a look at sljit (http://sourceforge.net/projects/sljit/).

It's a jit compiler that takes simplified asm code and is available for the following platforms:
Intel x86-32, AMD x86-64, ARM (Including ARM-v5, ARM-v7 and Thumb2 instruction sets), IBM PowerPC-32, IBM PowerPC-64 and MIPS-32.

Haven't looked into it deeper but maybe it's easier for you to work with this and would give you the benefit that it works on all these platforms.
(As SB obviously works on several OS as well)

bye
Title: Re: O2 Source ?
Post by: JRS on August 04, 2011, 09:55:00 PM
sljit compiled fine on my Ubuntu 11.04 64 bit system.

The fun part will be making a SB SLJIT extension module out of this.  8)

Code: [Select]
jrs@laptop:~/sljit/bin$ ./sljit_test
Generating code for: x86-64
Executable allocator: ok
test1 ok
test2 ok
test3 ok
test4 ok
test5 ok
test6 ok
test7 ok
test8 ok
test9 ok
test10 ok
test11 ok
test12 ok
test13 ok
test14 ok
test15 ok
test16 ok
test17 ok
test18 ok
test19 ok
test20 ok
test21 ok
test22 ok
test23 ok
test24 ok
test25 ok
test26 ok
test27 ok
test28 ok
test29 ok
test30 ok
test31 ok
test32 ok
test33 ok
test34 ok
test35 ok
test36 ok
test37 ok
test38 ok
test39 ok
test40 ok
All tests are passed.
jrs@laptop:~/sljit/bin$

FYI: This test suite executed instantly. Don't blink.  :o

Thanks for the link!

Title: Re: O2 Source ?
Post by: Charles Pegge on August 05, 2011, 02:08:38 PM
Only a small subset of processor opcodes is required for implementing a high level language. One could bypass the assembly code stage and generate the opcodes directly.

This medel provides most of the data you would need to go from operator directly to x86 opcodes. (I have not included floating point)

Code: OxygenBasic
  1.  
  2.   type coding
  3.   string nam 'operator name
  4.  sys mod    'encoding mode
  5.  sys dir    'direct from mem
  6.  sys cdi    'second octal code
  7.  sys imm    'immediate literal
  8.  sys cim    'second octal code
  9.  end type
  10.  
  11.   coding c[32]
  12.  
  13.   '=================================================
  14.  '        name     mode  direct coder, immed, coder
  15.  '        nam      mod , dir  , cdi  , imm  , cim
  16.  '=================================================
  17.  c[01] <= "load" , 0x1 , 0x8b , -1   , 0xb8 , -1
  18.   c[02] <= "stor" , 0x1 , 0x89 , -1   , -1   , -1
  19.   c[03] <= "+"    , 0x2 , 0x03 , -1   , 0x81 , 0x0
  20.   c[04] <= "-"    , 0x2 , 0x2b , -1   , 0x81 , 0x5
  21.   c[05] <= "cmp"  , 0x2 , 0x3b , -1   , 0x81 , 0x7
  22.   c[06] <= "*"    , 0x3 , 0xf7 , 0x05 , -1   , -1
  23.   c[07] <= "/"    , 0x3 , 0xf7 , 0x07 , -1   , -1
  24.   c[08] <= "and"  , 0x4 , 0x23 , -1   , 0x81 , 0x4
  25.   c[09] <= "or"   , 0x4 , 0x0b , -1   , 0x81 , 0x1
  26.   c[10] <= "xor"  , 0x5 , 0x31 , -1   , 0x81 , 0x6
  27.   c[11] <= "++"   , 0x6 , 0xff , 0x0  , -1   , -1
  28.   c[12] <= "--"   , 0x6 , 0xff , 0x1  , -1   , -1
  29.   c[13] <= "push" , 0x6 , 0x50 , -1   , 0x68 , -1
  30.   c[14] <= "pop"  , 0x6 , 0x58 , -1   , -1   , -1
  31.   c[15] <= "call" , 0x7 , 0xff , 0x2  , 0xe8 , -1
  32.   c[16] <= "jump" , 0x7 , 0xff , 0x5  , 0xe9 , -1
  33.   c[17] <= "=="   , 0x8 , -1   , 0x85 , 0x0f , 0x85
  34.   c[18] <= "<>"   , 0x8 , -1   , 0x84 , 0x0f , 0x84
  35.   c[19] <= ">="   , 0x8 , -1   , 0x82 , 0x0f , 0x8c
  36.   c[20] <= "<="   , 0x8 , -1   , 0x87 , 0x0f , 0x8f
  37.   c[21] <= ">"    , 0x8 , -1   , 0x86 , 0x0f , 0x8e
  38.   c[22] <= "<"    , 0x8 , -1   , 0x83 , 0x0f , 0x8d
  39.   '=================================================
  40.  
  41.  

Charles
Title: Re: O2 Source ?
Post by: JRS on August 06, 2011, 11:52:25 PM
I started putting together a SB equivalent to your example.

Code: [Select]
ASM[1]{"nam"} = "load"
ASM[1]{"mod"} = CHR(0x1)
ASM[1]{"dir"} = CHR(0x8b)
ASM[1]{"cdi"} = -1
ASM[1]{"imm"} = CHR(0xb8)
ASM[1]{"cim"} = -1

ASM[2]{"nam"} = "stor"
ASM[2]{"mod"} = CHR(0x1)
ASM[2]{"dir"} = CHR(0x89)
ASM[2]{"cdi"} = -1
ASM[2]{"imm"} = -1
ASM[2]{"cim"} = -1

ASM[3]{"nam"} = "+"
ASM[3]{"mod"} = CHR(0x2)
ASM[3]{"dir"} = CHR(0x03)
ASM[3]{"cdi"} = -1
ASM[3]{"imm"} = CHR(0x81)
ASM[3]{"cim"} = CHR(0x0)

ASM[4]{"nam"} = "-"
ASM[4]{"mod"} = CHR(0x2)
ASM[4]{"dir"} = CHR(0x2b)
ASM[4]{"cdi"} = -1
ASM[4]{"imm"} = CHR(0x81)
ASM[4]{"cim"} = CHR(0x5)

ASM[5]{"nam"} = "cmp"
ASM[5]{"mod"} = CHR(0x2)
ASM[5]{"dir"} = CHR(0x3b)
ASM[5]{"cdi"} = -1
ASM[5]{"imm"} = CHR(0x81)
ASM[5]{"cim"} = CHR(0x7)

ASM[6]{"nam"} = "*"
ASM[6]{"mod"} = CHR(0x3)
ASM[6]{"dir"} = CHR(0xf7)
ASM[6]{"cdi"} = CHR(0x05)
ASM[6]{"imm"} = -1
ASM[6]{"cim"} = -1

ASM[7]{"nam"} = "/"
ASM[7]{"mod"} = CHR(0x3)
ASM[7]{"dir"} = CHR(0xf7)
ASM[7]{"cdi"} = 0x05
ASM[7]{"imm"} = -1
ASM[7]{"cim"} = -1

ASM[8]{"nam"} = "and"
ASM[8]{"mod"} = CHR(0x4)
ASM[8]{"dir"} = CHR(0x23)
ASM[8]{"cdi"} = -1
ASM[8]{"imm"} = CHR(0x81)
ASM[8]{"cim"} = CHR(0x4)

Title: Re: O2 Source ?
Post by: Charles Pegge on August 07, 2011, 07:17:13 AM
Hi John,

 I tried each data row into an array then using splita to separate the row data into strings. But these split units would not convert from hexadecimal to numeric values no matter what I tried. So I think setting flag values directly in a case block as I do the the o2 assembler is going to work better.

Charles
Title: Re: O2 Source ?
Post by: JRS on August 07, 2011, 11:52:22 AM
I can better understand the problem you may be having with SB arrays if you can show me an example in O2 Basic of using your base table  to generate a binary executable image. It doesn't to do much other than indicate it worked.

Thanks!



Title: Re: O2 Source ?
Post by: Charles Pegge on August 07, 2011, 12:35:38 PM

This describes the problem John:

ScriptBasic
Code: OxygenBasic
  1.  
  2.   ' =================================================
  3.  '        name     mode  direct coder, immed, coder  
  4.  '        nam      mod , dir  , cdi  , imm  , cim  
  5.  ' =================================================
  6.  c["load"] =      "0x1 , 0x8b , -1   , 0xb8 , -1"
  7.  
  8.   cc=c["load"]
  9.   splita cc by "," to d
  10.   dt=ltrim(rtrim(d[1]))
  11.   print dt
  12.   v=val(dt)
  13.   print "   " & v
  14.   line input a
  15.  
  16.  
  17.  

There's too much processing involved here anyway so I favour a more direct approach.

Charles
Title: Re: O2 Source ?
Post by: JRS on August 07, 2011, 12:47:02 PM
If I get the big picture here, I need to convert this into a binary string including the -1s to their binary integer format, correct?

 
Title: Re: O2 Source ?
Post by: Charles Pegge on August 07, 2011, 02:26:54 PM
Sample Binary Encoding for + and -

SB
Code: [Select]
 
  nam="+"
  vai=-1
  van="a"
  vav=123
  

  function mklong(v)
    h=right("0000000" & hex(v),8)
    'print h
    mklong=chr(asc(mid(h,7,1))*16+asc(mid(h,8,1))) & _
           chr(asc(mid(h,5,1))*16+asc(mid(h,6,1))) & _
           chr(asc(mid(h,3,1))*16+asc(mid(h,4,1))) & _
           chr(asc(mid(h,1,1))*16+asc(mid(h,2,1)))
  end function
  
  
  if nam="+" then
    'direct or immediate
    if vai<>0 then
      cod=chr(0x03) & chr(0x85) & mklong(vai*4)
    else
     cod=chr(0x81) & chr(0*8) & mklong(vav)
    end if
  elseif nam="-" then
    'direct or immediate
    if vai<>0 then
      cod=chr(0x2b) & chr(0x85) & mklong(vai*4)
    else
     cod=chr(0x81) & chr(5*8) & mklong(vav)
    end if
  end if


  print "ok"


  line input a

Can I put comments on the same line as code?
Direct means from a memory location.
Immediate means a number

Charles

Title: Re: O2 Source ?
Post by: JRS on August 07, 2011, 06:02:47 PM
Quote
Can I put comments on the same line as code?

SB only allows one statement per line. A remark is a statement.

Title: Re: O2 Source ?
Post by: Charles Pegge on August 07, 2011, 07:00:58 PM
The -1s indicate code not available but they are not relevant when coding directly. In fact the x86 instruction set is so irregular that it is probably easier to deal with each operator as an individual case and not attempt to formulate generic rules. For instance there is no way of dividing directly by a number. The number has to be preloaded into another register first. Then the edx register has to be set to 0. These things have to be done as a macro before the idiv instruction is given.

Charles
Title: Re: O2 Source ?
Post by: JRS on August 07, 2011, 09:49:29 PM
Before I get too far down the road with this, would brain surgery be easier to learn?  :'(

I agree that individual functions for each primary ASM instruction (mov, stor, pop, push, ...)  would make more sense.

I still would like to see a complete ASM binary executable that would run in memory in a hex format string. Until I see what the end game looks like, I'm bouncing off of walls.

 
Title: Re: O2 Source ?
Post by: Charles Pegge on August 07, 2011, 11:12:56 PM
The best advice I've heard is 'Follow your excitement'. If you enjoy working with the hardcore of software then it is well worth the effort learning about processors and machine code.( The fundamentals are in many ways simpler than the high level stuff).

Otherwise brain surgery may well be the softer option :)

Charles
Title: Re: O2 Source ?
Post by: JRS on August 07, 2011, 11:27:07 PM
I'm curious, do you know of a high level language that will translate assembly code (text) to it's binary executable format that the language can execute ? (bypassing the assembler part)
Title: Re: O2 Source ?
Post by: Charles Pegge on August 08, 2011, 05:48:34 AM

Not specifically John. I've done this thing before on three projects in my early career. Going from the language/user script directly to binary was the most practical way to go, especially when computing power was very limited and the target hardware was customised.

Charles
Title: Re: O2 Source ?
Post by: efgee on August 08, 2011, 08:57:49 AM
I'm curious, do you know of a high level language that will translate assembly code (text) to it's binary executable format that the language can execute ? (bypassing the assembler part)

Are you looking for an assembler that has high level commands as well ???

If so the only one I know is hla (http://en.wikipedia.org/wiki/High_Level_Assembly).
It's available for windows, Linux, Mac and FreeBSD.
Here is the link to the hla library (http://hla-stdlib.sourceforge.net/) which can be used for other languages as well.

I thought Charles made the Oxygen.dll work with SB, are there any side effects that you look for a different route?
 
Title: Re: O2 Source ?
Post by: Charles Pegge on August 08, 2011, 10:08:41 AM

I think John was trying to find a quicker way of moving Oxygen cross-platform using SB as an intermediary, but this is not easy to do, and when we investigate this route further the real complexities begin to emerge.

I have to take Oxygen through the self-compile stage before tackling Linux directly as this will flush out any design defects, and I am always keen to find the fastest possible development route.

Charles

Title: Re: O2 Source ?
Post by: JRS on August 08, 2011, 06:42:47 PM
Charles,

I don't see this as worth the effort based on the difficulty and time to implement.

I'll just wait till you have a Linux version and use OxygenBasic as it was intended. O2 works fine with SB under Windows as efgee  mentioned.

John