Author Topic: Polymorph Tutorial  (Read 6499 times)

0 Members and 3 Guests are viewing this topic.

Peter

  • Guest
Polymorph Tutorial
« on: May 06, 2014, 03:33:21 AM »
Deleted
« Last Edit: May 05, 2015, 09:14:15 AM by Peter »

Charles Pegge

  • Guest
Re: Polymorph Tutorial
« Reply #1 on: May 06, 2014, 08:19:18 AM »
Another trick: :)

Ellipsis & the built-in param array

Code: [Select]
sys f(n,...)
{
  indexbase 0
  for i=1 to n
  print param(i)
  next
}

f  3,10,20,30

Charles Pegge

  • Guest
Re: Polymorph Tutorial
« Reply #2 on: May 06, 2014, 09:53:33 AM »

Convenience really. With indexbase 0, param[0] is the same as n

Mike Lobanovsky

  • Guest
Re: Polymorph Tutorial
« Reply #3 on: May 06, 2014, 09:55:22 AM »
My 2 cents: that's because n (the number of parameters that follow) is logically also a parameter to this function. It's own index should've been 0 but is it, really? :)

Mike Lobanovsky

  • Guest
Re: Polymorph Tutorial
« Reply #4 on: May 06, 2014, 09:56:21 AM »
Haha! :D

You posted just before me, Charles.

[EDIT] Then why bother passing n at all? Let it be just Foo(...) where ... is a ParamArray whereby index 0 is always ParamArray's UBound, in fact.
« Last Edit: May 06, 2014, 10:03:43 AM by Mike Lobanovsky »

Mike Lobanovsky

  • Guest
Re: Polymorph Tutorial
« Reply #5 on: May 06, 2014, 10:14:11 AM »
Peter is correct, Charles! You shouldn't be talking about index bases and arrays here at all. Just accept what I'm suggesting and document that the arguments in such a vararg function as Foo(...) are accessed through a param() accessor where param(0) is the number of "meaningful" parameters passed to the function.

Your approach confuses the O2 users.

Mike Lobanovsky

  • Guest
Re: Polymorph Tutorial
« Reply #6 on: May 06, 2014, 10:21:35 AM »
Peter and Charles,

AFAIK Oxygen subs already have push/pop prologs and epilogs so pushad/popad aren't really necessary here. Am I correct in saying that? Peter's subs in his sw.dll all seem to be having such extra frames which shouldn't really be there. :)

Charles Pegge

  • Guest
Re: Polymorph Tutorial
« Reply #7 on: May 06, 2014, 10:58:06 AM »
Hi Mike

When using assembler in procedures, the ebx, ebp and esp registers should be protected.

All the others can be used safely without stacking. (ebx references all static entities)


Full Ellipsis, no indexes :)
Code: [Select]
sys f(...)
{
  sys n at (@param)
  sys v at (@param)
  do
    if n=0 then exit do
    n--
    @v+=sizeof sys
    print v
  end do
}

f  3,10,20,30

PS: exportable (extern) functions have more stuff in the prologs and epilogs to conform with the standard calling conventions, and protect non-volatile registers.
« Last Edit: May 06, 2014, 11:32:20 AM by Charles Pegge »

Mike Lobanovsky

  • Guest
Re: Polymorph Tutorial
« Reply #8 on: May 06, 2014, 03:25:37 PM »
Charles,

1.

Your noble British answer ( :) ) doesn't precise that "the ebx, ebp and esp registers should be protected" if and only if their content is modified by the user-coded assembly within that very function. Also, once modified by the user, the ebx register will not let the user access static data and Oxygen's table of user-declared imported functions until the register state is restored.

So my humble Slavonic assertion ( :) ) would be that Peter's brave German pushad-popad-ing ( :) ) is unnecessary and redundant since neither of these three registers are used and/or modified anywhere in his assembly code. The states of other registers are either protected by Oxygen itself or are simply irrelevant for proper functioning of the language.

2.
Quote
(ebx references all static entities)

This reminds me of a couple of other questions I wanted to ask. Here's a typical skeleton listing of a function exported from Peter's sw.dll as seen in Ida Pro:

Code: [Select]
0001  public Foo
0002  Foo    proc near
0003
0004  push   ebx ; Conventional protection of registers
0005  push   esi ; - do -
0006  push   edi ; - do -
0007  push   eax ; What's this for? (see line 0015)
0008
0009  call   sub_XXXX ; Oxygen-specific (see line 0022)
0010
0011  push   ebp ; Oxygen's sub stack frame prolog
0012  mov    ebp, esp ; - do -
XXXX  ........................ ; user code
0013  mov    esp, ebp ; Oxygen's sub stack frame epilog
0014  pop    ebp ; - do -
0015  add    esp, 4 ; This effectively discards eax saved on line 0007!
0016  pop    edi ; Restore conventionally protected registers
0017  pop    esi ; - do -
0018  pop    ebx ; - do -
0019  retn   N ; N depends on the number and size of sub's actual parameters
0020  Foo    endp
0021
0022  sub_XXXX proc near
0023  call    $+5 ; Get current EIP value into ebx
0024  pop     ebx ; - do -
0025  sub     ebx, YYYYY ; Get the address of Oxygen's table of static data and imports into ebx
0026  add     ebx, ZZZZZ ; - do -
0027  retn
0028  sub_XXXX endp

Question 0: Why aren't protectable registers preserved within the function stack frame but rather outside of it? Are there any benefits in such a design decision?

Question 1: Why is eax preserved at all if its value is discarded at least throughout Peter's library (see line 0015)? Are there any cases at all when Oxygen uses this register statically (perhaps for its own purposes)?

Question 2: Why would static data go into the table that's stored in the .text (i.e. code) section of the binary? This section should be marked READABLE and EXECUTABLE only and should deny writes else the AV software may flag it as suspicious.

3.

Perfect! :D

Are you going to provide a more BASIC-stylish param() access to ParamArray too? I realize that your code is perfect for the fastest access to parameters possible. But variadic functions are also very useful in non-time critical scenarios, and that coding style looks too much like good old C or sometimes even C-- (a HLL assembler). I'm thinking about total beginners... :)

Charles Pegge

  • Guest
Re: Polymorph Tutorial
« Reply #9 on: May 06, 2014, 09:40:38 PM »
Mike,

Quote
Question 0: Why aren't protectable registers preserved within the function stack frame but rather outside of it? Are there any benefits in such a design decision?

This was an arbitrary decision, but it works well for internal/external functions on both 32 and 64 bit systems. One minor advantage is that I can use constant ebp offsets for the garbage-collector and concatenator lists.


Quote
Question 1: Why is eax preserved at all if its value is discarded at least throughout Peter's library (see line 0015)? Are there any cases at all when Oxygen uses this register statically (perhaps for its own purposes)?

Oxygen makes very little distinction between subs and functions. The eax and edx registers are preserved only when invoking the garbage collector in the epilog. When strings are not used, there is no need, as these registers remain unaltered by the epilog.

You can inspect code directly with #show

sub f()
sys s
#show end sub

sub f()
string s
#show end sub



Quote
Question 2: Why would static data go into the table that's stored in the .text (i.e. code) section of the binary? This section should be marked READABLE and EXECUTABLE only and should deny writes else the AV software may flag it as suspicious.

In PE files, the .text section flag settings I use are:
0x60000020 ' code executable readable
what does your disassembler show?

3.

Quote
Perfect! :D
Are you going to provide a more BASIC-stylish param() access to ParamArray too? I realize that your code is perfect for the fastest access to parameters possible. But variadic functions are also very useful in non-time critical scenarios, and that coding style looks too much like good old C or sometimes even C-- (a HLL assembler). I'm thinking about total beginners... :)

Ellipsis and param are for meddlers only! :)

Variants are possible but relatively expensive for compiled code, when types, overlays,polymorphs and pseudo-typeless are available.

If the type spec is omitted then sys integers are assumed:

function f(a,b,c)
return a+b+c
end function

print f 1,2,3




PS: To retore the vital ebx register call  _mem. This call is present in the prolog of all external functions.
« Last Edit: May 06, 2014, 10:33:56 PM by Charles Pegge »

Mike Lobanovsky

  • Guest
Re: Polymorph Tutorial
« Reply #10 on: May 07, 2014, 04:24:27 AM »
Charles,

Thank you very much for this exhaustive overview.

Just one small clarification, please:

PS: To retore the vital ebx register call  _mem. This call is present in the prolog of all external functions.

Are you talking about a call to this small sub I mentioned:

Code: [Select]
0022  _mem  proc near
0023  call    $+5 ; Get current EIP value into ebx
0024  pop     ebx ; - do -
0025  sub     ebx, YYYYY ; Get the absolute address of Oxygen's table of static data and imports into ebx by its relative offset with respect to EIP
0026  add     ebx, ZZZZZ ; - do -
0027  retn
0028  _mem  endp

Here YYYYY and ZZZZZ are numeric literals that define the position of Oxygen's static data and declared function imports with respect to the current EIP value! This is why I assumed that the table itself would be located in the code (aka .text) section of the binary image. Otherwise, its absolute address would be known to the compiler by, say, a corresponding label and it would go into ebx directly without that arithmetic jugglery. Or am I missing something?

P.S. PE Explorer shows the .text section flags as READABLE+EXECUTABLE which is exactly what is expected of a code section including the imports table. But that isn't sufficient for static data which should also be WRITABLE...

Charles Pegge

  • Guest
Re: Polymorph Tutorial
« Reply #11 on: May 07, 2014, 04:54:10 AM »
Mike,

OxygenBasic produdes relocatable code (requires no fixups), so it must discover where it is, and then work out where a label, in this case bssdata is located. Unfortunately, this cannot be done directly. Hence the juggling to retrieve the instruction pointer from the stack. This is remedied in 64bit mode with RIP addressing (Relative to Instruction Pointer addressing). And all we have to do is:

lea rip rbx,bssdata

instead of:

._mem
call fwd here
.here
pop ebx
sub ebx,here
add ebx,bssdata
ret



where bssdata is given as a relative displacement from the current IP. This resolves the effective (absolute) address and stores it in rbx.

All call tables and other static data is mapped out in the bssdata  (uninitialised data) section, and ebx/rbx holds the its base address.

PS: Adding to our previous: External function prologs push the eax/rax register, as you observed. This has a role to play in relayed callbacks, where incoming callbacks are routed to a single function, carrying an id number in the eax register.
« Last Edit: May 07, 2014, 05:05:29 AM by Charles Pegge »

Mike Lobanovsky

  • Guest
Re: Polymorph Tutorial
« Reply #12 on: May 07, 2014, 05:12:50 AM »
Thank you Charles,

Everything is perfectly clear now under your enlightening guidance. And this makes me realize how little do I know by myself. But this also makes me eager to learn to know more.

:)

Charles Pegge

  • Guest
Re: Polymorph Tutorial
« Reply #13 on: May 07, 2014, 01:46:26 PM »

Deep is your enquiry :)