Author Topic: Dynamically compiled functions (Read 7879 times)

Charles Pegge · « **on:** July 15, 2014, 07:31:06 AM »

Dynamically compiled functions can now use 'link' syntax to create a function address table for the host. This is consistent with primary compiling (in thinBasic - linking for calls from thinbasic)

function ffs( string v) as string, link pf[2]

UPDATE:
http://oxygenbasic.org/o2zips/Oxygen.zip

Code: [Select]


  '#FILE "T.EXE"

  '=================================
  'Dynamically compiled mini library
  '=================================


  function Build(string src,sys *pf) as sys
  =========================================
  static sys base,libr
  static string er
  base = ebx
  libr = compile src
  er=error
  if er then
    print "Dynamic " er
    libr=0
  else
    call libr 'initialise library
    indexbase 0
  end if
  return libr
  end function
  '
  string funlib=quote """
  '
  function ffi(sys v) as sys, link pf[1]
  return v*3
  end function
  '
  function ffs( string v) as string, link pf[2]
  return v+" "+v
  end function
  '
  function fff( float v) as float, link pf[3]
  return v*2
  end function
  '
  """


'TEST
'====

sys a
sys p[3]
a=build(funlib,p)
if a then
  ! fi (sys v)    as sys    at p[1]
  ! fs (string v) as string at p[2]
  ! ff (float v)  as float  at p[3]
  cr=chr(13,10)
  print fi(14)+cr+ff(1.25)+cr+fs("Qwerty")
  freememory a
end if

RobbeK · « **Reply #1 on:** July 16, 2014, 01:51:12 PM »

Thanks Charles ,

Gives me some ideas -- (it may be possible to avoid foreign function calls in (atleast) NewLisp -- see attachment )
I finally could allocate contineous memory within NewLisp that can be indexed in the usual way. (huge c-strings containing NULL characters). Don't need foreign calls to read/write data any more ...

NewLisp is a very fast interpreter (depending on the application 5-2x slower than native code ) some JIT code may be welcome.
Such a combination would be perfect -- as for Lisp , the inventor Mr McCarthy was not sure a Lisp source could be compiled completely (Lisp was born in 1958) and nowadays the people from NewLisp and PicoLisp do not make any attempts in this way.. Maybe Mr Burger from PicoLisp said it correctly : compiled Lisp is no Lisp any more.

(i have the same experience - in a recent program i wrote formulae morphing themselves depending on a situation -- ran fine interpreted, but upon compiling , the machine told me it could not do such things -- (but then in Common Lisp , definitions can be compiled or not -- resulting in something as a tB+Oxygen situation -- but with dynamically compiled functions ...

best Rob

.

pber · « **Reply #2 on:** July 16, 2014, 04:48:13 PM »

Hi robbek,

Quote from: RobbeK on July 16, 2014, 01:51:12 PM

Maybe Mr Burger from PicoLisp said it correctly : compiled Lisp is no Lisp any more.

why something should continue to be itself once compiled?

Does "This is not lisp" stands for it's not a sexp (or maybe it's not a list)?
Or there is something more I do not see?

RobbeK · « **Reply #3 on:** July 17, 2014, 06:03:20 AM »

Hi Paolo,

Quoting him :
"Only an interpreted Lisp can fully support such "Equivalence of Code and Data". If executable pieces of data are used frequently, like in PicoLisp's dynamically generated GUI, a fast interpreter is preferable over any compiler. "

Attached something that may make it more clear -- in an interpreted Lisp it is possible to assign the elements of definitions as if it were lists and change them dynamically . Here every time the function is called it changes itself (some very clever things are possible this way .. )

As for interpreted vs compiled lisp - PicoLisp and NewLisp run very fast -- I made a compair (sorting a simple-array of 100000 integers and making the sum of the elements )
(i)= interpreted / (c) = compiled
CLisp (i) 390 mSec
CLisp (c) 90 mSec Bytecode

Clozure CL (native code)
(i) 63 mSec
(c) 47 mSec

Gnu CL (C obj. file)
(i) 190 mSec
(c) 50 mSec

Steel Bank CL (native optimized code)
(c) 31 mSec

NewLisp (i)
using setq 66 mSec
using incf 62 mSec
using apply on a sequence 31 mSec

LispWorks (Personal version -- Deep Space 1 used a Lispworks product on board)
(i) 2484 mSec (no GC interaction ?!)
(c) 62 mSec

Racket Scheme (bytecode + GNU Lightning JIT)
with lists : 62 mSec
Vector 32 mSec
Vector->List + apply 16 mSec

best Rob

.

pber · « **Reply #4 on:** July 17, 2014, 06:22:57 AM »

homoiconicity, you're right.

A compiled lisp should implement a different reader
for the compiled code. After all: ascii bytes are not code nor data.

RobbeK · « **Reply #5 on:** July 18, 2014, 03:24:11 AM »

Hi Paolo,

If you should consider trying NewLisp , I wrote the Japi, OpenGL , GLU , GLUT bindings for it - just ask ,

Rob

Charles Pegge · « **Reply #6 on:** July 18, 2014, 05:51:40 AM »

A geometric surface explorer would be quite a useful tool - making use of dynamic compiling. All the components are available in Oxygen, including snapshots. It just needs stitching together:

My concept: (using bumpy metallic surface)

jack · « **Reply #7 on:** July 18, 2014, 06:36:02 PM »

Quote from: RobbeK on July 18, 2014, 03:24:11 AM

Hi Paolo,

If you should consider trying NewLisp , I wrote the Japi, OpenGL , GLU , GLUT bindings for it - just ask ,

Rob

hello RobbeK,
obviously I am not Paolo but I am interested in those bindings for NewLisp.

RobbeK · « **Reply #8 on:** July 26, 2014, 12:50:27 PM »

Hi Jack, (oops this message escaped me)

Running Win or Linux ? .. I wrote a 3D and 2D skeleton for Win32 , the Japi bindings are complete (and in this case I also have the Japi.so file somewhere -- it's even more complete as what comes with GNU CL (it has no progressbar and seems to be based in a very early Japi distribution ) ... it has a Japi-Primitives package ..

best Rob

Mike Lobanovsky · « **Reply #9 on:** July 26, 2014, 02:06:37 PM »

Quote from: RobbeK on July 17, 2014, 06:03:20 AM

.........
Quoting him :
"Only an interpreted Lisp can fully support such "Equivalence of Code and Data". If executable pieces of data are used frequently, like in PicoLisp's dynamically generated GUI, a fast interpreter is preferable over any compiler. "
............
Here every time the function is called it changes itself (some very clever things are possible this way .. )
............

Hi Rob,

"Mr Burger from PicoLisp" may be perfectly correct from a very technical point of view.

Compiled code goes into the process memory sections that are usually marked READABLE/EXECUTABLE. The MS Windows Task Manager monitors program execution and assures that the code stays unmodified (non-writable) for as long as the process runs. OTOH data sections are usually marked as READABLE/WRITABLE and their content can be freely modified as per the process' intended purposes. At the same time, the Task Manager and DEP services will assure that data within these sections isn't used as executable code.

While it is possible to create the so called "self-modifiable" code and change section attributes at run time, such activities will be considered as extremely suspicious from the AV perspective. That's why it is highly unlikely that the language authors would dare risk their reputation and ignore triggering the AV software even if these are 100% false alarms.

This isn't the case with interpretative languages/modes of operation. Here the executable code of the language's virtual machine stays unmodified at all times but what it really does is entirely dependent on the program data that defines the chain of commands (tokens) which the virtual machine is supposed to execute. Naturally enough, this data (and consequently, the execution flow) is totally reconfigurable (re-writable) at run time while the Task Manager, DEP, and AV SW stay unawares. Good virtual machines can do (albeit somewhat slower) everything that static code can, and much much more. Look at such products as VMware, VirtualBox, Virtual PC etc. - these are all virtual machines even more powerful than individual interpretative languages. They can emulate entire workstations including hardware, operating systems, and a plethora of different processes running in them, all in one.

This is one of the unbeatable and decisive advantages of interpretative languages/modes of operation over their static, and therefore restricted, compiler-only counterparts.

pber · « **Reply #10 on:** July 28, 2014, 01:07:23 PM »

Quote from: Mike Lobanovsky on July 26, 2014, 02:06:37 PM

Compiled code goes into the process memory sections that are usually marked READABLE/EXECUTABLE. The MS Windows Task Manager monitors program execution and assures that the code stays unmodified (non-writable) for as long as the process runs. OTOH data sections are usually marked as READABLE/WRITABLE and their content can be freely modified as per the process' intended purposes. At the same time, the Task Manager and DEP services will assure that data within these sections isn't used as executable code.

While it is possible to create the so called "self-modifiable" code and change section attributes at run time, such activities will be considered as extremely suspicious from the AV perspective.
[...]
This is one of the unbeatable and decisive advantages of interpretative languages/modes of operation over their static, and therefore restricted, compiler-only counterparts.

Thanks Mike: valuable clarification to me.

But now a question knocks at my poor mind:
how can Oxygen generate and run machine-code?
Also, in NewLISP, I saw examples of dynamically generated (and executable) machine-code too.

Machine code is freeely runnable/writable is it stands in the heap?

Charles Pegge · « **Reply #11 on:** July 28, 2014, 07:13:35 PM »

Hi Paulo,

Code: [Select]

'creating and calling machine code

'using an array of bytes
'mov eax, 0x04030201 : ret
byte b={0xb8,1,2,3,4,0xc3}

print hex call @b

Mike Lobanovsky · « **Reply #12 on:** July 28, 2014, 07:51:12 PM »

Morning Charles,

Quote

byte b={0xb8,1,2,3,4,0xc3}
print hex call @b

Not possible with your DEP turned on: access denied exception.

Hi Paolo,

Quote from: paolo on July 28, 2014, 01:07:23 PM

Machine code is freeely runnable/writable is it stands in the heap?
how can Oxygen generate and run machine-code?
Also, in NewLISP, I saw examples of dynamically generated (and executable) machine-code too.

No, machine code doesn't reside in the heap. The memory image of an executable file is (very roughly and generally) divided into 4 parts:

-- program header that stores pointers to and sizes of program code, data, and resource sections of the program in the process memory;
-- code section(s) proper marked by the OS as non-modifiable (read/execute-only) memory areas;
-- data section(s) proper marked by the OS as modifialble (readable and writable) but non-executable memory areas; and
-- optional resource section that carries various icons, images, menus, dialog templates, strings etc. that the program may need for its purposes.

Apart from that, the program loader reserves two big chunks of memory that the program can use additionally outside its data section(s) in case it needs to create and destroy dynamic objects or pieces of data:

-- stack;
-- heap.

The stack is generally used for passing function parameters and allocating temporary pieces of data local to these functions. The heap is used to create both global and local dynamic objects such as arrays, strings (actually arrays of bytes), compound structures, and class instances. Both of these chunks are also usually marked as modifiable (readable and writable) but non-executable.

As you see, each process memory piece has a strictly predefined purpose and an associated expected behavior. Most user programs fit very well into this pattern but not all. Such programs as just-in-time compilers e.g. like OxygenBasic, FBSL, LuaJIT and some others will also need memory to dynamically generate and write additional executable code into. In this case the developer may use the VirtualAlloc/VirtualProtect/VirtualFree WinAPI's with a required set of readable/writable/executable attributes. These API's create and destroy additional data or code sections with an arbitrary mixture of attributes in the process heap chunk.

Unfortunately, this is exactly what most of the existing virii would do too for their malicious and destructive activities. Therefore, anti-viral software is taught to sniff unusual attribute combinations (mainly such as concurrent writable and executable attributes), and to also monitor the VirtualAlloc calls.

Since anti-viral protection is considered more important than presumption of the developer's innocence, a lot of less intelligent AV packages (abundant as such sites as e.g. the notorious VirusTotal.com) invariably flag JIT compilers as suspicious/malicious software. Hehe, and it is a serious challenge to outsmart the VirusTotal.com bunch of shitcode AV software to avoid false alarms.

OTOH more intelligent packages with well-developed heuristics engines such as e.g. Kaspersky, Norton, Ezet and a few others would never flag a JIT compiler suspicious.

So there are 3 conceivable ways to break this vicious circle for a JIT compiler project:

-- hire a guru that's smart enough to disguise your "suspicious" activities against the stupidity of shitcode AV's;
-- disclose your sources to the world and pay off the VirusTotal.com gang to include your JIT compiler on their exception list; and
-- subside forever to the turtle pace of fully interpreted code which does not create writable/executable sections in the process heap at all.

Obviously, "Mr Burger from PicoLisp" has chosen behavioral pattern number 3 from this list.

My own indisputable option is however pattern number 1.

Charles Pegge · « **Reply #13 on:** July 28, 2014, 08:18:42 PM »

Avoiding DEP issues.

Embedded / Inline machine code:

o2 machine script enables the programmer to access the lowest (and original) level of OxygenBasic.

Code: [Select]

'creating and calling embedded machine code
'using o2 machine script

print hex call fx
end

fx:
'mov eax, 0x04030201 : ret
o2 b8 01 02 03 04 c3

Aurel · « **Reply #14 on:** July 28, 2014, 09:01:54 PM »

I have a question......
i hope that is not to stupid....
For example Ed toy interpreter is a bytecode ....right?
but he don't produce bytecode with byte type of variable ...right?
than integer array.....
so is it possible to force this thing to produce byte array
or is this idea nonsence ?

thanks

Oxygen Basic

News:

Author Topic: Dynamically compiled functions (Read 7879 times)

Charles Pegge

Dynamically compiled functions

RobbeK

Re: Dynamically compiled functions

pber

Re: Dynamically compiled functions

RobbeK

Re: Dynamically compiled functions

pber

Re: Dynamically compiled functions

RobbeK

Re: Dynamically compiled functions

Charles Pegge

Re: Dynamically compiled functions

jack

Re: Dynamically compiled functions

RobbeK

Re: Dynamically compiled functions

Mike Lobanovsky

Re: Dynamically compiled functions

pber

Re: Dynamically compiled functions

Charles Pegge

Re: Dynamically compiled functions

Mike Lobanovsky

Re: Dynamically compiled functions

Charles Pegge

Re: Dynamically compiled functions

Aurel

Re: Dynamically compiled functions