Author Topic: Tiny Benchmark Test  (Read 34701 times)

0 Members and 7 Guests are viewing this topic.

Aurel

  • Guest
Re: Tiny Benchmark Test
« Reply #45 on: May 17, 2014, 10:05:47 PM »
thanks mike  ;)
If we talk about oxygen ...it looks to me that  if/elseif method is somehow faster than
select/case ..how is that possible ?

Mike Lobanovsky

  • Guest
Re: Tiny Benchmark Test
« Reply #46 on: May 17, 2014, 10:27:56 PM »
Hi Aurel,

I don't know yet how Charles implemented the two blocks at the assembly level but in modern C with an optimizing compiler, there are two differences between a multiline if block and a switch block:

1. A switch block is supposed to pre-scan the incoming value for the range of values the block handles. If the block doesn't contain default: and the value is totally out of its range of handled values, the block is supposed to be skipped altogether. In many cases, pre-scanning is much faster than going through the long long chain of mispredicted case's only to find that no actual case matches the value.

A multiline if will always perform the entire range of comparisons regardless of actual range of values it can handle.

2. A long switch block will be converted to a jump table and computed goto's by the optimizing compiler automatically if a corresponding optimization option is specified. Multiline if blocks do not enjoy such an optimization.

Now let's wait for Charles to tell us how the two blocks are actually handled in OxygenBasic.

JRS

  • Guest
Re: Tiny Benchmark Test
« Reply #47 on: May 17, 2014, 11:06:21 PM »
Are nested SELECT/CASE's more efficient than a IF/THEN/ELSEIF/ELSE block structure?

SB doesn't have SELECT/CASE syntax. (uses IF/THEN/ELSEIF/ELSE instead) I have never run into a SELECT/CASE that I couldn't easily convert to a IF/THEN/ELSEIF/ELSE structure. Then again, SB isn't a compiler and ease of use with a consistent syntax trumps speed & obscurity.
« Last Edit: May 17, 2014, 11:48:59 PM by John »

Mike Lobanovsky

  • Guest
Re: Tiny Benchmark Test
« Reply #48 on: May 18, 2014, 12:34:46 AM »
Are nested SELECT/CASE's more efficient than a IF/THEN/ELSEIF/ELSE block structure?

Nested Select Case's are exactly as awkward as nested If/ElseIf's especially if the both are long and not properly indented (and preferably highlighted with indentation guides).

I usually frame the outer conditionals in a Select Case block, and the inner ones, in If/ElseIf blocks if I have to. Also, I will never ever use an editor that isn't equipped with auto-indentation and indentation guides features.

Quote
... consistent syntax trumps speed & obscurity.

For obvious reasons that I described in my previous message, the use of multiple if/else if's will not be considered good programming style in the C language. Long chains of conditionals should be framed in a switch block while shorter and simpler if/else's are acceptable in the case sub-blocks.

It's not a matter of obscurity or consistency but rather of implied functionality. Long switch'es are generally executed faster than if/else if's due to inherent pre-scanning, even if not optimized into jump tables and computed goto's altogether.

JRS

  • Guest
Re: Tiny Benchmark Test
« Reply #49 on: May 18, 2014, 12:38:53 AM »
This is as close as I can come emulating a SELECT/CASE in SB.

Code: [Select]
v = 3

IF v = 1 THEN GOTO CASE_1
IF v = 2 THEN GOTO CASE_2
IF v = 3 THEN GOTO CASE_3
GOTO DEFAULT

CASE_1:
  PRINT "ONE\n"
  GOTO BREAK
CASE_2:
  PRINT "TWO\n"
  GOTO BREAK
CASE_3:
  PRINT "THREE\n"
  GOTO BREAK
DEFAULT:
  PRINT "DEFAULT\n"
 
BREAK:   


Last alternative before shutting down.

Code: [Select]
CASE{"1"} = ADDRESS(One())
CASE{"2"} = ADDRESS(Two())
CASE{"3"} = ADDRESS(Three())

SUB One
  PRINT "One\n"
END SUB 

SUB Two
  PRINT "Two\n"
END SUB 

SUB Three
  PRINT "Three\n"
END SUB 

v = 3

ICALL CASE{STR(v)}
« Last Edit: May 18, 2014, 01:18:24 AM by John »

Mike Lobanovsky

  • Guest
Re: Tiny Benchmark Test
« Reply #50 on: May 18, 2014, 01:23:34 AM »
The first variant is very very close to what a  C compiler will do decoding a C if/else if block into equivalent assembly. No code relocations, no pre-scanning - just straight-forward conversion of every C equivalent to your IF/THEN GOTO into a corresponding cmp/je pair of assembly instructions.

The second variant is worse - it is one function call slower in each constituent case.

The preprocessing of a real switch block in C will be much more thorough. That's why I'm looking forward to Charles' response as to how his BASIC parser treats the two blocks in OxygenBasic.

On a side note, I'm in favor of both BASIC GoTo and C goto keywords. Bellard's TinyCC which FBSL's DynC is based on uses goto's heavily. I'd say his C compiler wouldn't be so very fast (~30MB of source code per second on a 2GHz computer!) if he didn't use goto's so heavily throughout its source code. C++ goto's may be detrimental to structured programming but ANSI C goto's and BASIC GoTo's are certainly a bliss.

Aurel

  • Guest
Re: Tiny Benchmark Test
« Reply #51 on: May 18, 2014, 01:33:21 AM »
Mike thanks again...
I have used in my toy interpreter select to select token ( integer) from integer array
tok[n]...But mine main problem is slow interpreter loop ..in this case for/loop.
why?
For loop is not slow in oxygen BUT when i do some internal loop inside this main
for loop then thigs go slow..no mather what i do in this sub-loop the speed is same
what is really crazy... :( :o

Mike Lobanovsky

  • Guest
Re: Tiny Benchmark Test
« Reply #52 on: May 18, 2014, 01:45:32 AM »
I can't say it off the top of my head, Aurel.

Let me take some time to look through your sources if they are available from your site and perhaps I will be able to spot the bottleneck. I don't however promise that I'll do it right away, sorry. :)

Charles Pegge

  • Guest
Re: Tiny Benchmark Test
« Reply #53 on: May 18, 2014, 02:01:49 AM »
OxygenBasic select case is faster than if .. elseif, especially when the variable being tested is  indirect, or calculated or an array. This is because the test value is held in the accumulator register, and does not require reloading for each case evaluation.

Another advantage is case ranges. These produce very efficient binary, compared to the if equivalent.

case 1 to 10 : ...

if (a>=1) and (a<=10) ..


Oxygen does not preprocess its case blocks or try to map them into jump tables.

A simple token loop with a jump table: (goto is indispensible for this kind of code :) )

Code: [Select]
'JUMP BY TOKEN
'=============

sys t[256] : t[65]=>{@FAA,@FBB,@FCC}
'
string src = "ABCA" 'representing token stream
sys    e   = len(src)
sys    i   = 0
byte  *b   = strptr(src)-1
sys    eb  = @b+e

NextItem:

  @b++
  if @b>eb then end
  i=b : goto t[i]
  FAA:
  print "AA"
  goto NextItem

  FBB:
  print "BB"
  goto NextItem
  '
  FCC:
  print "CC"
  goto NextItem


The most recent Oxygen supports jump by array.
« Last Edit: May 18, 2014, 02:10:20 AM by Charles Pegge »

Aurel

  • Guest
Re: Tiny Benchmark Test
« Reply #54 on: May 18, 2014, 02:11:49 AM »
Thanks Mike for interest   :D
I must say that this is pure interpreting ..so there is no any intermediate representation.
From my experience ..in Oxygen is the fastest Do/EndDo loop.
I even try to code some asembler code to increase speed( as far as i understand asm coding  :-\ ).
Ok Charles maybe i am wrong about speed of select vs if/elseif... ;)

Mike Lobanovsky

  • Guest
Re: Tiny Benchmark Test
« Reply #55 on: May 18, 2014, 02:26:13 AM »
Hi Charles,

I think case ranges and faster evaluation are already reasons enough to keep two distinct constructs in the language.

And yes, while If/ElseIf allows the programmer to evaluate absolutely unrelated conditions in each comparison, Select Case always evaluates only one quantity against a number of other similar quantities. This enables the compiler to keep the former always in one place (accumulator register in O2 case) without reloading. This will naturally yield a boost compared to ElseIf where a completely different pair of objects may be evaluated and reloading must be done to allow for that.

Thanks also for yet another example of jump tables. The O2 code seems to be exactly as efficient in this regard as pure C. :)

Charles Pegge

  • Guest
Re: Tiny Benchmark Test
« Reply #56 on: May 18, 2014, 02:30:36 AM »
I've made ifs more efficient quite recently, so there is not much difference when testing numbers/equates.

if

mov eax,a
cmp eax,42
jnz _exit_if


case

cmp eax,42
jnz _exit_cas

Mike Lobanovsky

  • Guest
Re: Tiny Benchmark Test
« Reply #57 on: May 18, 2014, 02:48:21 AM »
I must say that this is pure interpreting ..so there is no any intermediate representation.

@Aurel:

As you see from Charles' examples, it is extremely easy to implement a fast jump table/computed GoTo's scheme in Oxygen. What is left to do for you is in fact to write a tokenizer which will substitute each of your language keywords in a given program with a corresponding token - an integer within the range of 0 to 255 (byte integers will suffice). You can then store the program's entire succession of tokens in a string and feed it to the loop exactly as shown in Charles' latest example.

That way you will have an extremely fast interpreting loop. Then, the only thing left to do will be optimize the function for every token (Charles uses simple Print instead of it) to make it running as fast as your main loop, et voila: your toy interpreter becomes the Interpreter. :)

@Charles:

Yeah, the difference is ultimately 1 clock per comparison.  But you can improve on that yet further by comparing two things simultaneously as both mov and cmp are perfectly UV-pairable. Then the overhead will be just 0.5 clocks.  :D

Further, sources say that Pentium branch predictors are optimized for jz more than other instructions of this mnemonic. Mispredicted jumps lead to processor stalls with heavy clock-wise penalties. If you could compare for zero then your evaluators will be lightning fast. You can even add an extra unconditional jmp at the end if needed as its clocks are fewer than the misprediction penalty. :)
« Last Edit: May 18, 2014, 03:02:28 AM by Mike Lobanovsky »

Charles Pegge

  • Guest
Re: Tiny Benchmark Test
« Reply #58 on: May 18, 2014, 03:03:43 AM »
Mike,

Thanks for Pentium jump tips.

Yes, inefficient assembler is too painful to behold, so I am driven to optimise!

switch is also supported, as well as select. It is occasionally used to allow cases to fall through for more case testing and processing. In this situation the accumulator has to be reloaded with the switch variable, before testing further cases.

Mike Lobanovsky

  • Guest
Re: Tiny Benchmark Test
« Reply #59 on: May 18, 2014, 03:12:20 AM »
It is occasionally used to allow cases to fall through for more case testing and processing. In this situation the accumulator has to be reloaded with the switch variable, before testing further cases.

Yup, that's when the values to compare against don't fall within a contiguous range. But in my practice such cases occur seldom enough. So I guess we can live with an occasional reload here and there. :)