Author Topic: Lorenz Landscape  (Read 8408 times)

0 Members and 2 Guests are viewing this topic.

Aurel

  • Guest
Re: Lorenz Landscape
« Reply #15 on: April 14, 2014, 11:31:43 AM »
oh..man
thanks John...stupid me  ::)

.

JRS

  • Guest
Re: Lorenz Landscape
« Reply #16 on: April 14, 2014, 11:44:43 AM »
Don't beat yourself up. I had to look twice at Peter's code as I thought it was a typo at first. (x and y reversed)


Mike Lobanovsky

  • Guest
Re: Lorenz Landscape
« Reply #17 on: April 14, 2014, 02:18:45 PM »
@ John & Aurel:

No, no, gentlemen, there's no need to blame yourselves for that. That's because BASIC and most other languages treat array indexing differently. SDL treats indices similar to C, in what's called a "row-major" order whereby the indices are stored in contiguous memory row by row (or line by line, if you will). A Lorenz plot is actually a matrix (a two-dimensional array) of X's and Y's stored by SDL and C in successive horizontal scans (a.k.a. rows, a.k.a. lines) from top to bottom if the plot's point of origin [0,0] lies in the top left corner of the canvas. BASIC's store matrices traditionally in a "column-major" order, i.e. in vertical scans (a.k.a. columns) from left to right under the same coordinate origin conditions.

So when you access a linear memory array of values in different languages, you have to swap the order of indices to convert (a.k.a. "transpose") the matrix to match a particular language's array indexing rules. It is easy for a 2-dimensional matrix (just swap X's with Y's in between the square brackets) but it really adds a lot of extra headache in multidimensional arrays. For example, JPEG compression uses 3-dimensional arrays and it is really difficult to interface them for use in both BASIC and assembly (or C) code in one script.

@ Charles:

Yes, it's all in God's hands as long as SpeedStep isn't switched off completely. But even when it is, there are also multiple cores to be taken into account which yield their own time slices to each other unpredictably, and that affects the clock and tick counts for each particular core very badly.

I have done a lot of extensive research on this problem in my big OpenGL project mentioned elsewhere on this forum and I can tell you that the best timing results may be achieved only through the RDTSC instruction and only if SpeedStep is switched off in BIOS or UEFI and also if the process affinity is programmatically reduced to only one of the CPU cores. That's why I had the following option in my OpenGL System Settings editor:



Now that I have four cores, there would've been many more option buttons on that panel! :)

Of course, confining the task to just one core of many nullifies effectively any chance for efficient multi-threading. So like with everything in this world, if you gain something here you immediately lose something else elsewhere.

As you see, the current OpenGL process is confined to Core 1 only (middle button colored blue) and all the load goes to the corresponding core as is seen in the green plots and yellow CPU load figures. The current mode of game timing is set to RDTSC. It produces the smoothest camera movement and character animation possible. The other two options are either simple GetTickCount() or QueryPerformanceFrequency()/QueryPerformanceCounter().  RDTSC uses its own calibratable equivalent to Sleep() written in assembly while the other two use Winmm.dll's timeBeginPeriod(1)/timeEndPeriod(1) to set Sleep() to a guaranteed granularity of 1 millisecond (tick) against its usual 16 msecs.

Please note that contrary to a common belief a multimedia timer timeGetTime() by itself is exactly as poor as GetTickCount() and its accuracy is not affected by timeBeginPeriod(1)!
« Last Edit: April 14, 2014, 02:25:38 PM by Mike Lobanovsky »

JRS

  • Guest
Re: Lorenz Landscape
« Reply #18 on: April 14, 2014, 05:26:36 PM »
The Lorenz example didn't use arrays. It was SIN/COS driven using a fractional step.

Mike Lobanovsky

  • Guest
Re: Lorenz Landscape
« Reply #19 on: April 14, 2014, 09:36:25 PM »
The Lorenz example didn't use arrays. It was SIN/COS driven using a fractional step.

Don't disillusionize me, John. Can't you read between the lines? ;)

Whenever you see a Foo(X, Y), be sure it's built on top of a Bar[X, Y] somewhere in the long run. SetPixel, glVertex2f, and pix are all Foo()'s.

Besides, your entire screen as well as any canvas you draw on is always a Pixel[X, Y] matrix.

I didn't want to advertise my extraneous FBSL here again but this was what its DynC loop looked like:

Code: [Select]
...
void main(int picbits[800][800])
{
double x, y, t;
for (t = 0.; t < 7000.; t += .006) {
x = sin(.99 * t) - .7 * cos(3.01 * t);
y = cos(1.01 * t) + 0.1 * sin(15.03 * t);
x = x * 200. + 400.; y = y * 200. + 400.;
picbits[(int)x][(int)y] = 0xFFFFFF;
}
}
...

Note the order of matrix indices: [X][Y].

Does that sound convincing enough now, John?

Charles Pegge

  • Guest
Re: Lorenz Landscape
« Reply #20 on: April 15, 2014, 12:15:00 AM »

I found that Opengl Hardware antialiasing (multisampling 4 samples per pixel) increased the GPU rendering time from 5 microseconds per frame to about 9. To be more specific, this is the time period from glCallList and the return from glFlush()

Frankolinox

  • Guest
Re: Lorenz Landscape
« Reply #21 on: April 15, 2014, 01:45:15 AM »
I've made a "Lorenz" test too, but with my compiler as a test. on my old notebook. I have written a converter to read oxygen openGL include files and I was very astound that that was working here :) the time for compiling isn't so important for me, only that's the code is working well.. I added only a

Code: [Select]
long lorenz  or in oxygen

Code: [Select]
sys lorenz
to charles code because I've got this little error what compiling list is "Lorenz"

Code: [Select]
BeginGlCompile Lorenz

nice work peter and thanks for openGL version from charles.

regards, frank

.

Mike Lobanovsky

  • Guest
Re: Lorenz Landscape
« Reply #22 on: April 15, 2014, 01:46:28 AM »
4x MSAA is the absolute noticeable minimum, Charles. 8x looks much better but MSAA is an extremely hard and time-consuming task for the OpenGL renderer - each 1x means one extra pass over the entire canvas. On my dual SLI-interconnected geForces, 50 thousand polygons are the absolute maximum for a 1680x1050 full-screen mode with 8x MSAA (my central Philips is 1680x1050 and my two side Philipses are 1920x1080), and only if the textures are mipmapped. If not then 25 or 30 thousand would be almost unreachable and the FPS rate would fall down from VSYNC'ed 60 frames to 30 frames abruptly.

There's a lighter algorithm involving a GLSL or HLSL shader; it's called FSAA ("full-screen anti-aliasing" IIRC). It's easy to find on the net. It's even been incorporated in the open-source Ogre rendering system. It's slightly less effective for sloping lines and animated edges but still much lighter for the renderer and practically indistinguishable to the eye.

And Charles, you don't have to use glFlush() at all except before taking a snapshot using OpenGL's own glGetPixels() call. OpenGL is a very clever asynchronous state machine and it knows much better when to start and end than we do. Just skip it; you're simply losing time for an unnecessary function call. Your calling glFlush() still doesn't mean OpenGL will actually obey. :)

Charles Pegge

  • Guest
Re: Lorenz Landscape
« Reply #23 on: April 15, 2014, 03:56:28 AM »
I don't normally use GlFlush, Mike, but it was needed here, only to get a time estimate of the list's transit through the GPU pipeline. It was an additional 4 microseconds, except when in full screen, when the processing time fell back to 5 microseconds, even though many more pixels required smudging. This would seem to indicate automatically switching to FSAA.

Just realised I made the measurements with 8 samples! I think I prefer 4. 8 is a bit too soft. Fog also helps to keep pixel noise down on distant moving objects.

Mike Lobanovsky

  • Guest
Re: Lorenz Landscape
« Reply #24 on: April 15, 2014, 04:52:56 AM »
Excuse me Charles,

Peter came up with this the first so we'll do it the FIFO style. :)

@Peter:

No Peter, this is not correct and you're misleading people who might be reading this thread in the future.

1. Charles uses a PFD_DOUBLEBUFFER format for his OpenGL context which means the OpenGL scene is first drawn into the memory backbuffer and it is blitted and becomes visible on  the screen only after a call to Gdi32's SwapBuffers() and that's what Charles' OpenglSceneFrame.inc uses, or OpenGL's wglSwapBuffers(). In other words, you will not see any effect whatsoever from either glFlush() or glFinish() simply because they are drawing to the invisible backbuffer.

2. The immediate effect of glFlush() and glFinish() may be seen only if OpenGL is supposed to draw directly onto the window's canvas, in which case pfd.dwFlags should not have the PFD_DOUBLEBUFFER flag but rather PFD_DRAW_TO_WINDOW and PFD_SUPPORT_OPENGL only. But such drawing is very flickery similar to unbuffered CS_HREDRAW+CS_VREDRAW window resize.

3. glFlush() does not wait until the drawing operations are completed! It exits immediately on notifying OpenGL that completion is requested in a finite period of time which in plain English means not later than the next frame. Contrary to that, glFinish() pulls OpenGL down to ensure that all draw calls are completed before the current frame is swapped onto the window. In other words, this pair of functions complement each other similar to SendMessage()/PostMessage() or GetMessage()/PeekMessage().

4. I repeat again, you do not have to use either of these functions with a PFD_DOUBLEBUFFER pixel format except for a couple of rare and very special cases because you will not see any visible effect no matter what you do!

@Charles:

Now the two very rare and special cases that might occur for an extremely inquisitive OpenGL coder are:

1. glGetPixels() in order to make a copy of the backbuffer contents, e.g. to save a screenshot; and

2. take a measurement related to draw calls in a frame, in which case a successful tester should use a call to glFinish() due to the reasons I stated above.

3. I was talking about MSAA for tens of thousands of textured polies where each extra draw pass is a heavy strain. The difference between 4x and 8x MSAA will not be that significant in a poorly populated scene.

Gentlemen, I am very serious now and I know what I'm talking about. Just google for "glFlush glFinish" and you will find a ton of proof to what I'm telling you here.

Thanks for your attention.

P.S. Charles, will you please inform us of your MSAA timings with glFinish() instead of glFlush()? And please don't use either of them for ordinary rendering. You'll be wasting your time even typing them to say nothing of waiting for the actual calls to take any visible effect.
« Last Edit: April 15, 2014, 05:06:15 AM by Mike Lobanovsky »

Aurel

  • Guest
Re: Lorenz Landscape
« Reply #25 on: April 15, 2014, 05:33:39 AM »
Quote
I've made a "Lorenz" test too, but with my compiler as a test
frank...is your 'compiler' modification of BINT created in PowerBasic...
and from what i see 0.9xx you get very good time respond  ;)

People..i really don't get it what you see in this openGL
because most of great Windows games are created with DirectX...right?

Charles Pegge

  • Guest
Re: Lorenz Landscape
« Reply #26 on: April 15, 2014, 05:58:46 AM »
Thanks Mike.

Yes, thats more like it. Using glFinish().
The results are now dependent on Window size.

16 is rather nice, but maybe not for smooth animations

Code: [Select]
  % title        "Lorenz Plot"
  % width        600
  % height       800
  % Multisamples 16 '0..4..16
  include     OpenglSceneFrame.inc

  ' PERFORMANCE
  ' samples  PipeLine..glFinish (Approx Seconds)
  ' n        t
  '----------------
  ' 0        0.01
  ' 4        0.0125
  ' 8        0.0165
  ' 16       0.0223


PS: DirectX is Microsoft's own brand of Opengl. It's Closedgl :)
« Last Edit: April 15, 2014, 07:06:44 AM by Charles Pegge »

Mike Lobanovsky

  • Guest
Re: Lorenz Landscape
« Reply #27 on: April 15, 2014, 08:36:33 PM »
Thanks for the benchmarks, Charles. I guess they were taken in fractions of a millisecond? Those would seem natural for the given window size and minimum complexity of the scene being rendered.

And the last question please, are you back to Windows again or were these measurements still taken under Mesa (Linux OpenGL) and Wine?

Charles Pegge

  • Guest
Re: Lorenz Landscape
« Reply #28 on: April 15, 2014, 10:39:24 PM »

These measurements were made under Vista. Both platforms are stable again, after I removed the cobwebs from the interior of my PC, and reinserted a few of the components :)

Mike Lobanovsky

  • Guest
Re: Lorenz Landscape
« Reply #29 on: April 16, 2014, 04:31:51 AM »
People..i really don't get it what you see in this openGL because most of great Windows games are created with DirectX...right?

Nowadays it's more a matter of the game author's personal preferences or corporate entity's marketing policy and targets.

OpenGL is a cross-platform standard abstracted from the underlying operating systems while DirectX is exclusively Windows-oriented proprietary stuff. Starting with Win XP's DX9, it depends heavily on many other components of this operating system.

OpenGL is older and it has always been more flexible, feature-rich, and responsive to immediate innovations in GPU technologies than its rival. Direct3D requires more profound knowledge of hardware specifics and it is more conservative to GPU innovation.

While immediate-mode OpenGL is easier for beginners, its advanced features are a perfect choice for professional platform-independent development. OpenGL covers all flavors of Linux, Windows, Mac OS X and portable device systems while Direct3D is confined to Windows desktops and laptops and Xbox/Sega's Dreamcast video game consoles only.

OpenGL has always been the industry innovator with Direct3D being the eternal follower.

OpenGL is one of very few software technologies where I'm a staunch opponent to Microsoft's policies. Otherwise, I'm a dedicated Windows partisan.