FlayedOne's picture

Performance, Rendering loop and VSync question

Hello,
I have the following problem. My rendering loop looks as follows(or rather main loop):

        [System.Security.SuppressUnmanagedCodeSecurity]
        [DllImport("User32.dll", CharSet = CharSet.Auto)]
 
        public static extern bool PeekMessage(
            out Message msg,
            IntPtr hWnd,
            uint messageFilterMin,
            uint messageFilterMax,
            uint flags);
 
           Application.Idle += (idle_sender, idle_e) =>
            {
                while (AppStillIdle)
                {
                    simulate();
                    updateLabels();
                    SNEngine.render();
                }
            };
 
        private bool AppStillIdle
        {
            get
            {
                Message msg;
                return !PeekMessage(out msg,
                                    IntPtr.Zero,
                                    0, 0, 0);
            }
        }

In update labels I'm measuring performance(average time per frame, max time per frame), and once every x frames I update a label on my main form with those values. I thought, that by using VSync, I'll get a steady 60 frames per second as long as I have computational power to spare, but the results are odd:
with VSync I get:
avg tpf = 16,5ms max tpf = 85ms
without VSync:
avg tpf = 0,35ms max tpf = 3ms

As you can see, VSync does not enforce a steady framerate even though the average time seems right. If my application can compute and render scenes in<3ms, then why do I get 85ms frames from time to time? They are noticable, and look really ugly.


Comments

the Fiddler's picture

The most likely culprit is the garbage collector. The issue is that the GC stops all threads while collecting memory, which causes very noticeable hiccups in rendering.

There are two commonly suggested solutions:

  • allocate all memory up front and avoid allocating temporary objects while rendering. You can use memory pooling or value types to eliminate runtime allocations.
  • allocate memory at will, but keep the allocation graph as simple as possible. The GC will then be able to run quickly enough as to not cause noticeable delays.

OpenTK itself avoids allocating any memory after startup - unless, of course you create a new object (e.g. window, context, etc) or call a few specific methods that are documented to allocate memory (e.g. querying monitors, changing resolutions and a few more). If OpenTK *does* allocate memory, this is a bug that will be fixed.

The best approach is to run a profiler that can track object lifetime and optimize your code to avoid runtime allocations. While this may not be always possible, it should be possible to avoid gen-1 or higher GCs, which impact performance a lot (gen-0 GCs can be performed in less than 10ms, which is fine if done relatively infrequently).

Be warned that WinForms are trigger-happy and will allocate memory for every single event they raise. Imagine moving the mouse around - this hurts a lot. You can work around this issue by pre-processing WM_* messages as they arrive and raising the relevant events yourself (caching their parameters), but this is pretty ugly. OpenTK.GameWindow was created to avoid exactly this sort of issues.

Off-topic: OpenTK.GLControl has an IsIdle property that can be used instead of p/invoking PeekMessage manually. The implentation is identical to yours on win32, but it also supports linux and macos.

Edit:

Application.Idle += (idle_sender, idle_e) =>
            {
                while (glControl.IsIdle)
                {
                    simulate();
                    updateLabels();
                    SNEngine.render();
                }
            };
FlayedOne's picture

Thanks for the reply,

I honestly doub't it's the GCs fault, because I'm only allocating about 40 additional kbs per second. And from observing my memory manager, with GC.Collect() thrown into my loop each 100 or 1000 frames and without, I gather it rarely runs at all without my calling it.

My scene is extremely simple - 3 rotating cubes - 24 vertices. Adding more doesn't change the times(checked up to 15 cubes), so the problem lies somwhere else.

I also haven't really declared many events, and right now all are Keyboard events, so no room for going trigger happy;).

Thanks for the tip with glControl.IsIdle ;) It'll come in handy.

the Fiddler's picture

How are you measuring the allocation rate?

It doesn't matter how many events you have declared, memory is allocated regardless. (The OnFooChanged(), OnBarMoved() methods are always called, even if you don't have the relevant events hooked).

Suggestion: instead of calling GC.Collect() manually (this has nasty side-effects to GC performance), add a Console.WriteLine() call whenever you detect a hiccup in the frametime.

if (hiccup_detected)
{
    Console.WriteLine("Hiccup detected. Gen-0: {0}, Gen-1: {1}, Gen-2: {2}",
        GC.CollectionCount(0), GC.CollectionCount(1), GC.CollectionCount(2));
}

This will likely reveal whether the hiccups are correlated with GC collections. There *are* other things that may cause hiccups but this is by far the most common, in my experience.

FlayedOne's picture

I measure allocation rate using Memory Manager software which is more accurate then task manager, but similar. It can display memory allocation graph for a process.

Damn, stupid Forms interface:P

I wasn't suggesting it as a solution, but rather as a means of identifying the problem. Thanks for the much better solution;). Here are the results where hiccup is a >80ms frame:

Hiccup detected. Gen-0:1, Gen-1: 0, Gen-2: 0
Hiccup detected. Gen-0:1, Gen-1: 0, Gen-2: 0
Hiccup detected. Gen-0:1, Gen-1: 0, Gen-2: 0
Hiccup detected. Gen-0:1, Gen-1: 0, Gen-2: 0
Hiccup detected. Gen-0:2, Gen-1: 1, Gen-2: 0
Hiccup detected. Gen-0:2, Gen-1: 1, Gen-2: 0
Hiccup detected. Gen-0:2, Gen-1: 1, Gen-2: 0
Hiccup detected. Gen-0:2, Gen-1: 1, Gen-2: 0
Hiccup detected. Gen-0:2, Gen-1: 1, Gen-2: 0
Hiccup detected. Gen-0:3, Gen-1: 1, Gen-2: 0
Hiccup detected. Gen-0:4, Gen-1: 1, Gen-2: 0
Hiccup detected. Gen-0:4, Gen-1: 1, Gen-2: 0

I guess this means they aren't related. The strangest thing is that my app is very simple - there is almost no room for the hiccups...

I went further and logged some times per frames. Results:

Frame took: 28
Frame took: 14
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 15
Frame took: 18
Frame took: 16
Frame took: 17
Frame took: 82
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 66
Frame took: 0
Frame took: 13
Frame took: 16
Frame took: 16
Frame took: 0
Frame took: 33
Frame took: 16
Frame took: 18
Frame took: 31
Frame took: 1
Frame took: 0
Frame took: 31
Frame took: 17
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 64
Frame took: 16
Frame took: 16
Frame took: 17
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 1
Frame took: 31
Frame took: 17
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 16
Frame took: 19
Frame took: 14
Frame took: 1
Frame took: 0
Frame took: 0
Frame took: 80
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 17
Frame took: 16
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 16
Frame took: 18
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 64
Frame took: 0
Frame took: 33
Frame took: 32
Frame took: 2
Frame took: 30
Frame took: 2
Frame took: 0
Frame took: 47
Frame took: 0
Frame took: 1
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 15
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 67
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 63
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 0
Frame took: 32
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 66
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 66
Frame took: 15
Frame took: 28
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 54
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 66
Frame took: 0
Frame took: 31
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 81
Frame took: 1
Frame took: 16
Frame took: 16
Frame took: 32
Frame took: 17
Frame took: 1
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 15
Frame took: 1
Frame took: 33
Frame took: 2
Frame took: 0
Frame took: 0
Frame took: 63
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 66
Frame took: 14
Frame took: 0
Frame took: 32
Frame took: 16
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 67
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 15
Frame took: 33
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 66
Frame took: 16
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 64
Frame took: 16
Frame took: 0
Frame took: 33
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 66
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 1
Frame took: 0
Frame took: 0
Frame took: 80
Frame took: 1
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 0
Frame took: 0
Frame took: 58
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 56
Frame took: 0
Frame took: 31
Frame took: 0
Frame took: 33
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 94
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 70
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 85
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 62
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 67
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 13
Frame took: 17
Frame took: 85
Frame took: 0
Frame took: 32
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 62
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 67
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 66
Frame took: 0
Frame took: 0
Frame took: 46
Frame took: 17
Frame took: 16
Frame took: 32
Frame took: 16
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 64
Frame took: 0
Frame took: 33
Frame took: 16
Frame took: 32
Frame took: 0
Frame took: 16
Frame took: 0
Frame took: 32
Frame took: 0
Frame took: 0
Frame took: 49
Frame took: 16
Frame took: 16
Frame took: 16
Frame took: 17
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 83
Frame took: 16
Frame took: 0
Frame took: 16
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 15
Frame took: 16
Frame took: 17
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 15
Frame took: 17
Frame took: 16
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 16
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 16
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 16
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 83
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 16
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 81
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 15
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 16
Frame took: 17
Frame took: 65
Frame took: 16
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 16
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 67
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 15
Frame took: 18
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 83
Frame took: 1
Frame took: 15
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 67
Frame took: 16
Frame took: 34
Frame took: 0
Frame took: 12
Frame took: 16
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 82
Frame took: 0
Frame took: 16
Frame took: 20
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 12
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 66
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 64
Frame took: 20
Frame took: 12
Frame took: 16
Frame took: 32
Frame took: 3
Frame took: 13
Frame took: 16
Frame took: 16
Frame took: 16
Frame took: 32
Frame took: 1
Frame took: 31
Frame took: 1
Frame took: 0
Frame took: 34
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 13
Frame took: 66
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 67
Frame took: 14
Frame took: 16
Frame took: 16
Frame took: 16
Frame took: 20
Frame took: 13
Frame took: 16
Frame took: 16
Frame took: 16
Frame took: 16
Frame took: 16
Frame took: 0
Frame took: 34
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 16
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 82
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 83
Frame took: 1
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 16
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 82
Frame took: 1
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 16
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 82
Frame took: 3
Frame took: 14
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 67
Frame took: 17
Frame took: 14
Frame took: 0
Frame took: 0
Frame took: 62
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 2
Frame took: 17
Frame took: 65
Frame took: 19
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 11
Frame took: 83
Frame took: 1
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 67
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 13
Frame took: 17
Frame took: 66
Frame took: 16
Frame took: 16
Frame took: 20
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 13
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 82
Frame took: 2
Frame took: 14
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 65
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 66
Frame took: 31
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 67
Frame took: 16
Frame took: 18
Frame took: 0
Frame took: 0
Frame took: 46
Frame took: 16
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 82
Frame took: 2
Frame took: 14
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 81
Frame took: 1
Frame took: 18
Frame took: 0
Frame took: 0
Frame took: 62
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 66
Frame took: 18
Frame took: 0
Frame took: 0
Frame took: 1
Frame took: 13
Frame took: 65
Frame took: 18
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 66
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 13
Frame took: 17
Frame took: 67
Frame took: 15
Frame took: 18
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 13
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 83
Frame took: 15
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 66
Frame took: 17
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 79
Frame took: 1
Frame took: 15
Frame took: 16
Frame took: 32
Frame took: 3
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 82
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 13
Frame took: 68
Frame took: 13
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 16
Frame took: 16
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 82
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 16
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 67
Frame took: 16
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 16
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 66
Frame took: 0
Frame took: 0
Frame took: 0
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 17
Frame took: 15
Frame took: 18
Frame took: 17
Frame took: 15

Very strange... I thought VSync will block every frame from being rendered faster then 16ms from the earlier one, but here it looks as if sometimes it blocks correctly giving smootch rendering, and sometimes it let's 3-4 frames get rendered in 1 ms, and blocks the 4th-5th one for 60-80ms.

the Fiddler's picture

Yes, there doesn't seem to be a strong correlation between the two.

Other low hanging fruit:

  • maybe the hiccups are related to mouse movement? This may sound strange, but it's likely that Application.Idle is not raised for as long as you are moving the mouse!
  • do the hiccups still occur with vsync off?
  • are you sleeping the rendering thread at all?
  • are you running any background processes that could steal CPU time? Try bumping the priority of your rendering thread by one click (any more and general system performance will suffer)
  • if you are using a Nvidia card, try turning off the "threaded optimization" option found somewhere in the advanced section of their control center.
  • if your CPU has multiple cores, try pinning your process to a single one (I don't remember off hand how to do this, either a p/invoke or an external utility can be used for this).

Edit: are you using System.Diagnostics.Stopwatch to measure frame time? Ensure that Stopwatch.IsHighResolution returns true, otherwise the results will be completely inaccurate (~10ms accuracy or worse).

FlayedOne's picture

Thanks again for the reply,

the Fiddler wrote:

maybe the hiccups are related to mouse movement? This may sound strange, but it's likely that Application.Idle is not raised for as long as you are moving the mouse>

I haven't moved the mouse at all during the tests. They appear even when I don't move the mouse, and I haven't noticed any changes in framerate even when I furiously did move it;).

the Fiddler wrote:

do the hiccups still occur with vsync off?

They do not, or you could say that they do, but are much smaller. Most frames take 0.30ms, and occasional(one every 20-50 frames) take more then 1ms.(but less then 4ms)

the Fiddler wrote:

are you sleeping the rendering thread at all?

No, no sleeps at all in my application.

the Fiddler wrote:

are you running any background processes that could steal CPU time? Try bumping the priority of your rendering thread by one click (any more and general system performance will suffer)

Standart XP system garbage. After setting priority to AboveNormal I don't see any changes. I don't see any changes if it's on Highest too.

the Fiddler wrote:

if you are using a Nvidia card, try turning off the "threaded optimization" option found somewhere in the advanced section of their control center.

Ati Radeon 3870

the Fiddler wrote:

if your CPU has multiple cores, try pinning your process to a single one (I don't remember off hand how to do this, either a p/invoke or an external utility can be used for this).

This one I've never tried, Ill look into that.

the Fiddler wrote:

Edit: are you using System.Diagnostics.Stopwatch to measure frame time? Ensure that Stopwatch.IsHighResolution returns true, otherwise the results will be completely inaccurate (~10ms accuracy or worse).

Yes, I'm using a Stopwatch and it is high resolution.

As a sidenote, some of the hiccups from the logs are supposedly because of logging itself. The application works much better without console writes, but it doesn't change the fact that I still get a hiccup around once a second.

the Fiddler's picture

Another suggestion: install CPU drivers! (especially important for AMD CPUs running on WinXP). WinXP are rather dump and will move processes between CPU cores much more than they should, especially if they sleep (I guess vsynced SwapBuffers() counts as such). By itself, this will not cause 80ms hiccups, but add a timestamp counter running out of sync between cores and you have a real problem in your hands (CPU drivers will fix this).

Finally, try modifying your power settings to "high performance" or similar - in short, disable any performance scaling performed by your CPU (in the most extreme cases, this can be disabled through your BIOS). This is hardware and OS-dependent, but CPU scaling can introduce unwanted hiccups - which is why many commercial games recommend disabling this feature.

FlayedOne's picture

I do have CPU&Motherboard drivers installed(Intel Pentium Dual-Core E5200). I also don't have any power-preserve settings on... I also get similar hiccups when the only things that happen in my loop is logging and swapping buffers.

It seems Im gonna have to use something other than VSync to get a steady update rate of the world, and let my rendering go without frame limiters. Thanks for your help. I appreciate it.

ebeckers's picture

Hi

I had the same problem some time ago with a C# windows application which used DirectX
The reason for these hickups was the onidle messageloop , same thing that OpenTK is using.
With this principle the main thread is handling both windows messages and rendering frames.
Sometimes a windows message takes a bit longer and thus FPS drop.
I solved it then by doing all the rendering in a seperate thread

Now i'm working on an OpenTK app which runs under linux & windows and I do see the same thing again.
So i would suggest create a new thread which handles all the rendering and let the main thread only handle windows messages

the Fiddler's picture

Actually, that's a pretty good idea for a tutorial.

kanato's picture

Did you install SQL server when you installed Visual Studio? I think it is usually installed by default. In the past I ran into this sort of thing caused by SQL server running in the background. Maybe a long shot, but it might be worth checking if SQL server is the cause before blaming OpenTK or the garbage collector.

aberghage's picture

I can second ebeckers' idea of running message handling on a separate thread. I'm working with a thread which just handles windows messages and raises the appropriate internal events (pulling 'em from an object pool to reduce garbage, as was mentioned above). If anyone has an interest in seeing numbers I can try benchmarking this when I get a few spare minutes.

Just a quick note: I'm not sure how to do it from code, never tried, but pinning a process to one cpu is pretty quick to do once-off with the windows task manager. Right click the process and hit "Set Affinity..." (super descriptive name, Microsoft!).

Inertia's picture

Since you're using the same graphics card this may be it: I've experienced similar hiccups with Radeon 3870 due to a bug in Catalyst 9.5 (9.6? 9.7? idk, it has been fixed for a while now) where any OpenGL app with vsync enabled would crawl at 10-15fps, while with vsync disabled it'd go up to ~2400fps. Updating drivers may help. It may also help to see if you can replicate the problem on other PCs.

the Fiddler's picture

Just a word of warning, I've heard that older Nvidia drivers used to have issues with message loops running on a separate thread (some form of conflict between Nvidia's "threaded optimizations"). This should no longer be the case, but it's something to keep an eye out for.

I've looked into moving OpenTK's message loop to a different thread, but decided against this in the end - at least for the 1.0 release. Do post some numbers when you find the time, this is something that might be worth revisiting in a future release.

You can set CPU affinity through code if you pinvoke the relevant API (IIRC "SetProcessorAffinity" or something close to that). There's a similar function for Linux and there should be something for Mac OS X (but no idea what it is called).

Finally, OpenTK reuses event args to avoid garbage. Yes, this means that the user shouldn't store event args outside the relevant event without cloning them, but this need rarely comes up in practice so it's not an issue. (An event pool faces the same issue, so it doesn't help here - how can you know that the user didn't store a reference to your pooled event?)

plohrmann's picture

Your issue is likely due to the fact that the CPU can't really time what the GPU is doing. VSync prevents the window from REFRESHING 60 times per second, but it doesn't directly stall the GPU. The cyclic nature of your results supports this - let me try to explain. Most OpenGL calls do not return data, they just send data to the GPU, this allows the CPU to give a bunch of info to the GPU without stalling. The SwapBuffers just says "i'm done with that frame, now I'm starting another", and the GPU schedules that message with everything else. If you use the CPU to time that frame, it may have a low value because the CPU didn't wait at all. However, at some point the GPU has to actually work (after, say, 1/60 of a second) which may happen after a few frames have been provided by the CPU (thus you have several frames of zero ms) , the GPU will process them all and display them as needed; during this processing time, the CPU may switch threads / processes while the GPU catches up (thus the hiccup).

Once you cross the boundary between being CPU bound vs GPU bound, then the CPU timings will start to be more accurate. You will need A LOT more than 15 rotating cubes to do that on a 3870!

You're measurements would be better served if you either average them over all the frames, or a rolling average over the last 1 second worth of frames.