MemoryLeak despite disposing of V8 Script Engines

Aug 6, 2015 at 7:22 PM
Greetings! I could really use some help figuring out how to handle this. I've spent a lot of time trying to isolate the problem. I suspect I have multiple problems, actually.

My scenario is what I need to have a user-defined script hook into an event I expose. The event can be raised at any time in the future, so it's going to be a long-running script that sits around. I pass the event a managed object (a kind of event args) that contains information about the event.

The memory usage was continuously climbing as events were raised.

Reading other discussions here, I gather that passing managed objects to scripts can cause some tricky memory management issues. You suggested disposing of them when they are done being used. I can't really do that easily, though. The object that is passed to script contains methods that the script can use to get access to even more objects. To be able to clean them all up, I'd have to have an event-scoped list of objects that were passed in so that I could easily dispose them later, which would be possible, but difficult.

So instead of doing that, I keep track of how often I call the event. After every 100 calls, or so, I will completely dispose of the V8 Engine and the script, recompile it, and re-evalulate it. I figured whatever memory use issues I have should be covered by that, so no need to dispose the host objects directly. Or so I thought.

Even with that, memory continues to climb. I've created a reproduction app to demonstrate:
    class Program
    {
        static void Main(string[] args)
        {
            for (var i = 0; i < 50; i++)
            {
                Console.WriteLine("Iteration " + i + ". Memory = " + Process.GetCurrentProcess().PrivateMemorySize64.ToString("n0"));

                var engine = new Microsoft.ClearScript.V8.V8ScriptEngine();
                var host = new HostObject();
                engine.AddHostObject("host", host);

                var compiledScript = engine.Compile(@"host.onEvent(function() {});");
                engine.Execute(compiledScript);

                for (var j = 0; j < 50000; j++)
                {
                    Raise(host);
                }

                engine.Dispose();
                GC.Collect();
            }

            Console.WriteLine("Done. Memory: " + Process.GetCurrentProcess().PrivateMemorySize64.ToString("n0"));
            Console.ReadLine();
        }

        private static void Raise(HostObject host)
        {
            host.RaiseEvent(new SomeEventArgs());
        }
    }

    public class HostObject
    {
        private dynamic _callback;

        public void onEvent(dynamic callback)
        {
            _callback = callback;
        }

        internal void RaiseEvent(SomeEventArgs args)
        {
            _callback(args);
        }
    }

    public class SomeEventArgs
    {
        public object getThing()
        {
            return new Thing();
        }
    }

    public class Thing
    {
    }
So it's just a loop that continues to raise events. After 50,000 events have been raised, it disposes of the V8 engine and forces GC.Collect. Then it starts over. Each iteration it spits out the private memory use. The output looks like this:
Iteration 0. Memory = 27,127,808
Iteration 1. Memory = 48,844,800
Iteration 2. Memory = 50,040,832
Iteration 3. Memory = 54,665,216
Iteration 4. Memory = 61,140,992
Iteration 5. Memory = 59,023,360
Iteration 6. Memory = 60,551,168
Iteration 7. Memory = 65,290,240
Iteration 8. Memory = 63,279,104
Iteration 9. Memory = 68,558,848
Iteration 10. Memory = 66,580,480
Iteration 11. Memory = 72,167,424
Iteration 12. Memory = 69,926,912
Iteration 13. Memory = 68,534,272
Iteration 14. Memory = 75,706,368
Iteration 15. Memory = 71,839,744
Iteration 16. Memory = 79,609,856
Iteration 17. Memory = 79,757,312
Iteration 18. Memory = 76,496,896
Iteration 19. Memory = 83,943,424
Iteration 20. Memory = 79,884,288
Iteration 21. Memory = 87,781,376
Iteration 22. Memory = 87,670,784
Iteration 23. Memory = 84,738,048
Iteration 24. Memory = 91,852,800
Iteration 25. Memory = 88,100,864
Iteration 26. Memory = 95,801,344
Iteration 27. Memory = 95,752,192
Iteration 28. Memory = 92,495,872
Iteration 29. Memory = 100,003,840
Iteration 30. Memory = 95,817,728
Iteration 31. Memory = 103,780,352
Iteration 32. Memory = 104,042,496
Iteration 33. Memory = 100,737,024
Iteration 34. Memory = 107,982,848
Iteration 35. Memory = 104,165,376
Iteration 36. Memory = 111,931,392
Iteration 37. Memory = 112,066,560
Iteration 38. Memory = 108,818,432
Iteration 39. Memory = 116,264,960
Iteration 40. Memory = 112,078,848
Iteration 41. Memory = 120,107,008
Iteration 42. Memory = 119,910,400
Iteration 43. Memory = 117,051,392
Iteration 44. Memory = 124,166,144
Iteration 45. Memory = 120,414,208
Iteration 46. Memory = 128,114,688
Iteration 47. Memory = 128,196,608
Iteration 48. Memory = 125,067,264
Iteration 49. Memory = 132,448,256
Done. Memory: 128,397,312
Using windbg I looked at the heap. There's a single dictionary in memory that seems to grow forever:

000007fe938e5848 1 1206920 System.Runtime.CompilerServices.ConditionalWeakTable2+Entry[[System.Object, mscorlib],[System.Collections.Generic.List1[[System.WeakReference, mscorlib]], mscorlib]][]

I presume this is some kind of weakreference dictionary that keeps track of proxy objects. It contains 75,000 elements in this example.

0:000> !do 0000000012805c40
Name: System.Runtime.CompilerServices.ConditionalWeakTable2+Entry[[System.Object, mscorlib],[System.Collections.Generic.List1[[System.WeakReference, mscorlib]], mscorlib]][]
MethodTable: 000007fe938e5848
EEClass: 000007fe938e57b8
Size: 1206920(0x126a88) bytes
Array: Rank 1, Number of elements 75431, Type VALUETYPE
Fields:
None

To be fair, while this does seem to reproduce a problem, I'm not even sure that it's my actual problem. I think I have multiple. My actual program is leaking memory a lot faster than I can reproduce with this sample app. That may just be because the host objects I'm passing in are larger though. I could perhaps use some advice or guidelines on what to do or what not to do for the scenario I'm in.
Aug 6, 2015 at 10:46 PM
I wonder if this issue isn't the cause? I just noticed that the thread count on my process goes up, too. For example, I end up with 4 gigs of memory use and over 600 threads.

https://clearscript.codeplex.com/workitem/86
Coordinator
Aug 7, 2015 at 3:40 PM
Hello infinitiesloop!

You've discovered a very subtle bug that's unrelated to Issue #86. We're already testing fixes for both. Thank you!

The excessive thread usage you're seeing is also unrelated, and we cannot reproduce it. Do you have any relevant stacks?

Thanks again!
Aug 7, 2015 at 8:54 PM
I'm very interested in those fixes! I appreciate how active you guys are on this project still. We're using it for some compelling things :)

It may be that my next move is to wait for your fixes and see where I go from there. If it helps though, I do see some threads that look like this:
OS Thread Id: 0xf260 (17)
Current frame: ntdll!NtWaitForSingleObject+0xa
Child-SP         RetAddr          Caller, Callee
000000002c1bfd10 000007fefd2410dc KERNELBASE!WaitForSingleObjectEx+0x9c, calling ntdll!NtWaitForSingleObject
000000002c1bfd90 000000007712413d ntdll!RtlAllocateHeap+0xd9d, calling ntdll!RtlAllocateHeap+0xdb0
000000002c1bfdb0 000007fee3b46319 *** WARNING: Unable to verify checksum for v8-x64.dll
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for v8-x64.dll - 
v8_x64!v8::ResourceConstraints::code_range_size+0xeb9, calling v8_x64!v8::ArrayBuffer::Contents::Contents+0x26a0
000000002c1bfe40 000007fee3d5f970 v8_x64!v8::ArrayBuffer::Contents::Contents+0x1550
000000002c1bfe70 000007fee8164f7f *** ERROR: Symbol file could not be found.  Defaulted to export symbols for msvcr120.dll - 
msvcr120!beginthreadex+0x107
000000002c1bfea0 000007fee8165126 msvcr120!endthreadex+0x192, calling msvcr120!beginthreadex+0xf0
000000002c1bfed0 0000000076fc652d kernel32!BaseThreadInitThunk+0xd
000000002c1bff00 00000000770fc521 ntdll!RtlUserThreadStart+0x21
So some thread sitting on a Wait within v8-x64.dll. There are a bunch that look like this.

I will continue to play around see if I can come up with a simple reproduction of the run away threads, but not too hopeful I can do it at the moment.
Aug 7, 2015 at 11:21 PM
Here's the callstack with arguments. It's strange to me that NtWaitForSingleObject is called with all 0s. It's also strange to me that code_range_size is waiting on anything, because in the V8 code, this is just a field lookup:

https://v8docs.nodesource.com/io.js-1.2/d4/da0/v8_8h_source.html#l04338
0:000> ~113 kb
RetAddr           : Args to Child                                                           : Call Site
000007fe`fda210dc : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!NtWaitForSingleObject+0xa
000007fe`c6906319 : 00000000`69160df0 00000000`00000000 00000000`00000000 00000000`000010cc : KERNELBASE!WaitForSingleObjectEx+0x9c
000007fe`c6b1f970 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : v8_x64!v8::ResourceConstraints::code_range_size+0xeb9
000007fe`f2b74f7f : 00000000`69274840 00000000`00000000 00000000`00000000 00000000`00000000 : v8_x64!v8::ArrayBuffer::Contents::Contents+0x1550
000007fe`f2b75126 : 000007fe`f2c2cb80 00000000`00000000 00000000`00000000 00000000`00000000 : msvcr120!beginthreadex+0x107
00000000`77a459cd : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : msvcr120!endthreadex+0x192
00000000`77c7b981 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0xd
00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21
Any thoughts?
Coordinator
Aug 8, 2015 at 1:48 AM
Hi again,

Hmm, that stack doesn't look correct at all. As you mentioned, v8::ResourceConstraints::code_range_size() doesn't call anything, and neither does the v8::ArrayBuffer::Contents constructor. Are you sure you have the right symbols?

Cheers!
Aug 10, 2015 at 6:28 PM
I don't think it is right, but I'm not sure how to get the correct symbols. Here's the 'noisy' output from windbg. I added a MS and Chrome symbol path:
000007fe`fda210dc : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!NtWaitForSingleObject+0xa
SYMSRV:  d:\symbols\v8-x64.pdb\AEFF2A3642C44ACE9BB070E1CFCFCC8D1\v8-x64.pdb not found
SYMSRV:  https://msdl.microsoft.com/download/symbols/v8-x64.pdb/AEFF2A3642C44ACE9BB070E1CFCFCC8D1/v8-x64.pdb not found
SYMSRV:  d:\symbols\v8-x64.pdb\AEFF2A3642C44ACE9BB070E1CFCFCC8D1\v8-x64.pdb not found
SYMSRV:  https://chromium-browser-symsrv.commondatastorage.googleapis.com/v8-x64.pdb/AEFF2A3642C44ACE9BB070E1CFCFCC8D1/v8-x64.pdb not found
DBGHELP: C:\Projects\!OPEN_SOURCE\clearscript\ClearScript\V8\V8\build\v8-x64\build\Release\v8-x64.pdb - file not found
*** WARNING: Unable to verify checksum for v8-x64.dll
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for v8-x64.dll - 
DBGHELP: v8_x64 - export symbols
Where does v8-x64 come from?
Coordinator
Aug 10, 2015 at 7:49 PM
V8 symbols are generated when you build ClearScript. If you obtained ClearScript via NuGet or another binary package, you could try searching for symbols somewhere within the package directory.

By the way, what ClearScript version are you using?
Aug 10, 2015 at 8:27 PM
Edited Aug 10, 2015 at 9:23 PM
I am using 5.4.2.1, from Nuget. It does not ship with the symbols.

I've tried to build from source, but upon opening the solution, VS complains that it is using an older version of the VC++ compiler/libraries. It tries to update them, but then the build still fails due to missing "v120" tools. Perhaps I just don't have an old enough VS/.NET SDK? Or is there a way for me to just grab the symbols from somewhere?

EDIT: I have successfully built the v8 component and am currently testing it. I plan on posting an update with hopefully better windbg results soon.
Coordinator
Aug 10, 2015 at 9:33 PM
ClearScript's build procedure is outlined in the ReadMe. It currently requires Visual Studio 2013 (which includes the "v120" toolset), but the next point release, with the fixes mentioned above, will support Visual Studio 2015 as well.
Aug 10, 2015 at 10:51 PM
Got it! Here's a much better call stack. After a few thousand iterations in my system (which equates to a few thousand events being raised -- e.g. js callbacks being called), I end up with almost 200 threads stuck at the same place:
0:000> ~197 kb
RetAddr           : Args to Child                                                           : Call Site
000007fe`fda210dc : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!NtWaitForSingleObject+0xa
000007fe`c59e6989 : 00000000`747e3880 00000000`00000000 00000000`00000000 00000000`000006e0 : KERNELBASE!WaitForSingleObjectEx+0x79
000007fe`c5bffc20 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : v8_x64!v8::internal::OptimizingCompilerThread::Run+0x59 [d:\git\clearscript\clearscript\v8\v8\build\v8-x64\src\optimizing-compiler-thread.cc @ 132]
000007fe`f2b74f7f : 00000000`1e35c9d0 00000000`00000000 00000000`00000000 00000000`00000000 : v8_x64!v8::base::ThreadEntry+0x20 [d:\git\clearscript\clearscript\v8\v8\build\v8-x64\src\base\platform\platform-win32.cc @ 1296]
000007fe`f2b75126 : 000007fe`f2c2cb80 00000000`00000000 00000000`00000000 00000000`00000000 : msvcr120!_callthreadstartex+0x17 [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 376]
00000000`77a459cd : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : msvcr120!_threadstartex+0x102 [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 354]
00000000`77c7b981 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0xd
00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x1d
Coordinator
Aug 11, 2015 at 4:44 AM
It appears that V8 maintains one "optimizing compiler thread" per isolate - an implementation detail beyond ClearScript's control. If your application uses the V8ScriptEngine constructor, it creates one isolate per engine instance.

You mentioned that you're seeing roughly 200 compiler threads. Is it possible that your application at that point is holding that many engine instances? Our normal recommendation is to use roughly as many concurrent engine instances as there are CPU cores in the machine.
Aug 11, 2015 at 5:14 AM
I have at most 10-15 scripts active at a time. For each script, I create a separate V8Runtime and create an engine off of that. I compile the script, too. So for each script I have a Tuple<V8Runtime, V8Engine, V8Script>.

At first I did not create an explicit runtime and used the V8Engine constructor. But in an effort to solve this problem I added the explicit V8Runtime so that I could dispose of it explicitly.

So now, after a particular script receives 250 callbacks, I will completely dispose of the script, engine, and runtime, then build them back up again. I was hoping whatever was leaking would get cleaned up that way, but it didn't make any difference.

I could instead keep a pool of V8Runtimes and reuse them and see if that helps. But if I do that, I have a question. These are all long-running scripts, because they are waiting for a callback initiated by managed code. But I want each script to be isolated and not collide any globals between them. I also would like them to execute in parallel if possible -- because each one really has nothing to do with any of the others, so serializing execution of them would be silly and a possible bottleneck for me. So, the question is, if two V8Engines share the same V8Runtime, will they still be isolated from each other in that regard?
Aug 11, 2015 at 7:09 AM
TL;DR: I may have identified my problem, which was my fault, but I'm not sure if it is expected behavior from ClearScript.

Alright I figured out why my thread count was continuously rising. You saying that it creates that optimization thread per isolate was what helped me find it.

Sometimes my javascript will call into a managed object and expect back a javascript array (I always try to avoid exposing .NET details to the scripts). On the managed side, I have a List<T>. It converts it into an Array by a ClearScript Evaluate("[]") call, then pushes each item into it. Well, I did that wrong. The Evaluate("[]") was being done on a NEW ScriptEngine rather than the Engine from the calling code. So the result was I was returning a js array created from one ScriptEngine back into code from another ScriptEngine.

Apparently when you do that, the object won't be cleaned up until disposing of the receiving engine.

I fixed that, and it's been a huge improvement. I'm still monitoring it for leaks though -- I'm not yet convinced I'm out of the woods. Some of that may just be the known problems you mentioned.

I'm also a bit confused by the results, because my app is very explicitly disposing of the script, engine, and runtime after every 200 calls. It even calls GC.Collect. But the thread count was still rising forever. This sample app sort of reproduces by problem, but I can't reproduce the inability to explicitly clean it up by disposing of things. Perhaps there's yet more subtlety to my app preventing that.

Sample app:
        private const string _script = @"
project.onIssueExceptionStatsChanged(function(ev) {
    var things = ev.getThings();
});
";

        static void Main(string[] args)
        {
            var runtime = new V8Runtime();
            var engine = runtime.CreateScriptEngine();
            var compiledScript = runtime.Compile(_script);

            var host = new HostObject();
            engine.AddHostObject("project", host);
            
            engine.Execute(compiledScript);

            for (var i = 0; i < 10; i++)
            {
                var iteration = i;

                Console.WriteLine("Iteration {0}. Threads = {1}. Memory = {2}.",
                    iteration,
                    Process.GetCurrentProcess().Threads.Count,
                    Process.GetCurrentProcess().PrivateMemorySize64.ToString("n0"));

                for (var j = 0; j < 100; j++)
                {
                    host.RaiseEvent(new SomeEventArgs());
                }
            }

            GC.Collect();
            GC.WaitForPendingFinalizers();

            Console.WriteLine("Done. Hit enter to Dispose. Threads: {0}. Memory: {1}",
                Process.GetCurrentProcess().Threads.Count,
                Process.GetCurrentProcess().PrivateMemorySize64.ToString("n0"));
            Console.ReadLine();


            compiledScript.Dispose();
            engine.Dispose();
            runtime.Dispose();

            // need to do this explicitly or it may not get cleaned up yet...
            GC.Collect();
            GC.WaitForPendingFinalizers();

            Console.WriteLine("Done. Threads: {0}. Memory: {1}",
                Process.GetCurrentProcess().Threads.Count,
                Process.GetCurrentProcess().PrivateMemorySize64.ToString("n0"));
            Console.ReadLine();
        }
    }

    public class HostObject 
    {
        private dynamic _callback;

        public void onIssueExceptionStatsChanged(dynamic callback)
        {
            _callback = callback;
        }

        internal void RaiseEvent(SomeEventArgs args)
        {
            _callback(args);
        }
    }

    public class SomeEventArgs
    {
        public object getThings()
        {
            dynamic list = new V8ScriptEngine().Evaluate("[]");
            list.push("foo");
            return list;
        }
    }
Output:
Iteration 0. Threads = 5. Memory = 30,679,040.
Iteration 1. Threads = 105. Memory = 942,821,376.
Iteration 2. Threads = 205. Memory = 1,850,576,896.
Iteration 3. Threads = 305. Memory = 2,758,254,592.
Iteration 4. Threads = 405. Memory = 3,662,749,696.
Iteration 5. Threads = 505. Memory = 4,570,488,832.
Iteration 6. Threads = 605. Memory = 5,477,806,080.
Iteration 7. Threads = 705. Memory = 6,386,348,032.
Iteration 8. Threads = 808. Memory = 7,291,899,904.
Iteration 9. Threads = 908. Memory = 8,198,541,312.
Done. Hit enter to Dispose. Threads: 1008. Memory: 9,104,228,352

Done. Threads: 4. Memory: 38,354,944
Coordinator
Aug 11, 2015 at 12:40 PM
Hi infinitiesloop,

The Evaluate("[]") was being done on a NEW ScriptEngine rather than the Engine from the calling code. So the result was I was returning a js array created from one ScriptEngine back into code from another ScriptEngine.

Ah, that explains it. Although ClearScript supports exposing foreign script objects, we don't recommend it. One of the problems is that the receiving engine has no way to discover the "weight" of an external object; it only knows the size of its local proxy. That can lead to suboptimal garbage collection behavior. In this case the external object is very large (as it holds another script engine), whereas the proxy is tiny, so the engine doesn't prioritize its garbage collection.

Apparently when you do that, the object won't be cleaned up until disposing of the receiving engine.

That's correct as long as the receiving engine continues to use the proxy. If it garbage-collects the proxy, the object becomes available for managed garbage collection.

I can't reproduce the inability to explicitly clean it up by disposing of things. Perhaps there's yet more subtlety to my app preventing that.

Please let us know what you find. By the way, the next ClearScript release will use a newer V8 version that seems to have done away with the compiler thread. Here's what your test program's output will look like:
Iteration 0. Threads = 3. Memory = 23,080,960.
Iteration 1. Threads = 3. Memory = 545,955,840.
Iteration 2. Threads = 3. Memory = 1,064,095,744.
Iteration 3. Threads = 3. Memory = 1,583,087,616.
Iteration 4. Threads = 3. Memory = 2,100,662,272.
Iteration 5. Threads = 3. Memory = 2,618,945,536.
Iteration 6. Threads = 3. Memory = 3,135,750,144.
Iteration 7. Threads = 5. Memory = 3,653,799,936.
Iteration 8. Threads = 5. Memory = 4,169,682,944.
Iteration 9. Threads = 5. Memory = 4,688,252,928.
Done. Hit enter to Dispose. Threads: 5. Memory: 5,203,578,880

Done. Threads: 3. Memory: 29,364,224
Cheers!