Search

11/30/2009

Javascript Memory Leaks

JScript Memory Leaks

When a DOM object contains a reference to a JavaScript object (such an event handling function), and when that JavaScript object contains a reference to that DOM object, then a cyclic structure is formed. This is not in itself a problem. At such time as there are no other references to the DOM object and the event handler, then the garbage collector (an automatic memory resource manager) will reclaim them both, allowing their space to be reallocated. The JavaScript garbage collector understands about cycles and is not confused by them. Unfortunately, IE's DOM is not managed by JScript. It has its own memory manager that does not understand about cycles and so gets very confused. As a result, when cycles occur, memory reclamation does not occur. The memory that is not reclaimed is said to have leaked. Over time, this can result in memory starvation. In a memory space full of used cells, the browser starves to death.

IBM developerWorks - Memory leak patterns in JavaScript
Internet Explorer and Mozilla Firefox are two browsers that use reference counting to handle memory for DOM objects. In a reference counting system, each object referenced maintains a count of how many objects are referencing it. If the count becomes zero, the object is destroyed and the memory is returned to the heap. Although this solution is generally very efficient, it has a blind spot when it comes to circular (or cyclic) references.

What's wrong with circular references?
A circular reference is formed when two objects reference each other, giving each object a reference count of 1. In a purely garbage collected system, a circular reference is not a problem: If neither of the objects involved is referenced by any other object, then both are garbage collected. In a reference counting system, however, neither of the objects can be destroyed, because the reference count never reaches zero. In a hybrid system, where both garbage collection and reference counting are being used, leaks occur because the system fails to identify a circular reference. In this case, neither the DOM object nor the JavaScript object is destroyed. Listing 1 shows a circular reference between a JavaScript object and a DOM object.

<html>
<body>
<script type="text/javascript">
document.write("Circular references between JavaScript and DOM!");
var obj;
window.onload = function(){
obj=document.getElementById("DivElement");
document.getElementById("DivElement").expandoProperty=obj;
obj.bigString=new Array(1000).join(new Array(2000).join("XXXXX"));
};
</script>
<div id="DivElement">Div Element</div>
</body>
</html>

As you can see in the above listing, the JavaScript object obj has a reference to the DOM object represented by DivElement. The DOM object, in turn, has a reference to the JavaScript object through the expandoProperty. A circular reference exists between the JavaScript object and the DOM object. Because DOM objects are managed through reference counting, neither object will ever be destroyed.

Learning about closures

<html>
<body>
<script type="text/javascript">
document.write("Closure Demo!!");
window.onload=
function closureDemoParentFunction(paramA)
{
var a = paramA;
return function closureDemoInnerFunction (paramB)
{
alert( a +" "+ paramB);
};
};
var x = closureDemoParentFunction("outer x");
x("inner x");
</script>
</body>
</html>

In the above listing closureDemoInnerFunction is the inner function defined within the parent function closureDemoParentFunction. When a call is made to closureDemoParentFunction with a parameter of outer x, the outer function variable a is assigned the value outer x. The function returns with a pointer to the inner function closureDemoInnerFunction, which is contained in the variable x.

It is important to note that the local variable a of the outer function closureDemoParentFunction will exist even after the outer function has returned. This is different from programming languages such as C/C++, where local variables no longer exist once a function has returned. In JavaScript, the moment closureDemoParentFunction is called, a scope object with property a is created. This property contains the value of paramA, also known as "outer x". Similarly, when the closureDemoParentFunction returns, it will return the inner function closureDemoInnerFunction, which is contained in the variable x.

Because the inner function holds a reference to the outer function's variables, the scope object with property a will not be garbage collected. When a call is made on x with a parameter value of inner x -- that is, x("inner x") -- an alert showing "outer x innerx" will pop up.

Closures and circular references
In Listing 5 you see a closure in which a JavaScript object (obj) contains a reference to a DOM object (referenced by the id "element"). The DOM element, in turn, has a reference to the JavaScript obj(innerFunction). The resulting circular reference between the JavaScript object and the DOM object causes a memory leak.

<html>
<body>
<script type="text/javascript">
document.write("Program to illustrate memory leak via closure");
window.onload=function outerFunction(){
var obj = document.getElementById("element");
obj.onclick=function innerFunction(){
alert("Hi! I will leak");
};
obj.bigString=new Array(1000).join(new Array(2000).join("XXXXX"));
// This is used to make the leak significant
};
</script>
<button id="element">Click Me</button>
</body>
</html>

Avoiding memory leaks
Listing 6. Break the circular reference
<html>
<body>
<script type="text/javascript">
document.write("Avoiding memory leak via closure by breaking the circular
reference");
window.onload=function outerFunction(){
var obj = document.getElementById("element");
obj.onclick=function innerFunction()
{
alert("Hi! I have avoided the leak");
// Some logic here
};
obj.bigString=new Array(1000).join(new Array(2000).join("XXXXX"));
obj = null; //This breaks the circular reference
};
</script>
<button id="element">"Click Here"</button>
</body>
</html>

Listing 7. Add another closure
<html>
<body>
<script type="text/javascript">
document.write("Avoiding a memory leak by adding another closure");
window.onload=function outerFunction(){
var anotherObj = function innerFunction()
{
// Some logic here
alert("Hi! I have avoided the leak");
};
(function anotherInnerFunction(){
var obj = document.getElementById("element");
obj.onclick=anotherObj })();
};
</script>
<button id="element">"Click Here"</button>
</body>
</html>

Listing 8. Avoid the closure altogether
<html>
<head>
<script type="text/javascript">
document.write("Avoid leaks by avoiding closures!");
window.onload=function()
{
var obj = document.getElementById("element");
obj.onclick = doesNotLeak;
}
function doesNotLeak()
{
//Your Logic here
alert("Hi! I have avoided the leak");
}

</script>
</head>
<body>
<button id="element">"Click Here"</button>
</body>
</html>


Mozilla developer center - A re-introduction to JavaScript ( Simon Willison )
Browser hosts need to manage a large number of objects representing the HTML page being presented - the objects of the DOM. It is up to the browser to manage the allocation and recovery of these.

Internet Explorer uses its own garbage collection scheme for this, separate from the mechanism used by JavaScript. It is the interaction between the two that can cause memory leaks.

A memory leak in IE occurs any time a circular reference is formed between a JavaScript object and a native object. Consider the following:

function leakMemory() {
var el = document.getElementById('el');
var o = { 'el': el };
el.o = o;
}
The circular reference formed above creates a memory leak; IE will not free the memory used by el and o until the browser is completely restarted.
Closures make it easy to create a memory leak without meaning to. Consider this:

function addHandler() {
var el = document.getElementById('el');
el.onclick = function() {
this.style.backgroundColor = 'red';
}
}

The above code sets up the element to turn red when it is clicked. It also creates a memory leak. Why? Because the reference to el is inadvertently caught in the closure created for the anonymous inner function. This creates a circular reference between a JavaScript object (the function) and a native object (el).

There are a number of workarounds for this problem. The simplest is this:

function addHandler() {
var el = document.getElementById('el');
el.onclick = function() {
this.style.backgroundColor = 'red';
}
delete el;
}

Surprisingly, one trick for breaking circular references introduced by a closure is to add another closure:

function addHandler() {
var clickHandler = function() {
this.style.backgroundColor = 'red';
}
(function() {
var el = document.getElementById('el');
el.onclick = clickHandler;
})();
}

The inner function is executed straight away, and hides its contents from the closure created with clickHandler.

Fixing Leaks - Drip IE Leak Detector

function loadMyPage() {
var elem = document.getElementById('myelement');
elem.onclick = function () {
window.alert('hi!');
};
}

To solve this, you could add:
elem = null;

Or, you could refactor the code somewhat:

function onMyElemClick() {
window.alert('hi!');
}
function loadMyPage() {
var elem = document.getElementById('myelement');
elem.onclick = onMyElemClick;
}

Understanding and Solving Internet Explorer Leak Patterns
Leak Patterns
1. Circular References—When mutual references are counted between Internet Explorer's COM infrastructure and any scripting engine, objects can leak memory. This is the broadest pattern.
2. Closures—Closures are a specific form of circular reference that pose the largest pattern to existing Web application architectures. Closures are easy to spot because they rely on a specific language keyword and can be searched for generically.
3. Cross-Page Leaks—Cross-page leaks are often very small leaks of internal book-keeping objects as you move from site to site. We'll examine the DOM Insertion Order issue, along with a workaround that shows how small changes to your code can prevent the creation of these book-keeping objects.
4. Pseudo-Leaks—These aren't really leaks, but can be extremely annoying if you don't understand where your memory is going. We'll examine the script element rewriting and how it appears to leak quite a bit of memory, when it is really performing as required.

Circular References
Circular references are the root of nearly every leak. Normally, script engines handle circular references through their garbage collectors, but certain unknowns can prevent their heuristics from working properly. The unknown in the case of IE would be the status of any DOM elements that a portion of script has access to. The basic principle would be as follows:
Figure 1 Basic Circular Reference Pattern

CodeProject: Memory Leakage in Internet Explorer - revisited. Free source code and programming help
Fabulous Adventures In Coding : How Do The Script Garbage Collectors Work?
Interestingly enough though, JScript and VBScript have completely different garbage collectors. Occasionally people ask me how the garbage collectors work and what the differences are.

JScript uses a nongenerational mark-and-sweep garbage collector. It works like this:

* Every variable which is "in scope" is called a "scavenger". A scavenger may refer to a number, an object, a string, whatever. We maintain a list of scavengers -- variables are moved on to the scav list when they come into scope and off the scav list when they go out of scope.
* Every now and then the garbage collector runs. First it puts a "mark" on every object, variable, string, etc – all the memory tracked by the GC. (JScript uses the VARIANT data structure internally and there are plenty of extra unused bits in that structure, so we just set one of them.)
* Second, it clears the mark on the scavengers and the transitive closure of scavenger references. So if a scavenger object references a nonscavenger object then we clear the bits on the nonscavenger, and on everything that it refers to. (I am using the word "closure" in a different sense than in my earlier post.)
* At this point we know that all the memory still marked is allocated memory which cannot be reached by any path from any in-scope variable. All of those objects are instructed to tear themselves down, which destroys any circular references.
Actually it is a little more complex than that, as we must worry about details like "what if freeing an item causes a message loop to run, which handles an event, which calls back into the script, which runs code, which triggers another garbage collection?" But those are just implementation details. (Incidentally, every JScript engine running on the same thread shares a GC, which complicates the story even further...)

You'll note that I hand-waved a bit there when I said "every now and then..." Actually what we do is keep track of the number of strings, objects and array slots allocated. We check the current tallies at the beginning of each statement, and when the numbers exceed certain thresholds we trigger a collection.

The benefits of this approach are numerous, but the principle benefit is that circular references are not leaked unless the circular reference involves an object not owned by JScript.

However, there are some down sides as well. Performance is potentially not good on large-working-set applications -- if you have an app where there are lots of long-term things in memory and lots of short-term objects being created and destroyed then the GC will run often and will have to walk the same network of long-term objects over and over again. That's not fast.

The opposite problem is that perhaps a GC will not run when you want one to. If you say "blah = null" then the memory owned by blah will not be released until the GC releases it. If blah is the sole remaining reference to a huge array or network of objects, you might want it to go away as soon as possible. Now, you can force the JScript garbage collector to run with the CollectGarbage() method, but I don't recommend it. The whole point of JScript having a GC is that you don't need to worry about object lifetime. If you do worry about it then you're probably using the wrong tool for the job!

沒有留言: