An “obvious” improvement

It’s been a long time since I felt such satisfaction debuging something so I decided to write about it.

Let’s assume that you need to store (cache) in memory a large object tree during some operations. In practice this happens because some regulatory constraints so you end up having to parse a very large file and store the resulting object tree. Actually you have a single entry cache. You parse your object, store it in memory for search and processing while the current object tree is used.

<pre lang="java">
public ObjectHandler getObjectHandler(Long id) throws Exception{
	if(cachedObjectHandler != null){
		if(cachedObjectHandler.getId().equals(id)){
			return cachedObjectHandler;
		}
	}
	//else
	cachedObjectHandler = parse(...)
	return cachedObjectHandler;
}

The code above is a simplified way to do it, no? Please note that the parse(…) function creates the object tree by parsing a stream and allocates a new object tree. In my particular case the object tree holded a max of 120k objects (~150Mb) and did some large xml parsing using stax.

So what is wrong with the code above? Take a look at what a single change can do:

<pre lang="java">
public ObjectHandler getObjectHandler(Long id) throws Exception{
	if(cachedObjectHandler != null){
		if(cachedObjectHandler.getId().equals(id)){
			return cachedObjectHandler;
		}
	}
	//else
	cachedObjectHandler = null;
	cachedObjectHandler = parse(...)
	return cachedObjectHandler;
}

Did we just reduced the max needed memory by 2? In the first case since java does right to left assignment first a new object tree is allocated by the parse function and only when done it is assigned to the cachedObjectHandler object allowing for the old object tree to be gc-ed. However with the null assignment it can be gc-ed while the new allocation takes place if memory is needed.

As I said, a small change with a big smile.