It’s been a long time since I felt such satisfaction debuging something so I decided to write about it.
Let’s assume that you need to store (cache) in memory a large object tree during some operations. In practice this happens because some regulatory constraints so you end up having to parse a very large file and store the resulting object tree. Actually you have a single entry cache. You parse your object, store it in memory for search and processing while the current object tree is used.
<pre lang="java">
public ObjectHandler getObjectHandler(Long id) throws Exception{
if(cachedObjectHandler != null){
if(cachedObjectHandler.getId().equals(id)){
return cachedObjectHandler;
}
}
//else
cachedObjectHandler = parse(...)
return cachedObjectHandler;
}
The code above is a simplified way to do it, no? Please note that the parse(…) function creates the object tree by parsing a stream and allocates a new object tree. In my particular case the object tree holded a max of 120k objects (~150Mb) and did some large xml parsing using stax.
So what is wrong with the code above? Take a look at what a single change can do:
<pre lang="java">
public ObjectHandler getObjectHandler(Long id) throws Exception{
if(cachedObjectHandler != null){
if(cachedObjectHandler.getId().equals(id)){
return cachedObjectHandler;
}
}
//else
cachedObjectHandler = null;
cachedObjectHandler = parse(...)
return cachedObjectHandler;
}
Did we just reduced the max needed memory by 2? In the first case since java does right to left assignment first a new object tree is allocated by the parse function and only when done it is assigned to the cachedObjectHandler object allowing for the old object tree to be gc-ed. However with the null assignment it can be gc-ed while the new allocation takes place if memory is needed.
As I said, a small change with a big smile.