optimization – Most efficient way to increment a Map value in Java

optimization – Most efficient way to increment a Map value in Java

Some test results

Ive gotten a lot of good answers to this question–thanks folks–so I decided to run some tests and figure out which method is actually fastest. The five methods I tested are these:

  • the ContainsKey method that I presented in the question
  • the TestForNull method suggested by Aleksandar Dimitrov
  • the AtomicLong method suggested by Hank Gay
  • the Trove method suggested by jrudolph
  • the MutableInt method suggested by phax.myopenid.com

Method

Heres what I did…

  1. created five classes that were identical except for the differences shown below. Each class had to perform an operation typical of the scenario I presented: opening a 10MB file and reading it in, then performing a frequency count of all the word tokens in the file. Since this took an average of only 3 seconds, I had it perform the frequency count (not the I/O) 10 times.
  2. timed the loop of 10 iterations but not the I/O operation and recorded the total time taken (in clock seconds) essentially using Ian Darwins method in the Java Cookbook.
  3. performed all five tests in series, and then did this another three times.
  4. averaged the four results for each method.

Results

Ill present the results first and the code below for those who are interested.

The ContainsKey method was, as expected, the slowest, so Ill give the speed of each method in comparison to the speed of that method.

  • ContainsKey: 30.654 seconds (baseline)
  • AtomicLong: 29.780 seconds (1.03 times as fast)
  • TestForNull: 28.804 seconds (1.06 times as fast)
  • Trove: 26.313 seconds (1.16 times as fast)
  • MutableInt: 25.747 seconds (1.19 times as fast)

Conclusions

It would appear that only the MutableInt method and the Trove method are significantly faster, in that only they give a performance boost of more than 10%. However, if threading is an issue, AtomicLong might be more attractive than the others (Im not really sure). I also ran TestForNull with final variables, but the difference was negligible.

Note that I havent profiled memory usage in the different scenarios. Id be happy to hear from anybody who has good insights into how the MutableInt and Trove methods would be likely to affect memory usage.

Personally, I find the MutableInt method the most attractive, since it doesnt require loading any third-party classes. So unless I discover problems with it, thats the way Im most likely to go.

The code

Here is the crucial code from each method.

ContainsKey

import java.util.HashMap;
import java.util.Map;
...
Map<String, Integer> freq = new HashMap<String, Integer>();
...
int count = freq.containsKey(word) ? freq.get(word) : 0;
freq.put(word, count + 1);

TestForNull

import java.util.HashMap;
import java.util.Map;
...
Map<String, Integer> freq = new HashMap<String, Integer>();
...
Integer count = freq.get(word);
if (count == null) {
    freq.put(word, 1);
}
else {
    freq.put(word, count + 1);
}

AtomicLong

import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.atomic.AtomicLong;
...
final ConcurrentMap<String, AtomicLong> map = 
    new ConcurrentHashMap<String, AtomicLong>();
...
map.putIfAbsent(word, new AtomicLong(0));
map.get(word).incrementAndGet();

Trove

import gnu.trove.TObjectIntHashMap;
...
TObjectIntHashMap<String> freq = new TObjectIntHashMap<String>();
...
freq.adjustOrPutValue(word, 1, 1);

MutableInt

import java.util.HashMap;
import java.util.Map;
...
class MutableInt {
  int value = 1; // note that we start at 1 since were counting
  public void increment () { ++value;      }
  public int  get ()       { return value; }
}
...
Map<String, MutableInt> freq = new HashMap<String, MutableInt>();
...
MutableInt count = freq.get(word);
if (count == null) {
    freq.put(word, new MutableInt());
}
else {
    count.increment();
}

Now there is a shorter way with Java 8 using Map::merge.

myMap.merge(key, 1, Integer::sum)

What it does:

  • if key do not exists, put 1 as value
  • otherwise sum 1 to the value linked to key

More information here.

optimization – Most efficient way to increment a Map value in Java

A little research in 2016: https://github.com/leventov/java-word-count, benchmark source code

Best results per method (smaller is better):

                 time, ms
kolobokeCompile  18.8
koloboke         19.8
trove            20.8
fastutil         22.7
mutableInt       24.3
atomicInteger    25.3
eclipse          26.9
hashMap          28.0
hppc             33.6
hppcRt           36.5

Timespace results:

Leave a Reply

Your email address will not be published.