Thirteen Great Ways to Increase Java Performance
1. Use buffered I/O.
Using unbuffered I/O causes a lot of system calls for methods like InputStream.read(). This is common in code that parses input, such as commands from the network or configuration data from the disk.
Garbage collection is rarely a serious performance overhead. But, Java1 virtual machine (JVM)-internal synchronization caused by the new operation can cause lock contention for applications with lots of threads. Sometimes new can be avoided by re-using byte arrays, or re-using objects that have some notion of a state-resetting method.
This sounds silly, but I had heard that the overhead of invoking a native method was so high that it might be the case that small Java methods would be faster. Not! In my test case, I implemented System.arraycopy in plain Java. I then compared this using arrays of different sizes against System.arraycopy . The native (original) method was about an order of magnitude faster, depending on the array size. The native-method overhead may be high, but they're still fast compared to interpreting byte code. If you can use native methods in the JDK, then you can remain 100% pure and have a faster implementation than if you used interpreted methods to accomplish the same thing.
4. String operations are fast.
Using x + y (where x and y are strings) is faster than doing a getBytes of the two and then creating a new String from the byte array. However, String operations can hide a lot of new operations.
5. InetAddress.getHostAddress() has a lot of new operations. It creates a lot of intermediate strings to return the host address. Avoid it, if possible.
6. java.util.Date has some performance problems, particularly with internationalization.
If you frequently print out the current time as something other than the (long ms-since-epoch) that it is usually represented as, you may be able to cache your representation of the current time and then create a separate thread to update that representation every N seconds (N depends on how accurately you need to represent the current time). You could also delay converting the time until a client needs it, and the current representation is known to be stale.
If the String's length exceeds 16 characters, hashCode() samples only a portion of the String. So if the places that a set of Strings differ in don't get sampled you can see lots of similar hash values. This can turn your hash tables into linked lists!
In most applications, good performance comes from getting the architecture right. Using the right data structures for the problem you're solving is a lot more important than tweaking String operations. Thread architecture is also important. (Try to avoid wait/notify operations--they can cause a lot of lock contention in some VMs.) And of course you should use caching for your most expensive operations.
9. I have mixed feelings about java.util.Hashtable . It's nice to get so much functionality for free, but it is heavily synchronized. For instance, get() is a synchronized method. This means that the entire table is locked even while the hashCode() of the target key is computed.
10. String.getBytes() takes about ten times as long as String.getBytes(int srcBegin, int srcEnd, byte dst[], int dstBegin) . This is because the former does correct byte-to-char conversion, which involves a function call per character. The latter is deprecated, but you can get 10% faster than it without any deprecated methods using the following code:
{
String str = new String("the dark brown
frog jumps the green tree");
// alloc the buffer outside loop so
// all methods do one new per iteration...
char buffer[] = new char[str.length()];
for(int i=0; i<10000; i++)
{
int length = str.length();
str.getChars(0, length, buffer, 0);
byte b[] = new byte[length];
for (int j = 0; j < length; j++)
b[j] = (byte) buffer[j];
}
}
This still does an incorrect char->byte conversion though.
11. Synchronized method invocation is about six times longer than non-synchronized invocation.
This time hasn't been a problem in the Java Web Server, so we tend to try to break locks up into smaller locks to avoid lock contention. A lot of times people synchronize on an entire class for everything, even though the class contains variables that can be read/written concurrently without any loss of consistency. This calls for locking on the variables or creating dummy objects to serve as locks for the variables.
A lot of people do something like the following:
debug("foobar: " + x + y + "afasdfasdf");
public static void debug(String s) {
// System.err.println(s);
}
Then they think that they've turned off debugging overhead. Nope! If there are enough debugging statements, you can see a lot of time spent in creating new strings to evaluate "foobar: " + x + y + "afasdfasdf", which is then tossed after calling debug .
13. Profiles of the Java Web Server show it spending about 1-2% of its time running the garbage collector under most uses. So we rarely worry about performance of the garbage collector. The one thing you want to be careful about is response time. Running with a large heap size decreases the frequency of a garbage collection, but increases the hit taken when one occurs. The current VM pauses all your threads when a garbage collection occurs, so your users can see long pauses. Smaller heap sizes increase the frequency of garbage collections, but decrease their length.
0 comments:
Post a Comment