|
The Java Specialists' Newsletter
Issue 024 2001-07-06
Category:
Performance
Java version: Self-tuning FIFO Queuesby Dr. Heinz M. Kabutz
Welcome to the 24th issue of "The Java(tm) Specialists'
Newsletter". Winter has arrived in Cape Town with great rain
and cold temperatures, much to the chagrin of the author. The
only way to combat such is to sit next to the heater with a glass
of South African 1997 Zonnebloem Cabernet Sauvignon and write
Java newsletters. This newsletter has taken the most time to
date - performance tests tend to have that effect on me :(((
Before I go on today's voyage, some feedback regarding the Socket
Wheel idea:
I'm not quite sure who gave the best contributions regarding
possible improvements for last newsletter's SocketWheel. Neil
Bartlett (Algorithmics Canada) and John Vlissides (IBM USA)
pointed out that the design was very close to the Reactor pattern
by Schmidt. This is a great plus mark for patterns; they truly
are things that OO developers around the world would come up with
independently. Josh Rehman pointed out that it is not too clever
to try minimze memory as that is very cheap nowadays, which I
agree with. He also pointed out that JDK 1.4 has non-blocking IO
making it possibly a lot easier to implement such a SocketWheel.
Ecce Jezuch (Poland) suggested using the available() method to
find out how many bytes would be available without blocking, but
unfortunately under Windows the JDK always returned 0.
James Pereira provided some excellent information regarding
sockets. It's quite technical, so I'm including it verbatim:
"Registered Ports, ports between 1024 and 49151, are listed by
the IANA and on most systems can be used by applications or
programs executed by users. Table C.2 specifies the port used by
the server process as its contact port. The IANA registers uses
of these ports as a convenience to the Internet community. To
the extent possible, these same port assignments are used with
UDP. The Registered Ports are in the numerical range of
1024-49151. The Registered Ports between 1024 and 5000 are also
referred to as the Ephemeral Ports. At least on Windows , The
TCP/Stack (OS) re-uses these ports internally on every socket
connection cycling from 1024...5000 wrapping around to 1024
again. This could lead to some interesting problems if sockets
are opened and close very quickly as there is usually a time
delay before that port is made available again...
"Second, the number of user-accessible ephemeral ports that can
be used to source outbound connections is configurable with the
MaxUserPort registry entry
(HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters). By
default, when an application requests any socket from the system
to use for an outbound call, a port between the values of 1024
and 5000 is supplied. You can use the MaxUserPort registry entry
to set the value of the highest port number to be used for
outbound connections. For example, setting this value to 10000
would make approximately 9000 user ports available for outbound
connections. For more details, see RFC 793. See also the
MaxFreeTcbs and MaxHashTableSize registry settings
(HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters).
"BTW: The stack and memory size issue is only an NT issue. On NT
the stack size is configurabale via linker and EditBin.exe
unfortunately we don't want to mess with the VM."
This information should solve the problem that I encountered with
too many sockets causing exceptions.
Ideally we should be able to specify stack size per thread as
some threads will need a lot and others only a little, so I still
think that we have a problem with many threads and their stack
size.
Another excellent contribution was made by Craig Larman, author
of "Java 2 Performance and Idiom Guide", who suggested using an
approach of multiplexing sockets described in his book to
minimize the number of sockets needed per client connection.
Next week I will try and write about this idea and give you a
solution that hopefully will work acceptably. I always recommend
Craig's book as an excellent book for the Java fundi who wants to
have his mind activated regarding performance ideas. It is very
difficult to write anything about performance as it is so
dependent on your implementation and hardware. This leads me to
this week's newsletters on self-tuning FIFO queues....
Self-tuning FIFO Queues
HotSpot(tm) has caused a lot of trouble for the Java Specialist
in the field, especially those of us with a few years experience.
All of a sudden, all the hard-earned performance knowledge was
wiped out in one foul swoop. The only thing left for us to do
was to write the simplest code that could possibly work and let
the HotStop compiler sort out performance for us. I liken it to
the great financial crisis of Japan in the 90's, where no one
knew whether he was coming or going and all the old certainties
went out of the window. Luckily, unlike in Japan, we are not
taking this too seriously, so our windows are still closed.
Fortunately everyone knows that Java is slow ;-)
A lot of the performance tricks we used in old code actually make
our code slower under HotSpot. Since we don't know what the
performance of our code will be for a specific platform, would it
be completely hairbrained to write self-tuning code? Write 3
algorithms, let the program measure on the target platform which
performs best, and choose that algorithm for the duration of the
VM.
To illustrate this idea, I want to write a FIFO queue that is
based on a java.util.List implementation. A while ago I
discovered that java.util.ArrayList is sometimes faster than
java.util.LinkedList for FIFO queue implementations. The switch
over occurs at a specific point in time, which we can measure
beforehand. The switch-over point is dependent on the VM,
whether we are using HotSpot(tm), etc. For example, on my little
notebook with 256MB RAM and Pentium III 700, the cross-over point
is 50 elements in the list when I use HotSpot, but 500 elements
when I switch off hotspot compiling (sometimes!).
The interface that the FIFO queues will implement is very simple:
public interface FIFO {
/** Add an object to the end of the FIFO queue */
boolean add(Object o);
/** Remove an object from the front of the FIFO queue */
Object remove();
/** Return the number of elements in the FIFO queue */
int size();
}
We implement this interface and extend ArrayList and LinkedList:
// FIFOArrayList.java
import java.util.ArrayList;
public class FIFOArrayList extends ArrayList implements FIFO {
public Object remove() {
return remove(0);
}
}
// FIFOLinkedList.java
import java.util.LinkedList;
public class FIFOLinkedList extends LinkedList implements FIFO {
public Object remove() {
return remove(0);
}
}
We also write a SwappingOverFIFOQueue which has values for HIGH
and LOW water marks. When we reach a HIGH water mark and we are
busy using an ArrayList, we start using a LinkedList. On the
contrary, if we reach a LOW water mark and we are busy using a
LinkedList we start using an ArrayList.
In foresight to my next example, I have made it possible to set
the watermarks, which also checks the optimal list types for all
the lists currently in the system. We have to be careful to not
get a memory leak by keeping handles to the instances of the
SwappingOverFIFOQueue so we use WeakReferences to hold the
references. Have a look at newsletter 15 for a discussion on
Weak / Soft References.
// SwappingOverFIFOQueue.java
import java.util.*;
import java.lang.ref.WeakReference;
public final class SwappingOverFIFOQueue implements FIFO {
/** The low value after which we switch over to ArrayList */
private static int LOW = 30;
/** The high value after which we switch down to LinkedList */
private static int HIGH = 70;
/** This list contains weak references of instances of this
class */
private static List instances = new LinkedList();
/** We add the weak references in an initializer block */
{
instances.add(new WeakReference(this));
}
/** When we set the low and high water marks we go through all
the existing instances and check for the optimal list type.
If the references is null we remove the WeakReference from
our instance list. */
public static void setWaterMarks(int low, int high) {
LOW = low;
HIGH = high;
Iterator it = instances.iterator();
while(it.hasNext()) {
WeakReference ref = (WeakReference)it.next();
SwappingOverFIFOQueue q = (SwappingOverFIFOQueue)ref.get();
if (q == null) {
it.remove();
} else {
q.checkOptimalListType();
}
}
}
private List list = new ArrayList();
public Class getListType() { return list.getClass(); }
private int size = 0;
public boolean add(Object o) {
try {
list.add(o);
return true;
} finally {
if (++size == HIGH) checkOptimalListType();
}
}
public Object remove() {
try {
return list.remove(0);
} finally {
if (--size == LOW) checkOptimalListType();
}
}
public int size() {
return size;
}
private void checkOptimalListType() {
if (size >= HIGH && (!(list instanceof LinkedList))) {
list = new LinkedList(list);
} else if (size <= LOW && (!(list instanceof ArrayList))) {
list = new ArrayList(list);
}
}
}
My test program takes the number of entries in the queue and then
illustrates how often we can add/remove in 2 seconds for each of
the queues. I found that you get the best performance results
when you run your tests for about 2 seconds each, so I count
iterations rather than milliseconds.
// SwappingOverFIFOQueueTest.java
import java.util.*;
public class SwappingOverFIFOQueueTest {
private static int ENTRIES;
public static void test(FIFO queue) {
for (int i=0; i<ENTRIES; i++) {
queue.add(new Object());
}
long up_to = System.currentTimeMillis() + 2000;
int iterations = 0;
while(System.currentTimeMillis() <= up_to) {
queue.add(new Object());
queue.remove();
iterations++;
}
System.out.println(queue.getClass());
System.out.println("\t" + iterations + " iterations");
}
public static void main(String[] args) {
if (args.length != 1) {
System.out.println(
"Usage: java SwappingOverFIFOQueueTest entries");
System.exit(1);
}
ENTRIES = Integer.parseInt(args[0]);
System.out.println("Entries = " + ENTRIES);
test(new FIFOArrayList());
test(new FIFOLinkedList());
SwappingOverFIFOQueue q = new SwappingOverFIFOQueue();
test(q);
System.out.println("Current queue implementation " +
q.getListType().getName());
}
}
On my notebook, when I run this program with 0 entries, I get the
following output:
Entries = 0
class FIFOArrayList
4552883 iterations
class FIFOLinkedList
2551017 iterations
class SwappingOverFIFOQueue
3594810 iterations
With 100 entries I get:
Entries = 100
class FIFOArrayList
1800877 iterations
class FIFOLinkedList
2509328 iterations
class SwappingOverFIFOQueue
2158451 iterations
And with 10000 entries I get:
Entries = 10000
class FIFOArrayList
49500 iterations
class FIFOLinkedList
812933 iterations
class SwappingOverFIFOQueue
758657 iterations
We can thus see that the SwappingFIFOQueue is always faster than
the worst case and slower than the best case, as one would
logically expect. However, I chose the HIGH and LOW values from
some tests that I made on my notebook, for that specific JVM. If
I take the JDK 1.2.2 that comes with JBuilder, for 100 entries I
get:
Entries = 100
class FIFOArrayList
1434122 iterations
class FIFOLinkedList
1307108 iterations
class SwappingOverFIFOQueue
1178115 iterations
Or if I use the -Xint mode for JDK 1.3 under Windows to switch
off the hotspot compiler, for 100 entries I get
Entries = 100
class FIFOArrayList
497550 iterations
class FIFOLinkedList
480599 iterations
class SwappingOverFIFOQueue
392314 iterations
In both cases, the values of the SwappingOverFIFOQueue were worse
than for both the ArrayList and the LinkedList.
We therefore write a Profiler that finds ideal HIGH/LOW water
marks for the JVM that is running on your system and sets up the
SwappingOver water marks.
// SwappingOverFIFOQueueProfiler.java
import java.util.*;
/*
For the sake of brevity we only consider two implementations
of java.util.List, namely java.util.ArrayList and
java.util.LinkedList. */
public class SwappingOverFIFOQueueProfiler {
private static boolean isArrayListFaster(int entries) {
System.out.println("isArrayListFaster(" + entries + ")");
return countIterations(new ArrayList(), entries) >
countIterations(new LinkedList(), entries);
}
private static int countIterations(List list, int entries) {
for (int i=0; i<entries; i++) {
list.add(new Object());
}
long end = System.currentTimeMillis() + 1000;
int iterations = 0;
while(System.currentTimeMillis() <= end) {
iterations++;
list.add(new Object());
list.remove(0);
}
return iterations;
}
static {
int checks = 0;
int watermark = 1;
int bestWatermark = 0;
for (int i=0; i<16; i++) {
if (isArrayListFaster(watermark)) {
bestWatermark = Math.max(watermark, bestWatermark);
watermark *= 2.0;
} else {
watermark *= 0.75;
if (watermark <= bestWatermark)
watermark *= 1.25;
}
}
System.out.println("Best watermark = " + bestWatermark);
int low = (int)(bestWatermark * 0.75);
int high = (int)(bestWatermark * 1.25);
System.out.println("Setting LOW to " + low +
" and HIGH to " + high);
SwappingOverFIFOQueue.setWaterMarks(low, high);
}
public static void main(String[] args) {
SwappingOverFIFOQueueTest.main(new String[] { "0" });
SwappingOverFIFOQueueTest.main(new String[] { "100" });
SwappingOverFIFOQueueTest.main(new String[] { "10000" });
}
}
If we load this class in our system then it will do measurements
of where the swap-over between ArrayList and LinkedList
performance occurs. On my computer, with JDK 1.3 and HotSpot,
the swap-over was measured to happen at about 32 entries in the
list. When I switch off the HotSpot, it occurs at about 121
entries, and under JDK 1.2.2 it happens at about 303.
After spending about 10 hours on this stupid newsletter, I have
to conclude that it would be better to stick to a LinkedList for
a FIFO queue as it is a better "average" performer. Perhaps the
lesson I've learnt from this newsletter is that we must be
careful of writing code which is too complicated as it tends to
be more difficult to optimize. As performance guru Craig Larman
pointed out though, we must be sure to not ignore performance
altogether; our customers might just kill the project if the
prototypes perform like dogs.
I always appreciate any feedback, both positive and negative, so
please keep sending your ideas and suggestions. Please also
remember to take the time to send this newsletter to others who
are interested in Java.
Heinz
Performance Articles
Related Java Course
Discuss at The Java Specialist Club
|