|
The Java Specialists' Newsletter
Issue 088 2004-05-19
Category:
Performance
Java version: Sun JDK 1.4.2_04 Resetting ObjectOutputStreamby Dr. Heinz M. Kabutz
Welcome to the 88th edition of The Java(tm) Specialists' Newsletter. Our readership has
increased to 100 countries, with the recent
addition of Botswana. A special welcome to my neighbouring
country :-) Please remember to forward these newsletters to
friends and colleagues who would be interested in joining.
The bigger the readership, the more pressure I will be under to
write new, good, newsletters :-)
There comes a time in any company, when it becomes important to
know what its "vision" and "mission" are. A vision and a
mission are supposed to help staff be more focused and provide
better consistent service to customers. After some
thought, we came up with the following Vision for
Maximum Solutions - The Java(tm) Specialists (not set in concrete yet):
Maximum Solutions develops and provides the best
training in the world for professional Java programmers.
There is a saying that goes round: "Those who cannot do,
teach." The impression is that trainers are usually teaching
others because they themselves are not good enough to be in
the real world. The saying is perhaps a bit unfair, so I
am determined to change this perception. A part of our
Mission is that in order to stay
"the best" at training, we make sure that all our trainers
are active in developing real software.
Thanks for reading this newsletter on our website. We also have a mailing list. That is where the real action takes place (webinars, free reports, etc.). Maybe subscribe today?
Advanced Java Courses on Crete:Java Specialists Master Course 18-21 June 2013 and
Concurrency Specialists Course 6-9 August 2013.
Resetting ObjectOutputStream
A class with many mysteries is java.io.ObjectOutputStream.
For instance, when and why should you reset the stream?
Let's look at an example. First we have class Person, which
is the class that we want to send over the network:
public class Person implements java.io.Serializable {
private final String firstName;
private final String surname;
private int age;
public Person(String firstName, String surname, int age) {
this.firstName = firstName;
this.surname = surname;
this.age = age;
}
public String toString() {
return firstName + " " + surname + ", " + age;
}
public void setAge(int age) {
this.age = age;
}
}
Next we have the code that Receives lots of Person objects
and code that Sends them:
import java.net.*;
import java.io.*;
public class Receiver {
public static void main(String[] args) throws Exception {
ServerSocket ss = new ServerSocket(7000);
Socket socket = ss.accept();
ObjectInputStream ois = new ObjectInputStream(
socket.getInputStream());
int count=0;
while(true) {
Person p = (Person) ois.readObject();
if (count++ % 1000 == 0) {
System.out.println(p);
}
}
}
}
import java.net.Socket;
import java.io.*;
public class Sender {
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
Socket s = new Socket("localhost", 7000);
ObjectOutputStream oos = new ObjectOutputStream(
s.getOutputStream());
Person p = new Person("Heinz", "Kabutz", 0);
for (int age=0; age < 1500 * 1000; age++) {
p.setAge(age);
oos.writeObject(p);
}
long end = System.currentTimeMillis();
System.out.println("That took " + (end-start) + "ms");
}
}
The output was:
java Receiver:
*snip*
Heinz Kabutz, 0
Heinz Kabutz, 0
Heinz Kabutz, 0
Heinz Kabutz, 0
Heinz Kabutz, 0
Heinz Kabutz, 0
java Sender:
That took 19548ms
When we run this, we will see lots of People objects on the
Receiver side, but all the age values will be 0, even though
we changed the age on the Sender side. Why is this?
When you construct an ObjectOutputStream and an ObjectInputStream,
they each contain a cache of objects that have already
been sent across this stream. The cache relies on object
identity, rather than the traditional hashing function. It is
more similar to a java.util.IdentityHashMap than a normal java.util.HashMap.
So, if you resend the same
object, only a pointer to the object is sent across the
network. This is very clever, and saves network bandwidth.
However, the ObjectOutputStream cannot detect whether your
object was changed internally, resulting in the Receiver
just seeing the same object over and over again. You will
notice that this was quite fast. We sent 1'500'000 objects
in 19548ms (on my machine). (well, we only sent one object,
and 1'499'999 pointers to that object).
There seemed to be some problem with sending the same Person
object many times, especially if the contents of that Person
changed. Due to the optimisation in ObjectOutputStream,
only the pointer to the Person would be sent each time.
So, what would happen if we simply sent a new Person each
time? Let's try it out...
import java.net.Socket;
import java.io.*;
public class Sender2 {
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
Socket s = new Socket("localhost", 7000);
ObjectOutputStream oos = new ObjectOutputStream(
s.getOutputStream());
for (int age=0; age < 1500 * 1000; age++) {
oos.writeObject(new Person("Heinz", "Kabutz", age));
}
long end = System.currentTimeMillis();
System.out.println("That took " + (end-start) + "ms");
}
}
This seems to run fine for a while, until we all of a sudden
see an OutOfMemory error on both the Receiver and the Sender2.
Someone once challenged regarding the pathetic speed of Java.
They claimed that Java was so slow that the Garbage Collector
could not even keep up with objects that were being read over
the network. It sounded strange to me that Java should run
out of memory so after some questioning, we traced the
problem to the object cache growing in the Receiver and never
being cleared. Since the Person objects are always distinct,
they are put into the cache on both sides of the
ObjectOutputStream. The Receiver's side cannot clear entries
from the table, since it does not know which entries the
Sender might send again. It then keeps on growing until the
JVM runs out of memory.
Resetting ObjectOutputStream
One hack^H^H^H^Hsolution to the OutOfMemory problem is to
every time that you send an object also reset the cache on
both sides. Let's try out what that does to our performance:
import java.net.Socket;
import java.io.*;
public class Sender3 {
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
Socket s = new Socket("localhost", 7000);
ObjectOutputStream oos = new ObjectOutputStream(
s.getOutputStream());
for (int age=0; age < 1500 * 1000; age++) {
oos.writeObject(new Person("Heinz", "Kabutz", age));
oos.reset();
}
long end = System.currentTimeMillis();
System.out.println("That took " + (end-start) + "ms");
}
}
When I ran that, it worked without causing any OutOfMemory
Errors, so I should be happy. But am I happy? I am old,
after having to wait for 314242ms for it to complete, i.e.
16 times longer than with Sender. Sender was fast, but
incorrect. Sender2 ran out of memory. Sender3 was correct,
but slow. Is there no better way?
The problem with reset() is that it clears the cache of ALL
objects, even constants such as the Strings "Heinz" and
"Kabutz". So, we end up sending these constants over the
network time and time again! Unfortunately the reset() is
an all-or-nothing approach, so the entire cache will be lost.
But perhaps, if we don't clear it all the time, we can get
the advantage of speed and correctness? Let's try that out:
import java.net.Socket;
import java.io.*;
public class Sender4 {
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
Socket s = new Socket("localhost", 7000);
ObjectOutputStream oos = new ObjectOutputStream(
s.getOutputStream());
for (int age=0; age < 1500 * 1000; age++) {
oos.writeObject(new Person("Heinz", "Kabutz", age));
if (age % 1000 == 0) oos.reset();
}
long end = System.currentTimeMillis();
System.out.println("That took " + (end-start) + "ms");
}
}
Because I don't reset the cache on every call, Sender4 can
avoid sending the Strings "Heinz" and "Kabutz" over the
network 1'500'000 times in just 66015ms. Infact, it only has
to send these Strings 1'500 times. If we reset the
ObjectOutputStream too frequently, we will increase the
network bandwidth, and if we do not reset it often enough, we
will increase the burden of our Garbage Collector. Like all
things in Java Performance Tuning, you have to set it to the
correct number, not too big and not too little.
What About RMI?
I seem to recall that at some point, RMI used the
ObjectOutputStream mechanism to convert the parameters of
your functions into a byte[]. The interesting part was that
it would make the ObjectOutputStream, write the objects, and
then close the ObjectOutputStream. This is akin to resetting
the stream each time that you write to it.
Depending on how you would want to transfer your data between
two machines, and depending on how many times there will be
identical objects sent across the network, it may pay you to
use ObjectOutputStreams directly, and be careful to reset the
stream before you run out of memory.
In the
last newsletter, I suggested that you could use the sun.*
classes in your code. I did not emphasize strongly enough
that you should be very careful of using sun.*
classes in your code, since it would make your Java code
non-portable between JVM vendors. This is a newsletter for
Java Specialists so I will sometimes leave out such
obvious details :-) However, several readers mentioned that
you could achieve the same with a SecurityManager, which I
had forgotten about. I guess if you were not able to use the
SecurityManager, you could generate a stack trace and find
out who called you. However, generating a stack trace
would be rather inefficient (another obvious fact that is
hardly worth mentioning ;-)
I want to personally thank you for taking the time to read
my newsletters. They are a wonderful hobby for me and I
thoroughly enjoy publishing them as a free resource to other
Java Specialists. Please remember to forward them to
friends, mention them on mailing lists, tell colleagues, etc.
so that others can also enjoy them :-)
Lastly, I am collecting quotes of what my happy readers think
of my newsletter. If you have some nice words that would
make others subscribe to The Java(tm) Specialists' Newsletter, would you please send them to me?
Kind regards
Heinz
Performance Articles
Related Java Course
Discuss at The Java Specialist Club
|