잘못된 자바 코딩에 관한글을 읽었습니다. 한글로 된 글이었는데, 원문을 찾아 보니 아래와 같은 글을 찾아볼 수 있었습니다.

[http://www.odi.ch/weblog/index.php] 에서 그 원문을 찾아 볼 수 있었고, 해당 내용은 두고 보면 좋을 것 같아, 통째로 잘 긁어다 두었습니다.
아마도 2007년 7월경에 작성된 글 인것 같습니다. 현재의 우리의 문제 해결 방법에 그 이치가 맞지 않는 것도 있을지도 모르겠습니다. 좀더 가치 있는 코드를 양산하는데 도움이 되리가 기대합니다.


Java Anti-Patterns


This page collects some bad code that may not look so obiously bad for beginners. Beginners often struggle with the language syntax. They also have little knowledge about the standard JDK class library and how to make the best use of it. In fact I have collected all examples from everyday junior code. I have modified the original code to give it example character and such that it highlights the problems.


String concatenation

String s = “”;
for (Person p : persons) {
s += “, ” + p.getName();
}
s = s.substring(2); //remove first comma
This is a real performance killer: O(persons.length²). The repeated concatenation of strings in a loop causes excess garbage and array copying. Moreover it is ugly that the resulting string has to be fixed for an extra comma.
StringBuilder sb = new StringBuilder(persons.size() * 16); // well estimated buffer
for (Person p : persons) {
if (sb.length() > 0) sb.append(“, “);
sb.append(p.getName);
}

Lost StringBuffer performance

StringBuffer sb = new StringBuffer();
sb.append(“Name: “);
sb.append(name + ‘\n’);
sb.append(“!”);

String s = sb.toString();
Despite good intentions the above code is not perfect. The most obvious mistake is the string concatenation in line 3. In line 4 a appending char would be faster than appending a String. An also major omission is the missing length initialization of the buffer which may incur unnecessary resizing (array copying). In JDK 1.5 a StringBuilder instead of StringBuffer should have been used: because it is only a local variable the synchronization is overkill.
StringBuilder sb = new StringBuilder(100);
sb.append(“Name: “);
sb.append(name);
sb.append(‘\n’);
sb.append(‘!’);
String s = sb.toString();

Testing for string equality

if (name.compareTo(“John”) == 0) …
if (name == “John”) …
if (name.equals(“John”)) …
if (“”.equals(name)) …
None of the above comparisons is wrong – but neither are they really good. The compareTo method is overkill and too verbose. The == operator tests for object identity which is probably not what you want. The equals method is the way to go, but reversing the constant and variable would give you extra safety if name is null plus an increase in speed because the equals method is always called from the same object if used in a loop. When testing for empty strings it’s faster to check if their length is 0. Because the equals method may first calculate a hash value.
if (“John”.equals(name)) …
if (name.length() == 0) …

Converting numbers to Strings

“” + set.size()
new Integer(set.size()).toString()
The return type of the Set.size() method is int. A conversion to String is wanted. These two examples in fact do the conversion. But the first incurs the penalty of a concatenation operation. And the second creates an intermediate Integer wrapper. The correct way of doing it is
String.valueOf(set.size())

Not taking adavantage of immutable objects

zero = new Integer(0);
return Boolean.valueOf(“true”);
Integer as well as Boolean are immutable. Thus it doesn’t make sense to create several objects that represent the same value. Those classes have built-in caches for frequently used instances. In the case of Boolean there are even only two possible instances. The programmer can take advantage of this:
zero = Integer.valueOf(0);
return Boolean.TRUE;

XML parsers are for sissies

int start = xml.indexOf(“<name>”);
int end = xml.indexOf(“</name>”);
String name = xml.substring(start, end);
This naïve XML parsing only works with the most simple XML documents. It will however fail if a) the name element is not unique in the document, b) the content of name is not only character data c) the text data of name contains escaped characters d) the text data is specified as a CDATA section e) the document uses XML namespaces. XML is way too complex for string operations. There is a reason why XML parsers like Xerces are a over one megabyte jar files! The equivalent with JDOM is:
SAXBuilder builder = new SAXBuilder(false);
Document doc = doc = builder.build(new StringReader(xml));
String name = doc.getRootElement().getChild(“name”).getText();

The XML encoding trap

String xml = FileUtils.readTextFile(“my.xml”);
It is a very bad idea to read an XML file and store it in a String. An XML specifies its encoding in the XML header. But when reading a file you have to know the encoding beforehand! Also storing an XML file in a String wastes memory. All XML parsers accept an InputStream as a parsing source and they figure out the encoding themselves correctly. The byte order (big-endian, little-endian) is another trap when a multi-byte encoding (such as UTF-8) is used. XML files may carry a byte order mark at the beginning that specifies the byte order.

Undefined encoding

Reader r = new FileReader(file);
Writer w = new FileWriter(file);
Reader r = new InputStreamReader(inputStream);
Writer w = new OutputStreamWriter(outputStream);
String s = new String(byteArray); // byteArray is a byte[]
byte[] a = string.getBytes();
Each line of the above converts between byte and char using the default platform encoding. The code behaves differently depending on the platform it runs on. This is harmful if the data flows from one platform to another. It is considered bad practice to rely on the default platform encoding at all. Conversions should always be performed with a defined encoding.
Reader r = new InputStreamReader(new FileInputStream(file), “ISO-8859-1”);
Writer w = new OutputStreamWriter(new FileOutputStream(file), “ISO-8859-1”);
Reader r = new InputStreamReader(inputStream, “UTF-8”);
Writer w = new OutputStreamWriter(outputStream, “UTF-8”);
String s = new String(byteArray, “ASCII”);
byte[] a = string.getBytes(“ASCII”);

Unbuffered streams

InputStream in = new FileInputStream(file);
int b;
while ((b = in.read()) != -1) {

}
The above code reads a file byte by byte. Every read() call on the stream will cause a native (JNI) call to the filesystem of the underlying operating system. JNI calls are expensive. The number of native calls can be reduced dramatically by wrapping the stream into a BufferedInputStream. Reading 1 MB of data from /dev/zero with the above code took about 1 second on my laptop. With the fixed code below it was down to 60 milliseconds! That’s a 94% saving. This also applies for output streams of course. And it is true not only for the file system but also for sockets.
InputStream in = new BufferedInputStream(new FileInputStream(file));

Infinite heap

byte[] pdf = toPdf(file);
Here a method creates a PDF file from some input and returns the binary PDF data as a byte array. This code assumes that the generated file is small enough to fit into the available heap memory. If this code can not make this 100% sure then it is vulnerable to an out of memory condition. Especially if this code is run server-side which usually means many parallel threads. Bulk data must never be handled with byte arrays. Streams should be used and the data should be spooled to disk or a database.
File pdf = toPdf(file);

Catch all: I don’t know the right runtime exception

Query q = …
Person p;
try {
p = (Person) q.getSingleResult();
} catch(Exception e) {
p = null;
}
This is an example of a J2EE EJB3 query. The getSingleResult throws runtime exceptions when a) the result is not unique, b) there is no result c) when the query could not be executed due to database failure or so. The code above just catches any exception. A typical catch-all block. Using null as a result may be the right thing for case b) but not for case a) or c). In general one should not catch more exceptions than necessary. The correct exception handling is
Query q = …
Person p;
try {
p = (Person) q.getSingleResult();
} catch(NoResultException e) {
p = null;
}

Exceptions are annoying

try {
doStuff();
} catch(Exception e) {
log.fatal(“Could not do stuff”);
}
doMoreStuff();
There are two problems with this tiny piece of code. First, if this is really a fatal condition then the method should abort and notify the caller of the fatal condition with an appropriate exception. Hardly ever you can just continue after a fatal condition. Second, this code is very hard to debug because the reason of the failure is lost. Exception objects carry detailed information about where the error occurred and what caused it. If you catch highlevel exceptions then at least log the message and stack trace.
try {
doStuff();
} catch(Exception e) {
throw new MyRuntimeException(“Could not do stuff because: “+ e.getMessage, e);
}

Catching to log

try {

} catch(ExceptionA e) {
log.error(e.getMessage(), e);
throw e;
} catch(ExceptionB e) {
log.error(e.getMessage(), e);
throw e;
}
This code only catches exception to write out a log statement and then rethrows the same exception. This is stupid. Let the caller decide if the message is important to log and remove the whole try/catch clause.

Incomplete exception handling

try {
is = new FileInputStream(inFile);
os = new FileOutputStream(outFile);
} finally {
try {
is.close();
os.close();
} catch(IOException e) {
/* we can’t do anything */
}
}
If streams are not closed, the underlying operating system can’t free native resources. This programmer wanted to be careful about closing both streams. So he put the close in a finally clause. But if is.close() throws an IOException then os.close is not even executed. Both close statements must be wrapped in their own try/catch clause. Moreover, if creating the input stream throws an exception (because the file was not found) then os is null and os.close() will throw a NullPointerException. To make this less verbose I have stripped some newlines.
try {
is = new FileInputStream(inFile);
os = new FileOutputStream(outFile);
} finally {
try { if (is != null) is.close(); } catch(IOException e) {/* we can’t do anything */}
try { if (os != null) os.close(); } catch(IOException e) {/* we can’t do anything */}
}

The exception that never happens

try {
… do risky stuff …
} catch(SomeException e) {
// never happens
}
… do some more …
Here the developer executes some code in a try/catch block. He doesn’t want to rethrow the exception that one of the called methods declares to his annoyance. As the developer is clever he knows that in his particular situation the exception will never be thrown, so he just inserts an empty catch block. He even puts a nice comment in the empty catch block – but they are famous last words… The problem with this is: how can he be sure? What if the implementation of the called method changes? What if the exception is still thrown in some special case but he just didn’t think of it? The code after the try/catch may do the wrong thing in that situation. The exception will go completely unnoticed. The code can be made much more reliable by throwing a runtime exception in the case. This works like an assertion and adheres to the “crash early” principle. The developer will notice if his assumption was wrong. The code after the try/catch will not be excecuted if the exception occured against all honest hope and expectation. If the exception really never occurs – fine, nothing changed.
try {
… do risky stuff …
} catch(SomeException e) {
// never happens hopefully
throw new IllegalStateException(e.getMessage(), e); // crash early, passing all information
}
… do some more …

The transient trap

public class A implements Serializable {
private String someState;
private transient Log log = LogFactory.getLog(getClass());

public void f() {
log.debug(“enter f”);

}
}

Log objects are not serializable. The programmer knew this and correctly declared the log field as transient so it is not serialized. However the initialization of this variables happens in the class’ initializer. Upon deserialization initializers and contructors are not executed! This leaves the deserialized object with a null log variable which subsequently causes a NullPointerException in f(). Rule of thumb: never use class initialization with transient variables. You can either solve this case here by using a static variable or by using a local variable:
public class A implements Serializable {
private String someState;
private static Log log = LogFactory.getLog(getClass());

public void f() {
log.debug(“enter f”);

}
}

public class A implements Serializable {
private String someState;

public void f() {
Log log = LogFactory.getLog(getClass());
log.debug(“enter f”);

}
}


Overkill initialization

public class B {
private int count = 0;
private String name = null;
private boolean important = false;
}
This programmer used to code in C. So naturally he wants to make sure every variable is properly initialized. Here however it is not necessary. The Java language specification guarantees that member variables are initialized with certain values automatically: 0, null, false. By declaring them explicitly the programmer causes a class initializer to be executed before the constructor. This is unnecessary overkill and should be avoided.
public class B {
private int count;
private String name;
private boolean important;
}

Too much static

private static Log LOG = LogFactory.getLog(MyClass.class);
Many developers store Log instances in static member variables. This is somehow correct from a design point of view. But it is totally unnecessary in this case. The LocFactory already keeps static references to the Log objects, so the getLog call is essentially a lookup in a hash table and not expensive. But having every class in your project equiped with a static variable is very bad as it prevents class garbage collection. Only in rare cases like serializable classes are static variables the way to go. It also prevents you from using getClass() to find the current class name. Instead you must hardcode the class name so you can’t share the Log instance in subclasses but must declare it private.
protected Log log = LogFactory.getLog(getClass());

Chosing the wrong class loader

Class clazz = Class.forName(name);
This code uses the class loader that loaded the current class. This is hardly ever what you want when you dynamically load an additional class. Especially in managed environments like Application servers, Servlet engines or Java Webstart this is most certainly wrong. This code will behave very differently depending on the environment it is run in. Environments use the context class loader to provide applications with a class loader they should use to retrieve “their own” classes.
ClassLoader cl = Thread.currentThread().getContextClassLoader();
if (cl == null) cl = getClass().getClassLoader(); // fallback
Class clazz = cl.loadClass(name);

Poor use of reflection

Class beanClass = …
if (beanClass.newInstance() instanceof TestBean) …
This programmer is struggling with the reflection API. He needs a way to check for inheritance but didn’t find a way to do it. So he just creates a new instance and uses the instanceof operator he is used to. Creating an instance of a class you don’t know is dangerous. You never know what this class does. It may be very expensive. Or the default constructor may not even exist. Then this if statement would throw an exception. The correct way of doing this check is to use the Class.isAssignableFrom(Class) method. Its semantics is upsidedown of instanceof.
Class beanClass = …
if (TestBean.class.isAssignableFrom(beanClass)) …

Synchronization overkill

Collection l = new Vector();
for (…) {
l.add(object);
}
Vector is a synchronized ArrayList. And Hashtable is a synchronized HashMap. Both classes should only be used if concurrent access requires it. If however those collections are used as local temporary variables the synchronization is complete overkill and degrades performance considerably.
Collection l = new ArrayList();
for (…) {
l.add(object);
}

Wrong list type

Without sample code. Junior developers often have difficulties to chose the right list type. They usually choose quite randomly from Vector, ArrayList and LinkedList. But there are performance considerations to make! The implementations behave quite differently when adding, iterating or accessing object by index. I’ll ignore Vector in this list because it behaves like an ArrayList, just slower.




















ArrayList LinkedList
add (append) O(1) or ~O(log(n)) O(1)
insert (middle) O(n) or ~O(n*log(n)) O(1)
iterate O(n) O(n)
get by index O(1) O(n)
While linked list have constant insert performance they use more memory per list entry. The insert performance of the ArrayList depends on whether it has to grow during the insert or if the initial size is reasonably set. The growing occurs exponentially (by factor 2) so growing costs are O(log(n)). The exponential growing however may use much more memory than you actually need. The sudden need to resize the list also makes the response time sluggisch and will probably cause a major garbage collection if the list is large. Iterating over the lists is equally inexpensive. Indexed list element access however is very slow in linked lists of course.

I love arrays

/**
* @returns [1]: Location, [2]: Customer, [3]: Incident
*/
Object[] getDetails(int id) {…
Even though documented, this kind of passing back values from a method is ugly and error prone. You should really declare a small class that holds the objects together. This is analoguos to a struct in C.
Details getDetails(int id) {…}

private class Details {
public Location location;
public Customer customer;
public Incident incident;
}


Premature object decomposition

public void notify(Person p) {

sendMail(p.getName(), p.getFirstName(), p.getEmail());

}
class PhoneBook {
String lookup(String employeeId) {
Employee emp = …
return emp.getPhone();
}
}
In the first example it’s painful to decompose an object just to pass its state on to a method. In the second example the use of this method is very limited. If overall design allows it pass the object itself.
public void notify(Person p) {

sendMail(p);

}
class EmployeeDirectory {
Employee lookup(String employeeId) {
Employee emp = …
return emp;
}
}

Modifying setters

private String name;

public void setName(String name) {
this.name = name.trim();
}

public void String getName() {
return this.name;
}

This poor developer suffered from spaces at the beginning or end of a name entered by the user. He thought to be clever and just removed the spaces inside the setter method of a bean. But how odd is a bean that modifies its data instead of just holding it? Now the getter returns different data than was set by the setter! If this was done inside an EJB3 entity bean a simple read from the DB would actually modify the data: For every INSERT there would be an UPDATE statement. Let alone how hard it is to debug these side-effects! In general, a bean should not modify its data. It is a data container, not business logic. Do the trimming where it makes sense: in the controller where the input occurs or in the logic where the spaces are not wanted.
person.setName(textInput.getText().trim());

Unnecessary Calendar

Calendar cal = new GregorianCalender(TimeZone.getTimeZone(“Europe/Zurich”));
cal.setTime(date);
cal.add(Calendar.HOUR_OF_DAY, 8);
date = cal.getTime();
A typical mistake by a developer who is confused about date, time, calendars and time zones. To add 8 hours to a Date there is no need for a Calendar. Neither is the time zone of any relevance. (Think about is if you don’t understand this!) However if we wanted to add days (not hours) we would need a Calendar, because we don’t know the length of a day for sure (on DST change days may have 23 or 25 hours).
date = new Date(date.getTime() + 8L * 3600L * 1000L); // add 8 hrs

Relying on the default TimeZone

Calendar cal = new GregorianCalendar();
cal.setTime(date);
cal.set(Calendar.HOUR_OF_DAY, 0);
cal.set(Calendar.MINUTE, 0);
cal.set(Calendar.SECOND, 0);
Date startOfDay = cal.getTime();
The developer wanted to calculate the start of the day (0h00). First he obviously missed out the millisecond field of the Calendar. But the real big mistake is not setting the TimeZone of the Calendar object. The Calendar will thus use the default time zone. This may be fine in a Desktop application, but in server-side code this is hardly ever what you want: 0h00 in Shanghai is in a very different moment than in London. The developer needs to check which is the time zone that is relevant for this computation.
Calendar cal = new GregorianCalendar(user.getTimeZone());
cal.setTime(date);
cal.set(Calendar.HOUR_OF_DAY, 0);
cal.set(Calendar.MINUTE, 0);
cal.set(Calendar.SECOND, 0);
cal.set(Calendar.MILLISECOND, 0);
Date startOfDay = cal.getTime();

Time zone “conversion”

public static Date convertTz(Date date, TimeZone tz) {
Calendar cal = Calendar.getInstance();
cal.setTimeZone(TimeZone.getTimeZone(“UTC”));
cal.setTime(date);
cal.setTimeZone(tz);
return cal.getTime();
}

If you think this method does something useful, please go and read the article about time. This developer had not read the article and was desperately trying to “fix” the time zone of his date. Actually the method does nothing. The returned Date will not have any different value than the input. Because a Date does not carry time zone information. It is always UTC. And the getTime / setTime methods of Calendar always convert between UTC and the actual time zone of the Calendar.


Calling Date.setTime()

account.changePassword(oldPass, newPass);
Date lastmod = account.getLastModified();
lastmod.setTime(System.currentTimeMillis());

The above code updates the last modified date of the account entity. The programmer wants to be conservative and avoid creating a new Date object. Instead she uses the the setTime method to modify the existing Date instance.


There is actually nothing wrong with that. But it is just not recommended practice. Date objects are usually passed around carelessly. The same Date instance could be passed to numerous objects, which don’t make a copy in their setters. Dates are often used like primitives. Thus if you modify a Date instance, other objects that use this instance might behave unexpectedly. Of course it is unclean design if an object exposes its intrinsic Date instance to the outside world, if you write code that strictly adheres to classical OO-principles (which I think is too inconvenient). General everyday Java practice however is to just copy Date references and not clone the object in setters. Thus every programmer should treat Date as immutable and should not modify existing instances. This should only be done for performance reasons in special situations. Even then the use of a simple long is probably equally good.

account.changePassword(oldPass, newPass);
account.setLastModified(new Date());

Not noticing overflows

public int getFileSize(File f) {
long l = f.length();
return (int) l;
}

This developer, for whatever reason, wrapped a call to determine the size of a file into a method that returns an int instead of a long. This code does not support files larger than 2 GB and just returns a wrong length in that case. Code that casts a value to a smaller size type must first check for a possible overflow and throw an exception.

public int getFileSize(File f) {
long l = f.length();
if (l > Integer.MAX_VALUE) throw new IllegalStateException(“int overflow”);
return (int) l;
}

Another version of an overflow bug is the following. Note the missing parantheses in the first println statement.

long a = System.currentTimeMillis();
long b = a + 100;
System.out.println((int) b-a);
System.out.println((int) (b-a));

Storing money in floating point variables

float total = 0.0f;
for (OrderLine line : lines) {
total += line.price * line.count;
}
double a = 1.14 * 75; // 85.5 represented as 85.4999…
System.out.println(Math.round(a)); // surprising output: 85

I have seen many developers coding such a loop. Including myself in my early days. When this code sums 100 order lines with every line having one 0.30$ item, the resulting total is calculated to exactly 29.999971. The developer notices the strange behaviour and changes the float to the more precise double, only to get the result 30.000001192092896. The somewhat surprising result is of course due to the difference in representation of numbers by humans (in decimal format) and computers (in binary format). It always occurs in its most annyoing form when you add fractional amounts of money or calculate the VAT.


There are business cases where you can not afford to lose precision. You lose precision when converting between decimal and binary and when rounding happens in not a well-defined mannor or at indeterminate points. To avoid losing presision you must use fixed point or integer arithmetics. This does not only apply to monetary values, but it is a frequent source of annoyance in business applications and therefore makes a good example. In the second example an unsuspecting user of the program would simply say the computer’s calculator is broken. That’s of course very embarassing for the programmer.


Consequently a non-integer amount of money should never ever be stored in a floating point data type. Even a simple multiplication with an integer can already yield an inexact result. If you see a float or double in your financial code base, the code will most likely yield inexact results. Instead either a string or fixed point representation should be chosen. A text representation must be in a well-defined format and is not to be confused with user input/output in a locale specific format. Both representations must define the precision (number of digits before and after the decimal point) that is stored.


For calculations the class BigDecimal provides an excellent facility. The class can be used such that it throws runtime exceptions if precision is unexpectedly lost in an operation. This is very helpful to uproot subtle numerical bugs and enables the developer to correct the calculation.

BigDecimal total = BigDecimal.ZERO;
for (OrderLine line : lines) {
BigDecimal price = new BigDecimal(line.price);
BigDecimal count = new BigDecimal(line.count);
total = total.add(price.multiply(count)); // BigDecimal is immutable!
}
total = total.setScale(2, RoundingMode.HALF_UP);
BigDecimal a = (new BigDecimal(1.14)).multiply(new BigDecimal(75)); // 85.5 exact
a = a.setScale(0, RoundingMode.HALF_UP); // 86
System.out.println(a); // correct output: 86

Abusing finalize()

public class FileBackedCache {
private File backingStore;

protected void finalize() throws IOException {
if (backingStore != null) {
backingStore.close();
backingStore = null;
}
}
}


This class uses the finalize method to release a file handle. The problem is that you can don’t know when the method is called. The method is called by the garbage collector. If you are running out of file handles you want this method to be called rather sooner than later. But the GC will probably only invoke the method when you are about to run out of heap, which is a very different situation. It may take anything from milliseconds to days until GC and finalization runs. The garbage collector manages memory only. It does that very well. But it must not be abused to manage any other resources apart from that. The GC is not a generic resource management mechanism! I find Sun’s API Doc of the finalize method very misleading in that respect. It actually suggest to use this method to close I/O resources – complete bullshit if you ask me.


Better code provides a public close method, which must be called by a well-defined lifecycle management, like JBoss MBeans or so.

public class FileBackedCache {
private File backingStore;

public void close() throws IOException {
if (backingStore != null) {
backingStore.close();
backingStore = null;
}
}
}


Leave a Reply

Your email address will not be published. Required fields are marked *