What are anti-patterns in Java design

Performance anti-patterns

Java magazine

06/08

Written by:
Mirko Novakovic, Mirko Novakovic

Performance is a critical requirement in Java EE projects. In many tuning projects, anti-patterns could be identified that have a high recognition value and can thus be avoided in your own projects.

Java Enterprise projects have an increasingly integrative meaning in many companies. Modern architecture approaches such as SOA, which integrate many external and internal services, place new demands on performance and stability and, at the same time, make it more difficult to analyze and resolve problems due to the increasing distribution of the application. In so-called “troubleshooting” assignments, practical experience was gained on how performance and stability problems can be avoided in order to improve the quality of these complex architectures. The article describes performance anti-patterns that help to identify and avoid problems.

Patterns come from the building industry and were first written down by Christoper Alexander [1] and later transferred to software engineering. But it is only since the publication of the book "Design Patterns" by Erich Gamma [2] and his colleagues that patterns have reached the masses and become very important in software development. Design patterns describe a proven solution for an object-oriented design problem in a uniformly defined form. In modern frameworks such as Spring, the described patterns (e.g. factory or proxy) can be found everywhere in the code, which greatly facilitates understanding. In the Java Enterprise world there are Java EE patterns and blueprints from Sun [3] and also in many special disciplines, such as EJB [4] or EAI [5]. These templates help with your daily work and avoid reinventing the wheel over and over again.

Anti-Patterns and Performance

Anti-patterns, on the other hand, are less widespread, "Bitter Java" by Bruce Tate [6] provides a list of Java anti-patterns, some of which also affect the subject of Java performance. Anti-patterns are examples of poorly executed solution patterns and in this way provide hints for improvement. Uniform documentation has also proven its worth with Anti-Pattern. A clear name is particularly important so that everyone in teams understands what is going on. (When you talk about “Singleton” today, everyone knows what is meant. Whether that was the case before “Design Patterns” is questionable - this is a crucial value of patterns.) Performance anti-patterns specialize general anti-patterns on performance and stability-related problems. This means that a “successfully” implemented performance anti-pattern usually leads to poor response times, low throughput and / or poorly available applications - and probably also leads to dissatisfied users and angry managers. The problem is that the implementation of a design pattern often leads to performance problems, so that a design pattern can sometimes become a performance anti-pattern.

A good example of this is the Java Petstore Blueprint application, which was "misused" by Microsoft as a performance benchmark application in comparison to .NET [7] and exhibited rather poor performance behavior. In response, JPetstore [8] was built, a performance-optimized variant of the application in which many Java EE patterns were eliminated. Among other things, the lean O / R mapping framework iBatis [9] was created - here as a solution for the EJB 2.0 Entity Bean Performance Anti-Pattern. (The Spring Framework was also created as a complexity anti-pattern solution for EJB [10] - EJB also had many positive effects on the Java world.)

Performance problems in Java projects

From experience in many troubleshooting assignments, three core problems can be identified in Java applications that lead to performance and stability problems:

  • Memory problems (memory leaks, garbage collection and session sizes)
  • DB problems (number of statements, O / R mapping, DB design and tuning)
  • Incorrect use of external frameworks.

In addition to these code and design problems, organizational and architectural reasons for performance problems are repeatedly uncovered, which are usually even more serious and can only be remedied with great effort, e.g. by refactoring. In the following, the performance anti-patterns are therefore divided into three categories:

  • Organizational anti-patterns affect the organization, the procedure and the communication in a project.
  • Architecture anti-patterns affect the strategic and structural decisions for an application.
  • Implementation anti-patterns affect the implementation and configuration of the application and infrastructure components.

In the following sections examples of performance anti-patterns in the different categories are given.

Performance tools

A doctor needs tools to diagnose a patient's symptoms so that therapy can be chosen correctly and, hopefully, the patient will recover quickly. In medicine, the diagnosis is made through a clinical examination and through the use of tools such as X-ray machines, ultrasound or magnetic resonance. It was the addition of these tools to the clinical examination that made it possible to diagnose and cure certain diseases precisely. The same diagnostic principle is used in performance analysis, and tools are also required to diagnose performance or stability “complaints”. Unfortunately, the "laying on of hands" is still the norm in software development. The software components described here are a tool kit for the performance tuner. Profiler and Memory Debugger Profiler and Memory Debugger are developer tools that enable runtime analyzes of the application down to the code level and allow a snapshot of the heap to be created in order to examine the object graphs for possible memory holes. Almost all profilers use the Java Virtual Machine Tooling Interface (JVMTI) for this, which among other things enables bytecode instrumentation of the application at runtime. Profilers and Memory Debuggers are well suited for the detailed analysis of a performance problem, but usually have a much too high overhead to be used under load or in production. Modern JVMs now also offer many performance analysis tools, such as jhat for analyzing the heap. The BEA JRockit JVM has even integrated a profiler and a memory leak analysis tool. The optimal option is to control the tool with the help of Ant in order to be able to generate automated profilings with the build. JProbe, JProfiler and yourKit are commercial profiling tools that have all the functions you need integrated.

Diagnostic tools

Diagnostic tools offer the possibility of performing performance and memory analyzes under real conditions, i.e. in production or in the load test environment. I. d. Usually, in addition to the runtimes, information about the application server, the operating system and the peripheral systems involved (DB, MOM) are also collected and aggregated. Most tools work with statistical data, although there are now modern tools that measure the real access path of each call and thus allow targeted error analyzes. The overhead of these tools should be between 3–10 percent. These diagnostic tools can be used to analyze performance and stability problems that only occur under load. dynaTrace diagnostics, Quest PerformanSure and CA / Wily Introscope are three representatives of this category.

Monitoring tools

Monitoring tools are tools for monitoring the application in operation. They mostly offer threshold-based alarm and escalation functions and have so-called dashboards for monitoring and analyzing problems. Many monitoring tools also have diagnostic functions that are switched on when required or on the basis of an escalation or always run (but mostly restricted on the basis of filters in order to keep the overhead low). Monitoring is important in order to be able to react proactively when bottlenecks loom. With an existing Service Level Agreement (SLA), these tools are also able to control them and generate corresponding reports. With the help of the Java Management Extension (JMX) you can integrate your own application data and frameworks into the monitoring. dynaTrace diagnostics, Quest Foglight, IBM ITCAM and CA / Wily Introscope are examples of tools in this category.

Load generators

Load generators record user interactions and can play them back with a certain number of virtual users in order to generate a targeted load on a system. The tools differ mainly in the protocols supported (HTTP, HTTPS, RMI, CORBA, ...), the scripting languages ​​used and the range of functions and adaptability of the APIs. The more complex the simulated use cases, the more important it is to be able to easily develop and expand the scripts with your own data and functions. AJAX also places new demands on load test tools, which should be taken into account when selecting a tool. A detailed overview and experience report will follow on the subject of load test tools in a later issue of Java Magazine. Borland Silk-Performer, HP / Mercury Loadrunner and also the open source tool JMeter are well-known representatives of this category.

Fig. 1: Multi-layering architecture

Organizational performance anti-patterns

When it comes to organizational performance anti-patterns, there are two candidates that can be found in most troubleshooting projects: the "parallel screwing" and the "shot in the dark" performance anti-pattern (please email the authors if someone has a brilliant idea for better names). With the “parallel screwing” anti-pattern, the developers are let loose by the project management with concentrated force on the performance problems. Each team member develops their own tuning theories and implements the first measures as quickly as possible. The performance of the improved application is measured at regular intervals - usually with little success.

The consequence of this procedure is that very high personnel expenditure is generated and only little that can be counted can be avoided. This often results from the fact that the implemented tuning measures influence each other and the effect of each individual implementation cannot be assessed. For example, tuning a developer could have brought 20 percent improvement, but a measure carried out in parallel had a 50 percent worsening effect. After the measurement, both measures are then incorrectly discarded. As a negative side effect, new technical errors are programmed, which then make performance testing more difficult or impossible.

In order to solve this anti-pattern, a central performance team should be set up that records and evaluates all measurements and measures and implements the most successful measure. Corresponding reinforcement by the individual developers is brought in selectively. The tuning is iterative and ends when the performance goals have been achieved. (Having no performance goals is also an organizational performance anti-pattern that can often be found.) Appropriate performance tuning tools are required to evaluate the tuning measures (see box).

The “shot in the dark” anti-pattern describes teams that do not have the appropriate tools and thus try to manually search for and evaluate performance problems. In practice, this means that code inspections or walkthroughs [11] are used to find errors - this is time-consuming and only leads to the goal with a lot of luck. The productive load, the infrastructure and external systems cannot or only partially be taken into account. So-called microbenchmarks also only consider sections of an application and can lead to incorrect conclusions, as Brian Goetz describes in his articles [12]. Poking around in the dark leads to “fingerpointing” between operations and development in many projects (keyword: everything is fast in Eclipse and slow in production), stress in the entire development team, and ultimately this results in a loss of trust in the team in management.

Solving this anti-pattern is simple, but requires some investment in tools and staff training. "Measure, don’t guess" is the only way to successfully find and fix performance and stability problems - this requires monitoring, diagnosis and profiling tools that provide all the necessary data from the systems involved. In addition, a load test tool should be available in order to be able to simulate the real load in everyday operation and to make production problems reproducible. A corresponding production-related test environment must therefore also be available. In addition to the tools, a performance management process must be established from development to operation and employees must be trained accordingly in how to use the tools and technologies.

Both organizational performance anti-patterns described increase when they occur in combination, i.e. "parallel screwing" on the black box. In most troubleshooting assignments, the investment in sensible processes, tools and training would have paid for itself quickly. In addition to these organizational anti-patterns, there are others, such as the “test data trap”, in which measurements are made with unfavorable data and use cases and thus false positive performance statements are made. Or “wrong timing”, which means the point in the development process when performance should be considered. Carrying out load tests at the end of development is usually too late - just as it is unfavorable to make decisions at the beginning without being able to collect performance data.

Architecture performance

Anti-patterns

There are many performance anti-patterns in architecture - this mainly describes those that are very often found in projects. The “Multi Layering” anti-pattern describes an architecture that tries to achieve a high level of abstraction through as many independent, logical application layers as possible. As a developer, you can quickly recognize such an architecture from the fact that a large part of the time is lost in mapping and converting data and simple access from the user interface to the database is complex.

Such architectures usually arise because the application should be kept as flexible as possible, so that e.g. GUIs can be exchanged easily and quickly and the dependencies on other systems and components can be kept low. The decoupling of the layers leads to a loss of performance when mapping and exchanging data - especially when the layers are also physically separated and data is exchanged using remoting technologies such as SOAP or RMI-IIOP. The many mapping and conversion operations can also lead to increased garbage collection activity, known as the "cycling object problem". As a solution to this anti-pattern, the architecture drivers should be examined carefully in order to clarify which flexibility and decoupling is necessary. New framework approaches, such as JBoss Seam [13], have taken on the problem and try to avoid the mapping of data as completely as possible.

Another architecture anti-pattern is the so-called “session cache”. The web session of an application is misused as a large data cache and the scalability of the application is severely restricted. Session sizes well over 1MB have often been measured during tuning operations - in most cases no team member knows the exact content of the session. Large sessions mean that the Java heap is very busy and only a small number of parallel users is possible. Especially when clustering applications with session replication, the loss of performance due to serialization and data transfer is very high, depending on the technology used. Some projects help themselves to get new hardware and more memory, but in the long run this is a very expensive and risky solution.

Session caches arise because the application architecture did not clearly define which data is session-dependent and which is persistent, i.e. can be restored at any time. During development, all data is quickly stored in the session because this is a very convenient solution - this data is often no longer removed from the session. To solve this problem, a memory analysis of the session should first be performed using a heap dump from production and the session should be cleaned of data that is not session-dependent. Caching can have a positive effect on performance if the process of fetching the data is performance-critical - for example when accessing databases. Ideally, caching is then carried out transparently for the developer within the framework. Hibernate offers e.g.a first and a second level cache to optimize access to data, but be careful: the configuration and tuning of such frameworks should be carried out by experts, otherwise you will quickly have a new performance anti-pattern.

Implementation performance anti-patterns

There are many Java performance anti-patterns and tuning tips - the problem with these technological anti-patterns is that they are heavily dependent on the Java version and manufacturer and, above all, on the use case. A very common anti-pattern, however, is the “underestimated front end”. In web applications, the front end is often the performance Achilles' heel. HTML and JavaScript development are often only an annoying accessory for "real" application developers and are therefore often insufficiently optimized for performance. Even with the increasing spread of DSL, the connection is often still a bottleneck - especially when it comes to a mobile connection via UMTS or GPRS. Web applications - driven by the Web 2.0 hype - are becoming more and more complex and their function is increasingly approaching desktop applications. This convenience leads to long waiting times and higher server and network load due to many server round trips and large pages.

There is a whole range of solutions for optimizing web-based interfaces. Compressing the HTML pages with GZip significantly reduces the amount of data transferred and has been supported by all browsers since HTTP 1.1. Web servers such as Apache have appropriate modules (mod_gzip) to perform the compression without changing the application. The page sizes can also be reduced quickly in HTML by consistently using CSS and outsourcing CSS and JavaScript sources to their own files - so these can be better cached by the browser. If used correctly, AJAX can also significantly improve performance, because it is possible to save having to completely reload websites, e.g. by only re-transferring the contents of lists.

But even in the analysis, the performance of the interfaces can be significantly improved by adapting the content of the pages to the requirements of the user. For example, if only the fields that are required 80 percent of the time are displayed on a page, the average transfer volume can be reduced significantly - the fields that are omitted are outsourced to their own pages. In many web applications there are forms with more than 30 input fields, of which only two fields were filled in 90 percent of the use cases - but all fields were always displayed and transferred, including all lists for the selection boxes. Another common anti-pattern is “phantom logging”, which is found in almost all projects. With phantom logging, log messages are generated that do not actually have to be created in the active log level. The code below is an example of the problem:

logger.debug (“A log message” + param_1 + “Text” + param_2);

Although the message would not be logged in the INFO level, the string is put together. Depending on the number and complexity of the debug and trace messages, this can lead to enormous performance losses - especially when objects have an overwritten and costly to-String () method. The solution is simple:

if (logger.isDebugEnabled ()) logger.debug (“A log message” + param_1 + “Text” + param_2);

In this case, the log level is first queried and the log message is only generated if the DEBUG log level is active. In order to avoid performance bottlenecks during development, the frameworks used should be properly understood. Most commercial and open source solutions have sufficient documentation on the subject of performance - experts should also be consulted at regular intervals when implementing the solution. Even if the bottleneck is found within a framework during profiling, this does not mean that the problem also lies within the framework. In most cases, the problem is incorrect usage or configuration.

Conclusion

Performance anti-patterns exist not only in development, but above all within the project organization and architecture. The right processes and tools [14], as well as the necessary expert knowledge, are the basis for avoiding the anti-patterns described and achieving good performance and stability in your own project. In the future, the authors plan to publish the anti-patterns in the various categories on their website and to establish a community around performance and stability anti-patterns in order to avoid as many problems as possible from existing projects in the future.

Links & literature

[1] Christopher Alexander: A Pattern Language. Oxford University Press, 1977.

[2] Erich Gamma: Design Patterns. Addison-Wesley Longman, 1995.

[3] Sun Java Blueprints: java.sun.com/reference/blueprints

[4] Floyd Marinescu: EJB Design Patterns. Wiley & Sons, 2002.

[5] Gregor Hohpe: Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions. Addison-Wesley Longman, October 2003.

[6] Bruce Tate: Bitter Java. Manning, May 2002.

[7] Pet Store J2EE vs. NET: www.onjava.com/pub/a/ onjava / 2001/11/28 / catfight.html

[8] JPetstore: sourceforge.net/projects/ibatisjpetstore

[9] iBatis Data Mapper Framework: ibatis.apache.org

[10] Rod Johnson: J2EE Development Without EJB. Wiley & Sons, 2004.

[11] Glenford Myers: The Art of Software Testing. Wiley & Sons, 2004)

[12] Java theory and practice series: www-128.ibm.com/developerworks/views/java/ libraryview.jsp? Search_by = practice:

[13] JBoss Seam: labs.jboss.com/jbossseam

[14] Niklas Schlimm: Performance analysis and optimization in software development. Computer science spectrum, volume 30, 04/2007

Full article