Why can't I write good code

Part 1: Clean Code - Rules for good, clean code

Writing clean and easy-to-understand code is the ultimate goal of a good IT project. Much depends on:

  • Maintainability
  • Training period for other programmers, you quickly understand what individual functions do
  • Robustness with changes
  • Testability, everything collapses, with small changes, stable updates can be provided quickly
  • Popularity with other programmers, e.g. in open source projects, a negative example is XT-Commerce

The highly recommended standard work on the subject is "Clean Code - Refactoring, patterns, testing, and clean code techniques" of Robert C. Martin. Chapters 1 through 3 are covered in this article.

Meaningful names

The name of a variable, function, or class should immediately explain why it exists, what it does, and how it's used. When a variable needs a comment, it does not express its purpose.

E.g:

int d // number of days past is better: int daysSinceCreation;

Use pronounceable names

No constructs with unclear abbreviations how: int daSiCre instead of daysSinceCreation.

Use searchable names

Modern IDEs make searching easy, but there's no point in searching for the letter e of a run variable and being inundated with results.

Variable names with a letter are only to be used as local variables in short methods.

No member prefixes for class variables

Prefices such as m_ for class variables are unnecessary because the classes and functions should be as small as possible and modern IDEs highlight the variables in color.

So don't write $ m_variable or $ _variable, but $ variable.

Interfaces and implementations

An interface class should not contain a mention of the name in the identifier, such as IShape, but simply Shape be called. The implementation of the interface should be identified by, at the end of the name, e.g. ShapeImp.

Loop counter variables

Loop counter Variables such as i, j or k are permitted, but never l (“small L”) because they are difficult to distinguish from 1. The scope of such variables must be very small.

Class names

Should consist of a noun or a noun phrase.

Method names

Method names should begin with a verb or with a verb. The JavaBean standard specifies, for example, the following prefixes: get, set, is.

When constructors are overloaded, static factory methods should be used to describe the arguments:

// bad: GeoPoint geoPoint = new GeoPoint (2,3.5); // good: GeoPoint geoPoint = new GeoPoint.fromLatLong (2,3.5);

One word per concept

For example, it's confusing fetch, retrieve and get to use as names for equivalent methods of different classes.

Same goes for controller, manager, driver: A consistent lexicon should always be used to provide clarity.

Use the name of the solution domain

The readers of your code will also be computer scientists, which is why technical terms from computer science or mathematics should always be used and not customer terms (e.g. business administration terms).

A term from the problem domain can only be used if there is no suitable expression, because then the customer may be able to help.

No unnecessary information

Some programmers put certain prefixes in front of the class name for the classes, such as the Zend_Framework (Zend_). In a programming language like PHP, where name spaces were only introduced with PHP 5.3, that's ok, otherwise the information is redundant.

Shorter names are generally better as long as you are clear. Do not add more context than necessary.

Functions

  • Features should be very small, barely longer than 20 lines. All functions that are longer can be refactored to smaller functions.
  • Indentation depth not more than 2 levels, increases readability immensely
  • they should be doing a task
  • one abstraction level per function, the generation of HTML and the configuration of tools, for example, are far apart
  • Code should be read from top to bottom; there should be one from the next abstraction level after each function.

SWITCH instructions

Switch instructions tend to do multiple jobs, which is bad. If they cannot be avoided then at least they should be buried deep in classes and never repeated. The solution is Polymorphism and the Abstract factory pattern.

E.g.

public Money calculatePay (Employee e) throws InvalidEmployeeType {switch (e.type) {case FREELANCER: return calculatePayFreelancer (e); case SALARIED: return calculatePaySalary (e); case COMISSIONED: return calculatePayComissioned (e); default: throw new InvalidEmployeeType (e.type); }}

There are several problems with this example:

  1. The function is too long and gets longer with each new type of employee.
  2. It fulfills several tasks: error handling and salary calculation
  3. It has to be changed when a new type is added
  4. There are probably several other functions with the same switch conditions, such as: getPayPerHour (Emplayee e)

The solution is to hide the switch statement behind an abstract factory and let the employee interface implement the functions polymorphically.

public abstract class Employee {public abstract Money caluclatePay (); public abstract Money getPayPerHour (); } ----- public interface EmployeeFactory {public Employee makeEmployee (EmployeeRecord r) throws InvalidEmployeeType; } ---- public class EmployeeFacotryImpl implements EmployeeFactory {public Employee makeEmployee (Employee e) throws InvalidEmployeeType {switch (r.type) {case FREELANCER: return new FreelancerEmployee (r); case SALARIED: return SalariedEmployee (r); case COMISSIONED: return CommissionedEmployee (r); default: throw new InvalidEmployeeType (r.type); }}}

Number of function arguments

At best, no argument and a maximum of 2 to 3 arguments should be used. The less, the easier it is to understand the functions.

The advantage is that the functions are easy to test because increasing the number of arguments potentially increases the number of combinations.

Output arguments are more difficult to understand than return values:

// bad: use of output arguments generateHTML ("
lalala
", $ html); // better: use return values ​​$ html = generateHTML ("
lalala
");

Functions with one input parameter (monadic functions)

  1. Ask a question about the argument: boolean is_int ($ var)
  2. Manipulate the argument, e.g. transform: InputStream fileOpen (String filename)

Flag arguments

Are absolutely forbidden and indicate that more than one task is being processed, e.g. render (true), here you have to look up what the parameter stands for. The function should be broken down into 2 functions, e.g. renderHighQuality () and renderLowQuality ().

Functions with two input parameters (dyadic functions)

Difficult to understand and the order of parameters can lead to problems, e.g. assertEquals (expected, actual). It makes sense for functions where the parameters logically belong together, e.g. new point (1,1).

Many dyadic functions can be converted into monadic functions, e.g. by introducing class variables. The same is true for triadic functions.

Functions with more than 2 input parameters (polyadic functions)

It is likely that the arguments should be wrapped in separate classes:

drawCycle (double x, double y, double radius); // better: own class, more understandable naming drawCycle (Point center, double radius);

Avoid side effects

When functions promise to fulfill a task, but also perform a hidden function, this often leads to bugs. This can be, for example, changing class variables / global variables. This leads to errors that are difficult to find, which depend on the time the function is called and which are difficult to reproduce.

function checkPassword ($ userName, $ password) {if (isAuthorized ($ userName, $ password)) {// leads to side effects session.start (); return true; } else {return false; }}

Separate instructions and queries

Functions should either do something or answer something, neither of which is allowed.

boolean setString ($ value)

is confusing because

if (setString ($ value))

This can mean that it is checked whether the value has already been set or whether the value has been set successfully.

Exception vs. error codes

Returning error codes as arguments provokes a lot of if loops and makes the code ugly. Using exceptions can separate error handling from logic. The try-catch blocks should always be separated from the logic and separated into individual functions:

public void setString (String value) {try {setStringValue (); } catch (exception e) {logError (e); }} private void setStringValue (String value) throws Exception {this.string = value; } private void logError (exception e) {logger.log (e.getMessage ()); }

DRY - Don't Repeat Yourself

Repetitions inflate the code and require additional work when changes are made.

Structured programming

According to Edsger Dijkstra, every function should have an output and an input: It is forbidden to use multiple return statements as well as break, continue and goto.

This only applies to long functions; to short functions, multiple returns, break and continue are absolutely permissible and useful because they make the code more expressive.

GOTO really should never be used.

How do you program correctly in order to be able to write such functions?

When writing code, it is perfectly natural that the first draft is awkward, too long, and unstructured. This draft should be modified and improved, accompanied by unit tests, until one is satisfied with the result and all criteria are met.

continue to: Part 2: Clean Code - correct and insane comments

Continue reading?