Guidelines To Reduce Coupling And Increase Cohesion

Guidelines To Reduce Coupling And Increase Cohesion

Orthogonality improves a software system’s quality attributes, and low coupling and high cohesion are cornerstones for achieving orthogonality. Therefore, being aware of the most important guidelines to reduce coupling and increase cohesion is crucial for one’s work as a software developer.

The previous blog post talked about the importance of building loosely coupled classes or modules having high cohesion as a means to achieve orthogonality and hence improve the overall quality attributes of the resulting software system. In the following sections, we’re going to build on that knowledge by introducing guidelines for how to achieve low coupling and high cohesion. As always, you can find the full source code for the given examples in this GitHub repository.

Avoid Inappropriate Intimacy

Guiding question: Does this class or module make use of too many internals of another class or module, either by invoking a lot of its methods or by changing state in its fields?

Whenever a class or module knows too much about the internals of another class or module by using many of its methods and fields, we call it inappropriate intimacy. Although invoking many methods or using many fields of another class or module is not by definition evil, it’s at least a smell that coupling and cohesion may not be quite optimal yet, and that the affected classes or modules may therefore not yet be orthogonal.

The last blog post illustrated how high coupling decreases the quality attributes of the affected classes or modules by means of a trivial example that involved a Customer1 and an InvoiceReader1 class. Their separation of concerns was terrible – the Customer1 class ended up doing the invoice initialization and therefore got very tightly coupled to the strings returned by InvoiceReader1. In a first attempt to untangle this mess, we might be tempted to move the lines for parsing individual string elements to the invoice reading class, so we’d arrive at something like the following in the customer class (the invoice reader it invokes is called InvoiceReader3 here because InvoiceReader2 has already appeared in scope of the last blog post’s improved example):

    //Truncated
    public void initInvoices() {

        var invoiceReader = new InvoiceReader3("invoices/%d.csv".formatted(id));
        List<String> dates = invoiceReader.parseDates();
        List<String> subjects = invoiceReader.parseSubjects();
        List<Double> amounts = invoiceReader.parseAmounts();
        for (int i = 0; i < dates.size(); i++) {
            invoices.add(new Invoice(dates.get(i), subjects.get(i), amounts.get(i)));
        }

    }
    //Truncated

Since InvoiceReader3 now contains the knowledge on how to turn some raw string into more usable pieces of information, this first attempt is a step into the right direction, but we still couldn’t reasonably answer the guiding question above with a loud and clear NoCustomer3 using so many of InvoiceReader3’s methods is a pretty clear indication for inappropriate intimacy.

In such cases, a good first treatment is to move each method or field invocation from the calling to the called class or module, one at a time, and then aggregate the result in the latter (maybe within a new method). Once all invocations have been moved, ask yourself Should this result still be in the responsibility of this class/module?, and if the answer is Yes, then – congratulations! – you now only need to return the aggregated result to the caller.

For example, if you moved the parseDates(), parseSubjects(), and parseAmounts() invocations shown above to a new method in InvoiceReader3, the aggregated result would be all information necessary to create a new Invoice object. Putting the responsibility to create those objects into a class called InvoiceReader feels pretty reasonable, and you might even go a step further and have the new method assemble all invoice objects related to a particular customer and return that list to the caller (which is exactly the solution shown in the previous blog post).

Avoiding inappropriate intimacy both decreases coupling and increases cohesion because it forces you to think about responsibility. Even if the larger picture is not visible yet and reasoning about responsibility is still difficult, moving a couple of method calls and result aggregations from the calling class or module to the called class or module is a good approach to quickly find out if the latter is actually a better place for them. If some method calls still don’t feel as if they should be in a particular place, keep moving them around until you feel comfortable with the solution.

Tell, Don’t Ask

Guiding question: Does this class or module query the internal state of another class or module, alter that state or use it to make decisions, and then pass back the state?

From the caller’s perspective, this guideline is about achieving a certain goal using another class or module without knowing how that class or module achieves that goal or which state transitions are involved in doing so. From the called class’ or module’s perspective, this guideline is about encapsulating not only state, but also the methods to act on it. Generally, you’ll want your classes or modules to expose as little direct write access to internal state as possible to callers and give them only a well-defined API (a set of public methods they can invoke) to trigger state changes.

To illustrate this, let’s expand the Customer-and-Invoice example with an Account class and some functionality to pay invoices using the account’s balance. A first attempt to implement this might look like the following:

    // Bad!
    var customer4 = new Customer4(401, "Tylor", new Account1(965.75));
    List<Invoice> newInvoices = new ArrayList<>();
    var accountBalance = customer4.getAccount().getBalance();
    for (Invoice invoice : customer4.getInvoices()) {
        if(accountBalance > invoice.getAmount()) {
            accountBalance -= invoice.getAmount();
        } else {
            System.out.println("Insufficient funds, cannot pay invoice: %s".formatted(invoice));
            newInvoices.add(invoice);
        }
    }
    customer4.setInvoices(newInvoices);
    customer4.getAccount().setBalance(accountBalance);

This code queries all state it needs (the current account balance as well as all invoices to be paid) to achieve a certain goal within the business domain (pay all the invoices if the account balance permits it) by making decisions about that state and eventually passing back some new state. Here, the business rules have been implemented outside the entities they affect, and those entities have been degraded to mere state containers.

Violating Tell, Don’t Ask is problematic because it introduces a significant risk of knowledge – and, hence, code – duplication: Since the business rules that act on entities are not stored with those entities themselves, it’s not immediately obvious where they have been implemented (if implemented at all), so different developers might implement the logic multiple times. This is undesirable since duplication of code and knowledge is always a form of coupling, and we want to reduce coupling as far as possible. Another way of phrasing this is that violating Tell, Don’t Ask incurs the risk of also violating the DRY principle (which we’ll look at later in this blog post).

To implement Tell, Don’t Ask, the functionality above could be refactored like so:

public class Customer5 {

    private final long id;
    private final String name;
    private final Account2 account2;
    private final List<Invoice> invoices;

    public Customer5(long id, String name, Account2 account2) { /* ... */ }

    // No getters for account and invoices!
    public long getId() { return id; }

    public String getName() { return name; }

    public void payInvoices() {

        invoices.removeIf(invoice -> account2.deduct(invoice.getAmount()));
        printInvoicesSummary();

    }

    // Truncated

    private void printInvoicesSummary() { /* ... */ }

}

This new customer class still exposes everything necessary to achieve the business goal – paying invoices –, but it does so only via a public method that callers can invoke. On the other hand, the state that must be acted upon to achieve this goal is hidden from the outside world. The same pattern was applied to the account class:

public class Account2 {

    private double balance;

    public Account2(double balance) {
        this.balance = balance;
    }

    public double getBalance() {
        return balance;
    }

    public boolean deduct(double amount) {
        if(balance > amount) {
            balance -= amount;
            return true;
        }
        return false;
    }
}

Importantly, none of those classes exposes a setter anymore – state is either final or can be altered only by the class itself. With this in place, a caller can simply invoke the customer’s payInvoices() method:

    // Better:
    var customer5 = new Customer5(501, "Mike", new Account2(185.94));
    customer5.payInvoices();

The aforementioned guiding question can now be answered with No, and Tell, Don’t Ask is adhered to. It’s now very unlikely the functionality to pay invoices would ever be implemented twice as it now resides with the affected entities themselves. This reduced risk of duplication translates to reduced coupling risk.

The Law Of Demeter

Guiding question: Does this class or module traverse abstraction boundaries by chaining method calls?

(The Law of Demeter isn’t really a law , it’s more like a rule of thumb. But because Internet and literature refer to it as the Law of Demeter and not the Pretty solid advice of Demeter, we’ll follow that convention and still call it a law.)

The Law of Demeter was originally created in 1987 to help developers of the Demeter project achieve better decoupling in their modules. It’s a set of guidelines that can be summarized as only talk to your immediate friends. To illustrate this, let’s imagine we extended our previous customer-invoice-account example such that each account is linked to a policy in the form of a Policy class, and that the Policy class defines an overdraft limit. A piece of code interested in this overdraft limit for a particular customer could do the following:

    // Violates Law of Demeter
    someCustomer.getAccount().getPolicy().getOverDraftLimit();
    // Equally bad
    var account = someCustomer.getAccount();
    var policy = account.getPolicy();
    var overDraftLimit = policy.getOverDraftLimit();

Note that the second section is equally bad – abstraction boundaries are still violated, the code only stretches this out across multiple lines of code (in fact, this is even worse since the boundary violation is now a lot less obvious).

The exception to the Law of Demeter are APIs that you know are very, very unlikely to change. A good example for this is the Stream API introduced to the Java language in version 8:

    double sum = someCustomersInvoices.stream()
        .filter(invoice -> invoice.getDate().startsWith("2020-08"))
        .mapToDouble(Invoice::getAmount)
        .sum();

This is a method chain, too, but it only consists of calls to the Stream API, and it’s highly unlikely this API will ever change so fundamentally that the code above were unable to function.

A violation of the Law of Demeter is often a smell for poor design of the involved abstraction layers, so once you find yourself answering the aforementioned guiding question with Yes, the question to ask next is: Should this class or module really be responsible for X? In the example above illustrating a typical Law of Demeter violation, this would be answered with a No – because the caller is so “far away” (in terms of class boundaries) from the state it desires, it probably shouldn’t be responsible for working with that state in the first place. Maybe, a better option would be to put code working with accounts’ overdraft limits a bit closer to either the Account or the Policy class.

If all code in a software system adheres to the Law of Demeter, it’s very likely all code in that system is in exactly the classes or modules where it should be because of the responsibility it fulfils. Thus, the Law of Demeter encourages good, modular design that leads to higher cohesion and eliminates the coupling introduced by method call chains.

Don’t Repeat Yourself: DRY

Guiding question: Is there any piece of knowledge that does not have a single representation within the software system?

The DRY principle (sometimes also referred to as Say It Once, And Only Once) states that each piece of knowledge within a system must have a single, unambiguous, authoritative representation. We’ve already encountered this principle above: A violation of Tell, Don’t Ask increases the risk of knowledge duplication, and this is a violation of DRY.

Adhering to the DRY principle is important because each duplication introduces coupling: If you have multiple representations of a piece of knowledge and that knowledge changes, you’ll have to update it in all places, and it’s almost guaranteed you’ll someday forget to update one of them. On the other hand, if classes or modules conform to DRY, updating knowledge of any kind is very simple because the change will always be restricted to only one single spot.

It’s obvious how DRY applies to source code – only implement any piece of functionality once –, but DRY applies to more areas than code. For example, consider the following:

     /**
     * Calculates the sum of all invoices.
     * 
     * This method iterates over the list of given invoices, retrieves the invoice 
     * amount on each, and adds the amount to a variable. After iterating over  
     * all invoices, this variable holds the sum of all invoice amounts, and this 
     * sum is then returned.
     * 
     * @param invoices The list of all invoices.
     * @return The total sum of amounts across all invoices.
     */
    public double calculateInvoicesSum(List<Invoice> invoices) {
        
        double sum = 0.0;
        for (Invoice invoice: invoices) {
            sum += invoice.getAmount();
        }
        
        return sum;
        
    }

The duplication sits in the combination of code and documentation – both describe how exactly the calculation works. It’s easy to imagine how those two descriptions can get out of sync – all it takes is someone replacing the clunky code with some elegant stream operations who then forgets to change the documentation accordingly. (In fact, you could even go a step further and delete the documentation altogether – the method’s name describes what it does and the code describes how it’s done, so there’s really no benefit keeping it around.)

Other frequent sources for duplication are more subtle and harder to spot. Building on the previous example, consider the following:

// Not contained in source code repository, so just call it 'Customer'
public class Customer {
    
    private final List<Invoice> invoices;
    private double invoicesSum;

    public Customer() { /* ... */ }

    public double getInvoicesSum() {
        return invoicesSum;
    }

}

The invoicesSum field is duplication of the knowledge about a customer’s current total sum of invoices because that sum can be calculated anytime by means of the invoices list (though one might argue this violation of DRY is acceptable for performance reasons when customers can have very large numbers of invoices). Therefore, we could simply drop the field and beef up the method a little:

public class Customer {
    
    private final List<Invoice> invoices;

    public Customer() { /* ... */ }
    
    public double getInvoicesSum() {
        return invoices.stream()
            .mapToDouble(Invoice::getAmount)
            .sum();
    }

}

It’s important to note here that callers of this method don’t know whether the value they receive was retrieved from memory or calculated. Generally, all methods offered by a class or module should be named according to a uniform convention that does not reveal whether they are implemented through storage or through computation.

Wrap-Up

In the last blog post, we’ve seen that building orthogonal classes or modules is beneficial for the overall quality attributes of the resulting software system and its code base, and we’ve learnt about coupling and cohesion as the two cornerstones of achieving orthogonality. The previous sections, then, introduced four guidelines or principles that help us reduce coupling and increase cohesion:

  • Avoid Inappropriate Intimacy: Moves field access or method invocations to the class or module that should be responsible for the result. Therefore, eliminates unnecessary coupling between classes or modules and increases their cohesion.
  • Tell, Don’t Ask: Encapsulates the methods for acting on entity state with those entities, thus reducing the risk of implementing the corresponding functionality multiple times and therefore incurring coupling by knowledge duplication.
  • Law of Demeter: Eliminates cross-class/cross-module coupling by removing chained method calls that would otherwise traverse the boundaries of multiple connected classes or modules.
  • DRY: Eliminates coupling incurred by knowledge duplication.

Those four guidelines are essential tools in each developer’s toolbox because they help us reduce coupling and increase cohesion and thus, ultimately, lead to more orthogonal classes or modules. Therefore, applying the four guidelines shown here will have an immediate, positive impact on the quality attributes of the overall software system.