Friday 16 September 2011

Public final and data encapsulation


Dear Junior

We were discussing immutable value objects and using "public final" data-fields for their representation. In that discussion I started off mixing up immutability and encapsulation. Now that we have covered immutability, let us return to encapsulation.

Obviously, declaring a field as public will break data encapsulation. You lose your freedom to change data representation without having to bother the clients.

Let us have a look at a name-class with some actual usage.

public class Name {
    public final String fullname;

    public Name(String fullname) {
        if(!fullname.matches("[a-zA-Z\\ ]+"))
            throw new IllegalArgumentException();
        this.fullname = fullname;
    }

    public String[] names() {
        return fullname.split(" ");
    }
}

The public API of this class consists of three parts: the construction of a name from a string, the attribute “fullname” (accessible through property data field), and the attribute “names (accessible through accessor method).

An example of what client code looks like we can find in the tests.

public class NameTest {

    private final String danbjson = "Dan Bergh Johnsson";

    @Test
    public void shouldHaveFullNameAsAttribute() {
      Assert.assertEquals(danbjson, new Name(danbjson).fullname);
    }

    @Test(expected = IllegalArgumentException.class)
    public void shouldNotAllowWeiredCharsInName() {
      new Name("#€%&/");
    }

    @Test
    public void shouldSplitIntoNamesAtSpaces()  {
      Assert.assertEquals(
        new String[] {"Dan", "Bergh", "Johnsson"},
        new Name(danbjson).names());
    }

}

Imaging that we want to change the internal data representation to using a char-array instead. Doing so will break all the clients as they rely on having that public data-field “fullname”. Thus, it is a bad design. Or?

I would argue that it is still a good design. Using modern tools it is easy to add the encapsulation when needed. Applying "Encapsulate Field" in e g IntelliJ yields.

public class Name {
    private final String fullname;

    public Name(String fullname) {
        if(!fullname.matches("[a-zA-Z\\ ]+"))
            throw new IllegalArgumentException();
        this.fullname = fullname;
    }

    public String[] names() {
        return fullname.split(" ");
    }

    public String fullname() {
        return fullname;
    }
}

And of course the client code has been changed accordingly.

    @Test
    public void shouldHaveFullNameAsAttribute() {
        Assert.assertEquals(danbjson, new Name(danbjson).fullname());
    }


Now, I could have chosen "getFullname" as the method name for the new method. However, I have always found that naming convention a little bit awkward, the property is “full name” and adding a boilerplate “get” does not add any value in my opinion. By the way, JavaBeans is just one naming convention in Java, the naming convention for CORBA predates JavaBeans. In the CORBA convention if you have a property “fullname” of type String, then the way to access it was to call a method “String fullname()” and the way to change the property was to call a method  “void fullname(String)”. So the convention I use is not new to Java at all.

In some languages there is no distinction in syntax between accessing a field and calling a no-arg method. Had the code been written in Eiffel or Scala, the syntax would have been the same and there would be no change to the client code at all.

Now that we have encapsulated the usage via “fullname()” we can change the internal representation to a char array.

public class Name {
    private final char[] chars;

    public Name(String name) {
        if(!name.matches("[a-zA-Z\\ ]+"))
            throw new IllegalArgumentException();
        this.chars = name.toCharArray();
    }

    public String[] names() {
        return new String(chars).split(" ");
    }

    public String fullname() {
        return new String(chars);
    }
}

Just for reference, in Scala the corresponding change would start with this "final public" representation – here represented by the keyword “val”.

class Name(val fullname: String)
{
  if(!fullname.matches("[a-zA-Z\\ ]+")) throw new IllegalArgumentException();

  def names = fullname.split(' ')
}

The change would take us to the slightly more verbose char array representation. Here we have no “val” in external class declaration, but a private val-field hidden inside the class-block instead.

class Name(name: String)
{
  if(!name.matches("[a-zA-Z\\ ]+")) throw new IllegalArgumentException();
  private val chars = name.toCharArray

  def fullname = chars.toString

  def names = fullname.split(' ')
}


In conclusion: As the public final field is accessible by the clients, it is also a part of the API for Name: This breaks data encapsulation. If you want to change data representation, you will need to create an accessor method to which you direct the client.

In Scala or Eiffel this is a no-issue, as the new accessor could transparently have the same name as the old datafield ("fullname") and be accessed with exactly the same syntax ("name.fullname"). However, in Java you have to include an empty pair of parenthesis to call the accessor method - thus the client code must be changed.

Now, I consider this a small issue, as there is excellent refactoring support in modern IDEs that eliminate the change to a fully automated four-click, one-minute exercise. Thus, there is no point in designing for that change up front.

So, even if we *do* break data encapsulation, I would say that it is not much of a problem.

Now, there are other kinds of encapsulations that are more interesting than data encapsulation, but that analysis will have to wait for some other letter.

Yours

   Dan