Blog from October, 2009

I've done a lot of thinking about this and I believe I made a mistake in the final ST 3.2.1 version before my current rebuild (v4). It's too confusing, and makes the code too complex, to distinguish between missing and present but null. There is huge history with ST too suggests that it seems to work okay treating a missing attribute and a null attribute as the same thing (i.e., not there). We have the null option that lets us say what to replace null with. there are a few corner cases that I've cleaned up, but ST ain't broke so I don't think I will "fix" this part. Has the advantage of being backward compatible, which will be important when I move onto rebuilding the front end of ANTLR v3 so that it does not use ANTLR v2.

Here are two of my unit tests. The difference you will notice that adding (setAttribute) null stores a null the value attribute in the attributes table. This is consistent than when I pass an array of size 1 with a null element.

    @Test public void testNullValueAndNullOption() throws Exception {
        STGroup group = new STGroup();
        group.defineTemplate("test", "<name; null=\"n/a\">");
        ST st = group.getInstanceOf("test");
        st.add("name", null);
        String expected = "n/a";
        String result = st.render();
        assertEquals(expected, result);
    }

Here it uses the null option because name is missing (same as null value attribute)

    @Test public void testMissingValueAndNullOption() throws Exception {
        STGroup group = new STGroup();
        group.defineTemplate("test", "<name; null=\"n/a\">");
        ST st = group.getInstanceOf("test");
        String expected = "n/a";
        String result = st.render();
        assertEquals(expected, result);
    }

The use case that I had for "missing" being important I realize is now a flaw in my thinking. filtering should be done in the model and not the view so my use case was an abuse case (smile)

When I yanked out the code that handled all of this missing versus null crap, it became much simpler and easier to understand + faster etc...

Part 1: Null-valued attributes

Let's consider values inside arrays. If names={"Tom", null, null, "Ter"}, what should we get here:

<names>

or here

<names; separator=", ">

My preference would be: TomTer and Tom, Ter. That is what v3 does now. We recently introduced the null option so we can say:

<names; null="foo">

to get "foo" instead of an missing element when names[i] is null.

HOWEVER, you cannot set an attribute to null. So, if instead of passing the list, we set them individually, we get a different answer.

st.add(names, "Tom");
st.add(names, null);  // do nothing
st.add(names, null);  // do-nothing
st.add(names, "Ter");

We get a list of {"Tom", "Ter"} sent to ST previously. All null values are ignored by add (actually called setAttribute in v3). The output would be "TomTer" even with null option. ooops.

I'm proposing that we allow null valued attributes in v4 to normalize the handling of single and multivalued attributes. In other words null and a list of one element with null in it should be the same.

Part 2. Missing versus null versus non-null

In v4, I want to clearly identify the exact meetings of: missing versus null versus non-null means. Consider what this means:

<name>

There are three situations:

  1. name doesn't exist as an attribute
  2. name exists but has no value (it's null)
  3. name exists and has a value

Similarly, what about properties (using getProp or isProp or the actual field name):

<user.name>

again, there are three situations:

  1. name doesn't exist as a property of the user object
  2. name exists but has no value (it's null)
  3. name exists and has a value

Currently, <name> is no problem if it doesn't exist, but <user.name> throws an exception if name is not a valid property. The reason I did this was that it's okay to have an attribute you don't set but accessing a nonexistent field is most likely a programming error. (I think I'm going to set up a list of flags you can set in order to throw exceptions upon certain conditions, otherwise ST will be fairly permissive).

Anyway, given that we are going to allow null-valued attributes, plain old <name> could be missing, could be null, or could have a value. Given this, what does the following yield?

<name; null="foo">

Personally, I think it should be:

  1. EMPTY if name doesn't exist as an attribute
  2. foo if name exists but has no value (it's null)
  3. name's value if name exists and has a value

So null option literally means the attribute exists but is null (has no value). If the attribute is simply missing, null option has no effect.

This is then consistent with lists and arrays. null applies to all null-valued elements because they exist physically in the list, they just have no value.

Ok, I think I just convinced myself that we'll allow null-valued attributes and that we will treat them differently than missing attributes. Secondly, null option only applies to present but null-valued attributes.