In the last posting, we created the macro if-ends?, which was a combination
   of let, if and the function ends?. Today, we’ll look at another similar
   macro, one that combines let, cond, and ends?.
The Purpose of cond-ends?
The problem with if-ends? is that sometimes we’ll want to look for twenty or
   more different endings on a single stemmer and handle each case differently.
   We can do that with if-ends?, but it would be a lot cleaner if we had a
   construct like cond.
Input Patterns
For example, we might want to do something like this:
(cond-ends? st (make-stemmer "names") "ing" (do (println "ends with -ing") st) "ed" (do (println "ends with -ed") st) "s" (do (println "ends with -s") st) :else (do (println "no ending") st))
First, notice that this starts out like if-ends?. It has a variable (st)
   that the new stemmer will be assigned to, followed by a stemmer instance to
   test.
These are followed by a series of ending/expression pairs. The endings are
   tested one by one until a match is found. Once it is, the expression following
   that ending is executed with the variable st available to it and assigned to
   the new stemmer, with the index changed to reflect the ending. The return
   value of that expression is the value of the entire cond-ends? expression.
Finally, if no ending matches, the cond-ends? should return the original
   stemmer unchanged. Or, optionally, the last ending can be the keyword :else.
   If no ending before that point matched, the expression following :else is
   executed and its value returned.
Before we start on the expected output for this macro, let’s look at a short expression that uses the default value if no ending matches:
(cond-ends? st (make-stemmer "names") "ing" (do (println "ends with -ing") st))
Of course, that would be better expressed with if-ends?, but using it as an
   input pattern will help us build the default ending.
Output Patterns
Here’s what the macro should construct from the input above.
;(cond-ends? st (make-stemmer "names") ; "ing" (do (println "ends with -ing") st) ; "ed" (do (println "ends with -ed") st) ; "s" (do (println "ends with -s") st) ; :else (do (println "no ending") st)) ;; Set up the environment (let [stemmer# (make-stemmer "names"), word# (subword stemmer#)] ;; -ing (let [j# (- (count word#) 3)] (if (and (pos? j#) (= (subvec word# j#) [\i \n \g])) (let [st (assoc stemmer# :index (dec j#))] (do (println "ends with -ing") st)) ;; -ed (let [j# (- (count word#) 2)] (if (and (pos? j#) (= (subvec word# j#) [\e \d])) (let [st (assoc stemmer# :index (dec j#))] (do (println "ends with -ed") st)) ;; -s (let [j# (- (count word#) 1)] (if (and (pos? j#) (= (subvec word# j#) [\s])) (let [st (assoc stemmer# :index (dec j#))] (do (println "ends with -s") st)) ;; :else (let [st stemmer#] (do (println "no ending") st)))))))))
That’s scary.
Actually, it’s not that bad. Notice that after the first let, it is a series
   of (let ... (if ...)) constructions. (I’ve added comments to separate each
   of these sections.) Each of those three let/if constructions are almost
   identical, until the :else expression. Also, notice that I’ve not only
   pre-compiled the vector itself, but also the length of the vector, and I’ve
   inserted them into the result example as their literal values. (Pre-computing
   the length of the vector was suggested by Holger Durer in the comments to the
   last posting.)
Before we start writing the macro, let’s look at the expected output for the second example above.
;(cond-ends? st (make-stemmer "names") ; "ing" (do (println "ends with -ing") st)) ;; Set up the environment (let [stemmer# (make-stemmer "names"), word# (subword stemmer#)] ;; -ing (let [j# (- (count word#) 3)] (if (and (pos? j#) (= (subvec word# j#) [\i \n \g])) (let [st (assoc stemmer# :index (dec j#))] (do (println "ends with -ing") st)) ;; default :else stemmer#)))
One of the things to notice about these output expressions is that a lot of
   the state for it can be computed once, and used throughout all the
   conditionals in the cond-ends? body. That observation will be useful later.
The Macro
Looking at the output above, we really have three tasks: set up the
   environment for the cond-ends? in the outer let; for each test generate a
   let/if construction; and finally generate an :else value, maybe using
   the default.
One way to tackle this problem would be to separate it into two macros:
- One macro sets up the environment and calls another macro to build the test expressions; 
- The second macro builds one - let/- ifpair, similar to what- if-ends?did, for one test expression, then it calls itself on the rest of the test expressions. It does this until it reaches the last test expression and that test is- :elseor until it is out of test expressions.
Let’s start with the second macro. That will allow us to see what the first macro needs to provide to the test macro, and it has to be defined first anyway, because that’s the way Clojure likes things.
Generating the Tests
The test macro will be overloaded. Either it will take one ending/expression
   pair, or it will take many. If there is only one, the ending could be an
   :else, or it could be the last ending to test for, and the :else will be a
   default. So the first, shorter version of this will return one of these two
   constructions (taken from above):
;;; With :else ;; :else (let [st stemmer#] (do (println "no ending") st)) ;;; With a test + the default :else ;; -ing (let [j# (- (count word#) 3)] (if (and (pos? j#) (= (subvec word# j#) [\i \n \g])) (let [st (assoc stemmer# :index (dec j#))] (do (println "ends with -ing") st)) ;; default :else stemmer#))
Another point to make about the test expressions is that—as an astute commenter pointed out (thanks, Holger)—there is a lot of duplication between the two versions of the macro that builds the expressions. We can capture that duplication in a function and just call that function with the difference between the two structures. The function will then create the macro’s output structure.
All right. Let’s roll up our sleeves. The helper function will be called
   make-cond-ends-test. It take the variable symbol, the stemmer variable, an
   ending string, a true expression, and a false expression.
(defn make-cond-ends-test [var stemmer word end true-expr false-expr] (let [vend (vec end)] `(let [j# (- (count ~word) ~(count vend))] (if (and (pos? j#) (= (subvec ~word j#) ~vend)) (let [~var (assoc ~stemmer :index (dec j#))] ~true-expr) ~false-expr))))
Notice that this function requires several parameters beyond the ending
   (end) and the result expressions (true-expr and false-expr)? It also has
   var, stemmer, and word.  But the stemmer and the subword should have
   been stored in variables in the first, outer let by this point. That’s
   right. In this function, these are the variables that those values were
   stored in, represented here as symbols.  Throughout this helper function,
   ~stemmer inserts the variable that the stemmer will have been assigned to in
   the outer let.
Now we can define the secondary macro.  We’ll call it cond-ends-helper,
   because I’m not feeling creative. It’s main purpose is to examine the input
   and call make-cond-ends-test with the appropriate false expression. The
   first overload for it will handle the last expression by testing for :else
   and deciding whether to use the default :else:
(defmacro cond-ends-helper "This helps cond-ends? by processing the " test-exprs pairs in cond-ends?'s environment." ([var stemmer word end true-expr] (if (= end :else) `(let [~var ~stemmer] ~true-expr) (make-cond-ends-test var stemmer word end true-expr stemmer))) ;; The overload will go here ... )
What happens here? First, at compile time (before the quasi-quote or call to
   make-cond-ends-test), it tests whether the ending is :else. If it is, it
   just outputs the shorter version of the ending. If it isn’t :else, it calls
   the helper function, with all the same parameters plus the false expression as
   a data structure. In this case, the false expression is the stemmer variable.
If your head is starting to hurt, take a moment. Unless you have programmed in
   lisp before, this isn’t like any programming you’ve ever done. We’re writing a
   program to write programs.  Like any meta-activity, it can make your brain
   explode.  Unfortunately, this is a particularly convoluted macro, because it
   has these two levels of macro, plus a function. Don’t worry if you don’t quite
   get it at first. There’s nothing wrong with copying and pasting the macro’s
   code into your source file and moving on. Start building macros like debug
   and if-ends?, and when you’re comfortable there, come back and read this
   through again. It will make a lot more sense.
Now, if you’re still with me, here’s the overloaded version of the helper
   macro. This takes an ending/expression pair and a list of the rest of the
   pairs in the expression. It then calls make-cond-ends-test with a new
   version of 
(defmacro cond-ends-helper "This helps cond-ends? by processing the test-exprs pairs in cond-ends?'s environment." ;; The earlier override goes here.... ([var stemmer word end true-expr & more] (make-cond-ends-test var stemmer word end true-expr `(cond-ends-helper ~var ~stemmer ~word ~@more))))
In some ways, this part of cond-ends-helper is much simpler. There are just
   a few things to comment on.
- In the parameter list, - & morecollects the values of the rest of the parameters and stores them in a variable I’ve named- more. This works with functions too.
- The false expression that this passes to - make-cond-ends-testbegins with a quasi-quote (back quote). This just generates an expression to pass as the false expression. This expression gets inserted into the expression that- make-conds-ends-testcreates.
- This quasi-quoted expression, when finally evaluated, is a macro call to the macro that created it. It starts the process all over again, passing the new macro expansion the values of - var,- stemmer, and- wordunchanged, and splicing all of the rest of the parameters that haven’t been handled yet into the expression using- ~@more.
Creating the Environment
So we know that the cond-ends? sets up the environment by creating an
   expression like this:
;; Set up the environment (let [stemmer# (make-stemmer "names"), word# (subword stemmer#)] ; ... )
And it replaces the ellipses by calling the macro cond-ends-helper with the
   variable and the variables stemmer# and word#. Put like that, it almost
   writes itself.
(defmacro cond-ends? "This is the same as a stacked series of if-ends?. This just sets up the environment for cond-ends-helper." [var stemmer & test-exprs] `(let [stemmer# ~stemmer, word# (subword stemmer#)] (cond-ends-helper ~var stemmer# word# ~@test-exprs)))
Notice that we use & test-exprs to collect all of the test expressions, and
   we use ~@test-exprs to splice them into the cond-ends-helper call.
Summary
For posterity’s sake, here is all the code for the cond-ends? macro, in one
   place:
(defn make-cond-ends-test [var stemmer word end true-expr false-expr] (let [vend (vec end)] `(let [j# (- (count ~word) ~(count vend))] (if (and (pos? j#) (= (subvec ~word j#) ~vend)) (let [~var (assoc ~stemmer :index (dec j#))] ~true-expr) ~false-expr)))) (defmacro cond-ends-helper "This helps cond-ends? by processing the test-exprs pairs in cond-ends?'s environment." ([var stemmer word end true-expr] (if (= end :else) `(let [~var ~stemmer] ~true-expr) (make-cond-ends-test var stemmer word end true-expr stemmer))) ([var stemmer word end true-expr & more] (make-cond-ends-test var stemmer word end true-expr `(cond-ends-helper ~var ~stemmer ~word ~@more)))) (defmacro cond-ends? "This is the same as a stacked series of if-ends?. This just sets up the environment for cond-ends-helper." [var stemmer & test-exprs] `(let [stemmer# ~stemmer, word# (subword stemmer#)] (cond-ends-helper ~var stemmer# word# ~@test-exprs)))
In the next posting, I’ll to wrap up macros by pointing out how they’re used and why.
 

4 comments:
Hi Eric,
First of all, the obligatory, but extremely well-deserved, thanks for this series. It's *really* interesting! I've just discovered it and read all the posts up to here in the last couple of days.
Not being steeped in macros, perhaps I oughtn't comment, but it seems to me that:
1. if-ends? should make use of the ends? function rather than include the logic directly in the macro
2. cond-ends? should use cond (and ends?) rather than tackling the harder task of building nested lets and ifs.
The reason I suggest this is to make it easier for beginners to understand. If there are performance reasons to do as much work as you've done, that's probably not enough justification in a pedagogical example. If there are "correctness" reasons for it, then maybe they should be discussed.
Thanks again. Looking forward to reading the rest, and any other series you feel like producing :)
Gavin Sinclair
Hi Gavin,
Glad you're getting something out of it. Unfortunately, it's far from finished at this point, although it does have a lot of useful information. (I also need to update the code to reflect the latest changes in Clojure, but that's another story.)
Now for your questions. (I've answered them together, since they're really the same question.)
I'm not sure that using ends? would make cond-ends? that much simpler, since ends? isn't really a straight predicate function. You'd still have to test at every stage. You might be able to wrap ends? in another function and use "or" to short-cut the tests. That would work, although it might also be a little too clever. You'd also still need to test for the ":else" condition.
You're right, though, the primary reason why I approached things this way was for performance. It does also illustrate one of the advantages of macros, though: they can "insert" expressions into the code at compile time, so that whenever it's executed, it's just a constant. For example, in "cond-ends?", the length of each of the endings is pre-computed ("~(count vend)") and inserted into the resulting expression as a constant integer. Similarly, the ending vector itself is a constant, although I'd have to think about whether that gains anything or not.
Pedagogically, I really should have stuck with a simpler stemming algorithm, as someone suggested (and you commented on) on another posting. That's one of the joys of blogs or any other serial publishing format: by the time realize what a thicket you're in, you may be too far along to turn back.
But you're probably right, using a simpler macro here would have made the explanation much simpler, too.
Glad you're enjoying it, though.
Eric
Thanks for your detailed response. One unfortunate thing I find about Clojure is that it's often hard for me to understand unless I'm writing it (and thereby experimenting at the REPL). Accordingly, I don't fully follow what you've written because it's been some time (a week or so) since I read and grokked the code involved. The solution to this is for me to type in all the code and try to implement if-ends? and cond-ends? differently myself. I'll let you know how I go (won't be very soon though).
Cheers,
Gavin
LOL. I know exactly what you mean. It's been months since I last looked at this code, and it took me a while to remember what i was thinking at the time. I'm still not sure that I did.
And I started to implement the suggestions I made in my last post, but I didn't really have time right then.
Life does sometimes get in the way.
Eric
Post a Comment