<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-2062021423309914146</id><updated>2011-11-27T20:04:53.132-05:00</updated><category term='facebook'/><category term='reading'/><category term='java'/><category term='clojure'/><category term='programming'/><category term='tutorial'/><category term='language'/><category term='lisp'/><category term='django'/><category term='concurrency'/><category term='life'/><category term='grammar'/><category term='sf'/><category term='haiku'/><category term='python'/><category term='coding'/><category term='clojure-series'/><category term='.net'/><category term='vim'/><category term='tv'/><category term='heroes'/><category term='blogging'/><category term='writing'/><category term='poverty'/><category term='Erlang'/><title type='text'>Writing/Coding</title><subtitle type='html'>Writing and Coding, Literature and Computers</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>87</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-4823169986839729317</id><published>2009-06-10T12:38:00.002-05:00</published><updated>2009-06-10T12:46:17.461-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='blogging'/><title type='text'>Not Dead Yet!</title><content type='html'>&lt;p&gt;I'm back. For the moment, anyway. More about that below.&lt;/p&gt;
&lt;p&gt;What have I been up to? A lot, really.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;General life stuff, of course. It always gets in the way.&lt;/li&gt;
&lt;li&gt;My wife and I have adopted a little girl, and that keeps us busy, both with adoption busyness and with new parent busyness.&lt;/li&gt;
&lt;li&gt;I've started a new job.&lt;/li&gt;
&lt;li&gt;As far as programming goes, I'm exploring function programming.  Clojure got me started, and since then I've branched out to Haskell and F#. I primarily wanted to learn Haskell, but I've had more opportunity to use F#.  At some point, I should compare the two.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Speaking of Clojure, I recently finished scanning my dead-tree edition of &lt;a href='http://www.amazon.com/gp/product/1934356336?ie=UTF8&amp;tag=httpwwwericro-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1934356336'&gt;&lt;em&gt;Programming Clojure&lt;/em&gt;&lt;/a&gt;.  It has good, practical examples. (I've taught enough to know how hard it is to have practical examples.) For example, &lt;a href="http://github.com/stuarthalloway/lancet/tree/master"&gt;Lancet&lt;/a&gt; is an extended example that Stuart Halloway develops throughout the book. It creates a DSL for build files (think &lt;em&gt;make&lt;/em&gt;). The system itself is built upon ant, but it creates a nice interface layer on top of that.&lt;/p&gt;
&lt;p&gt;The book is also a good introduction to functional programming, including how it's different than object-oriented programming and how to use it best for the benefits it gives for concurrency and other problems.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;I'll close this post by outlining the direction I'd like to take this blog.  I think I've implied in the past that I haven't been happy that it's turned into a Clojure tutorial (nice though that is).  I just haven't been sure what I wanted it to become. I think I've finally got some ideas.&lt;/p&gt;
&lt;p&gt;Primarily, I'd like Writing/Coding to be more results oriented. I'll still talk about text- and natural-language-processing, but instead of a lot of code and nuts-and-bolts, I'll focus on the results of those analysis and visualizations and pretty pictures. There will still be some code, but much, much less. For a while now, I've been interested in data visualization, and this will give me an opportunity to explore that.&lt;/p&gt;
&lt;p&gt;This will also be much less Clojure-centric. Clojure will still be here, certainly, but you can also expect some Python, Haskell, F#, and  who-knows-what.  (&lt;a href='http://en.wikipedia.org/wiki/Brainfuck'&gt;BF&lt;/a&gt;, anyone?)&lt;/p&gt;
&lt;p&gt;First off, I need to brainstorm some post ideas, outline a dozen or so of those, and draft a few to have ready. Once that's taken care of, I'll start posting here again. That won't happen immediately, but I hope to get it done real soon now.&lt;/p&gt;
&lt;p&gt;In the meantime, go get &lt;em&gt;Programming Clojure&lt;/em&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-4823169986839729317?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/4823169986839729317/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=4823169986839729317' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/4823169986839729317'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/4823169986839729317'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2009/06/not-dead-yet.html' title='Not Dead Yet!'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-4072628998906962512</id><published>2008-11-11T20:50:00.001-05:00</published><updated>2008-11-11T20:50:14.128-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='haiku'/><category scheme='http://www.blogger.com/atom/ns#' term='facebook'/><title type='text'>Facebook Haiku/Doggerel #1</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;The other night, bored, I half watched TV and half watched my wife on
   Facebook. It inspired me to take up pen and produce some dreadful haikus, and
   in the best tradition of the Internet and blogs, I'm inflicting them on you.
   To draw out your pain and my evil glee, I'll deliver them one at a time.
&lt;/p&gt;
&lt;p&gt;Here's number one.
&lt;/p&gt;
&lt;p&gt;Heh. Enjoy.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Friends write on my wall. &lt;br /&gt;
   Messages of cheer and joy &lt;br /&gt;
   Why do they hassle me? &lt;br /&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-4072628998906962512?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/4072628998906962512/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=4072628998906962512' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/4072628998906962512'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/4072628998906962512'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/11/facebook-haikudoggerel-1.html' title='Facebook Haiku/Doggerel #1'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-4206131960814832214</id><published>2008-10-24T14:52:00.001-05:00</published><updated>2008-10-24T14:52:16.526-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Concordances, Part 3: Positioning Tokens</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;Today, we're going to add to the processing information about where a token
   appears in the original document.
&lt;/p&gt;

&lt;h2&gt;Tokens Today&lt;/h2&gt;
&lt;p&gt;Currently, a token is just the string containing the token data.
&lt;/p&gt;
&lt;p&gt;Easy enough.
&lt;/p&gt;

&lt;h2&gt;Tokens Tomorrow&lt;/h2&gt;
&lt;p&gt;To hold more information about the token, we'll need a richer data type. To
   accommodate that, here's a &lt;code&gt;struct&lt;/code&gt; for tokens:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defstruct &lt;/span&gt;&lt;span class="nv"&gt;token&lt;/span&gt; &lt;span class="no"&gt;:text&lt;/span&gt; &lt;span class="no"&gt;:raw&lt;/span&gt; &lt;span class="no"&gt;:line&lt;/span&gt; &lt;span class="no"&gt;:start&lt;/span&gt; &lt;span class="no"&gt;:end&lt;/span&gt; &lt;span class="no"&gt;:filename&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This gives us slots to hold the token's text; its original text before case
   normalization, stemming, or whatever; the line it occurred on; the start and
   end indices where it can be found on that line; and the name of the file the
   token was read from.
&lt;/p&gt;
&lt;p&gt;Again, pretty simple.
&lt;/p&gt;

&lt;h2&gt;Updating the Tokenization&lt;/h2&gt;
&lt;p&gt;The big changes happen in the tokenization procedure. Currently, it doesn't
   take lines into account.
&lt;/p&gt;
&lt;p&gt;Let's start with the highest-level functions and drill down to the lowest.
   First, these functions tokenize either a file or a string.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;split-lines&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;input-string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;split&lt;/span&gt; &lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;\\r|\\n|\\r\\n&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;input-string&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;tokenize-str&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;input-string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tokenize-str-seq&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;split-lines&lt;/span&gt; &lt;span class="nv"&gt;input-string&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;input-string&lt;/span&gt; &lt;span class="nv"&gt;stop-word?&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;filter &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;comp &lt;/span&gt;&lt;span class="no"&gt;:text&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;complement &lt;/span&gt;&lt;span class="nv"&gt;stop-word?&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tokenize-str&lt;/span&gt; &lt;span class="nv"&gt;input-string&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;tokenize&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;with-open &lt;/span&gt;&lt;span class="nv"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;BufferedReader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;FileReader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;doall&lt;/span&gt;
       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;tkn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;tkn&lt;/span&gt; &lt;span class="no"&gt;:filename&lt;/span&gt; &lt;span class="nv"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tokenize-str-seq&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;line-seq &lt;/span&gt;&lt;span class="nv"&gt;in&lt;/span&gt;&lt;span class="p"&gt;))))))&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;filename&lt;/span&gt; &lt;span class="nv"&gt;stop-word?&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;with-open &lt;/span&gt;&lt;span class="nv"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;BufferedReader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;FileReader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;doall&lt;/span&gt;
       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;tkn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;tkn&lt;/span&gt; &lt;span class="no"&gt;:filename&lt;/span&gt; &lt;span class="nv"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;filter &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;comp &lt;/span&gt;&lt;span class="no"&gt;:text&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;complement &lt;/span&gt;&lt;span class="nv"&gt;stop-word?&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tokenize-str-seq&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;line-seq &lt;/span&gt;&lt;span class="nv"&gt;in&lt;/span&gt;&lt;span class="p"&gt;))))))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;split-lines&lt;/code&gt; breaks a string into lines based on a regex of line endings.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;tokenize-str&lt;/code&gt; uses &lt;code&gt;split-lines&lt;/code&gt; to break its input into lines, and it calls
   &lt;code&gt;tokenize-str-seq&lt;/code&gt; with them. The second overload for this function then
   filters the tokens with a stop list.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;tokenize&lt;/code&gt; opens a file with a &lt;code&gt;java.io.BufferedReader&lt;/code&gt;, and it calls
   &lt;code&gt;tokenize-str-seq&lt;/code&gt; with them. It sets the &lt;code&gt;:filename&lt;/code&gt; key on the token
   structures.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;doall&lt;/code&gt; is thrown in there because &lt;code&gt;map&lt;/code&gt; is lazy, but &lt;code&gt;with-open&lt;/code&gt; isn't.
   &lt;code&gt;doall&lt;/code&gt; forces &lt;code&gt;map&lt;/code&gt; to evaluate everything. Without it, &lt;code&gt;with-open&lt;/code&gt; would
   close the file before its contents could be read. This is a common mistake,
   and it will probably bit you regularly. It does me.
&lt;/p&gt;
&lt;p&gt;We haven't seen &lt;code&gt;tokenize-str-seq&lt;/code&gt; yet. What does it do?
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;def &lt;/span&gt;&lt;span class="nv"&gt;token-regex&lt;/span&gt; &lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;\\w+&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defn- &lt;/span&gt;&lt;span class="nv"&gt;tokenize-str-seq&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This tokenizes a sequence of strings.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;strings&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tokenize-str-seq&lt;/span&gt; &lt;span class="nv"&gt;strings&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;strings&lt;/span&gt; &lt;span class="nv"&gt;line-no&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;when-first &lt;/span&gt;&lt;span class="nv"&gt;line&lt;/span&gt; &lt;span class="nv"&gt;strings&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;lazy-cat &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tokenize-line&lt;/span&gt; &lt;span class="nv"&gt;line-no&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;re-matcher &lt;/span&gt;&lt;span class="nv"&gt;token-regex&lt;/span&gt; &lt;span class="nv"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
               &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tokenize-str-seq&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;rest &lt;/span&gt;&lt;span class="nv"&gt;strings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;line-no&lt;/span&gt;&lt;span class="p"&gt;))))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This function tokenizes a sequence of strings. It walks through the sequence,
   numbering each line (&lt;code&gt;line-no&lt;/code&gt;). For each input line, it constructs a lazy
   sequence by concatenating the tokens for that line (&lt;code&gt;tokenize-line&lt;/code&gt;) with the
   tokens for the rest of the lines.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;when-first&lt;/code&gt; is new. It is exactly equivalent to &lt;code&gt;when&lt;/code&gt; plus &lt;code&gt;let&lt;/code&gt;:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;macroexpand-1 &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;when-first&lt;/span&gt; &lt;span class="nv"&gt;line&lt;/span&gt; &lt;span class="nv"&gt;strings&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="nv"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/when&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/seq&lt;/span&gt; &lt;span class="nv"&gt;strings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/let&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;line&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/first&lt;/span&gt; &lt;span class="nv"&gt;strings&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="nv"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;tokenize-line&lt;/code&gt; constructs a lazy sequence of the tokens in that line.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defn- &lt;/span&gt;&lt;span class="nv"&gt;tokenize-line&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This tokenizes a single line into a lazy sequence of tokens.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;line-no&lt;/span&gt; &lt;span class="nv"&gt;matcher&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tokenize-line&lt;/span&gt; &lt;span class="nv"&gt;line-no&lt;/span&gt; &lt;span class="nv"&gt;matcher&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;line-no&lt;/span&gt; &lt;span class="nv"&gt;matcher&lt;/span&gt; &lt;span class="nv"&gt;start&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;when &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;find&lt;/span&gt; &lt;span class="nv"&gt;matcher&lt;/span&gt; &lt;span class="nv"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;lazy-cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;mk-token&lt;/span&gt; &lt;span class="nv"&gt;line-no&lt;/span&gt; &lt;span class="nv"&gt;matcher&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tokenize-line&lt;/span&gt; &lt;span class="nv"&gt;line-no&lt;/span&gt; &lt;span class="nv"&gt;matcher&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;matcher&lt;/span&gt;&lt;span class="p"&gt;))))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;mk-token&lt;/code&gt; constructs a &lt;code&gt;token&lt;/code&gt; struct from a regex and line number.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defn- &lt;/span&gt;&lt;span class="nv"&gt;mk-token&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This creates a token given a line number and regex matcher.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;line-no&lt;/span&gt; &lt;span class="nv"&gt;matcher&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;raw&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;group&lt;/span&gt; &lt;span class="nv"&gt;matcher&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;struct &lt;/span&gt;&lt;span class="nv"&gt;token&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;toLowerCase&lt;/span&gt; &lt;span class="nv"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;raw&lt;/span&gt;
            &lt;span class="nv"&gt;line-no&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;start&lt;/span&gt; &lt;span class="nv"&gt;matcher&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;matcher&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That's it. &lt;code&gt;tokenize&lt;/code&gt; and &lt;code&gt;tokenize-str&lt;/code&gt; create a sequence of strings of input
   data. Each item in the sequence is a line in the input.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;tokenize-str-seq&lt;/code&gt; takes that input sequence and creates a lazy sequence of
   the tokens from the first line and the tokens from the rest of the input
   sequence.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;tokenize-line&lt;/code&gt; takes a line and constructs a lazy sequence of the tokens in
   it, as defined by the regex held in &lt;code&gt;token-regex&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;Finally, &lt;code&gt;mk-token&lt;/code&gt; constructs the token from the regex &lt;code&gt;Matcher&lt;/code&gt; and the line
   number.
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;If you've made it this far, you've probably got Clojure up and running, but if
   not, Bill Clementson has a great post on how to &lt;a href="http://bc.tech.coop/blog/081023.html"&gt;set up
Clojure+Emacs+SLIME&lt;/a&gt;. In the future, he'll be exploring Clojure in more
   detail. He's got a lot of good posts on Common Lisp and Scheme, and I'm
   looking forward to seeing what he does with Clojure.
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;I haven't really explained about Clojure's laziness. Next, I'll talk about
   that.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-4206131960814832214?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/4206131960814832214/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=4206131960814832214' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/4206131960814832214'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/4206131960814832214'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/10/concordances-part-3-positioning-tokens.html' title='Concordances, Part 3: Positioning Tokens'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-8849197773126129228</id><published>2008-10-17T10:35:00.001-05:00</published><updated>2008-10-17T10:35:57.679-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='writing'/><title type='text'>Work in Progress, Draft 2....</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;Is done!
&lt;/p&gt;
&lt;p&gt;This is a SF/F novel I've been working on all year.
&lt;/p&gt;
&lt;p&gt;Is it any good? Who knows? It's certainly better than my last two efforts.
&lt;/p&gt;
&lt;p&gt;For draft three, I'm just going to decide what point of view to use.  I wrote
   it in first person, but I rewrote the first half of draft two in third-person.
   I'm going to read the whole thing and decide which half has more potential,
   and then I'll beat the other half into submission.
&lt;/p&gt;
&lt;p&gt;Then I will take a break to get some perspective on it, just in time for
   &lt;a href="http://www.nanowrimo.org/"&gt;NaNoWriMo&lt;/a&gt;.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-8849197773126129228?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/8849197773126129228/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=8849197773126129228' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/8849197773126129228'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/8849197773126129228'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/10/work-in-progress-draft-2.html' title='Work in Progress, Draft 2....'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-5126854426325144823</id><published>2008-10-15T12:44:00.002-05:00</published><updated>2008-10-15T12:49:09.260-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='poverty'/><title type='text'>Blog Action Day</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;Lately, I've donated some time writing for a non-profit that promotes
   education in India to diminish poverty, hunger, child labor, illness, and
   other concerns.  So when I saw that the topic for &lt;a href="http://blogactionday.org/"&gt;Blog Action Day
2008&lt;/a&gt; was poverty, I thought I'd lend my voice.
&lt;/p&gt;
&lt;p&gt;When I sat down to write this entry, I looked on the Internet for some facts
   to throw at you.
&lt;/p&gt;
&lt;p&gt;Then I changed my mind.
&lt;/p&gt;
&lt;p&gt;In all those numbers and percent signs, it's easy to forget that poverty has a
   face.
&lt;/p&gt;
&lt;p&gt;It is the man who doesn't have the livestock and knowledge to get the most
   from his farm, and who thus cannot provide enough food for his family.
&lt;/p&gt;
&lt;p&gt;It is the single mother who doesn't have the education and resources to
   support her children.
&lt;/p&gt;
&lt;p&gt;It is the child who has to work in a factory to help her family survive.
&lt;/p&gt;
&lt;p&gt;The current world economic crisis makes poverty an even more pressing concern.
   Bill Gates and Warren Buffer will be fine.  As worried as the stock brokers
   and bankers are, they won't bear the full impact of the crisis. They'll be
   fine. I'm no Bill Gates, but I'll be fine, too, thanks for your concern.
&lt;/p&gt;
&lt;p&gt;But the economic downturn will hit first and hardest on those who already have
   trouble getting enough to eat each day.
&lt;/p&gt;
&lt;p&gt;I would say, "Remember them," but I don't want you to remember them. I want
   you to find a reputable charity, such as &lt;a href="http://blogactionday.org/live_updates/fundraising"&gt;one of these&lt;/a&gt;, and
   &lt;em&gt;give&lt;/em&gt;.
&lt;/p&gt;
&lt;p&gt;Now. Go. Do it.
&lt;/p&gt;

&lt;script src="http://blogactionday.org/js/f56b8199217efbc98d00a04f0b9d02289822a869"&gt;&lt;/script&gt;
&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-5126854426325144823?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/5126854426325144823/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=5126854426325144823' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/5126854426325144823'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/5126854426325144823'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/10/blog-action-day.html' title='Blog Action Day'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-2399530862750624814</id><published>2008-10-15T09:18:00.001-05:00</published><updated>2008-10-15T09:18:09.671-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Concordances, Part 2: Making a Plan</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;In the &lt;a href="http://writingcoding.blogspot.com/2008/10/concordances-part-1-what-that.html"&gt;last entry&lt;/a&gt;, I showed what a concordance looks like. Today,
   I'm going to talk about the changes that I'll need to make to what we already
   have and about what I'll need to add to the system that's here now.
&lt;/p&gt;

&lt;h2&gt;Position&lt;/h2&gt;
&lt;p&gt;The biggest underlying change is that tokens will need to know their position
   in the input document. The system that we're building can only handle plain
   text documents, so I get to pretend that &lt;em&gt;position&lt;/em&gt; is a simple concept. But
   it's really not. For example, in an XML document, the position could be an
   &lt;a href="http://en.wikipedia.org/wiki/Xpath"&gt;XPath&lt;/a&gt; expression.
&lt;/p&gt;
&lt;p&gt;Even for a text document, position isn't entirely clear. Is the position of a
   token the byte it started on in the original raw file? Is it the
   after-Unicode-decoding character of the token in the text of the file?
&lt;/p&gt;
&lt;p&gt;For our case, I'm going to say that the position of a token is the line number
   and beginning and ending character in the Unicode data of the document. This
   is what my example yesterday had.
&lt;/p&gt;

&lt;h2&gt;Input Text&lt;/h2&gt;
&lt;p&gt;To keep track of that, instead of slurping in a file's contents all at once,
   I'll now need to read in each line separately and keep track of the line
   numbers. That shouldn't be difficult, though.
&lt;/p&gt;

&lt;h2&gt;Tokens&lt;/h2&gt;
&lt;p&gt;Tokens will also need to be more than just plain strings. They'll now need to
   be structures that keep track of their location: file names, line numbers, and
   starting and ending character indices.
&lt;/p&gt;
&lt;p&gt;As an added bonus, tokens will also be able to keep track of the original form
   of the word, as well as a normalized or a stemmed form.
&lt;/p&gt;

&lt;h2&gt;Indexing&lt;/h2&gt;
&lt;p&gt;To display the documents in alphabetical order and to pull all the occurrences
   of each word together easily, I'll need to index the tokens by word. The index
   won't be industrial-strength, really. It will probably bog down with too large
   of a document. Other options would be to use &lt;a href="http://lucene.apache.org/"&gt;Lucene&lt;/a&gt; or store the index in
   a database of some form. For now, though, I'll just keep things simple.
&lt;/p&gt;

&lt;h2&gt;Display&lt;/h2&gt;
&lt;p&gt;I can imagine displaying a concordance a number of different ways: text files,
   HTML, a GUI form. To keep the system flexible, I'm going to defer that
   decision until later, and the core concordance generator will just return some
   basic Clojure data types. I'll also add a function that prints them to the
   screen.
&lt;/p&gt;
&lt;p&gt;That's it. This should be a lot simpler than the Stemmer we just worked on,
   but it should give us a good idea of how various words are being used in the
   documents.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-2399530862750624814?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/2399530862750624814/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=2399530862750624814' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/2399530862750624814'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/2399530862750624814'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/10/concordances-part-2-making-plan.html' title='Concordances, Part 2: Making a Plan'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-7823182809905455365</id><published>2008-10-09T11:19:00.001-05:00</published><updated>2008-10-09T11:19:47.468-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Concordances, Part 1: What's that?</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;OK, that's enough of a break, don't you think?
&lt;/p&gt;
&lt;p&gt;Now we've got some processing tools under our belts: tokenizing and stemming.
   We'll tweak those, and we'll add more later. But for now let's take a look at
   our data.
&lt;/p&gt;
&lt;p&gt;One of the earliest ways of analyzing texts is the &lt;a href="http://en.wikipedia.org/wiki/Concordance_(publishing)"&gt;concordance&lt;/a&gt;.  Simply
   put, it's an alphabetical list of the words used in a document, followed by
   every occurrence of that word, a little context around it (typically just a
   few words), and the location of that occurrence.
&lt;/p&gt;
&lt;p&gt;The first concordance was done in the thirteenth century for the Vulgate
   translation of the Christian Bible. However, because complete concordances are
   labor-intensive, they really weren't practical as a wide-spread tool. Because
   of this, concordances were only made for works deemed sufficiently important,
   such as religious texts or the works of Shakespeare or Chaucer.
&lt;/p&gt;
&lt;p&gt;But computers changed that. Now, given a text file, a computer can spit out a
   concordance in little time.
&lt;/p&gt;
&lt;p&gt;For example, here are the first few entries of a concordance of the first few
   paragraphs of this blog entry (although this may reflect an earlier draft).
   The numbers at the beginning are the line in a text file I copied the entry
   into and the start and end indices of the word in that line. This should give
   you a picture of what a concordance is, although a better one would pull
   context from surrounding lines. This one doesn't.  Yet.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;a
=
  1: 23: 24 K, that&amp;#39;s enough of *a* break.
  5:  7:  8                take *a* look at our data.
 10: 35: 36 eren&amp;#39;t practical as *a* wide-spread tool. B

add
===
  4: 17: 20      We&amp;#39;ll probably *add* more later, and we&amp;#39;

analyzing
=========
  7: 30: 39 he earliest ways of *analyzing* texts is the [conco

and
===
  3: 66: 69 r belts: tokenizing *and* stemming.
  4: 33: 36 bly add more later, *and* we&amp;#39;ll definitely tw

are
===
  9: 58: 61 mplete concordances *are* labor-intensive,
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Next we'll analyze what we'll need to build the concordance and plan the next
   steps.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-7823182809905455365?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/7823182809905455365/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=7823182809905455365' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/7823182809905455365'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/7823182809905455365'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/10/concordances-part-1-what-that.html' title='Concordances, Part 1: What&amp;#39;s that?'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-8549035313425610696</id><published>2008-09-12T19:10:00.001-05:00</published><updated>2008-09-12T19:10:32.250-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 19: Debugging</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;Before I leave the Porter Stemmer behind, I want to show you some of the tools
   I used to debug the code as I went along.
&lt;/p&gt;
&lt;p&gt;There are some more modern options for debugging Clojure than what I'm
   presenting here. (Search the &lt;a href="http://groups.google.com/group/clojure"&gt;mailing list&lt;/a&gt; for details.) Personally, I
   generally use print statements for debugging. It's primitive, but effective.
   In some languages, it can also be painful. Fortunately, lisp languages take
   much of the pain out of print-debugging.
&lt;/p&gt;

&lt;h2&gt;Tracing&lt;/h2&gt;
&lt;p&gt;One common way to debug programs is to follow when a function is called and
   returns. This is called &lt;em&gt;tracing&lt;/em&gt;, and this function and macro handle that.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;trace-call&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;f&lt;/span&gt; &lt;span class="nv"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;print &lt;/span&gt;&lt;span class="nv"&gt;tag&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;:&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;-&amp;gt; &amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;result&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="nv"&gt;f&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="nv"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="nv"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;trace-call&lt;/code&gt; returns a new function that prints the input arguments to a
   function, calls the function, prints the result, and returns it. It takes the
   function and a tag to identify what is being traced.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;trace&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;fn-name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;def &lt;/span&gt;&lt;span class="nv"&gt;~fn-name&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trace-call&lt;/span&gt; &lt;span class="nv"&gt;~fn-name&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;~fn-name&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;trace&lt;/code&gt; macro is syntactic sugar on &lt;code&gt;trace-call&lt;/code&gt;. It replaces the function
   with a traced version of it that uses its own name as a tag. For example, this
   creates and traces a function that upper-cases strings:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;upper-case&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;toUpperCase&lt;/span&gt; &lt;span class="nv"&gt;string&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/upper-case&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;upper-case&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;NAME&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trace&lt;/span&gt; &lt;span class="nv"&gt;upper-case&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/upper-case&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;upper-case&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;upper-case&lt;/span&gt; &lt;span class="nv"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;NAME&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;NAME&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;The &lt;code&gt;debug&lt;/code&gt; Macro&lt;/h2&gt;
&lt;p&gt;Another common trick in print-debugging is to print the value of an
   expression. The macro below evaluates an expression, prints both the
   expression and the result, and returns the result.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;debug&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;~expr&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For example:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Lisp macros are especially helpful here, because they allow you to treat the
   expression both as data to print and as code to evaluate.
&lt;/p&gt;

&lt;h2&gt;The &lt;code&gt;debug-stem&lt;/code&gt; Function&lt;/h2&gt;
&lt;p&gt;This function is a debugging version to &lt;code&gt;stem&lt;/code&gt;. It uses &lt;code&gt;binding&lt;/code&gt; to replace
   all the major functions of the stemmer with traced versions of them.
&lt;/p&gt;
&lt;p&gt;(We'll talk more about &lt;code&gt;binding&lt;/code&gt; later, when we deal with concurrency. Right
   now, just understand that &lt;code&gt;binding&lt;/code&gt; changes the value of a top-level variable,
   like a function name, with a new value. But the variable only has that value
   for the duration of the &lt;code&gt;binding&lt;/code&gt;. Afterward, it is returned to its former
   value.)
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;debug-stem&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;binding &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stem&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trace&lt;/span&gt; &lt;span class="nv"&gt;stem&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nv"&gt;make-stemmer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trace&lt;/span&gt; &lt;span class="nv"&gt;make-stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nv"&gt;step-1ab&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trace&lt;/span&gt; &lt;span class="nv"&gt;step-1ab&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nv"&gt;step-1c&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trace&lt;/span&gt; &lt;span class="nv"&gt;step-1c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nv"&gt;step-2&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trace&lt;/span&gt; &lt;span class="nv"&gt;step-2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nv"&gt;step-3&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trace&lt;/span&gt; &lt;span class="nv"&gt;step-3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nv"&gt;step-4&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trace&lt;/span&gt; &lt;span class="nv"&gt;step-4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nv"&gt;step-5&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trace&lt;/span&gt; &lt;span class="nv"&gt;step-5&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stem&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That's it. These were the main functions I used in debugging the stemmer as I
   ported it from C and made it more Clojure-native.
&lt;/p&gt;
&lt;p&gt;Next up, we'll create a concordance and look at other ways of presenting the
   texts that we're analyzing.
&lt;/p&gt;
&lt;p&gt;By the way, I've also finally updated the &lt;a href="http://code.google.com/p/word-clj/"&gt;repository&lt;/a&gt; for sample code.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-8549035313425610696?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/8549035313425610696/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=8549035313425610696' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/8549035313425610696'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/8549035313425610696'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/09/stemming-part-19-debugging.html' title='Stemming, Part 19: Debugging'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-6324684456346618602</id><published>2008-09-11T16:14:00.001-05:00</published><updated>2008-09-11T16:14:41.039-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='blogging'/><title type='text'>I'm Ba-a-ack</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;Boy, but doesn't life just get in the way sometimes?
&lt;/p&gt;
&lt;p&gt;On the other hand, I have also let it keep me from posting. I was getting so
   bored with the Clojure tutorial series. I hate to think how tired you must
   have been with it.
&lt;/p&gt;
&lt;p&gt;But I'm back, and I'm going to make some changes.
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;&lt;p&gt;I've put a link to the table of contents for the Clojure tutorial in the
   sidebar. No matter how tired I am of it, it's still the main content on
   here.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;I'm going to continue the Clojure tutorial, but the pace won't be quite as
   relentless as it was. Hopefully, I'll be able to inject a little more
   energy into it, and it won't be quite as boring.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;I'm going to intersperse the tutorial with some other postings. I'll catch
   you up on what I've been doing, as well as talk about some other things
   that have caught my interest.
&lt;/p&gt;

 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;That's it. The moral: When you're getting tired, take a break and retool.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-6324684456346618602?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/6324684456346618602/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=6324684456346618602' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/6324684456346618602'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/6324684456346618602'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/09/i-ba-ack.html' title='I&amp;#39;m Ba-a-ack'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-1122839372407284583</id><published>2008-08-07T20:56:00.001-05:00</published><updated>2008-08-07T20:56:57.658-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 18: Processing and Testing</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;All the pieces are in place, now here is the final piece. Also, I’ll describe
   how I tested this to make sure it was working correctly.
&lt;/p&gt;

&lt;h2&gt;The &lt;code&gt;stem&lt;/code&gt; Function&lt;/h2&gt;
&lt;p&gt;Everything that we’ve written so far happens under the hood. This function is
   finally the one function that will be called in other code. Without further
   ado, here it is.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;stem&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;lt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;word&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="nv"&gt;str&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;-&amp;gt; &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;make-stemmer&lt;/span&gt;
                   &lt;span class="nv"&gt;step-1ab&lt;/span&gt; &lt;span class="nv"&gt;step-1c&lt;/span&gt; &lt;span class="nv"&gt;step-2&lt;/span&gt; &lt;span class="nv"&gt;step-3&lt;/span&gt; &lt;span class="nv"&gt;step-4&lt;/span&gt; &lt;span class="nv"&gt;step-5&lt;/span&gt;
                   &lt;span class="no"&gt;:word&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If the word has one or two letters, just return it. If it is longer, use the
   &lt;code&gt;-&amp;gt;&lt;/code&gt; macro to thread the word through &lt;code&gt;make-stemmer&lt;/code&gt; and the steps, and
   extract the stem vector.
&lt;/p&gt;
&lt;p&gt;The word vector gets passed to the &lt;code&gt;apply&lt;/code&gt; function. This is a special
   higher-order function that takes a function and its arguments as a sequence.
   It &lt;em&gt;applies&lt;/em&gt; the arguments to the function and returns the result. Let’s look
   at how it works.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;6&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="nv"&gt;+&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="mi"&gt;6&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="nv"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="mi"&gt;6&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can see that only the last argument to &lt;code&gt;apply&lt;/code&gt; has to be a sequence of
   arguments to pass to the function. The other arguments can be listed
   individually before the final sequence, and they are put before the sequence.
   For example, you can’t do this:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="nv"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   
&lt;span class="nv"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;lang&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;IllegalArgumentException:&lt;/span&gt; &lt;span class="nv"&gt;Don&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;t&lt;/span&gt; &lt;span class="nv"&gt;know&lt;/span&gt; &lt;span class="nv"&gt;how&lt;/span&gt; &lt;span class="nv"&gt;to&lt;/span&gt; &lt;span class="nv"&gt;create&lt;/span&gt; &lt;span class="nv"&gt;ISeq&lt;/span&gt; &lt;span class="nv"&gt;from:&lt;/span&gt; &lt;span class="nv"&gt;Integer&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Of course, if you’re doing that, you already know how many arguments you’re
   calling the function with, and in that case, you should just call it as is
   (that is, just call &lt;code&gt;(+ 1 2 3)&lt;/code&gt;).
&lt;/p&gt;
&lt;p&gt;So in &lt;code&gt;stem&lt;/code&gt;, we take a word vector and pass all of the characters in it to
   the &lt;code&gt;str&lt;/code&gt; function. &lt;code&gt;str&lt;/code&gt; converts all of this arguments to a string and
   concatenates them.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str &lt;/span&gt;&lt;span class="sc"&gt;\w&lt;/span&gt; &lt;span class="sc"&gt;\o&lt;/span&gt; &lt;span class="sc"&gt;\r&lt;/span&gt; &lt;span class="sc"&gt;\d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;word&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="nv"&gt;str&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sc"&gt;\w&lt;/span&gt; &lt;span class="sc"&gt;\o&lt;/span&gt; &lt;span class="sc"&gt;\r&lt;/span&gt; &lt;span class="sc"&gt;\d&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;word&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Well, we have a new toy now, so let’s play with it:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stem&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;porter&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;porter&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stem&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;porting&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;port&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stem&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ports&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;port&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stem&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ported&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;port&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stem&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;stemming&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;stem&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Testing&lt;/h2&gt;
&lt;p&gt;I’ve been presenting the code here as a finished product, perfect (I guess) as
   written. But it didn’t begin that way. In fact, I originally wrote something
   very close to the C version of the algorithm and made sure that worked right.
   Then I gradually changed it to make it more lispy. The is the result I have
   presented here.
&lt;/p&gt;
&lt;p&gt;To make sure it worked correctly, I downloaded the &lt;a href="http://tartarus.org/~martin/PorterStemmer/voc.txt"&gt;test input data&lt;/a&gt;
   and &lt;a href="http://tartarus.org/~martin/PorterStemmer/output.txt"&gt;expected output&lt;/a&gt; from the Porter Stemmer web site. The first
   file contains 23,531 words for a test set. The second contains those same
   words after they’ve been run through the stemmer.
&lt;/p&gt;
&lt;p&gt;Next, I wrote a function that reads from both files, stems the input, and
   compares it to the output. I don’t always need to test every item in the test
   set. Sometimes I can get by with only testing the first so many words, so I’ve
   included a parameter to limit how many words to test. Also, sometimes I may
   want to see the output from every word in the test set, but most of the time,
   I really only want to see the errors. Finally, this returns the total number
   of words tested, the number the stemmer got right, and the number it got
   wrong.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;read-lines&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;with-open &lt;/span&gt;&lt;span class="nv"&gt;reader&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new &lt;/span&gt;&lt;span class="nv"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;io&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;BufferedReader&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new &lt;/span&gt;&lt;span class="nv"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;io&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;FileReader&lt;/span&gt; &lt;span class="nv"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;doall &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;line-seq &lt;/span&gt;&lt;span class="nv"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;test-porter&lt;/span&gt;
  &lt;span class="p"&gt;([]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-porter&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;MAX_VALUE&lt;/span&gt; &lt;span class="nv"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;false&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="nv"&gt;output-all?&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;loop &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;take &lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;read-lines&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;porter-test/voc.txt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
          &lt;span class="nv"&gt;expected&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;take &lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;read-lines&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;porter-test/output.txt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
          &lt;span class="nv"&gt;total&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;correct&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;error&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;first &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;e&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;first &lt;/span&gt;&lt;span class="nv"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stem&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="nv"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;do&lt;/span&gt;
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;when &lt;/span&gt;&lt;span class="nv"&gt;output-all?&lt;/span&gt;
               &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;OK:&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pr-str &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;rest &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;rest &lt;/span&gt;&lt;span class="nv"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;total&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;correct&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="nv"&gt;error&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;do&lt;/span&gt;
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ERROR:&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pr-str &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                      &lt;span class="s"&gt;&amp;quot;=&amp;gt; (&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pr-str &lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;!=&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pr-str &lt;/span&gt;&lt;span class="nv"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;)&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;rest &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;rest &lt;/span&gt;&lt;span class="nv"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;total&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="nv"&gt;correct&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
       &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;total&lt;/span&gt; &lt;span class="nv"&gt;correct&lt;/span&gt; &lt;span class="nv"&gt;error&lt;/span&gt;&lt;span class="p"&gt;]))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The highlights of this are:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;p&gt;&lt;code&gt;read-lines&lt;/code&gt; is a utility that opens a file using a Java &lt;code&gt;BufferedReader&lt;/code&gt;
   and assigns that to &lt;code&gt;reader&lt;/code&gt;. &lt;code&gt;with-open&lt;/code&gt; always calls &lt;code&gt;(. reader close)&lt;/code&gt; when
   it exits. &lt;code&gt;line-seq&lt;/code&gt; takes a reader and returns a lazy sequence on the lines
   in the reader, and &lt;code&gt;doall&lt;/code&gt; forces Clojure to read all the items in a lazy
   sequence. Basically, &lt;code&gt;read-lines&lt;/code&gt; reads all the lines in a file and returns
   them in a sequence.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;As we’ve seen before, &lt;code&gt;take&lt;/code&gt; pulls the first &lt;code&gt;n&lt;/code&gt; items from a list, which
   limits the number of words to be tested.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;The &lt;code&gt;loop&lt;/code&gt; continues while there is input from &lt;code&gt;input&lt;/code&gt; and &lt;code&gt;expected&lt;/code&gt;.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;The input is stemmed and stored as the variable &lt;code&gt;a&lt;/code&gt; (short for &lt;em&gt;actual&lt;/em&gt;).
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;If the actual is the same as the expected, optionally output that, and loop,
   incrementing the number of total words tested and the number of words stemmed
   correctly.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;If the actual and expected are not the same, always write this out and loop,
   incrementing the number of total words tested and the number of errors.
&lt;/p&gt;

 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Tomorrow, I’ll talk about how I tracked down bugs that cropped up during
   testing.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-1122839372407284583?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/1122839372407284583/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=1122839372407284583' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/1122839372407284583'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/1122839372407284583'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/08/stemming-part-18-processing-and-testing.html' title='Stemming, Part 18: Processing and Testing'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-3161626986205840959</id><published>2008-08-06T19:58:00.002-05:00</published><updated>2009-01-26T11:09:48.202-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 17: The Final Step</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;Today, we pick apart step five. This will involve removing the final &lt;em&gt;-e&lt;/em&gt; and
   &lt;em&gt;-l&lt;/em&gt; from words. But each of these cases will be handled in a different
   function.
&lt;/p&gt;

&lt;h2&gt;Final E&lt;/h2&gt;
&lt;p&gt;Silent &lt;em&gt;-e&lt;/em&gt; is one of the more, um, endearing quirks of English spelling.
   Invariably, whenever someone suggests spelling reform, it is the first letter
   on the chopping block. Stemming isn&amp;#8217;t spelling reform, but this step does get
   rid of silent and final &lt;em&gt;-e&lt;/em&gt; in all words with more than one internal
   consonant cluster.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;rule-e&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This removes the final -e from a word if&lt;/span&gt;
&lt;span class="s"&gt;    - there is more than one internal consonant cluster; or&lt;/span&gt;
&lt;span class="s"&gt;    - there is exactly one final consonant cluster and&lt;/span&gt;
&lt;span class="s"&gt;      it is not preceded by a CVC sequence.&lt;/span&gt;
&lt;span class="s"&gt;  &amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;last &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="sc"&gt;\e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;m&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;or &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;gt; &lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cvc?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))))))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;pop-word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Final L&lt;/h2&gt;
&lt;p&gt;Handling final &lt;em&gt;-l&lt;/em&gt; just changes &lt;em&gt;-ll&lt;/em&gt; to &lt;em&gt;-l&lt;/em&gt; if there is more than one
   consonant cluster in the word. This cleans up any double-l&amp;#8217;s that may be left
   around.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;rule-l&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This changes -ll to -l if (&amp;gt; (m) 1).&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;last &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="sc"&gt;\l&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;double-c?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;m&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;pop-word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Step 5&lt;/h2&gt;
&lt;p&gt;Once again, the function for step five is pretty simple:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     It pulls the word from the input stemmer and creates a new one with an
     index pointing to the end of the word;
 &lt;/li&gt;

 &lt;li&gt;
     It runs that through both of the functions listed above.
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Again, notice that we used &lt;code&gt;-&amp;gt;&lt;/code&gt; to make it easier to read.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;step-5&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;Removes a final -e and changes -ll to -l if (&amp;gt; (m) 1).&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;-&amp;gt; &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;reset-index&lt;/span&gt; &lt;span class="nv"&gt;rule-e&lt;/span&gt; &lt;span class="nv"&gt;rule-l&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Surprise&lt;/h2&gt;
&lt;p&gt;One quirk with this algorithm is that it sometimes removes letters from
   standard English words. For example, &lt;em&gt;locate&lt;/em&gt; becomes &lt;em&gt;locat&lt;/em&gt;. The output it
   produces isn&amp;#8217;t standard, but it is correct. That is, all forms of &lt;em&gt;locate&lt;/em&gt;
   collapse down to one stem: &lt;em&gt;locat&lt;/em&gt;. So if you see a word that isn&amp;#8217;t a word,
   don&amp;#8217;t worry, it&amp;#8217;s still correct. Remember, we&amp;#8217;re identifying &lt;em&gt;stems&lt;/em&gt;, not
   &lt;em&gt;words&lt;/em&gt;.
&lt;/p&gt;
&lt;p&gt;Next, we&amp;#8217;ll look at the &lt;code&gt;stem&lt;/code&gt; function and at how to test a system like this.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-3161626986205840959?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/3161626986205840959/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=3161626986205840959' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/3161626986205840959'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/3161626986205840959'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/08/stemming-part-17-final-step.html' title='Stemming, Part 17: The Final Step'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-3638120454907476677</id><published>2008-08-05T20:00:00.001-05:00</published><updated>2008-08-05T20:00:01.419-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 16: More Suffixes</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;Today, we’ll look at step four of the Porter stemmer. This is a little
   different than previous and later steps, so we’ll just focus on it today.
&lt;/p&gt;

&lt;h2&gt;Utilities&lt;/h2&gt;
&lt;p&gt;Before we outline the step itself, however, we need to define another utility
   function. This tests whether the stemmer’s word has more than one internal
   consonant cluster. If it does, it strips off the ending; otherwise, it returns
   the original stemmer.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;chop&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;If there is more than one internal&lt;/span&gt;
&lt;span class="s"&gt;  consonant cluster in the stem, this chops&lt;/span&gt;
&lt;span class="s"&gt;  the ending (as identified by the index).&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;m&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Step Four&lt;/h2&gt;
&lt;p&gt;Once &lt;code&gt;chop&lt;/code&gt; is defined, the rest of step four pretty much defines itself. Like
   steps two and three, it is a &lt;code&gt;cond-ends?&lt;/code&gt; that tests for a variety of endings
   and strips them off.
&lt;/p&gt;
&lt;p&gt;There is one special case: If a word ends in &lt;em&gt;-ion&lt;/em&gt;, preceded by a &lt;em&gt;-s-&lt;/em&gt; or
   &lt;em&gt;-t-&lt;/em&gt;, the &lt;em&gt;-ion&lt;/em&gt; is removed, but the &lt;em&gt;-s-&lt;/em&gt; or &lt;em&gt;-t-&lt;/em&gt; is left. You can see how
   that is handled about half way through &lt;code&gt;step-4&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;step-4&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;takes off -ant, -ence, etc., in context &amp;lt;c&amp;gt;vcvc&amp;lt;v&amp;gt;.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;al&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ance&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ence&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;er&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ic&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;able&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ible&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ant&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ement&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ment&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ent&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ion&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sc"&gt;\s&lt;/span&gt; &lt;span class="sc"&gt;\t&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;index-char&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                      &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ou&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ism&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ate&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;iti&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ous&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ive&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ize&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chop&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That’s it for step four. In the posting, I’ll outline step five, which is, if
   anything, more like step one than steps two to four.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-3638120454907476677?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/3638120454907476677/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=3638120454907476677' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/3638120454907476677'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/3638120454907476677'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/08/stemming-part-16-more-suffixes.html' title='Stemming, Part 16: More Suffixes'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-1575436540282848818</id><published>2008-08-01T18:52:00.001-05:00</published><updated>2008-08-01T18:52:43.073-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 15: Morphological Suffix</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;Now on to steps two and three. Here we strip off a bunch of morphological
   suffixes.
&lt;/p&gt;

&lt;h2&gt;What Is a Morphological Suffix?&lt;/h2&gt;
&lt;p&gt;A &lt;a href="http://en.wikipedia.org/wiki/Morphology_(linguistics)"&gt;morphological suffix&lt;/a&gt; changes a word from one part of speech to
   another. These can be joined together almost infinitely:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     sense (&lt;em&gt;verb&lt;/em&gt;)
 &lt;/li&gt;

 &lt;li&gt;
     sense + -ate = sensate (&lt;em&gt;adjective&lt;/em&gt;)
 &lt;/li&gt;

 &lt;li&gt;
     sensate + -tion = sensation (&lt;em&gt;noun&lt;/em&gt;)
 &lt;/li&gt;

 &lt;li&gt;
     sensation + al = sensational (&lt;em&gt;adjective&lt;/em&gt;)
 &lt;/li&gt;

 &lt;li&gt;
     sensational + -ly = sensationally (&lt;em&gt;adverb&lt;/em&gt;)
 &lt;/li&gt;

 &lt;li&gt;
     sensational + -ize = sensationalize (&lt;em&gt;verb&lt;/em&gt;)
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You get the idea. We could play this all day.
&lt;/p&gt;
&lt;p&gt;(I should drop in a warning here. The derivations above are purely
   morphological. I’m not making any statement about how a word developed
   historically or how it came into the language. There, I feel better. Thanks
   for putting up with my moment of pedantry.)
&lt;/p&gt;
&lt;p&gt;There are a set of rules to how morphological suffixes can be combined. You
   can’t just stick &lt;em&gt;sensation&lt;/em&gt; and &lt;em&gt;-ize&lt;/em&gt; together to make a verb. Also,
   different morphological suffixes change the stem’s root in different ways.
   &lt;em&gt;Sense&lt;/em&gt; + &lt;em&gt;-ate&lt;/em&gt; (&lt;em&gt;sensate&lt;/em&gt;) is very different than &lt;em&gt;sense&lt;/em&gt; + &lt;em&gt;-ible&lt;/em&gt;
   (&lt;em&gt;sensible&lt;/em&gt;), even though both &lt;em&gt;-ate&lt;/em&gt; and &lt;em&gt;-ible&lt;/em&gt; turn &lt;em&gt;sense&lt;/em&gt; into an
   adjective.
&lt;/p&gt;
&lt;p&gt;The Porter stemmer leverages these rules to test for two different sets of
   endings in two different steps. The two steps are structured almost
   identically as two large &lt;code&gt;cond-ends?&lt;/code&gt; expressions. In each case, they test for
   an ending, and if it is found, they replace it with another ending using &lt;code&gt;r&lt;/code&gt;.
   The &lt;code&gt;r&lt;/code&gt; function only makes the change if the word has an internal consonant
   cluster inside the stem. If it doesn’t have an internal consonant cluster, the
   ending is assumed to be part of the word and left alone.
&lt;/p&gt;
&lt;p&gt;For example, step 1c changes &lt;em&gt;sensationally&lt;/em&gt; to &lt;em&gt;sensationalli&lt;/em&gt;. Step 2
   changes &lt;em&gt;sensationalli&lt;/em&gt; to &lt;em&gt;sensational&lt;/em&gt;.
&lt;/p&gt;
&lt;p&gt;On the other hand, the name &lt;em&gt;calli&lt;/em&gt;, which also ends in &lt;em&gt;-alli&lt;/em&gt;, should not be
   truncated to &lt;em&gt;cal&lt;/em&gt;. Because the stemmer only truncates the ending if the stem
   has an internal consonant cluster, which &lt;em&gt;calli&lt;/em&gt; does not, &lt;em&gt;calli&lt;/em&gt; is left the
   way it is.
&lt;/p&gt;

&lt;h2&gt;Our Functions for Today&lt;/h2&gt;
&lt;p&gt;Today, we’ll look at the functions for steps 2 and 3. As I said before, they
   are almost structurally identical, so I’ll show them both and comment on them
   together.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;step-2&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ational&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ate&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;tional&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;tion&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;enci&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ence&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;anci&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ance&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;izer&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ize&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;bli&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ble&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;alli&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;al&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;entli&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ent&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;eli&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;e&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ousli&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ous&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ization&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ize&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ation&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ate&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ator&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ate&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;alism&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;al&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;iveness&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ive&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;fulness&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ful&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ousness&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ous&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;fulness&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ful&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ousness&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ous&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;aliti&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;al&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;iviti&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ive&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;biliti&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ble&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;logi&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;log&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;step-3&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;deals with -ic-, -full, -ness, etc., using&lt;/span&gt;
&lt;span class="s"&gt;  a similar strategy to step-2.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;icate&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ative&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;alize&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;al&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;iciti&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ical&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ic&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ful&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="s"&gt;&amp;quot;ness&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;r&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;These are nothing but &lt;code&gt;cond-ends?&lt;/code&gt;. Each tests the input stemmer for a series
   of endings, and on the first ending found, it tests for an internal consonant
   cluster and changes the ending. If either of those conditions are false, the
   original stemmer is returned.
&lt;/p&gt;

&lt;h2&gt;Possible Improvements&lt;/h2&gt;
&lt;p&gt;One obvious improvement would be to make a macro that takes an input stemmer
   and a sequence of ending/replacement pairs. It would expand into the
   &lt;code&gt;cond-ends?&lt;/code&gt; above. It might look something like:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;replace-ending&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;
                &lt;span class="s"&gt;&amp;quot;icate&amp;quot;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ic&amp;quot;&lt;/span&gt;
                &lt;span class="s"&gt;&amp;quot;ative&amp;quot;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&amp;quot;&lt;/span&gt;
                &lt;span class="s"&gt;&amp;quot;alize&amp;quot;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;al&amp;quot;&lt;/span&gt;
                &lt;span class="s"&gt;&amp;quot;iciti&amp;quot;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ic&amp;quot;&lt;/span&gt;
                &lt;span class="s"&gt;&amp;quot;ful&amp;quot;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&amp;quot;&lt;/span&gt;
                &lt;span class="s"&gt;&amp;quot;ness&amp;quot;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I’ll leave that as an exercise to the reader.
&lt;/p&gt;
&lt;p&gt;In the next posting, we’ll look at stem 4, which is slightly different than
   steps 2 and 3.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-1575436540282848818?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/1575436540282848818/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=1575436540282848818' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/1575436540282848818'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/1575436540282848818'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/08/stemming-part-15-morphological-suffix.html' title='Stemming, Part 15: Morphological Suffix'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-7692943308898817173</id><published>2008-07-31T17:48:00.001-05:00</published><updated>2008-07-31T17:48:36.981-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 14: Verb Endings</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;In today’s posting, we’ll take a look at the initial step in the Porter
   stemmer.
&lt;/p&gt;

&lt;h2&gt;An Overview&lt;/h2&gt;
&lt;p&gt;The first step in stemming tokens involves removing plural &lt;em&gt;-s&lt;/em&gt; and verb
   endings &lt;em&gt;-ed&lt;/em&gt; and &lt;em&gt;-ing&lt;/em&gt;. It also turns &lt;em&gt;-y&lt;/em&gt; to &lt;em&gt;-i&lt;/em&gt;, so it will be recognized
   as a suffix in later steps. The documentation in the C source code lists some
   examples:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;caresses -&amp;gt; caress
   ponies -&amp;gt; poni
   ties -&amp;gt; ti
   caress -&amp;gt; caress
   cats -&amp;gt; cat
   feed -&amp;gt; feed
   agreed -&amp;gt; agree
   disabled -&amp;gt; disable
   matting -&amp;gt; mat
   mating -&amp;gt; mate
   meeting -&amp;gt; meet
   milling -&amp;gt; mill
   messing -&amp;gt; mess
   meetings -&amp;gt; meet
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Plurals&lt;/h2&gt;
&lt;p&gt;The first step is to strip off plurals. If the word ends in a &lt;em&gt;-sses&lt;/em&gt; or
   &lt;em&gt;-ies&lt;/em&gt;, this removes the &lt;em&gt;-es&lt;/em&gt;; otherwise, if it ends in &lt;em&gt;-s&lt;/em&gt;, but not &lt;em&gt;-ss&lt;/em&gt;,
   it takes off the &lt;em&gt;-s&lt;/em&gt;. If none of those apply, it returns the original
   stemmer.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;stem-plural&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This is part of step 1ab. It removes plurals (-s) from a stem.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;last &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="sc"&gt;\s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;
      &lt;span class="s"&gt;&amp;quot;sses&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;reset-index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
      &lt;span class="s"&gt;&amp;quot;ies&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set-to&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;i&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="no"&gt;:else&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;nth &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                           &lt;span class="sc"&gt;\s&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
              &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It’s been a while since we’ve seen some of these functions. Here are a few
   reminders about what they do:
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;reset-index&lt;/code&gt; returns a new stemmer from a word vector (here with the last two
   letters removed from the word).
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;set-to&lt;/code&gt; removes the &lt;em&gt;-ies&lt;/em&gt; from the end of the word and replaces it with an
   &lt;em&gt;-i&lt;/em&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;cond-ends?&lt;/code&gt; is the macro we created in the last few postings. I just wanted
   to point that out.
&lt;/p&gt;

&lt;h2&gt;Verb Endings&lt;/h2&gt;
&lt;p&gt;Another part of step 1 involves removing the verb suffixes &lt;em&gt;-ed&lt;/em&gt; and &lt;em&gt;-ing&lt;/em&gt;
   from the word. It first looks for &lt;em&gt;-eed&lt;/em&gt; if it is long enough (that is, if it
   has more than one internal sequence of consonants), it removes the &lt;em&gt;-d&lt;/em&gt;; if it
   ends in &lt;em&gt;-ed&lt;/em&gt;, it removes that; and if it ends in &lt;em&gt;-ing&lt;/em&gt;, it removes that. In
   either of the last two cases, it also expands truncated suffixes.
&lt;/p&gt;
&lt;p&gt;Expanding suffixes just tests for certain endings and, if they are found,
   appends an &lt;em&gt;-e&lt;/em&gt; to the word. Specifically, it looks for &lt;em&gt;-at&lt;/em&gt;, &lt;em&gt;-bl&lt;/em&gt;, and
   &lt;em&gt;-iz&lt;/em&gt;. It also checks for a double-consonant ending. Some are all right in
   English (&lt;em&gt;-ll&lt;/em&gt;, &lt;em&gt;-ss&lt;/em&gt;, and &lt;em&gt;-zz&lt;/em&gt;), but most others should be removed. Finally,
   it removes the final consonant in words that end in CVC (with exceptions).
   This process is handled by the &lt;code&gt;stem-expand-suffix&lt;/code&gt; function, which is listed
   first.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;stem-expand-suffix&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This is part of step 1ab. It expands -at, -bl,&lt;/span&gt;
&lt;span class="s"&gt;  and -iz by adding an -e in certain circumstances.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;
    &lt;span class="s"&gt;&amp;quot;at&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set-to&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ate&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="s"&gt;&amp;quot;bl&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set-to&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ble&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="s"&gt;&amp;quot;iz&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set-to&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ize&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="no"&gt;:else&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;double-c?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sc"&gt;\l&lt;/span&gt; &lt;span class="sc"&gt;\s&lt;/span&gt; &lt;span class="sc"&gt;\z&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;last &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
                &lt;span class="nv"&gt;st&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;m&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cvc?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set-to&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;e&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="no"&gt;:else&lt;/span&gt;
              &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;stem-verb-ending&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This is part of step 1ab. It removes verb endings -ed&lt;/span&gt;
&lt;span class="s"&gt;  and -ing.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;
    &lt;span class="s"&gt;&amp;quot;eed&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;m&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
            &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="s"&gt;&amp;quot;ed&amp;quot;&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;vowel-in-stem?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stem-expand-suffix&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
            &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="s"&gt;&amp;quot;ing&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;vowel-in-stem?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stem-expand-suffix&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
            &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;double-c?&lt;/code&gt; returns true if the word ends in a double consonant. For example,
   &lt;em&gt;-ll&lt;/em&gt; or &lt;em&gt;-ss&lt;/em&gt; or something.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;cvc?&lt;/code&gt; returns true if the word ends in a consonant-vowel-consonant sequence
   and if the final consonant is not &lt;em&gt;w&lt;/em&gt;, &lt;em&gt;x&lt;/em&gt;, or &lt;em&gt;y&lt;/em&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;m&lt;/code&gt; returns the number of consonant sequences between the start of a word and
   the index position. But if the word starts with a consonant sequence, it isn’t
   counted.
&lt;/p&gt;

&lt;h2&gt;Step 1AB&lt;/h2&gt;
&lt;p&gt;The actual function for step 1AB (&lt;em&gt;A&lt;/em&gt; calls &lt;code&gt;stem-plural&lt;/code&gt; and &lt;em&gt;B&lt;/em&gt; calls
   &lt;code&gt;stem-verb-ending&lt;/code&gt;) is simple. It just passes its input through the two
   functions:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;step-1ab&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stem-verb-ending&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stem-plural&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;One thing about functional languages is that they often have to be read from
   right to left. The stemmer gets passed to &lt;code&gt;stem-plural&lt;/code&gt; first and
   &lt;code&gt;stem-verb-ending&lt;/code&gt; second. The way it is written, however, is
   counter-intuitive, and it makes functional languages harder to read.
&lt;/p&gt;
&lt;p&gt;Clojure provides an improvement to this. It uses the &lt;code&gt;-&amp;gt;&lt;/code&gt; macro to build
   expressions like above. Let’s spend some time understanding what this macro
   does, first by using it and then by looking at the expressions it outputs.
&lt;/p&gt;
&lt;p&gt;The first parameter to &lt;code&gt;-&amp;gt;&lt;/code&gt; is an expression. The remaining parameters are
   functions. The expression parameter gets passed as the first parameter to the
   first function, and the output of this gets passed as the first parameter to
   the second function parameter. This continues until all the functions have
   been chained together.
&lt;/p&gt;
&lt;p&gt;To play with this, let’s define some functions that operate on strings.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;to-lower-case&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;toLowerCase&lt;/span&gt; &lt;span class="nv"&gt;string&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;porter/to-lower-case&lt;/span&gt;
&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;trim&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;trim&lt;/span&gt; &lt;span class="nv"&gt;string&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;porter/trim&lt;/span&gt;
&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;to-lower-case&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;   ThIs NeEdS ClEaNiNg  &amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;this needs cleaning&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If we make that last call with &lt;code&gt;-&amp;gt;&lt;/code&gt;, it looks like this:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;-&amp;gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;   ThIs NeEdS ClEaNiNg   &amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;to-lower-case&lt;/span&gt; &lt;span class="nv"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;this needs cleaning&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is really handy if there aren’t any other arguments. But what if you want
   to use a function that needs more than one argument. As long as the first
   argument is the expression parameter or the result of the previous function in
   the sequence, it’s no problem. Just enclose the function and the remaining
   parameters to that function in parentheses. For example, this defines a
   wrapper for &lt;code&gt;String.substring&lt;/code&gt; that returns everything from the second
   parameter on.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;substring&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;substring&lt;/span&gt; &lt;span class="nv"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;index&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;porter/substring&lt;/span&gt;
&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;-&amp;gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;   ThIs NeEdS ClEaNiNg   &amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;to-lower-case&lt;/span&gt; &lt;span class="nv"&gt;trim&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;substring&lt;/span&gt; &lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;cleaning&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We can see what this does using &lt;code&gt;macroexpand-1&lt;/code&gt;:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;macroexpand-1 &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;   ThIs NeEdS ClEaNiNg   &amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;to-lower-case&lt;/span&gt; &lt;span class="nv"&gt;trim&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;substring&lt;/span&gt; &lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/-&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;   ThIs NeEdS ClEaNiNg   &amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;to-lower-case&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;trim&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;substring&lt;/span&gt; &lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Umm. It changed, but not by much. Let’s just take the inner, second &lt;code&gt;-&amp;gt;&lt;/code&gt; and
   try expanding it:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;macroexpand-1 &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;   ThIs NeEdS ClEaNiNg   &amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;to-lower-case&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;to-lower-case&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;   ThIs NeEdS ClEaNiNg   &amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That seems like a normal function call. If we keep breaking it down, we get
   this:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;substring&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;to-lower-case&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;   ThIs NeEdS ClEaNiNg   &amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Which is what we want and expect.
&lt;/p&gt;
&lt;p&gt;Now, we can rewrite &lt;code&gt;step-1ab&lt;/code&gt; to use this macro and be much more readable.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;step-1ab&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;step-1ab gets rid of plurals and -ed or -ing. E.g.,&lt;/span&gt;
&lt;span class="s"&gt;    caresses -&amp;gt; caress&lt;/span&gt;
&lt;span class="s"&gt;    ponies -&amp;gt; poni&lt;/span&gt;
&lt;span class="s"&gt;    ties -&amp;gt; ti&lt;/span&gt;
&lt;span class="s"&gt;    caress -&amp;gt; caress&lt;/span&gt;
&lt;span class="s"&gt;    cats -&amp;gt; cat&lt;/span&gt;
&lt;span class="s"&gt;    feed -&amp;gt; feed&lt;/span&gt;
&lt;span class="s"&gt;    agreed -&amp;gt; agree&lt;/span&gt;
&lt;span class="s"&gt;    disabled -&amp;gt; disable&lt;/span&gt;
&lt;span class="s"&gt;    matting -&amp;gt; mat&lt;/span&gt;
&lt;span class="s"&gt;    mating -&amp;gt; mate&lt;/span&gt;
&lt;span class="s"&gt;    meeting -&amp;gt; meet&lt;/span&gt;
&lt;span class="s"&gt;    milling -&amp;gt; mill&lt;/span&gt;
&lt;span class="s"&gt;    messing -&amp;gt; mess&lt;/span&gt;
&lt;span class="s"&gt;    meetings -&amp;gt; meet&lt;/span&gt;
&lt;span class="s"&gt;  &amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;-&amp;gt; &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;stem-plural&lt;/span&gt; &lt;span class="nv"&gt;stem-verb-ending&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Step 1C&lt;/h2&gt;
&lt;p&gt;The rest of step one just tests to see if the word ends in &lt;em&gt;-y&lt;/em&gt;. If it does,
   and if there is a vowel in the stem, it removes the &lt;em&gt;-y&lt;/em&gt; and adds a &lt;em&gt;-i&lt;/em&gt; to
   the word.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;step-1c&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;Turns terminal y to i when there is another vowel in the stem.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;if-ends?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;y&amp;quot;&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;vowel-in-stem?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;reset-index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;conj &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="sc"&gt;\i&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
              &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That’s pretty straightforward. It uses &lt;code&gt;if-ends?&lt;/code&gt; to test for the &lt;em&gt;-y&lt;/em&gt;, and
   &lt;code&gt;vowel-in-stem?&lt;/code&gt; looks for a vowel before the &lt;em&gt;-y&lt;/em&gt;. If either of these is not
   the case, the original stemmer is returned.
&lt;/p&gt;
&lt;p&gt;Those two functions comprise step 1. In the next posting, we’ll look at the
   next several steps.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-7692943308898817173?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/7692943308898817173/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=7692943308898817173' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/7692943308898817173'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/7692943308898817173'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-14-verb-endings.html' title='Stemming, Part 14: Verb Endings'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-4364269957306101684</id><published>2008-07-29T18:46:00.001-05:00</published><updated>2008-07-29T18:46:26.945-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 13: The Steps for Processing</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;We finally have all the pieces in place to actually put the Porter stemmer
   together. But it’s been so long, I’ve certainly forgotten what goes next, so
   let’s take a moment to remember where we are going with this.
&lt;/p&gt;
&lt;p&gt;Earlier I outlined the process that the stemmer will perform in five steps:
&lt;/p&gt;
&lt;blockquote&gt;&lt;ol&gt;
 &lt;li&gt;
     Get rid of plurals, &lt;em&gt;-ed&lt;/em&gt;, and &lt;em&gt;-ing&lt;/em&gt;, and turn &lt;em&gt;-y&lt;/em&gt; to &lt;em&gt;-i&lt;/em&gt;, so it will
be recognized as a suffix in later steps;
 &lt;/li&gt;

 &lt;li&gt;
     Collapse multiple suffixes, such as &lt;em&gt;-ational&lt;/em&gt;, &lt;em&gt;-ator&lt;/em&gt;, &lt;em&gt;-iveness&lt;/em&gt;,
and others, to a single suffix, such as &lt;em&gt;-ate&lt;/em&gt;, &lt;em&gt;-ate&lt;/em&gt;, and &lt;em&gt;-ive&lt;/em&gt;,
respectively;
 &lt;/li&gt;

 &lt;li&gt;
     Collapse a different set of multiple suffixes or remove a small set of
     single suffixes;
 &lt;/li&gt;

 &lt;li&gt;
     Remove a set of suffixes including &lt;em&gt;-ance&lt;/em&gt;, &lt;em&gt;-ic&lt;/em&gt;, and &lt;em&gt;-ive&lt;/em&gt;; and
 &lt;/li&gt;

 &lt;li&gt;
     Remove final &lt;em&gt;-e&lt;/em&gt; and change &lt;em&gt;-ll&lt;/em&gt; to &lt;em&gt;-l&lt;/em&gt; in some circumstances.
 &lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;&lt;p&gt;In the next posting, we’ll pick apart what needs to be done for step 1.
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;(Sorry this posting isn’t longer. I’m still taking a breath after the macro
death march.)&lt;/em&gt;
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-4364269957306101684?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/4364269957306101684/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=4364269957306101684' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/4364269957306101684'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/4364269957306101684'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-13-steps-for-processing.html' title='Stemming, Part 13: The Steps for Processing'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-1289308672942674742</id><published>2008-07-28T18:10:00.001-05:00</published><updated>2008-07-28T18:10:44.983-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 12: Macros and Moving on</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;All right, let’s review what we’ve been doing for the last few days.
&lt;/p&gt;

&lt;h2&gt;What Have We Done?&lt;/h2&gt;
&lt;p&gt;In a real sense, we’ve been writing a program that writes programs. A macro is
   a function that accepts Clojure code—represented as lists, vectors, symbols,
   strings, numbers, and other Clojure data types—and returns another list,
   vector, etc., that represents Clojure code. A macro can change its input any
   way it wants to, but of course the output needs to be valid Clojure code.
&lt;/p&gt;

&lt;h2&gt;Creating Abstractions&lt;/h2&gt;
&lt;p&gt;All computer languages allow us to make abstractions. We can abstract the data
   “given-name,” “surname,” “age,” and “address” into a class (or structure in
   Clojure) called “Person.” We can abstract the action of printing “Greeting”
   and a person’s name into the function “hello-world.” All languages in common
   use let us do that. Object-oriented languages also give us other abstractions
   for associating actions with data.
&lt;/p&gt;
&lt;p&gt;But lisps, including Clojure, go a step beyond that. They allow us to create
   abstractions of the syntax of the language. This allows us to control when and
   how expressions get executed.
&lt;/p&gt;
&lt;p&gt;Does this allow us to do things we cannot do in other languages? Strictly
   speaking, no.
&lt;/p&gt;
&lt;p&gt;But it allows us to do things more concisely and readably than we otherwise
   could. But because we can abstract more things away and worry less about them,
   we can build and understand programs that are more complex than we otherwise
   could be able to comprehend.
&lt;/p&gt;

&lt;h2&gt;Domain Specific Languages&lt;/h2&gt;
&lt;p&gt;Another benefit of macros is that they allow us to create a mini-language with
   Clojure. If you look at Clojure’s source code, most of it is written in
   Clojure. &lt;em&gt;And you have that same power.&lt;/em&gt; Once written, your code isn’t really
   that different than the code that makes up Clojure’s core.
&lt;/p&gt;
&lt;p&gt;You can use that ability to extend Clojure, to build up from its foundation,
   to create your own language ideally suited to the work you need to do.
&lt;/p&gt;
&lt;p&gt;You can do this in other languages, of course. It’s called creating &lt;a href="http://en.wikipedia.org/wiki/Domain_specific_languages"&gt;domain
specific languages&lt;/a&gt;. But in lisp, it is natural in a way it is in no
   other language.
&lt;/p&gt;

&lt;h2&gt;The Spiderman Clause&lt;/h2&gt;
&lt;p&gt;Of course, as Peter Parker was told, “With great power comes great
   responsibility.” Like operator overloading or multiple inheritance, macros
   make it easy for you to shoot yourself in the foot. If you’re not careful, you
   can create something that is so far from Clojure that others have trouble
   understanding it, and you will have trouble getting your bearings when you
   come back to it in six months.
&lt;/p&gt;
&lt;p&gt;Remember: a little macros go a long way.
&lt;/p&gt;
&lt;p&gt;In the next posting, we’ll finally start on the steps involved in stemming.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-1289308672942674742?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/1289308672942674742/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=1289308672942674742' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/1289308672942674742'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/1289308672942674742'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-12-macros-and-moving-on.html' title='Stemming, Part 12: Macros and Moving on'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-849926136049353238</id><published>2008-07-25T16:09:00.001-05:00</published><updated>2008-07-25T16:09:35.621-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 11: the cond-ends? Macro</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;In the last posting, we created the macro &lt;code&gt;if-ends?&lt;/code&gt;, which was a combination
   of &lt;code&gt;let&lt;/code&gt;, &lt;code&gt;if&lt;/code&gt; and the function &lt;code&gt;ends?&lt;/code&gt;. Today, we’ll look at another similar
   macro, one that combines &lt;code&gt;let&lt;/code&gt;, &lt;code&gt;cond&lt;/code&gt;, and &lt;code&gt;ends?&lt;/code&gt;.
&lt;/p&gt;

&lt;h2&gt;The Purpose of cond-ends?&lt;/h2&gt;
&lt;p&gt;The problem with &lt;code&gt;if-ends?&lt;/code&gt; is that sometimes we’ll want to look for twenty or
   more different endings on a single stemmer and handle each case differently.
   We can do that with &lt;code&gt;if-ends?&lt;/code&gt;, but it would be a lot cleaner if we had a
   construct like &lt;code&gt;cond&lt;/code&gt;.
&lt;/p&gt;

&lt;h2&gt;Input Patterns&lt;/h2&gt;
&lt;p&gt;For example, we might want to do something like this:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;names&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;ing&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ends with -ing&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;ed&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ends with -ed&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;s&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ends with -s&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="no"&gt;:else&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;no ending&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;First, notice that this starts out like &lt;code&gt;if-ends?&lt;/code&gt;. It has a variable (&lt;code&gt;st&lt;/code&gt;)
   that the new stemmer will be assigned to, followed by a stemmer instance to
   test.
&lt;/p&gt;
&lt;p&gt;These are followed by a series of ending/expression pairs. The endings are
   tested one by one until a match is found. Once it is, the expression following
   that ending is executed with the variable &lt;code&gt;st&lt;/code&gt; available to it and assigned to
   the new stemmer, with the index changed to reflect the ending. The return
   value of that expression is the value of the entire &lt;code&gt;cond-ends?&lt;/code&gt; expression.
&lt;/p&gt;
&lt;p&gt;Finally, if no ending matches, the &lt;code&gt;cond-ends?&lt;/code&gt; should return the original
   stemmer unchanged. Or, optionally, the last ending can be the keyword &lt;code&gt;:else&lt;/code&gt;.
   If no ending before that point matched, the expression following &lt;code&gt;:else&lt;/code&gt; is
   executed and its value returned.
&lt;/p&gt;
&lt;p&gt;Before we start on the expected output for this macro, let’s look at a short
   expression that uses the default value if no ending matches:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends?&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;names&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;ing&amp;quot;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ends with -ing&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Of course, that would be better expressed with &lt;code&gt;if-ends?&lt;/code&gt;, but using it as an
   input pattern will help us build the default ending.
&lt;/p&gt;

&lt;h2&gt;Output Patterns&lt;/h2&gt;
&lt;p&gt;Here’s what the macro should construct from the input above.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="c1"&gt;;(cond-ends? st (make-stemmer &amp;quot;names&amp;quot;)&lt;/span&gt;
&lt;span class="c1"&gt;;  &amp;quot;ing&amp;quot; (do (println &amp;quot;ends with -ing&amp;quot;) st)&lt;/span&gt;
&lt;span class="c1"&gt;;  &amp;quot;ed&amp;quot; (do (println &amp;quot;ends with -ed&amp;quot;) st)&lt;/span&gt;
&lt;span class="c1"&gt;;  &amp;quot;s&amp;quot; (do (println &amp;quot;ends with -s&amp;quot;) st)&lt;/span&gt;
&lt;span class="c1"&gt;;  :else (do (println &amp;quot;no ending&amp;quot;) st))&lt;/span&gt;
&lt;span class="c1"&gt;;; Set up the environment&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;names&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
  &lt;span class="c1"&gt;;; -ing&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sc"&gt;\i&lt;/span&gt; &lt;span class="sc"&gt;\n&lt;/span&gt; &lt;span class="sc"&gt;\g&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ends with -ing&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="c1"&gt;;; -ed&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sc"&gt;\e&lt;/span&gt; &lt;span class="sc"&gt;\d&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
          &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ends with -ed&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
          &lt;span class="c1"&gt;;; -s&lt;/span&gt;
          &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sc"&gt;\s&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ends with -s&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
              &lt;span class="c1"&gt;;; :else&lt;/span&gt;
              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;no ending&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)))))))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That’s scary.
&lt;/p&gt;
&lt;p&gt;Actually, it’s not that bad. Notice that after the first let, it is a series
   of &lt;code&gt;(let ... (if ...))&lt;/code&gt; constructions. (I’ve added comments to separate each
   of these sections.) Each of those three &lt;code&gt;let&lt;/code&gt;/&lt;code&gt;if&lt;/code&gt; constructions are almost
   identical, until the &lt;code&gt;:else&lt;/code&gt; expression. Also, notice that I’ve not only
   pre-compiled the vector itself, but also the length of the vector, and I’ve
   inserted them into the result example as their literal values. (Pre-computing
   the length of the vector was suggested by Holger Durer in the comments to the
   last posting.)
&lt;/p&gt;
&lt;p&gt;Before we start writing the macro, let’s look at the expected output for the
   second example above.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="c1"&gt;;(cond-ends? st (make-stemmer &amp;quot;names&amp;quot;)&lt;/span&gt;
&lt;span class="c1"&gt;;  &amp;quot;ing&amp;quot; (do (println &amp;quot;ends with -ing&amp;quot;) st))&lt;/span&gt;
&lt;span class="c1"&gt;;; Set up the environment&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;names&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
  &lt;span class="c1"&gt;;; -ing&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sc"&gt;\i&lt;/span&gt; &lt;span class="sc"&gt;\n&lt;/span&gt; &lt;span class="sc"&gt;\g&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ends with -ing&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="c1"&gt;;; default :else&lt;/span&gt;
      &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;One of the things to notice about these output expressions is that a lot of
   the state for it can be computed once, and used throughout all the
   conditionals in the &lt;code&gt;cond-ends?&lt;/code&gt; body. That observation will be useful later.
&lt;/p&gt;

&lt;h2&gt;The Macro&lt;/h2&gt;
&lt;p&gt;Looking at the output above, we really have three tasks: set up the
   environment for the &lt;code&gt;cond-ends?&lt;/code&gt; in the outer &lt;code&gt;let&lt;/code&gt;; for each test generate a
   &lt;code&gt;let&lt;/code&gt;/&lt;code&gt;if&lt;/code&gt; construction; and finally generate an &lt;code&gt;:else&lt;/code&gt; value, maybe using
   the default.
&lt;/p&gt;
&lt;p&gt;One way to tackle this problem would be to separate it into two macros:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;&lt;p&gt;One macro sets up the environment and calls another macro to build the
   test expressions;
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;The second macro builds one &lt;code&gt;let&lt;/code&gt;/&lt;code&gt;if&lt;/code&gt; pair, similar to what &lt;code&gt;if-ends?&lt;/code&gt;
   did, for one test expression, then it calls itself on the rest of the test
   expressions. It does this until it reaches the last test expression and
   that test is &lt;code&gt;:else&lt;/code&gt; or until it is out of test expressions.
&lt;/p&gt;

 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let’s start with the second macro. That will allow us to see what the first
   macro needs to provide to the test macro, and it has to be defined first
   anyway, because that’s the way Clojure likes things.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generating the Tests&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;The test macro will be overloaded. Either it will take one ending/expression
   pair, or it will take many. If there is only one, the ending could be an
   &lt;code&gt;:else&lt;/code&gt;, or it could be the last ending to test for, and the &lt;code&gt;:else&lt;/code&gt; will be a
   default. So the first, shorter version of this will return one of these two
   constructions (taken from above):
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;  &lt;span class="c1"&gt;;;; With :else&lt;/span&gt;
              &lt;span class="c1"&gt;;; :else&lt;/span&gt;
              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;no ending&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

  &lt;span class="c1"&gt;;;; With a test + the default :else&lt;/span&gt;
  &lt;span class="c1"&gt;;; -ing&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sc"&gt;\i&lt;/span&gt; &lt;span class="sc"&gt;\n&lt;/span&gt; &lt;span class="sc"&gt;\g&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;st&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ends with -ing&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;st&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="c1"&gt;;; default :else&lt;/span&gt;
      &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Another point to make about the test expressions is that—as an astute
   commenter pointed out (thanks, Holger)—there is a lot of duplication between
   the two versions of the macro that builds the expressions. We can capture that
   duplication in a function and just call that function with the difference
   between the two structures. The function will then create the macro’s output
   structure.
&lt;/p&gt;
&lt;p&gt;All right. Let’s roll up our sleeves. The helper function will be called
   &lt;code&gt;make-cond-ends-test&lt;/code&gt;. It take the variable symbol, the stemmer variable, an
   ending string, a true expression, and a false expression.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;make-cond-ends-test&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt; &lt;span class="nv"&gt;false-expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;vend&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;~word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;~&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;vend&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;~word&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;~vend&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;~stemmer&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
           &lt;span class="nv"&gt;~true-expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="nv"&gt;~false-expr&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Notice that this function requires several parameters beyond the ending
   (&lt;code&gt;end&lt;/code&gt;) and the result expressions (&lt;code&gt;true-expr&lt;/code&gt; and &lt;code&gt;false-expr&lt;/code&gt;)? It also has
   &lt;code&gt;var&lt;/code&gt;, &lt;code&gt;stemmer&lt;/code&gt;, and &lt;code&gt;word&lt;/code&gt;.  But the stemmer and the subword should have
   been stored in variables in the first, outer &lt;code&gt;let&lt;/code&gt; by this point. That’s
   right. In this function, these &lt;em&gt;are&lt;/em&gt; the variables that those values were
   stored in, represented here as symbols.  Throughout this helper function,
   &lt;code&gt;~stemmer&lt;/code&gt; inserts the variable that the stemmer will have been assigned to in
   the outer &lt;code&gt;let&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;Now we can define the secondary macro.  We’ll call it &lt;code&gt;cond-ends-helper&lt;/code&gt;,
   because I’m not feeling creative. It’s main purpose is to examine the input
   and call &lt;code&gt;make-cond-ends-test&lt;/code&gt; with the appropriate false expression. The
   first overload for it will handle the last expression by testing for &lt;code&gt;:else&lt;/code&gt;
   and deciding whether to use the default &lt;code&gt;:else&lt;/code&gt;:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;cond-ends-helper&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This helps cond-ends? by processing the &amp;quot;&lt;/span&gt;
  &lt;span class="nv"&gt;test-exprs&lt;/span&gt; &lt;span class="nv"&gt;pairs&lt;/span&gt; &lt;span class="nv"&gt;in&lt;/span&gt; &lt;span class="nv"&gt;cond-ends?&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;s&lt;/span&gt; &lt;span class="nv"&gt;environment&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="err"&gt;&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="no"&gt;:else&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="nv"&gt;~stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="nv"&gt;~true-expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-cond-ends-test&lt;/span&gt; &lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="c1"&gt;;; The overload will go here ...&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;What happens here? First, at compile time (before the quasi-quote or call to
   &lt;code&gt;make-cond-ends-test&lt;/code&gt;), it tests whether the ending is &lt;code&gt;:else&lt;/code&gt;. If it is, it
   just outputs the shorter version of the ending. If it isn’t &lt;code&gt;:else&lt;/code&gt;, it calls
   the helper function, with all the same parameters plus the false expression as
   a data structure. In this case, the false expression is the stemmer variable.
&lt;/p&gt;
&lt;p&gt;If your head is starting to hurt, take a moment. Unless you have programmed in
   lisp before, this isn’t like any programming you’ve ever done. We’re writing a
   program to write programs.  Like any meta-activity, it can make your brain
   explode.  Unfortunately, this is a particularly convoluted macro, because it
   has these two levels of macro, plus a function. Don’t worry if you don’t quite
   get it at first. There’s nothing wrong with copying and pasting the macro’s
   code into your source file and moving on. Start building macros like &lt;code&gt;debug&lt;/code&gt;
   and &lt;code&gt;if-ends?&lt;/code&gt;, and when you’re comfortable there, come back and read this
   through again. It will make a lot more sense.
&lt;/p&gt;
&lt;p&gt;Now, if you’re still with me, here’s the overloaded version of the helper
   macro. This takes an ending/expression pair and a list of the rest of the
   pairs in the expression. It then calls &lt;code&gt;make-cond-ends-test&lt;/code&gt; with a new
   version of 
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;cond-ends-helper&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This helps cond-ends? by processing the&lt;/span&gt;
&lt;span class="s"&gt;  test-exprs pairs in cond-ends?&amp;#39;s environment.&amp;quot;&lt;/span&gt;
  &lt;span class="c1"&gt;;; The earlier override goes here....&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt; &lt;span class="nv"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;more&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-cond-ends-test&lt;/span&gt;
      &lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt;
      &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends-helper&lt;/span&gt; &lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="nv"&gt;~stemmer&lt;/span&gt; &lt;span class="nv"&gt;~word&lt;/span&gt; &lt;span class="nv"&gt;~@more&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In some ways, this part of &lt;code&gt;cond-ends-helper&lt;/code&gt; is much simpler. There are just
   a few things to comment on.
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;&lt;p&gt;In the parameter list, &lt;code&gt;&amp;amp; more&lt;/code&gt; collects the values of the rest of the
   parameters and stores them in a variable I’ve named &lt;code&gt;more&lt;/code&gt;. This works
   with functions too.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;The false expression that this passes to &lt;code&gt;make-cond-ends-test&lt;/code&gt; begins with
   a quasi-quote (back quote). This just generates an expression to pass as
   the false expression. This expression gets inserted into the expression
   that &lt;code&gt;make-conds-ends-test&lt;/code&gt; creates.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;This quasi-quoted expression, when finally evaluated, is a macro call to
   the macro that created it. It starts the process all over again, passing
   the new macro expansion the values of &lt;code&gt;var&lt;/code&gt;, &lt;code&gt;stemmer&lt;/code&gt;, and &lt;code&gt;word&lt;/code&gt;
   unchanged, and splicing all of the rest of the parameters that haven’t
   been handled yet into the expression using &lt;code&gt;~@more&lt;/code&gt;.
&lt;/p&gt;

 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Creating the Environment&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;So we know that the &lt;code&gt;cond-ends?&lt;/code&gt; sets up the environment by creating an
   expression like this:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="c1"&gt;;; Set up the environment&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;names&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
  &lt;span class="c1"&gt;; ...&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And it replaces the ellipses by calling the macro &lt;code&gt;cond-ends-helper&lt;/code&gt; with the
   variable and the variables &lt;code&gt;stemmer#&lt;/code&gt; and &lt;code&gt;word#&lt;/code&gt;. Put like that, it almost
   writes itself.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;cond-ends?&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This is the same as a stacked series of if-ends?.&lt;/span&gt;
&lt;span class="s"&gt;  This just sets up the environment for cond-ends-helper.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;test-exprs&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~stemmer,&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends-helper&lt;/span&gt; &lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~@test-exprs&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Notice that we use &lt;code&gt;&amp;amp; test-exprs&lt;/code&gt; to collect all of the test expressions, and
   we use &lt;code&gt;~@test-exprs&lt;/code&gt; to splice them into the &lt;code&gt;cond-ends-helper&lt;/code&gt; call.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Summary&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;For posterity’s sake, here is all the code for the &lt;code&gt;cond-ends?&lt;/code&gt; macro, in one
   place:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;make-cond-ends-test&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt; &lt;span class="nv"&gt;false-expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;vend&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;~word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;~&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;vend&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;~word&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;~vend&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;~stemmer&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
           &lt;span class="nv"&gt;~true-expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="nv"&gt;~false-expr&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;cond-ends-helper&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This helps cond-ends? by processing the&lt;/span&gt;
&lt;span class="s"&gt;  test-exprs pairs in cond-ends?&amp;#39;s environment.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="no"&gt;:else&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="nv"&gt;~stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="nv"&gt;~true-expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-cond-ends-test&lt;/span&gt; &lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt;
                          &lt;span class="nv"&gt;true-expr&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt; &lt;span class="nv"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;more&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-cond-ends-test&lt;/span&gt;
      &lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt;
      &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends-helper&lt;/span&gt; &lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="nv"&gt;~stemmer&lt;/span&gt; &lt;span class="nv"&gt;~word&lt;/span&gt; &lt;span class="nv"&gt;~@more&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;cond-ends?&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This is the same as a stacked series of if-ends?.&lt;/span&gt;
&lt;span class="s"&gt;  This just sets up the environment for cond-ends-helper.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;test-exprs&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~stemmer,&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cond-ends-helper&lt;/span&gt; &lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~@test-exprs&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In the next posting, I’ll to wrap up macros by pointing out how they’re used
   and why.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-849926136049353238?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/849926136049353238/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=849926136049353238' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/849926136049353238'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/849926136049353238'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-11-cond-ends-macro.html' title='Stemming, Part 11: the cond-ends? Macro'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-8453412012795265372</id><published>2008-07-22T20:07:00.001-05:00</published><updated>2008-07-22T20:07:19.384-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 10: The if-ends? Macro</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;For the last two postings, we’ve looked at macros. Today, we’ll finally put
   them to use for the Porter Stemmer. Remember that we’re looking to simplify
   the &lt;code&gt;ends?&lt;/code&gt; function, so let’s start by reviewing it.
&lt;/p&gt;

&lt;h2&gt;The ends? Function&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;ends?&lt;/code&gt; function tests whether a stemmer ends with a given suffix. If it
   does, it moves the working &lt;code&gt;:index&lt;/code&gt; to exclude that ending. If it does not, it
   leaves the &lt;code&gt;:index&lt;/code&gt; where it was originally. Moveover, code that calls &lt;code&gt;ends?&lt;/code&gt;
   needs to know whether that ending was found or not. If it was, it may want to
   take special action; if it wasn’t, it may want to test the stemmer against
   another ending. Currently, &lt;code&gt;ends?&lt;/code&gt; returns a vector containing the resulting
   stemmer (which may be the original one) and a flag indicating whether the
   ending was found or not:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;ends?&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;true if the word ends with s.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;s&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="nv"&gt;sv&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="nv"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="nv"&gt;j&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nv"&gt;true&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;false&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;A quick explanation:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     &lt;code&gt;(subword stemmer)&lt;/code&gt; is the part of word &lt;em&gt;before&lt;/em&gt; the working index;
 &lt;/li&gt;

 &lt;li&gt;
     &lt;code&gt;(vec s)&lt;/code&gt; turns the ending into a vector, if it isn’t already; and
 &lt;/li&gt;

 &lt;li&gt;
     &lt;code&gt;(- (count word) (count sv))&lt;/code&gt; is the index where the ending starts in the
  word. It will be the new value of the working index if the ending is found.
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;First, this tests whether the ending is shorter than the word and whether the
   word ends with the ending. If both are true, it returns a new stemmer with the
   working index (&lt;code&gt;:index&lt;/code&gt;) set to the position just before the ending in the
   word.
&lt;/p&gt;
&lt;p&gt;Let’s see what how what calling it looks like.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;ends?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;walking&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ing&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sc"&gt;\w&lt;/span&gt; &lt;span class="sc"&gt;\a&lt;/span&gt; &lt;span class="sc"&gt;\l&lt;/span&gt; &lt;span class="sc"&gt;\k&lt;/span&gt; &lt;span class="sc"&gt;\i&lt;/span&gt; &lt;span class="sc"&gt;\n&lt;/span&gt; &lt;span class="sc"&gt;\g&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nv"&gt;true&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;ends?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;walking&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;s&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sc"&gt;\w&lt;/span&gt; &lt;span class="sc"&gt;\a&lt;/span&gt; &lt;span class="sc"&gt;\l&lt;/span&gt; &lt;span class="sc"&gt;\k&lt;/span&gt; &lt;span class="sc"&gt;\i&lt;/span&gt; &lt;span class="sc"&gt;\n&lt;/span&gt; &lt;span class="sc"&gt;\g&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nv"&gt;false&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;The Purpose of if-ends?&lt;/h2&gt;
&lt;p&gt;The reason why we want to use a macro in place of a function for &lt;code&gt;ends?&lt;/code&gt; is
   because returning the extra value, the flag indicating whether the ending was
   actually found, is messy. Instead, we want something like an &lt;code&gt;if&lt;/code&gt; statement.
   We want a conditional that tests whether a stemmer ends with a suffix, but
   instead of returning true or false, it executes a different expression,
   depending on the result of &lt;code&gt;ends?&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;Another wrinkle is that we’ll want to have access to the newly created stemmer
   in the true- or false-expressions. In this sense, &lt;code&gt;if-ends?&lt;/code&gt; is like a &lt;code&gt;let&lt;/code&gt;.
   We need to provide a variable that will be used in the bodies of the true- and
   false-expressions. The altered stemmer will be assigned to it.
&lt;/p&gt;

&lt;h2&gt;Sample Inputs&lt;/h2&gt;
&lt;p&gt;What would calling this look like? Here are some examples:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;if-ends?&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;names&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;s&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;YES: had a plural suffix&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;NO : never had a plural suffix&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;if-ends?&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;walking&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ing&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;with -ing:&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;without -ing:&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;if-ends?&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;walked&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ed&amp;quot;&lt;/span&gt;
  &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;From these expressions, we can see that &lt;code&gt;if-ends?&lt;/code&gt; will need to take five or
   six parameters:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;&lt;p&gt;The variable that the resulting stemmer will be assigned to and that can
   be used in the body expressions of the &lt;code&gt;if-ends?&lt;/code&gt;. In all of these
   expressions, that variable is &lt;code&gt;x&lt;/code&gt;.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;The next parameter is the input stemmer. This could be an expression, as
   it is here, so we’ll need to evaluate this first and store it in a
   variable.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Next is the ending.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Next is the expression to evaluate if the stemmer does end with the
   ending.  In it, the altered stemmer is assigned to the variable name given
   in the first parameter.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Finally, the last parameter is the expression to evaluate if the stemmer
   does not end with the ending. Just as with &lt;code&gt;if&lt;/code&gt;, this branch is optional.
   If it is not given, whatever expression the &lt;code&gt;if-ends?&lt;/code&gt; creates should the
   original, unchanged stemmer.
&lt;/p&gt;

 &lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;Sample Outputs&lt;/h2&gt;
&lt;p&gt;Given these constraints, what would the output for each of these &lt;code&gt;if-ends?&lt;/code&gt;
   look like?
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="c1"&gt;;(if-ends? x (make-stemmer &amp;quot;names&amp;quot;) &amp;quot;s&amp;quot;&lt;/span&gt;
&lt;span class="c1"&gt;;  (println x &amp;quot;YES: had a plural suffix&amp;quot;)&lt;/span&gt;
&lt;span class="c1"&gt;;  (println x &amp;quot;NO : never had a plural suffix&amp;quot;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;names&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;end-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;s&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;word-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;j-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;end-var&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word-var&lt;/span&gt; &lt;span class="nv"&gt;j-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;end-var&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer-var&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j-var&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;YES: had a plural suffix&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="nv"&gt;stemmer-var&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;NO : never had a plural suffix&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(Here I’ve used &lt;code&gt;-var&lt;/code&gt; in place of the &lt;code&gt;#&lt;/code&gt; ending.)
&lt;/p&gt;
&lt;p&gt;First, this assigns the input stemmer and ending to temporary variables.
   Second, it assigns &lt;code&gt;word-var&lt;/code&gt; and &lt;code&gt;j-var&lt;/code&gt; to values, just as &lt;code&gt;ends?&lt;/code&gt; does.
   Finally, it performs the test. In either branch, it uses a &lt;code&gt;let&lt;/code&gt; to assign
   stemmer—new or old—to the variable &lt;code&gt;x&lt;/code&gt;. Finally, it inserts the body of the
   branch expression into each branch.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="c1"&gt;;(if-ends? x (make-stemmer &amp;quot;walking&amp;quot;) &amp;quot;ing&amp;quot;&lt;/span&gt;
&lt;span class="c1"&gt;;  (println &amp;quot;with -ing:&amp;quot; x)&lt;/span&gt;
&lt;span class="c1"&gt;;  (println &amp;quot;without -ing:&amp;quot; x))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;walking&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;end-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ing&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;word-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;j-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;end-var&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word-var&lt;/span&gt; &lt;span class="nv"&gt;j-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;end-var&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer-var&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j-var&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;with -ing:&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="nv"&gt;stemmer-var&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;without -ing:&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;

&lt;span class="no"&gt;:::clj&lt;/span&gt;
&lt;span class="c1"&gt;;(if-ends? x (make-stemmer &amp;quot;walked&amp;quot;) &amp;quot;ed&amp;quot;&lt;/span&gt;
&lt;span class="c1"&gt;;  x)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;walked&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;end-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;ed&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;word-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
      &lt;span class="nv"&gt;j-var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;end-var&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word-var&lt;/span&gt; &lt;span class="nv"&gt;j-var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;end-var&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer-var&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j-var&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
      &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;stemmer-var&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The last output shows what it should return when the optional false-expression
   is omitted.
&lt;/p&gt;

&lt;h2&gt;Writing the Macro&lt;/h2&gt;
&lt;p&gt;Actually, let’s start writing the macro with the shorter version. You can
   overload a macro’s parameter list just as you do with functions. Just enclose
   each set of parameters and body in its own set of parentheses.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;if-ends?&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;Instead of the function ends?, I&amp;#39;m using this:&lt;/span&gt;
&lt;span class="s"&gt;  (if-ends? x (make-stemmer \&amp;quot;names\&amp;quot;) [\\s]&lt;/span&gt;
&lt;span class="s"&gt;            (println x \&amp;quot;no longer has a plural suffix\&amp;quot;)&lt;/span&gt;
&lt;span class="s"&gt;            (println x \&amp;quot;never had a plural suffix\&amp;quot;))&lt;/span&gt;
&lt;span class="s"&gt;  &amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~stemmer,&lt;/span&gt;
          &lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="nv"&gt;~end&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
          &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
          &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
          &lt;span class="nv"&gt;~true-expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="c1"&gt;;; ... the full version of if-ends? goes here&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is a fairly straightforward transformation. The &lt;code&gt;-var&lt;/code&gt; variables are
   replaced with their &lt;code&gt;#&lt;/code&gt; counterparts. Otherwise, the parameters are inserted
   into the macro result with tilde expansion (&lt;code&gt;~expression&lt;/code&gt;).
&lt;/p&gt;
&lt;p&gt;The full version of &lt;code&gt;if-ends?&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;if-ends?&lt;/span&gt;
  &lt;span class="c1"&gt;;; ... the short version of if-ends? goes here&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt; &lt;span class="nv"&gt;false-expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~stemmer,&lt;/span&gt;
          &lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="nv"&gt;~end&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
          &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
          &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
          &lt;span class="nv"&gt;~true-expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
          &lt;span class="nv"&gt;~false-expr&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Compile-Time Optimization&lt;/h2&gt;
&lt;p&gt;We’ve got all the pieces in place, but we’re still not quite finished. Let’s
   make one assumption about our input, an assumption that will hold as we use
   this macro in the stemmer code. This will allow us to make an optimization
   that also leverages one of the advantages of macros: they are expanded at
   compile-time.
&lt;/p&gt;
&lt;p&gt;The assumption is this: the ending will always be a string literal. That is,
   we will always start &lt;code&gt;if-ends?&lt;/code&gt; like this:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;if-ends?&lt;/span&gt; &lt;span class="nv"&gt;v&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;ing&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;"ing"&lt;/code&gt; will get passed to the macro as a string literal, and we can convert
   it to a vector when the macro is translated &lt;em&gt;at compile time&lt;/em&gt;. In the macros
   that I’ve outlined above, the ending is converted to a vector in the outer
   &lt;code&gt;let&lt;/code&gt;. But we could also move it to a new &lt;code&gt;let&lt;/code&gt; &lt;em&gt;outside the macro expansion
expression&lt;/em&gt;. The value of that pre-converted ending is used in the macro
   result.
&lt;/p&gt;
&lt;p&gt;With this optimization, the entirety of &lt;code&gt;if-ends?&lt;/code&gt; looks like this:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;if-ends?&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;Instead of the function ends?, I&amp;#39;m using this:&lt;/span&gt;
&lt;span class="s"&gt;  (if-ends? x (make-stemmer \&amp;quot;names\&amp;quot;) [\\s]&lt;/span&gt;
&lt;span class="s"&gt;            (println x \&amp;quot;no longer has a plural suffix\&amp;quot;)&lt;/span&gt;
&lt;span class="s"&gt;            (println x \&amp;quot;never had a plural suffix\&amp;quot;))&lt;/span&gt;
&lt;span class="s"&gt;  &amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;vend&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
     &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~stemmer,&lt;/span&gt;
            &lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~vend,&lt;/span&gt;
            &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
          &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
            &lt;span class="nv"&gt;~true-expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="k"&gt;var &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt; &lt;span class="nv"&gt;true-expr&lt;/span&gt; &lt;span class="nv"&gt;false-expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;vend&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
     &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~stemmer,&lt;/span&gt;
            &lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~vend,&lt;/span&gt;
            &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;end&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
          &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
            &lt;span class="nv"&gt;~true-expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;~var&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="nv"&gt;~false-expr&lt;/span&gt;&lt;span class="p"&gt;))))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here, in both versions of &lt;code&gt;if-ends?&lt;/code&gt;, we first convert the ending to a vector
   and assign it to &lt;code&gt;vend&lt;/code&gt;. Inside the macro expansion expression, &lt;code&gt;vend&lt;/code&gt; is what
   gets assigned to &lt;code&gt;end#&lt;/code&gt;. Clojure calculates &lt;code&gt;vend&lt;/code&gt; when the macro is expanded
   at compile time. The time saved by the optimization won’t be much, but it does
   illustrate another benefit of macros.
&lt;/p&gt;
&lt;p&gt;This simplifies things a lot, but it doesn’t cover all the use cases for
   &lt;code&gt;ends?&lt;/code&gt;. We will also want to use it in a nested set of twenty or more
   comparisons. That is less like an &lt;code&gt;if&lt;/code&gt; and more like a &lt;code&gt;cond&lt;/code&gt; expression. In
   the next posting, we’ll create &lt;code&gt;cond-ends?&lt;/code&gt;.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-8453412012795265372?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/8453412012795265372/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=8453412012795265372' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/8453412012795265372'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/8453412012795265372'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-10-if-ends-macro.html' title='Stemming, Part 10: The if-ends? Macro'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-2206479100113489621</id><published>2008-07-21T19:14:00.001-05:00</published><updated>2008-07-21T19:14:25.222-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 9: Writing Macros</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;In the last posting, I introduced macros. Today, I’m going to point out some
   of the pitfalls in writing macros and suggest a method for writing them that
   help you avoid those potential problems.
&lt;/p&gt;
&lt;p&gt;To illustrate these issues, let’s take a look at a simplified, incorrect
   version of the &lt;code&gt;debug&lt;/code&gt; macro from the last posting:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;debug&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;do&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;~expr&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;~expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="nv"&gt;~expr&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Superficially, this version appears to work correctly. At least, if you try it
   out, it seemed to work:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Recomputing Values&lt;/h2&gt;
&lt;p&gt;The first problem with &lt;code&gt;debug&lt;/code&gt; as given above can best be illustrated by
   passing it an expression that has a side-effect, such as a &lt;code&gt;println&lt;/code&gt;
   expression.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Count Me!&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nv"&gt;Count&lt;/span&gt; &lt;span class="nv"&gt;Me!&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="nv"&gt;Count&lt;/span&gt; &lt;span class="nv"&gt;Me!&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;span class="nv"&gt;Count&lt;/span&gt; &lt;span class="nv"&gt;Me!&lt;/span&gt;
&lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;“Count Me!” gets printed twice. That’s because the expression’s getting
   executed twice: once when its value is printed and once when its value is
   being returned.  In this case, that might be fine, but imagine if the value
   took a long time to compute or if it deleted a file or inserted a new value
   into the database. The macro would take twice as long to run; it would cause
   an error when it tried again to delete a file it had just deleted; or it would
   insert a value into the database twice. None of those are good options.
&lt;/p&gt;
&lt;p&gt;Instead of having the value computed twice, we need it to be computed only
   once. We can get that by added a &lt;code&gt;let&lt;/code&gt; to the macro that computes the
   expression’s value once and stores it in a variable.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;debug&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;value&lt;/span&gt; &lt;span class="nv"&gt;~expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;~expr&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now if we try it again:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Count Me!&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nv"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;lang&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;Exception:&lt;/span&gt; &lt;span class="nv"&gt;Can&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;t&lt;/span&gt; &lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="nv"&gt;qualified&lt;/span&gt; &lt;span class="nv"&gt;name:&lt;/span&gt; &lt;span class="nv"&gt;user/value&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Hmm. That introduces the second type of error.
&lt;/p&gt;

&lt;h2&gt;Variable Capture&lt;/h2&gt;
&lt;p&gt;Basically, Clojure was complaining because macros return symbols that are
   attached to a namespace (&lt;code&gt;user&lt;/code&gt;, in this case). But variables have to exist
   outside of any namespace. To fix it, we change &lt;code&gt;value&lt;/code&gt; to use the &lt;code&gt;#&lt;/code&gt; variable
   notation I mentioned in the last posting.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;debug&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;~expr&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Let’s try it out.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Count Me!&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nv"&gt;Count&lt;/span&gt; &lt;span class="nv"&gt;Me!&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="nv"&gt;Count&lt;/span&gt; &lt;span class="nv"&gt;Me!&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There. That fixed both problems. Now it only evaluates the expression once,
   and it won’t clobber any variables from the surrounding context. Now it’s
   correct.
&lt;/p&gt;

&lt;h2&gt;Writing Macros&lt;/h2&gt;
&lt;p&gt;So how do we go about writing macros that are correct?
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Input Examples&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;First, create several examples of what you want the input to look like. For
   the debug macro, that might look like:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt; &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:name&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Make sure to include many different types of input parameters. In the &lt;code&gt;debug&lt;/code&gt;
   macro, the first problem—recomputing values—won’t appear if you always pass it
   a variable or another expression that doesn’t require computation. This is
   just good development and testing: test thoroughly.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Output Examples&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Next, for each input expression, write what you want the corresponding output
   expression to be:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="c1"&gt;;(debug name)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;name&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;;(debug (+ 1 2))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;;(debug (:name person))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:name&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:name&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Write the Macro&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Now that we have the pairs of input and output, we can write a macro that
   converts the first expression into the second. The result is what we had
   in the last posting:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;debug&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;~expr&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(The &lt;code&gt;flush&lt;/code&gt; function just makes sure that the expression is written out
   immediately. Otherwise, the computer might sit on it for a while.)
&lt;/p&gt;
&lt;p&gt;As you write the macro, think about the two common macro mistakes I outlined
   above.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Debugging&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Of course, occasionally you might make a mistake. You can see what expression
   a macro produces using the &lt;code&gt;macroexpand-1&lt;/code&gt; function:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;macroexpand-1 &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;debug&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/let&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;value__3063&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/println&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;quote &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;value__3063&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/flush&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nv"&gt;value__3063&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(I’ve broken the lines up to make them more readable.)
&lt;/p&gt;
&lt;p&gt;Examining the output of &lt;code&gt;macroexpand-1&lt;/code&gt; should make it clear where the problem
   is.
&lt;/p&gt;
&lt;p&gt;In the next posting, we’ll create a macro to use in place of the &lt;code&gt;ends?&lt;/code&gt;
   function.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-2206479100113489621?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/2206479100113489621/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=2206479100113489621' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/2206479100113489621'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/2206479100113489621'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-9-writing-macros.html' title='Stemming, Part 9: Writing Macros'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-5037078518105589834</id><published>2008-07-18T16:42:00.001-05:00</published><updated>2008-07-18T16:42:34.779-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 8: Macros</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;At the end of the last posting, I mentioned that we could probably simplify
   the &lt;code&gt;ends?&lt;/code&gt; function. Currently, it returns a vector with the result and a
   boolean indicating whether it had been changed.
&lt;/p&gt;

&lt;h2&gt;Data as Code&lt;/h2&gt;
&lt;p&gt;Compared to most programming languages, Clojure (and lisp) programs are odd:
   Functions and expressions look like lists; parameter lists for functions or
   &lt;code&gt;let&lt;/code&gt; expressions look like vectors; variables look like symbols.
&lt;/p&gt;
&lt;p&gt;In fact, those programming constructs start their lives &lt;em&gt;as&lt;/em&gt; those data
   structures. Remember that our interactive environment is called a REPL—a
   &lt;strong&gt;r&lt;/strong&gt;ead, &lt;strong&gt;e&lt;/strong&gt;valuate, and &lt;strong&gt;p&lt;/strong&gt;rint &lt;strong&gt;l&lt;/strong&gt;oop. In the read stage, a function
   or expression is read in &lt;em&gt;as a list, vector, symbol, string, or number&lt;/em&gt;. It’s
   not actually compiled into computer code until the evaluate stage. Before
   then, it’s all data.
&lt;/p&gt;
&lt;p&gt;This characteristic of Clojure—that programs are data—is called
   &lt;a href="http://en.wikipedia.org/wiki/Homoiconicity"&gt;homoiconicity&lt;/a&gt;. Part of what makes lisp so powerful and special is that it
   introduces a new step in the evaluation process. In most computer languages,
   the computer reads in an expression and immediately turns it into an action or
   byte code or whatever. In lisp, the computer reads in the expression as a
   native lisp data structure, and before it processes the data, it asks the
   programmer again how it should be handled.
&lt;/p&gt;
&lt;p&gt;This makes more sense with an example. Consider the Clojure built-in &lt;code&gt;when&lt;/code&gt;.
   You’ll recall that &lt;code&gt;when&lt;/code&gt; is just like &lt;code&gt;if&lt;/code&gt;, except that the true expression
   can be a sequence of expressions and the false expression doesn’t exist.
   Consider this &lt;code&gt;when&lt;/code&gt; expression:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;when &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;nth &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;abc&amp;quot;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="sc"&gt;\b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;b&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="sc"&gt;\b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;b&lt;/span&gt;
&lt;span class="sc"&gt;\b&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(&lt;code&gt;println&lt;/code&gt; prints its arguments, separated by spaces, and followed by a new
   line.)
&lt;/p&gt;
&lt;p&gt;The first “b” is what is printed. The “b” is the value returned by the &lt;code&gt;when&lt;/code&gt;
   expression, from the third line of the expression.
&lt;/p&gt;
&lt;p&gt;Of course, that &lt;code&gt;when&lt;/code&gt; expression is the same as this &lt;code&gt;if&lt;/code&gt; expression:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;nth &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;abc&amp;quot;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="sc"&gt;\b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;do&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;b&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
           &lt;span class="sc"&gt;\b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nv"&gt;b&lt;/span&gt;
&lt;span class="sc"&gt;\b&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(&lt;code&gt;do&lt;/code&gt; just executes a list of expressions and returns the value of the last
   one. It’s a way of taking a sequence of expressions and using them in places
   that only want one expression.)
&lt;/p&gt;
&lt;p&gt;In fact, the &lt;code&gt;when&lt;/code&gt; expression &lt;em&gt;is&lt;/em&gt; exactly equal to the &lt;code&gt;if&lt;/code&gt; expression.
   Clojure reads the &lt;code&gt;when&lt;/code&gt; expression and transforms it into the &lt;code&gt;if&lt;/code&gt; expression
   before evaluating it.
&lt;/p&gt;
&lt;p&gt;How is this done and how can we create our own code transformations?
&lt;/p&gt;
&lt;p&gt;Macros.
&lt;/p&gt;

&lt;h2&gt;Code Transformations&lt;/h2&gt;
&lt;p&gt;A macro is a code transformation. If you know XSLT, a macro transforms a
   Clojure expression in almost the same way as XSLT transforms an XML document.
   (If you don’t know XSLT, forget I mentioned it.)
&lt;/p&gt;
&lt;p&gt;A macro definition looks a lot like a function definition. It has a macro name
   and a vector of parameters. But macros start with the symbol &lt;code&gt;defmacro&lt;/code&gt;, and
   the body of the macro has a template that builds the output expression.
&lt;/p&gt;
&lt;p&gt;Here are the tools you use to create the template.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quote&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;A quote tells Clojure to return the next expression as data, exactly the way it
   is written, not as an expression or as a macro template. For example, the
   first &lt;code&gt;let&lt;/code&gt; evaluates &lt;code&gt;x&lt;/code&gt; before returning, but the quote in the second
   &lt;code&gt;let&lt;/code&gt; means that Clojure returns the symbol &lt;code&gt;x&lt;/code&gt;, not the value that the
   variable &lt;code&gt;x&lt;/code&gt; evaluates to.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;x&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Quasi-Quote&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;A quasi-quote is a back-quote character. It indicates that, when the macro is
   called, the expression it quotes should be scanned for the constructs below,
   and the list or expression generated should be returned. Except for the
   scanning bit, it’s a lot like a regular quote.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/let&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;user/x&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="nv"&gt;user/x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Tilde&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Inside a macro template, a tilde forces Clojure to evaluate the expression
   that follows it and insert its value into that place in the result.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
         &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt; &lt;span class="nv"&gt;~x&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="nv"&gt;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/let&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;user/y&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="nv"&gt;user/y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this, the outer &lt;code&gt;let&lt;/code&gt; sets &lt;code&gt;x&lt;/code&gt; to 2. Inside that &lt;code&gt;let&lt;/code&gt;, it defines a macro
   template, which returns a list that looks like another &lt;code&gt;let&lt;/code&gt; expression.
   Inside the inner &lt;code&gt;let&lt;/code&gt; it inserts the value of &lt;code&gt;x&lt;/code&gt; from the outer let.
&lt;/p&gt;
&lt;p&gt;In many ways, macro templates are shortcuts for longer expressions that build
   lists and vectors. For example, the last expression is the same as this
   expression, using functions to build the output expression:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;let&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vector &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;y&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="nv"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Tilde-At&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;The tilde-at sequence is a variation of tilde that evaluates its argument,
   which should return a sequence, and inserts the elements from that sequence
   into a list in the macro template.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
         &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="nv"&gt;~@&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;* &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="nv"&gt;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;done&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/println&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;quote &lt;/span&gt;&lt;span class="nv"&gt;user/done&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this expression, the elements from &lt;code&gt;x&lt;/code&gt; are run through &lt;code&gt;map&lt;/code&gt;, which doubles
   their values. The output of &lt;code&gt;map&lt;/code&gt; is inserted into the list that starts with
   &lt;code&gt;println&lt;/code&gt; and ends with &lt;code&gt;'done&lt;/code&gt;. Notice that &lt;code&gt;'done&lt;/code&gt; in the output is still
   quoted. It was read in as data—&lt;code&gt;'done&lt;/code&gt;—and output as the same data.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;gensym and #&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Sometimes when you’re creating a macro template, you want to use create a
   variable. One danger is that it could shadow a variable in the surrounding
   code.
&lt;/p&gt;
&lt;p&gt;For example, in this code,
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
         &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
           &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/let&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;user/x&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;user/x&lt;/span&gt; &lt;span class="nv"&gt;user/x&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;what do the two &lt;code&gt;x&lt;/code&gt; variables refer to on the third line? The &lt;code&gt;x&lt;/code&gt; from the
   outer &lt;code&gt;let&lt;/code&gt; (42); or the &lt;code&gt;x&lt;/code&gt; from the inner &lt;code&gt;let&lt;/code&gt; (13)? In this case, they
   both refer to the inner &lt;code&gt;x&lt;/code&gt;.  Occasionally, you may do this on purpose, but
   usually it’s a mistake.
&lt;/p&gt;
&lt;p&gt;To avoid this mistake, in variables you declare in a macro template, you
   should append an &lt;code&gt;#&lt;/code&gt; onto the end of the variable name. This makes sure that
   that variable name is never used elsewhere. It cannot clobber any other
   variable name.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
         &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
           &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/let&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x__1888&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;user/x&lt;/span&gt; &lt;span class="nv"&gt;x__1888&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this case, the two &lt;code&gt;x&lt;/code&gt; variables in the final line now refer to two
   different variables. The first refers to the variable in the outer &lt;code&gt;let&lt;/code&gt;, and
   the second refers to the variable in the inner &lt;code&gt;let&lt;/code&gt;, in the macro template.
   You can see that it’s different because Clojure has removed the &lt;code&gt;#&lt;/code&gt; character
   and replaced it with &lt;code&gt;__1888&lt;/code&gt;. (That number always changes. If you run this
   code, it will be different for you.) Moreover, Clojure guarantees that that
   variable name will never be used again during that run.
&lt;/p&gt;

&lt;h2&gt;A Quick Macro&lt;/h2&gt;
&lt;p&gt;So let’s write a quick macro, just to see what one looks like.
&lt;/p&gt;
&lt;p&gt;I usually develop with just a REPL and the text editor Vim, and if I need to
   debug anything, often I just put in print statements. In languages like
   Python, many of these statements look like this:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;x+4 =&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mf"&gt;4&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;counts[word] =&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That’s a lot of duplication and typing, and aren’t computers supposed to get
   rid of that?
&lt;/p&gt;
&lt;p&gt;The problem is that we want to bypass the way the language normally evaluates
   expressions. We want the computer to print the first expression out, just like
   it would data; we want it to evaluate the second expression and print out the
   result.
&lt;/p&gt;
&lt;p&gt;Because we want to change Clojure’s normal evaluation rules, this is a perfect
   place for a macro. We will pass in an expression and have Clojure print both
   the expression and its value. As an added bonus, we can have it also return
   the value. Here’s what we want it to look like:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;55&lt;/span&gt;
&lt;span class="mi"&gt;55&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Let’s look at what this would look like as a macro named &lt;code&gt;debug&lt;/code&gt;:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defmacro &lt;/span&gt;&lt;span class="nv"&gt;debug&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt; &lt;span class="nv"&gt;~expr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;~expr&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Let’s pick this apart.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;(defmacro debug&lt;/code&gt;: Create a macro named &lt;code&gt;debug&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;[expr]&lt;/code&gt;: The macro takes one parameter, which will be called &lt;code&gt;expr&lt;/code&gt; in the
   body of this macro.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;(let [value# ~expr]&lt;/code&gt;: Start the macro template with a &lt;code&gt;let&lt;/code&gt; expression. The
   &lt;code&gt;let&lt;/code&gt; will define one variable, named &lt;code&gt;value#&lt;/code&gt;, which will be assigned the
   value of evaluating &lt;code&gt;expr&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;(println '~expr "=&amp;gt;" value#)&lt;/code&gt;: Print the expression &lt;code&gt;expr&lt;/code&gt; as data (because
   it is quoted, an arrow, and the value of the variable &lt;code&gt;value#&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;(flush)&lt;/code&gt;: Flush the output buffer.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;value#))&lt;/code&gt;: Return the value of the variable &lt;code&gt;value#&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;Before we actually try it, we can see what calling it would return using the
   function &lt;code&gt;macroexpand-1&lt;/code&gt;. This takes a list expression and returns the list
   expression created by calling the first macro, if any, that needs to be
   executed on it.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;macroexpand-1 &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;debug&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/let&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;value__1892&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/println&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;quote &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;value__1892&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;clojure/flush&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nv"&gt;value__1892&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(I reformatted this to make it readable.) With a little mangling, we can see
   that the &lt;code&gt;debug&lt;/code&gt; macro will generate the expression
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;value__1892&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;println &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;=&amp;gt;&amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;value__1892&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nv"&gt;value__1892&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Which is exactly what we want. If we then call that macro, we get this.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;55&lt;/span&gt;
&lt;span class="mi"&gt;55&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(The first line is the printed output of the expression. The final 55 is the
   result of the expression.)
&lt;/p&gt;
&lt;p&gt;In the next posting, we’re going to look at the process of how to write macros
   and how to avoid the pitfalls and dangers of macro writing, and we’re going to
   write a number of macros for the Porter Stemmer.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-5037078518105589834?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/5037078518105589834/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=5037078518105589834' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/5037078518105589834'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/5037078518105589834'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-8-macros.html' title='Stemming, Part 8: Macros'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-1010273030250414572</id><published>2008-07-17T18:58:00.001-05:00</published><updated>2008-07-17T18:58:32.986-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 7: More Functions</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;Today, we’ll define some more utilities for the Stemmer.
&lt;/p&gt;

&lt;h2&gt;Internal Functions&lt;/h2&gt;
&lt;p&gt;One facet of functions that these utilities will use is internal functions.
   Remember that we can declare a function literal using &lt;code&gt;fn&lt;/code&gt;. Also, we can
   declare variables using &lt;code&gt;let&lt;/code&gt;. Internal functions combine both of these to
   create functions that are only visible and usable within a function. For
   example, in the last posting we redefined &lt;code&gt;count-item&lt;/code&gt; to use &lt;code&gt;cond&lt;/code&gt;. We could
   also rewrite it to use an internal function instead of &lt;code&gt;loop&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;count-item&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;ci&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;seq &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt;
                   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;peek &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                   &lt;span class="no"&gt;:else&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;)))]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;ci&lt;/span&gt; &lt;span class="nv"&gt;sequence&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/count-item&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this case, the loop is redefined as a recursive function. It is assigned to
   the variable &lt;code&gt;ci&lt;/code&gt;, which is called in the body of the &lt;code&gt;let&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;In this case, using an internal function instead of a &lt;code&gt;loop&lt;/code&gt; isn’t really a
   win, but if you have several internal loops that interact, defining them as
   internal functions can greatly clarify what the function does.
&lt;/p&gt;

&lt;h2&gt;Stemmer Utilities&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;(m &lt;em&gt;stemmer&lt;/em&gt;)&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;One of the utilities that will use internal functions is &lt;code&gt;m&lt;/code&gt;. It counts how
   many consonant sequences are between the beginning of a word and the stemmer’s
   index. It uses a function called &lt;code&gt;count-v&lt;/code&gt;, which skips letters while they are
   still vowels; &lt;code&gt;count-c&lt;/code&gt;, which skips letters while they are still consonants;
   and &lt;code&gt;count-cluster&lt;/code&gt;, which walks over the vowel and consonant clusters in the
   word, counting the consonants.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;count-v&lt;/code&gt; and &lt;code&gt;count-c&lt;/code&gt; both return vectors. The first item in the vector
   indicates what the caller should do after the function returns.&lt;br /&gt;
&lt;code&gt;:return&lt;/code&gt;
   means that the function should just return immediately; &lt;code&gt;:break&lt;/code&gt; means that it
   should continue processing. The second and the third items in the vectors are
   the current consonant cluster count and the current index of letter that is
   being considered.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;m&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;Measures the number of consonant sequences between&lt;/span&gt;
&lt;span class="s"&gt;  the start of word and position j. If c is a consonant&lt;/span&gt;
&lt;span class="s"&gt;  sequence and v a vowel sequence, and &amp;lt;...&amp;gt; indicates&lt;/span&gt;
&lt;span class="s"&gt;  arbitrary presence,&lt;/span&gt;
&lt;span class="s"&gt;    &amp;lt;c&amp;gt;&amp;lt;v&amp;gt;       -&amp;gt; 0&lt;/span&gt;
&lt;span class="s"&gt;    &amp;lt;c&amp;gt;vc&amp;lt;v&amp;gt;     -&amp;gt; 1&lt;/span&gt;
&lt;span class="s"&gt;    &amp;lt;c&amp;gt;vcvc&amp;lt;v&amp;gt;   -&amp;gt; 2&lt;/span&gt;
&lt;span class="s"&gt;    &amp;lt;c&amp;gt;vcvcvc&amp;lt;v&amp;gt; -&amp;gt; 3&lt;/span&gt;
&lt;span class="s"&gt;    ...&lt;/span&gt;
&lt;span class="s"&gt;  &amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nv"&gt;j&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-index&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nv"&gt;count-v&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;gt; &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;:return&lt;/span&gt; &lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;vowel?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;:break&lt;/span&gt; &lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                        &lt;span class="no"&gt;:else&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
        &lt;span class="nv"&gt;count-c&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;gt; &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;:return&lt;/span&gt; &lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;consonant?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;:break&lt;/span&gt; &lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                        &lt;span class="no"&gt;:else&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
        &lt;span class="nv"&gt;count-cluster&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="nv"&gt;stage1&lt;/span&gt; &lt;span class="nv"&gt;n1&lt;/span&gt; &lt;span class="nv"&gt;i1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-c&lt;/span&gt; &lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
                          &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="nv"&gt;stage1&lt;/span&gt; &lt;span class="no"&gt;:return&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="nv"&gt;n1&lt;/span&gt;
                            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="nv"&gt;stage2&lt;/span&gt; &lt;span class="nv"&gt;n2&lt;/span&gt; &lt;span class="nv"&gt;i2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-v&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;i1&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
                              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="nv"&gt;stage2&lt;/span&gt; &lt;span class="no"&gt;:return&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                                &lt;span class="nv"&gt;n2&lt;/span&gt;
                                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="nv"&gt;n2&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;i2&lt;/span&gt;&lt;span class="p"&gt;)))))))&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stage&lt;/span&gt; &lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-v&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="nv"&gt;stage&lt;/span&gt; &lt;span class="no"&gt;:return&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="nv"&gt;n&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-cluster&lt;/span&gt; &lt;span class="nv"&gt;n&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;(ends? &lt;em&gt;stemmer&lt;/em&gt; &lt;em&gt;suffix&lt;/em&gt;)&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;ends?&lt;/code&gt; tests whether the stemmer ends with a given suffix. If it does, it
   moves the stemmer’s current &lt;code&gt;:index&lt;/code&gt; and returns the new stemmer. The
   processor also needs to know whether the ending was actually found. To
   accommodate this, &lt;code&gt;ends?&lt;/code&gt; returns a vector containing the new (or old) stemmer
   and &lt;code&gt;true&lt;/code&gt; or &lt;code&gt;false&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;ends?&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;true if the word ends with s.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;s&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;sv&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="nv"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nv"&gt;true&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;false&lt;/span&gt;&lt;span class="p"&gt;])))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;(set-to &lt;em&gt;stemmer&lt;/em&gt; &lt;em&gt;new-ending&lt;/em&gt;)&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;set-to&lt;/code&gt; sets the stemmer’s word to the prefix of the word (everything before
   the stemmer’s index) and the new ending.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;set-to&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This sets the last j+1 characters to x and readjusts the length of b.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;new-end&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;reset-index&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;into &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;subword&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;new-end&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;(r &lt;em&gt;stemmer&lt;/em&gt; &lt;em&gt;orig-stemmer&lt;/em&gt; &lt;em&gt;suffix&lt;/em&gt;)&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;r&lt;/code&gt; tests whether there are any consonant clusters in the stem. If so, the
   ending is set to &lt;code&gt;suffix&lt;/code&gt;. Otherwise, the original stemmer is returned. This
   is used in some of the steps to add a suffix only if the stem is long enough.
   For example, you want to replace the ending of “restive” with nothing (to
   produce “rest”); but you don’t want to strip the “-ive” off “five.”
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;r&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This is used further down.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;orig-stemmer&lt;/span&gt; &lt;span class="nv"&gt;s&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pos? &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;m&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set-to&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;orig-stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Some of these are pretty messy. For instance, returning the vector of multiple
   values may be fine for a collection of internal functions, but it creates a
   complicated interface to the &lt;code&gt;ends?&lt;/code&gt; predicate. In the next posting, we’ll
   look at how we can simplify &lt;code&gt;ends?&lt;/code&gt;.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-1010273030250414572?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/1010273030250414572/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=1010273030250414572' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/1010273030250414572'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/1010273030250414572'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-7-more-functions.html' title='Stemming, Part 7: More Functions'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-1447290780428625058</id><published>2008-07-15T20:39:00.001-05:00</published><updated>2008-07-15T20:39:03.925-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 6: Stemmer Predicates</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;In the last few postings we’ve been looking at functions and how they’re used
   in Clojure. One of the fundamental kinds of functions is the predicate: a
   function that tests something and returns true or false. By convention, these
   functions end in a &lt;code&gt;?&lt;/code&gt;. Clojure has a number of these, and for the Porter
   Stemmer, we’ll define a few. 
&lt;/p&gt;

&lt;h2&gt;Sets&lt;/h2&gt;
&lt;p&gt;Sets can act as predicates. As we saw when we were discussing &lt;a href="http://writingcoding.blogspot.com/2008/06/tokenization-part-5-stop-words.html"&gt;stop
words&lt;/a&gt;, sets are also functions that test for membership.
&lt;/p&gt;

&lt;h2&gt;Built-Ins&lt;/h2&gt;
&lt;p&gt;Clojure defines a number of built-in predicates, and higher-order functions
   are often useful for creating other predicates.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(zero? &lt;em&gt;num&lt;/em&gt;)&lt;/strong&gt; Returns whether its argument is zero.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(pos? &lt;em&gt;num&lt;/em&gt;)&lt;/strong&gt; Returns whether its argument is a positive number.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(neg? &lt;em&gt;num&lt;/em&gt;)&lt;/strong&gt; Returns whether its argument is a negative number.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(complement &lt;em&gt;fn&lt;/em&gt;)&lt;/strong&gt; Returns a new function that returns the opposite of the
   predicate function passed into it. For example, &lt;code&gt;(complement zero?)&lt;/code&gt; returns a
   predicate that tests whether its argument is &lt;em&gt;not&lt;/em&gt; zero.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(cond &lt;em&gt;test&lt;/em&gt; &lt;em&gt;expression&lt;/em&gt; ...)&lt;/strong&gt; A structure that acts as a series of nested
   &lt;code&gt;if&lt;/code&gt; statements. Each &lt;em&gt;test&lt;/em&gt; is followed by one &lt;em&gt;expression&lt;/em&gt;. If the &lt;em&gt;test&lt;/em&gt;
   evaluates as true, the &lt;em&gt;expression&lt;/em&gt; is evaluated and its value is returned by
   the &lt;code&gt;cond&lt;/code&gt; expression. An optional final test, by default &lt;code&gt;:else&lt;/code&gt;, can be used
   if no previous tests evaluated as true. If no default test is provided, &lt;code&gt;cond&lt;/code&gt;
   returns &lt;code&gt;nil&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;For example, in the last post, we had defined &lt;code&gt;count-item&lt;/code&gt;, which had two
   nested &lt;code&gt;if&lt;/code&gt; expressions:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;count-item&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;loop &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt; &lt;span class="nv"&gt;sequence,&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;seq &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="nv"&gt;accum&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;peek &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/count-item&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This could be defined more simply using &lt;code&gt;cond&lt;/code&gt;:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;count-item&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;loop &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt; &lt;span class="nv"&gt;sequence,&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;seq &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt;
          &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;peek &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
          &lt;span class="no"&gt;:else&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/count-item&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In the last post, we also defined &lt;code&gt;member?&lt;/code&gt;. How would you define it using
   &lt;code&gt;cond&lt;/code&gt;?
&lt;/p&gt;

&lt;h2&gt;Stemmer Predicates&lt;/h2&gt;
&lt;p&gt;With all that we’ve learned, we’re ready to define a number of predicates that
   we can use later in the Porter Stemmer.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;vowel-letter?&lt;/strong&gt; is a set of the standard vowel letters. This will only be
   used to define &lt;code&gt;consonant?&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;def &lt;/span&gt;&lt;span class="nv"&gt;vowel-letter?&lt;/span&gt; &lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nv"&gt;e&lt;/span&gt; &lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt; &lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nv"&gt;o&lt;/span&gt; &lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nv"&gt;u&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;consonant?&lt;/strong&gt; returns true if the index in the &lt;code&gt;stemmer&lt;/code&gt; points to a
   consonant letter. Alternatively, it tests whether a given index points to a
   consonant letter.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;consonant?&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;Returns true if the ith character in a stemmer&lt;/span&gt;
&lt;span class="s"&gt;  is a consonant. i defaults to :index.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;consonant?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-index&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;c&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;nth &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;vowel-letter?&lt;/span&gt; &lt;span class="nv"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;false&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="nv"&gt;c&lt;/span&gt; &lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;zero? &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                      &lt;span class="nv"&gt;true&lt;/span&gt;
                      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;consonant?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
           &lt;span class="no"&gt;:else&lt;/span&gt; &lt;span class="nv"&gt;true&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;vowel?&lt;/strong&gt; is the logical opposite of &lt;code&gt;consonant?&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;def &lt;/span&gt;&lt;span class="nv"&gt;vowel?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;complement &lt;/span&gt;&lt;span class="nv"&gt;consonant?&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;vowel-in-stem?&lt;/strong&gt; returns true if any of the characters &lt;em&gt;before&lt;/em&gt; the index is
   a vowel character.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;vowel-in-stem?&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;true iff 0 ... j contains a vowel&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-index&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;loop &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;gt; &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;false&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;consonant?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="no"&gt;:else&lt;/span&gt; &lt;span class="nv"&gt;true&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;double-c?&lt;/strong&gt; returns true if the index (or another character) is the last
   letter in a double consonant pair.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;double-c?&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;returns true if this is a double consonant.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;double-c?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-index&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;gt;= &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;nth &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;nth &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;consonant?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;cvc?&lt;/strong&gt; return true if the characters before the index (or another character)
   is a CVC sequence (consonant-vowel-consonant).
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;cvc?&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;true if (i-2 i-1 i) has the form CVC and&lt;/span&gt;
&lt;span class="s"&gt;  also if the second C is not w, x, or y.&lt;/span&gt;
&lt;span class="s"&gt;  This is used when trying to restore an *e*&lt;/span&gt;
&lt;span class="s"&gt;  at the end of a short word.&lt;/span&gt;
&lt;span class="s"&gt;  E.g.,&lt;/span&gt;
&lt;span class="s"&gt;    cav(e), lov(e), hop(e), crim(e)&lt;/span&gt;
&lt;span class="s"&gt;    but snow, box, tray&lt;/span&gt;
&lt;span class="s"&gt;  &amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cvc?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-index&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;gt;= &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;consonant?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;vowel?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;consonant?&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nv"&gt;w&lt;/span&gt; &lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;nth &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Notice that we’ve established a pattern here: these all take one or two
   arguments. With one argument, they test against the &lt;code&gt;:index&lt;/code&gt; character in the
   &lt;code&gt;stemmer&lt;/code&gt;. With two arguments, they test against any character:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;consonant?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;secrets&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nv"&gt;true&lt;/span&gt;
&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;consonant?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-stemmer&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;secrets&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;false&lt;/span&gt;
&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;nth &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;secrets&amp;quot;&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nv"&gt;e&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Read over these and make sure you understand them. There’s nothing in them
   that we haven’t covered already. And if you have any questions, feel free to
   ask in the comments.
&lt;/p&gt;
&lt;p&gt;In the next posting, we’ll define some more utilities for the stemmer.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-1447290780428625058?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/1447290780428625058/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=1447290780428625058' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/1447290780428625058'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/1447290780428625058'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-6-stemmer-predicates.html' title='Stemming, Part 6: Stemmer Predicates'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-2249381655289904686</id><published>2008-07-14T17:16:00.001-05:00</published><updated>2008-07-14T17:16:07.535-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='tutorial'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 5: Functions and Recursion</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;So far in this Clojure tutorial/NLP tutorial, we’ve mainly been looking at
   Clojure’s functions, but the last posting actually included a good chunk of
   code for the Porter Stemmer. Today, we’ll review functions. Some of this we’ve
   seen before, but some of it will be new.
&lt;/p&gt;

&lt;h2&gt;Functions&lt;/h2&gt;
&lt;p&gt;As with any functional language, functions are the building blocks of Clojure.
   I’m going to briefly summarize what we’ve learned about functions so far, and
   then we’ll explore one important topic—recursion—in more detail.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Defining Functions&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Generally, functions are defined using &lt;code&gt;defn&lt;/code&gt;, followed by the name of the
   function, a vector listing the parameters it takes, and the expressions in the
   body of the function. You can also include a documentation string between the
   function name and the parameter vector:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;greetings&lt;/span&gt; 
         &lt;span class="s"&gt;&amp;quot;Say hello.&amp;quot;&lt;/span&gt;
         &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Hello, &amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/greetings&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;greetings&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;Eric&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;Hello, Eric&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(The &lt;code&gt;str&lt;/code&gt; function here just creates strings out of all its arguments and
   concatenates them together.)
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Higher-Order Functions&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Higher-order functions are functions that take other functions as values.
   These may use the other function in a calculation, or they may create a new
   function from the existing one. For example, &lt;code&gt;map&lt;/code&gt; calls a function on each
   element in a sequence:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map &lt;/span&gt;&lt;span class="nv"&gt;greetings&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Eric&lt;/span&gt; &lt;span class="nv"&gt;Elsa&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Hello, Eric&amp;quot;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Hello, Elsa&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;complement&lt;/code&gt;, however, creates a new function that is equivalent to &lt;code&gt;(not
(original-function *args...*))&lt;/code&gt;:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;def &lt;/span&gt;&lt;span class="nv"&gt;not-zero?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;complement &lt;/span&gt;&lt;span class="nv"&gt;zero?&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/not-zero?&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;not-zero?&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;false&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;not-zero?&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;true&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Function Literals&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Sometimes, particularly when you’re using a higher-order function, you may
   want to create a short function just for that one place, and giving it a name
   would clutter up your program. Or you may want to define a function &lt;em&gt;inside&lt;/em&gt;
   another function, in a &lt;code&gt;let&lt;/code&gt; expression, to using only within that function.
&lt;/p&gt;
&lt;p&gt;For either of these, use a &lt;em&gt;function literal&lt;/em&gt;. A function literal looks like a
   regular function definition, except instead of &lt;code&gt;defn&lt;/code&gt;, use &lt;code&gt;fn&lt;/code&gt;; the name is
   optional; and generally it cannot have a documentation string. For example,
   suppose that &lt;code&gt;greeting&lt;/code&gt;, which we defined above, doesn’t exist, and we want to
   greet a list of people. We could do this by creating a function literal and
   passing it to &lt;code&gt;map&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Hello, &amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Eric&lt;/span&gt; &lt;span class="nv"&gt;Elsa&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;                  
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Hello, Eric&amp;quot;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Hello, Elsa&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Or, we could temporarily define &lt;code&gt;greet&lt;/code&gt; using &lt;code&gt;let&lt;/code&gt;, and pass that to &lt;code&gt;map&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;greet&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Hello, &amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map &lt;/span&gt;&lt;span class="nv"&gt;greet&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Eric&lt;/span&gt; &lt;span class="nv"&gt;Elsa&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Hello, Eric&amp;quot;&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Hello, Elsa&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Overriding Functions&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;I’ve already mentioned that Clojure allows you to provide different versions
   of a function for different argument lists. Do this by grouping each set of
   parameter vector and expressions in its own list. The documentation string, if
   there is one, comes before all of the groups:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;count-parameters&lt;/span&gt;
         &lt;span class="s"&gt;&amp;quot;This returns the number of parameters&lt;/span&gt;
&lt;span class="s"&gt;         passed to the function.&amp;quot;&lt;/span&gt;
         &lt;span class="p"&gt;([]&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="nv"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="nv"&gt;b&lt;/span&gt; &lt;span class="nv"&gt;c&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="nv"&gt;b&lt;/span&gt; &lt;span class="nv"&gt;c&lt;/span&gt; &lt;span class="nv"&gt;d&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="nv"&gt;b&lt;/span&gt; &lt;span class="nv"&gt;c&lt;/span&gt; &lt;span class="nv"&gt;d&lt;/span&gt; &lt;span class="nv"&gt;e&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/count-parameters&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-parameters&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;p0&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;p1&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;p2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-parameters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This also works in function literals. This creates a shortened version of
   &lt;code&gt;count-parameters&lt;/code&gt; called &lt;code&gt;cp&lt;/code&gt; and calls it twice within a vector. The two
   calls are collected in a vector because &lt;code&gt;let&lt;/code&gt; only returns one value, and we
   want to see the value of both calls to &lt;code&gt;cp&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;cp&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn &lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="nv"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="nv"&gt;b&lt;/span&gt; &lt;span class="nv"&gt;c&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
         &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="nf"&gt;cp&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;p0&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;p1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Recursion&lt;/h2&gt;
&lt;p&gt;Many problems can be broken into smaller versions of the same problem.
&lt;/p&gt;
&lt;p&gt;For example, you can test whether a list contains a particular item by looking
   at the first element of the list. At each place in the list, ask yourself: is
   the list empty?  If it is, the item is not in the list. However, if the first
   element is what you’re looking for, good; if it’s not, strip the first element
   off the list and start over again.
&lt;/p&gt;
&lt;p&gt;This way of attacking problems—by calling the solution within itself—is called
   &lt;a href="http://en.wikipedia.org/wiki/Recursion_(computer_science)"&gt;recursion&lt;/a&gt;. Recursive problems are fundamental to functional programming,
   so let’s look at it in more detail.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;General Recursion&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;The main pitfall in creating a recursive problem is to make sure that it will
   end eventually. To do this, you need to make sure that you have an &lt;em&gt;end
condition&lt;/em&gt;. In the list-membership problem I described above, the end
   conditions are the empty list and the first element being the item you are
   looking for.
&lt;/p&gt;
&lt;p&gt;Let’s look at what this would look like in Clojure.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;member?&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;seq &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;    
           &lt;span class="nv"&gt;nil&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;peek &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
             &lt;span class="nv"&gt;sequence&lt;/span&gt;
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;member?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/member?&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this, the first &lt;code&gt;if&lt;/code&gt; tests whether the sequence is empty (that is, if &lt;code&gt;seq&lt;/code&gt;
   cannot create a sequence out of it; if it is, it returns &lt;code&gt;nil&lt;/code&gt;, which
   evaluates as &lt;code&gt;false&lt;/code&gt;. The second &lt;code&gt;if&lt;/code&gt; tests whether the first element in the
   sequence is what we’re looking for; if it is, this returns the sequence at
   that point. This is a common idiom in Clojure. Since the sequence has at least
   one item, it will evaluate as true, fulfilling its contract as a test, but it
   returns more information than that.  This function is useful for more than
   just determining whether an item is in a sequence: it also finds the item and
   returns it. Finally, if the first element is not the item being sought, this
   *calls &lt;code&gt;member?&lt;/code&gt; again with everything except the first item of the list*.
   This is the recursive part of the function. Let’s test it.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;member?&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;member?&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Sometimes it’s helpful to map this out. In the diagram below, a call within
   another call is indented under it. A function call returning is indicated by a
   “=&amp;gt;” and a value at the same level of indentation as the original call. Here’s
   the sequence of calls in the first example above.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;member?&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;member?&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The second example call to &lt;code&gt;member?&lt;/code&gt; would graph out like this:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;member?&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;member?&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;member?&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;member?&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;member?&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;nil&lt;/span&gt;
            &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;nil&lt;/span&gt;
        &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;nil&lt;/span&gt;
    &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In this case, we’re just returning the results of the membership test back
   unchanged, but we could modify the results as they’re being returned. For
   example, if we wanted to count how many times as item occurs in a sequence, we
   could do this:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;count-item&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;seq &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
           &lt;span class="mi"&gt;0&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;peek &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/count-item&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In many ways, this is very similar to &lt;code&gt;member?&lt;/code&gt;. It first tests whether the
   sequence is empty, and if it is, it returns zero. If the first element is the
   item, this calls itself (it recurses) and increments the result by one, to
   count the current item. Otherwise, it recurses, but it does not increment the
   result. Let’s see this in action.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here, the first example (in shortened form) graphs out like so:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
                &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
            &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="nv"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You see that the results gets incremented as the computer leaves each function
   call where 2 is the first element in the input list.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tail-Recursive Functions&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;While these two examples of recursion are superficially similar, on a deeper
   level they are very different. In &lt;code&gt;member?&lt;/code&gt;, once you’ve computed the result,
   you can return it immediately back to the function that originally called
   &lt;code&gt;member?&lt;/code&gt;. If Clojure had some way to jump completely out of a function from
   any level, you could do this. On the other hand, when &lt;code&gt;count-item&lt;/code&gt; recurses,
   it is not finished with its calculation yet. It has to wait for itself to
   return and possibly add one to the result. (Sentences like that remind me why
   recursion can be confusing. Don’t worry. Eventually your brain will get used
   to being twisted into a pretzel.)
&lt;/p&gt;
&lt;p&gt;There’s a term to describe functions like &lt;code&gt;member?&lt;/code&gt; that are finished with
   their calculations when they recurse. It’s called &lt;a href="http://en.wikipedia.org/wiki/Tail_recursion"&gt;&lt;em&gt;tail-call recursion&lt;/em&gt; or
&lt;em&gt;tail-recursive&lt;/em&gt;&lt;/a&gt;. This is a very good quality. The computer
   can optimize these calls to make them very fast and very efficient.
&lt;/p&gt;
&lt;p&gt;But the Java Virtual Machine doesn’t recognize tail recursion on its own, so
   Clojure needs a little help to make these optimizations. You signal a
   tail-recursive function call by using the &lt;code&gt;recur&lt;/code&gt; built-in instead of the
   function name when you recurse. Thus, we could re-write &lt;code&gt;member?&lt;/code&gt; to be
   tail-recursive like this.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;member?&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;          
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;seq &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;               
           &lt;span class="nv"&gt;nil&lt;/span&gt;                                    
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;peek &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;           
             &lt;span class="nv"&gt;sequence&lt;/span&gt;                               
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;         
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/member?&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;See the &lt;code&gt;recur&lt;/code&gt; in the last line? That’s all that has changed.
&lt;/p&gt;
&lt;p&gt;This should work exactly the same (and it does—try it), but for long lists, it
   should be much more efficient.
&lt;/p&gt;
&lt;p&gt;In fact, it’s often worth putting in a little extra work to make
   non-tail-recursive functions tail-recursive. There’s a straightforward
   transformation you can use to make almost any function tail recursive. Just
   add an extra parameter and use it to accumulate the results before you make
   the recursive function call. The first time you call the function, you need to
   pass the base value into the function for that parameter. For example,
   &lt;code&gt;count-item&lt;/code&gt; with tail recursion would look like this:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;count-item&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;seq &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;              
           &lt;span class="nv"&gt;accum&lt;/span&gt;                                 
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;peek &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;   
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/count-item&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The difference here is the &lt;code&gt;accum&lt;/code&gt; parameter. When &lt;code&gt;item&lt;/code&gt; equals the first
   element of the sequence, &lt;code&gt;accum&lt;/code&gt; is incremented &lt;em&gt;before the recursive
functional call is made&lt;/em&gt;. When the function starts again on the shorter list,
   &lt;code&gt;accum&lt;/code&gt; is incremented. Finally, when the end of the list is reached, &lt;code&gt;accum&lt;/code&gt;
   is returned.  It contains the counts accumulated as &lt;code&gt;count-item&lt;/code&gt; walked down
   the sequence.
&lt;/p&gt;
&lt;p&gt;Of course, now we have to call &lt;code&gt;count-item&lt;/code&gt; differently also:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Whenever we call &lt;code&gt;count-item&lt;/code&gt;, we have to include a superfluous zero that we
   really don’t care about. That seems messy and error-prone. How can we get rid
   of it?
&lt;/p&gt;
&lt;p&gt;Essentially, we want to hide &lt;em&gt;this&lt;/em&gt; version of &lt;code&gt;count-item&lt;/code&gt; and replace it
   with a new version that handles the extra zero for us. Then, we call the new
   version and forget about this one. There are several ways to actually do this:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     Have the public function named &lt;code&gt;count-item&lt;/code&gt; and create a private version
named something like &lt;code&gt;count-item-&lt;/code&gt;;
 &lt;/li&gt;

 &lt;li&gt;
     Use &lt;code&gt;let&lt;/code&gt; to define the private function inside &lt;code&gt;count-item&lt;/code&gt;; or
 &lt;/li&gt;

 &lt;li&gt;
     Use &lt;code&gt;loop&lt;/code&gt;.
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;code&gt;loop&lt;/code&gt;? Yes, this is new. &lt;code&gt;loop&lt;/code&gt; is a cross between a function call and &lt;code&gt;let&lt;/code&gt;.
   It looks a lot like &lt;code&gt;let&lt;/code&gt; because it allows you to define variables. But it
   also acts as a target for &lt;code&gt;recur&lt;/code&gt;. How would &lt;code&gt;count-item&lt;/code&gt; look with &lt;code&gt;loop&lt;/code&gt;?
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;count-item&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sequence&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;loop &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt; &lt;span class="nv"&gt;sequence,&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;seq &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="nv"&gt;accum&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;peek &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="nv"&gt;sq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;accum&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/count-item&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Notice that the first line of &lt;code&gt;loop&lt;/code&gt; looks a lot like the first line of &lt;code&gt;let&lt;/code&gt;.
   Both declare and initialize a series of variables. Just to keep things clear,
   I’ve renamed &lt;code&gt;sequence&lt;/code&gt; to &lt;code&gt;sq&lt;/code&gt; within the &lt;code&gt;loop&lt;/code&gt;. Also, &lt;code&gt;item&lt;/code&gt; isn’t included
   in the list of variables that &lt;code&gt;loop&lt;/code&gt; declares, since it doesn’t change.
   Finally, at the end of the &lt;code&gt;loop&lt;/code&gt; are two &lt;code&gt;recur&lt;/code&gt; statements. When they are
   evaluated, they cause the program to jump back to the &lt;code&gt;loop&lt;/code&gt; statement, but
   this time, instead of the original values used to initialize the &lt;code&gt;loop&lt;/code&gt;
   variables, the values in the &lt;code&gt;recur&lt;/code&gt; call are used.
&lt;/p&gt;
&lt;p&gt;Now we can again call &lt;code&gt;count-item&lt;/code&gt; without the extra parameter:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;count-item&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Stemmer Utility&lt;/h2&gt;
&lt;p&gt;With recursion, we can define another utility to use on the &lt;code&gt;stemmer&lt;/code&gt;
   structures. This function will take a predicate and a stemmer. For each step,
   it will test the stemmer with the predicate, and if true, it will pop one
   character from the word and recurse. If there are no letters left in the word
   or if the predicate evaluates to false, it will return the stemmer the way it
   is:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;pop-stemmer-on&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This is an amalgam of a number of&lt;/span&gt;
&lt;span class="s"&gt;  different functions: pop (it walks&lt;/span&gt;
&lt;span class="s"&gt;  through the :word sequence using pop);&lt;/span&gt;
&lt;span class="s"&gt;  drop-while (it drops items off while&lt;/span&gt;
&lt;span class="s"&gt;  testing the sequence against drop-while);&lt;/span&gt;
&lt;span class="s"&gt;  and maplist from Common Lisp (the&lt;/span&gt;
&lt;span class="s"&gt;  predicate is tested against the entire&lt;/span&gt;
&lt;span class="s"&gt;  current stemmer, not just the first&lt;/span&gt;
&lt;span class="s"&gt;  element).&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;predicate&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;seq &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;predicate&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;recur &lt;/span&gt;&lt;span class="nv"&gt;predicate&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;pop-word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I’m ready for a break. Next time, We’ll look at some more of the functions
   that Clojure provides, and we’ll define a few predicates of our own.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-2249381655289904686?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/2249381655289904686/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=2249381655289904686' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/2249381655289904686'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/2249381655289904686'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-5-functions-and-recursion.html' title='Stemming, Part 5: Functions and Recursion'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-2797771103176683080</id><published>2008-07-11T16:45:00.001-05:00</published><updated>2008-07-11T16:45:23.252-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 4: Tracking the Stemmer’s Data</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;After the last several postings, we finally have seen enough of Clojure’s
   native data structures and the functions associated with them to define the
   data structure that the Porter Stemmer will use, as well as some of the
   functions that will operate on it.
&lt;/p&gt;

&lt;h2&gt;The Stemmer Structure&lt;/h2&gt;
&lt;p&gt;Recall that the data that the stemmer will need to track is the word—and it
   will will need to manipulate the end of it—and an index into that word. A
   &lt;code&gt;struct&lt;/code&gt; is an obvious way to keep those two data together.
&lt;/p&gt;
&lt;p&gt;For the word, a string is probably not the best option, because it is a Java
   string. We would have to copy-and-change every time we wanted to remove a
   letter. Instead, a &lt;em&gt;vector&lt;/em&gt; gives us several advantages:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     We can make changes to the vector without having to copy it every time;
     and
 &lt;/li&gt;

 &lt;li&gt;
     It is still an immutable data structure, so we’re staying within Clojure’s
     functional framework, where things are easiest.
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;So open up &lt;code&gt;porter.clj&lt;/code&gt; and add these lines to the bottom. This defines the
   &lt;code&gt;stemmer&lt;/code&gt; structure to have two fields, &lt;code&gt;:word&lt;/code&gt; and &lt;code&gt;:index&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="c1"&gt;;; :word = input string&lt;/span&gt;
&lt;span class="c1"&gt;;; :index = general offset into string&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defstruct &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="no"&gt;:index&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Creating A Stemmer Structure&lt;/h2&gt;
&lt;p&gt;Like other data structures, a &lt;code&gt;stemmer&lt;/code&gt; structure is defined by the functions
   that operate on it.
&lt;/p&gt;
&lt;p&gt;The first function we’ll need is one to create a &lt;code&gt;stemmer&lt;/code&gt; structure from a
   word. It converts the word to a vector and sets the index to the index of the
   last character (one less than the number of characters in the word).
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;make-stemmer&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This returns a stemmer structure for the given word.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;struct &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Notice the string between the function name and the list of parameters
   (&lt;code&gt;[word]&lt;/code&gt;). This is a &lt;em&gt;documentation string&lt;/em&gt;. You can use the &lt;code&gt;doc&lt;/code&gt; function
   to retrieve this later:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;porter=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;doc &lt;/span&gt;&lt;span class="nv"&gt;make-stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;-------------------------&lt;/span&gt;
&lt;span class="nv"&gt;porter/make-stemmer&lt;/span&gt;
&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
  &lt;span class="nv"&gt;This&lt;/span&gt; &lt;span class="nv"&gt;returns&lt;/span&gt; &lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;structure&lt;/span&gt; &lt;span class="nv"&gt;for&lt;/span&gt; &lt;span class="nv"&gt;the&lt;/span&gt; &lt;span class="nv"&gt;given&lt;/span&gt; &lt;span class="nv"&gt;word&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Resetting the Index&lt;/h2&gt;
&lt;p&gt;Occasionally, we’ll need to reset the index to the last character. Generally,
   we’ll only need to do this after making a change to the word vector, so this
   function takes a word vector and creates a new &lt;code&gt;stemmer&lt;/code&gt; structure with the
   correct index value from it.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;reset-index&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This returns a new stemmer with the :word vector and&lt;/span&gt;
&lt;span class="s"&gt;  :index set to the last index.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;word-vec&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;struct &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="nv"&gt;word-vec&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;word-vec&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Retrieving the Index&lt;/h2&gt;
&lt;p&gt;We will also need to retrieve the index sometimes. Of course, there’s a chance
   that the index was not set and is &lt;code&gt;nil&lt;/code&gt; or that it has gotten out of sync and
   points beyond the end of the word. &lt;code&gt;get-index&lt;/code&gt; will check for both of these,
   and it will either return the index or the index of the last character.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;get-index&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This returns a valid value of j.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;if-let &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:index&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;min &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Retrieving the Word&lt;/h2&gt;
&lt;p&gt;A major role of the index is to mark a subsection of the word for later
   consideration. &lt;code&gt;subword&lt;/code&gt; returns the part of the word before the index.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;subword&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This returns the subword in the stemmer from 0..j.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;b&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-index&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))]&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;lt; &lt;/span&gt;&lt;span class="nv"&gt;j&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="nv"&gt;b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="nv"&gt;b&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="nv"&gt;j&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="nv"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If the index points to the last character in the word, it just returns the
   original word index. Otherwise, it returns the part of the word up to and
   including the index. By default, &lt;code&gt;subvec&lt;/code&gt; only returns the part of the word up
   to its second index, so the index has to be incremented before getting passed
   to &lt;code&gt;subvec&lt;/code&gt;.
&lt;/p&gt;

&lt;h2&gt;Retrieving a Character&lt;/h2&gt;
&lt;p&gt;Sometimes we just want the single character that the index points to.
   &lt;code&gt;index-char&lt;/code&gt; handles this.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;index-char&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This returns the index-char character in the word.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;nth &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-index&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Removing a Character&lt;/h2&gt;
&lt;p&gt;So far, we’ve just been messing with the index. The most common operation
   we’ll perform on the word itself is removing the last character. &lt;code&gt;pop-word&lt;/code&gt;
   handles this by popping a letter off the stemmer’s word and creating a new
   structure that associates &lt;code&gt;:word&lt;/code&gt; with that new, shorter word.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn &lt;/span&gt;&lt;span class="nv"&gt;pop-word&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;This returns the stemmer with one character popped from the end of the&lt;/span&gt;
&lt;span class="s"&gt;  list.&amp;quot;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;stemmer&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For the next posting, I’ll review functions again in more detail.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-2797771103176683080?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/2797771103176683080/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=2797771103176683080' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/2797771103176683080'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/2797771103176683080'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-4-tracking-stemmers-data.html' title='Stemming, Part 4: Tracking the Stemmer’s Data'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-8544627021265832014</id><published>2008-07-10T16:20:00.003-05:00</published><updated>2009-02-13T15:58:37.093-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 3: More Basics</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;In the last posting, I introduced a number of Clojure data structures. Today,
   I’ll introduce a few more; then I’ll show you some common functions.
&lt;/p&gt;

&lt;h2&gt;More Data Structures&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Keywords&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;We’ve seen symbols before: every word in Clojure is represented as a symbol;
   every function name; anything that’s text, really, but isn’t a string. In
   programs, symbols act as variable names.
&lt;/p&gt;
&lt;p&gt;Clojure also has &lt;em&gt;keyword symbols&lt;/em&gt;, which are like symbols, except they cannot
   be used as variables. Instead, a keyword also stands for itself. To write a
   keyword, put a colon (&lt;code&gt;:&lt;/code&gt;) before its name:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt;
&lt;span class="no"&gt;:word&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Keywords are used a lot in Clojure, particularly as keys for hash maps. There
   is good reason for this: a keyword is also a function that takes a hash map
   and returns the value associated with itself in the mapping.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="no"&gt;:frequency&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;the&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;the&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you try to retrieve a keyword’s value from a mapping that doesn’t have the
   keyword, it returns &lt;code&gt;nil&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:location&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="no"&gt;:frequency&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;the&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Structures&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Keywords’ acting as functions makes both keywords and mappings incredibly
   useful as flexible, generic data structures in Clojure. This is so common in
   Clojure that Rich Hickey has added structures, which are mappings with
   predefined sets of keys, and which are very efficient.
&lt;/p&gt;
&lt;p&gt;To define a structure, use &lt;code&gt;defstruct&lt;/code&gt;, give it a name and a list of keyword
   fields. Here’s a structure that stores a word and its frequency:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defstruct &lt;/span&gt;&lt;span class="nv"&gt;word-data&lt;/span&gt; &lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="no"&gt;:frequency&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/word-data&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now, you can define an instance of that data type (a &lt;code&gt;word-data&lt;/code&gt;), using
   &lt;code&gt;struct&lt;/code&gt;, which is called with the name of the structure and the values for
   the fields in the same order as they’re defined in the &lt;code&gt;defstruct&lt;/code&gt;:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;def &lt;/span&gt;&lt;span class="nv"&gt;the-word&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;struct &lt;/span&gt;&lt;span class="nv"&gt;word-data&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;the&amp;quot;&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/the-word&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;the-word&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;the&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;:frequency&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Use the keyword field names as functions to retrieve the value of the field
   from a structure:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="nv"&gt;the-word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;the&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:frequency&lt;/span&gt; &lt;span class="nv"&gt;the-word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;400&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Of course, &lt;code&gt;the-word&lt;/code&gt; is just a hash map, and you can add other fields to it
   and treat it like a hash map in other ways too:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;assoc &lt;/span&gt;&lt;span class="nv"&gt;the-word&lt;/span&gt; &lt;span class="no"&gt;:location&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;here&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="no"&gt;:word&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;the&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;:frequency&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;:location&lt;/span&gt; &lt;span class="nv"&gt;here&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h2&gt;Other Useful Functions&lt;/h2&gt;
&lt;p&gt;Of course, the functions we’ve just seen won’t do everything that we’ll need.
   Here are some useful functions, many of which we’ve already seen.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(dec n)&lt;/strong&gt; Return one less than &lt;em&gt;n&lt;/em&gt;. This is faster than &lt;code&gt;(- n 1)&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(inc n)&lt;/strong&gt; Return one more than &lt;em&gt;n&lt;/em&gt;. This is faster than &lt;code&gt;(+ n 1)&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;dec &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;inc &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;5&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;(let [&lt;em&gt;variables&lt;/em&gt;] &lt;em&gt;expressions&lt;/em&gt;)&lt;/strong&gt; Defines one or more variables.
   &lt;em&gt;variables&lt;/em&gt; is a vector of variable/value pairs, arranged just like the
   key/value pairs in a hash mapping. &lt;em&gt;expressions&lt;/em&gt; is one or more expressions.
   The entire &lt;code&gt;let&lt;/code&gt; returns the value of the last expression.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;y&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="nv"&gt;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="mi"&gt;9&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;(if &lt;em&gt;test&lt;/em&gt; &lt;em&gt;true-expression&lt;/em&gt; &lt;em&gt;false-expression&lt;/em&gt;)&lt;/strong&gt; Executes &lt;em&gt;test&lt;/em&gt;, and if
   it returns a true value (anything but &lt;code&gt;false&lt;/code&gt; or &lt;code&gt;nil&lt;/code&gt;), it executes and
   returns the value of &lt;em&gt;true-expression&lt;/em&gt;; otherwise, it executes and returns the
   value of &lt;em&gt;false-expression&lt;/em&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="no"&gt;:name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="no"&gt;:name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;:yes&lt;/span&gt; &lt;span class="no"&gt;:no&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="no"&gt;:yes&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;(if-let &lt;em&gt;var&lt;/em&gt; &lt;em&gt;test&lt;/em&gt; &lt;em&gt;true-expression&lt;/em&gt; &lt;em&gt;false-expression&lt;/em&gt;)&lt;/strong&gt; Combines
   &lt;code&gt;let&lt;/code&gt; and &lt;code&gt;if&lt;/code&gt;, capturing a common pattern:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;age&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:age&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nv"&gt;age&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;My age is &amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;age&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="s"&gt;&amp;quot;No age given&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here, you define a variable from an expression, and if it has a true value,
   execute one expression, and if it’s false, execute another expression. Here’s
   what this looks like in practice:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;def &lt;/span&gt;&lt;span class="nv"&gt;person&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="no"&gt;:given&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Eric&amp;quot;&lt;/span&gt; &lt;span class="no"&gt;:surname&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Rochester&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;user/person&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="no"&gt;:given&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Eric&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;:surname&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Rochester&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:age&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;if-let &lt;/span&gt;&lt;span class="nv"&gt;age&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:age&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;My age is &amp;quot;&lt;/span&gt; &lt;span class="nv"&gt;age&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="s"&gt;&amp;quot;No age given&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;No age given&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;(when &lt;em&gt;test&lt;/em&gt; &lt;em&gt;expressions&lt;/em&gt;)&lt;/strong&gt; If the value of &lt;em&gt;test&lt;/em&gt; expression is true,
   executes &lt;em&gt;expressions&lt;/em&gt; and returns the value of the last.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;when &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="mi"&gt;41&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;expression&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;one&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;expression&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;two&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;when &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;expression&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;one&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list &lt;/span&gt;&lt;span class="ss"&gt;&amp;#39;expression&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;two&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;expression&lt;/span&gt; &lt;span class="nv"&gt;two&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;(min &lt;em&gt;values...&lt;/em&gt;)&lt;/strong&gt; Returns the least value in its arguments.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;min &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;min &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;(and &lt;em&gt;expressions&lt;/em&gt;)&lt;/strong&gt; Evaluates its expressions until one returns a false
   value, at which point it returns &lt;code&gt;nil&lt;/code&gt;; otherwise, it returns the value of the
   last expression.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(or &lt;em&gt;expressions&lt;/em&gt;)&lt;/strong&gt; Evaluates its expressions until one returns a true
   value, at which point it returns that; otherwise, it returns &lt;code&gt;nil&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(not &lt;em&gt;expression&lt;/em&gt;)&lt;/strong&gt; Evaluates its one expression and returns the logical
   complement of it. An expression evaluating to &lt;code&gt;nil&lt;/code&gt; or &lt;code&gt;false&lt;/code&gt; will return
   &lt;code&gt;true&lt;/code&gt;; a true expression will return &lt;code&gt;false&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:given&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:surname&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;Rochester&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:given&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:surname&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:age&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nv"&gt;nil&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;or &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:given&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:surname&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;Eric&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;or &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:given&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:surname&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:age&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="s"&gt;&amp;quot;Eric&amp;quot;&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:given&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nv"&gt;false&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;:age&lt;/span&gt; &lt;span class="nv"&gt;person&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nv"&gt;true&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;em&gt;&lt;/em&gt;+, -, *, / &lt;em&gt;&lt;/em&gt; Performs arithmetic operations on their arguments.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;- &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;9&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;16&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;/ &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;4&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;=, not=, &amp;lt;, &amp;gt;, &amp;lt;=, &amp;gt;=&lt;/strong&gt; Compares its arguments, returning a boolean.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;= &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;false&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not= &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;true&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;not= &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;false&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;lt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;false&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;true&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;lt;= &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;false&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;&amp;gt;= &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;true&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For the next posting, we’ll apply what we’ve learned about Clojure and its
   data structures to the Porter Stemmer algorithm.
&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2062021423309914146-8544627021265832014?l=writingcoding.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://writingcoding.blogspot.com/feeds/8544627021265832014/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2062021423309914146&amp;postID=8544627021265832014' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/8544627021265832014'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2062021423309914146/posts/default/8544627021265832014'/><link rel='alternate' type='text/html' href='http://writingcoding.blogspot.com/2008/07/stemming-part-3-more-basics.html' title='Stemming, Part 3: More Basics'/><author><name>Eric Rochester</name><uri>http://www.blogger.com/profile/15840004674816343941</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://2.bp.blogspot.com/-2LD6TB4B8vY/TpjrP24MbNI/AAAAAAAACkk/SLQV5nF4ki8/s1600/0e72db523b0c799c871b7755eda209f5.png'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2062021423309914146.post-8289396449593875947</id><published>2008-07-09T19:30:00.002-05:00</published><updated>2008-10-22T06:44:28.342-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure-series'/><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Stemming, Part 2: Functional Programming</title><content type='html'>&lt;div&gt;
&lt;style type="text/css"&gt;
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #FF0000 } /* Generic.Error */
.gh { color: #000080; font-weight: bold } /* Generic.Heading */
.gi { color: #00A000 } /* Generic.Inserted */
.go { color: #808080 } /* Generic.Output */
.gp { color: #000080; font-weight: bold } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #800080; font-weight: bold } /* Generic.Subheading */
.gt { color: #0040D0 } /* Generic.Traceback */
.kc { color: #008000; font-weight: bold } /* Keyword.Constant */
.kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
.kp { color: #008000 } /* Keyword.Pseudo */
.kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
.kt { color: #B00040 } /* Keyword.Type */
.m { color: #666666 } /* Literal.Number */
.s { color: #BA2121 } /* Literal.String */
.na { color: #7D9029 } /* Name.Attribute */
.nb { color: #008000 } /* Name.Builtin */
.nc { color: #0000FF; font-weight: bold } /* Name.Class */
.no { color: #880000 } /* Name.Constant */
.nd { color: #AA22FF } /* Name.Decorator */
.ni { color: #999999; font-weight: bold } /* Name.Entity */
.ne { color: #D2413A; font-weight: bold } /* Name.Exception */
.nf { color: #0000FF } /* Name.Function */
.nl { color: #A0A000 } /* Name.Label */
.nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
.nt { color: #008000; font-weight: bold } /* Name.Tag */
.nv { color: #19177C } /* Name.Variable */
.ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #666666 } /* Literal.Number.Float */
.mh { color: #666666 } /* Literal.Number.Hex */
.mi { color: #666666 } /* Literal.Number.Integer */
.mo { color: #666666 } /* Literal.Number.Oct */
.sb { color: #BA2121 } /* Literal.String.Backtick */
.sc { color: #BA2121 } /* Literal.String.Char */
.sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
.s2 { color: #BA2121 } /* Literal.String.Double */
.se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
.sh { color: #BA2121 } /* Literal.String.Heredoc */
.si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
.sx { color: #008000 } /* Literal.String.Other */
.sr { color: #BB6688 } /* Literal.String.Regex */
.s1 { color: #BA2121 } /* Literal.String.Single */
.ss { color: #19177C } /* Literal.String.Symbol */
.bp { color: #008000 } /* Name.Builtin.Pseudo */
.vc { color: #19177C } /* Name.Variable.Class */
.vg { color: #19177C } /* Name.Variable.Global */
.vi { color: #19177C } /* Name.Variable.Instance */
.il { color: #666666 } /* Literal.Number.Integer.Long */
&lt;/style&gt;
&lt;p&gt;In the last posting, we looked at what the Porter Stemmer will need to do, and
   we glanced at functional programming. Now, we’ll consider what information the
   stemmer will need to keep processing a word, and we’ll start examining what
   data structures Clojure provides to keep track of that information.
&lt;/p&gt;

&lt;h2&gt;What We Need to Keep Track Of&lt;/h2&gt;
&lt;p&gt;The primary data that the stemmer needs to track is the word itself. It strips
   characters off the end of the word, and occasionally it adds a character back
   on or changes the last character.
&lt;/p&gt;
&lt;p&gt;Also, the stemmer needs an index into the word. Sometimes, one function will
   analyze the word and mark a location. A later function may then change the
   word based upon that index.
&lt;/p&gt;
&lt;p&gt;Of course, to keep the functions from getting too complicated, the stemmer
   should package those two data—the word and the index—together.
&lt;/p&gt;

&lt;h2&gt;Clojure Data Structures&lt;/h2&gt;
&lt;p&gt;So what does Clojure give us to manage this data?
&lt;/p&gt;
&lt;p&gt;I’ll list a few of Clojure’s data structures here.  Also, since in a
   functional language a data type is defined by the functions that operate on
   it, we’ll look at them too. And as we go along, we’ll try everything out.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lists&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;The most basic data type is almost any lisp is the list. It is a sequence of
   elements that is optimized for adding items onto and remove items from the
   beginning of the list.
&lt;/p&gt;
&lt;p&gt;List literals have parentheses around the items. Of course, this also looks
   like a function call, so we have to quote the list to indicate that we want it
   treated as a list:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Lists can also be constructed using &lt;code&gt;list&lt;/code&gt;:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Finally, you can convert a list or vector to a list using the &lt;code&gt;seq&lt;/code&gt; function.
   (The thing in square brackets below is a vector. See the next section for
   details.)
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;seq &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; 
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Vectors&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Vectors are conceptually similar to lists, except that they make it easy to
   add items to or remove items from the end of the vector.
&lt;/p&gt;
&lt;p&gt;Vector literals also look like lists, except they use square brackets (&lt;code&gt;[&lt;/code&gt; and
   &lt;code&gt;]&lt;/code&gt;). Since these can’t be confused with function calls, you don’t need to
   quote them. You can also create a vector using the &lt;code&gt;vector&lt;/code&gt; function. And
   finally, you can convert a list to a vector using &lt;code&gt;vec&lt;/code&gt;:
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vector &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;vec &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can also get part of a vector using &lt;code&gt;subvec&lt;/code&gt;. It takes one or two indexes
   into the vector. If called with one index, it returns everything from that
   index to the end of the vector; if called with two, it returns everything from
   the starting index up to, but not including, the second. The index for the
   first item is always zero, because that’s the way computers do things. (I
   could explain, but trust me: you don’t want to know.)
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;subvec &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Sequences&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Both lists and vectors (as well as some other things) are &lt;em&gt;sequences&lt;/em&gt;.
   Sequence functions take any type of sequence and operate on it. Some of these
   functions always return lists; some of them return the same type that was
   passed in to it; some of the functions return information about the sequence
   or about what it contains. Here are some of the more important sequence
   functions:
&lt;/p&gt;
&lt;p&gt;&lt;em&gt;count&lt;/em&gt; returns the number of items in the sequence.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;em&gt;nth&lt;/em&gt; returns an element from the sequence. Remember that the index of the
   first item is zero.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;nth &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;em&gt;pop&lt;/em&gt; returns a sequence without one item. In lists, this is the first item;
   in vectors, it is the last.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;pop &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;em&gt;peek&lt;/em&gt; returns one item from the sequence. In lists, this is the first item;
   in vectors, it is the last.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;peek &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;peek &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;em&gt;take&lt;/em&gt; returns the first &lt;em&gt;n&lt;/em&gt; items in the sequence as a list.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;take &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;em&gt;conj&lt;/em&gt; returns the sequence with one item added to it. In lists, the item is
   added to the beginning of the list; in vectors, it is added to the end.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;conj &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;conj &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;em&gt;into&lt;/em&gt; takes two sequences. It returns the results of calling &lt;code&gt;conj&lt;/code&gt; on the
   first sequence with each item in the second sequence.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;into &lt;/span&gt;&lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;into &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can see that different types of sequences share functions; that the
   functions do conceptually similar things; but that what exactly happens for
   each function- and sequence-type-combination may differ.
&lt;/p&gt;
&lt;p&gt;Also, note that vectors make it easy to add and remove items from the end of
   the list. Because of this, are ideal for storing the word while the stemmer
   processes it.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mappings&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;A mapping is the same as a hash table in Perl or a dictionary in Python.
   Actually, just as Clojure has different types of sequences, it also has a
   couple of different types of mappings. Hash maps are the most common, and that
   is the one I’ll be referring to below.
&lt;/p&gt;
&lt;p&gt;Literals for hash maps use curly braces (&lt;code&gt;{&lt;/code&gt; and &lt;code&gt;}&lt;/code&gt;), and they don’t require
   any punctuation between keys and values or between key/value pairs. You can
   include a comma, but that’s whitespace and is just ignored. Notice that when
   Clojure prints the mapping, it adds the commas, because they make it easier to
   read. We’ll use them too for that reason.
&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="nv"&gt;user=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;a&amp;quot;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;the&amp;quot;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;but&amp;quot;&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;a&amp;quot;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;but&amp;quot;&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;the&amp;quot;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;em&gt;hash-map&lt;/em&gt; also creates a mapping using a function.

