Clojure code, round 2
Clojure: continuing to look
My second look at Clojure
Casting and complements
As demonstrated to me, (. listener (isClosed))
returns a fresh
boolean, rather than Boolean.TRUE
or Boolean.FALSE
. So, while
the symbol looks fine in the REPL, when passed to (if (. listener
(isClosed)) ...)
, it is always true.
So, I made the following pair of complementary1 utility functions:
complement
takes a function as its argument and returns a function
that does exactly the same thing as that function, only with its
return value inverted. Thus, (complement listener-closed?) returns a
function that does exactly what listener-closed?
does and then
returns the opposite value.
In this case, it saves me from having to repeat listener-closed?
's
details. That way, if I ever need to change listener-closed
, in
order, say, to handle more than simple Java ServerSockets, I can do so
without having to make any changes to listener-open?
.
Note, though, that listener-open?
is a def
and not a
defn
. That's because defn
is a macro that wraps an fn
around its body. complement
, on the other hand, returns an
fn
, so I only need to bind it to a Var.
I then amended listener-run
so that it uses this explicit cast
rather than just taking what ``ServerSocket gives it.
So, while I'm assigning casting, I should likely take care of the warnings on loading this file, if for no reason other than that they're starting to get a little excessively verbose.
Simply, I went through and placed type hints on functions that call out to Java in order to ensure that their methods resolve on load.
Since I also noticed a few new things while reformatting the Clojure
manual into a single
texinfo document for my own use, I'll also change my use of (. var
(toString)
to use Clojure's str
function:
(str x) Returns x.toString(). (str nil) returns ""
Building a testing framework
All of those casts are fairly minor refactorings that just make the
statement of what a listener
and connection
are to the
compiler. On the other hand, the bug that I started with this time
would have been caught by the other thing I've been putting off.
Namely, actual an testing framework, rather than just eyeballing
results in the REPL.
Now, I could go find JUnit, install that, and then see how well Clojure links into it. But, well, that sounds like a new project1.
So, instead, what I'll look at is defining what it is that my functions are supposed to be doing, write a few tests, then add a framework to glue those tests together.
Ugliness ensues...
Which ended up looking like:
And, running it, I get:
So, it works. (Noting that I deliberately built one of those tests to
fail, to make sure that the testing function catches failure.) Except
that, well, how best to put it... Eww. I know that so far this is
only 150 lines, but I'm not going to remember to update the list of
tests down in the testing function every time I make a new one. So,
I'll run test-all
, realise that my new test is missing, then go
back and add the new test.
Adding structure...
No, I think what I need here is for the test itself to hold its own payload. Sounds like a reasonable time to add in a struct, really. A struct is just a defined hashmap taking keywords as arguments. For starters, I'll build one that just duplicates the list that I have below:
Well, that does let me wrap tests into structures, but doesn't exactly let me manipulate them:
What I'm going to need is a way to build these tests and load them when I load the file, otherwise I'm just making the process of eyeballing my results in the REPL harder on myself.
After a bit of thought, I settle on this as my test list:
Nothing special at the moment. Just a simple structure defining a unit test as a name, description and function. Now, let's use that one one of the three rudimentary tests that I already have:
All right. Improving a tiny bit. Except that, after evaluating the file twice, I still have:
Makes sense, really. My new instance of test-listener-new has a new symbol. So, even though I know that they're the same function, there's no good reason for the compiler to.
What I really want, though, is to make defining a new test as simple
as adding a new (albeit specialised) function, rather than playing
around with remembering to call unit-test-new after. So, rather than
trying to get unit-test-new
to parse out the name of the function,
I'll rethink my approach.
What do I actually want? Well, the function, name and description added to my set of tests seamlessiy. I think it's fair to assume that before I'm done, I'll want a few more qualifiers on tests as well. Since I don't know what those are, I won't add them to my structure yet.
Admitting that I need macros...
However, I'm in a conundrum here, because I don't want to add logic to my test definition every time I think of something new. Really, this is starting to sound like a good argument for a macro.
I'll start by just imitating what defn
does. Because if I can't
imitate that in a macro, the rest of this is going to start smelling
like week-old moose.
So that takes a function name, a descriptive string and a bunch of
unspecified "extras" and then proceeds to ignore the second two before
shoving the function name and the declaration into a defn
. The
backtick (`) marks the following list as a template, meaning that the
defn
won't be resolved right away.
The ~
and ~@
macro characters inside that template mean,
respectively, to unquote fname
(so that I'm not defining a
function called user/fname
) and to resolve fdecl
into the
sequence of values which it represents.
I have to do this because the & rest
argument to an fn
is
considered to be a sequence. Were I simply passing it as an unquoted
symbol, I'd end up with an extra set of parentheses around the
function body when it's evaluated, which I don't want.
So, having rationalised every jot and tittle of what I just did, I'm
going to dump test-listener-new
- that is to say, a working
function - into deftest via macroexpand-1 and see if it generates a
plausible defn
. (macroexpand-1 because I don't really want to see
the full expanion of defn
's internals, just this first layer that
I added.
And that resulting definition looks just like the definition of
test-listener-new
, only on one line and with defn
as a
qualified namespace/symbol pair.
Seeing as that worked, it means that all I should have to do is wrap
defining the function in a do
and add, as the second half, a
function call adding the test to my list of tests.
Which, when expanded, gats me:
Which counts as an almost-but-not-quite. Because I'm quoting the
empty list in the arguments that I'm passing to deftest
, it
expands to (quote ())
, which is then unquoted to quote ()
.
Which isn't exactly what I want.
On the other hand, the working syntax:
It works fine, and saves me a spurious quote. Furthermore, values in a
struct default to nil
, so if I add more bits to the unit-test
structure later, I don't have to cope with accounting for them in any
way other than doing nothing for nil
, and my previously-defined
tests will still work.
Applying these new tests.
So, I modify my test to use this macro and get:
And evaluating that adds it to ALL-TESTS
.
I could have put the extras
(that part that I have a suspicion
that I'll need, but don't know why yet,) at the end of the deftest
macro. However, that means that I'd likely forget to add them. Also,
it means that I have to make more changes to an existing test function
than simply adding two pieces of information.
On the other hand, because it returns the result of
unit-tests-add
, deftest
isn't quite a drop-in replacement to
defn
yet. Time to get it to return the function instead.
So, knowing the function name, and establishing the reasonable
qualification that the function has been defined (which it should have
been, as defining it is a part of the macro, what I want to return is
the Var named by fname
. Which leads me to:
Which, when I evaluate my test, returns the test function, ensuring that this is now a wrapper around defn that actually returns the right values.
That done, I can finally get to converting my remaining two tests and
the test-all
function. The tests are easy:
To make the testing function work, it needs to be able to get at my vector of tests and to examine the tests.
A side-trip into accessors...
Here, however, I'm noticing that I'm duplicating what's already
written once in my struct, just to have accessors. Also, :string
is a daft name.
So, I'll make :string
into :description
and generate my
accessors based directly off of the structure.
And testing that:
Hmm, not so good. Ok, time to poke at what exactly it is that I'm doing wrong here.
So, it appears that both def
and instance?
are seeing the
unexpanded list before it gets turned into a symbol. Curiouser and
curiouser.
Ok, so I redefined my-sym three times. After I got as far as:
I've decided that I'll no longer make a big deal out of building my
accessors from my structure, as I seem to have a merry clash between
what def
wants and what I know how to offer to it.
I'll admit that this failure irks me, but I'm going to leave it as something to poke at later, as I'm getting confused as to operator and macro precedence.
Sobeit:
Giving up and going back to the test function.
Now, on to making the test function actually work with this structure, rather than playing with trying to be clever with generating accessors.
However, testing that, it shows only that the unit test functions exist. Nothing more.
I need it to evaluate the returned function, preferably with the capacity to insert arguments as needed. Hmm.
Will leave that as a working test framework for the moment and actually move back to writing tests.
So there's my complete set of tests to date. Which I can then without
changing the test-all
that I had already defined.
Ok, so I notice two things here. One is that every test works except what should be the last one. The other is that, well, it isn't the last one. In fact, the tests are in no particular order when, in fact, they should have a certain level of ordering in them.
Setting up after statements...
Sounds like I finally have a use for that extras
argument that,
until now, I haven't been using.
There. Now tests that should have sequence can have them. But this means that I need to actually extract my function for generating the list of unit tests and make it respect the rules that I just added.
Which gives me the same result for test-all
. Now, to sort this
data rather than just dropping it in a list.
So what that's supposed to do is run through the unsorted list, adding things to the sorted list when they either have no after statement or their after statement is met.
Now, I'll bind that helper into a loop:
And running it:
It hangs.
Stepping through the helper function in a long series of examinations
tells me that it's hanging at the last step. Then, after changing it
to reflect my actual intent (I typoed sorted
as unsorted
. I
start to think about whether this is actually a good way to build a
list. After all, it will take up to the factorial of the length of
the list of tests to actually build it.
I poke with comparators for a bit and find that my somewhat fuzzy requirements don't exactly meet comparator's exacting demands2. Then I decide to revisit my iffy list function, but move blocks of tests at a time.
This one takes the unsorted list, breaks it into three parts, and
moves what it can over to sorted
at each step. It sorts my list
of functions in three steps, rather than twenty. However, those steps
are relatively expensive. I might as well compare the two functions'
performances while I've them both on my screen.
Interesting. The old one actually did better than the new. Now, I could count their operations, but it would be more fun to get the REPL testing this for me.
There. Made a hundred test macros with no structure to them.
gensym
just makes a symbol with a guaranteed-unique name. That
ensures that, in generating these tests, I don't have to first come up
with my own means to make unique nmes.
Now to try again:
A gap is starting to form, I think. What about adding some test macros that do rely on other tests?
So, that adds a thousand tests in that do follow a structure, as
they all have an after
statement this time.
And I run the test again ...
And I go get coffee ...
5 minutes...
It seems to be still running...
Maybe a thousand was a bad seed...
Oh, there it is:
So, definitely, as the structure gets more complicated, the slightly more sensible approach is pulling ahead.
But just for fun, as I'm going to bed:
Ok, somewhere before generating a 100000-length tree of tests, it crashed, due to a lack of symbols. Fair enough. That test was just waving things around to see if I can.
Revisiting generating my accessors.
However, going back one step, all that silly mucking around with
gensyms
and defines gives me an idea:
There. Now I know that, should I later decide to add more keys to
that structure, the accessors will be there and waiting for me.
Except that, in a freshly-loaded REPL (with this file loaded via
load-file
rather than dumping it in, I find that my accessors are
no longer loaded.
All right, I'll take the hint, stop picking at this, and leave them as normal functions rather than generated ones.
Revisiting actually running the tests
So, I try running all my tests again. And the test function (which I haven't changed since I started that little digression into sorting my tests) still does what it's told, albeit with a tiny wart and a failed test.
And here's the offending test. I'll step through it one bit at a time and see what's going wrong in actuality.
Ok, so I messed up the ordering of greater-than and less-than. I find
that I do this a lot in Lisp, because of how I read (> 0 ...
mentally. Namely, as a single function, testing if what follows is
greater than the integer that I've put there. Or, in other words,
exactly backwards.
So, change that to (> (. result (length)) 0)
and re-run the test
function:
Erk. All right, that's getting a bit odd. Changing it back gets me
the previous, broken test, but that's not exactly helpful. Looking at
that dump, I can see that, first off, the error happened in
test-connection-run
, but, secondly, it happened when trying to
invoke an instance method.
Further, on that line, the only one I've been changing, there is indeed an instance method. However, on a successful connection, even one with no data, it should return an empty string.
The answer here lies with the next line.
Before, the failure at the successful length was causing and to stop and return false, in a "short-circuiting" behaviour that's fairly normal3.
Now, what that reminds me is that (connection-run)
is going to
return nil
when it can't open a connection at all. So thus, I'm
calling (length)
on nil
: (. nil (length))
which,
deservedly, throws an error. So, in this case, the reflection
warnings about length
not being resolved were helping me rather
than nagging.
And, testing all of that, it does indeed work out the way it ought to, except for the aforementioned wart. To point it out, "Could not connect to 127.0.0.1 on port 51345" would appear to indicate an error, whereas I know that it's actually a valid part of both the tests for which it appears.
So, here's the problem: those errors are useful rather than throwing an exception when connecting manually, but they give the wrong indication (to me at least) when connecting automatically.
So, what I'd like is to shove any messages generated into something other than standard output, where I can inspect them if I so choose.
So, what that does is, before doing anything, creates a StringWriter outside the loop. Then, when it comes time to perform the test, temporarily re-binds out (the variable telling Clojure where to print to) to that StringWriter.
This means, now, that I can return the success/failure of the set of tests, as well as any error messages, as a list.
Afterthoughts
So, now I have a very basic (albeit ostensibly extensible) testing framework, as well as a setup that generates that framework automatically.
So, now a few (code-dump-free) thoughts as I actually make sure that my in-file commentary is up to date.
Mistakes
What are the current downsides to this system and this approach?
Well, for starters, the test framework only recognises nil
and
true
. This means that, when an early test manages to hang and
throw an exception, it brings down the test framework as well.
In laziness, I chose to use return values in order to not have to play too much with caught/thrown errors, and instead passed truth/falsity around. Not necessarily the best approach, but I haven't decided whether throwing/catching is better in this case5.
Also, I'll be the first to admit that I'm handling errors in a bit of
a cheap way, and one that, were I actually turning this into a real
component, rather than something that I'm building as the "build one
to throw away" of my learning process, I'd have started to question at
about the point that I was wrapping a binding
around println
in order to isolate my error handling away from my testing function.
Further, the 100 lines defining my simple testing framework should really be extracted from this client-server application. They're only loosely attached to each other, so I should be able to give them their own namespaces with minimal hassle. (I'll wait for the next Clojure release to do that, though, because I know I'll have to change a bunch in order to handle upcoming changes, and would prefer to discuss namespaces with myself once the new system is in.)
Lessons
On contrast, what have I learned? Well, for one, do not try to generate 17 000 lines of Lisp at once via an elisp command. Or, at least, save your document first. (I lost all of my poking at comparators via that gaffe.)
Also, that macros really aren't especially terrifying. Believe me or
don't, but deftest
was the first Lisp macro that I've ever
written. I'm still tripping over order of operations and how forms
get expanded and names get resolved, especially when dealing with
things like def
and load_file
, but I do think that that falls
simultaneously with me trying to be too clever and not actually
understanding what it is that I'm doing.
As I write more, I keep finding new boot.clj functions that let me
write a verbose expression a lot more succinctly and clearly. My new
ones are with-open
, some
, not-any?
and anything beginning
with sort
.
I'm finding that having the manual as a single, indexed document is making me try harder to use Clojure's functions for doing something, rather than rolling my own off of Java's library.
As I've found before, having a single unified environment in which to write code, then evaluate it selectively is a great asset. It means, for me at least, that I spend less time having to mentally page-swap as I try to remember what I was going to do next.
There's some caveats to that, though: Number one is that my current environment state doesn't necessarily match what I have written down. I found myself, this session, killing by REPL on occasion just to make sure that it was matching my work as exactly as possible.
Also, it makes me lazy about making sure that, e.g., functions are declared before another function references them. Which works just fine, right up until I try to load the file and realise that my selective evaluation has (yet again) created a file that needs minor reordering in order to load.
But that's a trivial irritation. And far outweighed by being able to write this (as HTML-heavy Markdown), manipulate the actual source of this program and run the REPL, all in the same environment, and all sharing data6.
Future / Goals
Well, beyond the nebulous goals I dropped at the end of my initial look, I now have a few more.
Cease using
nil
as an error value.Move to separate test / application namespaces.
Look at existing functions that do the same thing (the
close
s andlength-of-string
come to mind) and push them into multimethods instead.(Yes, the long-term geal here is to meander through all 21 chapters of the manual as I find needs.)
Explore wrapping an extant Java unit testing facility in Clojure rather than rolling my own (a nice learning exercise but not exactly a valid solution to anything beyond making myself try out the language.
Ok, it sounds like a fine idea for a project, but I'm trying not to let my focus wander all over the place, so I'll keep this experiment/application self-contained. Back
To clarify my failure to use comparators a little bit, a comparator is an object/function that grabs some items, compares them, and replies which one is greater.
user=> (sort (comparator >) [5 4 8 4 3 8 4 6]) (8 8 6 5 4 4 4 3) user=> (sort (comparator (fn [x y] (println x y (> x y)) >)) [5 4 8 4 3 8 4 6]) 5 4 true 4 8 false 8 4 true 3 8 false 8 4 true 4 6 false 4 3 true (5 4 8 4 3 8 4 6) user=> (sort (comparator (fn [x y] (println x y (>= x y)) >)) [5 4 4 4]) 5 4 true 4 4 true 4 4 true (5 4 4 4)Here's where that falls apart for my purposes, though: when sorting numbers, letters, names, etc., I know the relative ordering of any two. On the other hand, with these tests, unless one of four conditions is met, the ordering of the two items is unknown.
:after
is nil. The test comes before all others.:after
is:all
. The test comes after all others.Both tests have the same
:after
. They are equivalent.One test's
:name
is the other test's:after
. They have to go in that order.
My problem is that most of my comparisons do not meet these standards. Hence my sort functions both having a means to defer looking at an item until a condition is met. Back
This meaning, in a logical condition, to stop evaluating the condition as soon as its truth / falsity becomes known. Here, and in other tests, I was using it as somewhat of a shorthand to say, "If the test fails at any point, the entire test has failed."
On the other hand, using logical predicates (
and
,or
, etc.) as control structure can quickly get unreadable at a glance.(or (or (and (test1) (test2)) (test3)) (or (and (test4) (test5) (test6)) (and (test6) (test7) (or (test8) (test8a) (test8b))) (test9)))A contrived example, sure, but I lost the flow of that logical tree midway through writing the test. I also tried to devise it so that the tree structure would bear at least a passing resemblance to something that had been modified to deal with a later condition4 Back
(For what it's worth, it says: Perform test1. On success, perform test2. On success of test1 and test2, return success. Otherwise, perform test3. On success of test3, return success. On failure of test3, try test4, test5, test6, stopping if there's failure. And so on, until, if all other tests have failed, return the success/failute of test9.) Back
Yes, I'm aware that the consistent (and possibly even correct) answer is throw/catch. However, for the purposes of a simple demo application, building Throwables to toss around strikes me as an unnecessarily obfuscating the mechanics of what I'm doing. Back
Emacs, 164 characters wide, divided into two columns, with a full column devoted to the REPL, 20 lines to the source and 40 to this text. Which raises a question: With the admonition to not write any function longer than a single screen, with whose screen length?
For the record, the longest function in this example is 28 lines and the shortest is 1 line long. That's likely a direct consequence to my having kept my source scrunched up with only a little window into it7. Back
-
(save-excursion (goto-char (point-min)) (re-search-forward "^ *$" (point-max) nil) (message (mapconcat (lambda (lc) (format "%d" lc)) (let ((lcount nil)) (while (not (eobp)) (forward-sexp) (setq lcount (append lcount (list (count-lines (save-excursion (backward-sexp) (point)) (point)))))) lcount) " ")))
And that's why Emacs one-liners are scary. Because before you know it, they're actually 15 lines long, all scrunched into an Eval: prompt. Back
My first look at Clojure: Source
Here's the resulting source code from my experimentation.
(import '(java.net BindException ServerSocket Socket) '(java.lang.reflect InvocationTargetException) '(java.util Date) '(java.io InputStream OutputStream) '(java.util.concurrent Executors)) ;;; Utility functions (defn current-time [] (. (new Date) (toString))) (defn byte-arr-from-string [str] (. str (getBytes))) (defn test-byte-array-from-string ([] (test-byte-array-from-string (current-time))) ([str] (let [barr (byte-arr-from-string str) bseq (map (comp char (appl aget barr)) (range (alength barr))) chseq (map char str)] (and (== (alength barr) (count bseq) (count chseq)) (nil? (first (filter false? (map eql? bseq chseq)))))))) (defn string-from-byte-sequence [coll] (reduce strcat (map char coll))) ;;; Listener functions ;;; These control the server (defn listener-new [port] (try (new ServerSocket port) (catch BindException except (println "Address is already in use.")))) (defn listener-wait [listener] (. listener (accept))) (defn listener-close [listener] (. listener (close))) (defn listener-send [lsocket] (.. lsocket (getOutputStream) (write (byte-arr-from-string (current-time)))) (.. lsocket (getOutputStream) (close)) lsocket) (defn listener-run [listener port] (loop [socket nil] (if (. listener (isClosed)) listener (do (when socket (. (listener-send socket) (close))) (recur (listener-wait listener)))))) (defn listener-run-in-background [port] (let [listener (listener-new port) exec (. Executors (newSingleThreadExecutor)) run (appl listener-run listener port)] (when listener (. exec (submit run))) listener)) ;;; Connection functions ;;; These control the client. (defn connection-new ([port] (connection-new "127.0.0.1" port)) ([address port] (try (doto (new Socket address port) (setSoTimeout 5000)) (catch InvocationTargetException except (println (strcat "Could not connect to " address " on port " port)))))) (defn connection-read [conn] (let [instream (. conn (getInputStream)) reader (fn [] (try (. instream (read)) (catch InvocationTargetException except -1)))] (loop [bytes nil current-byte (reader)] (if (== current-byte -1) bytes (recur (concat bytes (list current-byte)) (reader)))))) (defn connection-close [conn] (. conn (close))) (defn connection-run [port] (let [conn (connection-new port) str (when conn (string-from-byte-sequence (connection-read conn)))] (when conn (connection-close conn)) str))