Clojure transducers from the ground up: the practice

This is the second part of an article dedicated to Clojure transducers. In the first part we discussed transducers fundamentals and the functional abstraction they represent. In this article, we are going to explore how they are used in practice, including:

  • Composability
  • Reuse across transports with core.async
  • A logging stateless transducer example
  • An interleave stateful transducer example
  • Parallelisation

Composability

Transducers have the important property to isolate transforming reducing functions (like (map inc) or (filter odd?)) from the necessary sequential iteration. One interesting consequence of this design is that transducers can now compose like any other function. Have a look at the following example:

(def inc-and-filter (comp (map inc) (filter odd?)))
(def special+ (inc-and-filter +))
(special+ 1 1)
;; 1
(special+ 1 2)
;; 4

(map inc) or (filter odd?) generates a function with the same interface: taking one item and returning a transformation of that item. We can compose them with comp to form a new function inc-and-filter which is the composition of the two.

We can also provide an argument + to inc-and-filter which returns a new function. We could call the new function special+ because it enhances normal + with two transformations. We can see the effect of incorporating inc and odd? into special+ by calling it with two arguments; the second argument is incremented and then conditionally added to the first depending on it being odd or even. We can use reduce with special+ as usual and compare it with the same version using a simple +:

(reduce special+ 0 (range 10))
;; 25


(reduce + 0 (filter odd? (map inc (range 10))))
;; 25

The two reduce appear the same on the surface, but the mechanics is completely different. In the first case, our special+ applies transformations while iterating through the input sequence. The second case produces an intermediate sequences for each transformation, plus one for the final reduction. More importantly, we can now isolate transformations from reduce. By using transduce we make that explicit in the arguments:

(transduce (comp (map inc) (filter odd?)) + (range 10))
;; 25

How many transducers (our transforming reducing functions) can be composed this way? The answer is … as many as you like:

(def x-form
  (comp
    (map inc)
    (filter even?)
    (dedupe)
    (mapcat range)
    (partition-all 3)
    (partition-by #((apply + %) 7))
    (mapcat flatten)
    (random-sample 1.0)
    (take-nth 1)
    (keep #(when (odd? %) (* % %)))
    (keep-indexed #(when (even? %1) (* %1 %2)))
    (replace {2 "two" 6 "six" 18 "eighteen"})
    (take 11)
    (take-while #(not= 300 %))
    (drop 1)
    (drop-while string?)
    (remove string?)))


(transduce x-form + (vec (interleave (range 18) (range 20))))
;; 246

The example above is a famous gist created by Rich Hickey (the inventor of Clojure) to show off the available transducers in the standard library.

But there’s more: composition can be applied on top of additional composition, allowing programmers to isolate and name concepts accordingly. For example, we could group the previous, long chain of transducers into more manageable chunks giving each chunk a specific name:

(def x-clean
  (comp
    (map inc)
    (filter even?)
    (dedupe)
    (mapcat range)))


(def x-filter
  (comp
    (partition-all 3)
    (partition-by #((apply + %) 7))
    (mapcat flatten)
    (random-sample 1.0)))


(def x-additional-info
  (comp
    (take-nth 1)
    (keep #(when (odd? %) (* % %)))
    (keep-indexed #(when (even? %1) (* %1 %2)))
    (replace {2 "two" 6 "six" 18 "eighteen"})))


(def x-calculate
  (comp
    (take 11)
    (take-while #(not= 300 %))
    (drop 1)
    (drop-while string?)
    (remove string?)))


(def x-prepare
  (compx-cleanx-filter))


(def x-process
  (compx-additional-infox-calculate))


(def x-form
  (compx-preparex-process))


(transduce x-form + data)
;; 246

As you can see, comp can be used several times to compose over already composed transducers, producing named computations that are now easy to read and reuse.

Note that, although there are structural similarities, the thread last macro ->> doesn’t allow for a similar kind of composition. For example, the following expand-and-group composition is not possible:

(def coll (range 10))

(def expand
  (->> coll
    (map inc)
    (filter even?)
    (dedupe)
    (mapcat range)))

(def group
  (->> coll 
    (partition-all 3)
    (partition-by #((apply + %) 7))))

(def expand-and-group
  (comp expand group))

(expand-and-group)
;; Execution error (ClassCastException)
;; clojure.lang.LazySeq cannot be cast to clojure.lang.IFn

There are many things not working as expected in the example above. First of all, expand and group definitions already contain the result of the computation. Their composition is not the composition of their computational recipes but their results. comp works as expected, but trying to invoke it produces the non-sensical use of a lazy sequence as a function.

Reuse across transports

There is another important aspect of transducers that contributes to reuse (apart from composition). The composition of transducers happens without any knowledge of the input they will be applied to. As a consequence, we can reuse transducers with other transports.

A transport is essentially the way a collection of items is iterated. One of the most common transports in the standard library is sequential iteration (that is, using sequence functions like map, filter, etc.). But there are other examples of transports. The core.async library for instance, implements iteration through an abstraction called “channel” which behaves similarly to a blocking queue.

The following example shows the same transducers chain we have seen before to process incoming elements from a core.async channel:

(require '[clojure.core.async :refer [>!  ] :as a])


(def xform (comp (filter odd?) (map inc)))


(defn process [items]
  (let [out (a/chan 1 xform)
        in (a/to-chan items)]
    (a/go-loop []
      (if-some [item ( in)]
        (do
          (>! out item)
          (recur))
        (a/close! out)))
    ( (a/reduce conj [] out))))


(process (range 10))
;; [2 4 6 8 10]

The transducer xform is now applied to a channel out inside a go-loop. Every item pushed down the channel passes through the transducer chain. The input is a range of 10 numbers transformed into a channel, but items could be streamed asynchronously (and often are; for example, as server side events on a web page).

The go-loop works sequentially in this case, but core.async also contains facilities to apply the same transducers in parallel. A pipeline is an abstraction that supports multi-threaded access. It sits between an input and an output channel and it can be used with transducers:

(defn process [items]
  (let [out (a/chan (a/buffer 100))]
    (a/pipeline 4 out xform (a/to-chan items))
    ( (a/reduce conj [] out))))


(process (range 10))
[2 4 6 8 10]

A pipeline construct accepts a max number of parallel threads (usually the same of the physical cores available in the system). Each incoming item from the input channel is processed in parallel through the transducer chain up to the max number of threads.

Custom transducers

The standard library provides some transducer-enabled functions, which cover most of the common scenarios. When those are not enough, you can create your own. However, a custom transducer needs to obey a few rules to play well with other transducers:

  • It must support a zero, one or two-arguments calls.
  • The zero-argument call defines the initial value for the reduction. This arity is not currently used by transducers, but it is recommended to implement it anyway by invoking the reducing function (usually abbreviated “rf”) with no arguments. This is a conservative approach that plays well with the current behaviour of transduce just in case a future change to the standard library will require the zero-argument call.
  • The single-argument call defines the “tear-down” behaviour. This is especially useful if the transducer contains any state that needs deallocation. It will be called once with the final result of the reduction. After providing any custom logic (optional), the custom transducer is expected to call “rf” with the result, so other transducers in the chain can also have their chance for cleanup.
  • The two-arguments call represents the standard reduction step and should contain the business logic for the custom transducer. This is the typical reduce operation: the first argument represents the result so far, followed by the next item from the input. It is expected that the reducing function “rf” is invoked after applying any transformation, propagating the call to other transducers in the chain.
  • A transducer can decide to terminate reduction at any time by calling reduced on the results. Other transducers should pay attention to reduced elements and prevent unnecessary computation. This is, for example, what happens using transducers like take-while. When the predicate becomes false, no other items should be reduced and the computation should stop. If after take-while there are other computationally intensive transducers, they should also stop processing and allow the results through without doing anything else.

This set of rules describes the protocol to follow to execute a pipeline of transducers. The primary goal of the protocol is to give a fair chance to each transducer to execute (or stop) a computation, initialise internal state or perform cleanup logic.

Not surprisingly, transducers maintaining state are called “stateful” to distinguish them from “stateless” transducers. Maintaining state in a transducer is quite common and roughly half of transducers in the standard library do.

It is so common that a new concurrency primitive has been created called volatile!. A volatile! is a concurrency construct that allows multiple threads to promptly see a value as soon as it’s updated. It’s very different from the other concurrency primitives (var, atom, ref and agent) that instead “protect” state from concurrent access.

If you are wondering why volatile! is necessary, the answer is that with core.async pipelines the same transducer instance could run on multiple threads. A volatile! guarantees that the state inside the transducer is seen by all threads as soon as possible. So if the same transducer happens to run on a different thread, the thread will see the most recent internal state of the transducer. The reason why this might not be the case without a volatile! has to do with aggressive use of CPU registers as caches and instruction re-ordering, a common JVM optimisation.

A logging (stateless) transducer

So, let’s get practical and create our first stateless transducer. In this example, we assume that we just created a complicated, but nicely composed, transducer chain for data processing (we are simulating it here with a much shorter and simpler one).

How would you go about debugging it? You might need to understand at which step in the chain things are not working as expected. Here’s an idea for a logging transducer with a printing side effect. Other than printing on screen, the transducer is not altering the reduction:

(defn log [& [idx]]
  (fn [rf]
    (fn
      ([] (rf))
      ([result] (rf result))
      ([result el]
        (let [n-step (if idx (str "Step: " idx ". ") "")]
          (println (format "%sResult: %s, Item: %s" n-step result el)))
        (rf result el)))))


(sequence (log) [:a :b :c])
;; Result: null, Item: :a
;; Result: null, Item: :b
;; Result: null, Item: :c
;; (:a :b :c)

In this example, log is a transducer accepting an optional single argument “idx”. When “idx” is present, log additionally prints it, assuming “idx” is the position of the transducer in a composed chain (we’ll see in a second how that information can be used). Before being composed via comp, transducers are just a list of functions. The idea is that we can interleave the list with the logging transducer ahead of comp and use a dynamic variable to control when to print to the console:

(def ^:dynamic *dbg?* false)


(defn comp* [& xforms]
  (apply comp
    (if *dbg?*
      (->>
        (range)
        (map log)
        (interleave xforms))
      xforms)))


(transduce
  (comp*
    (filter odd?)
    (map inc))
  +
  (range 5))
;; 6


(binding [*dbg?* true]
  (transduce
    (comp*
      (filter odd?)
      (map inc))
    +
    (range 5)))
;; Step: 0. Result: 0, Item: 1
;; Step: 1. Result: 0, Item: 2
;; Step: 0. Result: 2, Item: 3
;; Step: 1. Result: 2, Item: 4
;; 6

Here, comp* is a thin wrapper around the normal comp function in the standard library. It has the responsibility to check the *dbg?* dynamic variable. When *dbg?* is true, we interleave our logging transducer to an already existing chain of transducers passed as input. Otherwise we do nothing.

The first example invocation of transduce shows that comp* behaves like normal comp. When we bind *dbg*? to true though, the logging transducer starts printing. Several logging transducers instances are created — as many as necessary to interleave the input chain. Each logging transducer carries the “idx” information about its position in the chain, so it can print it. By looking at the source code, we know that ‘step 1’ corresponds to the transducer at index 1 in the (zero-indexed) list passed to comp* (which is filter). If we see some odd value of ‘Item’ or ‘Result’, we know which step is producing it and we can take action.

An interleave (stateful) transducer

Let’s now see an example of stateful custom transducer. The following chain of sequential transformations shows a way to implement the egyptian multiplication algorithm. Egyptians didn’t use time tables to multiply numbers, but they worked out the operation by decomposing numbers by powers of two:

(defn egypt-mult [x y]
    (->> (interleave
           (iterate #(quot % 2) x)
           (iterate #(* % 2) y))
      (partition-all 2)
      (take-while #(pos? (first %)))
      (filter #(odd? (first %)))
      (map second)
      (reduce +)))


(egypt-mult 640 10)
;; 6400

We would like to express this algorithm with transducers but there is no interleave transducer in the standard library — just normal interleave. All other operations inside the thread last ->> form are available as transducers.

How should we design the interleave transducer to work with transduce? First of all, transduce does not support multiple collections as input. So, the idea is to use one sequence as the main input for transduce and the other as the sequence of interleaving elements. The sequence of interleaving elements lives inside the state of the interleave transducer, taking the first item to interleave at each reducing step. The remaining elements have to be stored as state inside the transducer while waiting for the next call. Without further ado, here’s the interleave transducer:

(defn interleave-xform ; 
  [coll]
  (fn [rf]
    (let [fillers (volatile! (seq coll))] ; 
      (fn
        ([] (rf))
        ([result] (rf result))
        ([result input]
         (if-let [[filler] @fillers]      ; 
           (let [step (rf result input)]
             (if (reduced? step)          ; 
               step
               (do
                 (vswap! fillers next)    ; 
                 (rf step filler))))
           (reduced result)))))))         ; 
  1. interleave-xform is modeled on the same semantic of the interleave function in the standard library: it interleaves elements up to the end of the shortest sequence. interleave-xform contains all the required arities: no argument, single argument and two argument.
  2. interleave-xform assumes the interleaving is coming from a collection passed while creating the transducer. We need to keep track of the remaining items in the sequence as we consume them, so the rest of are stored in a volatile! instance.
  3. During the reducing step, we verify to have at least one more element to interleave before allowing the reduction. Note the use of if-let and destructuring on the first element of the content of the volatile! instance.
  4. We need to check whether another transducer along the chain has required the end of the reduction. In that case, we obey without propagating any further reducing step.
  5. If, instead, we are not at the end of the reduction and we have more elements to interleave, we can proceed to update our volatile! state and call the next transducer using the “filler” element coming from the internal state. Note that at this point, this is the second time we invoke “rf”; the first being for the normal reducing step, the second is an additional reducing step for the interleaving.
  6. In case we don’t have any more items to interleave, we end the reduction using reduced. This prevents nil elements from appearing in the final output — exactly the same as normal interleave.

With interleave-xform in place, we can express the egyptian multiplication method as follows:

(defn egypt-mult [x y]
  (transduce
    (comp
      (interleave-xform (iterate #(* % 2) y))
      (partition-all 2)
      (take-while #(pos? (first %)))
      (filter #(odd? (first %)))
      (map second))
    +
    (iterate #(quot % 2) x)))


(egypt-mult 4 5)
;; 20

The second iteration of increasingly doubling numbers is now considered the interleaving sequence that we pass when creating the transducer. The other iteration with increasingly halved numbers is instead the normal input for transduce. The two sequences are interleaved together and partitioned into vectors as part of the transducing step. Apart from the interleaving part, the rest of the processing is a mechanical refactoring from a sequence operation into the related transducer version.

Laziness

There are four main ways to apply transducers in the standard library: transduce, sequence, eduction and into. Up until now we’ve seen one of the most popular, transduce, which is designed to completely consume the input collection. Even when transduce is used to output a new sequence, it doesn’t work lazily, as we can quickly verify by using a counter on the number of transformations happening on each item:

(def cnt (atom 0))
(take 10 (transduce (map #(do (swap! cnt inc) %)) conj () (range 1000)))
;; (999 998 997 996 995 994 993 992 991 990)
@cnt
;; 1000

In the example above, you can see that transduce completely consumes a lazy sequence, despite the fact that we want just the first 10 elements (note that conj on a list is pre-pending element at the beginning of the current list). The counter shows that the transducer has been called on all of the items, fully evaluating the range. into uses transduce underneath and has the same behaviour. If you are interested in applying a transducer chain lazily by gradually consuming the input, sequence or eduction will do that:

(def cnt1 (atom 0))
(let [res (eduction (map #(do (swap! cnt1 inc) %)) (range 1000))]
  (doall (take 10 res))
  @cnt1)
;; 33


(def cnt2 (atom 0))
(let [res (sequence (map #(do (swap! cnt2 inc) %)) (range 1000))]
  (doall (take 10 res))
  @cnt2)
;; 33

The type of laziness in eduction and sequence is called “chunked”. It means that they are lazy but not extremely lazy, by allowing some consumption of the input sequence ahead of the actual position in the iteration. New items are processed in chunks of 32 elements each: once the 32th element is reached, all others up to the 64th are processed, and so on. In our example, we consume 10 elements but process the whole first chunk of 32.

So, what’s the difference between eduction and sequence? eduction allows a variable number of transducers to be passed as argument without comp. More importantly, eduction does not cache results, which means potentially running all transducers again if necessary. Here’s a way to demonstrate this behaviour:

(def cnt1 (atom 0))
(let [res (eduction (map #(do (swap! cnt1 inc) %)) (range 10))]
  (conj (rest res) (first res))
  @cnt1)
;; 20


(def cnt2 (atom 0))
(let [res (sequence (map #(do (swap! cnt2 inc) %)) (range 10))]
  (conj (rest res) (first res)) ; (2)
  @cnt2)
;; 10

In our new examples, we are using both first and rest on the result of eduction and sequence, respectively. You can see that both first and rest forces eduction to scan the input sequence, as demonstrated by the counter printing “20”, which is twice the amount of items in the input sequence. sequence caches results internally, so it doesn’t execute transducers again. eduction approach has benefits on memory allocation at the price of multiple evaluations. A rule of thumb to pick the right choice would be:

  • Use sequence when you plan to use the produced output multiple times. For example, assigning it to a local binding and then proceed to further process it. Internal caching results in better performance overall. At the same time, it could consume more memory, as the entire sequence could load in memory if the last element is requested.
  • Use eduction when there is no plan to perform multiple scans of the output, saving on unnecessary caching. This is the case with transducer chains which depend on some user generated search parameters. In this scenario, the application needs to produce a one-off view of the system that is discarded as soon as the response is returned.

Parallelism

We have seen an elegant example of parallelism in transducers using core.async pipelines. There is also another option to parallelise transducible operations. fold is a function from the reducers namespace in the standard library. fold offers parallelism based on the “divide and conquer” model: chunks of work are created and computation happens in parallel while, at the same time, finished tasks are combined back into the final result.

The following example shows ptransduce, a function that works like a parallel transduce. Since reducers are based on the same functional abstraction, we can leverage fold without changing the transducers chain:

(require '[clojure.core.reducers :refer [fold]])


(def xform (comp (map inc) (filter odd?)))


(defn ptransduce [xform rf combinef coll]
  (foldcombinef
    (xform rf)
    (into [] coll)))


(ptransduce xform + + (range 1000000))
;; 250000000000

Note that the input collection needs to be a vector (or a map) in order for reducers to work in parallel. It also needs to be bigger than a certain size (512 items by default) to enable parallelism. Apart from this, the xform transducer chain is the same as before but we need to call it with (xform rf) because fold expects a reducing function, which is what xform returns (plus added transformations when invoked with a basic reducing function like +).

Both pipelines and fold parallelism work unless stateful transducers are involved. Observe the following:

(require '[clojure.core.async :refer [] :as a])
(require '[clojure.core.reducers :refer [fold]])
(def xform (comp (drop 100) (map inc) (filter odd?)))


(transduce xform + (range 10000))
;; 24997500


(let [out (a/chan (a/buffer 100))]
  (a/pipeline 4 out xform (a/to-chan (range 10000)))
  ( (a/reduce + 0 out)))
;; 0


(distinct
  (for [ (range 100)]
    (fold+
      (xform +)
      (into [] (range 10000)))))
;; (24997500 24877997 24781126 24912310 ....

(drop 100) is now part of the transducers chain. We can see the expected result of 24997500 with a normal transduce; but both pipeline and fold are producing unexpected results. This is because there are multiple instances of the same stateful transducer.

In the case of pipeline, one transducer chain is instantiated for each item (so drop is always dropping the single element available). In the case of fold, there is one transducer chain each chunk, and the situation is additionally complicated by “work stealing”, a feature in which threads that are currently not doing any work can “steal” items from other threads. When that happens, the stateful transducer is not initialised again with the result that the same state is suddenly shared across threads. That’s why, to see the inconsistency, you need to run fold multiple times.

The problem of parallelism with stateful transducer is both technical and semantic. The drop operation, for instance, is well defined as sequential operation: the number of requested elements is skipped at the beginning of the collection. But this definition needs to be reformulated when parallel chunks are involved: should drop remove the same number of elements each chunk? In that case sequential drop would diverge from parallel drop.

Since a solution that produces consistent results does not exist in the standard library, the problem of parallel stateful transducers remains unsolved and they should be avoided with fold or pipeline. If you’re interested, I provided a solution to this problem in the parallel library. The proposed solution is opinionated, but it provides a consistent model to execute stateful transducers in a parallel context.

Conclusions

As we’ve seen in the previous part of this article, transducers are, at their core, a simple but powerful functional abstraction. reduce encapsulates a pattern for recursion that can be adapted to many situations and, at the same time, it promotes the design of code in terms of a standard reducing interface. Since transducers’ building blocks conform to the same contract, they are easy to compose and reuse. Transducers were introduced late in Clojure, perhaps explaining some struggle in their initial adoption.

There are still some rough edges and room for improvement in transducers as they are today. A few libraries started to emerge to provide transducers version of other functions (most notably, Christophe Grand’s xforms), hinting at the fact that more could be added to the standard library. Transducers are also amenable for parallel computation; but there is no solid semantic for parallel stateful transducers, so they can’t be used with fold. This is somehow discouraging their parallel use as a whole.

On the positive side, transducers already cover a fair amount of common use cases and you should consider reaching them as the default for everyday programming.

Clojure The Essential Reference

Did you enjoy reading this article? You might find my book Clojure: The Essential Refence also interesting! The book has an entire chapter dedicated to all the functions related to both reducers and transducers. There you can find more examples and insights.

Resources

Clojure transducers from the ground-up: the essence.

This is the first part of an article dedicated to Clojure transducers. I initially wrote this article for uSwitch Labs (my former employer in 2017) but the articles were later removed. This first part illustrates the functional foundations of transducers. The second part contains practical examples of their use in real-life scenarios.

Introduction

Transducers have been introduced in Clojure 1.7 (at the end of 2014) and they never got the attention they deserved. The author of Clojure, Rich Hickey, recently stated in his A History of Clojure paper:

I think transducers are a fundamental primitive that decouples critical logic from list/sequence processing and construction, and if I had Clojure to do all over I would put them at the bottom.

Maybe because of some confusion with Reducers (a similar Clojure feature which focuses on parallelism), or because not all functions in the standard library are transducers-aware, many Clojure programmers are still reluctant to use them extensively. Transducers are still relegated to advanced scenarios, but there are compelling reasons to use them more often, for example to replace some common cases of sequential processing.

In this article I’m going to show you that transducers are essentially a functional abstraction (similar to combining object oriented patterns). They can be derived with a few refactoring moves on top of existing collection processing functions. The fact that Clojure offers them out of the box removes any excuse not to start using them today!

Same function, different implementations

map and filter are very common operation for stream oriented programming (there are many more in the Clojure standard library and the following examples apply to most of them as well). Here’s a simplified version of how map and filter are implemented in Clojure:

(defn map [f coll]
  (when (not= '() coll)
    (conj
      (map f (rest coll))
      (f (first coll)))))

(defn filter [pred coll]
  (when (not= '() coll)
    (let [f (first coll) r (rest coll)]
      (if (pred f)
        (conj (filter pred r) f)
        (filter pred r)))))

map and filter clearly share some common traits in terms of iterating the input, building the output, the recursion mechanism and the actual “essence” of the operation:

  • The access mechanism to the input collection (first, rest, the empty list '() are all specific to the Clojure sequential interface).
  • Building the output (conj is used to put elements in the final list, but something else could be used).
  • The recursion mechanism is used to consume the input (note that this is a stack consuming and not tail-recursive loop).
  • The “essence” of the operation itself, which is the way “mapping” or “filtering” works (filter requires a conditional for example).

There are similar operations for data pipelines in other Clojure libraries. For example core.async is a library inspired by CSP (Communicating Sequential Processes) in which processes exchange information using channels. A common case for the sender is to apply transformations to the outgoing messages, including operations like map, filter and many others. Let’s have a look at how they could be implemented in core.async (this is a simplified version of the now deprecated ones that appeared in the initial implementation):

(defn map [f in out]
 (go-loop []
  (let [val ( in)]
   (if (nil? val)
    (close! out)
    (do (doseq [v (f val)]
         (>! out v))
     (when-not (impl/closed? out)
      (recur)))))))

(defn filter [pred ch]
 (let [out (chan)]
  (go-loop []
   (let [val ( ch)]
    (if (nil? val)
     (close! out)
     (do (when (pred val)
          (>! out val))
      (recur)))))
  out))

Again, there are similarities and common traits:

  • The access mechanism uses core.async primitives (!) to read and write to channels.
  • The recursion mechanism is implemented by the go-loop macro and related recur instruction.
  • The “essence” of the operation itself is the same as before: map consists of applying “f” to each value and filter uses a predicate on each value in a when condition.

We are going to see one last example inspired by another library: the Clojure Reactive extensions. RxClojure is a library implementing the Clojure bindings for RxJava. Reactive programming is a push-based model based on streams: events (called “observables”) are collected and routed to components “reacting” to compose behaviour. How could map or filter be implemented in this case? The following are not in RxClojure, as they are just calling into the relative Java version. But if we had to implement them in Clojure, they would probably look something like this:

(defn map [f xs]
 (let [op (operator*
           (fn []
            (subscriber 
             (fn [ v]
              (catch-error-value  v
               (on-next  (f v)))))))]
  (lift op xs)))

(defn filter [pred xs]
 (let [op (operator*
           (fn []
            (subscriber 
             (fn [ v]
              (catch-error-value  v
               (when (f v)
                (on-next  v)))))))]
  (lift op xs)))

We start to see a pattern emerging, once again we can distinguish between:

  • The access mechanism uses lift to iterate through the incoming sequence “xs” in conjunction with on-next inside the operator implementation.
  • Building the output is not explicit as before. Events are consumed downstream without accumulating.
  • The recursion mechanism is implicit. Somewhere else in the code a loop is happening, but it’s not exposed as part of the main API.
  • The “essence” of the operation is the same as before: map consists of (f v) for each value and filter uses a when condition.

Do we need to repeat variants of the same implementation over and over? Is there a better way?

Combinatorial Explosion

By looking at the three implementations of map and filter above, we learned that the essence of the operation and some form of iteration are general aspects. Making access to the input or building the output depends on the specific transport. We just looked at map and filter, but the same isolation of concerns is applicable to other sequential processing functions, for example:

mapcat, remove, take, take-while  
take-nth, drop, drop-while, replace  
partition-by, partition-all, keep, keep-indexed  
map-indexed, distinct, interpose, dedupe, random-sample
[...]

The list above should also include any custom functions that you might need beyond what’s offered by the Clojure standard library.

The dilemma is; how can we deal with the ensuing combinatorial explosion? Are we doomed to implement the same functions with slight variations for each new type of transport/collection? Could we just write map once and use it everywhere? Transducers are the solution to this problem (and much more).

An exercise in refactoring

To enable reuse of the general aspects of sequential processing, we need to isolate the “essence” of map or filter (or other functions from the list above) and provide a way to run them in a transport-independent fashion. If we succeed, we’ll have a recipe to build processing pipelines that can be reused in different contexts.

It turns out that reduce, a well known operation in functional programming, is the key to achieve this goal. Is not a coincidence that Graham Hutton dedicated an entire paper on The universality and expressiveness of fold (fold is another name for reduce). reduce is very general because it encapsulates the prototypical tail-recursive loop. Have a look, for example, at the following “sum of all numbers in a list”:

(defn reduce [f result coll]
  (if (not= '() coll)
    (reduce f (f result (first coll)) (rest coll))
    result))

(reduce + 0 (range 10))

Here reduce accumulates the result explicitly as one of the parameters. This form of recursion is also called “iterative” and once transformed into a Clojure loop-recur, it doesn’t consume the stack. The other interesting fact about reduce is that it decouples the iteration mechanism from the transformation semantic, which is part of our plan.

map and filter (as well as many other recursive algorithms) can be rewritten “reduce style”. The fact that a stack-consuming algorithm can be rewritten as iterative is a well known property in theory of computation. By rewriting map and filter (and possibly other sequential functions) as iterative, we are offered the possibility to extract the “essence” of the operation:

;; refactoring step 1: iterative recursion style.

(defn map [f result coll]
  (if (not= '() coll)
    (map f (f result (first coll)) (rest coll))
    result))

(map (fn [result el] (conj result (inc el))) [] (range 10))

(defn filter [f result coll]
  (if (not= '() coll)
    (filter f (f result (first coll)) (rest coll))
    result))

(filter (fn [result el] (if (odd? el) (conj result el) result)) [] (range 10))

“f” is now passed as part of the parameters in our new implementations. If you look carefully, the two functions map and filter are now identical (except for the name). Invoking them requires a more sophisticated “f” function taking two arguments: the result so far (also called accumulator) and the next element to process.

One big plus after this change is that the essence of filtering (or mapping), is now isolated from recursion and input iteration. It is not yet isolated from the way the output is built (conj in both cases) and the actual function (inc and odd? respectively). But let’s take baby steps and do some renaming: map and filter can be renamed reduce because that’s what they are now. Second, we can extract two new functions called “mapping” for map and “filtering” for filter:

;; refactoring step 2: rename and reuse.

(defn reduce [f result coll]
  (if (not= '() coll)
    (reduce f (f result (first coll)) (rest coll))
    result))

(defn mapping [result el]
  (conj result (inc el)))

(reduce mapping [] (range 10))

(defn filtering [result el]
  (if (odd? el)
    (conj result el)
    result))

(reduce filtering [] (range 10))

reduce encapsulates the iteration and the sequential access mechanism. But there is still a problem with “mapping” and “filtering”: if we wanted to use them on a core.async channel for instance, we’d need to abstract conj away (because conj doesn’t work on channels). We can’t modify “mapping” or “filtering” interface, because it is part of the reduce contract. But we can add a parameter “rf” (for Reducing Function) in a wrapping lambda and return another function of two parameters:

;; refactoring step 3: extract output construction parameter.

(defn reduce [f result coll]
  (if (not= '() coll)
    (reduce f (f result (first coll)) (rest coll))
    result))

(defn mapping [rf]
  (fn [result el]
    (rf result (inc el))))

(reduce (mapping conj) [] (range 10))

(defn filtering [rf]
  (fn [result el]
    (if (odd? el)
      (rf result el)
      result)))

(reduce (filtering conj) [] (range 10))

We also need to extract inc and odd? which are just example functions and should be generically passed as parameters. Again, we don’t want to alter the two arguments interface required by reduce, so we use another wrapping function and introduce the new parameter “f” (or “pred” for filter):

;; refactoring step 4: extract transforming and predicate functions.

(defn mapping [f]
  (fn [rf]
    (fn [result el]
      (rf result (f el)))))

(reduce ((mapping inc) conj) [] (range 10))

(defn filtering [pred]
  (fn [rf]
    (fn [result el]
      (if (pred? el)
        (rf result el)
        result))))

(reduce ((filtering odd?) conj) [] (range 10))

Finally, let’s rename the relevant functions back to map and filter (because this is what they are after all):

;; refactoring step 5: final clean-up.

(defn map [f]
  (fn [rf]
    (fn [result el]
      (rf result (f el)))))

(defn filter [pred]
  (fn [rf]
    (fn [result el]
      (if (pred el)
        (rf result el)
        result))))

This is exactly how the single-arity versions of clojure.core/map and clojure.core/filter appear in the Clojure standard library (modulo some complexity related to multiple sequence arguments in map).

Along with the enriched versions of many sequential processing functions, Clojure 1.7 also introduced a new function called transduce that enables the use of map or filter without necessarily having to call reduce directly. This mainly improves readability:

(transduce (map inc) conj (range 10))
;; same as: (reduce ((map inc) conj) [] (range 10))

The standard library also provides transducers awareness in other places. The new versions of sequence and into for example, remove the need for an explicit conj:

(sequence (map inc) (range 10))
(into [] (map inc) (range 10))

conj is not explicit because the reducing function can be inferred from the specific call to sequence (because we want to build a sequence) or into [] (we want to build a vector). Now that we have the basic recipe, it’s time to put the new construct in practice and see how they can be used for our daily programming.

Conclusions

The article shows that transducers are built on top of a simple functional abstraction and there is nothing magic happening under the hood. Apart from the interesting refactoring exercise, transducers have deeper consequences in terms of reusability, composability and performance that we are going to explore in the second part of this article.

Clojure The Essential Reference

Did you enjoy reading this article? You might find my book Clojure: The Essential Refence also interesting! The book has an entire chapter dedicated to all the functions related to both reducers and transducers. There you can find more examples and insights.

Exploring the memoize function

*In this article we will explore some concrete examples of the many uses and intricacies of the memoize function from the Clojure standard library. If you enjoy this article, you might very well enjoy the Clojure Standard Library book: save 37% off the book with code fccborgatti. *

memoize is a function in the Clojure standard library that adds caching capabilities to another existing function using the invocation arguments as key. When the wrapped function is invoked with the same list of arguments, the result is returned immediately from the cache without any additional computation. The effects of memoize are readily visible if we print any message from the wrapped function:

(defn- f* [a b]
  (println (format "Cache miss for [%s %s]" a b))
  (+ a b))

(def f (memoize f*))

(f 1 2)
;; Cache miss for [1 2]
;; 3

(f 1 2)
;; 3

(f 1 3)
;; Cache miss for [1 3]
;; 4

The first invocation generates the message (the following invocations don’t), confirming that the wrapped function f* is not invoked again. Conventionally, if the wrapped function is not meant to be used directly, the star * character gets added at the end of the name, while the “memoized” version is named without one.

“Memoize” invocation contract

memoize formal specification of the function signature is as follows:

INPUT

  • “f” needs to be a function and is mandatory, so (fn? f) should return true. You can still provide memoize with a non-invokable object but the resulting function would be useless and throw an exception.

NOTABLE EXCEPTIONS

  • ClassCastException if “f” is not invokable.

OUTPUT

  • A new function of a variable number of arguments. memoize will eventually delegate the call to the wrapped function, so the number of arguments passed to the generated function needs to be compatible.

Examples

memoize works well for non-trivial computations accepting and returning values with a relatively small memory footprint. The following example illustrates this point. The Levenshtein distance (this Wikipedia article contains a good introduction to the Levenshtein Distance algorithm) is a simple metric to measure how different two strings are. Levenshtein distance can be used, for example, to suggest corrections for the most common spelling mistakes. The Levenshtein distance is straightforward to implement, but becomes computationally intensive for longer strings (above 10 characters long). We could use memoize to save us from computing the distance of the same pair of strings over and over again. The input (the strings arguments) and the output (a small integer) are relatively small in size, so we can cache a large amount of them without exhausting memory (assuming the list of words with which the function is invoked is a finite number that we can estimate).

To feed our example we are going to use the dictionary available on Unix systems at “/use/share/dict/words” which is a plain text list of words. If we were asked to implement an auto-correction service, it could work as follows:

  1. The user inputs a misspelled word
  2. The system checks the distance of the word against the words in the dictionary
  3. The results are returned back in order of smaller distance

Although, in our example, we are just implementing the essentials, concentrating mainly the application of memoize, we are also going to pre-compute a dictionary of words starting with the initials of the word, a technique to further speed-up the Levenshtein distance calculation:

(defn levenshtein* [[c1 & rest1 :as str1]         ; 1.
                    [c2 & rest2 :as str2]]
  (let [len1 (count str1)
        len2 (count str2)]
    (cond (zero? len1) len2
          (zero? len2) len1:else
          (min (inc (levenshtein* rest1 str2))
               (inc (levenshtein* str1 rest2))
               (+ (if (= c1 c2) 0 1) (levenshtein* rest1 rest2))))))

(def levenshtein (memoize levenshtein*))          ; 2.

(defn to-words [txt init]                         ; 3.
  (->>  txt slurp clojure.string/split-lines
       (filter #(.startsWith % init))
       (remove #(> (count %) 8))
       doall))

(defn best [misp dict]                            ; 4.
  (->> dict
       (map #(-> [% (levenshtein misp %)]))
       (sort-by last)
       (take 3)))

(defn dict [init]
  (to-words "/usr/share/dict/words" init))

(def dict-ac (dict "ac"))                         ; 5.

(time (best "achive" dict-ac))
;; "Elapsed time: 4671.226198 msecs"              ; 6.
(["achieve" 1] ["achime" 1] ["active" 1])

(time (best "achive" dict-ac))
;; "Elapsed time: 0.854094 msecs"                 ; 7.
(["achieve" 1] ["achime" 1] ["active" 1])
  1. The Levenshtein algorithm presented here is a variation of the many available on-line. The important aspect to remember is that it grows roughly as O(n*m) where m and n are the length of the strings, which is O(n^2) in the worst scenario.
  2. This def actually builds the wrapping function through memoize, conveniently called levenshtein without the final * that is reserved for the non-memoized version.
  3. to-words is a helper function to prepare the dictionary filtered by the initial string. to-words is part of the “static” or “learning” phase of the algorithm, since we can prepare words by initial off-line and store them for later use.
  4. The best function is responsible for the application of the levenshtein memoized function to the words in the dictionary. It then sorts the results with sort-by and returns the lowest distances.
  5. The def invocation defines a filtered dictionary starting by “ac” so it doesn’t need to be computed multiple times. This also prevents the time function from reporting on the time needed to read and process the file.
  6. The first invocation to search the best matches for the misspelled word returns in almost 5 seconds.
  7. The second invocation returns much faster.

The memoized version of the Levenshtein distance function is storing each new pair of strings as key and the returned distance as the value of an internal map. Each time the function is invoked with the same arguments, the return value is looked up inside the map. Comparing long strings seems to benefit from caching intermediate results and the elapsed confirms the theory.

This example also shows the way the memoized Levenshtein distance is “trained” before actual use. The application could pre-compute the set of dictionaries by initials (similar to the indexing happening inside a database). This technique contributes to the speed-up seen in our Levenshtein implementation, but consider that there are also other algorithms out-performing Levenshtein (See the list of metrics available on Wikipedia).

What’s in a name: memoize?

There is a reason why storing arguments and return values is called “memoization” instead of just “caching”. Memoization is more specific because it implies two features normally present in functional languages: pure and higher order functions.

Pure functions The wrapped function needs to be referentially transparent. If there was a way for the output to be influenced by factors other than the function arguments, then cached results could be different depending on the context. The cache would then need to be aware of the context and use it as part of the key. Memoization becomes straightforward in functional languages supporting referential transparency.

Higher order functions “Higher order” refers to the property of a function to be treated as a value. As such, the function can be stored, passed to other functions or returned. Not all languages offer higher order functions, although it is now more common to offer this feature. By describing this kind of caching as “memoization” it is implied that a function can be transparently decorated with caching capabilities. “Transparently” in this context means that the original wrapped function remains untouched.

See Also

These are other functions in the Clojure standard library that have a similar or related use to memoize.

  • lazy-seq creates a “thunk” (wrapper function around a value) that evaluates its content on the first access and return a cached version on following calls. When the thunks are joined together in a sequence they form a lazy sequence. Lazy sequences are comparable to a cache where the order and value of the keys is predetermined. An “evaluate once” semantic on collections could be achieved with lazy-seq. Since all Clojure sequences are lazy, you might already be using a “cached data structure” without knowing it.
  • atom creates a Clojure Atom, one of the possible Clojure reference types. Atoms are used by memoize to store intermediate results. Use an atom directly if the memoize implementation is too restrictive for the kind of caching you need to implement. You can, for example, look into something different than a Clojure hash-map to store items in the map, like a mutable Java map with soft-references (there are several examples of use of SoftReference for caching in Java. This is a good starting point. Keep in mind that there are already libraries like core.cache to provide common caching strategies if this is what you’re looking for.

Memoize Performance and Implementation Details

  • O(1) steps (function generation)
  • O(n log n) ** steps (generated function), n number of unique keys**
  • O(n) ** space (generated function), n number of unique keys**

The main aspect to consider about memoize, is that it stores cached items indefinitely. Constant accumulation of new cached values will eventually exhaust memory. memoize users should pay attention to these facts when designing their solution, more specifically to the prospected distribution of keys in the cache. memoize should not be used in cases of long-running services when the amount of argument permutations is potentially infinite or not easy to predict.

We can gather some statistics about the key distribution with some changes to the original memoize function. The following memoize2 contains additional atoms to collect data cache hits, misses and total number of calls at run-time.

(defn memoize2 [f]
  (let [mem (atom {}) ; 1.
        hits (atom 0)
        miss (atom 0)
        calls (atom 0)]
    (fn [& args]
      (if (identical? :done (first args)) ; 2.
        (let [count-chars (reduce + (map count (flatten (keys @mem))))]
          {:calls @calls:hits @hits:misses @miss:count-chars count-chars:bytes (* (int (/ (+ (* count-chars 2) 45) 8)) 8)}) ; 3.
        (do (swap! calls inc) ; 4.
            (if-let [e (find @mem args)]
              (do (swap! hits inc) (val e))
              (let [ret (apply f args)
                     (swap! miss inc)]
                (swap! mem assoc args ret)
                ret)))))))
  1. Along with the actual cache, additional counters are added to the initial let block.
  2. :done is a sentinel value that can be used to extract statistics during run-time.
  3. This is an estimate of the amount of memory necessary to store the keys given the number of chars footnote:[A good enough formula to estimate the amount of memory necessary to store strings in Java is: http://www.javamex.com/tutorials/memory/string_memory_usage.shtml].
  4. Additional swap! operations are performed to update counters.

By accessing the additional stats at run-time, we can estimate the key-space size or the memory footprint. For example, if we run the previous Levenshtein memoize example replacing memoize with memoize2 we can extract the following results:

(def levenshtein (memoize2 levenshtein*))

(best "achive" dict-ac)
(["achieve" 1] ["achime" 1] ["active" 1])

(levenshtein :done)
{:calls 400, :hits 0, :misses 400 :count-chars 5168 :bytes 10376}

(best "achive" dict-ac)
(["achieve" 1] ["achime" 1] ["active" 1])

(levenshtein :done)
{:calls 800, :hits 400, :misses 400 :count-chars 5168 :bytes 10376}

As you can see, the first time the best function is invoked it generates 400 misses, while the second time it results in all hits. We can also estimate the memory taken by the strings stored in memory which is around 10Kb.

The second aspect to consider when using memoize is the additional hash-map assoc operation and atom swap! that is added for each new key combination presented as input. The hash-map adds O(n log n) steps to add a new key while the atom could under-perform under heavy thread contention. Depending on the application requirement, memoize could be built on top of a transient data structure to avoid the performance penalty of filling the cache. Another option to consider, when possible, is “warming the cache”: while the application is still not serving live traffic, the cache could be populated artificially with the most common keys.

Now that you know some of the ins and outs of using memoize, perhaps you’re more interested in learning more about the Clojure standard library. If so, download the free first chapter of Clojure Standard Library and see thia Slideshare presentation for more details.

March 2017 Book Updates

For a quick introduction to the content of the book (and a good 42% discount code) check-out the following slides at slideshare.

I started writing Clojure Standard Library book 1.5 years ago and time is really flying. Only a few chapters of the 30+ expected are out after all this time and you might be wondering why. These are important factors to consider:

  • I’m using my spare time to write it, which amounts at something like 20 hours per week when everything is perfect. I throw in occasional holidays as well. It’s not a lot, but it’s consistent.
  • It’s a monumental work. Each function is a small essay that requires anything between 15 to 40 hours to complete. There are 700 or so of them, you do the math. The final manuscript is projected at well over 1000 pages.

I recently completed another couple of chapters that are going through the reviewing pipeline at the moment and will be released in the following weeks. I’m also trying to give a sense of the content of the book through the Clojure Pills screencast.

Writing the book is such a pleasure that I’m not really concerned if it takes some long time to finish. The care the Clojure core team is taking to maintain backward compatilibity is making my life easier in this case. Sure there will be changes (like the latest “spec” introduction in 1.9) but that’s the only of that size I can think of since Clojure 1.3. At the same time (as my publisher politely remembers me) I’m also aware that I should converge to a conclusion in a relatively short time! :D

For this reason I started searching for contributors and possibly co-authors. The book project is fully parallelizable: I can work on one chapter while somebody else is contributing another (or even sparse functions). Since I know already you’ll ask: it’s written in asciidoc on top of git. The difference between contributing and co-authoring are the following:

  • Contributing: is less formal and smaller in scope, from a few functions to an entire chapter. It brings your name in the book in the list of contributors and on the chapters you write. Anyone interested will be asked to contribute first to get a sense of the effort, challenges and pleasure of writing this book.
  • Authoring: a co-author will sign a formal contract, split with me the revenue of selling the book (don’t expect to become rich tho) and expected to make consistent contribution, in the order of 4-5 chapters at least.

I’m happy to announce that a new co-author is about to join me on the cover of the book, but I’ll wait to announce the name when that is formalized. This is great news, as you’ll see a speedup in releasing chapters and a general increase in content quality.

If you’re also interested in contributing to this amazing book, please get in touch! r e b o r g @ r e b o r g . n e t

New: Clojure Pills Screencast

pills

I’ve recently started a weekly series of screencasts called the “Clojure Pills”. Each episode is dedicated to a function/macro from the Clojure standard library and it is anything between 15-20 minutes long. The following episodes are available at the moment:

The idea with the Clojure Pills is to give developers an idea of the content of the standard library, how it can be used and what kind of performance to expect. Any level of experience is welcome, although the screencast doesn’t teach Clojure from scratch and you might need at least some basic understanding of the language to be able to follow.

I’m partially using the content of my book about Clojure Standard Library in the screencast, distilled down to a digestible “pill” format. Longer examples aren’t that practical to show in the screencasts along with many other things like comparison tables and other details.

I hope I can keep up with the demanding schedule but for now all seems to be going okay. For any questions or feedback please leave a comment below.

Knowing Your Tools

This article is excerpted from chapter 1 of the “Clojure Standard Library, An Annotated reference”, the Manning book I’m currently working on.

Book Cover

; TLDR

The post shows you why it’s important to dedicate time to learn the content of the Clojure standard library. The standard library contains loads of interesting functions and knowing them often leads to a better functional design. An example is given to show how a simple program can be improved by learning what you have available.

The book is available from Manning website along with a few sample chapters.

Tools

Software development is often compared to a craft, despite the fact that it’s predominantly an intellectual activity. While software development is abstract in nature there are many craft-oriented aspects to it:

  • The keyboard requires time and dedication to operate correctly. There are endless discussions on the best keyboard layout for programmers, for example to speed up typing (Dvorak users often claim huge benefits compared to QWERTY users. Here’s one comparison, including other kind of layouts: http://lifehacker.com/should-i-use-an-alternative-keyboard-layout-like-dvorak-1447772004).
  • The development environment is a key aspect of programmers productivity and another source of debate (almost reaching a religious connotation). Mastering a development environment often translates into learning useful key combinations and ways to customize the most common operations.
  • Libraries, tools and idioms surrounding the language. Almost everything above the pure syntax rules.
  • Proficiency in several programming languages is definitely a plus in the job marketplace and the way to achieve it is by practicing them on a regular basis including getting familiar with APIs and libraries the language offers.
  • Many other aspects require specific skills depending on the area of application: teaching, presenting or leadership.

The focus on mastering programming skills is so important that it became one of the key objectives of the Software Craftsmanship Movement. Software Craftsmanship advocates learning through practice and promotes an apprenticeship process similar to other professions.

The standard library is definitely one of the most important tools to master a language. One aspect that characterizes the standard library is the fact that it is already packaged with a language when you first experiment with it. Interestingly, it doesn’t get the amount of attention you would expect for such an easy to reach tool.

Why should I care about the Standard Library?

The expressiveness of a language is often described as the speed at which ideas can be translated into working software. Part of the expressiveness comes from the language itself in terms of syntax, but another fundamental part comes from the standard library which is usually provided out of the box. A good standard library liberates the programmer from the most mundane tasks like connecting to data sources, parsing XML, dealing with numbers and a lot more. When the standard library does a good job, developers are free to concentrate on core business aspects of an application, boosting productivity and return of investment.

Consider also that a deep knowledge of the standard library is often what distinguish an average developer from the expert. The expert can solve problems more elegantly and faster than the beginner because, apart from having solved the same problem before, they can compose a complex solution by pulling small pieces together from the standard library

Finally, the standard library contains solutions to common programming problems that have been battle-tested over generations of previous applications. It is certainly the case for Clojure. The robustness and reliability that comes with that kind of stress is difficult to achieve otherwise. There will be possibly just a handful of cases where something in the standard library won’t fit your needs and will need to be re-implemented.

What’s inside the standard library?

The Clojure standard library is quite comprehensive and can be divided roughly into 3 parts:

  1. What is commonly referred as “core”, the content of the single namespace clojure.core. Core contains the functions that have evolved to be the main public API for the language, including basic math operators, functions to create and manipulate other functions, conditionals. Core currently contains around 700 definitions between functions and macros. Functions in core are always available without any explicit reference from any namespace.
  2. Other namespaces other than core that are shipped as part of the Clojure installation. These are usually prefixed with clojure followed by a descriptive name, like clojure.test, clojure.zippers or clojure.string. Functions in these namespaces are sometimes available just prefixing their namespace (like clojure.string/upper-case) but in other cases they need to be imported in the current namespace using require (this is due to the fact that while bootstrapping, Clojure already imports several namespaces that are automatically available for the end user. Very popular tools like nRepl or Cider also load libraries while bootstrapping, which are then available at the prompt. It is good practice to always require what is useful in a namespace explicitly).
  3. Finally the content of the Java SDK which is easily available as part of Clojure Java interoperability features.

The standard library content can be roughly categorized by looking at the major features Clojure introduces and by the most common programming tasks. There are, for example, big groups of functions dedicated to Software Transactional Memory, concurrency and persistent collections. Of course Clojure also adds all the necessary support for common tasks like IO, sequence processing, math operations, XML, strings and many others. Apparently missing from the Clojure standard library are solutions already provided by the Java SDK, for example cryptography, low-level networking, HTTP, 2D graphics and so on. For all practical purposes those features are not missing, but just usable as they are from Java without the need to re-write them in Clojure. Java interoperability is one of the big strength of Clojure, opening the possibility to easily use the Java SDK (Standard Development Kit) from a Clojure program. Here’s a broad categorization:

  • Core support namespaces integrate core with additional functionalities on top of those already present. clojure.string is possibly the best example. Core already contains str but any other useful string functionalities have been moved out into the clojure.string namespace. clojure.template contains a few helpers for macro creation. clojure.set is about the “set” data structure. clojure.pprint contains formatters for almost all Clojure data types so they can print in a nice, human-readable form. Finally clojure.stacktrace contains function to handle Java exceptions manipulation and formatting.
  • REPL namespaces contain functionalities dedicated to the REPL, the read-evaluation-print-loop Clojure offers. clojure.main includes handling of the main entry point into the Clojure executable and part of the REPL functionalities that have been split into clojure.repl in later time. The latest addition, clojure.core.server implements the server socket functionality.
  • General support is about additional APIs beyond what core has to offer. The namespaces present here enrich Clojure with new functionalities. clojure.walk and clojure.zip for example are two ways to walk and manipulate tree-like data structure. clojure.xml offers XML parsing capabilities. clojure.test is the unit test framework included with Clojure. clojure.sh contains functions to “shell-out” commands to the operative system. clojure.core.reducers offers a model of parallel computation.
  • Java are namespaces dedicated to Java interop beyond what core already has to offer. clojure.java.browser and clojure.java.javadoc offer the possibility to open a native browser to display generic web pages or javadoc documentation respectively. clojure.reflect wraps the Java reflection APIs offering an idiomatic Clojure layer on top of it. clojure.java.io offers a sane approach to java.io, removing all the idiosyncrasies that made Java IO so confusing, like knowing the correct combination of constructors to transform a Stream into a Reader and vice-versa. Finally the clojure.inspector offers a simple UI to navigate data structures.
  • Data Serialization is about ways in which Clojure data can be encoded as string as an exchange format. clojure.edn is the main entry point into (EDN) [https://github.com/edn-format/edn] format serialization. clojure.data contains only one user-dedicated function diff to compute differences between data structures. clojure.instant defines encoding of time related types.

Making Your Development Life Easier

The standard library is not just there to solve the usual recurring programming problems but to offer elegant solutions to new development challenges. “Elegant” in this context translates to composable solutions that are easy to read and maintain. Let’s look at the following example.

Suppose that you’re given the task to create a report to display information on screen in a human readable form. Information is coming from an external system and a library is already taken care of that communication. All you know is that the input arrives structured as the following XML (here saved as a local balance var definition):

The balance needs to be displayed in a user-friendly way:

  1. Removing any unwanted symbols other than letters (like the colon at the beginning of each key)
  2. Separating the words (using uppercase letters as delimiters)
  3. Formatting the balance as a currency with 2 decimal digits.

You might be tempted to solve the problem like this:

  1. parse takes the XML input string and parses it into a hash-map containing just the necessary keys. parse also converts :currentBalance into a double.
  2. clean-key solves the problem of removing the “:” at the beginning of each attribute name. It checks the beginning of the attribute before removing potentially unwanted characters.
  3. separate-words takes care of searching upper-case letters and pre-pending a space. reduce is used here to store the accumulation of changes so far while we read the original string as the input. up-first was extracted as an handy support to upper-case the first letter.
  4. format-decimals handles floating point numbers format. It searches digits with re-find and then either append (padding zeros) or truncate the decimal digits.
  5. Finally print-balance puts all the transformations together. Again reduce is used to create a new map with the transformations while we read the original one. The reducing function was big enough to suggest an anonymous function in a letfn form. The core of the function is assoc the new formatted attribute with the formatted value in the new map to display.

While being relatively easy to read (the 3 formatting rules are somehow separated into functions) the example shows minimal use of what the standard library has to offer. It contains map, reduce, apply and a few others including XML parsing, which are of course important functions (and usually what beginners learn first). But there are definitely other functions in the standard library that would make the same code more concise and readable.

Let’s have a second look at the requirements to see if we can do do a better job. The source of complexity in the code above can be tracked down to the following:

  • String processing: strings need to be analyzed and de-composed. The clojure.string namespace comes to mind and possibly subs.
  • Hash-map related computations: both keys and values need specific processing. reduce is used here because we want to gradually mutate both the key and the value at the same time. But zipmap sounds a viable alternative worth exploring.
  • Formatting rules of the final output: things like string padding of numerals or rounding of decimals. There is an interesting clojure.pprint/cl-format function that might come handy.
  • Other details like nested forms and IO side effects. In the first case threading macros can be used to improve readability. Finally, macros like with-open removes the need for developers to remember to initialize the correct Java IO type and close it at the end.

By reasoning on the aspect of the problem we need to solve, we listed a few functions or macros that might be helpful. The next step is to verify our assumptions and rewrite the example:

  1. parse now avoids the let block, including the annoying side-effect of having to close the input stream by making use of with-open macro. ->> threading macro has been used to give linear flow to the previously nested XML processing.
  2. subs makes really easy to process sub-strings. We don’t need an additional function anymore because turning the first letter to upper-case is now a short single liner.
  3. The key function in the new separate-words version is clojure.string/replace. The regex finds groups of 1 upper-case letter followed by lower-case letters. The last argument conveniently offers the possibility to refer to matching groups. We just need to append a space.
  4. format-decimals delegates almost completely to clojure.pprint/cl-format which does all the job of formatting decimals.
  5. zipmap brings in another dramatic change in the way we process the map. We can isolate changes to the keys (composing words separation and removing the unwanted “:”) and changes to the values into two separated map operations. zipmap conveniently combines them back into a new map without the need of reduce or assoc.

The second example shows an important fact about “knowing your tools” (in this case the Clojure standard library): the use of a different set of functions not only cuts the number of lines from 45 to 30, but also opens up the design to completely different decisions. Apart from the case where we delegated entire sub-tasks to other functions (like cl-format for decimals or int to clean a key), the main algorithmic logic took a different approach that does not use reduce or assoc. A solution that is shorter and more expressive is clearly easier to evolve and maintain.

The well kept secret of the Clojure Ninja

Learning about the functions in the standard library is usually a process that starts at the very beginning. It happens when you first approach some tutorial or book, for example when the author shows a beautiful one-liner that solves an apparently big problem.

Usually developers don’t pay explicit attention to the functions in the standard library, assuming knowledge will somewhat increase while studying the features of the language. This approach can work up to a certain point but it is unlikely to scale. If you are serious about learning the language consider to allocate explicit time to understand the different nuances of similar functions or the content of some obscure namespace. The proof that this is time well spent can be found reading other’s people experience: the web contains many articles describing the process of learning Clojure or documenting discoveries (possibly the best example is Jay Field’s blog).

The following is a trick that works wonders to become a true Clojure Master. Along with learning tools like tutorials, books or exercises like the Clojure Koans, consider adding the following:

  • Select a function from the Clojure standard library every day. It could be lunch or commuting time for example.
  • Study the details of the function sitting in front of you. Look at the official docs first, try out examples at the REPL, search the web or www.github.com for Clojure projects using it.
  • Try to find where the function breaks or other special corner cases. Pass nil or unexpected types as arguments and see what happens.
  • Rinse and repeat the next day.

Don’t forget to open up the sources for the function, especially if belonging to the “core” Clojure namespace. By looking at the Clojure sources, you have the unique opportunity to learn from the work of Rich Hickey and the core team. You’ll be surprised to see how much design and thinking goes behind a function in the standard library. You could even find the history of a function intriguing, especially if it goes back to the origins of Lisp: apply for example, links directly to the MIT AI labs where Lisp was born in 1958! (eval and apply are at the core of the meta-circular interpreter of Lisp fame. The whole Lisp history is another fascinating reading on its own. See any paper from Herbert Stoyan on that matter). Only by expanding your knowledge about the content of the standard library you’ll be able to fully appreciate the power of Clojure.

Summary

  • The standard library is the collection of functions and macros that comes out of the box by installing Clojure.
  • The Clojure Standard Library is rich and robust, allowing developers to concentrate on core business aspects of an application.
  • Information about the Standard Library tends to be fragmented.
  • Deep knowledge of the content of the Standard Library improves code expressiveness exponentially.

XT-16 Conference Notes

XT-16

XT-16 has been an amazing single day conference organized by the friends at http://juxt.pro. Almost everything at the conference has been special: let’s start from the location. The conference was held inside a well known automotive proving ground just outside Millbrook, a little village in the English countryside. After clearing security (including covering each and every camera with special stickers to prevent taking pictures) a small van took us to the “Pod”, a semi-spheric structure where the conference was taking place. Unfortunately, apart from a little glimpse of the tracks and the far away noise of the powerful engines, we weren’t able to see any Ferrari testing from there :(

At the Pod we have been welcomed by the wonderful smell of the Bogota Coffee Company. I’ve enjoyed a couple of high quality coffees just after registration and one more in the afternoon: can’t think of a better way to digest the hog roast sandwich we had for lunch! Just before kick-off, a meditation session was available to attendees in a separate room to prepare the mind and soul for the conference. The welcome pack included the t-shirt, a bottle of specially brewed XT-16 beer, a signature glass and ukulele strap. A whole bunch of ukuleles were distributed in the room along with a little guide and chord progressions for a couple of songs. Ukuleles played a central part in the conference yesterday, being used by speakers, hosts and guests to all play together on stage!

During the day the signature beer was always available (including other drinks) to ease the task of un-packing the dense and thought-provoking talks. Drinking carried on with pizzas in the after conference, including Sam Aaron spot-on dynamically generated grooves. We were finally taken back to the train station (and Milton Keynes) inside a classic style double-decker bus. If you don’t mind the unreliable wifi and the overcrowded toilets (two minor points in my opinion) the rest was perfect.

Two of the guiding themes for the conference were “playing” and “exploring”. The ukulele played a central part for the first aspect, along with the unique possibility to relax throwing axes (!) and archery outside the room. The speakers often explored topics inside and beyond Clojure, ranging from “the roots” (Håkan Råberg - In Search of Simplicity) to “the future” (The Humanity of Our Industry - James Woudhuysen). The talks were all high-quality, entertaining and interesting. I have a few favourites (personally) but your mileage may vary.

I was literally blown away by Karsten Schmidt, Håkan Råberg and James Woudhuysen. The density of information in those talks will take me years to unroll, if ever. More understandable but still very enjoyable talks: Tommy Hall and Sam Aaron. Let me spend one more word for Sam: I started attending his talks sometimes in 2012 during the Overtone era. Despite his talks tend to be a repetition of the same themes over and over (and he explained this on stage), Sam continues to be an inspiration for me about how to look at our profession from a different angle. His talks are always inspiring not matter what.

After a quick introduction by Malcolm and Jon the conference kicked off. I took notes from each talk that you can see below. I written them quite fast and I couldn’t pay too much attention to syntax and good English. Apologies for that. Although they are not perfectly written, they are certainly sequential and made in such a way that you can follow the main points of what was said. Many thanks to JUXT, I’m looking forward to XT-17!

01 The state of open source - James Lewis

  • where does the best software come from?
  • open-source, the past, most-sold? What is best
  • thompson, ritchie, multix, interactive OS
  • bell, move away from batch systems to interactive systems
  • new operating system based on multix, unix, a success story
  • C is B with data types, dennis ritchie wrote it at the same time as unix
  • not fast enough for system programming language

successful software: preent and future

  • spring framework is extremely successful
  • spring in the age of XML configuration
  • a response to J2EE by sun
  • they seems to reinvent itself everytime, with spring-boot for example
  • back to the past: fetchmail a well documented utilty
  • the cathedral and the bazar describes the software dev of fetchmail
  • cruisecontrol the integration sever, actually by thoughtworks
  • one of the first big agile projects in the world, with people in us, india and uk
  • cc was a response to create automation around integration of software coming from different teams
  • jMock oriainal story: the servlet API was horrendous to deal with in testing
  • you had to implement every single method of a huge interface
  • late ‘80 is when linux dev started, a new open source unix kernel
  • linux was an aggressive collaborative software project, 1k people working on the same project
  • rich gabriel wrote an article “is better worse or is worse better”
  • linux broke the phased software process development
  • all bugs in open source are shallow, more eyes looking for bugs, more likely for the bug to be released

the technology radar

  • technology radar data back to 2010, what are the trends in open source?
  • 2010: distributed vcs, 2016: docker docker
  • clojure on the radar, appeared 2010, then JVM as a platform in general
  • trial in 2012, and adopt in 2014, but then nothing else.
  • is because TW is biased by using a lot of clj? not sure
  • let’s compare this to JS
  • JS first heard in 2010, then always mentioned till 2016 with reactJS
  • nodeJS is similar, touching on every year on the radar with different aspects for it
  • NPM building game: pull a name, if you can npm install it, you have to drink.
  • you’ll be surprised about how many things getting installed
  • next observations about microservices
  • you can see gradually building up on microservices tooling and infrastructure
  • but microservices never got into “adopt”
  • but since 2013 there is a huge explosion into services related to microservices

monetize the open source movement

  • microsoft recently open sourced a lot of things, arriving at virtuous companies by OSS
  • many companies are opensourcing on large scale: should be worried?
  • what concerns?
  • they opensource for attracting good people
  • but the community they attract is a closed community, leaving out people who are not github junkies
  • TW is for example looking into promising school drop-outs that cannot afford going to school in india
  • at the beginning it was someone scratching an hitch
  • from the '80 onward companies got interested to monetize this effort

the future

  • can we interpolate how the future looks like?
  • humanity augmented: software augmenting aspects of humanity
  • VR, augmented reality
  • reptile projection: laser scanning directly the back of your retina
  • a way to stimulating your retina directly to experience true vision
  • security, privacy, transparency is another important aspect for the future
  • autonomous corporation: charles stross book, accelerando
  • automating creation of companies
  • nano-technologies and the rise of the robots
  • “we are living in the future (it’s just not evenly distributed yet)”
  • ska square kilometer array: a radio telescope bing built atm
  • ska software is open source: you can find alien from your mobile
  • industry seems to be moving to this utopian future
  • who is going to create this future? at the moment: FB, google, IBM
  • it was not very different 30 years ago.
  • the cathedral is dead: open source is the way forward.
  • companies are open sourcing more
  • as developer we’ll continue scratching itches
  • better way to predict the future is to invent it (Alan Kay)

02 Computational Design - Karsten Schmidt

  • too much going on visually in this talk, not many meaningful notes
  • postspectacular design practice since 2007
  • computational design, turn design into a process to dissolve things into their essence
  • his work on the boundary of commercial and art-oriented
  • open source is one the clearest idea of contributing actively to society
  • “the limits of my language are the limit of my world” Ludwing Wittgentstein
  • atari machine was a 1.97 MHz and quite limited compared to current hardware
  • started programming on paper, because there was 1 hour of programming per week
  • idea of going from nothing to big things
  • like points: in one dimensions is just a point, but in 2D can become a line and so on
  • bezier curves shaped randomly generated points in the 3D
  • after you get the basic shape, by mixing them, you can get letters and so on
  • toxiclibs.org, home made code
  • 300 building classes in java
  • several examples of generative design, like longest UK led wall
  • action diffusion simulation
  • from 0D to higher-D
  • thi.ng, largely done in java
  • over the years, going back to other languages that enable more platforms
  • visual programming language with 8 different words, can be used in the browser
  • idea of branching taken to 3D object evolving into something else
  • thi.ng/morphogen project
  • 8 different operators: split, scale, pull, etc
  • pick a segment, then using the operators (like reflaction) one could create a grid
  • transplant object from one design to another, automated by genetic algorithms
  • the grow is then verified for fitness creating incredible shapes
  • every object is 3D printable and depart from a deterministic tree-structure
  • using genetic programming to evolve a “logo”

javascripting implementation of forth in the browser

  • @forthcharlie
  • charles moore: invetor of forth
  • almost no syntax, just basic stack rules
  • implementation of forth in javascript
  • interesting approach: data first, action next
  • the “.” operator pop the head element from the stack
  • writing pixel shaders in forth through charlierepl
  • forth compiles itself to glsl and you can inspecting it
  • the more iterations are added the more the painting becomes complex
  • just math applied to basic colors
  • why forth? super-light, very simple, multiple devices
  • the same process can be applied to audio!!!
  • showing a synth and composer in 50k of forth code
  • brain melt.

03 The Search for Simplicity - Håkan Råberg

  • the search for simplicity
  • hakan raberg
  • from an RDF semantic web project
  • regain sanity, lower abstraction, remove metalevel, go back to school
  • early '90: amiga 68k assembler, then x86 assembler at school
  • what has changed in 20 years?
  • not necessarily going into retro-programming
  • asm-one simpler amiga editor for assembly
  • guru meditation!
  • devpac 3 “the new standard” from 1991, an assembly IDE from HiSoft!
  • Borland Turbo Assembler!
  • TIOBE today: assembly language is back, it’s climbing! Probably for IOT or mobile

Akeem Scheme

  • Scheme R7RS small, a version of modern Scheme from the last 5 years
  • It’s JIT: code as data
  • 5636 LOC x86x64 bootstrapping
  • 1611 LOC Scheme on top of that
  • glibc based, for sanity
  • there is no assemblyUnit! Using diff and “entr”
  • the unit tests is a scheme file looking into the golden master for diffs
  • basic mark and sweep GC, after reading a lot of papers, decided to go for something simple
  • 2 months full time, then closed emacs and moved on with life

Modern x86 assembly

  • 16 registers and 16 XMM registers
  • CISC instruction set, 1-2k instructions! Big big language.
  • Find your subset and feel comfortable with it.
  • mov, jmp, add, xor, for example and many many more.
  • stack grows downward
  • return everything in rax, rdx registers
  • scheme primitives are using assembly macros
  • use assembler to assembly data to build up bytes
  • for a simple lambda:
  • estabilish a frame
  • you don’t want to go into the RED ZONE!
  • take the bytes that have been assembled and compile them
  • old school scheme technique to pass things around using conventions in bits
  • convention to clear registers by “xor” them

conclusion

  • modern CPUs are so complex that down to the hardware is just an illusion (again)
  • need to be strict about mutability, if you don’t know what’s going on you trash the CPU
  • conventions are extremely important to scale it up
  • you need to build the layers up before doing anything useful

04 Unleash Your Play Brain - Portia Tung

  • unleash your play brain - portia tung
  • play for life, working knowledge of play
  • play assessment, play or nay
  • why should I play, do I have the permission to play
  • stuart brown M.D.: seemingly purpoulsess, voluntary, inherent attractive, time flies by
  • reduced sense of self-consciousness, potential improvisation
  • playing: shapes organism brain, makes smarter and adaptable, foster creativity
  • seeker VS skeptic, knowhow-desire: can/cannot play, want/wouldn’t play 4 quadrants
  • opposite of play: work, punishment, TV, middle-management, depression
  • world: without movies, fairy tales, or anything.
  • does work foster creativity?
  • business in Chinese means manning of life
  • spring source never met face 2 face, how did you decided how to indent square brackets?
  • they agreed on a book they believed it was good. They became great thanks to that level of collaboration
  • adult playing: breaks down barriers, open mind to learning, create social connections, joy and hope
  • 5-10’ to play per day is enough, 1 day of play lasts one week. Little and often is better.
  • the chimp paradox book, Dr. Steve Peters
  • the “chimp” is in the limbic (core) part of our brain.
  • if the imput go through the chimp is emotional thinking, “staying alive”
  • the limbic system is the lizard brain, very very concerned about bein dead
  • think about a toy you remember from your childhood: how do you feel after that?
  • nostalgic? embarrassed? happier? and so on.
  • find your “chimp” and play with it. externalize it and try to name it.
  • distinctive thinking is very valuable
  • alternative way to look at intelligence: 8 different types
  • in addition to school mantra: reading, writing, arithmetic
  • our identity risk to become tied to only reading, writing, arithmetic
  • growing tomato plants from a seed is a form of intelligence
  • as is great visual spatial awayness, music, and so on are all form of intelligence
  • no longer do we have IQ, we also have playing IQ
  • playing: anticipation, surprise, pleasure, understanding, mastery, poise
  • stop distinguishing between playing and learning: you’ll see people growing a little bit

05 Adventures in User Interfaces - Kris Jenkins

  • adventures in user interfaces
  • 12-18 months chris has been in other techs
  • 1995 javascript was invented, at the time we thought it was for popup boxes
  • it was great for doing stuff that didn’t really matter
  • until 2001: into the time of jquery, turning javascript into a single platform
  • until 2006-7: gmail, the first mainstream thing that you could do in the browser
  • we suddenly realized we could build entire front-ends with it
  • with angular: get away from low level dom manipulation stuff into raising abstraction
  • then our expectations just grown: how far we can take things
  • but reality didn’t grow that fast at all, javascript is not keeping up

there is hope

  • wave of new things coming, hope for saner languages
  • new flush of langauges for the browser to build new things
  • first: because they have given time to mature. Celebration of hammock time. Like clojurescript
  • second: other languages were trying to avoid JS, writing something else.
  • clojurescript, elm, purescript: we have to build with better tools
  • also bringing new architectures, way of thinking into play
  • the biggest idea in programming is data and data modelling
  • running a wikipedia search example: it returns an answer which is the straight page, or a list of possibilities
  • if you add an extra part of the description coming back all of the sudden, the compiler said nope.
  • it’s a good feature if your language has a way to describe data and comes back with messages about it
  • that kind of description of data comes usually with academic languages that are now more mainstream
  • 2 big ideas: data and functions. But there is another important feature: illuminating design choices
  • the structure of the function is separated from the body of the function, so you can read them separately
  • the type declaration is the shape: the icon function takes some dom and returns a string (for example)
  • the body then executes that declaration into a code implementation
  • instead of string->string it takes an “Icon” and returns “HTML”. much more precise.
  • elm is a language with a particoular idea about how architecting web application should be done
  • example architecture: events in -> state of the world -> f(state) -> another state -> out-html
  • functional architecture: f(data)->data or a f(data,data)->data,tasks
  • internal-model: changing the data in a html form is changing a data structure underneath
  • a progress-bar: a function from data->int that takes exactly the same form-data model into another shape
  • elm-rays example: traking the pointer position on the square. Architecture is the same as the input-form.
  • seats experiment live: you can connect, give your position and your position in the audience shows up on screen
  • a function(active-seat,name)->some SVG to render your seat on the canvas
  • now we are going to change it so that if you answered questions you show up in red instead
  • now we change the seat render so that we render out the question you answered
  • elm: haskell made accessible to the mainstream, way less scary learning curve

06 Unlimited Register Machines - Tommy Hall

  • daniel dennett “intuition pumps and other tools for thinking” book
  • URM unlimited register machines
  • program, a finite sequence of instructions
  • machine we are going to look at: (end), (inc n m) (deb n m p)
  • end instruction closes the program
  • deb is decrement and branch: is the only branching instruction in this language
  • state: a program counter, some registers, the program
  • computation is a sequence of states
  • first program example: a few steps through simple registers that are incremented
  • this first program is just recurring into registers to sum up the two initial numbers
  • there is a graphical representation for a program like this.
  • the double arrow is a branch on zero condition, single arrow is the default sequence
  • showing a graph for a program that is doing multiplication
  • let’s build a DSL in Clojure for this simple URM
  • urm->fn is the macro taking our instruction and creating a program
  • the program can be invoked as a function and so on
  • lecture notes: computation theory for the computer science tripos part 1b

encode everything as numbers

  • godelization: key move in completeness theory, a computer has a way to express programs in numbers
  • we take a program, rapresents in number, feed it in a URN, that it will express the result as numbers
  • we need first a way to represent “pairs” as binary numbers
  • we can code pairs with code-pair clojure function and a decode-pair[binary] that brings them back
  • once we got pairs, we can get lists. Showing a way to encode lists and binaries, for example 0 is null or empty.
  • pairs are used as cons cell binaries: two zeros rapresent a 2, three zeros represent a 3 and so on
  • all separated by ones, that’s more or less the idea.
  • next step: encoding the end, inc and deb instructions considering them lists: (inc i j)
  • (inc i j) can be encoded as a binary and when decoded can be executed
  • 0 is used as the terminator, so it has a special meaning.
  • there is a way to decode-encode pairs (ending with the * symbol) which include or exclude the zeros for encoding

stacks are really good ideas

  • adding 3 more instructions: copy, push, pop to implement a stack
  • once we have these new instructions we can push and pop from registers
  • we can make the URM an UURM, an universal unlimited register machine
  • like a good turing machine is universal, we can create programs that can execute other programs

next steps

  • graphical editor and simulator
  • faster interpreter for 3 instructions URM
  • property based testing
  • book reference: martin davis “the universal computer”
  • book reference: computability by Cutland

07 ClojureScript without Borders - Frankie Sardo

  • rebuilding inventory written in xml and xslt
  • rewritten with clojure and datomic and able to pull the nice to have things
  • around 20k loc of clojurescript to build the art galley CMS
  • one goal was to add the mobile friendly site
  • they wanted to access the inventory and modify it from the mobile app
  • it looks simple to just send the same thing to mobile, but it wasn’t
  • how to port the existing cljs code into a mobile app?
  • using an android emulator to browse the site and it looks native
  • the good part is that you can use chrome to just debug it
  • changing things and see things live when you change things
  • the next level is to evaluate some code on the IDE and see that reflected on the running app
  • the live demo showing that re-evaluation of clojurescript in the IDE is changing the app
  • another good point is the possibility to send a bug report along with the state of the app at that point
  • same goes about sending instead the list of the last 1k states so you can see how it get there
  • how hard is to create a notification when doing web deveopment, or take a picture?
  • when the REPL is connected, with a few lines you can set an alarm, make a phone call etc

exploring the way the application is rendered

  • loading the same application on ios, android emulators and a real browser
  • click on a button bug: a button doesn’t work but it should sumbmit a form instead
  • so I go in the application and print an alert instead. The state is the same
  • but the code has changed to fix the problem
  • another example
  • changing the order of graphs on the page to put the pie-chart at the top
  • what if something broke on the mobile app?
  • so wouldn’t be nice for the change to be propagated to all applications at once?
  • the code that is enabling this is figwheel BTW
  • it seems to be the only way to do development for multiple platforms at once
  • the shorter the feedback loop the more you stay in the loop
  • the reload-demo is available open-source to check that out

more tools

  • devcards
  • built on the idea that a view can be played as a series of application states
  • look at how the view behaves while states are changing
  • edge cases are showing up in all the possible details in the case.
  • microsoft
  • MS created a tool that allows you to change the JS and that gets deployed live right away

08 Looking Beyond Clojure - Martin Trojer

  • looking at some of the typed languages with clojure glasses
  • all roads lead to haskell
  • extremely subjective distorted reality bold statements are just about to happen
  • clj pro for about 5 years, some other FP experiences here and there
  • worked on several >30k LOC of Clojure projects, mostly webapps

the must be a better way

  • people who adopted FP are explorer, rejecting the status quo
  • apparently when you take that lead, that voice in your head doesn’t go away
  • stuff that I really value:
  • refactor with confidence, being able to do change, being confident
  • code must scale: it shouldn’t be a massive problem for the code to grow
  • it should be possible to do code changes 3months-3years later without too much remembering
  • what to look at when shopping new languages?
  • higher order functions
  • no null, too old for nulls
  • values instead of variables, immutable data structures
  • good type systems, controlled side-effects

let’s talk types

  • very vague term with a lot of misconception
  • c-style types: quite limited
  • ml-style: modern type, what you really want
  • what is a bad type? they are “punitive”
  • there is so much typing going on that I feel like I’m working for the compiler
  • it should be the other way around clearly
  • difficult to express types, lot of repetition and so on
  • good types
  • simply describe a lot more of what is going on
  • look for: higher kinded, rank-N, traits, type-classes and so on
  • controlled side effects
  • Kris Jenkins: “side-effects are the complexity iceberg”
  • Knowing what part of the code is pure is very important

what else is out there

  • can it run in production, how do I deploy
  • how do I log, how to profile and so on and so forth
  • how are the stack traces?
  • Haskell
  • full and mature compiler
  • improving tooling history: after so many years of haskell hell
  • after cabal era
  • there are also quite a bit of libraries, even good ones
  • some down sides:
  • lazy evaluation sometimes bites you from behind: time/space reasoning
  • what about the M word?
  • it turns out that it’s very important construct, but the implemntation is so simple
  • it’s just about implications of that idea

elm

  • mature (ish) fantastic tooling, opinionated
  • faster and better code
  • really nice compiler errors, nothing like that in other type systems
  • very opinionated, so better use the elm architecture
  • some magic to make your life easier
  • coming from haskell (equality signature for instance) you’ll go: what?
  • but you don’t need to go full-haskell
  • FFI: when you want to call out is not exactly there

purescript

  • JS won, deal with it
  • influenced by haskell
  • lightweight, not a framework
  • FFI a little better, easier for you to blows your programs tho
  • runs well on node and browsers
  • really nice react wrapper
  • new and immature, with a small community
  • lack of library

09 Communicative Programming - Sam Aaron

  • languages are not that important, ideas are
  • how do you think about programming
  • standard engineering: efficiency, not the only thing
  • what other efficacies there are? what about communicative power?

communicative programming

  • business value to make people dancing is to keep them dancing
  • the parameter becomes: how a language should be in order to keep people dancing?
  • the other aspect is how to change the language so that we can communicate about applications
  • how about education? kids don’t care about quick sort or bubble sort
  • kids have their own idea about the world, the language needs to talk their language
  • human interaction with programming environments
  • how to take those ideas into education and arts
  • a program easy to understand is easy to communicate to other people
  • art and expression
  • code is a powerful media ever invented, why do we use it for business only?
  • you wouldn’t use English for contracts only.
  • what would be the possible targets to use code to target other aspect?
  • how to talk with code, how to dance with code?

how to think

  • not using the hammock time like Rich does
  • specs was really deeply thought, and that was ok
  • but you could go for a walk, and while walking taking notes and keep thinking
  • the office space is the worst place to think, when I want to work I go out of the office
  • physical activity is really important for thinking
  • practice a diary: communication happens over time
  • going into a school and say this is your super learning curve is not interesting
  • how to simplify and package idea in a way anyone can understand?
  • it’s really easy to come up with ideas, but how to find a way that a child can understand them
  • so having a lot of ideas all floating around concurrently and walking is a way to think about them
  • this solves the problem of simplifying adoption of an idea or application
  • state of mind
  • when designing a system there are ideas that are greet at the whiteboard, but they turn out bad in practice
  • the constraint of the use case are fundamentally important about thinking the solution
  • a school a such a constrained environment: whatever great idea it needs to work in a constrained environment
  • every day you should practice what you preach: use the product you create and experience the pain
  • the flow of the context gives more ideas that flow then into a new practice and so on

examples

  • boilerplate code is nuts, annoying and useless
  • play a sound should be the simplest possible action
  • a loop should be as easy, couple lines of code
  • but having two playing at the same time is not possible apparently: they are sequential!
  • so our environment should know that things evaluated at the same time should play at the same time
  • the live_loop construct is a new DSL that applies that concept, playing together.
  • sleep 1 for example, doesn’t work with music really. so it’s simply only on the surface
  • follow sam playing loops at amazing speed
  • live performing has shaped the DSL for making music in the way it is now
  • education needs for children shaped simplicity in a similar way

10 The Humanity of Our Industry - James Woudhuysen

  • deconstructing and reconstructing IT
  • how can anybody predict the feature?
  • what is it that I don’t know that I don’t know
  • younger him on the oil rig platform
  • reported about piper-alpha about the most dangerous time was off-time
  • accidents tend to happen
  • you can predict the future a bit
  • in design land everybody seems to be sure about their intuition
  • rationality and instinct are important to what you are going to do next
  • is it right to future proof your business?
  • the answer is don’t!
  • you get the idea that future is something that is happening to you instead of everybody
  • if you look at the news tho, it sounds very bad: health panics is everywhere
  • are fears justified?
  • banning a lot of stuff in the UK, this is the british response
  • the great thing about innovation is that you don’t need to think about it
  • donald tusk: because of the brexit vote, weare dealing with the end of civilization?
  • hardware business review: what the manager will say about bad things tomorrow is coming from here
  • prototypes, experiments is the way you convert unquantifiable uncertainty into quantifiable certanty
  • you should not future proof your business

mith about IT

  • this one was very difficult to note down, very very dense, load of words and ideas.
  • a language is what defines us as a human being
  • is it instead our playful activity?
  • but the way economy defines that is through work and industry
  • fire invention brings problem, but overall this is not a problem
  • attacks your collegue in IT when they say: exponential and disruptive
  • back at 1957-58 at the time Lisp was invented: first pressurised atomic reactor, double helix envealed, laser and so on
  • and today we say that technology is really accelerating? if anything is slowing down
  • much deeper and subtle changes
  • HR: small H and small R, more important to your life

IT will destroy work

  • in 2013 it was predicted that 47% of american jobs are in high risk for automation and 19% are in medium risk.
  • the luddite triangle: stop down being critical about IT
  • the broke up pieces of equipment in 1779
  • teenage scribblers are saying this, risk thesis that IT will destroy millions of jobs
  • one of the grat things about algorithms are neutral, unbiased and fair
  • the financial time compared algorithmic management to most rude forms of exploitation
  • IT as fingersmiths
  • unemployment in USA and Japan dropped but in Spain and Greece
  • IT is not destroying jobs
  • it is not happening because of barriers to innovations, lawyers and so on
  • and above all the obsession of risk
  • if you want to carrying on codeing and move higher in the salary scale, it’s time to get serious about forecasting
  • look at the slow of innovation: when will you able to buy a personalized medicine at booth?
  • read an innovation book each month, that alone would be an innovation

why innovation is not happening at the pace we expect?

  • crisis of capital investment in the west and india
  • we get too few of them
  • technological unemployment: there is not enough tech going on
  • even IT: gdp is dropped 1% since the beginning of the century
  • without the capital investment you’re not going to get the innovation you expect
  • USA is spending proportionally less on IT than 15 years ago
  • the last few years USA productivity went the same instead of increasing
  • laboratory is too risky
  • being bought is just around the corner, it’s too easy to just buy a company than investing in innovation
  • trump means a slump: there will be anyway, bad strategy, bad management, too many wars
  • germany and USA are very reliant on immigration than automation
  • agile now means traveling light, not making investment
  • it’s the lack of IT that is the danger for our future
  • hardware is as important as software in order to keep productivity up
  • investing in air quality, lighting, all factors are important

more IT future fields

  • less agreement, more heat
  • look at biometrics
  • eye tracking, gestore control, brain computer interfaces
  • open a file for biometrics, you’re going to work for that field pretty soon
  • mobile-first interfaces
  • cyvber security
  • quantum computing ecryption
  • insect drones?
  • wereable improvements: medicalisation of our society is pushing us to new frontiers
  • 3D printed villas in 3 hours
  • building insoles with sensors to improve your walking
  • robots are not coming: small and more precise, but nowhere near taking over more than warehouses

Clojure Weekly, Aug 26th, 2016

There is always time for a late Friday edition! The Weekly is a collection of bookmarks, normally 4/5, pointing at articles, docs, screencasts, podcasts and anything else that attracts my attention in the clojure-sphere for the last 7 (or so) days. I add a small comment so you can decide if you want to look at the whole thing or not. That’s it, enjoy!

conway.clj Searching for programming perfection is very healthy activity when constrained in a controlled environment (such as spare time and not at work!). This is the very essence of programming katas and less formally any small problem that can be solved multiple times at will. Game of life is one of such small problem that is fun to solve. There’s plenty of game of life in Clojure (I have my very own posted at https://gist.github.com/reborg/09752a1409688365541bda89489d2c6f) and this is one example. I like the predetermined encoding of the neighbours space, the reduce on the tick entry point that can be put in a loop. You probably have yours, feel free to share @reborg.

Design, Composition, and Performance Here’s one from Rich that I never watched. In terms of pure talk performance this is probably one of his best (IMHO): a concise, to the point, coherent exposition (like always, but possibly more this time). I very enjoyed the first part where Rich is teaching how to design, which is essentially about taking things apart on several different dimensions before putting everything back together. The second part is about comparing programming with instrument playing and despite still being an interesting topic, it becomes more about music than anything else. Still from this second part, an interesting take away is that instruments (like programming languages) aren’t designed for the first 5 seconds of beginners experience. Some complexity in Clojure for beginners seems to be related to this aspect. I’d come back to this talk (transcriptions are available) to learn more about good design, especially at the beginning of a project.

(((Bruce Durling))) on Twitter: “Any clojure people going to use spec…” Interesting tweets on specs, design up-front mindset, when to use, how to use them. clojure.spec is definitely pushing the Clojure community to discuss about the benefits of types (as well as the problems related to them). This twitter thread illustrates some of the possibilities: use them all the way down, at the boundaries, for generative testing only, upfront, spec-after and so on. The first question from Bruce is about using them along with Plumatic/Schema. Schema for coercion, core.spec for generative testing.

Category Theory for the Working Hacker Tried hard, but got lost somewhere in the middle :) I suppose that if you can get away without understanding all the details, then the picture you have is exactly as described by Phil: something simple made complicated. I have enough brain to understand that there is something powerful in there, an useful unification of concepts. With category theory, you can work in the abstract for so many things at once: logic, programming, sets, types etc etc.You just need to replace the placeholders and everything is connected back together in your favourite field. Applied category theory is probably more relevant when the types are exposed by the language, so it probably pays off to lear some of it if you are a Haskell programmer. With Lisps, category theory can still make sense, but it’s not a game changer.

Combining Clojure macros: cond-> and as-> Despite the good offering from the standard library, there are entire libraries of extensions to threading macros, because they are so useful and flexible. This post is not a library so you’re free to copy and paste. The example combines cond-> with as-> just in case you need the placement of the threaded expression to appear in different places in a cond-> like thread. The proposed macro is short and easy to read.

jimpil/fudje: Unit testing library for Clojure Never had specific issues with Midje, although nowadays I’m using it mostly for historical reasons (it was the first serious testing framework around, designed around concepts I was used to in OOP). Specifically, this library mentions a couple of problems with Midje: the load of dependencies and the large syntax that Midje brings in. It also mention the need of AOT compiling code with tests but never had that need personally (being tests always outside the uberjar). If those are your problems but you still like the mock-driven TDD this library can be for you.

Clojure Weekly, Aug 10th, 2016

Hello readers! I was busy for a while and also enjoyed 3 weeks of deserved vacation! The Weekly is now back in business with articles, docs, screencasts, podcasts and anything else that attracts my attention in the clojure-sphere. I just add a small comment to each link so you can decide if you want to look at the whole thing or not. That’s it, enjoy!

Alan Kay and Rich Hickey about “data” : Clojure Interesting discussion going on here, albeit a bit philosophical (but this is what I’m expecting from them tho). Alan is defending the idea that data without an interpreter are difficult (if not impossible to parse) and implying objects with their mix of data and behaviour can convey more meaningful data. Rich instead thinks that data can be interpreted just looking at the surrounding metadata (like the source, arrival time and so on) and picking a specific interpreter more by convention. I don’t think there is an absolute true or false and both positions are valid. Both the Reddit and the Hackernews thread are adding interesting comments/questions to the main discussion.

What the Hell is Symbolic Computation? / Steve Losh Beautiful introduction to the simple (but somewhat complicated) concept of symbolic programming. It’s a lengthy article but it’s worth the read. A Symbol is just another data structure that Lisps usually offer (along with the usual suspects: numbers, strings and so on). It’s not that special on its own, but fundamental in the way it is treated by the reader: it can be created (interned), dereferenced for use, quoted and so on. Apart from replacing “package” with “namespace” everything here applies to Clojure, despite the examples being in Lisp.

Clojure News This is a nice (new for me) Clojure news aggregator. It looks and feels like yCombinator news (similar to filtering by “clojure” in the search box on HN). It contains also the “question”, “jobs” and “events” section that are quite interesting too. Sources of the website are available on Github: it’s basically cljs running on Heroku.

[PDF] The future of Standard ML - Robert Harper SML is the precursor and inspiration for many popular typed functional languages (including Scala, Rust, OCaml). It is still a very popular research and teaching language, but failed to get modern industry traction. There is no inherent impediment for SML to become more widely adopted, but it is certainly lacking robustness and standardization at this point, as explained here. Robert Harper, one of the original contributors to the definition of SML illustrates in these slides his vision for the future of the language.

[#CLJ-1553] Parallel transduce - Clojure JIRA Parallel computation options for Clojure haven’t changed (yet) with the introduction of transducers. If you are in search for easy parallelization there are a couple of options at the moment (apart from rolling your own): pmap and reducers. The first is especially simple and good if the function you need to map each thread to is computational intensive. Reducers are instead good for large collections that can be easily chunked (so ordering is not a problem). This placeholder ticket should introduce the parallel option to transducers as well, closing the loop.

leifp/spec-play: Experiments with clojure.spec Unofficial attempt at spec-ing out the core library. Alex confirmed in the mailing list that the official version is coming but this project could give an idea about what to expect. Spec-ing of the standard library is very welcome both for human consumption and tool automation. I can imagine editors can take advantage of a function specification to highlight possible problems or making refactoring easier. I didn’t count them, but this project contains quite a good number from core.

Clojure Weekly, Jun 16th, 2016

Welcome to another issue of the Clojure Weekly! The Weekly is a collection of bookmarks, normally 4/5, pointing at articles, docs, screencasts, podcasts and anything else that attracts my attention in the clojure-sphere for the last 7 (or so) days. I add a small comment so you can decide if you want to look at the whole thing or not. That’s it, enjoy!

datafun Datafun is an hybrid functional/logic programming language. It follows the principle to add higher order functions on top of Datalog and put everything together into a new language (as opposed for example to Clojure datalog implementation in Clojure). The idea is to extract what makes Datalog expressive into a language with the same characteristics which you can actually program in. The linked page shows some examples of how it looks like. A more formal paper is available at www.cs.bham.ac.uk/~krishnan/datafun.pdf. The linked project is also the Racket implementation of Datafun by the author of the paper.

Clojure spec with Rich Hickey - Cognicast Episode 103 In this Cognicast, Rich iterates through the rationale of core.spec, features and possible uses. Of all the resources out there, this podcast along the official guide are the most authoritative resources about the design and use of core.spec, respectively. I had a look at how it was implemented, skimming briefly through the sources and impressed by the relative simplicity of the solution. At the same time is bringing a lot into Clojure core, since starting from 1.9 you won’t need libraries to spec out functions, generate sample data and use generative testing against your code.

A Tool For Thought The community is still absorbing the core.spec news and blog posts are still appearing to try to describe what this is all about. David Nolen introductory post correlates spec with papers and ideas in computer science, coming to the conclusion that this is an optimal tool to achieve crispness in large codebases. It might be a a little too celebratory in the tone, but it contains a nice example of spec-ing out a let form at the end.

Scripts · boot-clj/boot Wiki Boot has a quick way to write an executable Clojure script that doesn’t need any “java -jar” if you have boot installed. It includes the possibility to specify dependencies and a main with args parser helper. The main use case here is to put together an executable without necessarily going bash.

What are some uses of Clojure metadata? Right, I was wondering how and when they were used outside from documentation and compiler internals. Clojure metadata is advertised as a public feature not just a compiler helper. There has been some propagation bug in the past, but not enough to justify lack of use of the feature. This Stackoverflow question contains one interesting option to use them at the basis for a “taint” for insecure strings.

My Increasing Frustration With Clojure This article and related Reddit thread are a good description of how the Clojure core team works and what to expect from them. Compared to other languages, Clojure is more conservative and pushes back on issues that some users consider unacceptable. It’s true that Clojure has many rough edges and inconsistencies but it’s also a remarkably pragmatic language if your main goal is to deliver working code.

Some thoughts on clojure.spec Nice distinction between static typing and contracts and what they can and cannot do. Some (mild) critique about clojure spec in terms of input validation compared to Schema coercions, although it is pretty much possible that the very young Clojure feature will evolve that way eventually.

Clojure Weekly, June 8th, 2016

ClojureScript Unraveled Nice book. I like the fact that it takes ClojureScript as a language on its own, thus assuming readers are not necessarily coming from Clojure. It is also “language” and not “browser” oriented, so you’re not wasting time with html processing. There are of course other resources showing cljs in action to build web apps (the majority), which usually are lacking the basic ClojureScript aspect this book is covering in detail.

clojure/java.jmx and STM java.jmx, apart from being the library to use from Clojure to interact with JMX, contains a way to expose a Clojure ref type as a JMX bean. This turns a in-memory data structure into an in-memory concurrent data structure, potentially subject to concurrent changes coming from outside the JVM. Not different from an in-memory database. So one option is to use the STM and wraps those changes in a dosync.

PLaneT Package Repository : drocaml.plt I’m tinkering with the idea of integrating DrRacket to handle Clojure. DrRacket is a fine language editor with refactoring capabilities and it would maybe worth the effort. So if there is this OCaml plugin, what the heck, there must be a way to integrate another Lisp! Saving here for future hacking.

Extending Software Transactional Memory in Clojure with Side-Effects and Tansaction Control - YouTube Ah there it is. So last Weekly I gave a description of eClojure extension to the STM, but I couldn’t quite explain the details of those changes. This is the video of the same presentation with a little more details. So it seems to converge to the idea of state monad. The implementation allows for operations to be sent on a queue and lifecycle events to be received in other parts of the code such as onAbort. At that point, if some side effect needs to be attempted or re-tracted it is in control of the developer. Not sure I would use the STM if side effects are an important part of the equation though.

Clojure Remote - Keynote: Designing with Data (Michael Drogalis) - YouTube First Clojure Remote talk I watched. There are a few important ideas, although I’m not sold on all of them. First: the emphasis on data modelling. The information model comes before everything (but is not Schema, Schema can be optionally generated from that). From the information model different API layers can be built on top, not just one. On top of everything an optional DSL to drive human interaction. Finally: log-driven approach inspired by State Monad. Sure it’s cool, but it seems to be there to enable time-travel, which ultimately is a debugging tool. Do I always need it? Probably not.

Specter 0.11.0: Performance without the tradeoffs · nathanmarz/specter Wiki I was unsure about Specter when it first came out, because I wasn’t doing any of the fancy deep query paths that Specter supports. But with this latest release I start to see more compelling arguments. So I think: what if I change all of the operations on maps in my application to Specter, will I see a performance boost just for all the pre-compilation paths stuff? Perhaps. So making a Clojure weekly note.

Apache River - About Apache River Ohh, so Jini is alive? Didn’t know that. Wondering how much of the original code is still living there. Jini was an ancient Java distributed programming model based on the tuple spaces concept (something about Linda comes to mind). It went into obscurity after it failed to push the internet of things idea in the late ‘90, possibly ahead of the hardware available at the time (do you remember Java running on a toaster meme?). What about a Clojure wrapper for that? Of the whole Jini package the auto-discovery mechanism and the registry would certainly be an interesting thing to interface with.

Alan Kay’s reading list for his students | Hacker News Priceless Q&A with Alan Key about the importance of reading, good books and reading lists. It of course includes at the beginning a reading list that Alan Kay shared at some point in the '90. He stresses how important is to remember the essential about each book, especially if you are a voracious reader like he is (4 books/week now, 10 books/week peak).

Clojure Weekly, May 31st, 2016

Welcome to another issue of the Clojure Weekly! The Weekly is a collection of bookmarks, normally 4/5, pointing at articles, docs, screencasts, podcasts and anything else that attracts my attention in the clojure-sphere for the last 7 (or so) days. I add a small comment so you can decide if you want to look at the whole thing or not. That’s it, enjoy!

Clojure - spec Guide The Clojure world was taken by surprise by the alpha release of a new 1.9 feature called “spec”. The reaction has been 90% positive, with people rushing to blog about how to use it and what you can do with it. One of the most important aspect in my opinion is the fact that it was included in Clojure core (unlike core.async for example). This implies that Clojure wants to embrace and push down to all developers a new vision about how types could be used to write a different Clojure. Is also important to note that spec doesn’t want to be a type system but more a constraint checker. spec goal is to increase the likeliness of spotting bugs before live (with property based testing) and get out of the way when the code is finally released.

17th Annual Scheme and Functional Programming Workshop 2016 Just to announce this co-located workshop with ICFP 2016 in Japan. The workshop is accepting presentations (not just papers) and Clojure talks are welcome. The rest of ICFP is also interesting, with workshops from other languages (Erlang, Haskell, OCaml for example) and paper discussions. If you want to see in which direction the functional world is going, this is the place to be.

skejserjensen/eclojure: Extending Software Transactional Memory in Clojure with Side-Effects and Transaction Control eClojure was presented at the last european Lisp simposium. It is a fork of Clojure that implements changes to the STM (LockingTransaction and Ref mainly) to enable synchronisation of side effects inside a transaction. I have to admit that even looking at the slides it wasn’t illuminating about what exactly the changes are achieving. Unfortunately the code is missing commit diffs, so it’s almost impossible to understand what changed. So, if you understand what it does, leave me a message below.

eratosthenesia/lispc: “Lispsy” Lisp(ish) to C Converter (designed for CLISP) Similar to Ferret, this is a CL to C compiler. Like Ferret the main driving reason to have Lisp translated into C is low level interoperability with drivers or hardware. The readme, which is also a tutorial, includes a CUDA example. Why as a Clojurian should you be interested? Because it’s fascinating to know how a Lisp program parses itself and generates another language!

Clojure Remote 2016 - YouTube Juicy Clojure Remote conference videos are out. There are a couple I definitely want to see, possibly more, like for example, DrogalisM, ZachT, ColinJ and EricN. Others have intriguing title so I might skim over to see what they are about. In any case, differently from normal conferences, you get to see the speaker and his slides really close and clean, not to speak about the crystal quality audio, which is nice :)

jmgimeno/okasaki-clojure: Clojure implementation of some data structures described in Okasaki’s book A translation of Okasaki’s purely functional data structures in Clojure. Each namespace contains a data structure, including an helper “datatype” namespace that with some core.match is able to emulate some of the ML features, like pattern matching for instance. With this in place the author was able to translate more literally from ML. It seems enough easy to read and it contains tests for each data structure.

jstepien/flip: ╯°□°╯︵ʍoɹɥʇ Yes, this project is for fun, but there is some interesting idea there, namely the use of ascii art and unicode to visually mean something. Next time you need to throw an exception, use this little library and throw the exception literally and visually! Not recommending to use in your product project tho :)

(defn podcast [themes] | (conj themes ‘Clojure 'ClojureScript)) Fantastic news! I’m an avid podcast listener and the Functional Geekery + Cognicast combo aren’t nearly enough to keep me busy. Hopefully this is a core technical Clojure podcast, no fluff just stuff, which is what I like the most. It’s based in Europe, which means that hopefully we’ll get more coverage of what’s happening in Clojure land this side of the pond. Keep up the great job.

Clojure Weekly, May 18th, 2016

Welcome to another issue of the Clojure Weekly! The Weekly is a collection of bookmarks, normally 4/5, pointing at articles, docs, screencasts, podcasts and anything else that attracts my attention in the clojure-sphere for the last 7 (or so) days. I add a small comment so you can decide if you want to look at the whole thing or not. That’s it, enjoy!

SE-Radio Episode 254: Mike Barker on the LMAX Architecture : Software Engineering Radio Not strictly Clojure, but inherently functional. The Disruptor is essentially a in-memory message queue with configurable consumers priority policies. The topic is dense but is brilliantly explained in this podcast featuring the current Disruptor maintainer, Mike Barker. Mike also takes us inside the business perspective of brokers and currency exchanges and why speed is so essential to that domain. The functional aspect is given by the fact that the logic built following the disruptor should avoid side effects as much as possible, so events can be re-played any time to create the same state. Would it be possible to build something like this on the Clojure STM? Probably yes, but at the cost of performances. Garbage collection, allocation, locking are all parameters that need to be under control inside the Disruptor. Even if the STM contains a convincing model for concurrency, it won’t be very deterministic (a lot of CAS, “compare and swap”).

Macro Grammars - Clojure Design - Clojure Development In a not so distant future, Clojure could be implementing macros using grammars instead of the literal ad-hoc implementations now in core.clj. It won’t likely have an impact on end users (if not for improvements to existing way of writing macros). I think the discussion started after Colin Flaming presented the implementation he was forced to use in Cursive to parse macros. The theory at the base seems to come from PEGs and their flexibility for parsing programming languages. Some example of how to describe macros using the new grammar are given at the bottom. That means that the macro parsing code will be produced “for free” by the library used to parse the grammar, reducing the current complexity in core.clj.

Parsing Text with a Virtual Machine - Ghadi Shayban - YouTube Ghadi Shayban describes in this ClojureWest talk a nice little parser that has a lot of good thinking behind it. First of all PEGs (Parsing Expressions Grammars) are a recurrent acronym in Clojure land, after it has been demonstrated they are good at parsing Clojure itself and also included in a proposal for future implementations of macros in core. They are also less formals than other grammars and easier to use. The presentation shows the grammar to describe json in a screenful (goto the project on Github to see a csv example). Second the library uses the concept of virtual machine, first translating the PEG description into a stack based array of instruction that are then interpreted. This idea was borrowed by LPEG, the Lua equivalent library.

KLIPSE: a simple and elegant online cljs compiler and evaluator Interesting. So, taking advantage of cljs self-compilation abilities, it is now possible to edit and compile cljs from a browser without round-trips to the server for snippet evaluation. Apart from the actual tool that can show you the translated cljs into JavaScript to understand how certain things are done, the entire site is a series of interesting blog post, teaching you how to create something like Klipse in om.next from scratch.

daly/axiom: Axiom is a free, open source computer algebra system Speaking of large and used Lisp systems, Axiom is a computer based algebra system like Mathematica. It has a very old history starting in the 1964 (Fortran) and revived in the ‘70 by IBM to create a commercial product. Once royalties expired it was donated as Open Source and maintained as such since 2001. It is written with literate programming, so entire books are created when the project is built. Needless to say that Lisp has an intimate relationship with symbolic representation (the “s” in Lisp) and this is one of the use cases it was built for. What? Yeah, it would be nice to port it to Clojure.

Clojure Weekly, May 5th, 2016

Welcome to another issue of the Clojure Weekly! The Weekly is a collection of bookmarks, normally 4/5, pointing at articles, docs, screencasts, podcasts and anything else that attracts my attention in the clojure-sphere for the last 7 (or so) days. I add a small comment so you can decide if you want to look at the whole thing or not. That’s it, enjoy!

Beyond Clojure: Prelude An article by Martin Trojer that looks critically at Clojure. The main pain-point seems to be about Clojure lacking static typing and becoming increasingly hard to maintain for larger codebases, especially on the ClojureScript side. I do agree that types might be useful (especially those that can be gradually added) but the major source of pain for the projects I work on has been people not knowing how to structure their code. I’m also producing bad code on a first pass, but then I refactor, bringing clarity in functions and concepts. I struggle when people bangs on Clojure extracting functions at random. Types might help in carrying ideas over to other developers. So I’m not against, just using them when I see fit.

Clojure, The Good Parts Another article that contains constructive criticism about parts of Clojure that are sometimes misused or simply don’t work that great. I don’t agree with all of them, but certainly a good 75%. Some interesting discussions (especially about uses of the STM) continue on HackerNews. So why all of a sudden all of this criticism? What I think it’s happening is that long time users of Clojure (5+) are starting to see what is not working that great, after a few projects (big and small) under their belt.

Akeem is a small JIT-ed subset of R7RS Scheme written in x86-64 assembler as an experiment. This project is a treasure chest of interesting concepts. Akeem implements the R7RS Scheme standard on top of a just in time compiler written in assembler. It means that, once compiled down to an executable, assembled snippets are loaded into memory and executed on demand, including possible dynamic modifications made on the fly by the compiler. The README contains pointers to papers explaining the main concepts and also assembler literature to get you started on the hardcore part.

XT16 Conference We’ve got a new UK based high profile conference out there. It sounds like invitation only, lacking a CfP, but the line-up is already published and interesting. One good aspect is that is not strictly Clojure, but more functional and beyond. Second good aspect that is not London based. We need some country-side conference. Seating is restricted to just 128 and I’m expecting they’ll go pretty soon, despite the conference being in October. Buy yours.

Ferret User’s Guide Ferret is a Clojure (not sure 100% but more likely a restricted set) to C++ compiler. Once the C++ is produced it can be compiled to a normal executable (and there is a script for doing that in a single pass). The concept is interesting and well documented on the website using literate programming. Not sure if there is any production usage, but I suspect the main use case is driver/hardware interoperability more than a faster or better Clojure.

Why are body-macros more fashionable than thunks? - Google Groups The first rule of macros says: don’t write macros. Even more so If you consider wrapping a chunk of executable code into a lambda (a thunk) to send it around. The real need for macros appears to be very specific (sometimes performance related due to compile time propagation). Despite this, there are many example in the standard library of macros that could be rewritten as normal functions taking thunks. So what to do: write or not write a macro? As anything in computer science: it depends. There is a cost in readability implied in creating lambdas and, at the same time, there is some performance gain in avoiding stack frames. Probably the ultimate judgment should be down to readability, but always after asking if I could do without a macro first.

Types are like the Weather, Type Systems are like Weathermen - Matthias Felleisen More discussion about types from the last Clojure West. This time from a prominent academic in the dynamic language world, Matthias Felleisen of Racket fame. The approach Racket is taking (and that Clojure will soon follow) is gradual typing, a way to declare types in a a typed world and then carry that information in the untyped world, more or less transparently. This enables fast time to release, refinement of the design and types that can then be described incrementally. Once type information are in, the type system wraps around untyped instructions and when an error occurs, it knows where the wrong assumption was made.

Being A Developer After 40 — Medium Enough said. Not Clojure related, but, oh boy, the wisdom per line in this article is sky high. #1 about don’t follow the hype is really number one. KISS is definitely coming second in my list. Teaching is something I’m always dedicating some effort as well. Somehow opinionated in other sections but interesting nonetheless. LLVM as the long term future investment is convincing argument.

Clojure Weekly, April 21st, 2016

Welcome to another issue of the Clojure Weekly! The Weekly is a collection of bookmarks, normally 4/5, pointing at articles, docs, screencasts, podcasts and anything else that attracts my attention in the clojure-sphere for the last 7 (or so) days. I add a small comment so you can decide if you want to look at the whole thing or not. That’s it, enjoy!

One Million Clicks per Minute with Kafka and Clojure - Devon Peticolas - YouTube I started looking some of the latest Clojure West presentation and I have to say the quality is overall very high. So let’s start from this 1M clicks per minute architecture that is making good use of Kafka and Clojure consumers. What is basically happening here is that data are partitioned and packaged up in a smart way that makes processing a lot them easier. The other key-enabler is the way Clojure process streams by allowing processing and reading to happen simultaneously with really simple code to write.

Fast Clojure Interesting blog post that shows what process is necessary sometimes to answer apparently innocuous questions. We start from 3 ways of extracting the first element from a vector and asking the question about which is faster (and why). The the article follows the Clojure code base in search for the right path to show how they are implemented and present its conclusions. The only thing I need to suggest is to avoid micro-benchmarking using the time function.

Clojure Hashing Report - Google Docs When in 1.5 it was discovered that Java style hashing was inefficient for compound keys (vectors, sets, maps and so on) there was a very interesting debate around what hashing strategy it should be replaced with. Mark Engelberg (of Instaparse fame) wrote this very interesting document which not only highlights Clojure problems with hashing and their solutions but also general aspects of hashing in general. The discussion mainly resulted in Murmur3 being added to Clojure 1.6+ to alleviate this kind of problems.

macourtney/Conjure: A Rails like framework for Clojure. Not new nor maintained. But not sure why Clojure ran away from opinionated style frameworks for web development. You might argue that LuminusWeb is the Rails-like framework nowadays. Maybe. But I can’t see what was really the time saver while using Rails: rake create scaffold stuff, partials and a strong convention on how to place them.

The Joys and Perils of Interactive Development - Stuart Sierra - YouTube Enjoyable talk from Stuart Sierra at the last Clojure/West. It is a nice summary of the challenges in tool.namespace, reloading and consistency at the REPL. Despite you might have heard this already, it’s an interesting and inspiring talk. I just disagree at the very end, when Stuart suggests to “pass-in” components to code that requires them. That’s not bad in principle, but as soon as it happens, people with an OO background almost immediately slip back into the OO component mentality. The risk? Things that are not component (i.e. not stateful) becomes component. The trick is to keep the stateful system map available for namespaces to use, no function-injection component required.

bhauman/lein-figwheel: sidecar Since the amount of configuration that potentially needs to be in the REPL for ClojureScript can be overwhelming, figwheel also offers the possibility to “script” that configuration out of the project.clj. This looks similar to include some “user.clj” namespace at REPL startup to include stuff related to in the REPL development. Since it’s very common these days to have a server serving a single page app written with ClojureScript, you’ll likely use two REPLs, one for the client and the other for the server side. figwheel-sidecar can be useful to keep the two bootstrap sequences separated.

mapv - clojure.core | ClojureDocs mapv is like map, but it produces a vector as a result instead of a sequence. The other important aspect to consider is that mapv will walk the entire input to produce the vector (no lazyness) unlikely map that is going to produce a lazy (chunked) sequence. This is something you can verify with (take 4 (mapv #(do (println “.”) %) (range 100))) and watching the 100 dots on the screen (instead of 32).

Call for Proposals | JavaOne 2016 JavaOne CfP is out! It’s common to see some Clojure related talk to appear there, maybe more related to the Java interoperability side of Clojure but not limited to that. Personally I have an interest in Java sources that I consult sometimes to understand some Clojure behaviour better. Apart from that, knowing some of the JVM specifications and HotSpot internals helps explaining some performance-related Clojure problems. Pondering a talk.