libpython-clj has opened the door for Clojure to directly interop with Python libraries. That means we can take just about any Python library and directly use it in our Clojure REPL. But what about matplotlib?

Matplotlib.pyplot is a standard fixture in most tutorials and python data science code. How do we interop with a python graphics library?

How do you interop?

It turns out that matplotlib has a headless mode where we can export the graphics and then display it using any method that we would normally use to display a .png file. In my case, I made a quick macro for it using the shell open. I’m sure that someone out that could improve upon it, (and maybe even make it a cool utility lib), but it suits what I’m doing so far:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
ns gigasquid.plot
(:require [libpython-clj.require :refer [require-python]]
[libpython-clj.python :as py :refer [py. py.. py.-]]
[clojure.java.shell :as sh])


;;; This uses the headless version of matplotlib to generate a graph then copy it to the JVM
;; where we can then print it

;;;; have to set the headless mode before requiring pyplot
(def mplt (py/import-module "matplotlib"))
(py. mplt "use" "Agg")

(require-python 'matplotlib.pyplot)
(require-python 'matplotlib.backends.backend_agg)
(require-python 'numpy)


(defmacro with-show
  "Takes forms with mathplotlib.pyplot to then show locally"
  [& body]
  `(let [_# (matplotlib.pyplot/clf)
         fig# (matplotlib.pyplot/figure)
         agg-canvas# (matplotlib.backends.backend_agg/FigureCanvasAgg fig#)]
     ~(cons 'do body)
     (py. agg-canvas# "draw")
     (matplotlib.pyplot/savefig "temp.png")
     (sh/sh "open" "temp.png")))

Parens for Pyplot!

Now that we have our wrapper let’s take it for a spin. We’ll be following along more or less this tutorial for numpy plotting

For setup you will need the following installed in your python environment:

  • numpy
  • matplotlib
  • pillow

We are also going to use the latest and greatest syntax from libpython-clj so you are going to need to install the snapshot version locally until the next version goes out:

  • git clone git@github.com:cnuernber/libpython-clj.git
  • cd cd libpython-clj
  • lein install

After that is all setup we can require the libs we need in clojure.

1
2
3
4
(ns gigasquid.numpy-plot
  (:require [libpython-clj.require :refer [require-python]]
            [libpython-clj.python :as py :refer [py. py.. py.-]]
            [gigasquid.plot :as plot]))

The plot namespace contains the macro for with-show above. The py. and others is the new and improved syntax for interop.

Simple Sin and Cos

Let’s start off with a simple sine and cosine functions. This code will create a x numpy vector of a range from 0 to 3 * pi in 0.1 increments and then create y numpy vector of the sin of that and plot it

1
2
3
4
(let [x (numpy/arange 0 (* 3 numpy/pi) 0.1)
        y (numpy/sin x)]
    (plot/with-show
      (matplotlib.pyplot/plot x y)))

sin

Beautiful yes!

Let’s get a bit more complicated now and and plot both the sin and cosine as well as add labels, title, and legend.

1
2
3
4
5
6
7
8
9
10
(let [x (numpy/arange 0 (* 3 numpy/pi) 0.1)
        y-sin (numpy/sin x)
        y-cos (numpy/cos x)]
    (plot/with-show
      (matplotlib.pyplot/plot x y-sin)
      (matplotlib.pyplot/plot x y-cos)
      (matplotlib.pyplot/xlabel "x axis label")
      (matplotlib.pyplot/ylabel "y axis label")
      (matplotlib.pyplot/title "Sine and Cosine")
      (matplotlib.pyplot/legend ["Sine" "Cosine"])))

sin and cos

We can also add subplots. Subplots are when you divide the plots into different portions. It is a bit stateful and involves making one subplot active and making changes and then making the other subplot active. Again not too hard to do with Clojure.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
(let [x (numpy/arange 0 (* 3 numpy/pi) 0.1)
        y-sin (numpy/sin x)
        y-cos (numpy/cos x)]
    (plot/with-show
      ;;; set up a subplot gird that has a height of 2 and width of 1
      ;; and set the first such subplot as active
      (matplotlib.pyplot/subplot 2 1 1)
      (matplotlib.pyplot/plot x y-sin)
      (matplotlib.pyplot/title "Sine")

      ;;; set the second subplot as active and make the second plot
      (matplotlib.pyplot/subplot 2 1 2)
      (matplotlib.pyplot/plot x y-cos)
      (matplotlib.pyplot/title "Cosine")))

sin and cos subplots

Plotting with Images

Pyplot also has functions for working directly with images as well. Here we take a picture of my cat and create another version of it that is tinted.

1
2
3
4
5
6
7
(let [img (matplotlib.pyplot/imread "resources/cat.jpg")
        img-tinted (numpy/multiply img [1 0.95 0.9])]
    (plot/with-show
      (matplotlib.pyplot/subplot 1 2 1)
      (matplotlib.pyplot/imshow img)
      (matplotlib.pyplot/subplot 1 2 2)
      (matplotlib.pyplot/imshow (numpy/uint8 img-tinted))))

cat tinted

Pie charts

Finally, we can show how to do a pie chart. I asked people in a twitter thread what they wanted an example of in python interop and one of them was a pie chart. This is for you!

The original code for this example came from this tutorial.

1
2
3
4
5
6
7
8
9
10
(let [labels ["Frogs" "Hogs" "Dogs" "Logs"]
        sizes [15 30 45 10]
        explode [0 0.1 0 0] ; only explode the 2nd slice (Hogs)
        ]
    (plot/with-show
      (let [[fig1 ax1] (matplotlib.pyplot/subplots)]
        (py. ax1 "pie" sizes :explode explode :labels labels :autopct "%1.1f%%"
                             :shadow true :startangle 90)
        (py. ax1 "axis" "equal")) ;equal aspec ration ensures that pie is drawn as circle
      ))

pie chart

Onwards and Upwards!

This is just the beginning. In upcoming posts, I will be showcasing examples of interop with different libraries from the python ecosystem. Part of the goal is to get people used to how to use interop but also to raise awareness of the capabilities of the python libraries out there right now since they have been historically out of our ecosystem.

If you have any libraries that you would like examples of, I’m taking requests. Feel free to leave them in the comments of the blog or in the twitter thread.

Until next time, happy interoping!

PS All the code examples are here https://github.com/gigasquid/libpython-clj-examples

A new age in Clojure has dawned. We now have interop access to any python library with libpython-clj.


Let me pause a minute to repeat.


You can now interop with ANY python library.


I know. It’s overwhelming. It took a bit for me to come to grips with it too.


Let’s take an example of something that I’ve always wanted to do and have struggled with mightly finding a way to do it in Clojure:
I want to use the latest cutting edge GPT2 code out there to generate text.

Right now, that library is Hugging Face Transformers.


Get ready. We will wrap that sweet hugging face code in Clojure parens!

The setup

The first thing you will need to do is to have python3 installed and the two libraries that we need:


  • pytorch – sudo pip3 install torch
  • hugging face transformers – sudo pip3 install transformers

Right now, some of you may not want to proceed. You might have had a bad relationship with Python in the past. It’s ok, remember that some of us had bad relationships with Java, but still lead a happy and fulfilled life with Clojure and still can enjoy it from interop. The same is true with Python. Keep an open mind.


There might be some others that don’t want to have anything to do with Python and want to keep your Clojure pure. Well, that is a valid choice. But you are missing out on what the big, vibrant, and chaotic Python Deep Learning ecosystem has to offer.


For those of you that are still along for the ride, let’s dive in.


Your deps file should have just a single extra dependency in it:

1
2
:deps {org.clojure/clojure {:mvn/version "1.10.1"}
        cnuernber/libpython-clj {:mvn/version "1.30"}}

Diving Into Interop

The first thing that we need to do is require the libpython library.

1
2
3
(ns gigasquid.gpt2
  (:require [libpython-clj.require :refer [require-python]]
            [libpython-clj.python :as py]))

It has a very nice require-python syntax that we will use to load the python libraries so that we can use them in our Clojure code.

1
2
(require-python '(transformers))
(require-python '(torch))

Here we are going to follow along with the OpenAI GPT-2 tutorial and translate it into interop code. The original tutorial is here


Let’s take the python side first:

1
2
3
4
5
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel

# Load pre-trained model tokenizer (vocabulary)
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

This is going to translate in our interop code to:

1
(def tokenizer (py/$a transformers/GPT2Tokenizer from_pretrained "gpt2"))

The py/$a function is used to call attributes on a Python object. We get the transformers/GPTTokenizer object that we have available to use and call from_pretrained on it with the string argument "gpt2"


Next in the Python tutorial is:

1
2
3
4
5
6
# Encode a text inputs
text = "Who was Jim Henson ? Jim Henson was a"
indexed_tokens = tokenizer.encode(text)

# Convert indexed tokens in a PyTorch tensor
tokens_tensor = torch.tensor([indexed_tokens])

This is going to translate to Clojure:

1
2
3
4
5
6
7
8
9
10
(def text "Who was Jim Henson ? Jim Henson was a")
;; encode text input
(def indexed-tokens  (py/$a tokenizer encode text))
indexed-tokens ;=>[8241, 373, 5395, 367, 19069, 5633, 5395, 367, 19069, 373, 257]

;; convert indexed tokens to pytorch tensor
(def tokens-tensor (torch/tensor [indexed-tokens]))
tokens-tensor
;; ([[ 8241,   373,  5395,   367, 19069,  5633,  5395,   367, 19069,   373,
;;    257]])

Here we are again using py/$a to call the encode method on the text. However, when we are just calling a function, we can do so directly with (torch/tensor [indexed-tokens]). We can even directly use vectors.


Again, you are doing this in the REPL, so you have full power for inspection and display of the python objects. It is a great interop experience – (cider even has doc information on the python functions in the minibuffer)!


The next part is to load the model itself. This will take a few minutes, since it has to download a big file from s3 and load it up.


In Python:

1
2
# Load pre-trained model (weights)
model = GPT2LMHeadModel.from_pretrained('gpt2')

In Clojure:

1
2
3
;;; Load pre-trained model (weights)
;;; Note: this will take a few minutes to download everything
(def model (py/$a transformers/GPT2LMHeadModel from_pretrained "gpt2"))

The next part is to run the model with the tokens and make the predictions.


Here the code starts to diverge a tiny bit.


Python:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Set the model in evaluation mode to deactivate the DropOut modules
# This is IMPORTANT to have reproducible results during evaluation!
model.eval()

# If you have a GPU, put everything on cuda
tokens_tensor = tokens_tensor.to('cuda')
model.to('cuda')

# Predict all tokens
with torch.no_grad():
    outputs = model(tokens_tensor)
    predictions = outputs[0]

# get the predicted next sub-word (in our case, the word 'man')
predicted_index = torch.argmax(predictions[0, -1, :]).item()
predicted_text = tokenizer.decode(indexed_tokens + [predicted_index])
assert predicted_text == 'Who was Jim Henson? Jim Henson was a man'

And Clojure

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
;;; Set the model in evaluation mode to deactivate the DropOut modules
;;; This is IMPORTANT to have reproducible results during evaluation!
(py/$a model eval)


;;; Predict all tokens
(def predictions (py/with [r (torch/no_grad)]
                          (first (model tokens-tensor))))

;;; get the predicted next sub-word"
(def predicted-index (let [last-word-predictions (-> predictions first last)
                           arg-max (torch/argmax last-word-predictions)]
                       (py/$a arg-max item)))

predicted-index ;=>582

(py/$a tokenizer decode (-> (into [] indexed-tokens)
                            (conj predicted-index)))

;=> "Who was Jim Henson? Jim Henson was a man"

The main differences is that we are obviously not using the python array syntax in our code to manipulate the lists. For example, instead of using outputs[0], we are going to use (first outputs). But, other than that, it is a pretty good match, even with the py/with.

Also note that we are not making the call to configure it with GPU. This is intentionally left out to keep things simple for people to try it out. Sometimes, GPU configuration can be a bit tricky to set up depending on your system. For this example, you definitely won’t need it since it runs fast enough on cpu. If you do want to do something more complicated later, like fine tuning, you will need to invest some time to get it set up.

Doing Longer Sequences

The next example in the tutorial goes on to cover generating longer text.


Python

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained('gpt2')

generated = tokenizer.encode("The Manhattan bridge")
context = torch.tensor([generated])
past = None

for i in range(100):
    print(i)
    output, past = model(context, past=past)
    token = torch.argmax(output[0, :])

    generated += [token.tolist()]
    context = token.unsqueeze(0)

sequence = tokenizer.decode(generated)

print(sequence)

And Clojure

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
(def tokenizer (py/$a transformers/GPT2Tokenizer from_pretrained "gpt2"))
(def model (py/$a transformers/GPT2LMHeadModel from_pretrained "gpt2"))

(def generated (into [] (py/$a tokenizer encode "The Manhattan bridge")))
(def context (torch/tensor [generated]))


(defn generate-sequence-step [{:keys [generated-tokens context past]}]
  (let [[output past] (model context :past past)
        token (-> (torch/argmax (first output)))
        new-generated  (conj generated-tokens (py/$a token tolist))]
    {:generated-tokens new-generated
     :context (py/$a token unsqueeze 0)
     :past past
     :token token}))

(defn decode-sequence [{:keys [generated-tokens]}]
  (py/$a tokenizer decode generated-tokens))

(loop [step {:generated-tokens generated
             :context context
             :past nil}
       i 10]
  (if (pos? i)
    (recur (generate-sequence-step step) (dec i))
    (decode-sequence step)))

;=> "The Manhattan bridge\n\nThe Manhattan bridge is a major artery for"

The great thing is once we have it embedded in our code, there is no stopping. We can create a nice function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
(defn generate-text [starting-text num-of-words-to-predict]
  (let [tokens (into [] (py/$a tokenizer encode starting-text))
        context (torch/tensor [tokens])
        result (reduce
                (fn [r i]
                  (println i)
                  (generate-sequence-step r))

                {:generated-tokens tokens
                 :context context
                 :past nil}

                (range num-of-words-to-predict))]
    (decode-sequence result)))

And finally we can generate some fun text!

1
2
3
(generate-text "Clojure is a dynamic, general purpose programming language, combining the approachability and interactive" 20)

;=> "Clojure is a dynamic, general purpose programming language, combining the approachability and interactive. It is a language that is easy to learn and use, and is easy to use for anyone"

Clojure is a dynamic, general purpose programming language, combining the approachability and interactive. It is a language that is easy to learn and use, and is easy to use for anyone


So true GPT2! So true!

Wrap-up

libpython-clj is a really powerful tool that will allow Clojurists to better explore, leverage, and integrate Python libraries into their code.


I’ve been really impressed with it so far and I encourage you to check it out.


There is a repo with the examples out there if you want to check them out. There is also an example of doing MXNet MNIST classification there as well.

clojure.spec allows you to write specifications for data and use them for validation. It also provides a generative aspect that allows for robust testing as well as an additional way to understand your data through manual inspection. The dual nature of validation and generation is a natural fit for deep learning models that consist of paired discriminator/generator models.


TLDR: In this post we show that you can leverage the dual nature of clojure.spec’s validator/generator to incorporate a deep learning model’s classifier/generator.


A common use of clojure.spec is at the boundaries to validate that incoming data is indeed in the expected form. Again, this is boundary is a fitting place to integrate models for the deep learning paradigm and our traditional software code.

Before we get into the deep learning side of things, let’s take a quick refresher on how to use clojure.spec.

quick view of clojure.spec

To create a simple spec for keywords that are cat sounds, we can use s/def.

1
(s/def ::cat-sounds #{:meow :purr :hiss})

To do the validation, you can use the s/valid? function.

1
2
(s/valid? ::cat-sounds :meow) ;=> true
(s/valid? ::cat-sounds :bark) ;=> false

For the generation side of things, we can turn the spec into generator and sample it.

1
2
(gen/sample (s/gen ::cat-sounds))
;=>(:hiss :hiss :hiss :meow :meow :purr :hiss :meow :meow :meow)

There is the ability to compose specs by adding them together with s/and.

1
2
3
(s/def ::even-number (s/and int? even?))
(gen/sample (s/gen ::even-number))
;=> (0 0 -2 2 0 10 -4 8 6 8)

We can also control the generation by creating a custom generator using s/with-gen. In the following the spec is only that the data be a general string, but using the custom generator, we can restrict the output to only be a certain set of example cat names.

1
2
3
4
5
6
7
8
9
(s/def ::cat-name
  (s/with-gen
    string?
    #(s/gen #{"Suki" "Bill" "Patches" "Sunshine"})))

(s/valid? ::cat-name "Peaches") ;=> true
(gen/sample (s/gen ::cat-name))
;; ("Patches" "Sunshine" "Sunshine" "Suki" "Suki" "Sunshine"
;;  "Suki" "Patches" "Sunshine" "Suki")

For further information on clojure.spec, I whole-heartedly recommend the spec Guide. But, now with a basic overview of spec, we can move on to creating specs for our Deep Learning models.

Creating specs for Deep Learning Models

In previous posts, we covered making simple autoencoders for handwritten digits.

handwritten digits

Then, we made models that would:

  • Take an image of a digit and give you back the string value (ex: “2”) – post
  • Take a string number value and give you back a digit image. – post

We will use both of the models to make a spec with a custom generator.


Note: For the sake of simplicity, some of the supporting code is left out. But if you want to see the whole code, it is on github)


With the help of the trained discriminator model, we can make a function that takes in an image and returns the number string value.

1
2
3
4
5
6
7
8
(defn discriminate [image]
  (-> (m/forward discriminator-model {:data [image]})
      (m/outputs)
      (ffirst)
      (ndarray/argmax-channel)
      (ndarray/->vec)
      (first)
      (int)))

Let’s test it out with a test-image:

test-discriminator-image

1
(discriminate my-test-image) ;=> 6

Likewise, with the trained generator model, we can make a function that takes a string number and returns the corresponding image.

1
2
3
4
(defn generate [label]
  (-> (m/forward generator-model {:data [(ndarray/array [label] [batch-size])]})
      (m/outputs)
      (ffirst)))

Giving it a test drive as well:

1
2
3
4
(def generated-test-image (generate 3))
(viz/im-sav {:title "generated-image"
             :output-path "results/"
           :x (ndarray/reshape generated-test-image [batch-size 1 28 28])})

generated-test-image

Great! Let’s go ahead and start writing specs. First let’s make a quick spec to describe a MNIST number – which is a single digit between 0 and 9.

1
2
3
4
5
(s/def ::mnist-number (s/and int? #(<= 0 % 9)))
(s/valid? ::mnist-number 3) ;=> true
(s/valid? ::mnist-number 11) ;=> false
(gen/sample (s/gen ::mnist-number))
;=> (0 1 0 3 5 3 7 5 0 1)

We now have both parts to validate and generate and can create a spec for it.

1
2
3
4
5
6
(s/def ::mnist-image
    (s/with-gen
      #(s/valid? ::mnist-number (discriminate %))
      #(gen/fmap (fn [n]
                   (do (ndarray/copy (generate n))))
                 (s/gen ::mnist-number))))

The ::mnist-number spec is used for the validation after the discriminate model is used. On the generator side, we use the generator for the ::mnist-number spec and feed that into the deep learning generator model to get sample images.

We have a test function that will help us test out this new spec, called test-model-spec. It will return a map with the following form:

1
2
3
4
{:spec name-of-the-spec
 :valid? whether or not the `s/valid?` called on the test value is true or not
 :sample-values This calls the discriminator model on the generated values
 }

It will also write an image of all the sample images to a file named sample-spec-name

Let’s try it on our test image:

test-discriminator-image

1
2
3
4
5
6
7
(s/valid? ::mnist-image my-test-image) ;=> true


(test-model-spec ::mnist-image my-test-image)
;; {:spec "mnist-image"
;;  :valid? true
;;  :sample-values [0 0 0 1 3 1 0 2 7 3]}

sample-mnist-image

Pretty cool!

Let’s do some more specs. But first, our spec is going to be a bit repetitive, so we’ll make a quick macro to make things easier.

1
2
3
4
5
6
7
(defmacro def-model-spec [spec-key spec discriminate-fn generate-fn]
    `(s/def ~spec-key
       (s/with-gen
         #(s/valid? ~spec (~discriminate-fn %))
         #(gen/fmap (fn [n#]
                      (do (ndarray/copy (~generate-fn n#))))
                    (s/gen ~spec)))))

More Specs – More Fun

This time let’s define an even mnist image spec

1
2
3
4
5
6
7
8
9
10
 (def-model-spec ::even-mnist-image
    (s/and ::mnist-number even?)
    discriminate
    generate)

  (test-model-spec ::even-mnist-image my-test-image)

  ;; {:spec "even-mnist-image"
  ;;  :valid? true
  ;;  :sample-values [0 0 2 0 8 2 2 2 0 0]}

sample-even-mnist-image

And Odds

1
2
3
4
5
6
7
8
9
10
  (def-model-spec ::odd-mnist-image
    (s/and ::mnist-number odd?)
    discriminate
    generate)

  (test-model-spec ::odd-mnist-image my-test-image)

  ;; {:spec "odd-mnist-image"
  ;;  :valid? false
  ;;  :sample-values [5 1 5 1 3 3 3 1 1 1]}

sample-odd-mnist-image

Finally, let’s do Odds that are over 2!

1
2
3
4
5
6
7
8
9
10
  (def-model-spec ::odd-over-2-mnist-image
    (s/and ::mnist-number odd? #(> % 2))
    discriminate
    generate)

  (test-model-spec ::odd-over-2-mnist-image my-test-image)

  ;; {:spec "odd-over-2-mnist-image"
  ;;  :valid? false
  ;;  :sample-values [3 3 3 5 3 5 7 7 7 3]}

sample-odd-over-2-mnist-image

Conclusion

We have shown some of the potential of integrating deep learning models with Clojure. clojure.spec is a powerful tool and it can be leveraged in new and interesting ways for both deep learning and AI more generally.

I hope that more people are intrigued to experiment and take a further look into what we can do in this area.

SIMULACRA by Karina Smigla-Bobinski

In this first post of this series, we took a look at a simple autoencoder. It took and image and transformed it back to an image. Then, we focused in on the disciminator portion of the model, where we took an image and transformed it to a label. Now, we focus in on the generator portion of the model do the inverse operation: we transform a label to an image. In recap:

  • Autoencoder: image –> image
  • Discriminator: image –> label
  • Generator: label –> image (This is what we are doing now!)

generator

Still Need Data of Course

Nothing changes here. We are still using the MNIST handwritten digit set and have an input and out to our model.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
(def
  train-data
  (mx-io/mnist-iter {:image (str data-dir "train-images-idx3-ubyte")
                     :label (str data-dir "train-labels-idx1-ubyte")
                     :input-shape [784]
                     :flat true
                     :batch-size batch-size
                     :shuffle true}))

(def
  test-data (mx-io/mnist-iter
             {:image (str data-dir "t10k-images-idx3-ubyte")
              :label (str data-dir "t10k-labels-idx1-ubyte")
              :input-shape [784]
              :batch-size batch-size
              :flat true
              :shuffle true}))

(def input (sym/variable "input"))
(def output (sym/variable "input_"))

The Generator Model

The model does change to one hot encode the label for the number. Other than that, it’s pretty much the exact same second half of the autoencoder model.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
(defn get-symbol []
  (as-> input data
    (sym/one-hot "onehot" {:indices data :depth 10})
    ;; decode
    (sym/fully-connected "decode1" {:data data :num-hidden 50})
    (sym/activation "sigmoid3" {:data data :act-type "sigmoid"})

    ;; decode
    (sym/fully-connected "decode2" {:data data :num-hidden 100})
    (sym/activation "sigmoid4" {:data data :act-type "sigmoid"})

    ;;output
    (sym/fully-connected "result" {:data data :num-hidden 784})
    (sym/activation "sigmoid5" {:data data :act-type "sigmoid"})

    (sym/linear-regression-output {:data data :label output})))

(def data-desc
  (first
   (mx-io/provide-data-desc train-data)))
(def label-desc
  (first
   (mx-io/provide-label-desc train-data)))

When binding the shapes to the model, we now need to specify that the input data shapes is the label instead of the image and the output of the model is going to be the image.

1
2
3
4
5
6
7
8
9
10
(def
  model
  ;;; change data shapes to label shapes
  (-> (m/module (get-symbol) {:data-names ["input"] :label-names ["input_"]})
      (m/bind {:data-shapes [(assoc label-desc :name "input")]
               :label-shapes [(assoc data-desc :name "input_")]})
      (m/init-params {:initializer  (initializer/uniform 1)})
      (m/init-optimizer {:optimizer (optimizer/adam {:learning-rage 0.001})})))

(def my-metric (eval-metric/mse))

Training

The training of the model is pretty straight forward. Just being mindful that we are using hte batch-label, (number label), as the input and and validating with the batch-data, (image).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
(defn train [num-epochs]
  (doseq [epoch-num (range 0 num-epochs)]
    (println "starting epoch " epoch-num)
    (mx-io/do-batches
     train-data
     (fn [batch]
       ;;; change input to be the label
       (-> model
           (m/forward {:data (mx-io/batch-label batch)
                       :label (mx-io/batch-data batch)})
           (m/update-metric my-metric (mx-io/batch-data batch))
           (m/backward)
           (m/update))))
    (println "result for epoch " epoch-num " is "
             (eval-metric/get-and-reset my-metric))))

Results Before Training

1
2
3
4
5
6
7
8
9
10
(def my-test-batch (mx-io/next test-data))
  ;;; change to input labels
  (def test-labels (mx-io/batch-label my-test-batch))
  (def preds (m/predict-batch model {:data test-labels} ))
  (viz/im-sav {:title "before-training-preds"
               :output-path "results/"
               :x (ndarray/reshape (first preds) [100 1 28 28])})

  (->> test-labels first ndarray/->vec (take 10))
  ;=> (6.0 1.0 0.0 0.0 3.0 1.0 4.0 8.0 0.0 9.0)

before training

Not very impressive… Let’s train

1
2
3
4
5
6
7
8
9
(train 3)

starting epoch  0
result for epoch  0  is
[mse 0.0723091]
starting epoch  1
result for epoch  1  is  [mse 0.053891845]
starting epoch  2
result for epoch  2  is  [mse 0.05337505]

Results After Training

1
2
3
4
5
6
7
8
9
 (def my-test-batch (mx-io/next test-data))
  (def test-labels (mx-io/batch-label my-test-batch))
  (def preds (m/predict-batch model {:data test-labels}))
  (viz/im-sav {:title "after-training-preds"
               :output-path "results/"
               :x (ndarray/reshape (first preds) [100 1 28 28])})
  (->> test-labels first ndarray/->vec (take 10))

  ;=>   (9.0 5.0 7.0 1.0 8.0 6.0 6.0 0.0 8.0 1.0)

after training

Cool! The first row is indeed

(9.0 5.0 7.0 1.0 8.0 6.0 6.0 0.0 8.0 1.0)

Save Your Model

Don’t forget to save the generator model off – we are going to use it next time.

1
(m/save-checkpoint model {:prefix "model/generator" :epoch 2})

Happy Deep Learning until next time …

sunflowers

In the last post, we took a look at a simple autoencoder. The autoencoder is a deep learning model that takes in an image and, (through an encoder and decoder), works to produce the same image. In short:

  • Autoencoder: image –> image

For a discriminator, we are going to focus on only the first half on the autoencoder.

discriminator

Why only half? We want a different transformation. We are going to want to take an image as input and then do some discrimination of the image and classify what type of image it is. In our case, the model is going to input an image of a handwritten digit and attempt to decide which number it is.

  • Discriminator: image –> label

As always, with deep learning. To do anything, we need data.

MNIST Data

Nothing changes here from the autoencoder code. We are still using the MNIST dataset for handwritten digits.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
;;; Load the MNIST datasets
(def train-data
  (mx-io/mnist-iter
   {:image (str data-dir "train-images-idx3-ubyte")
    :label (str data-dir "train-labels-idx1-ubyte")
    :input-shape [784]
    :flat true
    :batch-size batch-size
    :shuffle true}))

(def test-data
  (mx-io/mnist-iter
   {:image (str data-dir "t10k-images-idx3-ubyte")
    :label (str data-dir "t10k-labels-idx1-ubyte")
    :input-shape [784]
    :batch-size batch-size
    :flat true
    :shuffle true}))

The model will change since we want a different output.

The Model

We are still taking in the image as input, and using the same encoder layers from the autoencoder model. However, at the end, we use a fully connected layer that has 10 hidden nodes – one for each label of the digits 0-9. Then we use a softmax for the classification output.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
(def input (sym/variable "input"))
(def output (sym/variable "input_"))

(defn get-symbol []
  (as-> input data
    ;; encode
    (sym/fully-connected "encode1" {:data data :num-hidden 100})
    (sym/activation "sigmoid1" {:data data :act-type "sigmoid"})

    ;; encode
    (sym/fully-connected "encode2" {:data data :num-hidden 50})
    (sym/activation "sigmoid2" {:data data :act-type "sigmoid"})

    ;;; this last bit changed from autoencoder
    ;;output
    (sym/fully-connected "result" {:data data :num-hidden 10})
    (sym/softmax-output {:data data :label output})))

In the autoencoder, we were never actually using the label, but we will certainly need to use it this time. It is reflected in the model’s bindings with the data and label shapes.

1
2
3
4
5
(def model (-> (m/module (get-symbol) {:data-names ["input"] :label-names ["input_"]})
               (m/bind {:data-shapes [(assoc data-desc :name "input")]
                        :label-shapes [(assoc label-desc :name "input_")]})
               (m/init-params {:initializer (initializer/uniform 1)})
               (m/init-optimizer {:optimizer (optimizer/adam {:learning-rage 0.001})})))

For the evaluation metric, we are also going to use an accuracy metric vs a mean squared error (mse) metric

1
(def my-metric (eval-metric/accuracy))

With these items in place, we are ready to train the model.

Training

The training from the autoencoder needs to changes to use the real label for the the forward pass and updating the metric.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
(defn train [num-epochs]
  (doseq [epoch-num (range 0 num-epochs)]
    (println "starting epoch " epoch-num)
    (mx-io/do-batches
     train-data
     (fn [batch]
       ;;; here we make sure to use the label
       ;;; now for forward and update-metric
       (-> model
           (m/forward {:data (mx-io/batch-data batch)
                       :label (mx-io/batch-label batch)})
           (m/update-metric my-metric (mx-io/batch-label batch))
           (m/backward)
           (m/update))))
    (println {:epoch epoch-num
              :metric (eval-metric/get-and-reset my-metric)})))

Let’s Run Things

It’s always a good idea to take a look at things before you start training.

The first batch of the training data looks like:

1
2
3
4
5
6
7
  (def my-batch (mx-io/next train-data))
  (def images (mx-io/batch-data my-batch))
  (viz/im-sav {:title "originals"
               :output-path "results/"
               :x (-> images
                      first
                      (ndarray/reshape [100 1 28 28]))})

training-batch

Before training, if we take the first batch from the test data and predict what the labels are:

1
2
3
4
5
6
7
  (def my-test-batch (mx-io/next test-data))
  (def test-images (mx-io/batch-data my-test-batch))
  (viz/im-sav {:title "test-images"
               :output-path "results/"
               :x (-> test-images
                      first
                      (ndarray/reshape [100 1 28 28]))})

test-batch

1
2
3
4
5
6
7
  (def preds (m/predict-batch model {:data test-images} ))
  (->> preds
       first
       (ndarray/argmax-channel)
       (ndarray/->vec)
       (take 10))
 ;=> (1.0 8.0 8.0 8.0 8.0 8.0 2.0 8.0 8.0 1.0)

Yeah, not even close. The real first line of the images is 6 1 0 0 3 1 4 8 0 9

Let’s Train!

1
2
3
4
5
6
7
8
  (train 3)

;; starting epoch  0
;; {:epoch 0, :metric [accuracy 0.83295]}
;; starting epoch  1
;; {:epoch 1, :metric [accuracy 0.9371333]}
;; starting epoch  2
;; {:epoch 2, :metric [accuracy 0.9547667]}

After the training, let’s have another look at the predicted labels.

1
2
3
4
5
6
7
  (def preds (m/predict-batch model {:data test-images} ))
  (->> preds
       first
       (ndarray/argmax-channel)
       (ndarray/->vec)
       (take 10))
 ;=> (6.0 1.0 0.0 0.0 3.0 1.0 4.0 8.0 0.0 9.0)
  • Predicted = (6.0 1.0 0.0 0.0 3.0 1.0 4.0 8.0 0.0 9.0)
  • Actual = 6 1 0 0 3 1 4 8 0 9

Rock on!

Closing

In this post, we focused on the first half of the autoencoder and made a discriminator model that took in an image and gave us a label.

Don’t forget to save the trained model for later, we’ll be using it.

1
2
  (m/save-checkpoint model {:prefix "model/discriminator"
                            :epoch 2})

Until then, here is a picture of Otto the cat in a basket to keep you going.

Otto in basket

P.S. If you want to run all the code for yourself. It is here

Perfect mirror

If you look long enough into the autoencoder, it looks back at you.

The Autoencoder is a fun deep learning model to look into. Its goal is simple: given an input image, we would like to have the same output image.

It’s sort of an identity function for deep learning models, but it is composed of two parts: an encoder and decoder, with the encoder translating the images to a latent space representation and the encoder translating that back to a regular images that we can view.

We are going to make a simple autoencoder with Clojure MXNet for handwritten digits using the MNIST dataset.

The Dataset

We first load up the training data into an iterator that will allow us to cycle through all the images.

1
2
3
4
5
6
(def train-data (mx-io/mnist-iter {:image (str data-dir "train-images-idx3-ubyte")
                                   :label (str data-dir "train-labels-idx1-ubyte")
                                   :input-shape [784]
                                   :flat true
                                   :batch-size batch-size
                                   :shuffle true}))

Notice there the the input shape is 784. We are purposely flattening out our 28x28 image of a number to just be a one dimensional flat array. The reason is so that we can use a simpler model for the autoencoder.

We also load up the corresponding test data.

1
2
3
4
5
6
(def test-data (mx-io/mnist-iter {:image (str data-dir "t10k-images-idx3-ubyte")
                                  :label (str data-dir "t10k-labels-idx1-ubyte")
                                  :input-shape [784]
                                  :batch-size batch-size
                                  :flat true
                                  :shuffle true}))

When we are working with deep learning models we keep the training and the test data separate. When we train the model, we won’t use the test data. That way we can evaluate it later on the unseen test data.

The Model

Now we need to define the layers of the model. We know we are going to have an input and an output. The input will be the array that represents the image of the digit and the output will also be an array which is reconstruction of that image.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
(def input (sym/variable "input"))
(def output (sym/variable "input_"))

(defn get-symbol []
  (as-> input data
    ;; encode
    (sym/fully-connected "encode1" {:data data :num-hidden 100})
    (sym/activation "sigmoid1" {:data data :act-type "sigmoid"})

    ;; encode
    (sym/fully-connected "encode2" {:data data :num-hidden 50})
    (sym/activation "sigmoid2" {:data data :act-type "sigmoid"})

    ;; decode
    (sym/fully-connected "decode1" {:data data :num-hidden 50})
    (sym/activation "sigmoid3" {:data data :act-type "sigmoid"})

    ;; decode
    (sym/fully-connected "decode2" {:data data :num-hidden 100})
    (sym/activation "sigmoid4" {:data data :act-type "sigmoid"})

    ;;output
    (sym/fully-connected "result" {:data data :num-hidden 784})
    (sym/activation "sigmoid5" {:data data :act-type "sigmoid"})

    (sym/linear-regression-output {:data data :label output})))

From the model above we can see the input (image) being passed through simple layers of encoder to its latent representation, and then boosted back up from the decoder back into an output (image). It goes through the pleasingly symmetric transformation of:

784 (image) –> 100 –> 50 –> 50 –> 100 –> 784 (output)

We can now construct the full model with the module api from clojure-mxnet.

1
2
3
4
5
6
7
(def data-desc (first (mx-io/provide-data-desc train-data)))

(def model (-> (m/module (get-symbol) {:data-names ["input"] :label-names ["input_"]})
               (m/bind {:data-shapes [(assoc data-desc :name "input")]
                        :label-shapes [(assoc data-desc :name "input_")]})
               (m/init-params {:initializer  (initializer/uniform 1)})
               (m/init-optimizer {:optimizer (optimizer/adam {:learning-rage 0.001})})))

Notice that when we are binding the data-shapes and label-shapes we are using only the data from our handwritten digit dataset, (the images), and not the labels. This will ensure that as it trains it will seek to recreate the input image for the output image.

Before Training

Before we start our training, let’s get a baseline of what the original images look like and what the output of the untrained model is.

To look at the original images we can take the first training batch of 100 images and visualize them. Since we are initially using the flattened [784] image representation. We need to reshape it to the 28x28 image that we can recognize.

1
2
3
4
(def my-batch (mx-io/next train-data))
(def images (mx-io/batch-data my-batch))
(ndarray/shape (ndarray/reshape (first images) [100 1 28 28]))
(viz/im-sav {:title "originals" :output-path "results/" :x (ndarray/reshape (first images) [100 1 28 28])})

originals

We can also do the same visualization with the test batch of data images by putting them into the predict-batch and using our model.

1
2
3
4
5
;;; before training
 (def my-test-batch (mx-io/next test-data))
 (def test-images (mx-io/batch-data my-test-batch))
 (def preds (m/predict-batch model {:data test-images} ))
 (viz/im-sav {:title "before-training-preds" :output-path "results/" :x (ndarray/reshape (first preds) [100 1 28 28])})

before-training-preds

They are not anything close to recognizable as numbers.

Training

The next step is to train the model on the data. We set up a training function to step through all the batches of data.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
(def my-metric (eval-metric/mse))

(defn train [num-epochs]
  (doseq [epoch-num (range 0 num-epochs)]
    (println "starting epoch " epoch-num)
    (mx-io/do-batches
     train-data
     (fn [batch]
       (-> model
           (m/forward {:data (mx-io/batch-data batch) :label (mx-io/batch-data batch)})
           (m/update-metric my-metric (mx-io/batch-data batch))
           (m/backward)
           (m/update))))
    (println "result for epoch " epoch-num " is " (eval-metric/get-and-reset my-metric))))

For each batch of 100 images it is doing the following:

  • Run the forward pass of the model with both the data and label being the image
  • Update the accuracy of the model with the mse (mean squared error metric)
  • Do the backward computation
  • Update the model according to the optimizer and the forward/backward computation.

Let’s train it for 3 epochs.

1
2
3
4
5
6
starting epoch  0
result for epoch  0  is  [mse 0.06460866]
starting epoch  1
result for epoch  1  is  [mse 0.033874355]
starting epoch  2
result for epoch  2  is  [mse 0.027255038]

After training

We can check the test images again and see if they look better.

1
2
3
4
5
;;; after training
(def my-test-batch (mx-io/next test-data))
(def test-images (mx-io/batch-data my-test-batch))
(def preds (m/predict-batch model {:data test-images} ))
(viz/im-sav {:title "after-training-preds" :output-path "results/" :x (ndarray/reshape (first preds) [100 1 28 28])})

after-training-preds

Much improved! They definitely look like numbers.

Wrap up

We’ve made a simple autoencoder that can take images of digits and compress them down to a latent space representation the can later be decoded into the same image.

If you want to check out the full code for this example, you can find it here.

Stay tuned. We’ll take this example and build on it in future posts.

Spring is bringing some beautiful new things to the Clojure MXNet. Here are some highlights for the month of April.

Shipped

We’ve merged 10 PRs over the last month. Many of them focus on core improvements to documentation and usability which is very important.

The MXNet project is also preparing a new release 1.4.1, so keep on the lookout for that to hit in the near future.

Clojure MXNet Made Simple Article Series

Arthur Caillau added another post to his fantastic series – MXNet made simple: Pretrained Models for image classification – Inception and VGG

Cool Stuff in Development

New APIs

Great progress was made on the new version of the API for the Clojure NDArray and Symbol APIs by Kedar Bellare. We now have an experimental new version of the apis that are generated more directly from the C code so that we can have more control over the output.

For example the new version of the generated api for NDArray looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
(defn
 activation
 "Applies an activation function element-wise to the input.
  
  The following activation functions are supported:
  
  - `relu`: Rectified Linear Unit, :math:`y = max(x, 0)`
  - `sigmoid`: :math:`y = \\frac{1}{1 + exp(-x)}`
  - `tanh`: Hyperbolic tangent, :math:`y = \\frac{exp(x) - exp(-x)}{exp(x) + exp(-x)}`
  - `softrelu`: Soft ReLU, or SoftPlus, :math:`y = log(1 + exp(x))`
  - `softsign`: :math:`y = \\frac{x}{1 + abs(x)}`
  
  
  
  Defined in src/operator/nn/activation.cc:L167
  
  `data`: The input array.
  `act-type`: Activation function to be applied.
  `out`: Output array. (optional)"
 ([data act-type] (activation {:data data, :act-type act-type}))
 ([{:keys [data act-type out], :or {out nil}, :as opts}]
  (util/coerce-return
   (NDArrayAPI/Activation data act-type (util/->option out)))))

as opposed to:

1
2
3
4
5
6
7
8
(defn
 activation
 ([& nd-array-and-params]
  (util/coerce-return
   (NDArray/Activation
    (util/coerce-param
     nd-array-and-params
     #{"scala.collection.Seq"})))))

So much nicer!!!

BERT (State of the Art for NLP)

We also have some really exciting examples for BERT in a PR that will be merged soon. If you are not familiar with BERT, this blog post is a good overview. Basically, it’s the state of the art in NLP right now. With the help of exported models from GluonNLP, we can do both inference and fine tuning of BERT models in MXNet with Clojure! This is an excellent example of cross fertilization across the GluonNLP, Scala, and Clojure MXNet projects.

There are two examples.

1) BERT question and answer inference based off of a fine tuned model of the SQuAD Dataset in GluonNLP which is then exported. It allows one to actually do some natural language question and answering like:

1
2
3
4
5
6
7
Question Answer Data
{:input-answer
 "Rich Hickey is the creator of the Clojure language. Before Clojure, he developed dotLisp, a similar project based on the .NET platform, and three earlier attempts to provide interoperability between Lisp and Java: a Java foreign language interface for Common Lisp, A Foreign Object Interface for Lisp, and a Lisp-friendly interface to Java Servlets.",
 :input-question "Who created Clojure?",
 :ground-truth-answers ["rich" "hickey"]}

  Predicted Answer:  [rich hickey]

2) The second example is using the exported BERT base model and then fine tuning it in Clojure to do a task with sentence pair classification to see if two sentences are equivalent or not.

The nice thing about this is that we were able to convert the existing tutorial in GluonNLP over to a Clojure Jupyter notebook with the lein-jupyter plugin. I didn’t realize that there is a nifty save-as command in Jupyter that can generate a markdown file, which makes for very handy documentation. Take a peek at the tutorial here. It might make its way into a blog post on its own in the next week or two.

Upcoming Events

  • I’ll be speaking about Clojure MXNet at the next Scicloj Event on May 15th at 10PM UTC. Please join us and get involved in making Clojure a great place for Data Science.

  • I’m also really excited to attend ICLR in a couple weeks. It is a huge conference that I’m sure will melt my mind with the latest research in Deep Learning. If anyone else is planning to attend, please say hi :)

Get Involved

As always, we welcome involvement in the true Apache tradition. If you have questions or want to say hi, head on over the the closest #mxnet room on your preferred server. We are on Clojurian’s slack and Zulip

Cat Picture of the Month

To close out, let’s take a lesson from my cats Otto and Pi and don’t forget the importance of naps.

Have a great rest of April!

I’m starting a monthly update for Clojure MXNet. The goal is to share the progress and exciting things that are happening in the project and our community.

Here’s some highlights for the month of March.

Shipped

Under the shipped heading, the 1.4.0 release of MXNet has been released, along with the Clojure MXNet Jars. There have been improvements to the JVM memory management and an Image API addition. You can see the full list of changes here

Clojure MXNet Made Simple Article Series

Arthur Caillau authored a really nice series of blog posts to help get people started with Clojure MXNet.

Lein Template & Docker file

Nicolas Modrzyk created a Leiningen template that allows you to easily get a MXNet project started – with a notebook too! It’s a great way to take Clojure MXNet for a spin

1
2
3
4
5
6
7
8
9
10
11
12
13
# create project
lein new clj-mxnet hello

# run included sample
lein run

# start notebook engine
lein notebook

# open notebook
http://0.0.0.0:10000/worksheet.html?filename=notes/practice.clj
# open empty notebook with all namespaces
http://0.0.0.0:10000/worksheet.html?filename=notes/empty.clj

There also is a docker file as well

1
2
3
4
5
6
7
8
docker run -it -p 10000:10000 hellonico/mxnet

After starting the container, you can open the same notebooks as above:

# open notebook
http://0.0.0.0:10000/worksheet.html?filename=notes/practice.clj
# open empty notebook with all namespaces
http://0.0.0.0:10000/worksheet.html?filename=notes/empty.clj

Cool Stuff in Development

There are a few really interesting things cooking for the future.

One is a PR for memory fixes from the Scala team that is getting really close to merging. This will be a solution to some the the memory problems that were encountered by early adopters of the Module API.

Another, is the new version of the API for the Clojure NDArray and Symbol APIs that is being spearheaded by Kedar Bellare

Finally, work is being started to create a Gluon API for the Clojure package which is quite exciting.

Get Involved

As always, we welcome involvement in the true Apache tradition. If you have questions or want to say hi, head on over the the closest #mxnet room on your preferred server. We are on Clojurian’s slack and Zulip.

Cat Picture of the Month

There is no better way to close out an update than a cat picture, so here is a picture of my family cat, Otto, watching birds at the window.

Have a great rest of March!

Object detection just landed in MXNet thanks to the work of contributors Kedar Bellare and Nicolas Modrzyk. Kedar ported over the infer package to Clojure, making inference and prediction much easier for users and Nicolas integrated in his Origami OpenCV library into the the examples to make the visualizations happen.

We’ll walk through the main steps to use the infer object detection which include creating the detector with a model and then loading the image and running the inference on it.

Creating the Detector

To create the detector you need to define a couple of things:

  • How big is your image?
  • What model are you going to be using for object detection?

In the code below, we are going to be giving it an color image of size 512 x 512.

1
2
3
4
5
6
7
(defn create-detector []
  (let [descriptors [{:name "data"
                      :shape [1 3 512 512]
                      :layout layout/NCHW
                      :dtype dtype/FLOAT32}]
        factory (infer/model-factory model-path-prefix descriptors)]
    (infer/create-object-detector factory)))
  • The shape is going to be [1 3 512 512].
    • The 1 is for the batch size which in our case is a single image.
    • The 3 is for the channels in the image which for a RGB image is 3
    • The 512 is for the image height and width.
  • The layout specifies that the shape given is in terms of NCHW which is batch size, channel size, height, and width.
  • The dtype is the image data type which will be the standard FLOAT32
  • The model-path-prefix points to the place where the trained model we are using for object detection lives.

The model we are going to use is the Single Shot Multiple Box Object Detector (SSD). You can download the model yourself using this script.

How to Load an Image and Run the Detector

Now that we have a model and a detector, we can load an image up and run the object detection.

To load the image use load-image which will load the image from the path.

1
(infer/load-image-from-file input-image)

Then run the detection using infer/detect-objects which will give you the top five predictions by default.

1
(infer/detect-objects detector image)

It will give an output something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[[{:class "person",
   :prob 0.9657765,
   :x-min 0.021868259,
   :y-min 0.049295247,
   :x-max 0.9975169,
   :y-max 0.9734151}
  {:class "dog",
   :prob 0.17513266,
   :x-min 0.16772352,
   :y-min 0.45792937,
   :x-max 0.55409217,
   :y-max 0.72507095}
   ...
]]

which you can then use to draw bounding boxes on the image.

Try Running the Example

One of the best ways to explore using it is with the object detection example in the MXNet repo. It will be coming out officially in the 1.5.0 release, but you can get an early peek at it by building the project and running the example with the nightly snapshot.

You can do this by cloning the MXNet Repo and changing directory to contrib/clojure-package.

Next, edit the project.clj to look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
(defproject org.apache.mxnet.contrib.clojure/clojure-mxnet "1.5.0-SNAPSHOT"
  :description "Clojure package for MXNet"
  :url "https://github.com/apache/incubator-mxnet"
  :license {:name "Apache License"
            :url "http://www.apache.org/licenses/LICENSE-2.0"}
  :dependencies [[org.clojure/clojure "1.9.0"]
                 [t6/from-scala "0.3.0"]

                 ;; To use with nightly snapshot
                 ;[org.apache.mxnet/mxnet-full_2.11-osx-x86_64-cpu "<insert-snapshot-version>"]
                 ;[org.apache.mxnet/mxnet-full_2.11-linux-x86_64-cpu "<insert-snapshot-version>"]
                 ;[org.apache.mxnet/mxnet-full_2.11-linux-x86_64-gpu "<insert-snapshot-version"]

                 [org.apache.mxnet/mxnet-full_2.11-osx-x86_64-cpu "1.5.0-SNAPSHOT"]

                 ;;; CI
                 #_[org.apache.mxnet/mxnet-full_2.11 "INTERNAL"]

                 [org.clojure/tools.logging "0.4.0"]
                 [org.apache.logging.log4j/log4j-core "2.8.1"]
                 [org.apache.logging.log4j/log4j-api "2.8.1"]
                 [org.slf4j/slf4j-log4j12 "1.7.25" :exclusions [org.slf4j/slf4j-api]]]
  :pedantic? :skip
  :plugins [[lein-codox "0.10.3" :exclusions [org.clojure/clojure]]
            [lein-cloverage "1.0.10" :exclusions [org.clojure/clojure]]
            [lein-cljfmt "0.5.7"]]
  :codox {:namespaces [#"^org\.apache\.clojure-mxnet\.(?!gen).*"]}
  :aot [dev.generator]
  :repositories [["staging" {:url "https://repository.apache.org/content/repositories/staging"                  :snapshots true
                             :update :always}]
                 ["snapshots" {:url "https://repository.apache.org/content/repositories/snapshots"               :snapshots true
                              :update :always}]])

If you are running on linux, you should change the mxnet-full_2.11-osx-x86_64-cpu to mxnet-full_2.11-linux-x86_64-cpu.

Next, go ahead and do lein test to make sure that everything builds ok. If you run into any trouble please refer to README for any missing dependencies.

After that do a lein install to install the clojure-mxnet jar to your local maven. Now you are ready to cd examples/infer/object-detection to try it out. Refer to the README for more details.

If you run into any problems getting started, feel free to reach out in the Clojurian #mxnet slack room or open an issue at the MXNet project. We are a friendly group and happy to help out.

Thanks again to the community for the contributions to make this possible. It’s great seeing new things coming to life.

Happy Object Detecting!

It’s holiday time and that means parties and getting together with friends. Bringing a baked good or dessert to a gathering is a time honored tradition. But what if this year, you could take it to the next level? Everyone brings actual food. But with the help of Deep Learning, you can bring something completely different – you can bring the image of baked good! I’m not talking about just any old image that someone captured with a camera or created with a pen and paper. I’m talking about the computer itself creating. This image would be never before seen, totally unique, and crafted by the creative process of the machine.

That is exactly what we are going to do. We are going to create a flan

Photo by Lucia Sanchez on Flickr

If you’ve never had a flan before, it’s a yummy dessert made of a baked custard with caramel sauce on it.

“Why a flan?”, you may ask. There are quite a few reasons:

  • It’s tasty in real life.
  • Flan rhymes with GAN, (unless you pronounce it “Gaaahn”).
  • Why not?

Onto the recipe. How are we actually going to make this work? We need some ingredients:

  • Clojure – the most advanced programming language to create generative desserts.
  • Apache MXNet – a flexible and efficient deep learning library that has a Clojure package.
  • 1000-5000 pictures of flans – for Deep Learning you need data!

Gather Flan Pictures

The first thing you want to do is gather your 1000 or more images with a scraper. The scraper will crawl google, bing, or instagram and download pictures of mostly flans to your computer. You may have to eyeball and remove any clearly wrong ones from your stash.

Next, you need to gather all these images in a directory and run a tool called im2rec.py on them to turn them into an image record iterator for use with MXNet. This will produce an optimized format that will allow our deep learning program to efficiently cycle through them.

Run:

python3 im2rec.py --resize 28 root flan

to produce a flan.rec file with images resized to 28x28 that we can use next.

Load Flan Pictures into MXNet

The next step is to import the image record iterator into the MXNet with the Clojure API. We can do this with the io namespace.

Add this to your require:

[org.apache.clojure-mxnet.io :as mx-io]

Now, we can load our images:

1
2
3
(def flan-iter (mx-io/image-record-iter {:path-imgrec "flan.rec"
                                         :data-shape [3 28 28]
                                         :batch-size batch-size}))

Now, that we have the images, we need to create our model. This is what is actually going to do the learning and creating of images.

Creating a GAN model.

GAN stands for Generative Adversarial Network. This is a incredibly cool deep learning technique that has two different models pitted against each, yet both learning and getting better at the same time. The two models are a generator and a discriminator. The generator model creates a new image from a random noise vector. The discriminator then tries to tell whether the image is a real image or a fake image. We need to create both of these models for our network.

First, the discriminator model. We are going to use the symbol namespace for the clojure package:

1
2
3
4
5
6
7
8
9
10
11
12
(defn discriminator []
  (as-> (sym/variable "data") data
    (sym/convolution "d1" {:data data
                           :kernel [4 4]
                           :pad [3 3]
                           :stride [2 2]
                           :num-filter ndf
                           :no-bias true})
    (sym/batch-norm "dbn1" {:data data :fix-gamma true :eps eps})
    (sym/leaky-re-lu "dact1" {:data data :act-type "leaky" :slope 0.2})

  ...

There is a variable for the data coming in, (which is the picture of the flan), it then flows through the other layers which consist of convolutions, normalization, and activation layers. The last three layers actually repeat another two times before ending in the output, which tells whether it thinks the image was a fake or not.

The generator model looks similar:

1
2
3
4
5
6
7
8
9
10
11
12
13
(defn generator []
  (as-> (sym/variable "rand") data
    (sym/deconvolution "g1" {:data data
                             :kernel [4 4]
                             :pad [0 0]
                             :stride [1 1]
                             :num-filter
                             (* 4 ndf) :no-bias true})
    (sym/batch-norm "gbn1" {:data data :fix-gamma true :eps eps})
    (sym/activation "gact1" {:data data :act-type "relu"})
  
  ...
  

There is a variable for the data coming in, but this time it is a random noise vector. Another interesting point that is is using a deconvolution layer instead of a convolution layer. The generator is basically the inverse of the discriminator. It starts with a random noise vector, but that is translated up through the layers until it is expanded to a image output.

Next, we iterate through all of our training images in our flan-iter with reduce-batches. Here is just an excerpt where we get a random noise vector and have the generator run the data through and produce the output image:

1
2
3
4
5
6
7
8
(mx-io/reduce-batches
       flan-iter
       (fn [n batch]
         (let [rbatch (mx-io/next rand-noise-iter)
               dbatch (mapv normalize-rgb-ndarray (mx-io/batch-data batch))
               out-g (-> mod-g
                         (m/forward rbatch)
                         (m/outputs))

The whole code is here for reference, but let’s skip forward and run it and see what happens.

FLANS!! Well, they could be flans if you squint a bit.

Now that we have them kinda working for a small image size 28x28, let’s biggerize it.

Turn on the Oven and Bake

Turning up the size to 128x128 requires some alterations in the layers’ parameters to make sure that it processes and generates the correct size, but other than that we are good to go.

Here comes the fun part, watching it train and learn:

Epoch 0

In the beginning there was nothing but random noise.

Epoch 10

It’s beginning to learn colors! Red, yellow, brown seem to be important to flans.

Epoch 23

It’s learning shapes! It has learned that flans seem to be blob shaped.

Epoch 33

It is moving into its surreal phase. Salvidor Dali would be proud of these flans.

Epoch 45

Things take a weird turn. Does that flan have eyes?

Epoch 68

Even worse. Are those demonic flans? Should we even continue down this path?

Answer: Yes – the training must go on..

Epoch 161

Big moment here. It looks like something that could possibly be edible.

Epoch 170

Ick! Green Flans! No one is going to want that.

Epoch 195

We’ve achieved maximum flan, (for the time being).

Explore

If you are interested in playing around with the pretrained model, you can check it out here with the pretrained function. It will load up the trained model and generate flans for you to explore and bring to your dinner parties.

Wrapping up, training GANs is a lot of fun. With MXNet, you can bring the fun with you to Clojure.

Want more, check out this Clojure Conj video – Can You GAN?.