Vector Symbolic Architectures in Clojure
generated with Stable Diffusion
Before diving into the details of what Vector Symbolic Architectures are and what it means to implement Clojure data structures in them, I’d like to start with some of my motivation in this space.
Small AI for More Personal Enjoyment
Over the last few years, I’ve spent time learning, exploring, and contributing to open source deep learning. It continues to amaze me with its rapid movement and achievements at scale. However, the scale is really too big and too slow for me to enjoy it anymore.
Between work and family, I don’t have a lot of free time. When I do get a few precious hours to do some coding just for me, I want it it to be small enough for me to fire up and play with it in a REPL on my local laptop and get a result back in under two minutes.
I also believe that the current state of AI is not likely to produce any more meaningful revolutionary innovations in the current mainstream deep learning space. This is not to say that there won’t be advances. Just as commercial airlines transformed the original first flight, I’m sure we are going to continue to see the transformation of society with current big models at scale - I just think the next leap forward is going to come from somewhere else. And that somewhere else is going to be small AI.
Vector Symbolic Architures aka Hyperdimensional Computing
Although I’m talking about small AI, VSA or Hyperdimensional computing is based on really big vectors - like 1,000,000 dimensions. The beauty and simplicity in it is that everything is a hypervector - symbols, maps, lists. Through the blessing of high dimensionality, any random hypervector is mathematically guaranteed to be orthogonal to any other one. This all enables some cool things:
- Random hypervectors can be used to represent symbols (like numbers, strings, keywords, etc..)
- We can use an algebra to operate on hypervectors: bundling and binding operations create new hypervectors that are compositions of each other and can store and retrieve key value pairs. These operations furthermore are fuzzy due to the nature of working with vectors. In the following code examples, I will be using the concrete model of MAP (Multiply, Add, Permute) by R. Gayler.
- We can represent Clojure data structures such as maps and vectors in them and perform operations such as
get
with probabilistic outcomes. - Everything is a hypervector! I mean you have a keyword that is a symbol that is a hypervector, then you bundle that with other keywords to be a map. The result is a single hypervector. You then create a sequence structure and add some more in. The result is a single hypervector. The simplicity in the algebra and form of the VSA is beautiful - not unlike LISP itself. Actually, P. Kanerva thought that a LISP could be made from it. In my exploration, I only got as far as making some Clojure data structures, but I’m sure it’s possible.
Start with an Intro and a Paper
A good place to start with Vector Symbolic Architectures is actually the paper referenced above - An Introduction to Hyperdimensional Computing for Robots. In general, I find the practice of taking a paper and then trying to implement it a great way to learn.
To work with VSAs in Clojure, I needed a high performing Clojure library with tensors and data types. I reached for https://github.com/techascent/tech.datatype. It could handle a million dimensions pretty easily on my laptop.
To create a new hypervector - simply chose random values between -1 and 1. This gives us a direction in space which is enough.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
The only main operations to create key value pairs is addition and matrix multiplication.
Adding two hyperdimensional vectors, (hdvs), together is calling bundling. Note we clip the values to 1 or -1. At high dimensions, only the direction really matters not the magnitude.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
We can assign key values using bind
which is matrix multiplication.
1 2 3 4 |
|
One cool thing is that the binding of a key value pair is also the inverse of itself. So to unbind is just to bind again.
The final thing we need is a cleanup memory. The purpose of this is to store the hdv somewhere without any noise. As the hdv gets bundled with other operations there is noise associated with it. It helps to use the cleaned up version by comparing the result to the memory version for future operations. For Clojure, this can be a simple atom.
Following along the example in the paper, we reset the cleanup memory and add some symbols.
1 2 3 4 5 6 7 8 |
|
Next we create the key value map with combinations of bind
and bundle
.
1 2 3 4 5 6 |
|
So H
is just one hypervector as a result of this. We can then query it. unbind-get
is using the bind
operation as inverse. So if we want to query for the :name
value, we get the :name
hdv from memory and do the bind
operation on the H
data structure which is the inverse.
1 2 3 4 5 |
|
We can find other values like :high-score
.
1 2 3 4 5 |
|
Or go the other way and look for Alice
.
1 2 3 4 5 |
|
Now that we have the fundamentals from the paper, we can try to implement some Clojure data structures.
Clojure Data Structures in VSAs
First things first, let’s clear our cleanup memory.
1
|
|
Let’s start off with a map, (keeping to non-nested versions to keep things simple).
1
|
|
The result is a 1,000,000 dimension hypervector - but remember all the parts are also hypervectors as well. Let’s take a look at what is in the cleanup memory so far.
1 2 3 4 5 6 7 |
|
We can write a vsa-get
function that takes the composite hypervector of the map and get the value from it by finding the closest match with cosine similarity to the cleanup memory.
1 2 3 4 |
|
In the example above, the symbolic value is the first item in the vector, in this case the number 1, and the actual hypervector is the second value.
We can add onto the map with a new key value pair.
1 2 3 4 5 6 |
|
We can represent Clojure vectors as VSA data structures as well by using the permute (or rotate) and adding them like a stack.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
We can also add onto the Clojure vector with a conj.
1 2 3 4 5 6 7 8 |
|
What is really cool about this is that we have built in fuzziness or similarity matching. For example, with this map, we have more than one possibility of matching.
1
|
|
We can see all the possible matches and scores
1 2 3 4 5 6 7 8 9 |
|
This opens up the possibility of defining compound symbolic values and doing fuzzy matching. For example with colors.
1 2 3 |
|
Let’s add a new compound value to the cleanup memory that is green based on yellow and blue.
1 2 3 |
|
Now we can query the hdv color map for things that are close to green.
1 2 3 4 5 |
|
We can also define an inspect
function for a hdv by comparing the similarity of all the values of the cleanup memory in it.
1 2 3 4 5 6 7 8 9 10 |
|
Finally, we can implement clojure map
and filter
functions on the vector data structures that can also include fuzziness.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Wrap Up
VSAs and hyperdimensional computing seem like a natural fit for LISP and Clojure. I’ve only scratched the surface here in how the two can fit together. I hope that more people are inspired to look into it and small AI with big dimensions.
Full code and examples here https://github.com/gigasquid/vsa-clj.
Special thanks to Ross Gayler in helping me to implement VSAs and understanding their coolness.