Hereâs a quick journey through the key milestonesâthe discoveries and breakthroughs that paved the way for the creation of Hopfield Networks and beyond.
Donald Hebb introduced the Hebbian theory, which describes how the simultaneous activation of two neurons strengthens the synaptic connection between them. This is often summarized as: âneurons that fire together, wire together.â In the context of Hopfield Networks, this principle is used to update the weights of the network. The weight for each connection is calculated as:
One thing to observe is the weird \(2 \epsilon - 1\) instead of just using \( \epsilon \). Well that is just a way to accomodate the states to still give the same energy irrespective of whether we go with the \( \pm 1 \) representation or the binary representation \( (1, 0) \).
Weâve seen how the network learns to store memories using the mathematical framework discussed earlier. But how do we actually retrieve a stored memory once the network is trained?
Hopfield, in his original paper, proposed a binary activation update rule for neurons.
The idea is simple: if we provide a partial or noisy memory as input,
each neuron updates its state based on the weighted contributions from all the other connected neurons.
If this summed input crosses a certain threshold, the neuron activates (1
); otherwise, it deactivates (0
).
The update rule can be written as:
Where:
By repeatedly applying this rule, the network gradually converges from the given partial input to the closest stored memory pattern â effectively completing the memory.
The Hnet
class below represents a basic Hopfield Network model in C++.
It stores the connection strengths between neurons (the weights),
keeps track of the number of neurons (n
), and uses a threshold
value to determine activation.
The main components are:
weights
: A 2D matrix of doubles representing connections between neurons.n
: The total number of neurons in the network.threshold
: The activation threshold for neuron state changes (default is 0.0).The class provides methods to:
energy
: Calculate the current energy of the network given a state.learn
: Update weights using a chosen learning method and training states.infer
: Evolve the network state until it stabilizes (or until max iterations).save_weights
/ load_weights
: Store and retrieve trained weights from a file.
class Hnet {
private:
std::vector<std::vector<double>> weights;
int n;
double threshold = 0.0;
public:
Hnet(int n);
Hnet(int n, double threshold);
double energy(const std::vector<int>& state);
void learn(Learning_method lm, const std::vector<std::vector<int>>& states);
void infer(std::vector<int>& state, int max_iters = 100);
void save_weights(const std::string& filename) const;
void load_weights(const std::string& filename);
};
The learn
function is where the Hopfield Network actually trains.
It uses the Hebbian learning rule to update the weights based on training states.
Learning method check
: If the method is not Hebbian, it simply returns without doing anything.Hebbian update
: For every pair of neurons (j, k)
, the weight is updated using the rule
(2*state[j] - 1) * (2*state[k] - 1)
, which reinforces same states and penalizes opposite ones.Weight merge
: After all threads finish, their local weights are combined into the main weight matrix.In short: This function takes the training states, applies Hebbian learning in parallel, and updates the networkâs weights so that the given patterns become stable energy minima.
void Hnet::learn(Learning_method lm, const std::vector<std::vector<int>>& states) {
if (lm != Hebbian) return;
int total_states = states.size();
for (int idx = 0; idx < total_states; idx++) {
const auto& state = states[idx];
for (int j = 0; j < n; j++) {
for (int k = 0; k < n; k++) {
if (j == k) continue;
weights[j][k] += (2 * state[j] - 1) * (2 * state[k] - 1);
}
}
}
}
Starting from an initial (possibly incomplete or noisy) state, the infer
function repeatedly updates each
neuron based on the weighted input from all other neurons until the network reaches a stable state or the maximum number of
iterations is reached. Each neuron's new state is computed using the net weighted sum of all other neurons. The process stops
early if the state stops changing (converges).
void Hnet::infer(std::vector<int>& incomplete_state, int max_iters) {
for (int iter = 0; iter < max_iters; iter++) {
std::vector<int> next_state(n);
for (int i = 0; i < n; i++) {
double net_input = 0.0;
for (int j = 0; j < n; j++) {
if (i == j) continue;
net_input += weights[i][j] * (2 * incomplete_state[j] - 1);
}
net_input -= threshold;
next_state[i] = net_input >= 0 ? 1 : 0;
}
if (next_state == incomplete_state) break;
incomplete_state.swap(next_state);
}
}
You can check out the full implementation of this code with parallelization performed on the MNIST handwritten digit recognition dataset
here. My stupid-self just tried training it on all the images even though Hopfield mentioned about the \( 0.15N \) soft limit which basically
means that a network with \( N \) nodes can memorize upto \( 0.15N \) patterns. So all I get is some random spurious pattern. Also
one thing to mention is I found a paper where they attempted H-Net on the MNIST dataset and they mentioned the Hebbian method doesn't
work well so I will try implementing the Storkey method on this.
Now that we have a basic understanding of what a Hopfield Network is, how it can be used, and to some extent why it works, we can move on to a deeper questionâwhy is it able to store memories? To answer this, we need to explore its convergence properties. Jehoshua Bruck published a paper that examines the convergence behavior of Hopfield networks in great detail. In the following section, I have drawn upon some of the theorems and proofs from his work to explain the convergence of Hopfield networks, specifically for the case of serial mode with symmetric weights. Here, serial means that the neuron states are updated one at a time, and symmetric means that \( w_{i,j} = w_{j,i} \).
Theorem: Let \( N = (W, T) \) where \( N \) is the neural network, \( W \) is the matrix of the weights of network and \( T \) is a vector of threshold values (essentially a \( 0 \) vector as given in the example) operating in serial mode. \( W \) is basically a symmetric matrix with non-negative diagonal elements (obviously zero since we are not considering self-connections). Then the network \( N \) will always converge to a stable state.
Proof:
Do note that I have considered state value to be \( \epsilon = \pm 1 \) instead of \( 0/1 \) because that is how it has been done in the paper.
The idea is still the same and maybe you can try using \( 0/1 \) to write the proof on your own for fun.
Threshold vector: \( T = [ \theta_1, \theta_2, \ldots, \theta_n ]^T \)
State vector at \(t: V_k(t) = [ \epsilon^k_1, \epsilon^k_2, \ldots, \epsilon^k_n ]^T \)
The energy function is defined as:
We are doing this computation in the serial mode, a let us say we do it at the neuron i for instance then \( \Delta \epsilon_j^k = 0\) if \( j \neq i\),
It is easy to see from the definition of \( \Delta \epsilon_j^k \) that the first term is always positive or zero and as we
know that diagonal elements of \( W \) are non-negative, we can say \( \Delta E \ge 0\) for every neuron. We also that the energy
function is defined on a finite state space for a given neural network which means it will be a bounded function.
\(\therefore \) E will always converge.
Hopfield Networks may seem like a relic of early neural network research, but their core ideas continue to ripple through modern AI â from associative memory mechanisms to modern attention-based architectures. What makes them remarkable is not just their ability to store and retrieve patterns, but how they bridge diverse disciplines: the physics of energy minimization, the biology of neural plasticity, and the mathematics of dynamical systems.
While they are no longer the state-of-the-art for large-scale pattern recognition, they remain a powerful mental model â a reminder that intelligence can emerge from simple local interactions when embedded in the right structure. For anyone curious about where neural networks came from, studying Hopfield Networks is like looking at the DNA of modern AI.
This blog only scratches the surface. I plan to keep updating this post and experiment with extensions like continuous Hopfield networks and the Storkey learning rule, and see how these early architectures can still surprise us today.
Comments