Master's Thesis Defense

Uncovering Structural Changes in Open-Ended Continual Learning

Igor Piotr Urbanik Supervisor: Dr. Paweł Gajewski Faculty of Computer Science · AGH University of Krakow
1

Agenda

01PrologueUnpacking the title and the field.
02The StoryHow SatSOM appeared, surprised us, and exposed gaps.
03EpilogueWhat remains useful after the numbers fade.
2
Part 1 / Prologue

Uncovering Structural Changes in Open-Ended Continual Learning

We will walk through the title, one idea at a time.
3
Prologue

Continual Learning

A model learns from a stream of tasks or data, without retraining from scratch each time the world changes.

  • new classes, users, objects, or contexts arrive sequentially
  • old knowledge should remain usable
  • the system should adapt without full supervision loops
4
Prologue

Open-Ended Continual Learning

Most current approaches still require substantial human preparation before new data can be introduced.

closed setupcurated task boundarieshuman decides what arrives and how
open-ended setupunbounded changethe system must organize knowledge as it grows
That requirement removes many of the most interesting applications.
5
Prologue

Uncovering Structural Changes

The goal is not only to improve a score.

It is to understand, almost mechanistically, what happens inside a learning model when it remembers, adapts, or forgets.

6
Prologue

The Stability-Plasticity Dilemma

Stabilitypreserve old knowledge
Plasticityabsorb new knowledge

Continual learning lives in the tension between these two forces.

7
Prologue

Two Reference Points

EWC / Kirkpatrick et al.

Protect important weights

Constrain changes in weights carrying high Fisher information. Simple, general, and easy to implement.

DSDM / Pourcel et al.

Dynamic Sparse Distributed Memory

A strong contemporary reference point and state of the art at the time of writing.

8
Prologue

Datasets

All benchmarks were evaluated in a class-incremental setting.

KMNIST sampleDatasetKMNIST
FashionMNIST sampleDatasetFashionMNIST
CIFAR-10 sampleDatasetCIFAR-10
CIFAR-100 sampleDatasetCIFAR-100
CORe50 sampleDatasetCORe50

For CIFAR-10, CIFAR-100, and CORe50, learning was performed on ResNet50 ImageNet embeddings rather than raw - this is a commonly used protocol.

9
Part 2 / The Story

After Many Failed Ideas

Igor
What if the structure itself could tell us something?
10
The Story

Self-Organizing Map

A SOM is a grid of prototypes that organizes itself topologically.

Similar inputs activate nearby regions, so the map becomes a visible representation of structure.

BMU and Neighborhood Adaptation
11
The Story

Two Knobs: λ and σ

λlearning ratehow strongly neurons move
low λ
high λ
Prototype appearance from noise: high λ converges faster.
σneighborhood radiushow far the update spreads
σ expands or contracts the affected neighborhood.
12
The Story

The SatSOM Idea

What if every neuron had its own λ and σ?

01Track useEach neuron accumulates how much it has already moved during learning.
02SaturateAs local memory grows, that neuron's own learning rate λ and neighborhood radius σ shrink.
03RedirectStable regions stop being overwritten, so new information flows toward less-used parts of the map.
Figure 7: SatSOM neighborhood dynamics
Figure 8: SatSOM learning dynamics
13
The Story

SatSOM Inference

During inference, we project an input onto the map, then read the class from the local neighborhood.

01ProjectFind the best-matching neuron in the embedding map.
02VoteAggregate nearby class evidence with distance weighting.
03PredictReturn the class with strongest local support.
Figure 1: SatSOM overview
14
The Story

First Benchmark

kNNperfectly stable
SatSOMlocal saturation
SOM familyinterpretable baselines

SatSOM turned out to be simple, interpretable and strong.

kNN
85.03
SatSOM
76.98
CSOM
75.13
FashionMNIST, incremental by one class, ACC.
15
The Story

Ablation Study

Paweł
Let us ablate it.
Igor
Good idea.
Cyfronet
BRRR
Igor
Oh.
16
The Story

Breakthrough 1

σdistributes knowledgethe neighborhood radius spreads updates across the map and protects structure from overwriting
λregularizes under pressurethe learning rate matters most when there is no comfortable space for new information
Ablation study chart for neighborhood radius
ablation 1
Ablation study chart for learning rate
ablation 2
Ablation study chart for saturation parameter
ablation 3
17
The Story

Lost But Not Forgotten?

Professor Roberto Corizzo from the American University visits AGH. Paweł invites him to talk with Igor. The discussion shifts from performance to interpretation.

Igor
SatSOM does not seem to forget much.
Roberto
But here accuracy drops.
Igor
Look at the map. The knowledge is still visible.
Roberto
Oh.
KMNIST SOM heatmap
18
The Story

Breakthrough 2

Accuracy drop is not always forgetting.

Storedthe representation remains in the map
Accessiblethe classifier may fail to retrieve it cleanly
Like having knowledge on the tip of your tongue.
19
The Story

Reviewer Enters

The paper goes to JAISCR. A missing comparison becomes the next experiment.

Reviewer
Interesting, but where is the SOTA comparison?
Igor
Yes, sir.
Cyfronet
BRRR
20
The Story

The Morning After

Igor
Paweł, we have state of the art.
Paweł
Oh.
SatSOM>DSDM
21
The Story

Breakthrough 3

On selected contemporary benchmarks, SatSOM beat DSDM, the state-of-the-art reference model.

SatSOM
90.06
DSDM
70.26
KMNIST, incremental by one class, ACC.
22
The Story

Something Is Off

Igor
Strange that such a simple method reached SOTA...
Paweł
True.
Igor
Wait. kNN was beating SatSOM.
Cyfronet
BRRR
Igor
Oh.
23
The Story

Breakthrough 4

Under these benchmarks, kNN is effectively state of the art.

large pretrained encoder
fixed embeddings
stable classifier
kNN
96.05
SatSOM
90.06
DSDM
70.26
Same benchmark as before: KMNIST, incremental by one class, ACC.
The benchmarks outsource plasticity to the encoder and mostly measure stability.
24
Part 3 / Epilogue

Epilogue

SatSOM is simple, interpretable, and reaches state-of-the-art results on selected benchmarks.

But the deeper value is the gaps it exposes.
25
Epilogue

What We Built

A saturation-based Self-Organizing Map for continual learning.

  • local plasticity
  • interpretable structure
  • visible saturation dynamics

A SOM-like grid with per-neuron saturation: local memory, local plasticity.

26
Epilogue

What We Learned

LocalityKnowledge is preserved better when updates are distributed locally instead of globally.
AccessAn accuracy drop does not always mean erasure; often the model still stores the information but loses clean access to it.
BenchmarksMany modern setups reward stability more than plasticity, so very stable methods can look stronger than they actually are.
27
Epilogue

Limitations

PlasticitySatSOM is still more stable than plastic, which limits how well it can absorb genuinely new structure.
Practical BaselineIn practice, kNN remains extremely hard to beat in absolute accuracy.
28
Epilogue

Future Work

HierarchyA hierarchical extension is a direct path to better plasticity; since Dittenbach et al already proposed a hierarchical SOM, a hierarchical SatSOM is a concrete open direction.
UnlearningPolowczyk et al recently reported the reverse phenomenon as well: unlearned models may not truly forget, they mostly lose access, and the hidden information can be recovered surprisingly easily.
MetricsWe need better metrics that separate memory, accessibility, stability, and plasticity more directly.
29

Thank You!