Interactive Neural Network Visualizer

Step through neural network training and see every computation in mathematical detail.

Quick Start

Click ▶ Play to start training • Explore the tabs at the top (Network, Computation, Data, Predictions, Parameters) • Adjust learning rate slider to see different behaviors

📖 Detailed Controls & Tips

Control Buttons

▶ Play - Auto-train | ⏸ Pause - Stop | ⏭ Step - Manual advance | ↻ Reset - New weights

Learning Rate

0.1 - Steady (default) | 0.5-1.0 - Faster | 2.0+ - Unstable (try it!)

What Each Tab Shows

Network - Visual diagram (node size = activation, edge color = weight sign)
Computation - Step-by-step math with notation glossary
Data - Scatter plot & table of training points
Predictions - How well the network solves XOR
Parameters - Weight evolution graphs over time

Learning Tips

Start with defaults (100 samples, no noise, LR 0.1)
Watch loss decrease to near 0
Try LR 0.5, 1.0, 2.0 to see different behaviors
Add noise to test robustness
Reset and compare different runs

Try It Now

What is XOR?

XOR (Exclusive OR) is a logical operation that outputs true (1) only when inputs differ:

Input A	Input B	XOR Output
0	0	0
0	1	1
1	0	1
1	1	0

Why XOR is Perfect for Learning

XOR is the simplest problem that requires a hidden layer:

Linear separability: You cannot draw a single straight line to separate the outputs
Requires feature combination: The network must learn that "inputs differ" is the pattern
Classic test: If a neural network can learn XOR, it has the capacity for non-linear learning

What the Network is Learning

The demo shows a 2→4→1 network learning XOR through these steps:

Random initialization: Weights start random, predictions are terrible
Forward pass: Input flows through network, producing a prediction (0 to 1)
Calculate error: Compare prediction to actual XOR answer
Backpropagation: Calculate how to adjust each weight to reduce error
Update weights: Apply gradients to make predictions slightly better
Repeat: After many iterations, the network learns the XOR pattern

Watch for:

Early training: Loss is high (~0.7), predictions random
Middle training: Loss decreases, patterns start emerging
Convergence: Loss near 0, all 4 XOR cases predicted correctly

The Data Visualization tab shows the 4 training points and how the network classifies them in the input space.

What This Demo Teaches

1. How Neural Networks Learn

Watch a network solve the XOR problem from random initialization:

Initial state: Random weights, terrible predictions
Training: Gradual weight adjustments via backpropagation
Convergence: Network learns the correct pattern

2. Forward Pass in Detail

See each computation step:

Input layer → Hidden layer:
  z₀ = x₀ · w₀₀ + x₁ · w₁₀ + b₀
  a₀ = sigmoid(z₀)

Hidden layer → Output:
  z_out = a₀ · w₀₀ + a₁ · w₀₁ + ... + b_out
  prediction = sigmoid(z_out)

Every value is shown with actual numbers.

3. Backward Pass (Backpropagation)

See how errors propagate backward:

Output error → Hidden layer error → Weight gradients

The chain rule is made explicit at each step.

4. Why Hidden Layers Matter

XOR is not linearly separable:

Without hidden layer: Cannot solve XOR
With hidden layer: Network creates new features (intermediate representations)
Visualization: See how hidden neurons create separable representations

Features

Interactive Controls

Step-by-step: Move forward/backward through training
Play/Pause: Watch training in real-time or at your own pace
Reset: Start over with new random weights
Randomize: Try different initializations
Learning rate: Adjust from 0.01 to 2.0

Training Data Options

Data size: 50 to 500 training examples
Noise level: 0% to 50% label corruption
Confidence penalty: Push predictions toward 0 or 1
Regenerate: Create new random datasets

Multiple Visualizations

Network Graph:

Nodes show activations
Edges show weights (color = sign, thickness = magnitude)
Click edges to see detailed gradient info

Computation Panel:

Forward pass step-by-step
Loss calculation
Backward pass gradients
Notation glossary

Data Visualization:

Scatter plot of training points
Table view of all samples
Noisy samples highlighted

Parameter Graphs:

Weight evolution over time
Loss curve
Path strength analysis

Predictions Panel:

Network output on canonical XOR points
Accuracy and loss metrics

Understanding the Architecture

Network Structure: 2 → 4 → 1

Input layer (2 neurons):

Takes two binary inputs (0 or 1)

Hidden layer (4 neurons):

Creates intermediate representations
Learns features like "AND" and "OR"
Uses sigmoid activation

Output layer (1 neuron):

Combines hidden features
Produces final prediction (0 to 1)

Why 4 Hidden Neurons?

Could solve XOR with 2-3 hidden neurons, but 4 provides:

More capacity for learning
Clearer visualization of path strengths
Redundancy for noisy data

Common Questions

Why does loss sometimes increase?

Reasons:

Learning rate too high → overshooting optima
Noisy training data → memorizing wrong patterns
Stochastic training → batch-to-batch variance

Experiment: Lower the learning rate and watch convergence smooth out.

Why do some weights grow very large?

Answer: The network is becoming confident. Large weights → steep sigmoid → predictions close to 0 or 1.

Experiment: Enable "confidence penalty" to discourage extreme weights.

What are path strengths?

Path strength = product of weights along a path from input to output.

Example: Input x₀ → Hidden h₂ → Output

Path strength = w[0→2] × w[2→out]

Strong paths dominate the network's computation.

Experiment: Click "Parameter Graphs" tab to see path evolution.

Learning Exercises

Beginner

Watch one full training run: Observe loss decreasing
Inspect final weights: Which paths matter most?
Reset and compare: Do different random starts converge to similar solutions?

Intermediate

Learning rate experiments:
- Try 0.1 (default), 0.5, 1.0, 2.0
- When does training become unstable?
Data size impact:
- Train with 50 vs 500 samples
- Does more data always help?
Noise robustness:
- Add 10%, 20%, 30% noise
- At what point does learning fail?

Advanced

Hypothesis testing:
- Form a hypothesis (e.g., "larger learning rate = faster convergence")
- Test it systematically
- Analyze results
Path analysis:
- Identify the dominant path at convergence
- What does this path compute?
- Why is it stronger than others?
Initialization sensitivity:
- Run 10 training sessions
- Do they all find similar solutions?
- What varies? What stays constant?

Technical Implementation

Pure TypeScript

No ML libraries - everything implemented from scratch:

Forward pass computation
Backpropagation algorithm
Weight updates
Loss calculation

Why? Understanding the implementation is part of the learning.

Testing

Comprehensive test suite ensures correctness:

Forward pass produces correct shapes
Gradients computed accurately
Weight updates applied correctly
Loss decreases over training

View the tests

Source Code

All code is open source:

Network logic: web/src/network/Network.ts
Visualization: web/src/components/
State management: web/src/hooks/useTraining.ts

Next Steps

After mastering this demo:

Read the implementation: See how backprop works in code
Explore the learnings: What we discovered while building this
Try modifying it: Fork the code and experiment
Wait for more demos: Attention mechanisms coming soon!

Feedback

Found a bug? Have a feature idea? Something confusing?

Open an issue on GitHub to help improve this learning tool!

This visualizer is part of the LLM Workings learning project exploring neural networks and language models from the inside out.

Interactive Neural Network Visualizer ​

Quick Start ​

Control Buttons ​

Learning Rate ​

What Each Tab Shows ​

Learning Tips ​

Try It Now ​

What is XOR? ​

Why XOR is Perfect for Learning ​

What the Network is Learning ​

What This Demo Teaches ​

1. How Neural Networks Learn ​

2. Forward Pass in Detail ​

3. Backward Pass (Backpropagation) ​

4. Why Hidden Layers Matter ​

Features ​

Interactive Controls ​

Training Data Options ​

Multiple Visualizations ​

Understanding the Architecture ​

Network Structure: 2 → 4 → 1 ​

Why 4 Hidden Neurons? ​

Common Questions ​

Why does loss sometimes increase? ​

Why do some weights grow very large? ​

What are path strengths? ​

Learning Exercises ​

Beginner ​

Intermediate ​

Advanced ​

Technical Implementation ​

Pure TypeScript ​

Testing ​

Source Code ​

Next Steps ​

Feedback ​