Artificial Neural Networks and Cognitive Models

"A neural network is not a model of the brain so much as a model of one thing the brain does — turning patterns into responses, and getting better at it with practice."- Claude 2026

Artificial Neural Networks and Cognitive Models

How a network of simple units learns to behave intelligently — and how researchers use such networks to model the human mind.

Diagram of an artificial neuron showing inputs, weights, a summation step, an activation function, and an output

Source: Rukshan Pramoditha, Towards Data Science

Learning objectives

By the end of this page you should be able to:

Explain the structure and function of artificial neural networks.
Describe cognitive models used to simulate human cognition.
Analyze neural network designs for cognitive applications.

The Structure and Function of Artificial Neural Networks

An artificial neural network (ANN) — a computing system made of many small processing units connected together, loosely inspired by how brain cells connect — learns to perform a task by adjusting the strengths of those connections rather than by following hand-written rules. The whole field rests on one small building block, repeated thousands or millions of times.

The artificial neuron

The basic unit is the artificial neuron (also called a node or, in its simplest form, a perceptron). It does three things in order: it takes several numbers as input, combines them, and produces a single number as output. Each input arrives with a weight — a number that says how much that input matters — and the neuron multiplies each input by its weight and adds the results together (a weighted sum). It then adds a bias (a fixed offset that shifts the result up or down) and passes the total through an activation function — a rule that decides the neuron's final output, often squashing it into a fixed range or deciding whether the neuron "fires."

Compactly, a neuron computes y = φ(Σ wᵢxᵢ + b), where the wᵢ are the weights, b is the bias, and φ is the activation function. Geometrically, the weights and bias define a hyperplane (a flat decision boundary) through the input space; before the activation is applied, a single neuron is therefore just a linear classifier, splitting its inputs into two halves. The activation is what bends that boundary and gives the unit its real power.

Inputs

x₁, x₂, … xₙ

Weighted sum

Σ wᵢxᵢ

Add bias

+ b

Activation

φ(·)

Output

Weights

One number per input, controlling its influence. Large weight means the input strongly affects the output; near zero means it is mostly ignored. These are the values the network adjusts as it learns.

Bias

A single extra number added to the weighted sum. It lets the neuron shift its threshold for firing, so it does not have to pass through zero — making the unit far more flexible.

Activation function

A rule applied to the total (common choices include the S-shaped sigmoid and the ReLU, which keeps positive values and zeroes the rest). It introduces non-linearity, letting networks model complex relationships.

* ReLU is now the default in hidden layers because sigmoids saturate and cause vanishing gradients in deep stacks; an output layer doing classification typically ends in softmax, which turns raw scores into a probability distribution over the classes.)

From one neuron to a network

A single neuron can only draw simple distinctions. The power comes from arranging many of them in layers and connecting the output of each layer to the input of the next. A typical network has three kinds of layer:

A feedforward neural network with an input layer, two hidden layers, and an output layer, fully connected — A feedforward network: data enters at the input layer, passes through hidden layers, and leaves at the output layer. Source: GeeksforGeeks

Input layer — one unit per piece of input data (for example, one per pixel of an image). It simply receives the values.
Hidden layers — one or more layers in the middle, where the actual processing happens. Each unit combines signals from the previous layer, letting the network build up increasingly complex patterns. Stacking several of them (a deep network) yields hierarchical representations: early layers capture simple features, later layers compose them into abstract ones.
Output layer — produces the final answer, with one unit per possible response (for example, one per category the network can choose).

Passing data forward through the layers to produce an answer is called the forward pass. When every unit in one layer connects to every unit in the next and the data flows in one direction only, the network is called feedforward (and such all-to-all layers are termed fully connected or dense) — the most basic design and the foundation for the rest.

Crucially, a network is not programmed with the right weights — it learns them. It is shown examples, compares its output to the correct answer using a loss function (a measure of how wrong it is), and adjusts its weights to reduce that loss. The standard procedure is gradient descent driven by backpropagation — the chain rule applied backward through the network to find how each weight affects the loss — repeated over many examples until the network performs the task. For the plain-language version, it is enough to know that learning means changing weights; the technical version is that it means descending a loss surface.

Cognitive Models That Simulate Human Cognition

A cognitive model is a working system — usually a computer program — built to reproduce some aspect of human thinking precisely enough to be tested. The aim is not just to get the right answer but to get it the way a person would: making the same kinds of mistakes, taking longer on harder problems, showing the same memory limits. If a model behaves like a human on a task, it becomes evidence for how the mind might actually work. Cognitive models fall into a few broad traditions.

Connectionist models

Connectionist models — also called parallel distributed processing (PDP) models — use neural networks directly as theories of cognition. Knowledge is not stored as explicit facts but is spread across the connection weights of a network (a distributed representation, where each concept is a pattern of activity over many units rather than a single symbol), and behavior emerges from many simple units acting together. These models are well suited to explaining abilities that feel automatic and pattern-based — recognizing a familiar face, learning the past tense of verbs, filling in a half-heard word. Their strength is that they learn from examples and show graceful degradation — performance falls off gradually rather than collapsing when information is noisy, incomplete, or units are damaged — much as human cognition does.

Cognitive architectures

A cognitive architecture is a unified theory of the mind expressed as software: a fixed set of mechanisms — for memory, perception, and action — that together aim to model cognition as a whole, rather than one isolated task. Two are especially influential:

ACT-R

ACT-R (Adaptive Control of Thought—Rational), developed by John Anderson at Carnegie Mellon, divides the mind into specialized modules (visual, manual, declarative memory, and others), each accessed through a small holding area called a buffer. It separates declarative knowledge (facts — "Paris is the capital of France") from procedural knowledge (rules for action — how to type a letter), and produces step-by-step simulations whose timing and errors can be compared directly with human data.

Soar

Soar, developed by Allen Newell and colleagues, models intelligent behavior as a continual cycle of applying rules to reach goals. When the system reaches a point where it does not know what to do (an impasse), it sets up a sub-goal to work it out, and remembers the solution for next time (a learning process called chunking). Soar emphasizes general problem-solving across many tasks.

These architectures are usually classified as symbolic — they represent knowledge as discrete, readable symbols and manipulate them with explicit rules — in contrast to the connectionist approach, where knowledge is distributed across numerical weights. The two traditions answer different questions: symbolic architectures excel at modeling deliberate, step-by-step reasoning, while connectionist models excel at fast, intuitive pattern recognition.

Neuro-symbolic models

Because the two traditions have complementary strengths, a growing line of work tries to combine them. A neuro-symbolic model — a system that joins a neural network's pattern-learning with a symbolic component's explicit rules and reasoning — aims to do both at once: a neural part turns raw, messy input into meaningful pieces, and a symbolic part reasons over those pieces using logic and stored knowledge. In a typical arrangement, the neural component looks at a scene and identifies the objects in it, and the symbolic component then applies rules to draw conclusions — for example, checking that an interpretation obeys commonsense constraints (a cup cannot rest in mid-air without support) and rejecting it if it does not.

This pairing is appealing as a cognitive model because it mirrors a familiar picture of human thought: a fast, automatic mode that recognises patterns at a glance, and a slower, deliberate mode that reasons step by step (an idea often described as dual-process cognition). Concrete systems include DeepProbLog, which adds neural predicates to probabilistic logic programming, and Logic Tensor Networks, which embed logical constraints into a network's training signal — both examples of differentiable reasoning, where symbolic rules are made smooth enough to learn through. Combining learning and reasoning in one system is an active research area — sometimes called the third wave of AI, after the earlier symbolic and neural waves — and it remains an open challenge to get the two halves to work together smoothly. Researchers pursue it partly because the symbolic half can make a system's decisions easier to inspect and check, and partly because explicit rules let a system generalise from far fewer examples than a neural network alone would need.

The line between these models and ordinary AI is the intent. An engineer building a face-recognition system wants accuracy and does not care whether it works like a human. A cognitive modeler wants the system to match human behavior — its speed, its limits, its errors — because the goal is explanation, not just performance.

Analyzing Network Designs for Cognitive Applications

Different cognitive abilities have different shapes, and over time researchers have developed network designs (also called architectures) whose structure matches the structure of the problem. Choosing a design is really a claim about what kind of processing the task requires. Four families cover most cognitive applications.

Network design	Key structural idea	Cognitive ability it suits
Feedforward	Data flows one way through fully connected layers	Simple classification and decision tasks — sorting an input into one of several categories
Convolutional (CNN)	Units scan small local regions of the input, detecting features regardless of position (weight sharing gives translation invariance)	Visual processing — recognizing shapes and objects, echoing how the visual system builds images from local features
Recurrent (RNN)	Connections loop back, carrying a hidden state that remembers earlier input (gated variants LSTM/GRU address the vanishing gradient over long sequences)	Sequence and time — understanding ordered input such as language or events unfolding over time
Attention / transformer	Units learn which other parts of the input to focus on for each decision (self-attention over query–key–value vectors; the basis of the Transformer)	Selective focus — concentrating on the relevant parts of a large input, loosely paralleling human attention

The pattern across all four is the same: the structure of the network encodes an assumption about the structure of the cognitive task. A convolutional design assumes that what matters can appear anywhere in an image; a recurrent design assumes the order of inputs carries meaning; an attention-based design assumes that only part of the input is relevant at any moment. Analyzing a design therefore means asking what claim its structure makes — and whether that claim fits the ability being modeled.

Matching design to task

When a network is built as a cognitive model rather than just an engineering tool, this matching becomes the central question. A researcher modeling how people recognize objects would reach for a convolutional design, because its feature-detecting structure mirrors a known property of biological vision. A researcher modeling how people understand a sentence word by word would reach for a recurrent or attention-based design, because comprehension clearly depends on what came earlier. The design is not chosen for raw accuracy alone but because its internal organization makes a testable claim about how the corresponding human ability is organized.

Strengths of network-based cognitive models

They learn from experience rather than requiring every rule to be specified, they tolerate noisy and incomplete input, and their layered feature-building resembles real perceptual systems — making them natural models of fast, automatic cognition.

Limits to keep in mind

A network that performs a task well does not automatically explain how humans do it; it can be hard to interpret why a network responds as it does; and matching human accuracy is not the same as matching human process — the real test of a cognitive model.

A Cognitive Architecture at a Glance

Where a neural network is built from one repeated unit, a symbolic cognitive architecture is built from several specialized modules that pass information between them — a very different picture of how a mind might be organized.

Diagram of the ACT-R cognitive architecture showing modules connected through buffers to a central procedural system — The ACT-R architecture: specialized modules communicate through buffers, coordinated by a central procedural system. Source: Wikipedia — ACT-R

Tools & Tutorials

GeeksforGeeks — Feedforward Neural Network — a clear walkthrough of layers, activation functions, and how data flows through a network, with a worked code example.
Towards Data Science — The Concept of Artificial Neurons — an accessible, diagram-rich introduction to the single neuron and the maths inside it.
ACT-R — Official Site (Carnegie Mellon) — source code, tutorials, and published models for the ACT-R cognitive architecture.

Artificial Neural Networks and Cognitive Models

Learning objectives

The Structure and Function of Artificial Neural Networks

The artificial neuron

Weights

Bias

Activation function

From one neuron to a network

Cognitive Models That Simulate Human Cognition

Connectionist models

Cognitive architectures

ACT-R

Soar

Neuro-symbolic models

Analyzing Network Designs for Cognitive Applications

Matching design to task

Strengths of network-based cognitive models

Limits to keep in mind

A Cognitive Architecture at a Glance

Tools & Tutorials

Further reading