Aatif Ahmad
IIT Jodhpur
Course: Cognitive Science - Fall 2025
Why do some sentences make our brains ache?
This presentation explores Dependency Locality Theory (DLT), which argues that processing difficulty increases when related words are far apart.
We'll see how this principle connects human cognition to the behavior of Large Language Models (LLMs), suggesting efficiency is a universal rule of thinking.
Why do some sentences tire the brain?
Have you ever read a long sentence and realized you forgot how it began? That's your brain's short-term memory protesting.
DLT suggests a simple answer:
The farther related words are in a sentence, the harder it becomes to process.
Our minds work like short-term memory buffers. When related words are far apart, the brain must hold incomplete pieces longer, increasing effort.
This sentence feels relatively easy to process:
The reporter who attacked the senator admitted the error.
Why? The subject "who" (referring to "reporter") is right next to its verb "attacked". The dependency is local.
This sentence is correct, but feels "heavier":
The reporter who the senator attacked admitted the error.
Why? The dependency is non-local.
In the "hard" sentence:
"The reporter who the senator attacked..."
This "holding" action is what increases the mental effort.
Linguists proposed two main types of mental effort that increase with distance:
Integration Cost:
How hard is it to connect a new word (like a verb) to its related subject or object from earlier in the sentence?
Longer distance = Higher integration cost.
(It's harder to "plug in" the new word.)
Memory Cost:
How many incomplete dependencies (or "open expectations") must your brain hold onto at one time?
More incomplete parts = Higher memory cost.
(Your brain's "RAM" is getting full.)
This is why "tongue-twisters" or complex legal text is so confusing:
"The administrator who the intern who the nurse supervised had bothered lost the reports."
By the time you reach "had bothered", your brain is juggling multiple open connections:
No wonder it feels confusing!
This preference for "local" connections isn't just an English quirk. It's a global human tendency.
DLT also explains *how* speakers structure conversations.
In Hindi conversations, speakers make it easier for the listener by following a "Discourse Context" principle:
Appears earlier in the sentence.
("That friend of mine...")
Comes later.
("...just bought a house.")
Large Language Models (like GPT) don't have brains or biological memory limits...
...yet they often show human-like behavior.
LLMs reflect our cognitive biases because they were trained on billions of human-written sentences.
They didn't "learn" DLT as a rule; they statistically absorbed our natural tendency to follow DLT principles for efficient communication.
So, while machines don't "struggle" with memory like us, their language generation mirrors our struggle.
A pattern emerges across both humans and AI:
Efficiency and locality are universal principles of thinking.
When we code, write, or teach, clarity comes from reducing unnecessary distance between related ideas.
DLT bridges cognitive science and artificial intelligence by showing that both thrive on locality.
Evolved this efficiency out of necessity (limited working memory).
Inherited this efficiency from us (by learning from human data).
"Simplicity isn't a limitation - it's intelligence in its most elegant form."
This field of study opens several avenues for further exploration: