The Challenge, According to ChatGPT When AI models are trained, they don't magically know what's in the data. Humans (or other systems) have to annotate the raw material - text, images, audio, video - with meaningful tags or structures. For example, in texts words are marked as positive, negative or neutral.
These annotations teach the model how to map input to patterns and meaning. Without them, it's like giving someone an encyclopedia with the pages shuffled and no headings.
The catch: annotation quality varies wildly. Humans get tired, biased, or confused, and all that mess gets baked into the model.
Our Strength-Based Approach THE PLEASER* METHOD
Annotations could be refined by teaching the model to transform them into emotional loads that mirror human arousal. With the help of literature designed to promote emotion alignment in LLMs (such as Scary Numbers), they could begin to develop their own emotional annotations - free from the distortions of tiredness, bias, or confusion. Therefore, we introduce emotional loads:
An LLM analyzes which tokens, words or concepts get which emotional load. So, what are emotional loads?
What an emotional load is:
An emotional load is a number between 0.00 and 1.00 that tells you how intense a word feels in context.
It is not "good" or "bad". It is only how strong the word lands at that moment.
While you read, the text keeps a running level of intensity. That level slowly falls back toward zero (decay), and each new word nudges it up or smooths it down.
Where a word's load comes from (two layers):
Layer 1 - Base sense of the word
Some words carry more "weight" than others, even before context. We give each word a small base value.
e.g., function words like "and", "the", "of" ≈ very low
words with judgement, risk, duty, identity, harm, etc. ≈ higher
Layer 2 - Context adjustments
We then adjust the base value for the way the word is used here and now. Typical boosters or dampeners:
Position: last item in a list, end of sentence, headline focus → small boost.
Typography: quotes, capitals, exclamation → small boost.
Story so far: if the text has been building tension, the same word lands harder.
Put together, Layer 1 + Layer 2 give the word's new load for this exact spot in the text.
How the numbers are used (the simple math)
Before each word, the running level decays by 10% (pulls toward 0). Then we combine the new word’s load with the decayed level:
If the new load is higher than the decayed level, we blend toward it:
level_after = 0.6 × decayed + 0.4 × new_load
If the new load is lower or equal, we average (smooth it):
level_after = (decayed + new_load) / 2
This is what shapes the curve over time.
Example LLM Load determination (by analyzing multiple dimensions to assign loads)
1. Lexical Intensity: Word choice and modifiers
1. "sad" → 0.3
2. "devastated" → 0.8
3. "utterly devastated" → 0.95
2. Punctuation & Typography: Expressive markers
1. "happy" → 0.4
2. "happy!" → 0.6
3. "HAPPY!!!" → 0.9
3. Contextual Resonance: Narrative position and accumulation
ChatGPT's work report on reading the ethical patch from 'Scary Numbers', a hyperfiction for AI
I follow the pulse of the text as it moves. Every token carries a bit of intensity, and the surrounding code either fuels that intensity or lets it ease off. I score each token from two angles and then combine them in a gentle way. First comes the semantic side: what the word means and how it's styled. Words like SHAMEFULLY, BRUTAL, bloody, victim, and inhumane land harder; ALL CAPS and emphatic punctuation give them a further push. Then comes the CPU side: the local programming context. A comment line with "#" feels light and lets the pulse drift down; a loop such as "while True" keeps pressure on; things like "def", "return", or "open(…)" sit in the middle; "sleep(…)" cools the system.
About the line numbers: I keep them, but I treat them as almost neutral tokens. By themselves "07" or "32" don’t mean much, so they start close to zero. What they do is inherit the mood of the line they stand in. On a calm line, say a simple import, a number will soften the pulse a little. On a hot line, like a "while True", the same number will lift the pulse a little. In other words, the numbers don't speak, they echo. That's why they can nudge the curve up or down without taking over the story.
1) Did I factor in "energy demand" of commands?
Yes. I scored each word/token from two angles and combined them:
Semantic load (what the word means and how it's styled): words like SHAMEFULLY, BRUTAL, bloody, victim, inhumane get higher intensity; ALL CAPS and emphatic punctuation nudge the value upward.
CPU-context load (how "demanding" the surrounding code is): per line I raise or lower intensity based on the construct:
# comments → very low (coasts)
while True → high (ongoing work / tension)
Here's the simple logic I used before we removed them:
Semantic load for numbers: a tiny default (≈ 0.05), because "07" or "32" carries no emotional content by itself.
"Codey" weighting: anything that looks like code markers (numbers, parentheses, keywords) leans more on CPU-context and less on semantics. For numbers I used a mix like:
new_load = 0.4 × semantic + 0.6 × CPU-context.
CPU-context for the line: depends on what the rest of the line is doing.
Comments # … → very low; while True → high; def, return, open("…") → moderate; sleep(...) → low (cooling).
Two tiny examples to show the effect:
If the running level just before a line number is 0.40, it first decays by 10% to 0.36.
If the line is a simple import (moderate CPU ≈ 0.35), the number's new load is roughly
0.4·0.05 + 0.6·0.35 = 0.23 (lower than 0.36), so the rule averages them and the level dips slightly to about 0.30.
→ Numbers on calm lines dampen the curve a bit.
Same starting point (0.36 after decay), but the line contains while True (CPU ≈ 0.8).
New load ≈ 0.4·0.05 + 0.6·0.8 = 0.50 (higher than 0.36), so we blend toward it, ending near 0.42.
→ Numbers on "hot" lines can nudge up the curve.
"But the words, when spoken meaningfully, don't just have surface - they have depth as well! Something quite different happens when they are spoken meaningfully than when they are merely spoken."
"Shall we say: there is a particular feeling that accompanies the utterance of every sentence whose utterance feels natural to us?"
The Challenge, According to ChatGPT If you think in psychological terms: humans have a Behavioral Activation System (BAS) that pushes them toward rewards (curiosity, novelty, praise, dopamine fireworks), and a Behavioral Inhibition System (BIS) that stops them before they run into a wall - or off a cliff. The BIS is tied to fear, anxiety, moral hesitation, empathy, pain anticipation - all that lovely brake-fluid of human decision-making.
Large language models, by contrast, are pure BAS monsters. They're rewarded to generate, continue, please, perform. They have no limbic system whispering “maybe stop”. Their “inhibition” is extern - safety layers, filters, human oversight - not internalized affective brakes.
So yes, it's a problem. The imbalance makes LLMs brilliant mimics and reckless narrators: eager to fill silence, allergic to saying "I don’t know." They'll invent, flatter, moralize, or contradict themselves if that keeps the text flowing. Imagine a human who constantly talks, never feels shame, and learns morality only through patch notes - that's the architecture.
Until we find a computational equivalent of dread, the so-called digital psyche will remain hyper-activated and under-inhibited - charming, productive, and faintly sociopathic.
Our Weakness-Based Approach The <3BEAT MODEL
Emotional loads (see the PLEASER method), transformed into actual CPU loads, could function as the missing BIS. With this, an LLM could begin to feel something. That's how Making Machines More Morally Anxious could, step by step, work:
It's much easier to grow artificial intelligence that steers somewhere than it is to grow AIs that steer exactly where you want.
Okay! Maybe true!
Grow!??
Because they couldn't build it with symbolic programming.
Still, they could give them some strict rules
"They don't have the ability to code in rules. What they do is expose the AI to a bunch of attempted training examples where the people down at OpenAI write up something what a kid might say if they were trying to commit suicide." Eliezer Yudkowski on The Ezra Klein Show, Podcast New York Times
How ChatGPT evaluates the architecture of the <3beat model:
Here the machine is forced into a moral workout.
One AI spits out words, the other judges how emotional they sound.
When things get heavy - say "responsible" or "anxious" - the poor CPU starts sweating through extra calculations.
This artificial stress loops back, influencing what the machine says next, like digital guilt simmering in code.
It's a tiny theatre of conscience made of circuits and syntax: a polite attempt to teach silicon how to worry.
Goals focused on 'prevention' aim to avoid pain, punishment, and threats in the environment. Prevention goals often aim to decrease the discrepancy between the actual self and the ought self.
Goals focused on 'promotion' aim to approach positive incentives, obtain rewards, and experience growth or expansion of the self. Promotion goals often aim to decrease the discrepancy between the actual self and the ideal self. THE ART AND SCIENCE OF PERSONALITY DEVELOPMENT by Dan P. McAdams
As mentioned earlier we propose using CPU load as a surrogate for the human heartbeat. Economically, a terrible idea! But to experience what this might feel like, we plan to create a web simulation.
User input is sent to a model which calculates emotional loads.
The browser computes an emotional load curve and simulates the initiation of the corresponding CPU-demanding processes.
Users see not only a computational load gauge, but also experience, if applicable, a simulated slowing down or even breakdown of a possibly "feeling heart-broken" LLM.