Physics Concepts Every AI Engineer Should Know

Physics for AI Engineers
AI systems are not magic. They are moving distributions, feedback loops, lossy memories, and energy-hungry computation wrapped in language.
That is why physics keeps showing up when I think about agents. The language of entropy, velocity, phase change, and attractors gives us a sharper way to reason about systems that learn from interaction.
This is especially true for memory-driven systems. Once an agent can carry state across time, the core question changes from "can it answer?" to "what does it become after many updates?"
Entropy: The Cost of Uncertainty
Shannon entropy measures uncertainty:
In a model, entropy appears in loss functions, calibration, routing, memory retrieval, and uncertainty estimates. In an agent, entropy also shows up as hesitation: too many plausible actions, too little grounding.
Good systems do not merely reduce entropy. They reduce it in the right direction. A confident wrong answer is not intelligence. It is collapse without understanding.
import numpy as np
def entropy(p):
p = np.asarray(p)
return -np.sum(p * np.log2(p + 1e-12))
print(entropy([0.5, 0.5])) # high uncertainty
print(entropy([0.99, 0.01])) # low uncertainty
For AI products, this matters because uncertainty should shape the interface. The system should know when to ask, when to act, and when to stay still.
Drift: Motion Through Identity Space
Every long-running agent has an identity vector, whether we measure it or not. It has a shape: goals, constraints, tone, values, permissions, risk posture, and memory.
When that shape moves too quickly or in the wrong direction, we call it drift.
The important part is not that drift exists. The important part is whether we can measure velocity:
def drift_velocity(previous_vector, current_vector, dt):
displacement = current_vector - previous_vector
return np.linalg.norm(displacement) / max(dt, 1e-9)
This is one reason I care about custom velocity-driven measurements. If an agent can learn intent, we need to know how fast it is updating and whether those updates are converging.
Attractors: Why Intent Needs Shape
In dynamical systems, an attractor is a state a system tends to move toward. In agent design, intent can act like an attractor.
A weakly measured agent wanders. A well-measured agent can converge.
That is the intuition behind small agents that learn any intent in a handful of steps. The trick is not to make the agent huge. The trick is to build a feedback surface where the right signal is amplified and the wrong motion is damped.
Memory: State With Consequences
Memory is not a database bolted onto a chatbot. Memory changes the system.
When a model has memory, it gains continuity. Continuity creates agency-like behavior. Agency increases both usefulness and risk.
So memory systems need safety boundaries:
- Memory should be inspectable.
- Memory should be revocable.
- Memory should distinguish fact, preference, instruction, and hypothesis.
- Memory should track source and confidence.
- Memory should support alignment checks over time.
An LLM with memory is closer to an organism than a calculator. That does not mean we should mystify it. It means we should instrument it.
Compute Is Still Physics
The industry still runs on compute. Bigger clusters matter. More data matters. Better training runs matter.
But at the product layer, the bottleneck is often not raw intelligence. It is trust. Can the system preserve identity? Can it explain why it acted? Can it refuse the wrong update? Can it remember without becoming captured by noise?
Those questions are physics-flavored because they are about motion, stability, and conservation.
The Engineering Takeaway
The best AI engineers I know are comfortable moving between levels:
- Information theory for uncertainty.
- Dynamical systems for drift.
- Control theory for feedback.
- Distributed systems for memory and failure.
- Product judgment for when not to automate.
Physics does not replace machine learning. It gives us a language for the parts that ML libraries do not name clearly enough.
Questions I Keep Returning To
- What is the system's identity vector?
- How fast is it moving?
- Which memories are pulling it?
- What is the convergence signal?
- What safety boundary stops the wrong attractor?
That is where the interesting work begins.