ESSAY · 13 June 2026 · 8 min read

Four Letters, Zero Data: the case for quantifying the pupillary light reflex

In 2019 I wrote that the doctor's bedside instruments belong in a museum. Pupilux is one of those predictions, built — and the case for measuring a reflex that has been judged by eye for over a century.

Clinical · Strategy

At the end of 2019, in the second part of a two-essay series, I wrote a sentence I have had cause to remember since. The tools a doctor uses to examine a patient's body, I argued, "are at least a century old, if not more." I named them by their dates — the stethoscope (1816), the sphygmomanometer (1881), the knee hammer (1888), the ophthalmoscope (1857), the thermometer (1714) — and called them "antiquated relics that should be sent to museums quickly." Then I made a prediction. "Image recording and recognition systems (e.g., the phone camera — duh!) will get empowered with backend AI algorithms" that help a doctor examine a patient. I called it "an invitation for some start-ups to invade into the doctors' office."

Six years later I have built one of them. This essay is about the test it measures, which is older than any instrument on that list — and about why a reflex that has guided physicians for two centuries is still, in most of the world, recorded in four letters and no data.

Pupilux — quantitative bilateral pupillometry from an ordinary iPhone

The oldest signal at the bedside

The pupillary light reflex is among the first things a clinician learns and among the last things they stop trusting. Shine a light into one eye; both pupils should constrict, briskly and equally. The arc of that response runs through the optic nerve, the midbrain, and the third cranial nerve — a circuit that passes directly beside the parts of the brain that fail first when pressure rises inside the skull. When a head-injured patient is deteriorating, the pupil often knows before anything else does. A blown, fixed pupil is one of the few bedside findings that can change a decision in the next sixty seconds.

So it is strange how we record it. A nurse or physician brings a penlight to the bedside, shines it, watches, and writes a word: reactive, or sluggish, or fixed. Four letters. No number, no trace, no way for the next clinician three hours later to know whether the pupil is the same as it was or quietly getting worse. The most consequential neurological sign we have is documented with less precision than we would accept for a blood pressure.

The eye is a poor pupillometer

It would be one thing if the human eye were good at this. It is not, and the literature on this is not kind. When clinicians' penlight assessments are checked against objective measurement, the agreement collapses at exactly the sizes that matter. Couret and colleagues, in a 2016 Critical Care study of 406 paired measurements, found that half of all anisocoria — a difference in size between the two pupils, often the first warning of a mass effect — was simply missed by eye, with a 39% error rate for small pupils. Olson and colleagues, examining 2,329 assessments the same year, reported a 67% false-negative rate for non-reactivity and an inter-rater agreement (kappa 0.40) that is, in plain terms, barely better than poor. Kerr's group documented the same systematic pattern: sizes underestimated, asymmetries and reactivity misjudged.

None of this is a comment on the people doing the examining. It is a comment on the task. Estimating a three-millimetre disc contracting over a third of a second, under variable ambient light, from memory, is not something a human visual system does reliably. We have been asking eyes to do a measuring instrument's job.

Why the imprecision is expensive

The reason this matters is that the signal, when you capture it properly, carries enormous prognostic weight. In severe traumatic brain injury, the combination of a low Glasgow Coma Score with bilateral fixed, dilated pupils has been associated with mortality approaching 100%, where reactive pupils at the same coma score leave the majority of patients surviving (Tien, J Trauma 2006). Bilateral mydriasis has been reported to carry an odds ratio above eleven for death (Martins, J Trauma 2009). After cardiac arrest, a quantified non-reactivity index at or below a threshold value has shown near-total specificity for poor neurological outcome (Oddo, Intensive Care Medicine 2018). The guidelines have caught up to this: recent statements from NINDS on TBI classification, from the AHA on comatose post-arrest survivors, and from the European resuscitation and intensive-care societies all now ask that pupillary reactivity be documented and, increasingly, quantified — not described.

There is a gap, then, between what the signal is worth and how it is captured. Quantified pupillometry — a dedicated infrared device that measures the reflex and returns a number — closes that gap, and where it is deployed it has changed practice. But it is a separate instrument, with a per-unit cost and disposables, and it lives in the units that can afford one of it per ward, if that. In most emergency rooms, on most night shifts, in most of the world, the pupillometer is the penlight and the four-letter note.

The ninety-five-minute window

I built Pupilux for that majority, and the Indian emergency room is the clearest case. India sees on the order of 2.2 million traumatic brain injuries a year, most of them mild on arrival — which is exactly the population in which a deteriorating pupil is the cheap early warning. Yet the IMPETUS collaborative, looking across 23 centres and more than 2,000 patients, found a median door-to-CT time of 95 minutes against a guideline target of 25. For an hour and a half, the patient most at risk of a "talk and die" trajectory is being watched without a single objective neurological number being recorded. A penlight and a word are, very often, all there is.

That is not a problem you solve by shipping more infrared pupillometers to under-resourced ERs. It is a problem you solve the way I guessed in 2019 — with the camera and torch that are already in the clinician's pocket.

The instrument is already in your hand

Pupilux performs a bilateral pupillary light reflex test from an ordinary iPhone in seven seconds. A voice-guided protocol runs the capture hands-free — a baseline, a calibrated torch flash, a recovery window — while the device locks onto both irises at once. All of the analysis runs on the phone's own neural engine; no image leaves the device. What returns is not a word but a two-page report: a pupillogram for each eye and six measured parameters per side — baseline diameter, constriction percentage, latency, maximum constriction velocity, average dilation velocity, and recovery time. A test costs about ten rupees, twelve American cents. There is no dedicated hardware and there is nothing to consume.

The engineering question I cared about most was the only one that earns a clinician's trust: is the measurement true? Against a blinded reference, absolute pupil diameter came back with a mean error of about half a millimetre — comfortably inside the one-millimetre tolerance the American Association of Neuroscience Nurses sets for clinical pupillometry — and the system agreed with an expert grader on the direction of anisocoria down to asymmetries as small as a tenth of a millimetre. That is the difference between an interesting demonstration and an instrument.

The discipline, not the noise

I want to be precise about what this is and is not, because the first lesson of the first essay in that 2019 series applies directly to me now. Pupilux is a measurement and screening tool. It is not a diagnostic device, it does not replace a CT scan or a clinician's judgement, and it is in alpha. The honest claim is narrow and, I think, sufficient: it turns a guess into a number, a number that can be timestamped, compared against the same patient an hour later, and handed to the next person on shift.

The other lesson from 2019 was that medicine adopts slowly — fifteen to twenty years for a new tool to become routine — and that it adopts on rigour, not on noise. I believe that, and I would rather Pupilux earn its place that way than any other. But the deeper pattern that essay drew out is the one I keep returning to: the advances that change medicine usually come from outside it — from chemistry, from metallurgy, from electronics. The phone in a clinician's pocket is the most widely distributed sensor humanity has ever built. Pointing it at the oldest signal at the bedside, and finally writing down a number instead of four letters, is exactly the kind of borrowing from elsewhere that the last century of medicine was made of.

The penlight has had a good run. It can keep its place in the museum, next to the ophthalmoscope.

Explore Pupilux at pupilux.ai.