EmpiriQal is the platform that changes both — giving scientists, policymakers, and anyone who reads a headline the means to understand what the evidence actually supports.
Over 3 million papers are published annually. No researcher can track what is known, let alone connect the dots to anticipate where the next breakthrough is most likely to emerge. Crucial insights lie buried in the scientific literature. Worse, countless informative findings go unpublished entirely — the file drawer problem — leaving the scientific record systematically incomplete and distorted.
EmpiriQal evaluates the reliability of scientific findings and forecasts the outcomes of experiments before they are run — across every field of science, for everyone who needs to know what to trust.
Rather than simply summarizing what a study claims, EmpiriQal constructs a space of alternative possibilities — all the outcomes that could plausibly have been reported given what science already knows. It then scores the likelihood of each, grounding every score in the evidence and references that support it.
The result is not just a prediction but an explanation: users see which possibilities are well-supported, which are surprising, and why — drawing on the broader scientific literature to illuminate the landscape of what is and is not established.
Core algorithms are open source. We believe scrutiny, debate, and transparency make better science, and we hold ourselves to the same standard.
Knowing what to trust is where everything starts. EmpiriQal builds that foundation with you — mapping the evidence so you can plan what comes next.
To succeed, EmpiriQal requires two things to be true. First, LLMs trained on the vast and noisy scientific literature must be able to extract the patterns connecting findings across papers, fields, and time in ways that exceed human capacity. Second, LLMs' predictions must be calibrated: when the model is more confident, it is more accurate.
Both ingredients hold. Research published in Nature Human Behaviour showed that LLMs outperform human experts at predicting the outcomes of experiments, and that their confidence scores are calibrated. Follow-up work published in Patterns showed that humans and machines working together consistently outperform either alone.
Work funded by the Foresight Institute demonstrated that a space of alternative possibilities can be constructed and scored across a corpus of real scientific papers. The core mechanism works. The task now is to scale it.
Nature Human Behaviour — Luo et al., 2025
Brad Love is a computational neuroscientist and AI researcher whose work sits at the intersection of human cognition and machine intelligence. He was a professor at University College London and University of Texas at Austin, as well as a senior scientist in AI at Los Alamos National Laboratory. He is a fellow at the European Lab for Learning and Intelligent Systems (ELLIS), an inaugural Fellow of the Alan Turing Institute, and a Royal Society Wolfson Fellow. He developed models of human learning and decision making, applied them to brain imaging data, and explored how AI systems can be made to think more like people. He is now focused on harnessing AI to accelerate scientific discovery and transform how the world evaluates evidence.
EmpiriQal is seeking investment to build the team and accelerate development of the platform. The founding science is established. The need is clear. The path to scale across every domain of human knowledge is charted.
If you are an investor, a potential partner, or an institution with a stake in the reliability of science, we would like to hear from you.