petal length (cm)<=2.45class=setosapetal length (cm)<=4.75petal width (cm)<=1.65petal width (cm)<=1.75class=versicolorclass=virginicapetal length (cm)<=4.95petal length (cm)<=4.85class=versicolorpetal width (cm)<=1.55sepal length (cm)<=5.95class=virginicaclass=virginicapetal length (cm)<=5.45class=versicolorclass=virginicaclass=versicolorclass=virginica

Applied AI, in production

We engineer production-grade applications on Anthropic's Claude, bringing data-science discipline to LLM engineering — testing for consistency, calibrating confidence, and auditing the decisions the model makes. We pair this with a deep predictive-modelling and machine-learning toolbox.

Claude & LLM engineering

  • Consistency testing — running prompts repeatedly to expose instability, then refining until the model gives the same answer every time on the same input
  • Structured human-in-the-loop — when a decision is genuinely ambiguous, surfacing alternatives and reasoning for a one-click human review instead of guessing
  • Multi-step workflows driven by explicit state machines, so the framework keeps the model on the rails through long-running sessions
  • Prompt orchestration — system, tool, and per-state prompts designed for reliability under real-world inputs
  • Model selection matched to assessed task complexity — strongest models where reasoning matters, faster models where it doesn't
  • Tool use with validated outputs — Claude calling our functions and returning structured results we can verify, with retries and fallbacks built in
  • Live data pipelines feeding current, structured information into the model

Predictive modelling & machine learning

  • Machine learning — gradient-boosted models (GBMs) and related supervised methods
  • Generalised linear models (GLMs) and statistical inference
  • Simulation and optimisation algorithms
  • Pricing and price-elasticity modelling
  • Data analytics that go beyond dashboards

High-confidence outputs ship. Low-confidence ones become a structured conversation, not a black box. Classical ML where the answer is numerical, LLMs where it's linguistic or structural — with the statistical analytics layer over both that keeps everything auditable.