← All content

Knowunity — 50% LLM Cost Reduction

Replaced frontier model API calls with distilled SLMs, cutting inference costs by 50% without sacrificing quality.

Overview

Company: Knowunity — Europe’s leading learning platform for students.

Challenge: High inference costs from frontier LLM API calls for content classification and routing across millions of daily requests.

Solution: Replaced frontier model API calls with distilled SLMs trained on the distil labs platform.

Results:

  • 50% reduction in inference costs
  • No loss in classification quality
  • Self-service retraining when categories change

“Using distil labs, we spun up highly accurate custom small models tailored to our workflows quickly. Those models cut our inference costs by roughly 50% without sacrificing quality.”

Lucas Hild Co-Founder & CTO at Knowunity