Research

Collinear Research

Explore our research papers developed in partnership with leading enterprises and universities

YC-Bench: We gave Claude, Gemini and GPT $250k, and it didn't go as you'd expect...

Cartoon orange and white cat holding a cute black spider with a small web on its back, against a light background with spider webs.

Spider: A lightweight on/off-policy distillation framework with a single client interface

Line graph comparing LCB Pass@1 performance of Qwen2.5-7B-Instruct and Llama3.1-8B-Instruct models across different token counts, highlighting a valley region between 1K and 10K tokens.

The Valley of Code Reasoning: Scaling Knowledge Distillation of Large Language Models

Illustration of a woman looking confused and scratching her head while sitting at a desk with a laptop and a small plant.

Impatient users confuse AI agents: high-fidelity simulations of human traits for testing agents

Gray cartoon cat with a puzzled expression holding up a paw beneath a large question mark.

Cats Confuse Reasoning LLM: Query Agnostic Adversarial Triggers for Reasoning Models

VERITAS: A Unified Approach to Reliability Evaluation

Self-rationalization improves LLM as a fine-grained judge

Orange twisted ribbon-like shape forming a loop on a black background.

Get started in minutes

Better simulations.
Better data. Better agents.

See what a thousand rollouts can teach your agent in 30 minutes.

Talk to a Researcher