resources

Research

Self-rationalization improves LLM as a fine-grained judge

An iterative process enhancing LLM-as-a-judge rationales, outperforming larger models

Read

VERITAS: A Unified Approach to Reliability Evaluation

A fast, accurate hallucination detection model rivaling GPT-4 Turbo while reducing costs

Read