The cost of doing evals has gone down substantially...How to use coding agents to accelerate your evalsMay 19, 2026·4 min read
Resources to transition to 'Applied AI' roleI have seen quite a few software devs(including me) with little to no background in ML jumping on the genAI train in the last 1-2 years. While this is a good thing and using LLMs with no prior classical ML knowledge helps in some ways it is also a de...Jun 27, 2025·3 min read·248
Curated resources: AI Product x How to eval?List of resouces for building a solid eval pipeline for your AI productMar 9, 2025·2 min read·527
Order of fields in Structured output can hurt LLMs outputEvals on how does the order of fields in prompts w/ structured output(JSON) affect the LLM response qualityJan 5, 2025·4 min read·3.3K
Experiments with gpt-4o vision and architecture diagramsEval based experimentation to figure out how well it works with architecture diagramsOct 22, 2024·4 min read·867