Building LLM Evals From Scratch