Practical notes on evals, model trade-offs, and prompts that perform outside the demo.
EigenPrompt is a data-driven prompt optimizer that automatically rewrites and tests your prompt to find the best trade-offs between accuracy and cost. Pointed at an LLM-based support-ticket router, it took the prompt from 76% to 92% accuracy at sending tickets to the right desk, turned up a cheaper version that still beat the original, and flagged the mislabeled and ambiguous tickets that were capping the score. Clean those up and accuracy reaches 97%.
The 5x Performance Team
Bank transaction descriptors hide the merchant behind processors, app stores, and payment rails. We point EigenPrompt at a plain merchant-extraction prompt, walk through a full Standard run screen by screen, and compare what the Efficient, Standard, and Advanced modes each buy you. Accuracy went from 64% to 81% while the winning prompt got 41% cheaper per call.
The EigenPrompt Team
A clear, up-to-date glossary of prompt optimization and prompt engineering terms — from eval leakage and prompt caching to reasoning tokens, tool calling, and DSPy.
The EigenPrompt Team