How We Cut Cloud LLM Costs by 93% with On-Premise Inference