A comprehensive benchmark was conducted on 15 small language models (SLMs) across 9 diverse tasks, including classification, information extraction, and QA. The study aimed to provide data-driven insights to help developers choose the optimal base model for fine-tuning, addressing the challenge of selecting from numerous SLM options like Qwen3, Llama 3.2, and Gemma 3. This research offers practical guidance for fine-tuning efforts, moving beyond intuition to systematic evaluation.