LLM optimization for "Santa 2024" Kaggle competition

The Challenge

Kaggle competition Santa 2024. My goal was not to get the highest possible score but to optimize and compare different LLM-based systems to see what’s the best score reachable for an LLM and study their differences.

My Solution

I produced 7 notebooks where I compare different LLM systems and models:

  1. Explored Claude Haiku and Sonnet in single call and Agentic architectures

  2. Compared perplexity scores of baseline VS random order

  3. Prompt optimization on Claude Haiku using DSPY framework

  4. Comparison of DeepSeek-R1-Distill-Llama-8B VS Llama-8B models

  5. Synthetic data generation using LLMs for creating a finetuning dataset

  6. Finetuning of DeepSeek-R1-Distill-Llama-8B on Google Colab

  7. Evaluate my finetuned version of DeepSeek-R1-Distill-Llama-8B

The Outcome

I reached the best-performing system after finetuning DeepSeek-R1-Distill-Llama-8B. This model performed better than a large commercial LLM like Claude Sonnet.

My Kaggle notebook for prompt optimization using DSPY obtained a bronze medal.