← Back to Projects
Completed

NutriLLaVA

Multimodal AI app comparing LLaVA 1.5 vs 1.6 for generating personalized recipes from fridge/pantry photos based on dietary goals.

Started: August 2025Released: December 2025

About This Project

NutriLLaVA is a Large Multimodal Model application that helps users generate creative, personalized recipes from photos of their ingredients while adhering to specific dietary goals. The system addresses a real problem in fitness and nutrition: meal prep monotony that leads people to abandon their diets. By uploading a photo of their fridge, pantry, or countertop along with dietary preferences (high-protein, vegan, fat loss, etc.), users receive instant recipe recommendations from an AI acting as both nutritionist and chef.\n\nThe project evolved into a comparative study between LLaVA 1.5 (7B parameters) and LLaVA 1.6 (34B parameters), using the PDER (Plan, Draft, Evaluate, Refine) prompt engineering methodology. Testing across 40 synthetically-generated images (created via ChatGPT 5.1) and 4 distinct dietary profiles revealed that upgrading to LLaVA 1.6 improved success rates from 32.5% to 77.5%. The key differentiator was ingredient identification accuracy—both models understood dietary constraints well, but LLaVA 1.5 frequently hallucinated ingredients not present in images.\n\nBuilt with a Gradio frontend on Google Colab with NVIDIA A100 GPU, the application demonstrates the practical tradeoffs between model size, accuracy, and inference latency. While LLaVA 1.6 achieved significantly better qualitative accuracy, inference times increased from ~30 seconds to 2-4 minutes, highlighting the compute challenges in deploying large multimodal models.

Tech Stack

PythonPyTorchHugging Face TransformersLLaVA 1.6GradioPillowGoogle Colab
academic