Tell the AI what you have and what you want — it handles the rest.
Paste a link to extract frames and audio, and let the VLM structure the recipe.