What is Reference-to-Video?

Reference-to-video is an AI video generation workflow where a reference input — such as a product image, URL, or style guide — is used to inform the content, subject, and visual style of a generated video, rather than relying solely on a text prompt. This approach is particularly valuable for e-commerce, where sellers can paste a product page URL and receive a ready-to-publish video ad.

How It Works

Reference-to-video pipelines combine multiple AI capabilities. First, the system extracts information from the reference: for a product URL, this means scraping the product name, description, price, and images using web scraping and vision models. For a style reference image, a vision encoder extracts style features (color palette, composition, mood).

Next, an LLM (large language model) synthesizes this extracted information into an optimized video generation prompt. For a product video, the prompt might describe the product rotating on a clean background with lifestyle context. The LLM can also generate an ad script or call-to-action overlay text.

The generated prompt and reference image are then passed to a video diffusion model. The reference image may be used as a conditioning input (like image-to-video) or as visual context that guides the generation. The specific approach depends on the model and pipeline configuration.

The result is a video that features the referenced product or style without requiring the user to write a detailed prompt manually. This lowers the barrier to video creation for non-technical users, especially e-commerce sellers who need to produce video content at scale.

Use Cases

  • 1E-commerce product ads — Paste a Shopify or Amazon product URL and get a video ad showcasing the product with motion, lifestyle context, and branding.
  • 2Brand-consistent content — Upload a brand style guide image and generate videos that match the brand's visual identity without manual art direction.
  • 3Real estate marketing — Use a property listing URL to generate a virtual walkthrough video from the listing photos and description.
  • 4Social media repurposing — Reference a blog post, infographic, or existing image to generate a video version suitable for Instagram Reels or TikTok.

Reference-to-Video on Kensa

Kensa offers a dedicated reference-to-video tool. You can paste a product URL from any major e-commerce platform, and the system automatically extracts product details, generates an optimized prompt, and produces a showcase video. You can also upload a reference image directly.

This workflow is especially popular with Shopify and Amazon sellers who need to produce video ads for every product listing. Visit the reference-to-video tool to try it.

Related Terms

Frequently Asked Questions

How is reference-to-video different from image-to-video?+
Image-to-video takes a single image and animates it — the image becomes the first frame. Reference-to-video uses a reference (image, URL, or product page) as a style or content guide but generates a fully new video that may not directly match the reference frame-by-frame. The reference informs the subject, style, or product details, but the AI composes its own scenes.
Can I use a product URL as a video reference?+
Yes, on Kensa you can paste a product URL from stores like Shopify or Amazon. The system automatically extracts the product name, images, description, and price, then generates a product showcase video. This is especially useful for e-commerce sellers who need video ads for every product listing.
What types of references work best?+
High-quality product photos on white backgrounds work best for e-commerce videos. For style references, images with clear artistic direction (lighting, color palette, composition) transfer most reliably. Avoid references with cluttered backgrounds, heavy text overlays, or multiple subjects — the AI performs best with a clear focal point.

Try Reference-to-Video on Kensa

Paste a product URL and get a video ad in minutes. Free credits, no credit card required.

Start Generating
What is Reference-to-Video? | AI Video Glossary | Kensa