Long continuous AI video is here! Free & open-source

Stable Video Infinity Version 2 Pro: Installation & Usage Guide

Key Concepts:

Stable Video Infinity (SVI): An open-source AI tool for generating long, coherent videos.
Alibaba’s One 2.2: The underlying open-source video model used by SVI, considered the best and most uncensored currently available.
VRAM: Video Random Access Memory – the amount of memory on a graphics card, impacting the ability to run AI models.
ComfyUI: A popular, node-based platform for running open-source image and video models.
Lora (Low-Rank Adaptation): Small files used to fine-tune models for specific styles or effects.
GGUFS: A quantized model format allowing for running large models on lower VRAM systems.
FP8: An 8-bit floating point format for model weights, offering a balance between performance and quality.
Sage Attention/Triton: Acceleration methods for faster processing, potentially causing errors if not configured correctly.

I. Introduction & Overview

Stable Video Infinity (SVI) version 2 Pro is a free, open-source, and uncensored AI tool capable of generating long, consistent, and coherent videos. Unlike previous iterations, version 2 Pro delivers significantly improved quality and seamless transitions. The tool leverages Alibaba’s One 2.2, currently the leading open-source video model, by generating multiple 4-5 second clips and seamlessly stitching them together. Previous methods of stitching clips often resulted in noticeable transitions and quality degradation over longer durations, issues SVI 2 Pro effectively addresses. The tool is designed to run even with as little as 6GB of VRAM.

II. How Stable Video Infinity Works

SVI 2 Pro operates by breaking down video generation into shorter, manageable clips (4-5 seconds). Each clip is generated independently, but designed to flow seamlessly into the next. This approach circumvents the quality deterioration experienced when attempting to generate longer videos directly with models like One 2.2, which typically degrade after 4-5 seconds. The key improvement in version 2 Pro lies in its ability to maintain visual consistency – characters and scene elements remain consistent even across cuts and pans. The new version produces “more dynamic and natural motion” and exhibits a “memory” of scene elements.

III. Installation & Setup (ComfyUI)

The installation process focuses on utilizing ComfyUI, a popular platform for running open-source AI models. The following steps are required:

Install ComfyUI: (Refer to linked tutorial in video description). The Windows portable version is recommended.
Install KJ Nodes:
- Open Command Prompt within the ComfyUI directory (typing “cmd” in the address bar).
- Clone the KJ Nodes repository using the command: git clone [URL from video description].
- Install dependencies using the command provided in the video description.
Update ComfyUI: Within ComfyUI, navigate to “Manager” -> “Update ComfyUI” and restart.
Download SVI Workflow: Download the JSON workflow file (link in description) and drag it into the ComfyUI interface.
Download Required Models:
- One 2.2 (Image to Video): Download both High and Low variants (approximately 15GB each). Place in ComfyUI/models/diffusion_models. FP8 versions are available for reduced VRAM usage.
- Light X2V Loras: Download both High and Low variants. Place in ComfyUI/models/loras. These accelerate video generation by 4-5x.
- Stable Video Infinity Version 2 Pro Models: Download both High and Low variants (approximately 1.23-2.3GB each). Place in ComfyUI/models/loras.
- UMT5 XXL FP8: Download and place in ComfyUI/models/text_encoders. (6.7GB)
- VAE: Download and place in ComfyUI/models/vae. (Uses 1.12.1 instead of 1.2.2)
Refresh Model List: Press ‘R’ within ComfyUI to update the model list.
Select Models: Select the downloaded models from the dropdown menus within the workflow.

IV. Running the Workflow & Customization

Initial Image: Upload an image to serve as the starting frame.
Resolution & Cropping: Set the desired width and height. Setting the workflow to “crop” will maintain the aspect ratio by cropping the image.
Prompting: Enter a prompt to guide the AI’s generation of subsequent clips (e.g., “look at the camera and smile”).
Workflow Structure: The workflow consists of repeating blocks that generate 4-5 second clips. Additional blocks can be added to create longer videos.
Running the Workflow: Press the “Run” button to initiate video generation.

V. Troubleshooting Common Errors

Sage Attention/Triton Errors: Set the “Sage Attention” setting to “Disabled” for both High and Low nodes.
Missing SVI Pro Node: Ensure KJ Nodes are installed and updated via ComfyUI’s Custom Nodes Manager.
PyTorch 2.7.1 Error: Set the “FP16 accumulation” setting to “False” for both High and Low nodes.

VI. Utilizing Loras & GGUFS for Lower VRAM

Adding Loras: Duplicate the Lora loader nodes and connect them to the appropriate inputs to add additional stylistic or effect-based Loras.
GGUFS Models: For systems with limited VRAM, utilize GGUFS versions of One 2.2 (available via link in description). These quantized models require less memory. Replace the standard One 2.2 nodes with “Unit Loader” nodes (install via ComfyUI’s Custom Nodes Manager if needed) and select the corresponding GGUFS model file.

VII. GenSpark Integration

The video highlights GenSpark, an all-in-one AI workspace capable of handling research, reports, slide decks, website builds, and automations. Demonstrations include generating CRM dashboards, marketing assets, and slideshows. GenSpark also offers an AI inbox and team collaboration features. (Link in description).

VIII. Conclusion

Stable Video Infinity version 2 Pro represents a significant advancement in open-source video generation. Its ability to create long, coherent, and visually consistent videos, even on systems with limited VRAM, makes it a powerful tool for content creators. The combination of Alibaba’s One 2.2, the seamless clip stitching, and the flexibility of ComfyUI provides a robust and accessible platform for AI-powered video production. The availability of GGUFS models further expands its accessibility to a wider range of hardware configurations.