Create Lifelike AI Videos of Yourself - Consistent Characters

Key Concepts

Consistent character generation in videos using AI
Image to video animation
Text to video generation
Replicate AI model platform
Flux PID model (single image animation)
OSRIS Flux Dev LoRA trainer (multiple image training)
Model training with custom images
Trigger words for character activation
Face bleed issue in multi-character scenes
Custom GPTs for marketing
Cling video generator
Runway video generator
AI custom model training on videos (Cling Pro Plan)
Lip syncing in videos

1. Single Image Animation with Flux PID on Replicate

Main Point: Easiest method for animating a character from a single image.
Process:
1. Log into Replicate using a GitHub account.
2. Add a credit card for usage-based charges (very cheap, e.g., 45 runs for $1).
3. Search for "flux PID" model.
4. Upload the character image.
5. Write a prompt (e.g., "wizard casting a spell").
6. Set aspect ratio (e.g., 16:9).
7. Leave settings at default initially, adjust as needed.
8. Set number of generations (default is 4).
9. Choose PNG output format.
10. Run the model (takes ~20 seconds).
Example: Using an image of the speaker and prompting for a wizard casting a spell.
Technical Terms:
- Replicate: An AI model platform.
- Flux PID: An AI model for single image animation.
Data: Cost is very low, around 45 runs for $1.

2. Multiple Image Training with OSRIS Flux Dev LoRA Trainer on Replicate

Main Point: Training a model with multiple images for more accurate character representation.
Process:
1. Search for "OSRIS" then pick "flux Dev Laura trainer" on Replicate.
2. Name the model for storage (e.g., "Kevin 2").
3. Keep the model private during training.
4. Upload 10+ photos of the character in a ZIP file.
5. Create a trigger word (e.g., "Kevin").
6. Add autocaptions (optional).
7. Increase lower rank to 32 for complex feature training.
8. Create training (takes ~20 minutes, costs $3-5).
9. Run the trained model from the dashboard.
10. Use the trigger word in prompts (e.g., "Kevin is a wizard casting a spell").
Example: Training a model using thumbnails and additional photos of the speaker.
Technical Terms:
- OSRIS Flux Dev LoRA trainer: An AI model for training with multiple images.
- LoRA: Low-Rank Adaptation of Large Language Models
- Trigger word: A word used in prompts to activate the trained character model.
- Lower rank: A parameter that influences the level of detail the model learns.
Data: Training costs $3-5 and takes about 20 minutes.

3. Addressing Face Bleed Issue

Main Point: Face bleed occurs when prompting scenes with multiple people, causing characteristics from the trained model to appear on other faces.
Example: Prompting a scene of the speaker sipping tea with Queen Elizabeth resulted in the Queen's face being altered with features from the speaker's trained model.
Solution: Be mindful of multi-character scenes and adjust prompts or models accordingly.

4. Custom GPTs for Marketing (Sponsored by HubSpot)

Main Point: Custom GPTs are underutilized but powerful tools for marketing automation.
HubSpot's Offering: Free PDF guide on creating custom GPTs for marketing.
Benefits:
- 10x marketing results.
- Create AI assistants for various tasks (social media content, ad copy, market research).
Process (from the PDF):
1. Basic setup.
2. Customizing and personalizing the GPT's knowledge.
3. Training, testing, and refining.
Resources: Free PDF download available in the video description.

5. Training on a Different Character (Alien Example)

Main Point: Demonstrating the training process with a non-human character.
Process:
1. Prompt for a character sheet in an image generator.
2. Describe the character's clothing and style.
3. Crop the character sheet into different shots (different angles, close-ups).
4. Flip the canvas for symmetrical characters to create more images.
5. Follow the same training process as before (name, ZIP file, trigger word, lower rank, create training).
Example: Creating a character sheet for a blue alien in a suit ("blute").
Observation: Results may be more variable compared to training with human faces.

6. Turning Images into Videos with Cling

Main Point: Using Cling to animate generated images into videos.
Process:
1. Switch to image to video in Cling.
2. Drag the image into Cling.
3. Leave the prompt blank or type what you want to happen (e.g., "Vikings walk into battle").
4. Customize camera controls (optional).
5. Generate the video (takes 5-10 minutes).
Alternatives: Runway, Minx, Luma Labs.
Observation: Results can vary; may require rerolls or prompt adjustments.

7. Training a Model on Videos of Yourself with Cling (Pro Plan)

Main Point: Cling's newest feature allows training a model on videos of yourself for realistic video generation.
Requirements: Cling Pro Plan.
Process:
1. Access the AI custom model feature.
2. Agree to terms of responsible use.
3. Upload a close-up front view video with a natural expression (good lighting, no blur, no subtitles).
4. Upload 10-30 additional videos (5-15 seconds each) with different actions and expressions.
5. Ensure consistent hair and features across videos.
6. Submit for training (takes ~2 hours).
Important Considerations:
- Avoid HDR videos (default on iPhones); turn off HDR in settings or convert videos.
- Use content you have the rights to.
Usage: Select the trained face reference and write a prompt (e.g., "Kevin is casually walking away from an explosion").

8. Lip Syncing in Cling

Main Point: Cling has a built-in lip syncing feature.
Process:
1. Click "lip sync."
2. Choose text to speech or upload audio.
3. Click "lip sync" again.

9. Synthesis/Conclusion

The video provides a comprehensive overview of three methods for creating videos with consistent characters using AI. The first two methods involve generating images using Replicate's Flux PID (single image) or OSRIS Flux Dev LoRA trainer (multiple images) and then animating them. The third method uses Cling's AI custom model feature to train a model on videos of yourself. The video also covers practical considerations like addressing face bleed, using custom GPTs for marketing, and utilizing Cling's lip syncing feature. The speaker emphasizes the ease of use and the potential for creating engaging and personalized video content.