WAN 2.2 Animate Control Character Swaps by Alibaba

I’ve been testing Wan 2.2 Animate, a free model that can swap characters and animate photos with convincing lip sync and body movement. It’s available on wan.video and released as open source, so you can run it locally or use hosted options.
In this article, I’ll explain what the model does, how it works on the platform, the differences between Character Swap and Photo Animate, what to expect in terms of quality, ethical considerations, and practical tips based on real usage. I’ll also cover pricing, credits, and setup choices so you can get started quickly.
What Is Wan 2.2 Animate?
Wan 2.2 is a text-to-video and avatar model. The “Animate” tools focus on two tasks:
- Character Swap: Replace a person in a video with a new identity using a single reference image.
- Photo Animate: Animate a static image to match a reference performance while keeping the original photo’s background.
Both modes can produce speech-synced faces and body movements controlled by your source video or audio.
The model is open source and hosted on wan.video. You can run it locally if you’re comfortable with larger downloads and more complex setup, or you can use the web platform for a simpler workflow.
Overview of Wan 2.2 Animate
Item | Description |
---|---|
Access | wan.video (hosted), open source for local runs |
Core Modes | Character Swap, Photo Animate, plus speech-to-video options |
Inputs | Source video or voice; one reference image (for identity) |
Outputs | Video with identity swapped or a photo animated with original background |
Best For | Controlled lip sync, facial expressions, head and upper body motion; full-body motion can be strong in some cases |
Resource Needs | Web platform is easy to start; local runs require large downloads |
Cost | Free tier available; optional Pro plan and credits |
Key Features of Wan 2.2 Animate
- Character Swap with one image: Replace a subject in your video using a single photo as the target identity.
- Photo Animate with preserved background: Animate a still image while keeping its original background intact.
- Lip sync and body motion: Mouth movement matches speech, with head and body motion guided by your source.
- Web-based workflow: Sign in, upload inputs, and render without installing heavy dependencies.
- Open-source availability: Run it locally if you need full control and have the hardware.
Why I Used the Hosted Platform
Running Wan 2.2 locally involves a large download and many files. To keep things simple, I used wan.video. It’s free to try, and you can purchase credits to move up the queue. In practice, my render times were roughly the same with or without credits, so I suggest starting on the free plan and evaluating before you spend.
Ethics and Legal Notice
This tool can create convincing swaps and animated faces. Do not use it for fraud, impersonation, or to promote products or services without the necessary rights. Using someone’s likeness without permission can create ethical and legal exposure. Treat any realistic persona videos you receive with skepticism, especially those asking for sensitive information.
How Wan 2.2 Animate Works?
Wan 2.2 uses a source performance and a target identity to produce a new video:
-
For Character Swap:
- Input: a driving video and a single image of the target person.
- Output: the original video performance with the target identity replacing the original face or figure. Background comes from the source video.
-
For Photo Animate:
- Input: a single image and a driving video or audio.
- Output: the original photo animated to match the performance, preserving the photo’s background.
In both cases, lip sync can match speech, and the model attempts to replicate head movement and body gestures from the driving video.
Interface Guide: Getting Started on wan.video
- Sign in to wan.video.
- Click Generate to open the creation panel.
- Switch the mode from Video (text-to-video) to Avatar.
- Choose a tool:
- Speech to Video
- Character Swap
- Photo Animate
For this article, the focus is Character Swap and Photo Animate.
Character Swap vs. Photo Animate
-
Character Swap:
- Keeps the video’s original background.
- Replaces the subject’s identity based on your reference image.
- Often produces more stable results across a range of inputs.
-
Photo Animate:
- Keeps the photo’s original background.
- Animates the photo to match your performance.
- Can produce strong motion fidelity but may show more visual artifacts.
Standard vs. Pro
There’s a Standard and Pro toggle. In my tests, setting Pro did not clearly change the final quality. You can experiment, but don’t expect a consistent improvement from that switch alone.
How to Use Wan 2.2 Animate?
Step-by-Step: Character Swap
-
Prepare Inputs
- Source video: Use a clip with clear lighting and a visible face.
- Target image: Choose a high-quality, front-facing photo with neutral expression and even lighting.
-
Open Avatar Mode
- Click Generate, then switch to Avatar.
- Select Character Swap.
-
Upload Files
- Upload the source video.
- Upload the target image.
-
Settings
- Choose Standard or Pro.
- Leave defaults for a first run.
-
Generate
- Submit the job and wait for the render.
- Typical wait times can be around 10 minutes, depending on queue.
-
Review and Iterate
- Check facial alignment, lip sync, and body motion.
- If detection fails or looks off, try a different source clip or target image with clearer lighting and framing.
Step-by-Step: Photo Animate
-
Prepare Inputs
- Base photo: Use a high-resolution image with a clear, unobstructed face.
- Driving video or audio: Provide a clip with the exact performance you want to replicate.
-
Open Avatar Mode
- Click Generate, switch to Avatar.
- Select Photo Animate.
-
Upload Files
- Upload the base photo.
- Upload the driving video or voice.
-
Settings
- Start with defaults.
-
Generate
- Submit the job and wait for completion.
-
Review and Fix
- Check for blur bands, ghosting, or hand artifacts.
- If artifacts appear, try a different base photo, adjust lighting in your driving video, or reduce fast hand gestures.
Quality Observations and Practical Tips
Based on hands-on usage, here’s what to expect and how to improve results.
Detection and Consistency
- Face detection can fail partway through a long clip. If that happens, use another segment or record a shorter take with straighter head pose and clearer lighting.
- Avoid fast head turns, heavy occlusions (hands on face), or strong backlighting.
Visual Artifacts
- Photo Animate sometimes introduces blur streaks or a band at the bottom of the frame. This can make results unusable for polished work.
- Ghosting and hand disappearance can occur, especially during fast gestures.
- Lighting mismatches can appear as odd patterns on the face or body.
Mitigation steps:
- Use stable lighting and a clean background in the driving video.
- Keep gestures moderate and avoid rapid hand movement near the face.
- Choose a target image or base photo with even lighting, no heavy filters, and no extreme angle.
Character Swap vs. Photo Animate
- Character Swap tends to be more robust for general use because it keeps the source video’s background and focuses on identity replacement.
- Photo Animate can match body and head motion well, but artifact risk is higher.
- If you need fewer artifacts and faster success, start with Character Swap.
Inputs Matter
- A high-quality, frontal target image improves identity consistency.
- A steady, clear driving video with readable lip movement improves sync and facial realism.
- AI-generated portraits can work, but they may introduce oddities. Real photos usually produce more stable results.
Motion Scope
- Upper-body motion is typically strong. Full-body motion can work as well, especially in clips with clear pose and no occlusion.
- The model can animate illustrated or anime-style characters as the target identity. Expect variability in quality; some styles hold up better than others.
Safety And Misuse Risks
Animated identity content can be misused for social engineering. Combining someone’s public photo with a cloned voice can produce convincing messages. Treat unexpected identity-based videos as unverified, and avoid sharing personal or financial information based on a video alone.
- Always get permission before using someone’s likeness.
- Avoid commercial use of a person’s identity without clear rights and consent.
- Consider adding watermarks, disclosures, or context when sharing AI-generated outputs.
Pricing, Plans, and Credits
- Pro membership: approximately $5/month on an annual plan, or around $6.50 month-to-month.
- Credits: optional packs are available to move up the queue.
My observation:
- Renders took about 10 minutes with or without credits.
- At the time of testing, credits didn’t reliably reduce wait times.
- The free plan is sufficient for trying the tool. If you render many clips, you may be moved further in the queue after several generations.
Recommendation:
- Start on the free plan.
- If you consistently face long queues, test a small credit pack and see if your wait times improve.
- Reassess monthly membership only if you use the platform frequently.
Troubleshooting Guide
-
Face not detected or swap fails mid-clip:
- Shorten the source video.
- Ensure frontal face and reduce occlusions.
- Improve lighting and contrast.
-
Mouth or audio sync feels off:
- Use a source with clear enunciation and minimal background noise.
- Try a different segment with more neutral head pose.
-
Blur bands or ghosting with Photo Animate:
- Replace the base photo with a higher-resolution, unfiltered image.
- Reduce fast hand motion in the driving video.
- Consider Character Swap if artifacts persist.
-
Identity looks inconsistent:
- Upload a clearer, front-facing target image.
- Avoid extreme expressions in the target image.
- Keep hair and facial features unobstructed.
Recommended Workflow
- Start simple:
- Character Swap with a clean, short driving clip and a neutral, high-quality target image.
- Evaluate:
- Watch for detection stability, lip sync, and artifact-free frames.
- Iterate:
- Tweak inputs more than settings; inputs have the biggest impact.
- Scale up:
- Move to longer clips or Photo Animate once you’ve dialed in what works.
FAQs
Is Wan 2.2 open source?
Yes. The model is open source, and you can run it locally if you’re comfortable with setup and resource requirements. The hosted version at wan.video is the easiest way to start.
Can I run it on my own machine?
Yes, but expect large downloads and a more complex setup. If you’re new to this, start on the hosted platform.
What's the difference between Character Swap and Photo Animate?
- Character Swap replaces the identity in a video while keeping the video’s background.
- Photo Animate keeps the photo’s background and animates the subject to match your driving performance.
Does Pro mode produce better quality than Standard?
In my tests, the difference wasn’t clear. Feel free to test both, but don’t rely on Pro alone to fix quality issues.
Do credits speed up rendering?
During testing, render times were similar with or without credits. Queue times can change, so test before purchasing large packs.
How long do renders take?
Plan for around 10 minutes per clip, depending on server load and video length.
How do I reduce artifacts?
- Use clean, front-facing target images.
- Use well-lit, steady driving videos.
- Avoid fast hand gestures near the face.
- Consider Character Swap if Photo Animate shows blur bands or ghosting.
Can it animate illustrated or anime-style characters?
Yes, you can use an illustrated character as the target identity. Quality varies by style and input quality.
Are there legal issues with identity swaps?
Yes. Using a person’s likeness without permission can carry legal and ethical risk. Get consent and avoid misleading content.
What if face detection fails midway?
Use a shorter, clearer segment, improve lighting, and keep the face frontal. Re-upload with a better target image if needed.
Conclusion
Wan 2.2 Animate makes identity swaps and photo animations accessible through a simple web interface and an open-source model. Character Swap is the most reliable starting point: it keeps your original video background and focuses on a stable identity replacement with clear lip sync and body motion. Photo Animate can produce striking results while preserving the photo’s background, but it’s more prone to artifacts like blur bands and ghosting.
Use clear inputs, control lighting, and keep gestures moderate for best results. Begin on the free plan, evaluate render times, and only consider credits or Pro if you consistently need faster turnaround. Above all, respect consent and legality when working with likenesses. This technology is powerful, and it deserves careful, responsible use.
Recent Posts

Wan 2.2 Animate: AI Character Replacement Tutorial
Master AI character replacement with Wan 2.2 Animate in ComfyUI—swap characters, keep motion, and restyle footage. Includes install, workflow, nodes, and tips.

Wan Animate GGUF Low-VRAM ComfyUI Workflow
Learn how to import the Wan Animate GGUF workflow from CivitAI into ComfyUI, install missing custom nodes, update and restart—optimized for low VRAM GPUs.

Replace Any Character: Free Wan‑Animate ComfyUI Tutorial
Turn any 3D, cartoon, or photo character into a moving performance—free. This ComfyUI guide uses Wan‑Animate for open‑source motion transfer and face swap.