FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations
Hmrishav Bandyopadhyay, Yi-Zhe Song
2024-11-20

Summary
This paper presents FlipSketch, a new system that allows users to create animated sketches easily by just drawing their ideas and describing how they want them to move.
What's the problem?
Creating animations traditionally requires a lot of skill and time, as artists must draw multiple key frames and specify motion paths. Existing automated methods still need significant artistic input, making animation less accessible to casual users who may not have drawing skills.
What's the solution?
FlipSketch simplifies the animation process by letting users draw a sketch and describe its movement. The system uses advanced techniques from text-to-video models to generate smooth animations. It includes features like fine-tuning for sketch-style frames, a reference frame mechanism to keep the original sketch intact, and a dual-attention system that ensures fluid motion while maintaining the visual quality of the sketches. This approach allows for dynamic transformations of sketches, capturing the essence of traditional animation without requiring extensive artistic effort.
Why it matters?
This research is important because it democratizes animation, making it accessible to anyone who can draw a simple sketch. By combining easy-to-use tools with advanced technology, FlipSketch encourages creativity and storytelling through animation, allowing more people to express their ideas visually.
Abstract
Sketch animations offer a powerful medium for visual storytelling, from simple flip-book doodles to professional studio productions. While traditional animation requires teams of skilled artists to draw key frames and in-between frames, existing automation attempts still demand significant artistic effort through precise motion paths or keyframe specification. We present FlipSketch, a system that brings back the magic of flip-book animation -- just draw your idea and describe how you want it to move! Our approach harnesses motion priors from text-to-video diffusion models, adapting them to generate sketch animations through three key innovations: (i) fine-tuning for sketch-style frame generation, (ii) a reference frame mechanism that preserves visual integrity of input sketch through noise refinement, and (iii) a dual-attention composition that enables fluid motion without losing visual consistency. Unlike constrained vector animations, our raster frames support dynamic sketch transformations, capturing the expressive freedom of traditional animation. The result is an intuitive system that makes sketch animation as simple as doodling and describing, while maintaining the artistic essence of hand-drawn animation.