TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control
Minkyoung Cho, Ruben Ohana, Christian Jacobsen, Adityan Jothi, Min-Hung Chen, Z. Morley Mao, Ethem Can
2025-10-13
Summary
This paper introduces a new way to control image generation models, specifically those based on diffusion techniques, allowing for more precise and adaptable control over the final image.
What's the problem?
Current methods for controlling these image generators usually apply instructions in the same way throughout the entire image creation process, from the rough outline to the fine details. This is like giving the same directions to an artist whether they're sketching the basic shape or adding the finishing touches – it's not very effective because the needs change as the image develops. The existing methods aren't flexible enough to adjust how they respond as the image gets more refined.
What's the solution?
The researchers developed a technique called TC-LoRA. Instead of changing how the model *acts* at each step, TC-LoRA subtly changes the model *itself* at each step. It uses a smaller 'helper' network to create tiny adjustments to the main model's settings, and these adjustments are different depending on how far along the image creation is and what the user wants. Think of it like giving the artist different brushes and instructions for each stage of painting.
Why it matters?
This new approach leads to images that are more faithful to the user's instructions and have better overall quality. It's a significant step forward because it allows for a more dynamic and nuanced control over image generation, meaning we can create images that are closer to what we envision and that better reflect the specific details we request.
Abstract
Current controllable diffusion models typically rely on fixed architectures that modify intermediate activations to inject guidance conditioned on a new modality. This approach uses a static conditioning strategy for a dynamic, multi-stage denoising process, limiting the model's ability to adapt its response as the generation evolves from coarse structure to fine detail. We introduce TC-LoRA (Temporally Modulated Conditional LoRA), a new paradigm that enables dynamic, context-aware control by conditioning the model's weights directly. Our framework uses a hypernetwork to generate LoRA adapters on-the-fly, tailoring weight modifications for the frozen backbone at each diffusion step based on time and the user's condition. This mechanism enables the model to learn and execute an explicit, adaptive strategy for applying conditional guidance throughout the entire generation process. Through experiments on various data domains, we demonstrate that this dynamic, parametric control significantly enhances generative fidelity and adherence to spatial conditions compared to static, activation-based methods. TC-LoRA establishes an alternative approach in which the model's conditioning strategy is modified through a deeper functional adaptation of its weights, allowing control to align with the dynamic demands of the task and generative stage.