Institute of Artificial Intelligence, China Telecom (TeleAI)
Content-preserving style transfer—generating stylized outputs based on content and style references—remains a significant challenge for Diffusion Transformers (DiTs) due to the inherent entanglement of content and style features in their internal representations. In this technical report, we present TeleStyle, a lightweight yet effective model for both image and video stylization. Built upon Qwen-Image-Edit, TeleStyle leverages the base model’s robust capabilities in content preservation and style customization. To facilitate effective training, we curated a high-quality dataset of distinct specific styles and further synthesized triplets using thousands of diverse, in-the-wild style categories. We introduce a Curriculum Continual Learning framework to train TeleStyle on this hybrid dataset of clean (curated) and noisy (synthetic) triplets. This approach enables the model to generalize to unseen styles without compromising precise content fidelity. Additionally, we introduce a video-to-video stylization module to enhance temporal consistency and visual quality. TeleStyle achieves state-of-the-art performance across three core evaluation metrics: style similarity, content consistency, and aesthetic quality.
- Jan 30, 2026: We refine the code and update requirements.txt. In addition, a new version of TeleStyle-Image model with better performance has been uploaded. Finally, we release a free online demo for TeleStyle-Image . Please light a star to support this project if you find the demo useful.
- Jan 28, 2026: We release the technical report , code and model of TeleStyle.
- Release inference code
- Release models
- Release technical report
pip install -r requirements.txt
This environment is tested with:
- Python 3.11
- PyTorch 2.9.1 + CUDA 12.1
- diffusers 0.36.0
- transformers 4.57.3
Download the TeleStyle checkpoint to a local path for example weights/:
We provide Image and Video checkpoint:
-
Image (reference style image + content image -> stylized image)
diffsynth_Qwen-Image-Edit-2509-Lightning-4steps-V1.0-bf16.safetensors; diffsynth_Qwen-Image-Edit-2509-telestyle.safetensors -
Video (stylized first frame + content video -> stylized video)
dit.ckpt; prompt_embeds.pth
We provide inference scripts for running TeleStyle-Image and TeleStyle-Video:
python telestyleimage_inference.py
python telestylevideo_inference.py --video_path assets/example/1.mp4 --style_path assets/example/1-0.png --output_path results/video.mp4
If you find TeleStyle useful in your research, please light a star for the project and cite our paper, thank you:
@article{teleai2026telestyle,
title={TeleStyle: Content-Preserving Style Transfer in Images and Videos},
author={Shiwen Zhang and Xiaoyan Yang and Bojia Zi and Haibin Huang and Chi Zhang and Xuelong Li},
journal={arXiv preprint arXiv:2601.20175},
year={2026}
}