Enhancing DreamBooth with LoRA for generating unlimited characters with stable diffusion

Pascual Casas, Rubén; Maiza Coupin, Adrián Mikel; Sesma Sara, Mikel; Paternain Dallo, Daniel; Galar Idoate, Mikel

Enhancing DreamBooth with LoRA for generating unlimited characters with stable diffusion

Files

Pascual_Enhancing.pdf (13.67 MB)

Date

2024-09-09

Authors

Pascual Casas, Rubén

Maiza Coupin, Adrián Mikel

Sesma Sara, Mikel

Paternain Dallo, Daniel

Galar Idoate, Mikel

Publisher

IEEE

Acceso abierto / Sarbide irekia

Contribución a congreso / Biltzarrerako ekarpena

Versión aceptada / Onetsi den bertsioa

Project identifier

AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2022-136627NB-I00/ES/
Gobierno de Navarra//0011-1365-2022-000130/

Impacto

2

Abstract

This paper addresses the challenge of generating unlimited new and distinct characters that encompass the style and shared visual characteristics of a limited set of human designed characters. This is a relevant problem in the audiovisual industry, as the ability to rapidly produce original characters that adhere to specific characteristics greatly increases the possibilities in the production of movies, series, or video games. Our solution is built upon DreamBooth, a widely extended fine-tuning method for text-to-image models. We propose an adaptation focusing on two main challenges: the impracticality of relying on detailed image prompts for character description and the few-shot learning scenario with a limited set of characters available for training. To solve these issues, we introduce additional character-specific tokens to DreamBooth training and remove its class-specific regularization dataset. For an unlimited generation of characters, we propose the usage of random tokens and random embeddings. This proposal is tested on two specialized datasets and the results shows our method¿s capability to produce diverse characters that adhere to a style and visual characteristics. An ablation study to analyze the contributions of the proposed modifications is also developed.

Keywords

Training, Industries, Visualization, Video games, Neural networks, Text to image, Focusing

Department

Estadística, Informática y Matemáticas / Estatistika, Informatika eta Matematika / Institute of Smart Cities - ISC

URI

https://academica-e.unavarra.es/handle/2454/53539

https://doi.org/10.1109/IJCNN60899.2024.10651300

item.page.cita

Pascual, R., Maiza, A., Sesma-Sara, M., Paternain, D., Galar, M. (2024) Enhancing DreamBooth with LoRA for generating unlimited characters with stable diffusion. In Poggio, T., Comminiello, D., Morabito, F. C., Vellasco, M., Uncini, A., Scarpiniti, M., Hammer, B., Chen, B., Gori, M., Dauwels, J., Kuh, A., Tian, Z., Tanaka, T., Grassucci, E., Took, C. C., Ricci, E., Scardapane, S., Mitsufuji, Y., Silvestri, F., Squartini, S., Venayagamoorthy, G. K., Principi, E., Zhou, J., Soda, P., Xu, Z., Ji, H., Liwicki, M., Amerini, I., Roy, A., Príncipe, J. C., Sperduti, A., Duro, R., Tobar, F., Bacciu, D., Qin, K., Guarrasi, V., Ludermir, T. B., Hirose, A., Kasabov, N., Jayne C, 2024 International Joint Conference on Neural Networks (IJCNN) (pp. 1-8). IEEE. https://doi.org/10.1109/IJCNN60899.2024.10651300

item.page.rights

© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work.

Collections

Comunicaciones y ponencias de congresos DEIM - EIMS Biltzarretako komunikazioak eta txostenak
Comunicaciones y ponencias de congresos - Biltzarrak eta Argitalpenak
Comunicaciones y ponencias de congresos ISC - ISC biltzarretako komunikazioak eta txostenak

Full item page

Enhancing DreamBooth with LoRA for generating unlimited characters with stable diffusion

Files

Date

Authors

Director

Publisher

Project identifier

Impacto

Abstract

Description

Keywords

Department

Faculty/School

Degree

Doctorate program

URI

item.page.cita

item.page.rights

Collections