Skip to main content

CanFuUI: A Canvas-Centric Web User Interface for Iterative Image Generation with Diffusion Models and ControlNet

  • Conference paper
  • First Online:
AI-generated Content (AIGC 2023)

Abstract

Today, various AI generation tools are emerging in succession. And the majority of existing tools are predominantly model-centric in design, resulting in steep learning curves and high usability thresholds for users. Moreover, current user interfaces lack built-in image editing capabilities, forcing users to rely on external software even for basic image editing tasks. Considering that most image generation is an iterative process, this limitation significantly hampers user experience and creative potential. Instead, this paper proposes a novel canvas-centric design that seamlessly integrates editing functionalities into the UI called CanFuUI, streamlining secondary image processing. Users can crop, modify, and annotation of specific regions of generated images within the same canvas in CanFuUI. Furthermore, canvas content is utilized as preprocessed images, directly integrated into the ControlNet preprocessing procedure, reinforcing the customization capabilities of AI-generated outputs.

Q. Hu and Z. Xu—Contributed equally to this work.

Many thanks to Mr. Yihui Shen for his generous funding support.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Stable diffusion ComfyUI. https://github.com/comfyanonymous/ComfyUI. Accessed 07 June 2023

  2. Stable diffusion WebUI. https://github.com/db0/stable-diffusion-webui. Accessed 07 June 2023

  3. Dhariwal, P., Nichol, A.: Diffusion models beat GANs on image synthesis. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 8780–8794. Curran Associates, Inc. (2021)

    Google Scholar 

  4. Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 27. Curran Associates, Inc. (2014)

    Google Scholar 

  5. Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851. Curran Associates, Inc. (2020)

    Google Scholar 

  6. Hu, E.J., et al.: LoRA: low-rank adaptation of large language models. In: The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, 25–29 April 2022. OpenReview.net (2022). https://openreview.net/forum?id=nZeVKeeFYf9

  7. Jo, J., Lee, S., Hwang, S.J.: Score-based generative modeling of graphs via the system of stochastic differential equations. In: Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G., Sabato, S. (eds.) Proceedings of the 39th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 162, pp. 10362–10383. PMLR (2022)

    Google Scholar 

  8. Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8107–8116 (2020). https://doi.org/10.1109/CVPR42600.2020.00813

  9. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2021)

    Google Scholar 

  10. Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning. Proceedings of Machine Learning Research, Lille, France, vol. 37, pp. 2256–2265. PMLR (2015)

    Google Scholar 

  11. Song, Y., Durkan, C., Murray, I., Ermon, S.: Maximum likelihood training of score-based diffusion models. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 1415–1428. Curran Associates, Inc. (2021)

    Google Scholar 

  12. Zhang, L., Agrawala, M.: Adding conditional control to text-to-image diffusion models (2023)

    Google Scholar 

Download references

Acknowledgements

We would like to thank the Zhejiang Provincial Blended First Class Online and Offline Course “Three-dimensional Character Design” (No. Z202Y22513), the Ministry of Education’s Industry School Cooperation Collaborative Education Project “Research on PTA-Based Programming Training and Evaluation Model ” (No. 202101151011) as well as the 17th batch Educational Reform Projects of Communication University of Zhejiang: “Cultivation and Practice of Computational Thinking in the Age of AI” for the generous funding support of the work referred to in this paper.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hao Zeng or Tongqing Ma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hu, Q. et al. (2024). CanFuUI: A Canvas-Centric Web User Interface for Iterative Image Generation with Diffusion Models and ControlNet. In: Zhao, F., Miao, D. (eds) AI-generated Content. AIGC 2023. Communications in Computer and Information Science, vol 1946. Springer, Singapore. https://doi.org/10.1007/978-981-99-7587-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7587-7_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7586-0

  • Online ISBN: 978-981-99-7587-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics