Biotechnology

The “DragGAN” AI method promises to revolutionize the digital image


Imagine being able to try on different outfits on your virtual avatar and see how it looks from every angle. Or adjust the direction your pet looks in your favorite photos. You can even change the perspective of landscape images. Photo editing like this is always challenging, even for experts. New AI tools now promise that with just a few mouse clicks, anyone can easily make edits like this. This method is being developed by a research team led by the Max Planck Institute for Informatics in Saarbrücken, in particular by the Saarbruecken Research Center for Visual Computing, Interaction and Artificial Intelligence (VIA) located there.

Imagine being able to try on different outfits on your virtual avatar and see how it looks from every angle. Or adjust the direction your pet looks in your favorite photos. You can even change the perspective of landscape images. Photo editing like this is always challenging, even for experts. New AI tools now promise that with just a few mouse clicks, anyone can easily make edits like this. This method is being developed by a research team led by the Max Planck Institute for Informatics in Saarbrücken, in particular by the Saarbruecken Research Center for Visual Computing, Interaction and Artificial Intelligence (VIA) located there.

This breakthrough method has the potential to revolutionize digital image processing. “With ‘DragGAN,’ we are now creating a user-friendly tool that allows even non-professionals to perform complex image edits. All you need to do is mark the areas in the photo you want to change and specify the desired edits in the menu. Thanks to the support of AI, with just a few clicks of the mouse, anyone can adjust things like pose, facial expression, direction of gaze or angle of view, for example in a photo of a pet,” explains Christian Theobalt, Managing Director of the Max Planck Institute for Informatics, Director of the Saarbrücken Research Center for Visual Computing, Interaction and Artificial Intelligence, and Professor at the University of Saarland at the Saarland Campus of Informatics.

This is made possible through the use of artificial intelligence, specifically a type of model called “Generative Adversarial Networks” or GANs. “As the name suggests, GAN is capable of generating new content, such as images. The term “hostile” refers to the fact that a GAN involves two networks that compete with each other,” explained Xingang Pan, a postdoctoral researcher in MPI for Informatics and first author of the paper. A GAN consists of a generator, which is responsible for creating images, and a discriminator. , which is tasked with determining whether the image is real or generated by a generator.The two networks involved in this competition are trained to the point where the generator produces an image that the discriminator cannot distinguish from the real one.

There are many uses for GANs. For example, in addition to the obvious image generator use cases, GANs are good at predicting images: they enable what’s called video frame prediction, which can reduce data requirements for video streams by anticipating the next video frame. Or they can upscale low-resolution images, improving image quality by calculating where the extra pixels of the new image should go.

“In our case, this property of GANs proves advantageous when, for example, the direction of a dog’s gaze is to be changed in an image. GAN essentially recalculates the entire image, anticipating where which pixel should land in the image with the new viewing direction. A side effect is that DragGAN can calculate things that were previously blocked by the position of the dog’s head, for example. Or if the user wants to show the dog’s teeth, he can open the dog’s muzzle in the image,” explained Xingang Pan. DragGAN may also find application in professional settings. For example, fashion designers can take advantage of its feature to adjust the cut of clothing in a photo after the initial capture. In addition, vehicle manufacturers can efficiently explore different design configurations for their planned vehicles. While DragGAN works on various categories of objects such as animals, cars, people and landscapes, most of the results are achieved on the synthetic images that GAN generates. “How to apply it to any user input image is still a challenging problem that we are currently researching,” added Xingang Pan.

Just days after the release of preprints, this new tool from Saarbrücken-based computer scientists has caused a stir in the international technology community and is considered by many to be the next big step in AI-assisted image processing. While tools like Midjourney can be used to create entirely new images, DragGAN can greatly simplify post-processing.

The new method is being developed at the Max Planck Institute for Informatics in collaboration with the “Saarbrücken Research Center for Visual Computing, Interaction and Artificial Intelligence (VIA)”, which was opened there in collaboration with Google. The research consortium also includes experts from the Massachusetts Institute of Technology (MIT) and the University of Pennsylvania.

Apart from Professors Christian Theobalt and Xingang Pan, contributors to the paper entitled “Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold” are: Thomas Leimkuehler (MPI INF), Lingjie Liu (MPI INF and University of Pennsylvania), Abhimitra Meka (Google) , and Ayush Tewari (MIT CSAIL). The paper has been accepted by the ACM SIGGRAPH conference, the world’s largest professional conference on computer graphics and interactive technologies, to be held in Los Angeles, August 6-10, 2023.

Further information:

Original publication (preprint):
Xingang Pan, Ayush Tewari, Thomas Leimkuehler, Lingjie Liu, Abhimitra Meka, and Christian Theobalt. 2023. Drag Your GAN: Interactive Point-Based Manipulation of Generative Image Manifolds. In the Special Interest Group on Computer Graphics and Interactive Engineering Conference Proceedings (SIGGRAPH Conference Proceedings ’23), August 6-10, 2023, Los Angeles, CA, USA. ACM, New York, NY, USA, 11 pages. https://doi.org/10. 1145/3588432.3591500
https://arxiv.org/pdf/2305.10973.pdf

Project website: https://vcai.mpi-inf.mpg.de/projects/DragGAN/

Questions can be directed to:
Prof Dr Christian Theobalt
Max Planck Institute for Informatics
Tel.: +49 681 9325 4500
Email: (email protected)

Max Planck Institute Background for Informatics:
The Max-Planck Institute for Informatics in Saarbrücken is one of the world’s leading research institutes in Computer Science. Since the founding of the institute in 1990 he has researched the mathematical foundations of information technology in the areas of algorithms and complexity, as well as programming logic. At the same time researchers at the institute have developed new algorithms for various application fields such as database and information systems, program verification, and bioinformatics. Basic research in visual computing, namely computer graphics and computer vision, at the intersection of artificial intelligence and machine learning, is also an important focus of the institute. With publications at the highest level and the education of excellent young researchers, MPI for Informatics plays a major role in advancing basic research in computer science.

Background of “Saarbrücken Research Center on Visual Computing, Interaction and Artificial Intelligence” (VIA):
The “Saarbrücken Research Center for Visual Computing, Interaction and Artificial Intelligence (VIA)” is a strategic research partnership between MPI for Informatics and Google and conducts basic research in the fields of cutting-edge computer graphics, computer vision and human-machine interaction in artificial intelligence interfaces and machine learning. The center cooperates with the University of Saarland and many internationally renowned computer science research institutes at the Saarland Informatics Campus.

Saarland Informatics Campus Background:
900 scientists (including 400 PhD students) and around 2500 students from more than 80 countries make the Saarland Informatics Campus (SIC) one of the leading locations for computer science in Germany and Europe. Four world-renowned research institutes, namely the German Research Center for Artificial Intelligence (DFKI), Max Planck Institute for Informatics, Max Planck Institute for Software Systems, Center for Bioinformatics as well as Saarland University with three connected departments and 24 degree programs covering the entire spectrum of computer science .

Editor:
Philipp Zapf Schramm
Saarland Informatics Campus
Telephone: +49 681 302-70741
Email: (email protected)




Source link

Related Articles

Back to top button