Hello, Let's Talk AI listeners. Today we have something a little bit different for you. It's a panel discussion on GANs, done to mark the launch of the GAN specialization on Coursera, which I hope you take and is what I will be teaching.
This panel discussion convenes Ian Goodfellow, who is the inventor of GANs, Anima Anandkumar, who's the director of machine learning research at NVIDIA. NVIDIA builds amazing GANs, state-of-the-art GANs. We're joined by Alexei Afros from Berkeley. He's a professor there and has...
And his lab has produced some of the most amazing state-of-the-art GANs as well, especially around style transfer for turning a horse into a zebra, for example. And finally, we'll also have Andrew Ng, my advisor there with me, and I'll be moderating the panel for hopefully some fun discussion, but also for some insightful advice to listeners out there. So Sharon, I'll turn it over to you now.
Thanks so much, Ryan. And if what you saw in the keynotes caught your fancy or went over your head, that's what the specialization is here for. And you'll get to learn from the very basics of what is a GAN to the state-of-the-art and style GAN that Anima had shown. So with those fantastic keynotes from Ian and Anima, let's start off by letting the new faces in the room, Alexi and Andrew, say hi to our audience. Alexi? Hi.
Well done, Alexi. Maybe a sentence about yourself. Go for it, Alexi. Hi, I'm Alixia. First, I'm...
professor at UC Berkeley doing computer vision and computational photography. And I've been a huge fan of GANs. I don't fall in love with papers often, but this was definitely love at first sight. And so I'm really excited to be part of this GAN story.
- Hey, everyone. Thanks to all of you from all 140 countries and all the time zones joining us, middle of the night, morning, evening. It's really exciting to be here. I had the pleasure of knowing Ian for a long time. It's great to reconnect with him and with Alexi, whom I think I first met when we were both PhD, about to enter PhD programs, and we ran into each other touring different university campuses. And really grateful as well that Anima
and Sharon are here, GANs are a big movement. It's one of those amazing technologies that frankly wasn't working so well six years ago when Ian published his first paper, but now has really taken off thanks to the work of really NVIDIA and many other groups around the world. And IFUSIC is poised to find a lot of exciting applications
And I was really struck by Ian and Anima's comments on the concrete use cases where GANs are no longer something that generates cool pictures you look on the Internet. So things that are really useful and are going to people's mouth in the dentist's office. So I hope that some of you watching online today will take the GAN specialization and learn about these tools and go help build amazing things that will go make life better for a lot of people.
Thank you, Alexi and Andrew. For the next 30 minutes or so, I'll be asking some questions to our panelists. And while I'll be directing questions to a specific person, I'd also like to encourage any of you panelists to jump in and offer your input in the conversation. So starting with Alexi, can you tell us a bit about what your lab has been working on recently? Students in the specialization will be spending a large chunk of time learning about work from you and your students.
Sure. As I said, I'm a big fan of GANs, so there is a lot of things that happen in GANs. Thank you, Ian, for giving a shout out to our work with Gladwell on the dental reconstruction. This was done with my wonderful colleague, Stella Yu. Lately, we have been really thinking hard about disentanglement, just like Anima, and
because I'm really focused on unsupervised learning. We have been looking at
at disentanglement in an unsupervised setting. And we have several papers on that trying to use a Hessian penalty for disentanglement and also thinking about using contrastive learning for kind of disentangling texture from structure. We have a couple of very recent papers. One is on kind of replacing the cycles and cycle GAN with a contrastive learning.
that was in ECCV and we have a just brand new paper at NeurIPS called Swapping Out Encoders where we are kind of separating the style and the content again in an unsupervised way that we're pretty excited about. Apart from that,
I've been working for a long time in self-supervised learning, again, going away from-- I don't like labels, so I'm trying to stay away from labels. And I've been also pushing against-- I also don't like data sets, even though I love data. But I don't like the fact that they're stationary. And so we have been really focusing on--
online kind of streaming data learning. And so we have a paper on test time training where we're basically like adjust, adapt to the streaming data in an online way. And so I think, but this is just starting and I think there's a lot of cool directions there. Yeah.
Fantastic. I love your contrastive learning paper. And we are also in our lab with Andrew looking at disentanglement. So that seems to be a trend here. In a sense, I think GANs are doing a lot of the disentanglement. Even if you just take the, you know, the
just off the shelf style GAN or big GAN. There is a lot of cool disentanglement there already and we just don't quite understand how much is in there. So I think this is one of the kind of exciting directions now, yeah. - Yes, very much so. And a huge trend in research right now. Speaking of trends, Andrew, what trends around GANs are you most excited about?
By the way, not sure if I should say this, sadly, our submission to Neuros on disentanglement did not get accepted, but that's okay. It happens to everyone. You live and you move on, and then you try again. Trends I'm excited about, I think that there's so much basic research on GANs that's still going on, and that's great. Clearly, technology innovations, new modes still going on is great.
What I see is in the tech deep learning world, over and over, a pattern you see is there's a new technology. It works amazingly well in the lab. And this opens up the door for a lot of exciting creative applications. I think maybe 10, 12 years ago, we started to see this supervised deep learning. And then we saw a lot of dominoes topple. Deep learning has its first major impact in speech recognition, followed by computer vision, ImageNet. And then I saw same transformation in NLP and so on and so on.
And GANs, pattern matches to me to one of those amazing technologies and now it works so well. You'll go, wow, this is so cool. But I'm really excited about exploring all the ways to take these and put them into useful technologies.
applications, which is actually really difficult. The whole AI world is not great, frankly, at bridging the research, the proof of concept to the production gap. But when Ian talked about the gaze tracking, you're mapping one eye, the computer graphics eye to another one,
It turns out that in order to apply that broadly across many industries and many problems, I think there'll be a lot of important stuff to solve. What happens if the mapping doesn't work? How do you make sure you don't have mode collapse or training problems? All of those are issues, not just when Ian is the one at the keyboard making it work well, but how to make it systematic so that many teams, hundreds of thousands of teams across the industry can make it work.
I'm also excited about the creativity that GANs could unleash, everything from Photoshop 2.0 to all the ways we have to manipulate images. I hadn't realized Apple had such cool work on AR using GANs. But I think all the creativity, we need more people to understand these algorithms so that they can be the creative ones to come up with the cooling applications.
Thanks, Andrew. And I think Anima also has maybe shares that perspective since she does spend her time between NVIDIA and Caltech. Anima, what are some of the unique challenges of applying GANs to business that you see, even if you're at NVIDIA and have a seemingly infinite supply of GPUs?
Well, you know, it's really exciting to be working at NVIDIA, but like everyone else, of course, you know, we want to make a good case for why we use GPUs, right? And use it for good benefits to humanity. And that's where I'm
very excited by my colleagues at Nvidia who've been pushing so hard and making Gantt's photo realistic, get to these really high quality images that have now passed the Turing test, whether it's StyleGAN, GoGAN has, I think, now maybe even up to a million downloads.
where you can turn into an artist and you can give like rough broad brushes of the landscape you want to draw and then it turns it into a really impressive image. And then also moving more into the 3D world, how to do generation there, what are challenges and how to do it at scale. That's where there have been a lot of great researchers at NVIDIA looking into these cutting edge topics in generative models.
So more on the Caltech side, what I'm excited is looking at interdisciplinary research, right? And especially talking to neuroscientists like such as Doris and looking at how we can get inspiration from how our brain does this. Does the brain have a generative model? Apparently that question is still open and unanswered, but we do have some form of feedback. We do have some form of representations of the world around us.
And when we are seeing and perceiving, we are hallucinating.
to get to the actual image. We are not just taking an external input from the world, we are also building based on our internal representations and doing it as a feedback. And so some of the recent work we've been looking at is how to add this kind of feedback to any standard feed forward neural network architecture. And that can be a lot more robust to all kinds of corruptions that it hasn't seen during training.
because that's the other important aspect to bring it to the real world that
These models have to be robust. The current brittleness that even if you have a small amount of imperceptible noise, it's completely going to throw off your predictions is not one that can be brought into the real world. And so looking at how we can get inspired from the mechanisms in our brain and trying to bring some of that into neural network architectures like feedback is something that's been very exciting.
Very cool and very salient. Going off of kind of the utility of GANs, Ian, can you expand a bit more on some of the topics you mentioned on using GANs for good from your keynote, particularly pertaining to perhaps differential privacy or combating bias? Yeah, and a lot of these are areas that have only had little nibbles on them in the research literature that I'm hoping people from this event go on to explore in more depth.
One of them is differential privacy. The idea behind differential privacy is you can train a machine learning model in a way where it doesn't memorize the individual characteristics of individual examples in the training set. So if that model you're training is a GAN, it can then make you new data without revealing anything about the real data that went into it at the start.
That's super powerful for medicine. We've seen a proof of concept from Casey Green's lab where they actually trained GANs on medical data, and then they're able to make new fake medical data that can be released publicly and effectively an infinite amount of it, especially because it's really hard to pull data from different clinics because of things like HIPAA considerations. Things like differentially private generative models seems like a really good way to get over the data scarcity problem in medicine.
For a lot of other topics, there's reasons why you might want to generate more data for a given area with a GAN. If you want to, for example, support a language that isn't spoken very commonly, it might help to generate more data for that language.
Or there's a startup called Vue.ai that allows people to visualize themselves in clothes available from a retailer. Traditionally, you'd have to rely on the retailer having hired a model who looks kind of like you. And now you can use the GAN from Vue.ai to generate somebody who's your race, your skin color, your hair color, your body shape. It makes the whole model photography process a lot more inclusive in that sense.
So I think there's a lot of different things that you can do. Those are just a few things starting to scratch at the surface. Thanks. I think Arima had touched on this a little bit before in terms of how GANs, there's still a gap between GANs in the real world and decision-making environments versus an understanding where they do learn and are effective and where they might fail. Ian, did you want to touch a bit about this?
Yeah, it's also pretty similar to what Andrew said about how supervised deep learning went from the lab to the real world. The state of GANs today kind of reminds me of the state of supervised deep learning, maybe like circa 2012, that it used to really take a wizard to train a deep learning system. And to some extent today, GANs are still like that.
Now deep learning is considered relatively reliable and it's because we found all these nice recipes like always using relus, always using momentum, maybe having a few technologies that didn't radically change the paradigm but made it so much more reliable like Atom and ResNets. I'm hoping that we get those kinds of reliability technologies that help us to apply GANs in lots of applications without needing a GAN wizard.
One of the interesting things to me, just really going off Ian's comment, one of the interesting things to me is that a problem, if I may, about the whole GAN world is we don't have very good evaluation metrics. And so we can generate it, either go, ooh, this looks great. In fact, one of the pieces of work that I found really interesting was some of Sharon's work on hype.
showing how challenging it is and how problematic it is to use some of the automated evaluation metrics. And I think this contributes to the need for GAN wizards because the right wizard looks at it and their eyeball says, "Oh, I got to, it's clearly mode collapse. I'm going to do this or something." Actually, Sharon, don't mean to put you on the spot. I know you're the moderator, but since you're a world expert on this, do you want to say anything about this problem?
I think evaluation is a big problem in GANs and you'll get to learn about it in the specialization. But because it is a problem, I think it very much depends on your downstream task and what you want to use your GAN for. So your GAN can help your other AI models, which is cool.
but then you can evaluate your GAN based on how much it does help your downstream AI model like classification, segmentation and the like, or it could be around realism. And then I think it's really important that we have humans in the loop who we humans are the gold standard perhaps of evaluating realism.
And in terms of democratizing GANs, which Ian was talking about, I think some of the really interesting work was the work by the NVIDIA group, open sourcing stuff so the rest of us could use it. So I've gone to the NVIDIA website and looked at maybe Anima
Yeah, and I'm happy to announce just a few days ago, we announced the release of Imagineer. So that's all the, you know, very sophisticated and cutting edge GAN models. We've now put them all in one place. You know, Mingyu's group has worked hard to,
have a hackathon for the past six months and get all that in great shape so that everyone can use it. So we'd love for more people to check this out throughout this course and maybe adopt some of those models.
Very cool. Thank you. I'm really excited about that and was following the tweet on that pretty closely. Maybe switching gears and thinking about our audience members a bit and thinking about what they can do to prepare to be a good GAN researcher or practitioner or student in general. Alexi, maybe to start, what makes a good AI researcher in your lab and what characteristics do you see in your students that you'd like?
Well, I mean my lab is is is a I'm pretty happy to say that all the main characteristic of my students and and me also is we're all a little bit crazy. I think it it's kind of important to be a little bit crazy if you want to do research and
I think really it's the same things that, you know, in any discipline you want to, you know, you want to be imaginative. You don't want to focus on short term. You don't want to have, you know, focus on getting papers out every, you know, three weeks. I think it's all about having some kind of thing that you really, really want to do and just trying to go there. What I tell my students is,
don't stress out about all these papers, this, you know, the faucet is open on archive and like there's so many papers coming out every day. There's so many papers being published. You know, there is this idea in,
In medicine, you know, when they graduate from medical school, they tell you there is a concept of half-life of knowledge. So they say, okay, you know, remember, five years from now, half of what we have taught you in medical school will turn out to be false, right? Just because, you know, science moves forward. And I think in ML and in GANS, that number is maybe three months, right?
Half the life of knowledge in GANs is about three months. So I would not worry about every single paper, every single thing, every single trick. You know, it's fine. You can wait a little bit and see if it actually works, if it takes off.
I don't read every paper. For example, if the paper only has faces in it, I don't read it because we know that faces are easy. Faces worked 20 years ago using active appearance models. So they need to try something harder. Or if they just have MNIST or something, CIFAR, I wait until they try it on harder data sets. So that kind of cuts out a lot of the chatter. So I would just not worry and not be stressed out because if you have a good idea,
just go for it. It will come up. Definitely. And I will say I have been scooped before. I think I was thinking about doing a semi-supervised GAN at some point, looking to NVIDIA.
Anima, as a fellow woman in AI, I and others definitely look up to you for advice and inspiration. And it's without a doubt that you've lended incredible support to your students and your role model to a lot of women looking to go into AI. What advice would you give to the women and girls tuning in and who are learning AI?
Thank you, Sharon. And yeah, I'd love to see more diversity and inclusion in our field. I think there's been incredible awareness in the last few years and, you know, the majority has been supportive and positive, right? So despite all the trolling and some of the negativity, I think the positivity overwhelms at the end of the day and
That's why I continue to speak out and I continue to make sure that we create a healthy environment for everyone in the community. And so what I would say is to keep fighting. If you are seeing a problem, whether it's a technical problem or it's a societal problem, it's something you care about.
you may not see the immediate returns. You have to not only fight for it, also find the allies, find your support network, and also learn how to communicate well because whether it's a research aspect or everything else, we want to see change in the community. I think at the end of the day, we need that support network and we need to make sure that we create the awareness. Definitely.
And on the business side, for those tuning in on the business side, as a successful CEO, founder, and professor, Andrew, what's your secret in managing your time and keeping yourself updated on AI? And what kind of advice would you give to folks tuning in from the business side should they take this specialization?
Yeah, actually, before I answer, before I just to shout out to Anima for really consistently speaking up on social matters at issue that matter. I feel like we live in a world where there are, you know, lots of ups and downs and ups and downs.
I guess watching the US presidential debate just last night, I think it's more important than ever that when all of us have a strong opinion on something right or wrong that we speak up because I think every voice matters. I think the Nima has been consistent
voice in that. And I think that's great. And I think Sharon released an app on the internet to help anonymize protesters, to protect them. I think it's wonderful that Sharon just woke up and said, this is Xuxu Guo, I'm a deep learning researcher. I want to build this because I can make the world a better place. I think every one of you watching this online, your voice matters. Don't ever think that,
individually, our voices are limited, but collectively, we, the AI community, are incredibly powerful. But only if all of us speak up about the things that matter so that we can shift the whole world toward doing more of the things we want that make people better off. And so to answer Sharon's question, so much is happening.
One of my honest sources of knowledge is The Batch, which DeepLine.ai publishes. There's a large team of writers that our editor-in-chief Ted Greenwald organizes to try to synthesize the most important AI news to share in a very succinct form with everyone.
The issue that's going out later today of the Vag is a GAN special issue. So covers a bunch of cool research on GANs, some of which I actually honestly did not know about myself until Ted and his team found those stories and synthesized them. So I actually use that.
I'm fortunate also to have friends. To be completely honest, my personal number one source of knowledge on GANs is actually Sharon. Sharon tells me when she sees something cool. So that's fun. But I find that having friends you can talk to, to read papers together, to brainstorm together,
It has been a very important part of how I keep up to date and so I encourage others to form a community. None of us should have to do this alone. So make friends, form a community and work with them because we're all really much better and much stronger together.
Thank you. That's powerful. The newsletter, The Batch, actually has an interview with Ian as well in it, which I hope you all check out. And I think this would be a question for all of you on how do you stay up to date on the latest machine learning research? Maybe starting with Ian? Yeah. At Apple, we actually have an event called Paper Party.
where once a week we have people get together and share a paper. And I think that's really powerful because only the person presenting it needs to have read it. It's hard to get everyone in a discussion group to read a paper, but one person can explain it succinctly to other people. And then beyond that, just like how Andrew relies on you for updates about GANs, I have friends in different subject matter areas that keep me up to date.
I actually read very few papers. I mostly discuss them with other people. And that's and through events like Paper Party, that's how I stay up to date. Great. I know my favorite kind of paper group was definitely one where there were no expectations going in to have read a paper. We just read while you're there together, which is nice. Anima, how do you stay up to date?
Yeah, I mean, some of the reading groups are great, especially if they are involving people, right, in a broader community or coming from different areas. So we can look at different papers, not just in a narrow topic. But mostly, again, I don't read papers. Most of the time, I also don't encourage my students to read papers just on archive. So we almost, like, you know, most of the time, it's a thought process, right? We are saying, why is this?
say, problem hard and what could be a new thing that can be done, right? What is the barrier now? And then maybe look back and see if others have looked at similar ideas. So I encourage them to first come up with ideas on their own and then maybe check back and see if that's already been done. And if somebody has done it, great. I know that means they were thinking in the right direction. If not, then there are new topics.
So I think encouraging more creativity and original thought is really important in this mad rush of papers we see on our card. It's very true. When I saw Semi Style again, I was like, I'm so glad NVIDIA did this. I would never be able to do something that awesome. Alexi?
I also don't read as many papers as I would like to. And I rely on my students really to tell me what's cool and what's working. I like to read older papers. I feel a little bit bad that we are reading less papers, especially because I put so much blood, sweat and tears into my papers. You know, it's kind of every paper is like the introduction. I try to make it all very, you know, Tolstoyan. I read your papers. Yeah.
But it's harder and harder because I know that less and less people read them. So I think it's good to read papers. I would just encourage you to use time as a filter. Read older papers. The older papers are still good. In fact, you will find a lot of really cool things in older papers. And so just wait until you realize that this is actually something that's really working and then go back and read the paper.
That's my advice. And to those older papers, the ones that have stood the test of time, or perhaps your newer ones, did you want to share any of your work in art using GANs?
Well, actually, I do want to mention about art, and it's something that really kind of connects with people talking about the different things that Gantz are used for.
For us, it was really kind of amazing. So we came out with Pix2Pix and then within, you know, we put the date, we put the code online. And then within like a couple of weeks, a friend of Filizola's, Christopher Hayes, created this very cute little online app to create, you know, cats from edges. And then kind of,
what kind of started really quickly is that people just started doing crazy things with this app and doing things that none of us really expected. So this is kind of different, different things that people have gone online and, you know, people started to do it on faces and, and it's,
we realized that there was some kind of a creative energy and even, you know, Jan did one for us here. And then my wonderful colleague, Aaron Hertzman from Adobe, who is really kind of interested in this kind of connection of art and technology. He pointed me out to just an avalanche of absolutely amazing artists
who are using GANs to improve their, to create art. And so Mario Klingbein is one of the first ones that really, I mean, this is something that to me,
That to me feels like we are actually doing something good here by giving the artists a new tool, a new brush. In fact, this is an artist I met, Elena Sarin, who is a software engineer and she's also an artist. When she learned about GANs, she was like, "Wow, I can connect those two things together."
I like what she says. She says, my watercolor teacher used to say that, let the medium do it.
through that, say my sketch provides a foundation and then the network does its thing. I don't fight, I constantly tweak the GAN brush, right? So it's kind of the idea of GAN not as computers creating art, but as a brush to unleash the creativity of artists. And this is another wonderful artist, Scott Eaton, that I was happy to visit. And what he does is he basically
he sketches things and then he essentially runs pics to pics using some of the data that he has photographed himself in his studio to create these kind of impossible sculptures. And it's kind of amazing how well the GAN is able to kind of capture the shading, the consistency of tone. And yet it is very much
something different and something weird. And so finally, I wanted to kind of shout out to this wonderful website by Joel Simon, who is called Art Breeder. And it's a wonderful thing to try to
to unleash your creativity with very simple set of rules where you basically, you're just kind of a, you're working in the GAN space. I think they use big GAN and they kind of combine different latent vectors together. And the power of it is that it's a community effort, it's shareable and some really, really impressive and amazing works come out of it. So this is,
This is just, I think something that is kind of, to me at least it was unexpected. I think maybe Ian would also not have expected this, but there's the whole community of artists kind of jumping on this and doing things that we technology folks would never think of. And I think that this is just wonderful.
Absolutely. I really like Elena Sarin's book using GANs and illustrations using GANs, as well as Art Breeder. I love Art Breeder. You can actually, they use StyleGAN as well, and you can actually upload a photo of yourself and adapt it because they have disentangled it essentially. And you can make yourself older or younger and look all sorts of different things. And regarding Pics2Pics, the first model from Edges2Cats,
you will get to implement that and learn about that in the specialization as well. Cool. Well, we have a couple minutes left. So before we wrap up the panel discussion, I'd like to ask each of you to give kind of a one or two liner for our learners who are building their careers in AI and getting into GANs. Maybe starting with Andrew? Sure. So maybe just one example. You know,
I read the CycleGAN paper a long time ago, right? Cool result. This is iconic turn horses into zebras picture. And when I saw that, I was like, wow, this is really cool. And I thought, all right, how often do I want to turn a picture of horse into zebra? Not that often. So sometimes people do ask, you know, what are GANs useful for?
And then just earlier this week, I came across a paper by University of Wisconsin and NIH, National Institute of Health, that was taking contrast CT images, basically X-ray images with a dye and turning them into X-ray images without the dye to feed to a supervised learning algorithm to train for medical imaging. I thought, wow, this is really cool. This thing that was originally published as turning zebras and horses into zebras, which sounds fun and really cool,
you know, it looks like there's a real application of it to medical imaging. So, oh, and we wound up covering this paper in the batch as well, which we got today. But to me, there's so much room for creativity that I think if someone wants to break into GANs, this is a good time because there's still so much room for so many applications, be it autistic ones like RR-Reader or small data supervised learning ones like the
medical imaging, one-night-justice drive, or probably lots of other stuff that no one, you know, today has even thought of or is working on. So I think sometimes, as I think Alexei said just now, sometimes you're a little bit stubborn, you know, chase your dream or chase your beliefs and just keep working on it.
it until it works. Maybe one strange connection. First GPU server at Stanford was filled by Ian and his roommate way back. And that helped spur a lot of the work that now, right, NVIDIA
And I think it was because Ian way back thought, I want to develop a GPU server that influenced my groups way back then. And I think GAN offers a package of really cool algorithms that are mature enough to use in many places, but also has a lot of room for creativity. So I hope people will do that. ANU MAI: Hanuman next.
So that's great what Andrew said. I completely agree. I would add, in addition to also look into the foundations, right? So at Caltech, I teach foundations of machine learning. It's always good to start from first principles. And if you're now thinking this as a min-max game, you know, what does game theory tell us? You know, even in simple cases like a
bilinear game between two players, what is known, right? So thinking some of that could give you then intuitions on even how to tune complicated GANs ultimately, right? So I think having a good foundation and some solid principled approaches will help us understand these very complex GANs hopefully better. I mean, we are still not completely there, but I encourage you to think about the foundations.
And Ian? I think in research or in industry or anything you want to do, really, you should think about how can I do something that's different from what everyone else is doing? If you go for the most obviously promising thing, you'll get scooped, as Sharon has said. I think one way that you can avoid getting scooped with GANs is to try to break out of the mold of mostly using GANs for images. Think about using GANs for industrial design, medicine,
any number of things that move beyond just making raster grids. And certainly other people have done this so far, but that's a much emptier field than the very crowded image field. And I think venturing out there will pay big dividends for your career. - I agree. And Alexi? - First, let me just, I wanna do a little tiny pushback on using things like CycleGAN for medicine. I think in medicine,
I think we just need to be very, very careful because every month, I get doctors calling me up and telling me, oh, we want to use CycleGAN to create more X-ray data or more MRI data. Fundamentally, one has to remember, GANs are all about hallucinating content.
So you have to be just super, super careful. So I would just... I think it's possible to do interesting things in medical applications with GANS, but I think you need to be just extra, extra careful because this is not horses and zebras. This is like human life. So I would just...
add some caution to using GANs in the medical domain in particular. But I think definitely research should continue in that direction. As far as kind of advice, I think, especially if you're doing research, if you want to be a researcher,
I think it's all about having fun. You need to enjoy what you're doing because if, you know, being a researcher is hard, you know, your papers get scooped, your papers get rejected. It's, it's, it's tough. You know, you don't, it's not a nine to five job. You know, you're, you're a researcher 24 seven. So you need to be enjoying it. You need to find something that you actually love because otherwise it's just not worth it. So just try to try to have fun.
I totally agree with that, though on the medical side, I will seek to prove you wrong in a few months. I definitely agree with that, Yosha. You should be careful. But I think that's part of why we want machine learning experts working in that area is if someone who only knows medicine sees a paper on GANs and says, you know, give me some of that horse to zebra magic for MRIs.
they're not going to understand all the things that can go wrong in the ways that a real machine learning expert can. So it's part of this community's responsibility to be careful and thoughtful about those high stakes applications.
And I think it's really important to be highly collaborative. It's both on our side and on the medical side. And this brings back evaluation to every kind of medical project that we do in our lab. We very much have doctors in the loop looking at a GAN generated image and saying, did it hallucinate cancer or remove cancer? And that cannot, it must maintain that label, right? For it to be useful.
So to find those applications, I think it's really important to make sure that there is a lot of collaboration and conversation going on and learning from both sides, of course.
I mean, I completely agree with that, right? So the collaboration has to be like completely integrated. It's not like, you know, take this Gantt package and go off and do this, right? Because the question is, what are the metrics again in medicine that is accepted more
In many cases, even that's not there, right? It's not, again, one quantifiable metric in many cases. Even doctors disagree. And I've been working with Andrew Hung, who's a professor of urology at USC. And we haven't gone to Gansett. We are doing just the first steps of understanding gestures in robotic surgery and how to do that accurately. And hence, give feedback ultimately to surgeons on how they did it.
different gestures and what were the outcomes. And yeah, that's opened my eyes into the kind of complexity that goes into, you know, deciding what is a gesture. Was this an effective one or not?
Right. So I think my advice would be similar to what Lyosha said, that maybe do baby steps, right? You don't have to do GANs to begin with, just do very simple supervised learning, see what accuracies you're getting. And also think about the diversity. For instance, there've been a lot of apps for detecting skin cancer and, but they're mostly trained on lighter skin and, and,
And there could also be lots of other spurious things. I think I saw a tweet where I think the purple marker was the one it was, you know, overfitting into, right? So be very careful in terms of the data and in terms of what kind of data is being used.
Absolutely. So thank you everyone on the panel. I'll now be sourcing questions from our YouTube chat. So if you are watching and you have some questions, please post there and we'll be addressing them as they come.
So this speaks to kind of what Ian was talking about in terms of use GANs for something else. Think about other things that you can use GANs for. And one audience member asked, are there any applications of GANs in NLP and can we use GANs for NLP data augmentation? Ian, did you want to take that?
anyone can maybe not up to date on this topic as Alyosha said the half-life of GAN knowledge can often be like three months NLP has definitely been one of the areas that's been harder to tackle with GANs I wrote a paper with lead author Liam Fetus back in about 2017 called mask GAN filling in the blank where we thought we had gotten GANs to make
sentences that were more structured and grammatical and meaningful, but lower probability than you would get from traditional maximum likelihood text modeling. Later, Liam actually did some follow-up work where he showed with other people that the mask GAN was not really any better than a probabilistic model. You could get the same effect by reducing the entropy of a traditional maximum likelihood model.
So generative models, certainly you can use them in NLP. And basically every NLP model that we see these days is a generative model in some ways. GANs specifically seem a lot harder to apply in that particular domain. Cool. Another question is, can GANs be used for style transfer? And yes, we saw it could be from Alexei's amazing presentation. But how is it similar to original neural style transfer algorithms?
Alexei, do you want to take this one? ALEXEI VARGAS: So OK, I think there's two separate directions, I would say. One is the kind of cute things that we computer science geeks do, like neural style transfer and different kind of image-to-image translations, like the stuff that we did.
And separately from that is using GANs for artists by artists. And remember, the artists are going to be using whatever they find. They can download it, Pix2Pix or CycleGAN or Neurostyne. But that's not really the important part. The important part is not the technology. The important part is what they do with it. And sometimes they take the most silly old school stuff and make
amazing art with it. So I think this is kind of a separate thing. So in terms of like neural style transfer, the Gattis paper, I really love that paper. That was a beautiful paper, but that's not really, I would not call that art. That is just kind of a cool, you know, Instagram filter or something, right? Because it's not, there is no soul there. And so, yeah,
I think we as technologists can provide all of these kind of little tools, little brushes, but I don't think we can claim anything more than just, you know, kind of
cool filters, and then we have to let the art community see what we offer, pick what they like, and see what they do with it. So I would focus on just letting the artists decide. They are the experts. They are the ones who are actually creating something that
something that should touch your soul. Horses to zebras, it's cute, right? But it doesn't touch your soul. But the stuff that, for example, Helen does is really just like I have a book of hers and I just, yeah, it's something that speaks to me, right? And I think that's kind of what we're trying to focus on with giving the artist that too.
Yeah, I would agree with that. And this one would perhaps be directed to Anima with GPU memory and computing power continuing to grow. How do you see GANs being used with video or 3D data in the future? Yeah, I think it's an exciting time when it comes to the convergence of data compute and algorithms, right? That's what I like to call the Trinity.
And I mean, GPUs continue to get better. And there's also the convergence of the graphics and the AI side in Nvidia. You see that with DLSS, which is the deep learning based super sampling. So we are taking baby steps in terms of how to speed up graphics. But the question again is what can traditional graphics do versus GANs, right? I mean, should you just replace everything with GAN or maybe some of it can be still the...
you know, fixed computing. I think that's what we will look into in the coming years, how to have really fast, efficient rendering, you know, both not just 2D, 3D, and also looking at multimodal data with both images and LiDAR, for instance, in autonomous driving. So in all these cases, how to speed up the
the generation of such images, but also look at very high quality photorealistic setup and ultimately be able to create new shapes, new concepts. I think that would be the next round, not just to speed up what's existing, but to create better assets and
I don't think we'll create new art, but at least we'll create, I think, more assets in terms of 3D shapes to then help us with other downstream tasks. Thank you, Anima. And one last question. I'll direct this to Andrew, since you did touch on this a bit. What are the ethical concerns about pursuing GAN research or perhaps just applying GANs as it relates to deep fakes? We already live in a political environment where what's quote unquote true is already vigorously contested.
you know i think that uh almost many many ai technologies can be used for tremendous benefits to everyone but also has very problematic use cases so i think that it is up to um everyone across society all of every single engineer every company governments for-profits non-profits to try to swing
put our thumb on the scale to try to make sure that applications are built that make people better off. I don't think we're very good yet at making sure organizations have the appropriate ethics committees. I don't think we're very good yet at making sure that individual engineers are fully empowered to raise issues and debate them vigorously and really refuse to participate in projects that we don't think are moving forward. For myself, I do co-projects if I don't think they move the world forward.
I've killed multiple projects that look like it could be
profitable, but I don't think makes the world better. I just call them just based on that reason. But I'd love to see everyone across society empowered to do that in a more consistent way. I think one thing that we do need is for everyone to just speak up more when these issues arise so we can have that nuanced debate. And some of these things are difficult. Should you build a technology or not? Sometimes it's actually not clear if you're making the world better off or not. But I think having people empowered to debate that
surface issues will mean that on average, we make better decisions in society. So maybe I want to say, do pay attention to this. You know, technology is really powerful. So this ethics stuff, it's not just something to let someone else worry about. I think it's something that every one of us should think about and worry about because this is important for all of us building this.
Yes, with great power comes great responsibility. So thank you, panelists, for your insights and discussion here. And thank you, speakers, Anima and Ian, for joining us today. And for our audience, stick around for the last 10 minutes. I'll be giving you a sneak peek into the GAN specialization right now. Thanks a lot, Sharon. You were fantastic. Thanks for organizing all this. Bye.
Thank you so much for tuning in to this special clip. I hope you learned a lot and are inspired and consider taking the GAN specialization on Coursera. I believe you can take it for free, actually, if you audit it. And if you like this episode and the show in general, please leave us a rating and see you next week.