Welcome to the LSE events podcast by the London School of Economics and Political Science. Get ready to hear from some of the most influential international figures in the social sciences. Good evening everyone and welcome to the London School of Economics and Political Science for this hybrid event. My name is Sita Pena Gangadharan and I am Associate Professor and Deputy Head of Department in Media and Communications at the LSE.
We are here to talk about the power of data and I'm very pleased to welcome Chris Higgins, Alison Powell and Erin Young to both our online and in-person audience today. This event is co-sponsored by the Data Science Institute and the Department of Media and Communications. Before I have the privilege of
describing the format for tonight's event and introducing our speakers. A few housekeeping items. I'll keep them brief. First, the school has a code of practice on free speech, so let's champion that code of practice tonight. Second, in the event of a fire, follow the exit signs and the fire assembly point for this building is outside in the plaza.
And now to very brief introductions. Chris Wiggins is an associate professor of applied mathematics at Columbia University and the chief data scientist at the New York Times. Allison Powell is associate professor in media and communications at LSE and directs the Masters of Science stream in data and society. Erin Young is head of innovation in
and Technology Policy at the Institute of Directors. The format of the event will be as follows: we will hear from Chris as he presents, followed by Alison and Erin, before coming back together to discuss as a panel. Then there will be an opportunity for all of you, as an audience in person and online, to ask questions to our speaker. This event is being recorded.
and will hopefully be made available as a podcast subject to no technical difficulties. Please, can I remind you at this moment to put your phones on silent to minimize any disruptions. And now I'm delighted to give the floor to Chris Williams.
Thank you very much. It's a real pleasure to be here. Thank you so much to our hosts and for inviting me to be here as well. Tonight, I thought I'd share with you some of my own thinking about data and ethics. My thinking about data and ethics was very much informed by a project I was engaged in from 2017 until present.
namely a class at Columbia and a resulting book. So I thought I would share with you a bit about how we thought about data, power, and ethics in this class that I was teaching at Columbia. It resulted in a book on the far right co-authored with historian Matthew Jones. I'm not a historian, I'm merely a fan of history.
And these three images sort of capture what we accomplished in those years, 2017 until 2023 when we were teaching the class together, which was to deal head on with concerns that students had about the internet, which is symbolized by young people in the form of a flaming dumpster fire.
And then to unite that with optimism about the future, which for my age group is symbolized by this cartoon character, Buzz Lightyear, who's from a movie from a previous millennium who had irrational exuberance. And we wanted students and also the
readers of this book to understand head-on the challenges of a world in which our personal political and professional realities are being shaped by data empowered algorithms and yet to have some optimism and hope for How we can shape that future together?
So I thought I'd share with you a bit about that in maybe 20 minutes. And if you'd like to know more, there's a book which you can read about there and of course a GitHub repository like any civilized class. The primary materials are available to you online. So by way of nice to meet you, my own thinking about data has been shaped by this book as well as a book co-authored with computer scientists. And as mentioned, my own engagements with the New York Times leading a team that develops and deploys machine learning.
my engagements at Columbia teaching as well as doing research in machine learning and then a non-profit that I co-founded in 2010 in New York City to try to get students to find out about New York City's startup community. The book itself resulted in three parts, three sort of stories about data and our world. And as I said, the book itself was really shaped by interacting with Columbia undergraduates who are a pretty demanding bunch.
We started the class in 2017 thinking we would teach a class on the history of data science. And the students really pushed us to think more about data and its impact on society. So part one really opens up thinking about statistics as you might have encountered it as a field of mathematical analysis. Part two is about how computers change the way we make sense of the world through data. Part three is about our present condition and about power and solutions.
We opened up by talking about stakes. The stakes, and when we started the class in 2017, it was immediately before a blossoming of literature about challenges involving data-empowered algorithms. Our colleague, Cathy O'Neill, wrote Weapons of Math Destruction right around the time that the class debuted.
Sophia Noble wrote this great book about search algorithms and more generally the way algorithms reinforce societal inequalities. Virginia Eumeix had this great book about how algorithms can empower
Automating Inequality, Shoshana Zuboff's own book about surveillance capitalism and about commercial interests and how they exacerbate some concerns. And with a particular US lens, Ruha Benjamin's great book about race after technology captured a lot of the societal and political concerns around race, again, with a particular US lens.
Another scholar captured it well in the title of her class, Lisa Nakamura's class, The Internet is a Trashfire, which as she says, she teaches a class called The Internet is a Trashfire, and as she says, she doesn't have to explain to anyone what that means.
We started out the class talking about a subject for which there's abundant literature, which is mathematical statistics and its own history. Most people learn statistics without learning its history. It's quite a rich and well-covered history, so I won't spend too much time on it other than to advertise that we started out in the last days of the Victorian Empire at the end of the 19th century and leaned heavily on scholarship by other scholars, including Stephen Jay Gould, Smith's measure of man,
We introduced students to the ways that people had tried to make sense of humanity by quantifying whatever they could, sometimes very much influenced by their own prior conceptions about humans themselves are arranged. And we introduced them to the birth of correlation and regression and eugenics, three words brought to us by Sir Francis Galton, all part of the same program.
Stephen Jay Gould does a great job capturing what we wanted to do by having a history class to understand the present day. History is very useful for making the present strange when we find history that's far enough away that we can look at other people's innovations and see them with distance and say, well, clearly those people had a wrong conception of how to make sense of the world, and yet a history that's close enough to present day that it makes our own present world strange. Or as Stephen Jay Gould wrote,
Shall we believe that science is different today? By what right, other than our own biases, can we identify their prejudice and yet hold that science now operates independently of culture and class? So that was part of the history in part one of the class to make the present day strange for our students and readers.
Part two opens up with the birth of digital computation, data at war, because digital computation is really born during World War II. So we introduced students to what an incredible break point it was to be able to make sense of data on an automated machine so that the laborious hand calculations can be made cheap and, as we would say today, at scale. That process involved the birth of what we now call the military industrial complex and also labor.
And as soon as labor becomes part of making sense of data, it is genderized. And so we, again, leaning on the great scholarship by authors such as Janet Abbott Day and her book Recoding Gender, we show how immediately, as soon as people saw that data was a project to be managed, people decided which part was going to be men's work and which part was going to be women's work, including this work in the creation of the Colossus at Bletchley Park here in 1943.
Once we've introduced students to how data changed when it became a concern for labor and for computation, we introduced them to the birth of artificial intelligence. You may think that the birth of artificial intelligence was something like chatbots or data-empowered algorithms, which is our present universe. So we want students to see how most of the history of artificial intelligence as a term had no data in it. And in fact, data was the low road.
The term artificial intelligence was coined by this mathematician John McCarthy, who as he put it later, here in London actually, this screenshot is from a debate about artificial intelligence in 1973 when people were already sick and tired of hearing about artificial intelligence. He's on record as saying, I invented the term artificial intelligence when we were trying to get money for a summer study. And it's a useful story to remind students because they often feel like when they hear artificial intelligence, people are just trying to get money. And it's useful to tell them that in fact always has been that way.
This is the actual study. These seven white guys hung out in the summer of 1956 at Dartmouth. On the far right is Claude Shannon, now immortalized for young people by the word Claude, the chatbot of Anthropic.
But in there also is the infamous Minsky and John McCarthy and several others from a variety of walks of life. It's clear when you look back on what they proposed in 1955 for the 1956 study that they had no idea what they were talking about. So they wrote a long list of things that they planned to study. One of them, neuron nets, turned out to be the thing that won, but we didn't really know that until 2012 or so. And the other fields went on to create seven effectively
countries within the continent of computer science. But at the time, the world was young and they did not know how artificial intelligence would be crafted. Today, artificial intelligence has been crafted as a problem in which machine learning is the interesting technical nugget. But when you experience artificial intelligence in the palm of your hand, the artificial intelligence is only one part of a large technical system and in fact, of a large socio-technical system of which you are an important element.
So, in order to tell students the story of machine learning, we also often tell them the story about why and how data were captured for several decades before we could get to the point that people would realize that artificial intelligence was actually not about how we think we think, but it was actually about the low road of capturing sufficient data for computers to be able to reproduce to us a simulation of how we think we think.
Okay, and by the time we get to the present day, we talk a bit about the battle for data ethics, which was a great story in technology, I would say 2017 through 2023 or so, when a number of the leading companies were interested in positioning themselves as defenders of ethics. Ethics, of course, is a drift word, right? It's a term that means different things in different communities, in different decades, or at different companies.
Or as the law professor Phil Alston put it in 2018 on the stage of the AI Now Institute in New York City, he said, what I always see in the AI literature these days is ethics. I want to strangle ethics. Now, Professor Alston did not mean that he wanted systems not to be ethical. What he meant was the danger of a drift word is that the word will be captured. And in fact, ethics was largely captured by private companies who defined it in ways that suited their own interests by around 2023.
Ethics, for many people in academy, already had a prehistory around fairness and privacy. Privacy had already been investigated and interrogated well by scholars such as Latanya Sweeney, who pointed out that simply removing the name from a table was not enough to make a data set anonymous. As she pointed out, if I have one voter list that has everyone's name, zip code, birth date, and sex, and I have another data set, which I can
get my hands on which is medical records that has the zip, birth date, sex and no name I can simply take those three fields and make them a composite join key that will allow me to as it is often said mail the governor of Massachusetts his own health records
She's often said that Latanya Sweeney did that when she was a graduate student. I emailed her and she said that it was actually not true, which is too bad because it's a great story. But the point is that we may think that privacy is merely removing a name from a data set, but it is worse than that.
Fairness is often the way we think about ethics in industry and academia, and a lot of work was done to advance that, including by Julia Engwin and coworkers at ProPublica to point out how black box algorithms were being used in the criminal justice system in the United States by private companies in ways that are absolutely
not available for us to inspect. We don't know the source code, the algorithms, the underlying logics of these algorithms. She and coworkers did a great job reverse engineering as best as possible and even more so putting the source code on GitHub which launched a small technical field of fairness within ethics. But we try to introduce students to the idea that
Privacy and fairness are really a small part of a larger conception of ethics. We introduce them to a particular conception of ethics organized around principles that are intention. If you were raised in the academic tradition in the States, you would have encountered this under a forgotten book called the Belmont Report. You can tell it's a forgotten book because if you go to Amazon and try to buy it, it says on the bottom, forgotten books.
But this conception of ethics is that ethics is not a checklist or it's not just about privacy. It's actually about principles that are designed to be intention so that we must hold those principles in tension and adjudicate difficult decisions about informed consent, harms, and bias as well in concert.
Many of the things that we tell students are about why you would want to do ethics. For people who work at these tech companies, there's a concern around moral injury, meaning the feeling that you might do something that gets you promoted, and then after you've done it, even if you got a raise, you might realize that there was something bad about that thing that you have done.
And if you follow the literature by many people who have graduated from these tech companies and then started nonprofits thereafter, many people in fact do graduate from these companies and then do realize that there is a sense of moral injury, a sensation often associated with first responders or doctors or members of the military, but it's also a phenomenon now among tech workers who, while they are working in tech, form collective action in various ways or might be whistleblowers and try to defend themselves against that sense of moral injury.
Employees themselves form in this way a form of private regulation. They don't regulate in the sense of laws, but they are private ordering of their own companies and of society in a way that has been analyzed by legal scholars.
Ethicists within companies may be technologists who try to design process. So, for example, there's this great paper by Meg Mitchell and Timnit Gebru, later famous for getting fired from Google for their concerns about large language models. Deb Raji is currently a graduate student in Berkeley and other coworkers trying to think about how the process of launching a product can be augmented so that in addition to reviewing QA and technical assessments, we can review its ethical impact.
As I said, tech ethics and data ethics was flourishing 2017 through 2023. It'll be a question for historians of technology to ask why by 2023 as AI boomed, tech firms were laying off their ethicists. And that is why it fits well into a history book.
With that, we close the book and the class by trying to give students some form of optimism and thinking about solutions. Thinking about solutions really means thinking about power. So we try to get students to think about not only their role as students who will go on to work in tech companies, but as consumers who produce data that is the lifeblood of these companies. This form of power, which I like to call people power, sits in a three-player unstable game among corporate power and state power.
Often, particularly in the States, people think, "Oh, well, there ought to be a law." And if we see something that's a concern for consumer protection, we think that some centralized legal entity will somehow take care of it. But the law and state power more generally, we want to remind people, is only one player within this unstable game. Companies themselves effectively regulate each other or might advance for, say, privacy as a human right, as Apple has been saying for 10 years, in part because it's a value add to the consumer.
consumers provide the power not only to their democratically elected officials, but also to the company and their money, their data, their talent. Having elucidated that dynamic of power, what do we hope people get out of the book? Well, we learned an arc from the students. As I said, in 2017, we thought we were going to teach a class on the history of data science. We ended up seeing that there's much more to data, particularly data around truth and power.
Our feeling was that there's important material that at present is not being taught, neither to the future statisticians, nor to the future senators and CEOs. How data happened and also how data sits within a nexus of power, and not only power but truth. Data comes with it, a rhetorical valence.
that somehow data is given this extra truthiness because it's perceived as objective and part of the scientific program. And we want to illustrate for students and for readers of the book all of the subjective design choices that professionals who do data know go into all of their analyses as well as the resulting products. And with that, I'll close and hand over the floor to Professor Powell. Thank you very much. Thank you, Chris. Thank you, everyone, for being here.
So I run a program that's called Data and Society. So I'm picking up where Chris left off, but I'm also going to take our discussion in a little bit of a different direction because my program sits inside a department of media and communication.
So when I'm thinking about data, I'm not only thinking about how data becomes a resource for narrating our social world, but I'm also thinking about how we tell stories about data.
So today I'm going to focus on three questions: What stories are told about data? What influence do these have? And what else is possible? And when I say what stories are told about data, I'm beginning where Chris left off, because Chris told us about how when you collect data, when you perform your statistical analysis, when in the history of statistics
you are collecting information about a people to describe those people, to describe the average of those people perhaps, as if an average were something to represent those people, you are already telling a story. So that is story number one that is being told with data.
we're also telling a second story, which is a story about data's truth or truthiness, as Chris pointed out. So I wrote a book also, which I'm not going to talk about at great length, but I do want to tell you a little bit about it because it described the consequences of a related story, which is the story about governing with technology.
And that story was like the story about data being truthy, that governing with technology, governing with data was better. And it's better because it's cheaper, it moves faster, you can describe more things with data, you can make certain kinds of processes move more quickly or as Chris said, at scale.
But there's a catch here, and that catch is that when you tell that kind of story about governing, for example, a city with technology,
you're also telling a story about which kinds of experiences are worthy and good and which are not. So the spoiler of the book, so now you don't have to read it, is that stories about being true, objective and easier to action than other forms of evidence used to govern can limit the ways that people are given space to participate in citizenship.
So, in the rest of my talk, I'm going to talk about things that I have done in my own research practice to try to create other kinds of stories and other kinds of narrating about data and sometimes with data.
And in contrast to many talks that I give, this talk is completely full of pictures of people doing stuff. And if you're really watching carefully, you might see yourself, you might see famous scholars, you might see some of my friends, my neighbours, some of my collaborators. And that's really intentional because I think it's very important when we're talking about technology to remember that technology is people.
and technical systems also, as Chris has already pointed out, are in fact socio-technical systems. So when we tell stories with data and about data, we're also doing things together as people. So I'm going to start with something I've been doing in my classes for such a long time that this slide has two pictures of two classrooms that are more than ten years apart.
And this is a project that is called Data Walking. And I started it as an educational project, but Data Walking has been used all around the world as part of public consultation on developing technology in urban regions and as a kind of educational extension project. So if we were going to do this now, I would tell you that we were going to go for a walk.
and you would have to go on a walk with some people that you're sitting next to, and you would each have to play a really specific role on that walk. You would have to either figure out how to navigate, you would have to take notes, you would have to take photographs, or you would have to draw a map. Now, what is our project on this walk? Well, we are looking for data. Oh, no, we have a problem now because we don't know what data is.
So this is where we begin when we're doing things together as people and telling stories about data. First, we have to have a debate about a definition. And sometimes you invent a definition like artificial intelligence. But in fact, debating shared definitions, struggling over meaning is a really big part of decision making.
about what matters in the world. So when we debate in our walks about what is data, that changes what we're looking for on the walk and what kinds of things are actually collected. So if we were doing this in this room, you would all come back and I would make you all
synthesize your data and I would say tell me what kind of response can you make to a problem that you found out in the world based on your data and in 2012 I made people make things out of cardboard so that's why they're sitting on the floor but by 2025 I had sort of innovated and so they were making things out of Lego
which was a lot easier to work with. I got a complaint from a colleague actually after I ran one of these workshops that the classroom was full of glitter. So Lego is certainly tidier. Playing rules, walking around and then having to pay attention to how the decision of what you decided was data makes data matter. And it makes data matter because you've decided what it applies to.
And so this is a really basic beginning point that starts with defining things that we care about together and also making sure that we have enough space to debate with each other about what those definitions are. So one of the things that I have been thinking about
when I scaled my research up from the classroom to the policy-making space was how to use this struggle over definitions in an area where there seems to be a lot at stake. So from 2019 until 2023, I was the director of the Just AI project, which was hosted at the Ada Lovelace Institute. It was commissioned by the Arts and Humanities Research Council as a strategic intervention
intended to increase interdisciplinary connections in what was then a really nascent field of data and AI ethics. So we were in 2019, we were in the middle of this kind of like data ethics being a thing. We also knew that AI ethics was going to be a thing.
And the AHRC, who was the funder, the Nuffield Foundation, who supported the Ada Lovelace Institute, the Ada Lovelace Institute and my team, composed of Imre Bard and Louise Hickman and myself, were like, okay, so how do we intervene in something that is still emerging?
So we did a very complicated mapping exercise. We tried to figure out who was publishing what in data and AI ethics up until that point. And then we looked at the places where things hadn't settled yet, where the definition wasn't clear.
We started to move the conversation to places where ideas were in flux or contradictory and we used a lot of incredibly creative methods. We commissioned a film that you'll see a screenshot of. Above our heads are all of the words that we are trying to think
think about in relation to data and AI ethics. We convened hundreds of people in online events. It was COVID, so we had to set up very weird jury rigged network connections, as you can see here. And we commissioned a fellowship program so that we could focus our resources on investing in people investigating the relationship between AI and racial justice.
And what I think was really interesting about this project is we did not directly shape policy. What we did was introduce new ideas into a conversation and create space for them. The ideas and the themes of our project, which included refusal, sustainability and justice, are now present in conversations in ways that they weren't before.
We also commissioned science fiction writing, which is how I met Aaron, as it happens, because Aaron's work became the basis of a science fiction short story that we then debated in an online salon to make sure that people became invested themselves in the questions that have unfolded around data and AI ethics. And it turns out that people have a lot to say about data and AI ethics, not just experts.
If you ask them, they will have many, many things to contribute. These photographs were taken in 2023 at a community festival near where I live in East Walworth in South London. And I had been spending time talking with Michael Little and Richard Galpin who had been working in this community to increase community trust and connectivity as a way of increasing the health and resilience of the community.
And Michael and I started talking about data. And I said, "I wonder, Michael, if the people in the community would like to talk about data as something that they could connect with and connect using." And he said, "I don't know if the community is interested in data. Why don't we find out?" So at the community festival, we made a data stall
And it was just down the street from the pub stall that my friend Matt made, and it was just up the green from the stuffed animal sales stall that our kids made. And we thought for sure no one would come to the data stall because they would probably be at the pub stall. But it turns out that many, many people wanted to talk to us about what data was important in their community. We heard about ideas like
collecting together the information that everyone in a tower block might need about how much electricity is being used so that they could buy electricity together rather than having prepaid meters. We heard about how much more information they wanted from the council, how they needed granular data about what was happening in their area and not just at the borough or high government level.
So what we learned here was that if there are mechanisms for people, all kinds of people, what we sometimes refer to as ordinary people, as if some of us are extraordinary people, if you let people participate in structuring the norms and rules around things that matter, like technology and information control, they can and they will use those opportunities.
So, when we were organizing this panel, Sita said, "I'd really like if you can talk about what to do now so that we don't just have the idea that the data world is a dumpster fire." So I've decided to think a little bit about what to do because there's a lot at stake. Not only data or data governance that's being struggled over, it's this idea of AI now that's often thrown out as a vague term of power.
And as Chris's talk has shown, the dynamics of financial and cultural investment, in particular ways of thinking about data and technology, can leverage vague language, things like data ethics, for the benefit of certain actors and not others. How often do you see data ethics framed from East Walworth compared to how often you see data ethics framed from Google?
So I don't need to tell you that the current state of things is not a very good way to support flourishing. And so when I give these talks, I'm often aware of the fact I'm talking about small groups of people in particular situations, and they might not therefore seem world-changing. So why focus on defining data together or expanding the field of data ethics or collaborating with people in tower blocks in inner London?
But I think these practices are a bit like digging a new garden. Before you start, it seems impossible to have anything grow there that wasn't there before. But once it's in place, you can't imagine the place looking anything otherwise. So I think we need to pay attention to which stories are told by whom and when, develop our individual and collective capabilities,
and make some breathing space around technology so that we can grow something new. Thank you. Hello, lovely to be here. Thank you so much for the invitation. I'm going to talk about something quite different and pick up on some different threads from Chris's talk, namely about how do we govern data and socio-technical systems more broadly.
So, I've recently joined the Institute of Directors as the head of tech policy as of a few months ago. And if you don't know us, I want to give you some background. So we're a non-party political organization that was founded in 1903. We have about 20,000 members who are board members, directors from across industries. So everyone from CEOs of large corporations to startup entrepreneurs.
The IOD was granted a Royal Charter in 1906, which is when we began influencing Parliament in the interests of British business. And the Charter also tasked the Institute with, and I will quote, promoting for the public benefit, high levels of skill, knowledge, professional competence and integrity on the part of directors. And these are our headquarters in Pall Mall.
So as such, we work a lot on corporate governance. There are various definitions of corporate governance, given that the term governance, like ethics, like AI, is contested and malleable. But essentially, it's a system by which companies are directed and controlled properly. So, of course, Musk did not say this. LAUGHTER
At the IOD we advocate for meaningful interpretations of good and responsible governance in business. So it's not just about what you might think compliance and risk management. It's about, for example, committing directors to ethical standards. It's about promoting tech and data literacy across board members. It's about diversity on boards, protecting whistleblowers.
So to this end, the IOD launched, it's before I joined, launched the Code of Conduct for Directors last year. It's a tool which helps business leaders make better decisions, essentially. So it provides them with a framework to help them build and maintain the trust of not just their stakeholders, but the wider public in their business activities.
So the code is structured around key principles. Now I know I haven't spoken about data yet. What's the link to data here? So you might notice that some of these principles are often seen in AI and data governance principles which have emerged globally over the last few years. So transparency, accountability, fairness, we see ethical standards again, trust.
When we're talking about corporate governance today, we need to be inherently thinking about digital tech, AI data governance to harness the opportunities as well as manage the risk. And of course, coming back to this point on language, we're always acknowledging that not just governance, but terms like AI and ethics are also contested and malleable, meaning different things in theory and in practice to different people.
So with the launch of the UK's growth-focused AI opportunities action plan earlier this year, and its largely uncritical focus in part on driving cross-economy AI adoption in the UK, I wanted to better understand our members' views on AI broadly defined. So we surveyed members in March of this year,
And we found that many respondents indicated a lack of trust in AI technologies, tools and systems. And this is obviously a huge issue when corporate governance is about building trust.
They shared concerns around the accuracy and reliability of AI systems, ethical, societal, environmental concerns. And most interestingly, there was a massive pushback against the kind of AI hype versus tangible business value and application.
and even when we asked members specifically about the benefits of ai although some in early adoption mentioned time savings administrative efficiencies others took the opportunity to express critical concerns about widespread ai hype and it got me thinking it makes complete sense that this kind of sweeping generalized ai debate often playing out among the upper echelons of the tech sector
is so different to what's happening on the ground with adoption and governance of AI across both the private and the public sectors in the UK and globally. And I also was thinking perhaps this is kind of pushing us towards this trough of disillusionment that we see in Gartner's hype cycle. You know, it's not entirely surprising that the frustration and disillusionment in the sentiment of some mainstream business in the UK
Given that this is the kind of sensationalized dystopian coverage from certain media outlets, not mentioning the Daily Star in particular, these headlines are obviously completely ridiculous. But they're very pernicious because they take attention away from the actual issues, the big issues that we really need to be thinking about and talking about when we're discussing, developing, governing, adopting AI systems.
issues like the under-representation of women in data science and AI. So there's lots of issues I could have flagged here, but I'm drawing specifically on research that Alison mentioned already that I led previously at the Alan Turing Institute exploring how the under-representation of women in data science and AI, as well as in the financing of the systems, shapes tech innovation and other data-driven systems.
We found evidence of persistent structural inequality in the data science and AI fields with the career trajectories of
data and AI professionals differentiated by gender. So, you'll see here women are much more likely than men to occupy jobs associated with less status and lower pay. So, for example, data preparation. And then men were much more likely to hold the more prestigious frontier roles in engineering and machine learning.
And so this is really about the people making the subjective design choices behind the development of data-driven systems. Decisions which, of course, reflect their preferences, their values, and decisions which shape and are then encoded into technologies well before they are launched into the wild for adoption by, for example, businesses who are members of the IOD.
We also did some research looking at the financing of data-driven systems, namely venture capital, and we found that over the last decade female-founded startups won just 2% of VC funding deals involving AI companies in the UK. We also found that average capital raised per deal by a female-founded AI startup is six times lower than the average deal capital raised by an all-male founder team.
And then within AI enterprise software specifically, where investment is really booming, all female teams raised only half a percent of capital. This isn't just about economic and social equity, but it's really about innovation. It's about creating systems that actually serve populations that will contribute to real growth, not just economic, but otherwise as well.
We've seen the results of these harmful feedback loops, garbage in, garbage out, bias in, bias out. The thing is, it's not necessarily about an intention to harm, but it's caused by the fundamental structures of data systems. It brings me back to, it's no wonder that businesses in the UK have massive trust issues when it comes to data-driven technologies.
So today is the day, I forgot my poppy unfortunately, but it marks the 80th anniversary since the end of World War II. So I wanted to mention that in sharp contrast to the data workforce we see today, at the advent of electronic computing during the Second World War, as Chris discussed, software programming was largely considered women's work and actually the first computers, inverted commas, were young women.
The majority of workers at Bletchley Park, as you saw in Chris's slide, were women. But as computer programming became professionalized, the gender composition of the industry shifted.
marginalizing the work of female technical experts. So in other words, as money and influence entered the fields, then more men began to work in the fields. And this pattern seems to be replicating, has replicated itself again in data science and AI today. So in contrast,
On the right of the slide, we can see who is in the room today making key choices about the development and the governance of data-driven technologies. Basically, who holds the most power? And it's not the average UK SME, small business. It's not the government and it's not the public.
So bringing it back to our work at the IOD, we make sure in the policy team that mainstream British business companies of all sizes have a voice in the policy debates around data and AI governance. And for initiatives like the Code of Conduct, which I mentioned at the top of my talk, we support directors and boards in providing effective oversight, including of their own data systems and the data-driven systems they're building.
so that their organizations can operate responsibly, not only sustaining the trust of their stakeholders, their shareholders, but also towards the benefit of the economy and society at large. Thank you very much. Thank you so much for your presentations and discussion. We will now open the floor to...
From the audience, both here and online, if you could please type your questions into the Q&A box. We will try to answer as many as possible. Please also include your name and affiliation. For those of you that are here in the room, please raise your hand and I'll take questions in batches of three.
And as you here in person and online think about what your questions might be, I just wanted to get us going to see if Chris, you had any, if we could sort of interact with the two presentations that followed you. Alison talking about sort of participation, fair and equal participation in the discourse on technology and the kind of
socio-technical systems that we want. So you were focused on participation and you, Aaron, were focused on parity, representational parity in the tech sector as one way to achieve the type of socio-technical systems we'd like in the face of unequal power. And so, Chris, I'm just wondering if you had an example from recent times
that you think is the most compelling example of how this sort of people power challenges the state and private power that you spoke about in the book and that you would like to see result in solutions beyond just tech solutionism. So yeah, could you share us just one or two examples? Are they similar to Alison and Aaron's or do they differ?
Well, so at Columbia I'm an engineering professor, so I teach young people who go on to careers in technology. So I'm particularly attuned to cases where they've gone into companies and then I talk to them later about their experience as technologists.
So, I'm particularly interested in this internal experience of people who become software developers, data scientists, product managers within companies. So, within the vast fear of people power, I'm particularly interested in the people who are working in these companies.
So, within that form of private ordering, to quote the article from the legal scholar, there's all sorts of ways in which people talk to each other and perhaps this is what you meant by saying technology is people. I mean, they are people building the technologies and distilling from what they think are their standards down to lines of code and ultimately product decisions. The ways that they push back include collective action, small and large, and
demanding that the launch cycle includes not only QA and quality assurance, but some sort of understanding or at least thinking through the potential impact on people. And absent of that, and sometimes after that, alerting the wider world to the consequences of these technologies. Sometimes this happens after the people have left the company. There's plenty of great books by people who are alumni of these companies and then try to help us all understand how these companies work.
Sometimes it happens from whistleblowers, leakers, and planters who go to the press. I am a big fan of investigative journalism, so I do think that's one way that people work together with the inside and the outside is to make it possible for people to discuss with the part of civic society that is the press to make sure that people have thought through potential damages, seen, unseen, present, and future, and technology. Those are my favorite.
the external visibility. Sure, we'll be able to touch upon some of those themes. Can I get questions from the audience? We'll go there and then behind and in the middle. Yeah, just developing on this point about the general public as this shepherd for better practices around AI. Is that an...
conference at the Bennett Institute in Cambridge yesterday, similar discussions arose. And there's this concern around the general public is affixed between either alarmism or AI hype. And there's this concern around literacy that informs the ability to kind of hold practices accountable and push back. So in what ways do you think a data centric view can contribute to increasing literacy and
Well, I think you make a good point about the dichotomy between irrational exuberance and the trough of despair, to quote the diagram from Dr. Young. And I think that's a very useful diagram. We go into a technology often with a view of what it can do. Sometimes that view is, in fact, argued for by the people who developed or sold that technology to us.
And often as we interact with the piece of technology, we find ourselves in that trough, as she pointed out, of realizing that it actually doesn't do everything it was told to do. The best place, surely, is the far right-hand side in which we have rational exuberance of what it can and cannot do, which prevents us from either inflated hype or doomerism that will have too much of a power. I hope that responds to your question.
Hi, I'm interrupting this event to tell you about another awesome LSE podcast that we think you'd enjoy. LSE IQ asks social scientists and other experts to answer one intelligent question, like why do people believe in conspiracy theories or can we afford the super rich? Come check us out. Just search for LSE IQ wherever you get your podcasts. Now back to the event.
I mean, I might also add that this dichotomy is itself a story, right? So it is a story saying there's sort of two main kinds of people, the ones who are getting on board or the ones who are getting in the way. And, you know, I think all three of us are really interested in presenting something that's a little bit more nuanced.
and therefore helping decision makers, policy makers and governance structures understand that people are not, you know, don't necessarily, it's not that there's a huge gap in technical education about AI, but there might be a gap in the social conversation that would permit other things to become apparent rather than just on board or in the way.
Thank you. It's been absolutely fantastic talk. If I could just jump wildly ahead to a kind of gigantic question, which obviously we find ourselves in a very challenging world politically since the new administration took power in America.
hopes for enlightened regulation from the federal government are kind of shattered, it seems, at the moment. There seems to be increasing rivalry between America and China. You know all this. So, in the context of where we are today,
What would you say is a way, a strategy for attempting to regulate AI in some way so that it benefits as many people as possible rather than the tech elite?
Sorry, I know it's a gigantic question, where do you start, but your thoughts would be greatly appreciated. Yeah, I would say the one way to think about it is to expand the definition of regulation. So particularly in the states, people have a conception that there's one form of regulation which is central federal laws. And of course, regulation, even legal regulation, is not just federal laws, but state laws, municipalities, individual municipalities. Municipalities in the states, for example, have
you know, ban facial recognition or something like that. And then of course regulation takes places in other jurisdictions. GDPR is a good example. United States companies that wish to be active in Europe had to decide, do they want to make a separate infrastructure for Europe or do they want everything to be GDPR compliant? Most companies decided to simply make one
effectively better business intelligence infrastructure that made it possible to operate in Europe. Moreover, as the legal scholar Larry Lessig pointed out at the end of a previous millennium, regulation is not just about laws. Regulation is also norms, laws, and architecture. Norms, laws, markets, and architecture, as he put it. That laws are just part of the regulation. There's also our own technological solutions, which is architecture.
There's also markets, whether people choose to spend their money, for example, on an allegedly self-driving car. If an entire country decides they're not going to buy that car anymore, it can have an order one impact on the stock and the future of that car, for example. And our own norms, right? We have to decide, is it okay to use a large language model to write our friend's eulogy? That's sort of up to us and our norms.
We were talking earlier today about the various meanings of norms as well. All of those form a type of regulation irrespective of whether or not one particular country at the level of the whole state is bringing about regulation that enforces consumer protection. Just to add to that, we sometimes see this kind of innovation versus regulation, that the two can't coexist. And I think on top of that,
we've historically seen that Europe is seen as the regulator and US seen as the innovator. And I think once again, it's nuance that we need. It's not one regulation is one thing and innovation isn't another thing. I think we're seeing both in Europe and the UK,
kind of more exploration around this. So for example, the AI Act in Europe, there's now a lot of conversations around, well, how do we actually kind of implement this in practice? Is it, you know, be a bit more flexible in the approach? Likewise in the UK,
Generally, we're taking a kind of more sector-based approach compared to Europe that takes more principles-based approach. But again, we have, for example, cooperation across regulators with the Digital Regulation Cooperation Forum where regulators can discuss the best approaches, the most flexible, nuanced approaches on kind of a case, sectoral, industry basis. So I think it's really keeping this nuance in the discussion that will make the difference.
Yeah, maybe it sounds a bit obvious almost, but Alison, you mentioned about 2019 being in the middle of data ethics being a thing.
And I guess I'm kind of wondering what sort of, I guess, narratives or understanding you all have for sort of why organizations at that particular time in the course of those few years kind of felt the need to kind of concurrently produce these sets of principles, etc. Like was it a change in language or was there a real kind of cultural shift and kind of what explanations you might have for that?
I'll start if that's okay, Chris. Yeah, we were discussing this a little bit earlier. From my view, having watched this unfold, in my view there was a strong social critique that was beginning to emerge around the observable kinds of harms that were resulting from the design choices made around certain kinds of systems. And so those systems included automated
algorithms for enforcement, the compass system for sentencing, facial recognition systems, which were widely understood to misrecognize certain groups of people, which caused disproportionate levels of harm. So this was something that was well documented, well discussed, certainly part of the scholarship of people sitting in this room. And this was starting to have an impact. This was starting to
cause people within the technology industry to be concerned about how their products might be received or whether Chris's outcome of people stopping to consume as many of these products because of a perception of widespread harm or a malfunctioning in fact based on these different kinds of design systems. So part of what occurred was a sort of attempt by companies
to capture the energy and channel it into something that was less disruptive to their bottom line and their business model. And that of course meant for technology companies like Google to build ethics boards to respond to external and very importantly internal critique of the designs of certain kinds of systems and the recognition of the harm that those systems were causing.
And, I actually originally had a slide that said, you know, data ethics is a problem, question mark, that had some news coverage of the time, some of which was about the Google Ethics Board, which was disbanded almost immediately because it was very clearly performative and they also hadn't very carefully considered who was actually participating on that ethics board.
So this was, I think, a form of, you know, a very dynamic form of what some people might call capture, but you might also call a kind of unstable terrain of struggle. So there's a terrain of struggle that's identifying real problems with associate technical system and then, you know, power and benefit seek to preempt that critique before it becomes too damaging to absorb it. And then several years later, you know, it can, it's allowed to fall away.
also because the money has moved somewhere else, in this case to massive investment in AI startups. Great, I think that actually segues nicely into an online question from Kito Shillamma, former MST Data & Society student.
and current Mozilla Foundation Senior Tech Policy Fellow, who asks, "On the topic of power and data, I'm interested in the panelists' perspectives on resistance, particularly how to incorporate it in policy discussion." For Alison, do you have thoughts about how to scale resistance movements to data practices from a local to national and global level? For Aaron, do businesses consider resistance as a form of governance?
which I think actually nicely links back to some of what you were saying about municipalities, for example, bans on facial recognition. So yeah, great question. Yeah, such a fantastic question. And it's really wonderful to hear from Yukito, who's doing spectacular work.
themselves. So how to scale resistance? Well, first of all, I wonder about scaling because sometimes when we talk about scaling, we assume that scaling has to work in the way that scaling works in hypercapitalist businesses where you take one kind of tiny pilot idea and then you just repeat it many times with the same sort of principles and with the same expectations. And I don't think
that this is how these kinds of dynamics operate. So when I was spending time in East Walworth, we were thinking about how to make a kind of
network of people who were interested and active already, but whose interests and activities might be quite different from each other to allow them to speak with each other to sort of do some good practice sharing. So this is not like scaling as in repeating and trying to do exactly the same thing. It's much more thinking about change as a kind of as a sort of emergent
collaboration or federation of people with different interests who could potentially share different tactics.
So that is one model if you're thinking about getting from the very, very small to the slightly larger. The other way is to think about different scales of institutions. So when I was also talking about East Walworth, I was talking about the borough. So in London we have a great number of boroughs. I can never remember if it's 27 or 29 or 32. So these are local governments and they are an intermediary state between the very small and the very large.
In other places you have different kinds of municipal governments, state level governments. You often have things like neighborhood associations. These can link together again in this kind of federated manner to share knowledge and practice which also allows for ideas to move across different kinds of institutions.
and allows you to have your change not have to be repetitive or be the same every time, but for your change to be different and therefore specific to whatever context you need it to be working in. Very good question, and it's something I'd like to ask my husband, Isabel. My sense is that we're very early on in adoption and governance in mainstream business,
when we're thinking about particularly generative AI, I would say, and kind of broader data-driven systems. And so for the directors I speak to and the organizations they represent, particularly those which are in sectors and industries which are kind of far from digital-first industries, they're at the very initial stages of
asking questions about these systems, what are the most valuable use cases, then how do we govern that, what are the risks? So I think it's, my answer is, it's a little bit early to tell, because we still are at that kind of very early experimental stage. It will be interesting to see how that unfolds. It raises a lot of questions about...
ex-post and ex-ante, right? That I think is probably a topic we could spend a lot of time talking about. Yeah, also what made, what I just sort of started thinking about, Erin, while you were talking was, I mean, many years ago, if you'd asked me this question, I would have said, well, you could make different stuff. You could, you know, open source models for different kinds of like development practices and you could have that
you know, design driven innovation also provide different kinds of alternatives. And I wanted to bring that back in because I feel like either I got really cynical or we stopped talking about actually building different kinds of technical alternatives, which meant that we stopped perhaps having an imagination that we could build technology in a different manner that wasn't in the manner in which your dumpster fire got started. I don't know if you had thoughts on that.
I mean, I'm in favor of technologists thinking creatively about the consequences of what they design, certainly. I think design is only one element of, again, the socio-technical system. There's many forces at play. There's abundant capital that drives people to design certain things, irrespective of the technology, of the design hopes of the individual technologist. So that fits nicely with another online question that we have from Aisha Mahal.
What's the panel's view on the term data colonialism, whereby data has become an exploitable resource used to deepen power asymmetries between global north and global south? Where does ethics and public interest sit within this dynamic and what could be done? Maybe it's not a design solution in that case. Yeah. I mean, my conception of ethics definitely touches on at least two aspects of data colonialism.
justice, which is often interpreted as fairness. Are we having an equal attribution of the benefits to different people? And the other is respect for persons, which is often interpreted as informed consent, because often people who are in a position to have their data extracted are not the people who are really consenting to the way those data are going to be used eventually. So certainly I think it's
I think the framework of ethics as a set of principles speaks directly to data colonialism under that definition. As far as what is to be done, I'll turn to my panelists, particularly my policy-minded panelists. What is to be done?
So I'm going to wade in because our online questioners are really erudite and they're like really, really clever. So like we're all like struggling away. Okay, how do I address data colonialism? So okay, listen, the current models
for both design and for the kind of design of the industrial strategy behind data-driven systems are that they depend on constant increase in input. So this is also how current machine learning systems are designed. In order for the thing to scale in the sense that we were talking about scaling
scalable business models, you have to keep adding more stuff. So if you have to keep adding more stuff and there are sort of finite sources on the existing internet, which is what most many of the sort of conventional large language models are trained on, then you need new sources of data and information. If you are a
creating a business that's trying to expand into new markets, you also need material that's written in different languages. So this certainly starts making information produced in Global South contexts or all around the globe into things that look attractive, provided the cost is low enough to acquire them. So I'm going to jump straight to the solution to this dynamic, because this is a dynamic that is colonial
you know, in terms of how it is oriented. But it does not have to reproduce or repeat the colonial violence. And I think sometimes when I hear the phrase data colonialism, I get a bit worried because it implies a bit that a kind of structure of an industry that's depending on gaining new resources is inevitably going to produce a kind of colonial violence. So one way you could
you could preempt that would be to make those new sources of material, which in this case is data information, things people say, whatever's on the internet, expensive. And I think this is one way that you could reverse the dynamic. You could try to create a different sort of market, a market in which this notion that data is abundant
and therefore cheap, and therefore you're always looking for more of it, and therefore something that's a material into something that needs to be worked with carefully. I talk about this a little bit at the end of my book. There was also a moment there where I was able to write about what if we used data minimization strategies? What if we assumed the data was going to be expensive and not cheap?
and therefore people's participation in systems of governance would be equally expensive if they were participating, for example, in a citizens assembly or if they were contributing data to a data-driven public consultation. We have a question here in the front.
You mentioned markets as an arena of regulation and public interest and clearly mass holdings of data grant an immense amount of power in markets. But do you think our competition regulation actually has developed concepts and tools that address that power granted by data in markets? Has regulation created... Has competition regulation helped create...
Sorry, can you... Has competition regulation helped narrative? No, I think basically holding data gives enormous power in markets. Does our regulation of markets actually have tools and concepts that matches that? That's a good question. So, for example, if you read...
Anyway, if you look at people arguing for AI as a differentiator in company development, one of the arguments for AI is that it's a self-perpetuating flywheel, that the companies that have more data create more value, they use that value to gather more data and then
They have a runaway advantage and then they win. So I think what you're getting at is does the existing, in the States you would say maybe antitrust regulation, adequately reach out to and is aware of and defend against the possibility that one company takes market advantage because they've gathered so much data that they have it as their data moat.
I kind of want to give a US-centric answer, which is it depends on who's running the FTC. Meaning, there's actually a debate that's been going on in the last decade, in particular in the States, around what antitrust should do in the United States. Should it be, for example, just about price? So there's a conception that the only type of market advantage or monopoly power that matters is, is the company rising the price unfairly to the market? And then there's a counter-narrative that it's about power.
And so one of the arguments that's happening right now, and it's been advanced by a legal scholar named Lena Kahn, who to my surprise is still at the FTC, is that the role of antitrust should be around power and that we should regulate companies that are able to capture the market
Not because they're charging people more, because they have monopoly power, but because they have such dominance that they must be broken up. And that debate is being played out in court cases week by week, particularly in the last few weeks, Google and Meta on the receiving end. So I think it's absolutely possible, but it requires people to think about antitrust as having
an impact not just on price but about consumer protection and about enforcing competition. But you're right that certainly when I started going to venture capital events in 2010 or so, that was part of the narrative is that AI and data are going to be its own moat. That the moat will not be a better product, the moat will be that you will gather so much data that no other company can compete with you because you have the most valuable data that no other company can afford to gather.
Yeah, to add to that, there is an interesting, I think mainly US-based, but this kind of interesting little tech movement where a lot of VC-backed tech startups and some of the venture capital firms as well are pushing back against this, arguing that no, it doesn't adequately challenge that. I think something else I was thinking about, I think it really depends on where you're looking in the kind of AI data value chain or in the tech stack.
So, for example, if we look at the share of the cloud market in Europe and the UK, we obviously are hugely reliant on, I'd say, three or four US companies. So there, I don't think it's perhaps not intervening in the right ways. In other parts of the stack, maybe differently. I'll just add one quick thing that might be worth considering is also...
not just the consumers, but the workers, right? The extent to which, for example, platforms have a monopoly on the labor market and whether we've used enough tools in the toolbox to actually address that. We had a question in the back and then I'll come here and then we have, actually if we could batch these questions, so you can pose your question, you can go next and you can go third and we'll ask, and fourth, if we can squeeze it in and we'll try and get a roundup.
I'm afraid I have a very boring question about data cleansing and data governance, so perhaps for Dr. Young as well as the others. The two words, and I appreciate how loaded this is, that for me interestingly were prominent by their absence in the discussion, very, very interesting discussion, were fact and truth. I appreciate how loaded this is.
and appreciate at the macro level, you know, kind of big data modeling, data, you know, sales and so forth, this may be irrelevant, but thinking, you know, companies using AI, using advanced learning about more internal decision making, and anything larger than a simple local business then immediately starts to get lost for lack of
clean, truthful, factual data. And this is
a problem which potentially could break businesses. In many cases it has to some extent or it has put very large companies at risk. So I'm wondering how much of – is there an awareness? Are you seeing the question of data governance come up perhaps in parallel with this very interesting alarm about what AI can really deliver because of course if we don't have the data quality then how to use it to make decisions?
Yeah, it's a really good question. I think this is, I see this as fitting closely alongside digital tech AI literacy. What I'm noticing when I speak with boards, for example, is they, many don't have the kind of
basic whatever that might mean, AI literacy, data literacy, to even kind of know which questions to ask regarding data governance. What does our data look like? How do we begin organizing this data? So again, I think this comes back to, it's almost kind of a layer before thinking about this. What
How do we make sure that SMEs, for example, boards of non-digital first companies understand this to the degree that it's not just strategic but it's also responsible? I'm going to batch the questions now. We could just take all of the questions at once and then I'll give an opportunity for all of you to respond. Hi everyone.
So we've heard a few times tonight about the power of a smallish number of companies influencing a lot the future of AI and controlling lots of data and the data moats, as you were talking about. However, typically when it comes to new technologies, you're talking about terms like creative destruction as having both pros and cons and
when it comes to the typical value exchange between market efficiencies and equality, do we feel that there's more of a tipping point this time because so few companies are controlling and leading the way so much that public bodies and typical public authorities are playing catch-up, not wanting to hold the market back, but also having to play catch-up when it comes to regulation after the effect?
Thank you so much. It's been a fascinating session. With all your knowledge and experience directly and through people you've worked with, what's your personal behaviour in terms of protecting your data, in terms of social media? How do you live online?
Hi. Dr Powell and Dr Wiggins, you both used the trash fire metaphor and I think you were perhaps joking slightly when you said you didn't need to explain that any further but I'd appreciate if you did, if you could talk about some of the concrete
harms that you think are happening now from unethical use of data? I know there's been quite a bit of discussion of accountability and bias in decision making, but is there anything else? I'm not left that worried. Maybe I'm being blasé. Great, okay, so not that worried. Can you explain the internet trash fire? Second question in reverse. Your own personal data hygiene habits and
Third question, which is this, can you respond to this problem, this common problem of regulators having to catch up with creative destruction?
So, again, the U.S. perspective is, yes, they play catch up, and I think that's the way regulators often design. It's often said in the States that we shouldn't regulate technology before a problem occurs. I'm not saying that I personally advocate that, but it's certainly something that you hear. And, in fact, often in the States, it's some extremely cataclysmically bad thing that
happens before there's any sort of regulatory response, including financial crisis of 1929 or the incredibly deeply unethical research results that created the Belmont Report, which I really can't go into without crying. It's bad. So yes, it's often the case that, at least in the States, regulation is after a cataclysmically bad event. As far as how I live my social life, not on X...
I have never used Facebook. It always gave me the creeps for Snapchat or any of those other things. I use DuckDuckGo in incognito mode and I prefer to interact with LLMs using Duck.ai, which is a privacy-preserving way of-- and I particularly like the open-weight choices rather than sending all of my information via an API to a private company. Several other ways that I avoid exchanging data with the internet.
Oh, are you avoiding the concrete harms, Chris? Correct. I mean, there's so many trash fires. I would rather lean on the shoulders of scholars. I mean, like, Algorithms of Oppression is a great one. I can still remember meeting at Data & Society, Sophia Noble, before the book existed, and I was like, oh, search, what's wrong with search? And she explained to me what happened when you look for, you know,
pictures of white girls and pictures of black girls. When I met Latanya Sweeney and she told me the story about Googling her own name and then sitting next to her co-author and then the co-author Googled his names and they came up with radically different results, right? For Latanya Sweeney it was arrest records, right? Which at the point Latanya Sweeney was already doing fine as a professor at Harvard, but...
I don't actually know where to begin other than I encourage you to check out some of the books from the... Virginia Eubanks' book, Automatic Inequality, is a deep ethnography where she interviews people and as one of them says to her, be careful because what they're doing to us now, they'll do to you later. So she interviews people from low socioeconomic systems who are having their benefits decided algorithmically or you can look at the past 100 days in the United States of people being fired at scale, including people who are in charge of
dealing with viruses because of, or people, my colleagues whose grants were terminated because they had trans regulation of genetic regulatory networks, but trans was a bad regular expression. The examples are too numerous for me to, we would be here a long time.
But I encourage you to check out the slide with the excellent work by a variety of women scholars 2017 through 2019 for a good start. Or probably many of the papers by my colleagues on the panel. I can work backwards. So actually I give...
presentation I used to give at the Turing was a few slides I showed at the end. I ended with a lot more examples of gender bias in AI and data systems. And one which always really struck me, which has now
thankfully this is not the case, but actually my colleague in the front row sent to me, was a translation system where you were translating from a language that didn't have gender into a language with gender. And if you, so for example, if you wanted to translate "they clean" and "they make money", it would translate as "she cleans" and "he makes money".
Now that's been changed, but much more recently we've seen image generation systems replicating this again. If you ask for an image of a doctor, for example, versus an image of a nurse. So yeah, just one of the kind of harms and risks in the plethora that we see. Personal behavior. I grew up...
with the advent of YouTube and MySpace, actually. And this was long before we didn't have any conception of what this might mean, harms of putting data online. So I'm not sure what happened to all my data on MySpace. Hopefully deleted. Now I am much more careful, obviously, but I do think about that sometimes. Thirdly, the question on regulators playing catch-up. I think this is
I mean of course regulation and law lags behind tech capabilities. I think it's a question of how much behind and how agile can this be. Ofcom recently have been loading up on tech talent so they've been hiring a lot more people into the regulator who understand not just from a technical perspective but a social perspective and so I think initiatives like this
will kind of up the agility of regulators and perhaps make this not so much of a problem going forward? So I've noticed that it's 7.58pm and I know that these events run strictly to time so I will...
acknowledge that my colleagues have mentioned so many of the harms. I will however just point out that if one is reflecting on the lack of harm, that might mean that one is in a category which is traditionally privileged and therefore less likely to experience harm
at any given time, but that we all move across these categories. And as political situations change, we may find ourselves moving from a category in which things do not feel risky or consequential into categories where they do feel much more risky and consequential, as I think Chris has narrated already from the U.S.
I think Aaron has answered the question about monopoly so well, so I will leave you with my extremely incoherent online life, which is attempting to respond to a sense that things have become indeed more risky than perhaps they were before. And perhaps I did have more information about myself
on the internet prior to learning more about these kinds of systems, the amount of data that's ingested and the very complex kinds of predictions and matchings that occur. And I will make just one final observation because I know Sita a little bit and I also know that she is very careful about her online life and
just leave you with the thought that the people who are thinking really carefully and deeply about data ethics and governance are trying not to leave too many traces online. And with that, it's been a great pleasure to have you. Please, I think all of you have enjoyed this panel, has left us with a lot to think about. Thank you all very much for coming to this event. And please now join me in a final round of applause
Thank you for listening. You can subscribe to the LSE Events podcast on your favourite podcast app and help other listeners discover us by leaving a review. Visit lse.ac.uk forward slash events to find out what's on next. We hope you join us at another LSE Events soon.