We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode The SSN Breach: What Now?

The SSN Breach: What Now?

2024/8/18
logo of podcast a16z Podcast

a16z Podcast

AI Deep Dive AI Chapters Transcript
People
J
Joel de la Garza
N
Naftali Harris
主持人
专注于电动车和能源领域的播客主持人和内容创作者。
Topics
Joel de la Garza:此次数据泄露规模巨大,涉及大量社会安全号码等敏感信息,对个人和社会都造成严重威胁。社会安全号码不会改变,一旦泄露就很难补救。虽然许多泄露的信息可能已经通过之前的泄露事件公开,但此次事件将这些信息集中在一起,方便攻击者利用。将个人身份信息与其他信息(如银行账户或驾驶执照)结合起来,会增加诈骗的风险。攻击者也面临着数据质量和准确性的问题,这与合法企业面临的营销问题类似。个人可以采取一些基本的安全措施来降低风险,例如启用双因素身份验证和使用密码管理器。数据泄露问题仍然存在,但通过采取适当的安全措施,可以有效降低在线风险。改变激励机制可以有效打击网络犯罪。现有技术可以解决数据泄露问题,但需要政治意愿和行动来实施这些解决方案。公共密钥基础设施可以解决身份验证问题,但其在现实中的应用面临挑战。美国政府长期以来一直致力于改进身份验证系统,但进展缓慢。 Naftali Harris:此次数据泄露来自一家收集个人信息的第三方公司,泄露的数据包括姓名、社会安全号码、地址等信息,影响了美国和加拿大公民。黑客最初试图在暗网上出售这些数据,但无人问津,最终免费发布。Naftali的团队验证了泄露数据的真实性,数据质量参差不齐,存在重复和错误信息。在暗网上获取泄露的数据相对容易。Sentilink公司每天阻止超过2万起身份盗窃事件,向金融机构等提供身份验证服务。 主持人:本期节目讨论了近期近30亿条记录的数据泄露事件,其中包括大量社会安全号码。安全专家Joel de la Garza和Naftali Harris将参与讨论此次数据泄露事件。Naftali的团队获得了泄露的数据集,并验证了相关说法。强制披露数据泄露事件有助于提高消费者意识,并促使他们采取正确的应对措施。

Deep Dive

Chapters
The chapter recaps the recent data breach involving billions of records, including social security numbers, and discusses the nature of the breach and the data at risk.
  • A third-party company collecting identity validation data was breached.
  • The data included names, social security numbers, and addresses of U.S. and Canadian citizens.
  • Hackers attempted to sell the data on the dark web but later released it for free.

Shownotes Transcript

Translations:
中文

Hello, everyone, welcome back to the a extensive podcast today. We've got a special episode covering a timely piece of news that, quite Frankly, I wish we were not reporting on. In case you missed IT, this week, there was a reported breach of nearly three billion reforms, but not just any records.

Headlines included, quote, billions of social security numbers exposed, or even, quote, did hackers steal every social security number? Naturally, we wanted to bring the experts break down what really happened here, and it's expected impact. So joining us today or joel the garza, enough telly hairs. Joe is an Operating partner in a sixteen C E, who was previously the chief security officer at box and previous to that, the global head of threat management in cyber intelligence for city group. Nati, on the other hand, is cofounder in C E O of centered in a company that helps block, I don't be theft in fraud for hundreds of financial institutions at a scale that might make you s we verify .

over million people everyday.

Incredibly enough, now tall's team was actually able to get their hands on the bridge dataset, and we're actually in the room as we were recording. So you'll hear enough tilly reference them as they were, were poking, prodding to validate the claims, listening as we explore the who, the what, the when, the where, the why, but also how a bridge like this happens. And what we can do about IT, we watched .

these markets like this has been going on forever.

You can see the Fosters is talking about this on the forms.

social security numbers of the kind of things that don't change, right? You get one when you're born and you're stuck with that for while.

If you're in this bridge, you're in the bridge. And probably all three of us are, Frankly.

as a reminder, the content here is for informational purposes only, should not be taking as legal, business, tax or investment advice or be used to evaluate any investment or security, and is not directed at any investors or potential investors in any a six cy fund. Please note that a 7 cne and eza hili ates may also maintain investments in the companies discussed in this podcast。

For more details, including a link to our investments, please see a six inc. Outcome slash disclosure. So dull, not tally.

This was a pretty crazy week. I got ta slack from one of our coworkers doll that was like, you seen this social security hack. And I had not at that point.

And that is a pretty no fighting method. So why don't we just take a second to recap what actually happened here? What was this bridge and what data was potentially at risk?

yes. So just when you thought there wasn't any more information to leak out into the world and then there's always a surprise that there's still more dated to come out. And so this week, we saw there was a third party company that collects all this information and uses IT for things like validating your identity.

And so they have your name, your social security number, your address. They also have nicknames and seems like they had IT for all us. Citizens as well as all canadian citizens. So this is larger than just the U.

S. And i'm not safe as a canadian can .

save you this time, unfortunately. And so these hackers somehow came about getting this information, and then they tried to sell IT on the dark web. And there weren't any takers.

Nobody wanted to buy IT because, like I said, I thought all this stuff was already public 关。 And so when they couldn't sell that, they just released IT for free. And so now there is this hundreds of gigg bike file out there on the internet that incapable tes data about all americans and most canadians. So there you go. That's what happened .

when I read the articles yesterday, IT seemed like this was alleged reporting. People weren't sure, confidently process that this was billions of data points, including social security numbers. How sure are we? That is the data that was hacked. Well, stuff. I can answer .

that quite confident because we actually have IT. So we found IT ourselves on the dark b and filling in a little bit of the timing here. So national public ata, which is the company bridge, they reported that the hack itself happens in december of twenty twenty.

And then I got released on to the dark web on a place called breached forms by some hacker named fenice, or finest. If I mispronounced your name, please don't come after me. But that person released on August x and we got to copy yourself and so we ve looked for IT.

And yeah, it's as reported. So there's names is the birth addresses the data was says like relatively messy, relatively some other dead breach is let you sometimes see. But no, confident is true because we literally happen.

jez. And so when you say it's messy, so like if there's a name and there's a social security number, others linked and others link to email or any other fields that might be in there.

yeah, I give you an example of the way, which is so for example, of the first six records all correspond to the same individual, a woman from alaska, but they have different invariance on her name, including like, nicknames. And stop like that believes across two different address is that he had, that's one level of matinees.

Another way that the data is MC is about ten percent of the exercise are obviously fake, like they begin with three zeroes or four zeroes. So the day is not as queen as IT could be, which is obviously good thing. But there's no question there's a lot about stuff in there.

You've always the access to data set. How long did you take you to actually get access to IT?

Is access .

how long you .

take guys? Little on. My god, you just I can believe this team, unbelievable.

increase. great. My team, we only got IT like the day was released.

maybe for the listeners. Give us a little insight when you get access to a data like this, what are you looking at, right? I mean, obviously this is not your first video.

Yeah, we first get to that. Is that like this? The first when we trying to do is just understand like what's in IT. So we'll take a look at the first couple thousand rows and just understand what feels are present and where does that look like.

That said, actually came from how common or the different fields super example for this particular bridge phone number is mostly missing, is mostly like A M address social da perth sometimes and there sometimes on. For example, we looked at the evolve data breach from about a month two ago, and that one had information on A C H. Transactions and baLances across different fin tax and stuff like that. And so you know that let to sit down a different sort of about .

the and just for focus listening, who are aren't spending time on the dark web, how easy is IT really to access the data?

It's be straight ward, if you know, to look and look, we're also by far not dolling people doing this. I mean, I think as of this morning, which in twenty six thousand years on reach forms for the thread. So like the frost communities looking at this and we seen the there we see in on telegram, we in on league base, just it's all over the place.

And so if you know to look at its not that hard. Obviously folks like the three of us don't do this every day. But for fraud to is or far for the professionals, you can find IT.

It's not reassuring. But I mean, as the answer are expected.

I would say that this is probably one of the big wins for sensible regulation around reach disclosures. Like I think having worked in this space since before they were breached disclosure requirements, these things were always happening and no one talked about him. And consumers are just a oblivious. And I think that knowledge is power and making consumers aware what's happened with their data super important. And this is one of those cases where I think forcing disclosure and reaches makes the world a safer place and makes people respond to them and and handle them in a correct way.

We're onna get to how this happens and obviously impact, but maybe we could just get a sense for scale. I mean, when I heard this, I felt bigger, but i'm a laman. I hear about breaches all the time. And so how would you actually character ze maybe like the magnitude or importance of this particular bridge .

in terms of magnatum? So it's two hundred and seventy seven gig by data and compressed a line that's across two different files, which totals two points, seven billion rows. Now some of the reporting i've seen the media is like, oh, is on, you know, three billion people, other identities stone, which is fortunately not the case, as I mentioned, a lot of deposit are, but there two point seven billion records is really A C S B file.

And so each row is some different piece of information about individual. Now we haven't gone through the full file, but based on sampling, we think about approximate a third of the records are unique. And so if you run the math on that is high, hundreds of millions of people. But again, we're not completely sure if you haven't him all things. So I say hundreds of millions of indigenous confidently and two point seven billion records you've .

been working in security for so long. How would you characterize maybe not only the sheer number of records, but maybe the quality of the information, the particular kind of information?

Unfortunately, probably a little desensitized. I'm only partially being sari. Like I do think a lot of this information already looked out there like we've had multiple breaches of credit reporting agencies.

And you have to remember that social security numbers of the kind of things that don't change, right? You get one on your born. You'd stuck with IT for a while.

And so not through any central repository, but just the breaches over the last twenty years. A lot of these informations already league. And so I don't know how unique IT is. What might be interesting is that IT gives you sort of maybe a central repository where you can Q A the information you already have or maybe there's some information in there that hasn't already leaked. And so that's probably gonna make a little bit a bit .

difference for folks. I agree the beers of all had leagues at different points. And think the archive x breach from what five or seven years ago had something like eighty percent of americans and IT or something like this.

But one of the things that i'm sort of thinking about here, and actually you can see the frost's talking about this on the forms, we are short of using this as a backbone to other breaches. Anything to is Frankly frozen today. Folks who commit identity after are not limited by P.

I like P, I, R out there. It's ratio easier to get identity that you can use as a base to steal. But the place where breaches really get bad as when you connect the sort of core P I information so named at a birth as an address, when you connect that to other things.

So if you connect that to a driver's license or a bank account or a vin or email addresses like that's when you can actually start to do something interesting from a razer perspective with the information and this data that has gotten breached here, we think of be used as sort of a background to connect to all other sorts of information. Breached, as I mentioned, like in breach forms. The forms of the Foster are talking .

about this again. It's funny when you actually read some of these chatter with the attackers, right, because they have a lot of the same problems that like legitimate businesses have, specifically like marketing companies, right, which is like, I make sure that we have the right job.

And how do we know that we've got as the right car and do we have this right identification? Because a lot of times these guys are trying to defeat things that are using personal information about you for authenticating, right? They asked me what school you went to and you were five and stuff like that. And so the more of this demographic information these folks can build up and the more accurate they can make IT, the easier IT is to several a lot of the security controls and places for them to commit fraud.

right? And as more of these breaches happen and more data is released, I mean, how much risk is there for me? I was just say, as the average american, should I be really concerned with this new bridge? Like how would you measure that?

I mean, I think the risk is always there. If you ever present, I think that you should probably have a locker or freeze on your credit, right, that sort of step one. I think if you do that, you mitigate some of the problems from these sorts of things.

I think the bigger issue was going to be at least as you look forward and you think about how thaves and scammers you're going to use this stuff, you know, you can start to use this demographic information pretty convincingly if you could clone someone's voice using genre, or you take this in the new direction in which you get a lot more attributes about a person that let you build a much more believable profile. Then let you replicate the presence they're like kind of identity and a lot more difficult to verify world. And what we've heard from folks is that this kind of fraught, this sort of next level social engineering is the thing that's been happening more and more.

I can give the advice activity that gives to my family at thanksgiving. I would ask me the same question, which is, look at the of the day, there is not too much that people can do to prevent frost from stealing their identities.

If you are in this breach, you're in the rich and probably all three years are, Frankly, but the things that you can do are pretty basic and strong personal security things like, for instance, turn on two factor and location for all the important services that you have. Probably the ones are not important as well. User password manager.

So don't have a bunch of repeated passwords everywhere and maybe use your best jugement of something. Seems like it's too good to be sure IT probably actually is. There is a good point of freezing your credit. That's a great idea. It's also a good idea to just check your accounts on a regular basis to see if there's anything that you don't expect.

And we actually have a helpful blog post that we wrote years ago called sixteen things to protect yourself self online that still as applicable today even after this data breach.

Yeah, joe, I think you probably got way more you said of that than anyone with hope.

I wish I could say that things had change ratio, but it's still the same problems.

How does something like this actually happen, right? We know all these companies have various versions of our data, some more than others, some more important than others. Is IT a lack of good infrastructure? Or is this just the kind of thing that's bound to happen when you put data all in one central place?

If I was a gambling man, I had bet that they had some kind of configuration issue on a data store, that they had a cloud database that probably had a guest able password, wasn't using two factor indication, and someone told the credentials right?

If you look at the snow, a breach which impacted, I think, one hundred and thirty seven different companies, that was all because there was a two factor of the indication, enable, and people were able to guess to steal those passwords and user names. And so to be quite honest, these breaches are usually low as combo etr, right? They don't have to pick the lock if you leave the window open, and you'd be surprised that people leave windows open. And that tends to be how these things happen.

I mean, on that note, i'm a little bit surprised by maybe like how unsurprised you are by this breach.

And so where are we in that art? Is this just really something that we expect to just continue to happen? And if you frame things the way you have as, like, the hacker is basically become more effective as more of this happen and they can piece together different blocks, where does that put us? How does the industry need to shift, if at all? Or should we just expect, like a ruling kidding of this.

if you go back decades, people could be secured by this data actually not being out there as much as the sons were secret. And your possession of one meant that he was probably you, i'd like to joke, and some so security numbers are both your user name and your password. And at this point, there are also public.

So it's the kind of the worst possible thing you could have. But so many different data bridges have completely broken that, you know, I mentioned Frankly, there is so much data out there that P I being secret is no longer control at all, Frankly, to prevent identity after other kinds of red. no. Frankly, the reason why there's not more identity that to rather front out there is because institutions that guard against identity that so banks or governments or anyone that needs to verify the identities of consumers like those institutions have, uh, controls form and you know something like is one of those controls. And so actually the reason there's not more brought out there is because of the control that institutions take, not because there's not data breaches.

Yeah, I think, like I said, not to be overly cynical, but we had data in databases for a really long time and it's relatively recently that there's been a requirement to disclose data breaches, right? California past to C C P A. Actually the breach disclose your law in california past to think in two thousand five, but IT wasn't nally implemented quite some time.

And even then, they're still a patch worker regulations. And with the sec, that's actually driving a lot of the breach disclosure requirements currently, they require you have believed to disclose within forty eight hours after a material security breach, which is only a year old, right? So these beaches have been happening for years and years, and people just never talked about them.

And so when you work in the security industry, especially if you work on the cyber intelligence or the financial fraud side and you watch these markets like this has been going on forever. And it's only now that companies are being forced to disclose and that consumers are becoming aware. And so I think that's really the thing that changed.

And like all of these different kinds of situations, this is very much a cat mouse game right at the attackers and the defenders. And you go back and forth. And to be quite honest, the defenders have gotten really good. We have really excEllent technology after the sentence, a great example of that, where a lot of this stuff can be nipped the body, even if the information is out there, you can limit the harm that IT causes.

Do you know we verify over a million people every day, you many people today that we hope to approve. They are. That's amazing. I really out of the bottom .

line of a lot of this stuff is you like I said, it's easy to be cynical. It's easy to get worked up about this stuff. Red, whatever the case may be. But in reality, things we've actually got got a lot Better.

And if you freeze your credit, if you follow the security best practices, if you use things like A U bikey, you know, a hardware security key, you can exist online relatively safely, right? Probably more safe than you are walking through a city street at risk of being robed, right? We've come a long way. You get these headlines in the media, hype the stuff up, and people think at the end of the world. But in reality, like things are a lot Better there, a lot Better than people would report them to be.

The other really cool thing about the way the world is evolved is that with the started ecosystem in the ability for, you know, expert founders to build technology to address these things, like we ve actually shifted a lot of the economics on some of these things where you can build a successful company fighting this stuff and in a financially way than if you were doing this stuff, right? And I think if you look at all these different kinds of situations and you look at any kind of crime, to be quite honest, is just about where the incentives lie. And if you shift the incentives in a meaningful way, you can actually really start to crack down on a lot of this stuff.

That's a great point and not tally. That's what your company does, right? How many cases of identity float are you blocking per day?

We stop over twenty thousand a day day.

And who is paying for that? Is at the end customer who's paying you to monitor or how does that work?

No, it's the institution. So we serve over three hundred bank slanders, financial institutions, tok's governments throughout the united states to help them forget if their customers or users are this, that they are. For example, before someone opens a great occurred that financial institutional ask, you say this several person only identity stolen, and will build answer to that for that in real time.

On the note of some of the new technologies coming online, they do open up a new vector, both for tear point, joel, attacking and defending. Curious if you see any gaps in terms of places that builder should be addressing on this new frontier, as, again, like the attack vector has also opened up .

everyone's talking about general A I and sort of the ability to do deep fakes and that sort of thing. And there's a lot of activity that we actually have an investment in a company called pinda p, which is really good at spotting audio deep fix, and they saw a lot of products. So you can imagine the financial services companies because that's typically where you see the threat, but IT all rolls downstream.

And so it's not just jp, more king chase and city bank that are getting hit by these general AI fakes. It's actually becoming grandmas and grandpas and parents, right? They're getting the fake phone calls from grandchildren and children that they're being held that you need to wire the money and stuff to that affect the virtual kidnappings, right? Like these are things that trickle down. And so enterprises are doing a good job of protecting themselves from some of this. And what we need is for some of that technology to start to filter down into protecting consumers at large.

Obviously, we've been using the same P I I for ages, right? Like mention social security. I mean, it's also crazy to me that they send you that on a piece of paper.

In any case, is there some world where we have similar to password managers like forcing you to update your password every so often or other forms of like biological identification? Should we be rethinking the idea that we use name, email address, phone extra? Or am I thinking about this correctly? And even those have just like the same kind of factors?

I would say, like E. S, for sure, we should be thinking about this differently. Does that ever going to happen? unfortunately? no. But you know, Frankly, public hotoke phy solves like quite a bit of this. And i'm not talking about like grip to block changes anything like that.

I mean simply every citizen having a public private keeper and having the government or some trust and entity go and could do grath's ally signs would solve a bunch of identity verification issues. Is I going to happen in united states? absolutely. night. But, you know, would that be an elegant solution that solve a lot of problems?

You would. There has been a dream for a really long time among the number of die head old triptych phy. People that one day the U.

S. Government would get improving identity. And there has been a nest working group. The national institute standards technical, has been trying to set standards for proving for decades.

There was a hope that maybe one day the post office would become the place where you go prove your digital identity and get a token or some kind of key. I think we're still as far away from IT today as we were ten years ago. But I hold our friend one day.

One day, I mean, california is rolling out digital drivers licenses, right? I got a digital license place for my car, like we might get there. I might happen in my lifetime.

I'm hoping, I think, enough tolley's point. Like the technology exists. We know how to stop this.

We just need someone with the political will and desire to make this a thing and maybe go after the real problems that everyday american consumers face. So one day will get there. I'm up. yes.

Hopefully a few more breaches .

along the way. All right.

if you meet IT this far, thank you so much for listening. And if you like us covering these timely topics, you should let us know and read this podcast 点 com flash a extensive, or you can email us at pod pitches at a six sense will see you on the foot side.