We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

#108: Lambda on Rust with James Eastham

2024/9/20

Real World Serverless with theburningmonk

AI Deep Dive AI Chapters Transcript

People

James Eastham

Topics

James Eastham: 我最初接触 Rust 是为了提升 Lambda 函数的性能，特别是解决冷启动问题。.NET 和 Java 等语言在 Lambda 中的冷启动问题比较突出，虽然近年来有所改善，但 Rust 的原生编译特性使其在性能方面具有显著优势。起初，我关注的是性能，但后来发现 Rust 的其他特性也令人满意，例如其严格的编译器能够帮助我编写更高质量的代码，减少错误。虽然在 Lambda 的单请求环境中，Rust 的并发优势不那么明显，但编译器的严格性仍然能促使我编写更好的代码。我参与编写了一本关于 Rust 在 Lambda 中应用的书籍，原因在于 Rust 在速度、性能、资源消耗和可持续性方面都具有优势。此外，目前市面上关于如何使用 Rust 构建实际应用的资料相对匮乏，我们的书籍旨在填补这一空白，帮助 Rust 开发者学习 Lambda 及相关 AWS 服务，也帮助不熟悉 Rust 的开发者学习如何在无服务器环境中使用 Rust。 Datadog 的 Lambda 层的重写就是一个很好的例子。之前用 Go 编写的版本已经很不错，但用 Rust 重写后，性能得到了显著提升，二进制文件也更小。这主要得益于 Rust 的性能优势以及针对 Lambda 环境的优化，包括减少依赖项和使用自定义包装器来访问 Secrets Manager，而不是使用完整的 SDK。关于内存安全，我参考过一个由多个国家安全机构联合发布的报告，该报告建议使用内存安全的语言来提高安全性。Rust 的所有权模型和缺乏垃圾回收机制使其成为一种非常安全的语言，可以有效避免内存泄漏和缓冲区溢出等问题。目前，我对容器和函数的结合很感兴趣，希望探索如何构建结合两者优势的现代系统。这包括研究如何在 Fargate 容器中部署轻量级 Web 层，并将后端功能迁移到 Lambda 函数中，以及如何有效地监控和管理这种混合架构。 Jan: 我认同 James 关于 Rust 性能和内存安全的观点。Rust 编译器的严格性确实能提高代码质量，减少 bug。此外，我还关注 Erlang 语言，它是一种基于消息传递的异步语言，其设计理念与 Lambda 的执行模型非常相似，这使得它非常适合构建事件驱动的系统。尽管 Erlang 本身有一些缺点，例如语法和工具不够友好，但其核心概念值得借鉴。关于 Lambda 扩展和层，我理解它们的作用是为 Lambda 函数提供额外的功能，例如 Datadog 的监控功能。之前，由于扩展层包含了较大的依赖项，导致冷启动时间较长，而使用 Rust 重写后，这个问题得到了显著改善。关于安全方面，我同意将敏感信息存储在环境变量中存在风险，最好使用更安全的存储方式，例如 AWS Secrets Manager 或 SSM。攻击者可以通过各种方式，例如恶意 NPM 包，来获取环境变量中的敏感信息。最后，关于“无服务器”这个概念，我认为它已经逐渐演变，不再仅仅指 Lambda 函数，而是指一种更广泛的云原生架构模式，其中包含了容器、函数等多种技术。我们需要更准确地描述这些技术，而不是简单地用“无服务器”一词概括。

Deep Dive

Chapters

James Eastham, a developer advocate at Datadog, discusses his transition from .NET and AWS to Datadog and his focus on Rust for Lambda. He highlights the performance benefits and the enjoyable development experience Rust offers.

James Eastham transitioned from AWS to Datadog.
He's a long-time listener of the Real World Serverless podcast.
James is an advocate for using Rust with Lambda functions.

Shownotes Transcript

Translations:

中文

Support for this episode comes from HookDeck. Level up your event-driven architecture with this fully serverless event gateway. To learn more, go to hookdeck.com slash theburningmonk.

Hi, welcome back to another episode of Real World Serverless. And today we are joined by James Isham, who has just recently left AWS to join Datadog as a developer advocate. Hey, James, good to have you on the show. Hey, Jan. Thanks for having me on. It's an honor, actually. I've been a long-time listener, long-time listener, first-time caller. I feel like I've said that a lot recently, actually. So yeah, I'm excited to be here.

Yeah, I've seen quite a lot of your posts on social media. You've been writing a lot about, you know, writing Lambda functions in Rust and recently spoke with your co-author on the show as well. We talked a little bit more about Luciano and we talked a little bit more mostly about the MIDI, but also touched on the Rust as well. And you've actually been sharing a lot of content on the, you know, on the writing Lambda functions with Rust.

So before we get into it, do you want to just spend a moment and just catch us up on your journey to AWS and the serverless and what you've been working on? Yeah, absolutely. So I actually took the kind of self-taught route into tech. So I finished what is college here in the UK, which is like up to 18. I don't know what that translates to in the US.

And I had no idea what I wanted to do with myself. So I ended up going to work for a software company in that frontline support. And they built e-commerce backend that was all .NET and SQL Server. And then I thought, I can support people better if I teach myself SQL Server. So I taught myself SQL Server. And then I taught myself .NET. And then I've slowly kind of worked my way up the stack from there.

So I'm from a .NET background, so Rust is kind of an interesting one, and we can maybe get into that, some of the differences, because it's both similar and very different. And then I started working with Lambda, I think almost as soon as .NET on Lambda was available.

um it must have been 2015 2016 um it's kind of funny actually i went from deploying dotnet applications to windows servers like straight to lambda like i missed all of that container middle bit um and just immediately it was like oh these windows servers are horrible let's just forget about service completely like i've never liked infrastructure management i was a dba for a little while like i've never liked that kind of stuff so being able to jump straight to to just deploying things and they just run like i remember the first time i used lambda like

I'm sure most people do, is that you just deploy some code and it just works and you've got an API and it's just fantastic. So yeah, and then from there, I've mostly worked in consulting. I've worked with quite a few startups, all like big scale-ups, all on serverless, all .NET actually, and then I joined AWS.

where I was working with, I was actually in the professional services part of AWS, which surprises people sometimes with how much content production I do. I think people assumed I was a DA, but I wasn't. I was a consultant at AWS. And then, yeah, about three weeks ago, I left AWS to join Datadog. And now I get to do DevRel and serverless all the time, all the time, all the time, which is very exciting. And you're working with AJ as well? Yeah, I get to work with AJ as well, which is...

Yeah, AJ is an absolute legend. So it's pretty cool to be able to work with him as a colleague now. So, yeah. Yeah. So I guess in that case, what drew you to Rust coming from, say, more of a .NET background?

Cold starts? I mean, I think historically, maybe not so much anymore, but I think historically, like .NET and Java have suffered more than the other runtimes with cold starts in Lambda. And they've both got better over the years. Obviously, Java's got SnapStart now.

.NET was getting more performant anyway, like sub-second code starts with .NET were pretty easy anyway. And then Microsoft announced native compilation, which allows you to natively compile a .NET binary, much like you would a Rust application or a Go application. But it still wasn't super fast. So that was what initially got me there, was I just want blazing fast performance. Actually, as I've started using Rust more and more, the performance is kind of a

a side note now, if you will. It's like a benefit. There's so many other parts of the language that are just really enjoyable to work with, really fun to work with. But initially it was, I just want something to be fast. I'd read a lot, as everybody has, I've read a lot about how fast Rust is. So yeah, I started learning Rust.

And I guess when you say .NET in that case, I guess you meant C-sharp, not F-sharp. Yeah. Because I did most of my career with .NET as well. But it was really quite a mix of C-sharp and F-sharp. Maybe towards the end, it was even more F-sharp than C-sharp. And when I looked at the rest, I think it was back in...

I think back before it was before 1.0 it was still like 0.3 or 4 or something like that quite early days in the Rust I saw a lot of functional influences in the way the language works a lot of syntaxes and some of the async stuff I saw like oh this looks really cool because it's very familiar with from what I've

been doing with F# and other functional languages. The one thing that was really, I guess, unique about Rust was the whole ownership system in terms of managing pointers and references, and so making them concurrency safe.

But then with Lambda, you have one request at a time. So again, okay, initially when people were talking about Rust and the Lambda, I was thinking, well, the thing about Rust, the whole pointer system, safety, borrowing stuff, you're not really going to get much of that use because Lambda is running one request at a time. But I guess that the performance side of things, the fact that you're compiled to native, that has a big impact in terms of cold starts.

I actually think to your point about the pointers and the concurrency, I completely agree. But I do think the how, what's the right word for this? How picky the compiler is. It actually forces you to think about writing good code more so than other languages. So although, yeah, the concurrency stuff isn't necessarily a thing in Lambda.

I find myself writing better quality code straight out of the box, writing less bugs straight out of the box because the compiler is just, no, James, you can't do that. You're an idiot. You need to try something else. The compiler just stops you doing so many things. But yeah, the concurrency thing is obviously not that relevant in Lambda.

Yeah, I do agree with that. You know, things like, I'm a big fan of Erlang. It's language that teaches you to think about asynchronously because everything is asynchronous. Everything is message passing. So you can't sort of kind of get away from that paradigm. But it does kind of forces you to

you know, think about more system design from the point of view of message parsing. And that actually translates really well in terms of what we're talking about today. Like, you know, even different architectures, it's all about different architectural components, exchanging messages. But if you've been doing Erlang, you've kind of got that mindset, you're into everything is about message and the orders, I guess the consequences of that. It kind of forces some better habits when you're trying to translate your architecture from, you know, code

to Lambda functions, event buses, and other components that have to talk to each other. So yeah, I can definitely see why having that compiler forcing you to do things in a certain way that's better designed from the get-go is going to encourage better habits as you're building your system to be more and more complicated, even if you're not fully taking advantage of some of the concurrency aspects that the language gives you from Lambda.

Yeah, yeah, absolutely. I never knew that with Erlang, actually. I've never really looked at Erlang, but that's quite interesting that it's like all messages and all asynchronous and all message passing. That sounds quite interesting, especially when it comes to building event-driven systems. Like you say, it sounds like it's just like a seamless translation. Why does Lambda not support Erlang? Yeah.

I guess it's not as popular a language, and Erlang's been quite an old language. Synthetically, it's not as polished compared to the languages that we use today. It's not as developer-friendly.

but it has got some really nice conceptual things, processes. You can almost compare Lambda's execution environment to a process in Erlang, whereby it's got a mailbox, receives one message at a time, it processes that message,

there's no concurrency within one process. So almost exactly like Lambda's execution environments. It's got a mailbox, it receives one message at a time, process that, and then gets the next one. So almost a similar pattern. You can see that in the Lambda's execution environments. - Interesting.

But yeah, it has got some really nice conceptual things, but it's quite an old language from the 1960s. And so syntax-wise and all of the developer tooling around it is not as nice as other things you'll find. That's why most people, when they look at Erlang, they're really using...

Elixir instead, which runs on the Erlang engine, but it's more like Ruby syntax, and also a lot of the developer tooling that comes from the Ruby world, where there's much more focus around the developer experience compared to, say, the Erlang world. Yeah.

Is it RabbitMQ that's written in Erlang? That's right, yeah. A lot of the systems that we use, like databases and RabbitMQ messaging systems, a lot of them are written in Erlang. It was a really popular system programming language before Go and Rust because of how efficient it is at dealing with a large number of concurrent requests.

There's also a lot of the other things like gay men jeans and other things like that was also written in Erlang as well. So for building this high throughput systems, there's some qualities about Erlang that's very attractive.

But it's difficult. A lot of telecom stuff are written in Erlang, but it's really difficult to hire people because it's a bit of an obscure language. That's one of the things that I guess that's nice about Rust and Go is that they've been mainstream enough that you can actually hire people for a position without too much hustle nowadays.

But, you know, it's still, compared to like your JavaScript and Python, it's still way down the list in terms of its penetration into the mainstream. So is that why you're writing a book about Rust? Yeah, I mean, I think when you think about, so when you start reading about Rust, I think the Rust came out of,

systems programming, really. So C, C++, Rust is kind of a natural successor. And obviously languages like that, one of the use cases is embedded systems. And you've got these systems where you've got really resource constrained things, whether that be like a Raspberry Pi or some little piece of compute running somewhere.

And actually, I think when you then think about Lambda, where you have technically got a resource-constrained environment, you've got memory allocation, you've got a small portion of CPU that scales with that, and then your cost is going to scale with that. The cost per millisecond is going to scale with that increased memory allocation, right? So actually, I think as well as all the other things we've talked about, the performance, the niceties of the language that force you to do good things...

Actually, the Lambda execution model, where you've got these resource-constrained environments, I think fit really nicely with these languages that are built to run in resource-constrained environments. So when we started writing the book, we were thinking about all these different things, all these different benefits that Rust can bring to serverless developers.

around speed, around performance, around resource consumption, around sustainability. There's all these different things. I mean, Werner talked about it at reInvent last year. Was it the year before? I can't remember. But Werner talked about it at reInvent, about Rust as a language. So there's this whole bunch of different benefits that Rust can give you, which is why we're writing the book Rust on Lambda. The other side of that, Jan, though, is...

A lot of the Rust content, at least I've found, and there's a few exceptions to this, focuses on the deep internals, the actual language itself. And there's not a lot of content out there about building real things. So I'm also coming at it from the angle of a Rust developer who's like, okay, I know about embedded systems. I know about systems programming. Now I want to build a web app. Now I want to build something like a hobby project that I'm going to deploy. I'm going to run a startup, whatever it is.

There's the Rust developers out there who might not know about serverless. They might be used to embedded systems. So there's also that angle to it as well. And we're trying to meet somewhere in the middle where if you're a Rust developer already, okay, you can learn about Lambda, you can learn about AWS, you can learn about Dynamo, you can learn about API Gateway, you can learn about all these different things. And if you're not a Rust developer,

we're by no stretch going to be teaching people rust, but there's enough in there. And we scale, we scale the examples up slowly enough that you could actually pick things up, um, as you go along, but it's absolutely not going to be like an entry level to rust. Um, so yeah, there's a whole range of reasons why we started writing the book. Um,

Yeah. Writing a book's fun. It's a lot of work. It is a lot of work. Yeah. And the last time I spoke with, um, with Shiana, we briefly touched on the book as well. And he talked about the concept of, uh, easy mode rust. So, you know, rust, basically writing rust for web applications, running Lambda minus all of the most so complicated ideas, I guess, the things like borrowers and the things like that, the pointers, um, that are, um,

As we talked about earlier, probably not as relevant in the single request execution environment that you get with Lambda. But we're seeing things like LRT being written in the Rust, and that's why it's able to perform so well in terms of cold starts, and

Recently, we were just talking off air about the Datadog Lambda layer for tracing, which has also been rewritten by AJ to be in Rust as well. And that is a Lambda layer that I have used for client projects. And I was seeing...

Somewhere between one point something seconds in terms of impact on cold starts and based on what Benjamin Powell has recently released, it looks like it's going to be a lot faster compared to that. So now that you've been working with AJ, any sort of insights on why is it so fast? Is it just because it's written in Rust?

Yeah, I mean, the other thing I will say before we get into that is also Maxime David, who has just left Datadog to join AWS, actually. He also did a lot of work on that project, and there's other people internally on the Datadog team who are also working on this. So AJ has done a lot of work on this, but there are others who are working on this as well, I will just say that. But yeah, so it has been... So the way the Lambda layer worked previously, it was kind of just packaging the entire...

parts of the Datadog full agent as part of a layer, and that's all written in Go. So this new extension, the new layer, sorry, has been completely rewritten from the ground up. So it's going to be functionally compatible, but it's been completely rewritten from the ground up to be Rust-based, which

gives you both the faster performance, just out of the box, but also a much smaller binary. I can't remember the exact numbers that I've seen, but the actual size of the layer is smaller than

the one that was written in Go. So that obviously is going to maybe not have quite as big an impact as the runtime itself, but it's going to affect the startup time a little bit. So yeah, it's been a complete rewrite. I'm still yet to really go and properly dive into the code. I need to go and have a look at that. But from the numbers I've seen, like I said, Ben made a post the other day, the numbers of the tests I've run myself,

The cold starts look like they're getting down in the low hundreds of milliseconds from, like I said, 1.2, 1.3 seconds. So it's a pretty, pretty fantastic improvement in performance. And all it will be, once it's fully available in GA, all it will be is just bumping the version of your extension, your Lambda extension. So there's going to be no...

changing of things, switching of environment variables, switching that, none of that. It's going to be the same layer, bump the version, and you'll get performance out of the box. I'm trying to kind of think of it like, you know, when EventBridge do something magic behind the scenes and suddenly all the latency is like drop off a cliff. I'm kind of thinking of it in the same way as that, that would have been upgrade and suddenly things will get faster, which is a win for everybody.

Right. So I guess for folks who are not familiar with Lambda extensions and what these kind of layers do, can you just spend a moment or two just to talk about what is the whole, what is this Lambda extension, how does it work, and how does this Datadog layer actually work to provide additional observability for your applications?

Yeah, so Lambda layers are a way of running, of packaging other things that aren't your main application code and kind of setting them up next to your actual Lambda function code itself. You can package up, I've seen people in the past like package up that shared node dependencies into a layer.

that's not normally the best thing to do. I don't know if you think differently, but that's not normally what I've seen. When I've seen people using layers as a way of packaging up shared dependencies, that's not going to really benefit you too much, at least from my experience. No, me and AJ both wrote blog posts about why you shouldn't publish a shared code as a Lambda layers. I'm not saying anything too controversial then, that's good. But what they are really useful for is...

These almost like sidecars in containers, these things that you want to run next to your application code. So, for example, I worked on a customer project when I was at AWS and we needed to, we were managing Kubernetes clusters from within Step Functions and Lambda.

So we package the kubectl executable as a Lambda layer. So then when we deploy our Lambda function, we've just got this shared kubectl that gets attached to the local file system. And then you can just run kubectl commands from within Lambda, but you're not packaging up kubectl in every single Lambda function that you're deploying. You're packaging it up once in the layer. So that's the kind of thing where I do see some benefit of

is when you've got these actually shared things that aren't necessarily application code. So the Datadog extension, it does a whole bunch of really cool things. So it will start to automatically pull out

different bits of information from your lambda function so it will trace cold starts as an example it will monitor cold starts it will monitor end-to-end latencies if you couple the actual the data dog extension with there are like language specific layers as well so there's a dot net layer a node layer uh um a java layer you then start to get an awful lot of auto instrumentation as well so i was working on an example earlier today this was in dot net mind but um

You add the layer, you write a Lambda function that receives a request from API Gateway, publishes a message to an SNS topic. That then goes to an SQS queue, which another Lambda function then picks up and processes. And you just get this entire end-to-end trace out of the box with zero changes to the actual application code.

So it's really, really useful. But like you said, the trade-off of that prior to this new version of the extension is that you're going to get an additional around a second is, like you said, typically what I've seen when I've done this as well. And which for the Lambda functions that are working from SQS queues might not necessarily be as much of a problem. But for these latency-sensitive API web-facing workloads within our Lambda, that can become more...

more of an issue. So yeah, the way to think of layers is that sidecar next to your Lambda function. Again, don't package shared dependencies into your layers, please people. And yeah, it allows you to be really, I'm sure I've seen somebody, maybe it was on Discord, they were doing like PDF generation inside Lambda and they packaged up like an executable to do PDF. Maybe it wasn't PDF generation. Anyway, yeah, that

That's the latest. I guess in this case, because your Lambda layer is actually, you're shipping the Lambda, it's running as a Lambda extension. And so the Lambda extension, in this case, you're able to subscribe to a lot of the telemetry information. You're able to intercept the request so that you're able to then

From there, send information to the mothership to report the fact that we just saw this invocation for Lambda function. And based on the invocation event, we know it came from some message in SQSQ.

So I guess the next question would be, okay, right, so if the previous agent was, well, layer was basically just bundling this data.agent that was written in Go, which is still compiled to native, so I imagine the performance is still not going to be an issue because of the language. It's more the fact that it's not written specifically for Lambda, so it's not really thinking too much about, okay, how much dependencies you've

bring in is designed to run on a containerized environment where you don't really have to think about cold starts as much so it's more of that that that mindset about designing something writing something specifically for lambda where you're really conscious about how much dependencies you're packing so that's probably maybe the way the big difference is uh is right yeah yeah absolutely i need to go and check out the core but i think it was aj i was chatting to where he

instead of using the Rust SDK, and there is a fully version one, fully featured Rust SDK that's got all the features that you have in any other language SDK. I seem to recall that they've actually written, it is actually written

custom wrapper. So typically you need an API key to do anything with Datadog and the extension supports either passing the API key in as an environment variable. Don't do that, people. Don't store your API key as an environment variable. It also supports passing in a secrets manager ARN. So you can pass in an ARN and start up the extension. The layer's going to reach out and speak to secrets manager and get the latest version of your API key. And instead of adding the whole

SDK for Secrets Manager. There's just a custom wrapper to make that API call almost manually, I suppose is probably the best way of putting it, to, like you said, really minimize the dependencies as much as possible. Let's make this thing as fast as it physically can be, minimize the dependencies as much as possible, and make it purely built just for the Lambda environment. Because it is, like you say, it's very different to running things on servers or containers where you can just throw more memory at it.

I was deploying a Java application to Fargate the other day and I had to throw so much memory at it. I was like, just give it more memory, it'll go faster. So yeah, Lambda is very different in that respect. Event-driven architectures is a powerful paradigm for building large-scale systems, but they're also notoriously difficult to test and observe and monitor in production.

With so many independent event publishers and subscribers, there are a lot of additional complexities around error handling, alerting and recovery, and making sure that you have full visibility into the entire lifecycle of an event so that you're able to troubleshoot any problems that you have encountered before.

I have built many event-driven architectures on AWS and these are just some of the recurring challenges that I face. And that's where HookDeck comes in. It's a fully serverless event gateway. There are no infrastructures for you to manage. You can get started in just a few minutes with their command line interface.

Compared to Amazon EventBridge, it does everything EventBridge does. It can ingest events from multiple sources, filter them, route them to different targets, and transform the event along the way. But it also offers a better developer experience, including a local development experience, and having more detailed metrics and logs to help you debug issues with delivering events, and just being able to query what events you have easily, which makes testing much simpler.

You can start a free trial of HookDeck today and support this podcast at the same time by going to hookdeck.com slash theburningmonk. Okay, so you said don't put your Datadog API keys in the environment variables, and that's something that I've been shouting for a long time. So why do you think that's not, well, that's a bad idea? It's because there's better places to put it. It's secret information. I mean, I don't know. I think if you were to work in an environment where

locked down users permissions so they couldn't i can't remember what the exact aws api call is but there's a way you can lock down so that users can't retrieve environment variables for a lambda function basically um so if you were going to go to the extent of locking that all that down then fine but that feels to me like it's going to be a bit of a painful thing to go off and debug and work out and just not being able to access the console and not be able to work with lambda

Whereas, at least when I've run tests on this, the overhead that using SSM or Secrets Manager, and personally, for a lot of cases, I prefer SSM and secure strings as opposed to Secrets Manager, mostly because the API calls are free. So unless you need to do that auto rotation of secrets, then SSM. And I found the overhead of doing that typically isn't

Crazy, really. So I'd rather just have it stored somewhere that I know is built for exactly what I'm using it for, as opposed to storing something that could be secret information in an environment variable. Right, right. The thing that I'm normally so concerned about is more about things like environment variables. We know from lots of previous instances that it's something that a lot of attackers would target because it's a low-hanging fruit.

it's really easy to do a supply chain attack and just put out an NPM package that's malicious and the first thing it does is scan your environment variables and just report whatever to their own backend so they can harvest whatever's in people's environment variables.

And people could have loads of connection strings, database IP address, username, password, everything you need to access the database in the environment variables, as well as API keys for various SaaS services you may use.

That's the thing that I normally worry about when it comes to putting sensitive information in variant variables, because even if someone can't penetrate the actual Lambda, the Lambda, each two instances that runs your Lambda function, because that's secured by AWS. But all it takes is just downloading the

NPM package or something that has been compromised. And we have seen many examples of that. Or someone who's... We also saw examples of someone publishing packages with slightly different names. So instead of a cross hash, cross dash nth, it would be cross underscore nth. So that you think you're downloading the official package, but you're downloading someone's fake version, which does the same thing. But

when you start up, when you require it, and at runtime, it's going to scan your environment variables. So that's the thing that I worry about, not so much the accessing data from console, more just about, okay, if someone...

Someone can pull up this kind of text really easily. And so that's the reason why I kind of lean more towards using, like you said, using SSM or Secrets Manager. And kind of the approach I normally take would be to fetch them a cost on and cache it. And then occasionally invalidate the cache for things that I want to rotate or just cache it forever if I know something that's just never going to change.

That's also an option as well. I mean, and I think I've seen some SaaS products I've worked with in the past, and I don't actually know this about Datadog. So this isn't... I don't necessarily know if this is the case with Datadog, but a lot of API keys that I found I've generated in the past are like write only, which I do think minimizes the risk somewhat if it's a write or an attacker can... I suppose they could do like a denial of wallet and just like absolutely ship a load of logs into your system. But...

Yeah, I've never thought of it like that, actually, scanning environment variables. That's a good point. It's interesting. I'm sure when I've worked with Kubernetes in the past, I'm sure all secrets in Kubernetes get mounted up as environment variables. I'm sure that's exactly how they work.

It's part of the 12-factor app thing that started... That's what I was just Googling. I'm sure this is 12-factor to do things in a matter of minutes. Yeah, which makes it easy for your app to access it, but also easy for anyone else who is able to run code in your execution environment when you boot up. So...

I don't know. It's one of those things that's a really good idea. Maybe the best idea people had at the time when the whole new cloud building and cloud thing was still new. But just because it was once a reasonable idea doesn't mean that we have to

hold it as on a pesto so I think we do have to evolve over time as the new attack vectors come up but that's my personal take I have had people who disagree with me quite strongly because it's really difficult to actually penetrate a security environment but

But my concern is more about, okay, what it takes is one bad dependency, one malicious dependency, and they don't have to actually penetrate your execution environment, but just your code. Yeah, yeah. I mean, coming back to Rust, I suppose, there's a paper that I found. I gave a conference talk earlier this year about Rust and Lambda, and there's a paper I found that was released by

a combination of governments. So it was like the CIA, GCHQ here in the UK, Canada, the Canadian cybersecurity team, Australia, France, there was a whole load of countries. And they released this joint paper about how everybody should adopt memory safe languages for the purposes of security and safety.

And obviously, the way Rust's ownership model works and the fact there's no garbage collector, that memory's only kept... Things are only kept in memory for exactly as long as they're needed to, and then they're dropped. The way the compiler checks everything, it does make it an incredibly...

memory safe language if not the most memory safe language I would potentially argue and obviously that wouldn't solve environment variables people could still scan environment variables in Rust of course but there are these whole range of memory safety type issues that simply just don't exist in Rust

And I mean, if the FBI and GCHQ and all these other cybersecurity people are saying that I should do that, then I think I'm probably going to listen at some point. So there is that angle to it as well, like the security angle of Rust. Right. Okay. So I remember that paper and I understand why that's a good idea because there's a lot of array overflow, buffer overflow kind of attacks in the past. But for folks who are not familiar with what does memory safety mean, can you just define that for us?

So whenever you're doing anything in computer programming, any data you keep as a variable, whether that's anything you're storing, it's going to be stored in a piece of memory. And God, it's been a long time since I've explained this, Jan. It's been a long time since I've explained memory safety.

And if you, let's imagine some JavaScript code, just for argument's sake. I could set a variable in JavaScript to be a string.

And then I could set that value to null. I could set that value to something else. And then I could try and reuse that variable again later in my application code. And then at that point, that variable wouldn't have a value anymore. That variable would be null or undefined or whatever that was. And if I try to access something on that, well, my code's going to explode because that piece of memory isn't actually there anymore.

Now in Rust, that's almost impossible to do because of the way ownership works. So if I declare a variable and then that variable goes out of scope, Rust is immediately going to drop that from memory and the compiler is going to tell me. So if I try and reuse that variable again,

The compiler is not going to let the code compile. It's going to say, you've got a variable. It's been dropped from memory. You're trying to reuse it. It's not going to exist anymore. You can't actually do that. So, I mean, that's a really simple example of memory safety, I suppose. But that's the kind of thing you're getting into where you've got things in memory that may or may not be there anymore. They might be overflowing.

Right, right. I don't know if you've got a better definition actually, Jan, because it's been a long time since I've done like, it's kind of one of the interesting things actually, one of the things that I've struggled with sometimes in programming is because I'm not classically computer science trained. Some of these concepts, some of the stuff in Rust and I'm like, what does that even mean? It takes me a long time to understand these things because I've not had the classical training.

Yeah, I think if you're working with things like .NET, even JavaScript, where you never have to manage memory, it's very hard to conceptualize what that means. I've done a bit of C++ back in the day, where you have to allocate a block of memory, you have to maintain pointers. That's where things like buff overflow are really easy to happen, where you allocate a bunch of memory.

But then you give someone an array and then somehow get into a position where you try to access a position beyond where the array boundary is in memory. But if there's no protection in the programming language runtime, that means, okay, someone tries to access index 10 when there's only nine items.

That means they're not able to read whatever's in memory in that 10 position. So they're able to read things that are not part of the array. So this is where you have-- and if, like I said, when things are out of scope, they're not cleaned up, they're still in memory, which means someone who's able to trick the program into overflowing the pointer so that they go beyond the boundary of where that array is,

then they can start to read information from the memory that doesn't belong to whatever the thing that you were trying, that you intended for them to read in the first place. And this is how a lot of the things that we saw in the past in terms of text on the

operating system or programs, able to read information from the computer that they should never, that was never intended to be read, comes from, that's something able to scan a bunch of this information and put it back together because they're able to just trick the program to keep incrementing the buffer, the pointer and read more and more of the memory fragments and then, okay, they're basically just getting a memory dump of the application and then they put it back together, see, okay, what information do we have there?

So if you have to deal with allocating memory and managing it yourself, that's where the kind of opposite end of memory safety, memory unsafe, it's really easy to get yourself into a position where you're returning data that was never meant to be returned because A, they're there, and also B, you're not stopping the people from being able to read outside of the intended boundary within your program. And that's where things like Rust

And I think a lot of other programs, other programming, more than programming languages, have this kind of protection in place now. Yeah, yeah. We should probably call out that same paper. It does also reference, I think, rust.net, Java...

Maybe go, I'm not sure. There were more than just Rust in this paper. Like if you're not a Rust developer and you're terrified now because the FBI have said you can't do it, like you can, please. We can probably put the links in the show notes, right? To the actual paper itself. Oh yeah, that's a good point. Yeah, I'll put that link in the show notes as well. Yeah, there are more than just Rust in there, everybody. Yeah, I think most of the sort of, I guess, modern programming languages would have that nowadays because it has been a...

security concern for many many years um you know but we are now i don't know what generation of languages we are at now but yeah it's something that i think most languages more than languages uh would be wouldn't know would have addressed i feel like i should go and try to write some c and t plus plus i think there's something i think it's something you want to put on linkedin actually and where you were like you should always try and understand like one layer deeper than what you actually understand maybe i should go and write some c and c plus plus and actually just like because it's one of the things that i think it's really interesting that

one of the things I find myself thinking about quite a lot is like for people who've been in software for quite a while who are used to deploying things to like servers, serverless is like amazing. Like it's like the best thing ever. But like there must be a whole category of developers just starting out now who have only ever known serverless. It's like, okay, I wonder if you appreciate it the same if you've never had to deal with a Windows server turning itself off at 3 a.m. Like it's interesting, isn't it? I wonder if I should do the same with programming and go and teach myself C. Yeah.

Yeah. So what I said was you should always understand maybe one or two layers beneath the layer that you want to be working at because those layers are not as productive. You want to be productive every day. But then understanding how things work under the hood can really help

I give you some appreciation for why servers are so good, why so many of us are so keen on servers, because a lot of us live through having to manage servers. And like I said, machines are blowing up at 3 a.m. or memory taking two weeks to actually be noticeable. And then you spend like months trying to figure out, okay, where did this come from? Reboot the server and then it starts up again and then two weeks later it breaks again. Yeah.

I used to have these memory leak problems from .NET applications where we had the large object heap. We were creating loads of large objects that we didn't realize. It was basically causing memory fragmentation. The memories are there, but they're just not being

They're not usable, they're not allocatable because the large memory heap doesn't get cleaned up on the normal, I think it was the L1 and L2, GAN objects collection. So they build up and they take up a lot of space and when they get destroyed, they leave loads of gaps. So the amount of memory you have is still the same, but the actual amount of addressable memory space is actually smaller and smaller and smaller.

Over time, that means that your application ran out of memory, even though the memories are still there. So we used to have this clone job for this particular application that basically just recycles machines every two weeks. Because we know it takes about two weeks for the memory allocation-- well, memory problem to kick in.

We never actually figured out what the real problem was. We know there's the symptoms. We have some theories, but we try a few things. Nothing quite works. So in the end, just, okay, let's just do a cron job to kick the tire every now and then. So, you know, there's a lot of problem things like that that you just don't get with Lambda anymore. And having understanding, you know, how to manage machines, run it,

I think it's really useful skill to understand, well, nothing is magic. Some things are lambda magic, but it's all just engineering. People are picking good work. And again, things that we're doing with the rust, all of that, none of that is magic. It's all about engineering and understanding how things were under the hood really help you understand what the decisions someone might have taken to get to the point where in rust we don't have this memory function.

safety problems anymore. And I think also help you appreciate how good we have it today. Yeah, absolutely. We really do have it good. So for things like if you're only working with HTTP APIs, you should also still understand how IP and UDP works. You may not use it, but at least you understand the protocol so that you understand why

HTTPS is more expensive when people talk about, OK, there's more additional round trips and handshakes and things like that. And why mobile devices in the country with poor network connectivity

you know, have more sort of round trips. There's more stuff going on because, you know, messages, acknowledgement gets dropped and how TCP can recover from that and all those things. You may not need it, but it's good to understand how those things work. So you've got a more complete mental picture. Yeah, I mean, I'd even say the same with Lambda, like understanding how Lambda works internally. Like I used to love seeing some of like Julian Wood's talks where he gets deep into like the internals of Lambda and how it actually works because it just,

It helps you understand why things are a thing. And I think, I don't know, cold starts are so interesting. I feel like they've been done to death. But when you actually think about what's actually happening, like imagine you've got a Lambda function with a gigabyte of memory. A request is coming in and the Lambda service is going to look around for a machine somewhere that's got a spare gigabyte of memory. It's going to allocate that bit of environment, reserve that gigabyte,

Start up the execution environment, download your code, start up the runtime, then pass the payload into that environment. Your code's going to run and then come back. And all that's going to happen in that 700 milliseconds. It's like, what? It makes my head hurt when I think about it. So I think it's just understanding that internal stuff deeply, I think, does make you appreciate, okay, there's actually quite a lot of stuff happening behind the scenes for this thing to run.

Yeah, so I think the last one that he did was, was it in 2022? He did, I don't think there was one last year. Normally, it's like, just a couple of years in a row, they had this Lambda deep dives session. I think Julian did the last one. I really like the talk.

And one of the things they talked about was how one of the things that they've done to improve this whole performance is by making most of the services along the call chain a stateful service. And the reason why that's good is because it removes the call to a database call because the machine itself, the servers, have the state, which of course makes the system more complicated. But

It means that on a good day, on a good request, they know straight away which machine has got the space, so they don't have to make a call to some other surveys or database to find out, okay, give me the list of servers, machines that we have, the amount of memory space they have available, and then try to allocate to one of the machines. They know on the machine, on the server itself, that, okay, whenever some machine is allocated to handle or to allocate another Lambda function instance,

they straight away is reflected in the machine's internal knowledge. And so, yeah, I love that talk. I'll put a link to that show notes as well. The most recent one, I think, which is from 2022.

Do you have any other talk? You mentioned you did a talk on Rust. Do you have a link to that? I can probably put it in the show notes as well. I don't know if it was recorded. I will try and find out. I've got a whole bunch of stuff on YouTube. I've got a lot of YouTube content on Rust and Lambda. I'll put a link to your YouTube channel as well. Awesome. Thank you. So yeah, I don't think that was recorded, actually. I'm hopefully going to do a couple more this year on Rust. I find it really funny whenever I go and do a Rust talk because I'm brand new to Rust. I've been doing Rust for 12 months, maybe 18 months.

And when I went to give this talk, there was a really experienced Rust programmer in the front row.

And I could almost see them like, what are you saying? Like, you're not, you're not, you've not, like, not worded it quite right. And yeah, I think my attitude towards public speaking is I don't like perfectly the enemy of good. Like, I like just to get up and share things. And if I don't say it quite right, then hopefully somebody calls me out and I can say it better next time. But yeah, it was kind of terrifying going to, especially because a lot of Rust programmers are like these hardcore, like, systems, like, embedded systems programmers. So going to speak at Rust conferences is pretty terrifying, more so than any other conference

conference i think i'd speak out everybody in bed was kind of fun yeah roughly scary yeah i was when i was you know so hanging around the functional programming community there's a lot of people that are you know really hardcore language researchers language creators and so you know those conferences is quite a different experience um there's uh there's quite a few conferences that i really enjoyed like strange loop uh if in the u.s that's the code beams um

conference run by Erlang Factory in the UK. It's kind of their take on trying to be in a strange loop, which is where a lot of the more sort of cutting-edge language research stuff being talked about in a sort of more mainstream setting. And you have a lot of people that have been doing language research for, I don't know, for decades. And you've got people like the creator of Erlang, creator Fsharp, and people like that talking about the latest and greatest

things in the language research. You've got one of the most changed languages I've tried in the past. I went through a phase of trying lots of different programming languages.

was this thing called Idris, which is a dependent type system where they can actually have type information sent over the wire. So then on the other end, they can also validate that, okay, the information you're sending is correct. They can also do type checking on the length of the array, not just the

the object types in the array, but also the length of the array that can be a part of the constraint on your type, which of course for more complex system, maybe that's interesting, but it's more just from like a research point of view in terms of, okay, how far can we push the type system? And when you look at that compared to things like the C sharp and it's okay, well, you know, that's type system and then that's type system. Yeah.

Yeah, absolutely. I've been writing quite a lot of Node in the last few days and I

Having no type system is really difficult. It's so hard. My .NET brain really struggles with it because I'm like, wait, this could be a string, but it could also be a number. What do you mean? What do you mean, JavaScript? I think for Lambda, I haven't found it to be too much of an issue because most of my code is just one or two files. So there's not a huge amount of abstracting layers. But definitely when I work with some more traditional code,

Node Express application. It's just, okay, depending on which file I'm looking at, user means something completely different. There's no type definition, so I have no idea what I can assume about what this user object has, and you have to go multiple files back to see, okay, where it's been parsed in and trace all the way to the beginning. All right, so that's the user type, and then you figure out, okay, what can you do with that object?

But yeah, that's where I think traditional web development, where the application gets complicated, you need the types to be productive. But I think Lambda, I've been able to make do with our types and I should be quite happy with nodes compared to TypeScript. Yeah, definitely. So I guess maybe last question, what are you going to be working on now that you're at Datadog?

All things serverless. So one of the things I've been thinking about recently actually quite a lot is kind of related to, you know, there's the round of articles doing the rounds of the internet about serverless being dead and what does serverless actually mean anymore? And I think, I mean, we kind of talked about it offline a little bit by just kind of maturing a little bit. And I think the bit I'm really interested in at the minute is...

the interplay, I suppose, between containers and functions. I think referring to things as serverless is getting tricky. I think you've got apps deployed to containers and you've got apps deployed to event-driven functions. And what do modern systems that take the best of both look like? So a pattern I've been looking at a lot recently is you've got a really thin web layer deployed to a really small Fargate container. That gives you a really...

performance response time to your front end but then all of your back-end functionality all of your async stuff shift all that into lambda so then what does the interplay look like between the two how do you understand it obviously being a data observability is quite a big thing how would you understand if you've got parts of your application on fargate parts of it on lambda maybe you've got some microservices that are all on lambda like how does that interplay work and how do all these things work together so that's what i'm particularly interested on at the moment like what does this

what does the next phase of software development sounds really grand. Like it's not going to be that grand, but like what, what does it actually look like going forward? Because, you know, I've worked with Kubernetes quite a lot now. And I still don't get it. I honestly, I just don't, I don't get it. Like, I mean, I think, I think one of the best heuristics I've ever heard on Kubernetes is if you're building a system where your ability to dynamically scale infrastructure is a core differentiator for your business,

then Kubernetes is great. Like I think about if you were building a machine, you know, you're someone like ChatGPT or Anthropic or someone like that, where you need to scale up and down GPU based instances dynamically on the fly and manage all of that, they're great. But for most companies,

I just think the operational overhead of it is overkill. But they might still want to use containers. You might still want containers to be your mode of operations. Okay, how do you do containers in a more serverless way as opposed to just thinking serverless equals Lambda? So yeah, that was quite a long-winded way of me answering that question. Sorry, Anne. But yeah, that's what I'm interested in is this kind of interplay of modern applications between that are still serverless-ish, should we say. Yeah.

Yeah, I think that, yeah, the whole serverless debt thing has been going on for a little while. All triggered by the AWS disbanding of its serverless advocacy team, even though it's just an organizational change. They're shuffling things around. They are putting more focus on AI. So, you know, they also disbanded the Kubernetes team last year. It's not like I didn't hear anyone say Kubernetes is dead. Yeah. But yeah,

But I think part of that is probably a reflection that they probably don't need to do as much advocacy themselves. There's a big community around it already. I think Marcia wrote a blog post about how part of that is the reason why they are doing

consolidating the advocacy teams as opposed to more specialized teams for different, I guess, paradigms. But also I think that we should probably, I think Ben, has Ben Cahill talked about this before when he was on this podcast that serverless, the term serverless should go away and hopefully will go away eventually when people just start using things as they are as opposed to having to need some kind of a marketing term to drive people

Completely agree. Yeah, I think, which is kind of funny, I think my job title is serverless developer advocate, and I've just agreed that I wanted to go away. But yeah, I mean, I agree. I think people should talk about functions, event-driven functions, reactive functions are containers. They are the mechanisms. Banding everything as serverless. Yeah, I'm not sure about that.

I think it's useful in beginning to introduce a new concept in terms of a different way of thinking. But eventually, like the same with NoSQL, not many people talk about NoSQL anymore. I think someone told me that there's still NoSQL specialist team within AWS. But for us from customers, we just talk about DynamoDB. We don't talk about it as a NoSQL database anymore. We just talk about it as DynamoDB. And we talk about other databases as just

you know, this database with that database. And we appreciate that it's not SQL. You know, you don't write SQL queries to talk to those databases. But...

The term NoSQL is not as useful way to label things anymore. Now that people are, you know, everyone's kind of appreciate the SQL versus databases that are not running SQL engine. But yeah, I suppose nowadays people talk about key value or time series or relational or whatever. So yeah, I like that way of looking at it.

Yeah. So I think a serverless is good to introduce this, the mindset of something that's more managed services. That's more so paper use where you don't have to worry about or run the infrastructure yourself. But I think as more and more services are really serverless, you know, DynamoDB S3, you know, really what we want with serverless is for people to do, to go towards sort of the most service for mindset where instead of running machines that then run your software, you use a service, you know,

and the Composer application out of different services so that you delegate more responsibility to these providers as opposed to running everything yourself. So that's kind of the end goal, and hopefully we get there, and then we can dish the serverless label. It was really interesting, though, actually. When I was at Serverless Days Belfast, I was talking to Dave Anderson,

author of the value flywheel effect. If you don't know Dave, go and check out his work. He's awesome. And he'd been talking to someone he knows who was working for a company who used Kubernetes. But the actual developer was like, I'm using serverless because the developer themselves didn't

they just pushed a container and the container just then went out and ran somewhere, which I think is fine. But then I think you're almost getting back into the kind of dev and ops world then where you've got a team running the Kubernetes cluster and you've got the developers and the developers are just throwing containers over the wall. Like I worry we're getting back to that. Whereas building what would be known as serverless systems, that almost everything is your responsibility as a developer, which I think is one of the downsides actually, is that everything now is my responsibility as a developer to build

work out if my Lambda functions got too many concurrent executions or whatever, like that becomes my responsibility. But I think that's a good thing to have more responsibility in the developer's hands personally. Yeah. Well, the responsibility is one hand is one side of the coin. The other side is the empowerment. You can do more, make more decisions yourself. But I think the, the, the main thing to kind of, you know, bridge that gap between having that, you know, great power comes with great responsibility is, you know,

being able to, you know, being ready to handle that responsibility. And I think that's where education is still really important because, you know, where we are talking about serverless, I think Gregor Hopp has created this great term where serverless is the true cloud native because it's the way the cloud wants you to build software.

But then to be able to do that properly, you have to actually know the cloud really well. And that's where understanding those services you're working with, IAM, how it works, how to properly create the right, tailor the right IAM permissions and things like that. So where we are shifting left more and more, security, architectural decisions,

DevOps, automation, deployment, all of those things I think are easier to do with serverless, with Lambda and whatnot because most of it is managed anyway. You just have to choose the right settings and off you go. There's no ongoing maintenance in terms of, okay, you have to patch the machines and things like that. There's no need for that and

And a lot of organizations, at least large organizations, have got platform teams that can take care of some of the cross-cutting concerns across multiple teams. Things like setting up the accounts correctly, having AWS config and all these other things set up to enforce conformity, best practices, and creating L3 constructs for CDK and whatnot.

But yeah, you're right. There's more responsibility for the developers, but I also think those responsibilities are easier to kind of fulfill, but requires the right education and the right people. Although you say that, it's funny. I was doing some stuff with Lambda yesterday, SQS and Lambda, and I did the whole, didn't properly set up DLQs.

And I got a note. This was all, I'd been playing around with Datadog, so I had it all wired up in Datadog. And I got an email off Datadog saying, your Lambda function has had 1.2 thousand executions in the last hour. So obviously the message was just going round and round and round and round the queue again. So yeah, as much as you say, it needs teaching and experience. So there I go, someone who's creating content and teaching and still managing to get it badly wrong. So...

Yeah, that's almost like something that everyone gets burned once. Circular recursive. Yeah. Yeah. But doesn't that doesn't the lamb this? So I think lambda has that that support now, isn't it? The sports SQS as well. They recognize this because it loops for SQS and lambda.

I'm going to say no. I don't know, but I'm going to say no because it happened to me yesterday. But maybe, maybe it's something. Maybe it's just SNS. I know there's quite a few of the events as triggers. It's able to identify those proactively and stop them when it sees a recursive invocation loop. Yeah, I've seen that with S3. I've been burned with S3 before, dropping a file into an S3 bucket, trigger a function, right back to the S3 bucket, and we go again. Yeah.

So yeah, I'm supported AWS. Oh yes, SQS. It says it does support SQS. That's interesting then because it didn't work yesterday. Anyway.

Okay. So, yeah, James, thanks so much for taking the time to talk to us today. I guess before we go, how's the book coming along and where can people go and read the book? So you can pick up the book at rust-lambda.com. That's the kind of landing page for the book. We're kind of releasing it as we write it. So we've got the first four chapters in a draft. They're complete chapters.

yet to be reviewed but if you were to go and purchase the book today then you get the first four chapters immediately and

the premise of the book is you're going to build a link, link shortening application. So like things like bit.ly, um, you're actually going to build that in rust and in Lambda. So by the end of chapter four, you have a fully functioning link shortener. You can submit a link, you get a short code back, you can then go and hit that short code and it will auto redirect you to the actual underlying URL. Um, so yeah, if you go to rust-lambda.com, um, the books, uh, I think it's 40% discount right now.

And as we write and release more and more of it, that's going to scale up as we release more of the book. So yeah, rust-lambda.com. I think we can drop the links in the show notes. So yeah, it's coming along. I've got a meeting with Luciano on Friday, actually, to try and work out what we're doing next. We've got all the chapters drafted. We've got the outlines drafted out, so you can go to the website. You can see exactly what we're going to cover. And we're just adding the chapters as and when we write them.

Okay, all right. Best of luck. Looking forward to when the focus is finished. I already have the early assets, so I'll be keeping an eye on the new chapter updates as well. And any feedback, let us know if anybody, if you do pick up the book, even from you as well, Jan. If you read anything or see anything, then we're both...

We can't offer you anything apart from your eternal gratitude of the authors. That's all you can get. Oh, we've got cool stickers as well, actually. So if you see me or Luciano around anywhere, then ask us. We've got some like Lambda in a hat, Rust in a hat, sorry. Ferris, the mascot of the language in a Lambda hat. So yeah. Okay. All right. Best of luck. And hopefully see you in person in September. Yes, likewise. Awesome. Cheers. Take care. Okay. Bye-bye. Bye.

Thank you to HookDeck for supporting this episode. You can find out more about how HookDeck improves developer experience for building event-driven architectures. Go to hookdeck.com slash the burning monk.

So that's it for another episode of Real World Serverless. To access the show notes, please go to realworldserverless.com. If you want to learn how to build production-ready serverless applications, please check out my upcoming courses at productionreadyserverless.com. And I'll see you guys next time.

#108: Lambda on Rust with James Eastham 01:02:02 Share

Real World Serverless with theburningmonk

Deep Dive

Shownotes Transcript

#108: Lambda on Rust with James Eastham