We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

#115: Introducing LocalStack v4!

2025/4/2

Real World Serverless with theburningmonk

AI Deep Dive AI Chapters Transcript

People

Waldemar Hummer

主

主持人

专注于电动车和能源领域的播客主持人和内容创作者。

Topics

主持人: 我对 LocalStack v4.0 版本中最感兴趣的是事件工作室和多云支持，特别是它扩展到其他云平台（例如 Snowflake）的能力，这表明 LocalStack 不再仅仅局限于模拟 AWS。 Waldemar Hummer: LocalStack 在过去一年取得了显著的成功，发布了许多令人兴奋的新功能，包括多云支持和新的融资。我们正在扩展服务范围，超越 AWS，例如添加 Snowflake 模拟器和 Azure 支持（尚未完全公开）。LocalStack v4.0 是公司发展中的一个重要里程碑，它在功能和用户数量上都有显著增长，这要归功于不断增长的社区支持。 LocalStack v4.0 包含许多新功能，包括重新实现的 Step Functions 和 API Gateway，改进的 IAM 支持，增强的 Kubernetes 支持以及新的 Web 应用程序功能。我们还专注于改进开发者体验，引入了新的 LocalStack SDK 和 Event Studio 调试工具。Event Studio 允许详细跟踪和分析无服务器应用程序，帮助识别问题，例如缺少 IAM 权限。 LocalStack 的 AWS Replicator 允许本地环境与远程 AWS 资源交互，支持代理模式和复制模式。我们正在探索更复杂的场景，例如与企业合作，并提供与远程系统的连接点，以克服本地机器的物理限制。我们与 AWS Step Functions 团队合作，支持 Step Functions 的 JSONata 转换和变量功能，并计划与 AWS 更紧密地合作，争取实现更多“Day Zero”发布的支持。LocalStack v4.0 增加了对 Apache Flink 的支持，并提供了改进的文档和示例，方便用户快速上手。 LocalStack 的架构包含核心模拟器、扩展功能和内部 API 三个层次。我们正在开放更多内部 API，并创建 Swagger 文档和 SDK，以便用户可以通过编程方式控制 LocalStack 的行为。CloudPods 是 LocalStack 的持久化机制，允许保存和恢复实例状态，并可用于在测试中快速加载预定义的数据，提高测试效率。 LocalStack 的 IAM 功能得到了改进，包括在 Web UI 中直接控制 IAM 执行状态和软执行模式。软执行模式在不完全阻止请求的情况下显示策略冲突，这对于大型 Terraform 脚本非常有用。我们正在开发 AWS Replicator，它支持复制模式、代理模式和自动创建模式，用于创建混合场景。我们正在扩展多云支持，目前正在开发 Snowflake 模拟器，并探索跨云用例。我们计划在未来支持 AWS 的 S3 托管 Iceberg 表功能。

Deep Dive

Shownotes Transcript

Translations:

中文

Hi, welcome back to the end of the episode of Real World Serverless. Today we welcome back Waldemar Hammer, who's the CTO of LocalStack. So man, another year, another major version of LocalStack. Yeah, so thanks so much for having me. It's great to be here again. I know we had this approximately a year ago when we had our last 3.0 release earlier in 2024. And

And now we have a whole bunch of new cool stuff to present with Locust Lake 4.0. So super excited to be here and share some updates today. Yeah, so I saw you guys had the announcement a couple of months ago and saw that you also had quite a busy re-invent as well. I guess a lot of interesting conversations. But as part of the 4.0 release, I saw the event studio. That looks really interesting to me. And also saw you announced the multi-cloud support. So you're not just about...

simulating AWS anymore. Now you're expanding out to other cloud, but also Snowflake as well. So yeah, really interesting to hear about that. But first of all, how are you doing? How's things been?

Yeah, absolutely amazing. So hope you and all the viewers also had a great start into the new year. So yeah, as you mentioned, last year has been pretty busy for us, quite successful. We've been coming out with a lot of exciting new features. As you mentioned, going multi-cloud, among other things, we also closed our CSA funding rounds and lots

of really exciting developments there. And I'm happy to share some of the updates, including events to do and some cool features today. I'll try to do a couple of live demos, you know, like spicy and try and see if everything works. But I hope it's going to be kind of interesting for everybody.

Yeah, sounds good. Yeah, I think the demo last year was really interesting. And I think we got some really good feedback from the comments and from the audience. And people told me afterwards that, oh, yeah, I heard about the local stack a couple of years ago. And it looks nothing like what you showed us, because I think a lot of people like me had been interested.

Maybe previous customers when it was just open source and back from, let's say, local stack version one or maybe even before that, which was a very different product to what it is today. So, yeah, shall we kick you off and tell us a bit more about what's going on with version four?

Absolutely. Let me just dive right in. I'm just going to share my screen here real quick. And for anyone who hasn't seen the episode we did last year, I'll put a link in the description below as well so you can check it out and watch the demo that Waldemar gave us about Local Slack version 3. But yeah, today we're going to learn a bit more about version 4.

Wonderful. Awesome. Yeah. So before we dive into demos, just one or two slides to kick things off and give a bit of an overview of how things came to be.

So this is really sort of a new milestone in our journey and I think I showed a similar slide last year as well where literally we have been almost on a yearly cadence now been coming up with major releases of local stack. So 4.0 adds really a new big milestone to our list of achievements here last November in 2024.

We tend to have this schedule now to align the major releases around re:Invent timeframe. There's generally always a big push from outside to push out new features, also create great demos and material that we can show at the booth. So that's again this year been a super exciting experience for the entire team.

We also continue to see very nice numbers in terms of adoption and usage, so both in the GitHub star site as well as Docker polls and our contributor list continues growing. So we're super excited and thankful for the big community that we have and that's supporting us really on a daily basis.

This is kind of summarizing a bit the year in review. So again, it's been really busy for us. We kicked off last year with our team retreat where we got sort of the local stack team together. It's always a great way, especially in a remote company to come together and discuss what's happening, what's planned for the year and the vision.

We then, in March, as you briefly mentioned, put out the very first version of a Snowflake emulator. So we're really expanding the scope of our offering beyond AWS, even though AWS is still like our core bread and butter and the main focus.

But we have actually been working quite a bit in the background on Snowflake and some other features, including ephemeral instances, for example, which is a way to spin up local stack hosted environments as well as Azure. This is something that's not really fully disclosed yet. So stay tuned for some more updates in the upcoming weeks as well.

Then we had some great successes. We've been named to Infrared 100. So we actually were on the Nasdaq Tower two times this year, both in June when we were added as part of the Redpoint Ventures Infrared, as well as when we closed our Series A round in September. And then again, version 4.0 in November and then reInvent where again, it's super energizing for us to be at this major event where we see all the community and customers coming together at our booth.

So yeah, this is just showcasing the team at the booth. It's been, again, I like to mention this because it's extremely, like big kudos for AWS for putting this all together and giving us an opportunity, even like as a small startup to be there, have a booth and have all these great conversations with AWS heroes, community builders. And it's really one of the most energizing and exciting events of the year where everybody comes together and talks about all things serverless and AWS.

We're also planning to be at some of the summits. So there's, for example, the London summit coming up in a couple of months and a few others in Europe. So hope to meet you and some of you there. Yeah, I think I'm going to be at the London summit, maybe the Amsterdam one as well in April, I believe. Yeah, yeah, exactly. Awesome. Definitely looking forward to it. It's pretty, pretty accessible from traveling within Europe. So we definitely want to be at some of those.

Right, so diving right into some of the news and updates, what's new in LocalStack 4.0 and also beyond. So this is just trying to summarize all the amazing stuff that has been built by the team in essentially almost the entire year. It's really hard to pinpoint, but we have, for example, an all new step functions provider. So we took a couple of services and completely re-implemented them from scratch and

including step functions as well as API Gateway because we saw that the existing implementation that we had was not fully future proof and we really took this initiative and this opportunity to reimplement it from scratch.

We continue to have a lot of enhancements and innovation around IAM. I'm going to show this briefly in the demos as well. We're kind of moving more and more towards almost like a security advisor tool that allows you to deeply inspect your policies and have like enforcement, soft enforcement, IAM stream and other things.

We're enhancing Kubernetes support, that's something that especially enterprises love to see, really running local stack natively in a cube environment, spinning up pods for Lambdas, for example. So everything that is seamlessly possible locally in Docker already is now also being natively ported to Kubernetes to make it really seamless and easy to use there.

We have a whole bunch of new features in the web application. So our web app, which you'll also see in the demo, is really sort of the entry portal to a lot of functionality from a visual UI point of view. And we've added a lot of new resource browsers, new overviews, and also features around CloudPods as well as Telemetry.

There's a few new services, for example, Apache Flink, which is a service that Iblis brought out not too long ago, which is, I think, the successor to Kinesis Analytics. So again, one of the services that has been requested quite frequently, and we added this. And just generally speaking, a lot of parity enhancement across all services. It's too many to count, of course, but as you mentioned, if you look at Locust Lake a couple of years ago, nowadays it's just a completely different product. It has changed entirely.

And we're also increasingly focusing on advanced developer experience features. So on the left hand side here, what I've listed is our new local stack SDK. I'm going to talk about that in just a bit, as well as the Event Studio, which, as you mentioned, is our new debugging tool that allows detailed traces and insights into your serverless applications.

There's a couple of upcoming features. So the AWS Replicator, I'll briefly touch upon that. It's basically a way to enable these hybrid scenarios where part of your stack may be running locally and then connecting your local environment to remote AWS resources. We continue to innovate in that space.

CloudFormation, there is quite some improvements we are planning to do, including update support across the board. So currently, sometimes it can be a bit painful to make updates. We have supported it for the main resource types, but we really want to make sure that there's a seamless experience for CloudFormation updates across all resources.

As well as Cloud Pods, this is our persistence mechanisms, as you remember in some of the listeners as well. It's basically a way to persist the state of a local stack instance and then later on inject the state back in. And we're working on better version compatibility support. So basically, if you store a Cloud Pod with one local stack version, it should be compatible with future versions as well, which is currently still, there's some rough edges currently around that.

Quick question about the 80% Replicator. Is the idea to be something like if I've got an API Gateway and it's hooked up to say a DynamoDB table, and so I'm asking API Gateway and using the service proxy to say get and put item directly. So is the idea then you will host the API Gateway on local stack, but then have it talk to the real DynamoDB table?

Exactly. So this is one of the modes that we support. I think we have a slide on this later on. There's basically different modes. What you're describing is what we call the proxy mode, which basically, you know, you get the local look and feel, the local interaction. But behind the scenes, we're actually calling out to a real AWS resource. So that's the proxy mode.

We also have the copy mode or replicate mode where ahead of time you can say, I want to, for example, copy contents from some parameter store or from an S3 bucket or from a DynamoDB table. I want to copy that into my local environment first and then have it available in my local machine.

Right. Okay. There's basically these different approaches that we see. And I think a strong combination of those really makes for a seamless experience where you can easily blur the lines almost between the local and the remote environments. Yeah, that's very interesting because I had some recent conversations with the folks from V-State and also the guys from HookDeck.

They both have built-in capabilities to have some part of your application run locally, some part of it still be hosted, so that you're able to easily change part of your workflow. For example, if you're using restate, you can change one of the steps in the workflow,

and you can debug it locally, but everything else is still going to be running in the cloud, in the hosted platform. So it looks like that's something that everyone is kind of doing, which is pretty cool. Yeah, absolutely. So I think this is really where, you know, if you combine the full power of, again, the local emulation, the local execution, together with, of course, scalability and the power of the cloud, I think this is really where we can unlock even more value for local stack. And, yeah.

Of course, as we're sort of exploring more and more complex scenarios with enterprises, we see that there's, you know, there's, of course, physical limitations to how much you can run on a local machine. That's why providing these easy plug points to remote systems is really critical for being successful there. Yeah.

Yeah, especially I guess as people do more and more AI stuff, you know, there's one service that you probably don't want to have to simulate locally would be like Bedrock or something like that, which I imagine have a lot of interesting challenges for you as a local simulator.

Exactly. You're not going to believe I'm going to do a short bedrock session afterwards, but it's very much, of course, not the full power of the models that are available in the real clouds. Yeah. Cool. But yeah, definitely some great directions and we're exploring a lot of these new features together with our customer base and this is why it's so important and critical for us to get all your feedback for what's working with local stack and where the boundaries are that we need to expand on.

Yeah, so with that, let's maybe just dive into a couple of demos. Again, I love to keep these things interactive. And then, of course, I know you also ask all the great questions usually, so we can navigate from here. So one thing I'd like to show is, and we also did prepare this for reInvent, is a small application we put together, which is a serverless quiz app. So basically, it consists of some backend components and also frontend components.

And basically it allows you to put together a quiz application with multiple choice basically. And then all the data is being stored with DynamoDB table. There's some SQS notifications involved with some scoring step functions that are happening in the background and so on. So it's a reasonably sized serverless application that is showcasing LocalStack.

So I can now go ahead and just deploy this whole application. Let me first go here. So I have a couple of terminal sessions here. First thing I'll do is just start up local stack. Before I do this, let me export this as an environment variable and then I'll just go ahead and say local stack start.

enabling debug mode and once it's up and running, I can do a docker ps here and we actually see it's running here in our container. The typical port 456H is the canonical entry point and now it's up and running and we can start working with this.

And just navigating to the right directory, serverless quiz app, I think is the name. Okay, so let's take a quick look at how this quiz app is defined. So if I go to my IDE here, we'll see that there is a CDK application. So the whole thing is defined as a CDK app called the quiz app stack.

and it has a couple of different components. There's a frontend stack here, which basically gets deployed as a S3 website, the web app bucket, there is a CloudFront distribution that sits on top of it, and a couple of other components.

So I'll just go ahead before we look at this and say make deploy CDK. And this now basically goes ahead and uses our wrapper script CDK local, which some of you may be familiar with. It's first running CDK local bootstrap, which initializes the stack, and then it basically does a CDK local deploy and deploys this application stack locally now.

So that's going to take maybe half a minute or a minute or so. So the CDK local, is that just like your CLI similar to AWS local?

Exactly, yes. So we have a bunch of these localized client tools, AWS local, as you mentioned, CDK local, TF local, which is a Terraform wrapper. And we generally keep the interface of these CLIs, you know, just the same. It's basically a drop-in replacement for the real thing, just deploying against the local machine.

Yeah, and what we can see here now is since I have verbose debug logs enabled, you can see all the different deployments happening now, the different Lambda functions, the step function, state machine, and others. And as this is pulling up here, I can then, once it's deployed, we should be getting a CloudFront distribution endpoint, which you can then use to open the front-end application, which enables us to work with this app.

I'm just taking a quick look if there's anything particularly interesting. I think the front end is just a node.js, I believe React.js application. And then we have our lambdas here for getting the quizzes, getting the submissions, doing the scoring, submitting the quiz and so on. So these are all just basically Python lambdas implemented here with just standard Python.

And all of these are triggered by step functions or is this triggered by API Gateway? So they're triggered by different types of events. So a lot of this goes through API Gateway, which is from the front-end app. And there's a few ones that also go through a queue, the submission queue. And then some also get triggered from a step function state machine that sends out SES emails and so on.

Okay. By the way, for your step functions provider, are you supporting the global variable stuff that they introduced a couple of months ago? No.

Yes, we do. And I also have a slide on that because it's been a really successful partnership for us with the AWS Step Functions team together. So we actually worked on a joint release announcement. So it's a very exciting partnership for us. I have some separate slide on that later on as well. Cool. Is that something that we can kind of expect more now in terms of day zero releases being supported on the local stack? Absolutely.

- Absolutely, so it's very exciting for us that we're, so this is one starting point for much closer collaboration with AWS. There's a couple of teams that already expressed interest, including the Lambda team and EventBridge and others. So yeah, hopefully this day one launches will be happening more and more frequently in the future. - Cool.

Cool, so now this app has been deployed. So again, all of this is now running locally in my local stack instance. And what we can do here is basically we can create a new quiz here. I'm just going to do some tests here.

You can create a test options, three, four, you can select the correct answer and then add question. And in the end you can just say submit the quiz, right? So this is just basically to come up with custom quizzes and then you can go to the homepage and actually now it's available here and we can actually start running this particular quiz.

So this is just for illustration purposes here. What you can do at this point now is to go to our web application, so that's app.localstack.cloud. As I mentioned before, this is really the entry portal to a lot of the functionality and visual UI for different parts of the offering. And we can go to the resource browser here, for example, and let's go to DynamoDB, for example,

and take a look at the tables. So here we have our quizzes table. And if we go to items, we can, for example, see here our test has been added with some randomly generated quiz ID and it's public. And so this is basically just all stored in our local DynamoDB table now.

Okay, so that's nice. We can of course also go through this and start playing, but I guess it's more for the demo purpose, so not that interesting. So submit the quiz and done.

Now, the maybe more interesting part is, let's sort of iterate a bit on this example and look at different aspects of this, right? And the first thing I'd like to showcase is what we already mentioned briefly before, is our Event Studio, right? So Event Studio is really a new product that we're currently developing. It's right now in its early preview, so working with early adopters to get feedback.

And basically what it allows you to do is to debug and trace your serverless applications in a very detailed manner down to the level of individual components, how they're connected, and really following a trace of execution in a serverless application. So certainly some similarities with X-Ray, for those in the call who are familiar with this.

We tend to believe that we can even go a step further, like in local stack we have a lot of possibility to really showcase all the payloads and prices and really showcase the detailed information, including things like IAM policy warnings or policy violations that can really help the user to pinpoint issues that are happening in their stack.

So we can now go ahead and try out Event Studio with this application. So Event Studio is available currently as an extension, a local stack extension. So if I go back to my browser here and go to app.localstack.cloud where we were before, there is on the left hand side you will find our extensions.

And Event Studio here is one of the extensions you can install into the instance. So just a quick reminder for those not familiar with extensions. Extensions is basically a plugin mechanism that we've developed some time ago to basically extend the functionality in LocalStack with additional functionality. So, for example, we have these Terraform init hooks that allows you to create a handler for

for Terraform files that you can then use with init hooks, with local stack init hooks, or we have this thing called the resource graph and a couple of other things. And event studio is here. And I can now go ahead and say, I want to install this into instance instance.

It's actually already installed and now I can go ahead and use this with the application that we have deployed. So the endpoint for Event Studio is this. So this is Event Studio localhost/logstack/cloud. Again, this is just a local endpoint that we're looking at here. There's no interaction with the cloud system. And we can already see the list of events that are coming from this deployed application that we were interacting with before, our quiz application.

And I can now go to one of these events, for example, and we can see the detailed traces all the way from the API gateway. We can see what's the request type, the request payload, the API type is an AWS proxy, all the information that's relevant here. And then all the way we can sort of trace the invocation all the way to the Lambda and from there onwards, it connects to any other third party or any other components.

So this is, again, it's relatively early days for us, but I think we already got some really good feedback when it comes to debugging different types of scenarios. Oftentimes it's a bit of a hit and miss game also with AWS where you think about, okay, what does the payload load actually look like? How can I extract the right payload information from these fields that are coming in? For example, in API Gateway, you can see these data.

detailed headers together getting passed in. So it's very detailed and one click information that you can get directly from here. Let me, so this was the invocation. Let's take a look at the put items. So this is a slightly longer trace here. So this is the one where we actually put or submit the quiz, which involves also some DynamoDB interaction. So here we have the API gateway, the invocation, then we're actually calling the Lambda function for the submit quiz, which then actually sends it. And this is,

Also interesting where some asynchronous processing comes into play, right? So Lambda puts an item to the SQS queue, and then at this point, this Lambda already returns, right? So we are no longer in the synchronous invocation chain, but it also detects that this now triggers asynchronously with an event source mapping another Lambda function that triggers the scoring and then sort of stores the use submission to the dynamic table, right?

So that's where it almost goes beyond a synchronous invocation chain and then also follows asynchronous events to their respective targets in the database, for example, in this case. And do you also follow streams as well? So that's one of the things that X-Ray doesn't do. For example, I think X-Ray and Datadog, they don't support...

events when it goes over say, DynamoDB or Kinesis stream. And that's one of the things that Lumigo does in terms of having that end-to-end traceability. What about the event studio? If I have got a Lambda function that subscribe to the DynamoDB stream here, am I going to see the other function after that?

Yeah, that's a great point. And by the way, Lumigio also comes to mind when I'm showing this, of course. Yes, the short answer is yes. At the current state of implementation, this is still being built out. But certainly this asynchronous following the trace is something that definitely Event Studio offers. As you can see here, the SQS, we basically follow each item to its destination.

So it's almost like the trace ID is sort of attached to the message itself that then gets consumed by some downstream system. In this case, it's Lambda. The next step would be, as you mentioned, the DynamoDB streams, for example, if you have a stream. Yeah.

There's still a couple of things that the X-Ray doesn't support. I think maybe S3 doesn't support as well, that if you go to say Lambda to S3 and then have S3 notification trigger and then the Lambda function, you also lose that in X-Ray trace as well. There's a couple of cases like that that is still not fully supported end-to-end on X-Ray.

And it's understandable. I can see that it's not easy to implement this across the board. Of course, you need the service buy-in from all the services to make sure that you're actually passing along this trace information. But for us, it's really a way, the way we see Event Studio is kind of opening up the black box, right? So we really want to give as much insight as possible into what is really happening sort of in the internals during the processing.

It's very much a starting point here, but we're already getting some quite exciting feedback for how this can be used. Ultimately, this also, as you can see, kind of spans up a kind of a dynamic resource graph. I mean, right here, we're just seeing like this one linear invocation. But if you think about sort of combining all these traces together, what we actually get is like a graph of all the interactions in the system. This is sort of the next evolution that we have in mind with Event Studio. Yeah.

Sure. Another feedback would be add a logging section so that you can easily see the logs for the function invocations as part of this transaction as well.

And that's also one of the nice UI things that Lumigo does that I love. I just use all the time. That is, yeah, no, that's great feedback. Certainly we'll take that back to the team. Yeah, as you mentioned, I think the power of this also comes in sort of the having the central entry point and then sort of branching out from there, right? So connecting it to the logs or even being able to click directly here on the function and then it takes you to the resource browser and you can inspect the state. So there's certainly a lot of navigation that we still need to add in here.

But yeah, super exciting times for us to, again, open up the black box and make these interactions more visible, more accessible. And yeah, so all these serverless applications with Lambda, of course, are a prime use case because they already lend themselves really well with event source mappings and so on. So it's really exciting.

get a deeper understanding of what's going on. I mean, you surely have worked with a lot of event-based applications, event-based logic, and sometimes it can be really hard to debug it and understand where was this coming from. The other feature we are

actually implementing and to be honest, this is a bit of a shot in the dark now, if this is gonna work, but we have an event replay functionality. Let me see if that works here for do this. I should probably not have picked this particular,

Yeah, okay, so this is something that's currently still under development, but basically the idea being that you can also replay events, right? So not only this thing, then here, but even like having a side effect way of like just repeating an event and see how it flows through a system again. Right, yeah.

Right. And also, I guess it'd be very useful if you say, oh, you run a test, you failed, you go to here, you look at the transaction, you figure out, oh, the error is X, Y, and Z. You go back to your code, you change it, and I guess you have the hot reloading enabled. Then you can, after that, fix your problem, run the test again, replay that failed event, and then you can see, oh, right, now it works. Great. Yeah.

Exactly, 100%. Exactly what it just said. The cool thing also is that you can re-trigger this replay at any point of the execution chain. So you don't always have to go all the way to the beginning of the chain, but you can literally go in here and say, I'm very curious right now and try this again.

Okay, so that actually worked. This is a different UI feature here. So you can actually really go to each individual step in the processing chain and sort of replay the event from there. Oh, that's very nice. That's something that's quite powerful there.

Awesome. Yeah, definitely expecting hopefully lots more great feedback on this. Please do check it out, everybody out there, and give us feedback. It's something where you can very much actively influence the development as we add more functionality to Event Studio. Cool. Right, so we looked at that. Basically, this is just a screenshot of what we just looked at.

One thing that's worth mentioning is the whole application can also be injected as a CloudPod. So just as a refresher for everybody on the call who may not have heard about CloudPods before, CloudPods is our terminology for these point-in-time snapshots of the state in local stack.

So the terminology we sometimes explain is it does not have anything to do with Kubernetes pods, so no containers. It's really the persistence. Think of it as a zip file that basically gets pulled out of the local stack instance and you can later inject it back in.

And we've gone ahead and actually prepared Cloud Pods ahead of time for this application that I was just showing, which actually consists of two parts. One is the infrastructure and the application part that we saw, as well as a data layer with some predefined quiz application that we can just inject into the DynamoDB table. So let me just quickly go ahead and restart LogoStack here as we pull this up.

Okay, LogoStack start. And then I have this here, I can just copy and paste. So the first step that we do is we load the quiz app. So that's basically the application and the infrastructure. One thing you notice is, I mean, the CDK deployment before took maybe a minute, a bit more than that. And the CloudPod injects within a couple of seconds, right? So this is really fast. Once you've created your stack once, recreating it is really fast.

And now as a second step, the Cloud Pods can also be combined on top of each other. We are loading the data for this application. So we're loading a second Cloud Pod here, which consists or contains the DynamoDB definition of our quiz that we can now open here. Let me see if I have the right endpoint. This is now a different endpoint. Distributions.

Just doing another CloudFront distribution list and that's now this endpoint. And now we see that we actually have three quizzes predefined that were part of this CloudPod. And we can select, for example, this AWS quiz. Let's quickly go through it. Jan, I'm sure you're going to be able to answer all the questions. Which AWS service is primarily used for hosting applications and websites?

Well, you can do all of them, but let's go S3. S3, okay. What does S3 in Amazon S3 stand for? Simple Service Service is A. A, okay. Which AWS database is fully managed relational database? RDS. RDS, okay. Cool. AWS running code in response to events without requiring server management. So that would be Lambda, so B. Yeah. IAM, what does IAM stand for? Identity and Access Management, so B. Okay. Perfect. Submit.

And let's see what our score is. Should be pretty good. Nice. I think that's 100%. I'm not sure how we're counting the points here, but I think that's 100%. In any case, so this is just to showcase, again, this whole quiz that we prepared here was just part of the definition of this CloudPod and is, again, has been loaded within a couple of seconds into this local instance here.

So the idea here is that you've got this static snapshot of data that you can then easily load onto your test so that instead of having some kind of script to see the data, you can just capture those data, create them once, create a snapshot, and then use that snapshot next time you want to run the same test cases.

Exactly. Exactly. And so what we see often nowadays is that maybe there's some kind of a CI pipeline that's running whenever you merge some change into your main branch, for example, which then spins up the infrastructure for your application, does all the deployments and then creates a cloud pod and pushes that as the latest golden pod is a terminology we often hear. In fact, we do this also for our own cloud.

platform backend. So basically the backend that we saw with our application, like all the backend APIs that are used for this app are also tested and defined with a cloud pod, right? In our own cloud backend infrastructure. And we use this notion of this latest golden pod to always have the latest version available that can quickly spin up on the local machine then. That is something that is more and more being used.

So in that case, how would you select which resources are part of this pod and which data is part of this pod? That is a great question. So there's a pod save command, which is kind of the counter command to pod load. And if you look at the...

at the help text here, basically we can specify the services that should be contained in this part. So currently that's the level of granularity that we do it by service. So one thing I'm not sure, probably you can't currently say, I want only one table of my DynamoDB within the part, so it's really on the level of services. But yeah, this is how you can combine them basically. - Okay, okay, gotcha.

So it's by service, not by individual resources. Yeah. I don't know if it is something you guys are thinking about. Maybe it would be good to have some-- I guess maybe some kind of UI support in the console would be good so that you can go to your local stack instance of all the things that you've created. You can maybe just take the ones that you want to create or include into a pod and then maybe do it that way. Yeah.

Yeah, that's actually a great feedback. Yeah, maybe some kind of like a tick box where you can just go through and say, I want to include this and this and this and then, you know, create Cloudflare. Yeah, maybe something like that would be easier to do that than to do it entirely through the CLI. Yeah, yeah.

That's true. I guess one thing that we often see is we, I guess maybe, you know, building a tool from developers for developers. I think we're oftentimes quite CLI heavy. Start from CLI first. CLI first. But I think we should sometimes acknowledge more that some UI first functionality sometimes would also be beneficial for the users.

Cool. Yeah. So the other thing I wanted to briefly highlight with this application is IAM. So we also talked about IAM, I think, in the last session. But this is something where we just have ongoing enhancements and innovations. So I just wanted to briefly show a quick refresher here as well.

So the first thing that we have is in the web UI directly, we can now very easily control the IAM enforcement status. So by default, IAM is essentially disabled. So, you know, by default, local stack is going to permit all system and you can run all the operations. But we can directly from the UI here, and we already see actually a couple of internal polling messages coming through. We can also disable them here or

or filter them out. And directly from the UI here, I can now say we want to enforce IAM policies, right? So the stack that we've deployed here already has all the IAM permissions in place. So here, when I refresh the page, it calls the Lambda function in the background and it's able to fetch the items from DiamondDB because all the right permissions are already in place because we can deploy the whole, the same thing to AWS as well.

Now what I've done here in preparation for this workshop is to create a small script which basically disables some of the policies that are required for the Lambda to access the DynamoDB. So basically what we're going to do next is in our architecture diagram here,

We're going to cut the access from some of these Lambda functions to DynamoDB to the quizzes table. We're going to take away the role-based policies that are required for this execution role of the Lambda here and then see how local stack reacts to that. So I have a little script here which I believe is called "bind updatePolicy"

So this is really just a small script I've written before, which basically is, don't mind the Python here, it's basically just going through some of the roles here and it's running an update statement, put role policy and basically takes away the DynamoDB commissions from the Lambda, right? I'm just gonna run this here now. Python, and I think we're gonna call it, I think it's disable.

Okay? So that was quick, but I think this should have already sort of cut the axis now from between a Lambda and a DynamoDB table. And now let's see what happens. Right, if I refresh the page here, that was probably not really true. So let me see if that actually worked. So first of all, let's go back to here. So IAM is still enforced, so that should be fine. Let me see if I did the right thing. Update policy.

So let's do a quick print statement here and see if it actually reaches this here. That's it. Let's call it enable and disable. Let's try one more time. Okay, it's not reaching the point, which is bad. So something's off in my script here. Yeah, I put this together just before coming into this call, so that was obviously a bad idea. So something is off. So we're looking at all the quiz apps stack. If they're not in scope, we'll continue.

And then let me just quickly see if we at least receive this here. Let's just keep this in the monitor so if I'm not able to fix it then we can move on. But I'm getting the roles here. I'm getting the roles, okay. But then they're probably not in scope. Role of role name.

- Ah, of course, sorry, because I was testing it before with the CDK stack, but this is actually the CloudPod that we have deployed. So we have different resource names here. They don't have the prefixes in there. Okay, so let's just do it for basically all the roles. So I'm just commenting out this part here. So we just do like a hard for policy name, hold on, in scope. What is it complaining about now? Policy name. Okay, let's see.

Okay, I think that worked. Let's try again. So now we basically remove the access. No, it did not work. Yeah, that's that's what happens when you when you do what if you update the CDK app yourself and redeploy the CDK app? Yeah, exactly. That's what I was gonna do. So let's just do a quick make CDK shouldn't take that long. It's gonna let that run. Wait until we started. Cool. Just gonna

Just going to take a minute or so and then we can run it one more time. So that actually brings up an interesting use case potentially for the pod, the use of pods. So potentially, would you be able to say in that case, what you had there, the problem was that you've got a pod, so things are slightly different. But could you have, say, created a CDK version, well, a version of a CDK app without necessary IAM permissions?

Once you've created that, then you create a pod so that you can use the pod the next time to do a demo like this so that you can say to test a failure case. Because this can be quite useful for some of the sort of more chaos experiment kind of situation whereby instead of having to, say, test different scenarios by writing different scripts that undo certain permissions and having to redo them between different tests, you can then create a pod that

that represent different cases that you want to run so that for each of the cases, this one would be, say, there's no IAM permission for DynamoDB. And then one could be...

you don't have some other permissions, or Lambda function wasn't connected properly to the event bridge, for example, then you can have different pods that represent different failure scenarios. One would be to test what happens if we don't have permission, to test our fallback. Another one could be to test the case where is our

the little queue working correctly, for example. That would be a use case where I'll have different pods that represent different failure cases. So you capture all the setup that instead of having my test cases written in such a way with all these setup steps, I'll have different pods pre-created to represent the different failure conditions. Yeah.

Yeah, that's actually a super interesting point. Yeah, so you can really have like a base definition of your application, right? And then either you... I mean, there's a couple of ways we could achieve this. Either in the CDK stack itself, we could...

we could just cut some of the definitions where the Lambda gets access to the dynoed_b table, for example. That would be the more crude way of doing it. Or the other way would be to say, maybe we're just not snapshotting all the IAM. - Right, right, exactly, yeah. - Yeah, exactly. And just leave that out basically. And then you have different failure conditions for seeing how that, it's actually a great use case. I'm gonna,

I'm going to dive into that and maybe prepare a demo for next time. That's awesome. Okay, sounds good. Okay, so back to let's see if my little script here now works. Okay, I'm at least getting a couple of updates. So let's see if that works now. Just double checking again one more time that we have IAM enabled. I need to refresh here. Enable the stream. I'm going to enforce...

Disable the internal calls. And now let's see what happens if I refresh our app here. Okay, this is now of course a new distribution ID. So I need to go back and CDK always generates a new CloudFront distribution. So I'm going here.

And cool, yeah, so now it's telling me no public quizzes are available and this is really coming from the fact that the Lambda is basically erroring out in the background. And if I go to our list here, refresh this,

And I think it's probably better to just look at it from the other. Yeah, so we can see that we already have an error state here. So it was trying to do a scan, but here we see this is the required policy that would be required to actually make that call, but it was actually denied because we just deleted that policy beforehand, right?

So this is really how you can see how this works. And one of the new innovations that we now have is to have soft nodes. So I can actually disable the enforcement now. And I can now refresh the application here one more time. And this should now actually give me a... It's just a warning signal, right? So this time, if I refresh here...

Yeah, it's giving me a warning. It should have actually made the call in the background, so I'm actually surprised that I'm not seeing the result here. But it's generally just doing like a soft enforcement, right? So it's not fully blocking it, but it's just informing you that there's a warning, that there's a policy violation happening here, but not sort of blocking fully the access. So those are some of the things. Is it because your table is empty?

Of course, right. Yeah, because the tables and because we we haven't. We removed the two parts that you had. Exactly. 100% you're fully fully on point. Yeah, exactly. Well, well spotted. But those are the two options that we have here. Right. So the enforcement mode or the soft mode, which is basically gives you a warning that something is not not accessible. Cool. Awesome. So going back to our flow here and

So we looked a bit at-- oops, that was our-- I'll go to my slideshow here again.

So we looked at the Cloud Pods. We talked a bit about IAM, right? So this new soft mode, which again shows us only the policy violations without blocking the request directly. Something that can be extremely helpful and useful if you have something like a large Terraform script, for example, and we don't know a priori which exact permissions we need in order to run all the API requests, for example, right? So it actually shows you this would be required and this would be required and so on.

Yeah, so just moving on to a couple of other topics. So we already briefly talked about the Step Functions collaboration. So that is something that's very exciting for us. Around re:Invent timeframe, the Step Functions team at AWS came out with two new announcements, JSONata transformations and variables.

So JSONata is basically a way to have more flexibility to transform the data between steps and step functions. So I believe previously it was mostly XPath based, which has some limitations. So you basically had to add additional steps to do concatenations, for example. And nowadays you can easily just do like an ampersand concatenation of strings and a lot of other nice operations.

And the other option is variables, which is really, you can have like an assigned block now in a step function definition and easily use these JSONata expressions to directly assign sort of values to variables. Again, something that cuts down the definition of step functions tremendously and makes them much easier to author.

And yeah, so there was this exciting announcement from the AWS blog here that I've linked in here and it actually mentions local stack, so that makes us proud. And I think this is really a strong example where we can really go in lockstep together with AWS because it's something that their customers are very much sort of asking for and we were able to come up with the implementation at the same launch date. So...

I'm very excited about that and hope to see many more day zero launch day sort of announcements together with them.

Yeah, another one is that Apache is just speaking about some new services in AWS, sort of in LocalStack 4.0. So Apache Flink, it's formally known as the Kinesis Data Analytics Service, basically a service for some large-scale data processing in real time. Apache Flink has been around as an open-source project for quite some time, and it was one of the most requested features also for LocalStack.

I'm not sure if you're going to have time because I know we're already top of the hour almost, but there's another nice documentation here. It's literally a step-by-step step through. So a lot of the documentation that we now provide is sort of literally a

copy paste, so you can literally go three steps here, clone a sample repository, run Maven package, create some buckets, deploy an application, and then in the end, you'll actually see how the whole data processing framework basically processes events and stores them to an S3 bucket. So maybe not going to step through this right now, but again, please check out our docs. There's lots of good samples you can literally copy paste and try out really quickly in local stacker.

Right, so I guess in other news, there's maybe a few more points worth mentioning around, you know, developments around 4.0, but also looking a bit into the future. So the first one is just if you look at how we're thinking about local stack and the different layers that it entails.

So in a nutshell, what we provide with local stack is this core emulator that's running on your local machine. And first and foremost, it enables a lot of these cloud service APIs, right? Things like Lambda, Step Functions. We have 100 plus services at this point in time, and they are being accessed by either some cloud tooling like Terraform or CDK, what we saw before. And this is really sort of one of the main entry points for public consumption of local stack APIs.

In addition to that, we're offering features that sit on top of local stack, really extend the basic functionality, including the chaos API. We didn't look at it today, but this is a way to inject errors and latencies into the APIs. I think we also looked at this last time. - Yeah, we talked about the last time as well. - Yeah, and then we have debugging features as well as the events that we saw today, which are really sort of sitting on top of the emulator functionality and adding developer experience features.

And then the third part here is what we consider the internal local stack APIs. So these are kind of

internal APIs that allow you to control the behavior of local stack in certain different ways. And we so far have them literally more internally, but we're now actually opening them up more and we're creating a swagger documentation and also SDKs for different languages to allow you to do certain automation with local stack programmatically.

So for example, if we just open this swagger doc here real quick in my browser, what you'll see, and this is literally sort of the internal local stack REST API from the container that's running here, you'll see a lot of convenience endpoints that we have created throughout the years. So the ones that have underscore AWS as a prefix,

are service specific endpoints, for example, certain Cognito token endpoints or IEM where you can control the configuration and whatnot. And then everything with _localstack is literally the internal APIs, right? So here we have the chaos APIs, for example, we have the Cloud Pods APIs where you can do, you know, configure the state in various ways. We have

This is actually another additional AWS APIs. So you can see it's a growing list of endpoints that we're now offering here. And we're making those more and more accessible and also create SDKs around it so you can start controlling local stack from your test scenarios and from UCI pipelines and so on.

Um, so definitely more on that also coming out in the upcoming upcoming weeks and months as we're building out the SDK here and APIs. And also we'll share a lot more examples of why this is useful, right? So you might be asking yourself, why would I use that? And we have a bunch of, um,

advanced testing scenarios where we can showcase why, for example, injecting chaos in your test case can be very beneficial and we give you the tools to do this more easily. Right, okay. Yeah, that's exactly what I was going to ask you. So why would I use this? Yeah.

Exactly, yeah. I mean, to a large degree, a lot of what you see here is actually transparently under the covers used by our dashboard, right? So our web app, including the events tool and everything, like a lot of the functionality you see here calls these internal endpoints kind of transparently, but we also want to open this up more because we believe that there's some use cases where this can be beneficial, but it has to be, as you mentioned, it has to be

more explicitly exposed and clear why to use that. - Okay.

- Right, yeah, just maybe one or two more pointers. So the AWS Replicate is something that we briefly touched upon before, so enabling these hybrid scenarios. And just one point I wanted to get across here is that there's different modes for enabling hybrid scenarios. So we have the copy mode, which really allows you to basically start from an existing AWS account and then kind of fetch the details from there. Let's say an SQS view, for example, something else.

and then basically copy these resources into your local stack instance, right? So that's copy mode. The second one... When you say copy the resources, you mean copy the configurations for those resources? So the goal is actually to copy even both. So, like, if it's a database, then it would be schema plus data, or for SQS, it would be... Right, okay. Yeah. Okay, I see. But...

Yeah, I think predominantly, you know, very much configuration, but also contents, right? Because we also see some benefit to actually having like the full, you know, the message content of an SKS queue that you can copy locally. The cool thing is that we can literally create clones in terms of even the same IDs, right? The same message IDs and handles that will then basically make part of the local queue, right?

And then a third mode, which we call auto creates, it's something we're also currently building out where the idea is you make a request to an API. If it already exists, you get the result. If not, it fetches the details and kind of lazily creates the local representation of that. So it's kind of a combination of both in a sense. Yeah, I guess the problem I see with the copy mode is in terms of copying data is that A, you can be quite slow depending on say the size of a download DB table. You can also cause the,

problems in terms of, say, if you're pulling from SQSQ that has got a redrive policy, suddenly you can force the data in the real queue into going to the DL queue. There also may be cases where you just can't pull the data. Things like event bridge, the events are transient. And there's other things that may require...

you know, I don't know, private VPN, the connections, the right networking setup for accessing databases like RDS, Elastic Cache, and there's just, I guess, a lot of edge cases. It just feels a little bit, okay, a little bit dangerous sometimes. Yeah, absolutely. Especially the ones where you have sort of side effects, right? For example, consuming an SQS message can have a side effect in terms of like visibility time. Yeah, exactly.

and as well as the ephemeral nature of EventBridge, where you're probably more interested in copying the rules, right? The rules for forwarding. Yeah. But certainly, again, I would say it's also in the bucket of sort of an area we're currently exploring, right? So we're providing this tooling. And by the way, this is really easily accessible also in the CLI. So we have this local stack replicator CLI now where

You can basically start these replication jobs and there's different configurations to do that. Right. Again, in the interest of time, not going through all of it, but again, something where we're very much looking for active feedback from the community to further build this out. Because we do see a lot of opportunity in these hybrid scenarios, but they're devils off in the detail, right? So you really need to figure out how exactly to configure them and everything.

Cool. Yeah, maybe the last point to just quickly point out is, as I also had in our timeline, earlier last year in 2024, we released the first version, a preview version of Logstack for Snowflake. Again, we're expanding beyond AWS as we speak. We're becoming a multi-cloud product. We're working very actively with a number of early adopters for this new product. And please reach out if you're interested in

There's a whole bunch of features already available. You can create databases, tables, schemas, stages, UDFs, and a couple of other things. So definitely something that's picking up pace as we go along. And that's very interesting. As Snowflake and AWS are expanding their own partnerships and building out integrations, there's some really cool innovations we can already start to showcase. For example,

connecting in a Kinesis stream and streaming data into a Snowflake table, for example. Or, of course, S3 buckets that can be used to store things like so-called iceberg tables, which is a storage format that is kind of interoperable. So there's some really interesting cross-cloud use cases which we're exploring. We'd love to get your feedback if you're all interested in the Snowflake.

So I guess in that case, are you also supporting the new S3 tables, which is kind of the S3 managed iceberg tables thing?

Not yet. Very much in our roadmap, very actively. So if we had known about this a couple of months ago already, then it probably would already be there. But yeah, this is really a killer feature. Kudos to AWS for developing this. I think it's really pretty awesome, like expanding the notion of storage buckets to really make them schema-aware and even enabling queries. It's something we're currently working on. There is...

some similarities actually to what, how Snowflake, the queries work. Yeah. Something we're quite excited about, but not yet available. Okay, cool. Yeah. Awesome. Yeah, cool. So I guess maybe just wrapping up with a, with a short shout out to, to the community, it's going to be a very exciting year ahead, 2025. So we have a huge, we're super energized, have a huge roadmap ahead of us.

A lot of features that I've showcased today that are in preview are going to go GA throughout the year, including Event Studio, AWS Replicator, a new web experience. What I've shown today in our web UI, we're looking to have a completely renewed experience because it has outgrown its own size a bit. So we're working on a redesign of

And also more exciting partnerships with AWS on the horizon, right? So kind of replicating that success from step functions and working, I guess, mentioned this here with Lambda and EventBridge team, nothing official yet, but there's some very early signals.

We'd love to get your feedback. Please get in touch on the Slack community, GitHub repository, different ways to get in touch with us. And of course, always follow Jan's webinars and all the good stuff that he's creating. So kudos to you also for creating a lot of super relevant community content. And it's always great to be here. Thank you.

So yeah, I guess thank you very much for showing us what's going on with you guys at LocalStack. And good luck in 2025. And looking forward to what you guys are going to be doing, especially around the multi-cloud stuff, which I think is going to be really interesting.

Certainly, for me, I've been working with AWS for so long, but I have heard people that want to use Azure functions and they have similar problems that we've had in the past with Lambda. I'm interested to see how you see the two clouds, how they're different as you try to implement the APIs.

Yeah, 100%. Super, super, super interested in which learnings we're going to have with the similarities, but also differences between the different cloud stacks. So I'm super excited to share more in the course of this year and stay tuned, everybody. Sounds good. It's good to talk to you again. And thank you so much for joining us again to tell us what's been happening at the local stack of E4. And yeah, best of luck.

Thanks so much for having me. Take care. I'll see you at the summit, I guess. Yes, looking forward to it. Cool. And see you guys, everyone. And see you next time. Okay, bye-bye. Bye-bye. So that's it for another episode of Real World Serverless. To access the show notes, please go to realworldserverless.com. If you want to learn how to build production-ready serverless applications, please check out my upcoming courses at productionreadyserverless.com. And I'll see you guys next time.

#115: Introducing LocalStack v4! 01:03:55 Share

Real World Serverless with theburningmonk

Deep Dive

Shownotes Transcript

#115: Introducing LocalStack v4!