A Decade of AWS Lambda—Leading the Future of Serverless Computing
Join us on a serverless computing journey! Host Keith Townsend is with Amazon Web Services‘ Usman Khalid, Director, AWS Lambda on this episode of Six Five On The Road at AWS re:Invent. They look at a decade of evolution and what’s in store for the future of serverless computing with AWS Lambda.
Tune in for details 👇
– A reflection on a decade of AWS Lambda, highlighting major innovations and milestones that are changing the game for developers
– Recent enhancements in AWS Lambda that elevate the developer experience and streamline application development, particularly leading up to AWS re:Invent
– An overview of the AWS Serverless Vision and its implications for the future of cloud computing
– Insights on the integral role of security in AWS Lambda and its importance in serverless computing
– A look ahead at the future progression of AWS Lambda and serverless technologies, focusing on scalability, developer efficiency, and the continuous simplification of cloud computing infrastructure
Learn more at Amazon Web Services.
Watch the video at Six Five Media at AWS re:Invent, and be sure to subscribe to our YouTube channel, so you never miss an episode.
Transcript
Keith Townsend: All right, you’re a developer and you’re wondering where is the goodness for me? I heard about Bedrock, Amazon Q, all of this business level conversation from the show floor of AWS re:Invent during the keynote. Well, I have with me Usman Khalid, Director of AWS Lambda at AWS, and I’m your host, Keith Townsend, for this special Six Five On The Road. Usman, welcome back to the program.
Usman Khalid: Thank you so much for having me back, Keith.
Keith Townsend: You know what, we’re at AWS re:Invent 2024 here in Las Vegas, with 60,000 of our favorite friends talking about innovation around generative AI. But I’m more interested in the … I talked to another analyst, he said, “Keith, AWS is back, baby. It’s got its mojo back.” Talking about traditional challenges with Gen AI mixed in. Talk to me about your journey with AWS.
Usman Khalid: I started actually 11 years ago on the Auto Scaling team. And it’s funny enough, and this is Lambda’s 10-year anniversary, this re:Invent as well, we launched it in 2014, Lambda started, and it’s actually a very public story about our architecture, we started by launching Lambda function on EC2 instances, which actually provisioned by an Auto Scaling group. And so suddenly I come back from re:Invent 2014, and I see a large spike up and since launch is happening through my service, and I’m like, “What’s going on in this Lambda?” And it was actually super exciting. It was operationally super challenging. We certainly see, every week, every day, we see the load increasing in our service. But it was also a very exciting time to see the product grow, that many and so quickly and actually see the hearts and mind it captured with the customers so quickly, simply because developers love the simplicity and they love the fact that they could really get a great experience, only write the code that matters.
And obviously it’s grown a lot in the last 10 years as well. And I’ve joined the team since last year as well. But I’ve always been excited about it. Back in my early Auto Scaling days, we talked about how the cloud was still fresh and new, and now you don’t have to get your exact number of servers that you need to provision in your data centers. You can just let Auto Scaling take care of it and let the best ideas win. But with Lambda, you don’t even have to think about any servers whatsoever and how many you need to provision. So it was just the next level of taking an idea to production the fastest way possible. That’s what we thought about Lambda 10 years ago. That’s what we think about Lambda 10 years later.
Keith Townsend: Yeah. AWS, all about the builders journey, helping builders move closer to the business challenge versus focusing on the infrastructure. Lambda, a critical part of that. It is the birth home of the term “serverless.” What are some of the more exciting announcements from this week’s show?
Usman Khalid: Oh, for sure. Look, as I said, you can think about servers, how old servers are, probably 40 years, I would say, give or take. So servers is only 10 years old, so it’s not actually a particularly old technology. It’s still very much in ascendancy phase right now, and obviously we’re not done with servers either. Just recently we launched capabilities like SnapStart for Python and .NET. This is unheard of sci-fi stuff even 10 years ago, where we are able to snapshot a customer’s code, just as an initialization part of the customer’s code, and so the next time we need to run a function or run an execution or a new execution, we are able to just start that so much faster. So in some of our testing, for example, with some of the generic applications using LangChain, so Python and Lambda, we see cold starts go from five seconds to under a second, almost a seven times improvement in cold start experience.
And that just means that these future, these exciting new applications customers are building, are just that much more performant. Their end users are getting a great experience, and all they had to do was turn on a flag on their function. They don’t have to re-architect their application and it’s certainly seven times faster. We’ve also been really focusing on what I call the inner loop. The inner loop of building an application is testing, debugging, deploying. There’s the outer loop as well, where you’d have to think about operationalizing it, you got to think about metrics and utilization and CI/CD and all these other things, but we’ve been really focusing on getting the inner loop really, really awesome.
And so we did a couple of launches just before re:Invent. First we put the Visual Studio Code, which is the ID of choice, it’s a fastest growing ID in the market, we put that in our console. So even in the first time a developer is trying to experience Lambda, they see a familiar home to start with. And we’ve done a really amazing set of integrations in the ID itself, and it’s running on your laptop, or our developers’ desktop, so they can remotely and locally debug and test their function. They can install Amazon Q for developers and get great suggestions, while they’re typing up the code. That further accelerates the code, even the code that they have to write. And we built a native logging experience there as well, where they can just see what the function logs are. All in their ID, never have to even leave the ID. So even though you’re programming without a server, you’re getting that experience of developing code like you’re used to writing developing code.
Keith Townsend: Yeah, it’s really a cultural shift from a observability perspective, from a application architecture perspective, to learn not just the technical challenges of coding without worrying about the infrastructure, but troubleshooting, observing and improving application performance. There’s a lot to unpack in your last set of statements, but one of the things I want to focus on is that you’re reducing the friction associated with writing code. I have a business challenge. I want the logic to solve that challenge. I write that code. The easier you make something, the more you’re going to do it. So the more code you write, which is not always a great thing … How does Amazon help reduce some of the code that developers have to write?
Usman Khalid: No, absolutely. I mean, really at the end of the day, going back to our vision and our mission in this and for AWS and for developers, is really be the fastest way to take ideas to production. And the more code you write, I mean, I hate to use this term, but it is true, code is a liability over time. It might feel exciting when you’re opening a fresh file and typing code in it, but someone has got to maintain it, patch it, and run it for you. And so the way Lambda does it that helps accelerate, firstly is that we have about 220 integrations with across AWS services and third-party SaaS providers built in. A ton of code that customers are writing or developers are writing today is really around taking different systems and plugging them together. And so we make that super easy right out the bat. So you’re not writing any integration code, you’re not managing any integration code, you’re not scaling any integration code, that just works out of the box.
And for instance, just three days ago we launched a new enhanced integration with Kafka, where you are now able to run really high throughput Kafka workloads, which are SLA-bound, with a new provision Polar, where we are scaling that integration for you 10x faster than the previous model. For example, the integration part is just taken out of the picture because you’re not writing that code as well. Similarly, anything related to patching or anything related to resiliency, the type of code you have to think about whether what happens if my server dies or … All of these things are actually baked into our programming model. Customers never have to think about patching.
They never think about downtime or architect their code for downtime, because the act of using the Lambda’s programming model or Invoke model, just assumes your function can never run more than 15 minutes. There’s no server. Everything is ephemeral. So a lot of the problems of securing a product or patching the product, these burdens after the code is written, are all actually taken off your plate as well. Even though I say code is a liability, code written in Lambda is a much, much smaller liability because your team is not held back, your developers are not … as they’re building new systems, they’re not burdened by those systems. They keep moving forward, keep creating new content for solving new business problems without slowing down. And that’s one of the awesome things about building your application with a Lambda.
Keith Townsend: Yeah. I think you’ve highlighted and you’ve talked through a lot of the advantages for the developer, but I’m a Ops guy, and at the end of the day, no matter how well the application is written, environments change, load changes, I’m all about the observability of the application. And before I’d measure I/O going to the disk. I’d measure network bandwidth, I measure CPU utilization, these kind of low-level stats to help me understand application performance and tie that back to transactions. What does observability look like when it comes to a platform like Lambda?
Usman Khalid: Great question. And look, there’s no free lunch in the world. And one of the challenges that customers get with a distributed system like Lambda where they’re composing applications with different parts, is the observability. Exactly like you called out. But we also took this challenge head-on. Yes, it’s the nature of the application you’re building, but what can we do to make it simpler for customers? So we’ve been doing tons of work, and there’s obviously more to do in the future as well, but just recently we launched support for application signals with CloudWatch Application Signals with Lambda. Basically, just like you talk about with your metrics being low-level infrastructure metrics, we wanted our developers to think, our operators of Lambda functions, to think about application level metrics. So with application signals right out of the box, they get metrics and dashboards for application latency, application availability, request count. This is not even code that someone has to write or instrument. Again, going back to the previous question, how it helps customers move faster. Now you’re not writing observability code, it comes baked in for your application itself.
So you’re still thinking about observability, but you’re not observing the underlying infrastructure metrics, now you’re looking at your application-level metrics. So Application Signals was launched a few days ago. We are super excited about it. We’ve got great customer feedback. But beyond that, for example, we launched the capability for Live Tailing your Lambda function. This used to take about 20 seconds, now it takes less than five seconds. After executing your functions, you’re immediately seeing logs, live logs coming from the function. You’re not querying anything, you’re not grepping anything. There are live logs and you can actually see what’s happening in your live production environment at any time.
And you can run queries against that. You can do all kinds of analytics from it, but more importantly, you can debug things immediately if something is wrong. And it’s all integrated with Application Signals as well. For example, if you see a data point where there’s an unavailability, you click on that graph, it will connect you all the way down to the application logs and show you, “Hey, this is why you have an error here or a fault here,” which is incredible when you think about it. You’re not debating like, “Hey, which server failed,” you have an application-specific view of your observability.
Keith Townsend: Just as the SREs are yelling at me, “Keith, what about my role? What about observability?” I have my security professionals yelling at me, saying, “Look, I understand access control. I understand S3, I understand the tools that AWS has given me to secure my application,” but Lambda is serverless. I can’t say, “You know what? This IP address can’t access this IP address.” How do our operators, security professionals and developers coming together to think through security when they’re building serverless applications?
Usman Khalid: Again, a great question. Look, security is our 10 and zero. When we think about the product, we think about the product features, the first thing we think about is security. That’s how we’ve architected from the ground up. From innovations like Firecracker, which are isolating, which are basically these micro virtual machines that Lambda functions run on, which give you the same isolation as a regular virtual machine with none of the overheads and really, really fast performance, to security at every layer. From using customer master keys, where you can encrypt what’s going in and out, including your code, to encryption during transit as your request is going through our system, everything is secured and encrypted all along the way.
And then we have a great integration with IAM as well. So if you’re looking at access control, we have things like resource-based policies where customers can really fine-tune who can access and what the function can access itself, all governed through IAM and managed through IAM and actually secured at the organizational level as well. So totally legitimate question, but again, from a … And these SREs and security professionals, their role still very much in a serverless land is to be acting as governors and making sure that the right permissions and the right practices are being applied. But now developers don’t see it as a friction, it’s actually just part and parcel of how they develop and architect their applications. So it’s actually what I find with customers who adopt a serverless model, they end up having these organizations, where their SRE organizations, their security organizations and programmers, all having a very collaborative relationship together because everything is just designed to work together in these regards.
Keith Townsend: Let’s wrap up. We’ve talked about quick start instances. We’ve talked about the innovation around observability. Where’s this going? What haven’t we talked about?
Usman Khalid: I think Lambda, firstly, is not general-purpose compute. Lambda is designed for very specific type of workloads, for example, bursty web APIs or low latency web APIs, event-driven architectures, which are highly evolutionary architectures, data processing or ETL type of workloads, which are just in time as well. They’re not always running. You just need to do something quickly, scale out, spin up a supercomputer of Lambdas and spin it down within seconds. That’s what the technology is designed for. When I look beyond and look ahead and you’re seeing just these emerging new applications that use LLMs and they’re using generative AI, these applications very much fit into these three molds I described for Lambda and serverless. They’re usually asynchronous by their very nature.
Many Gen AI applications actually use multiple LLMs. You’re not just using one model, you’re using multiple models to generate a really refined output for your end users. And they’re all asynchronous. The models take a request and they’re going to return or stream the content or values back to you. And then, as a developer, your code has to go reason over it. Asynchronous architectures, event-driven architectures, highly parallel scalable architectures are perfect fits for serverless and Lambda. And where I see this going is I see customers recognizing that this is absolutely the best technology when it comes to Gen AI applications, specifically around inference or actually productionizing your models.
So I definitely see a big aspect there, and I continue to see us focus on that inner and outer loop, and how do we continue to reduce that time for developers. And really give it even more delightful experience when they’re generating the code, using, for example, things like Amazon Q, all the way to when they’re actually productionizing their application. How do we just help them get to production faster in terms of metrics, alarms, observability, security, how do we get them the best recommendations automatically through Gen AI as well. So there’s a couple of areas where I really see the next four or five years of serverless evolving.
Keith Townsend: Usman, I’m excited. There’s so much to unpack here. We’re talking about one small part of the AWS portfolio, so much around Bedrock, Nova. And now, as we think of these concepts of these higher value AWS services, and we think about what we can build with a Lambda around these services and Nova and this and all of the AI coming out of AWS re:Invent, we’re excited. Where do you go to learn more about this? Well, you’re watching Six Five On The Road, so you’re starting at the right place. Stay tuned for more coverage from AWS re:Invent 2024 here from Las Vegas, Nevada.