Driving Innovation with Generative AI on AWS

Companies of all shapes and sizes want to take advantage of Generative AI technologies. That’s particularly so in some industries,such as sports, healthcare, entertainment, and fintech. Dr. Matt Wood, VP of AI Products at AWS, discusses AWS’s approach to Generative AI and how AWS makes it easy to build and scale Generative AI customized for your data, your use cases and your customers.


Patrick Moorhead:
Hi, this is Pat Moorhead and welcome back to The Six Five Summit 2024. As you can tell, it’s all about AI at this year’s summit. It’s not only about building out the infrastructure and the tools, but also about the benefits that enterprises and even consumers are getting around that.
Now, one of the key things we’re seeing from a research perspective is the dawn of these end-to-end Enterprise AI platforms. And that’s essentially all the way from data ingest to getting your AI application to do what you need to do and everything in between from a generative point of view using AI, but obviously also machine learning.

And I couldn’t even think of a better person to have this conversation with than Matt Wood who runs Artificial Intelligence Products at AWS.

Matt, welcome to the summit.

Matt Wood:
Thank you so much, Pat. It’s a pleasure to be here. I appreciate it.

Patrick Moorhead:
Absolutely. It’s been fun watching what you have been doing from the outset with Bedrock. Well, first of all, I mean AWS is not new to AI. Industry leading cloud AI platforms for developers and enterprises. SageMaker, I believe, was the first end-to-end machine learning platform. We’re getting even more information about Q, but it is exciting stuff.

So I have a real question here, and it’s probably the most important, which is, what distinguishes AWS from all of the noise out here that we’ve heard of agents and co-pilots and assistants?

Matt Wood:
Yeah, sure. I think number one, thank you for having me again, appreciate it. I think number two, this is just an incredibly exciting time for machine learning, for artificial intelligence, for cloud computing, for technologists, but also for business leaders who are looking to seize the opportunity of this technology to completely reinvent their own customer experiences, to kickstart or accelerate their journey through digital transformation. And not just reinvent their own business and their own products and their own processes, but drive that transformation deep into their industry as well.

So it’s an incredibly exciting time. This is probably the single largest shift in technology of how we’re going to interact with data and information and each other probably since the advent of the very earliest internet. And organizations that invested in that early internet 30 years ago, including amazon.com, went on to experience pretty tremendous growth in the intervening decades. And it’s my guess that today customers that are investing in generative AI in a similar way are going to experience multiple Amazons of growth over the next couple of decades as well. So it really, really is an exciting time.

As you say, we’ve been doing a lot of work around machine learning and artificial intelligence at AWS for many, many years. I actually helped launch the very first Amazon machine learning service all the way back in 2016, 2017, I think. That evolved to become SageMaker, which has been really become the defacto standard way that most organizations have been working on machine learning. We have hundreds of thousands of customers using SageMaker today, but generative AI is really changing the game again in a number of different ways.

There’s two really remarkable changes that are happening at the same time that in isolation would be interesting and exciting, but happening together I think is really demonstrating why generative AI is different and why a lot of customers are excited about it.

The first is there’s just a step function increase in the complexity of tasks that you can now solve using machine learning through artificial intelligence. And so these tasks are once, very one dimensional. You could take simple inputs and build a model to predict simple outputs and then more complicated outputs. But today that complexity of the type of problems that we can solve is just getting combinatorially larger as the technology continues to improve.

At the same time, the skills required to be successful with machine learning and artificial intelligence are reducing. And as a result, the technology is democratizing and becoming much more accessible than it ever has done before. And so the result of that is that more and more customers are getting started either training and tuning their own models. The large foundation frontier model providers like Anthropic and Mistral and Amazon and others, there’s a lot of that work happening.

But increasingly, I think what we’ll see is the more customers are able to, because the technology is democratizing, are able to take those foundation models as a starting point and use the underlying technology to be able to customize them, to specialize them in really meaningful ways for their organization.

We also see a huge increase in the usage of people just using generative AI to be able to build net new applications. And there we make available a service called Bedrock. So foundation models and Bedrock, see what we did there. So Bedrock gives you access to the broadest set of foundation models that are available anywhere. It gives you access to the broadest set of additional capabilities for not just prototyping, but also taking those prototypes through to production with guardrails and knowledge grounding and fine-tuning and agentic systems. And in addition to that, is able to deliver all of that not just in a cul-de-sac where all you can do is prototype, but in an environment where any organization can bring their most sensitive data and trust that that data will not be used to improve the underlying models, won’t be reviewed by humans, doesn’t travel over the public internet and all those sorts of things.

And so through Bedrock, we’re really seeing, we have tens of thousands of customers using Bedrock today to build and transform their organization precisely because to do that you need access to data, and to be able to access data with these models, you need to be able to have the right security, confidentiality and privacy controls.

And then finally, we’re making a set of applications available to customers, which we call Amazon Q, which is by far the most capable assistant, which is available to complete tasks at work. So we have tasks that will help you write software more quickly, analyze data more quickly, and manage knowledge inside your organization more effectively as well.

So for all of those reasons, the majority of customers that are getting started on their generative AI applications today are doing so starting with AWS, and those that maybe started prototyping earlier at the beginning of last year are migrating to AWS really as quickly as they can.

Patrick Moorhead:
I appreciate that, all those comments there. And no, whenever we get a major inflection point in the industry, it comes down to use cases, it comes down to workloads that support those use cases. Then we get into this ever present horizontal versus vertical and industries matrices. And I’m curious for you at this point right now, are there certain industries and use cases that are bubbling to the top first mover type of things? Because when I talk to enterprise, some of their biggest questions are where do I start and what do I do? And POC and then beyond POC to really get things going.

Matt Wood:
Yeah, it’s a good question. I think every customer that I speak to, which is several hundred at this point, but every customer I speak to, I really have not seen this level of energy and enthusiasm and momentum. And most organizations are being very deliberate. They’re finding the use cases that they think can be really meaningful. They’re prototyping some say through to production and being pretty successful with the majority of them.

But there is a group that’s kind of moving faster than the average. And at least to me, it was slightly counterintuitive, but it’s actually the regulated industries that are moving faster than the average in this new world. It is interesting. I think if you’d have even told me a year ago that 160 year old life insurance companies were operating in the vanguard of artificial intelligence, I probably would’ve been pretty surprised to find out that that was the case. But it is what we’re seeing today.

There’s a couple of reasons for this I think. The first is that a lot of the regulations around financial services, life sciences, healthcare, manufacturing, insurance, these regulated industries have had to comply with governance requirements around their data for 20 years or so.
And that compliance may have felt like a bit of a headwind at the time, I’m sure, but it actually forced all the right behaviors inside these organizations to get their data quality up, to put the right governance in place so they understood what data they had, what it could be used for, which tools it could be used with, who could do what with it, and they have the right privacy and security controls around that data. And those are all the things that you need to do to be successful with generative AI. And these groups have already done it. So as a result, the incremental investment to be successful with generative AI is actually relatively small and they’re able to move much faster as a result. So that’s the first reason.

The second is that most of these organizations have absolutely huge volumes of private text data. If you think of market reports or clinical trials or manufacturing results or health records, these are not on the public internet. The foundation and frontier models, I’ve never seen them before, but there is so much information and net new knowledge inside them that organizations want to be able to leverage in more meaningful ways. So sometimes that can be just summarizing what’s in there.

Increasingly, it’s connecting the dots and finding similarities and differences, and then using that information for brainstorming to come up with new ideas and new strategies inside the organization that are going to be really, really meaningful. And so having access to that text data in a way which is well governed, and being able to find the right model through Bedrock to be able to handle all of those different use cases has been really transformative.

And then the third area is that I think charitably these organizations in part because of compliance, but also just that they are traditionally a little bit more conservative, they’re just earlier in their digital transformation journey. And I think that they have been looking at the transformation that has happened in hospitality and travel and media and gaming, the Netflixes and Airbnbs and Expedias of the world that have been able to use the cloud to be able to disrupt and transform their entire industry. They’d be looking at that generously from the sidelines. And today they see generative AI through the cloud as not just also a way to catch up with that level of transformation, but actually lead from ahead of it.

And so for all of those reasons, it’s actually those, so counterintuitively, it’s actually those types of industries that are moving the fastest to deliver value in production on generative AI.

Patrick Moorhead:
I’ll admit, as analysts, we’re not supposed to admit, not supposed to say we’re surprised, but I’m surprised that you brought that up because typically they’re the laggards when it comes to that. But in my conversation with a lot of enterprise, it’s like security and data, right? Data management and security. And it sounds like in an interesting shift in normally who the fast movers are, and here we are.

Matt Wood:
I think that that may not be the last surprise that we see.

Patrick Moorhead:

Matt Wood:
And here is a really discontinuous flux. You will expect to be surprised along the way. I think a lot of organizations are surprised by how easy it is to prototype with this technology. I think they’re surprised to find the breadth of utility of this technology. I think they’re surprised by what the journey they have to go on in some cases to be able to take those prototypes and take them into production.

So a lot of what we are focused on is helping with those sorts of surprises, helping customers through the prototyping, the use case identification, and the delivery into production. Because there is actually a little bit more work than I think most people appreciate there after the initial conviction that you get from a successful prototype.

Patrick Moorhead:
For sure. So this track is about end-to-end generative AI platforms or Enterprise AI platforms. And you have a managed service for generative AI or foundational models called Bedrock. And sometimes we can all dive into the tech and deconstruct it and confuse everybody, but a lot of the times it starts with an elevator pitch. What’s the elevator pitch for Bedrock that is resonating with your customers?

Matt Wood:
Yeah, I think it’s pretty straightforward. Bedrock customers tap us as the fastest, most cost-effective, safest way to deliver generative AI applications into production. That’s it. And under the hood, in order to be able to do that, there’s a lot of pieces that we need to be able to deliver, and deliver in ways that are better than anyone else is able to do so, with a level of breadth, of capability that is broader than anyone else is able to deliver. And because this area is moving really, really quickly, we have to keep on top of that that innovation as well. And you see that across AWS. We’ve actually delivered twice as many net new features for machine and artificial intelligence, inter general availability than all of the other cloud providers combined.

And so it’s not enough just to have the breadth. It’s not enough just to have the models. It’s not enough just to have the best models. You also want customers tell us platforms that are able to keep up with this pace. And from that perspective, it doesn’t even seem much of a fair fight to me.

Patrick Moorhead:
I like that. I like it. I like that a lot.

By the way, one thing I really appreciate about Bedrock is its simplicity and the fact that you don’t even need to have to pick a compute method. It picks the right one for you. And quite frankly, that’s another element of simplification that enterprises and governments are looking for. So congratulations on going GA on Q. We wrote a research paper out there that I thought was pretty good and is getting some pickup out there.

A lot of the questions that I get as an analyst and pundit company out there is, “Hey, how is it different?” We have assistants. We have co-pilots. We have agents. We have all these different terms being thrown around by a bunch of different companies. Let’s talk why Q? What sets it apart?

Matt Wood:
Yeah, for sure. I mean, Q is today the most capable assistant for completing tasks at work. And you can think of Q as a way to interact with an assistant that is an expert in lots of different pieces of your organization and helps with lots of different tasks inside of your organization. And so we have by far the broadest set of those tasks that we can help with.

Q is particularly good at working with software development and accelerating software development. So Q can help you write code, it can help you write tests and understand code and documentation, all those sorts of pieces. That’s a really big accelerator. And we found that today the code generated by Q is not only more secure than other assistants, but is also used and consumed and accepted in more cases than any other assistant that we’ve seen with published results.

So customers have told us that they will, they actually accept about 50% of the code that Q generates, which is higher than anyone else. So just from a software development code generation code suggestion task, you can expect to see pretty significant speedups for software development there.

And really what we’re trying to do in all of these tasks is kind of what we did with cloud computing. With cloud computing our original goal was to … our original observation was that about 70% of the work of putting an idea together wasn’t the building and the construction and the idea and the product development. It was just this sort of undifferentiated muck of having to do the work and provision the infrastructure in the terms of the cloud and get the servers and the storage. And nobody wants to do any of that. And what we’ve noticed is that this 70/30 balance is actually pretty common with lots of different areas inside the enterprise and the type of work that we do.

Software development is a really good example of that. If you talk to software developers, only about 30% of the time that they’re spending writing code is on the code. The other 70% is on the undifferentiated muck. And so what Q is trying to do is just remove that undifferentiated muck. And we do co-generation, and that’s very useful.

But where Q really shines is we have a set of helper assistant agents, which will not just generate code for you, but will take an objective, work backwards from that objective, create a plan, and then systematically work through that plan to complete higher order tasks on your behalf.

So as an example, Q has a feature development agent which will take in not, “Hey, I want to write a new sort function for this data,” but there you say, “Hey, I want to write a favorites list for my social media application.” And Q understands the code, it understands all the code that you’re working with and everything inside the repository. It looks at all of that code and it comes up with a strategy. It says, “Here’s the code that we need to rewrite. Here’s the additional code that we need to add. Here’s the test that we need to run as we’re going to make sure nothing breaks. Here’s the additional tests we need to add. Here’s the documentation,” and so on and so forth.

And you can review that strategy upfront, but when you’re happy with it, you just hit go and Q works through it systematically. And you can watch it go and it’ll solve problems as it goes and pulls in information as it needs. And it’s pretty fascinating to watch. But at the end of it, you get a completed update, a set of pull requests of new code that you can commit across your application, which develops that feature.

And so this isn’t a 1.5 or 2x speed up. This is more like a 10 to 50x speed up in terms of developer productivity. You’re moving all of that muck associated with developing a new feature and allowing developers to take their idea and get it into the product, yeah, in an order of magnitude less time.

And we’ve also got them for other just muck work like transformation, co-transformation or migration. So moving from one version of Java to another version of Java. That’s just undifferentiated fixed cost work. No engineer thanks you for tasking them to do that work. The best you end up with is you end up right back where you started, but like two months later. It’s just yech.

And so what key will do will say, “Hey, I see what version of Java you’re running on. I’m going to produce a strategy to migrate you and do all the code optimization and run the tests to safely migrate you from one version to another version.” The developer just clicks a button and reviews the code and you’re done.

And so there you’re really looking at two orders of magnitude speed up in terms of delivery. And when you look at Q from a software developer perspective against all the other benchmarks that are out there today, publicly available today, Q is the king of the hill. So if you look at what’s called the SWE benchmark or the SWE light benchmark, Q is the best performing agent for completing software development tasks available anywhere today.

And so just for software development, there’s this one or two order of magnitude improvement by removing all of that muck. And we’re doing the same thing with data analysis, letting you ask questions of your data, get responses and answers and relevant visualizations in seconds, and being able to identify and build applications around the data inside your organization to be able to automatically drive higher levels of automation using that data.

And so again, it’s just a completely different world when you’re operating with Q because it has access to by far the broadest amount of information, it supports the broadest number of tasks and the tasks that it supports are industry leading and defining in many ways. And so again, it’s a totally different world on AWS than anywhere else today.

Patrick Moorhead:
Yeah, and by the way, I saw the figures on the developer scores and not that I was … I mean, AWS was foreign for developers to get infrastructure out of the way 15 years ago. And your continued focus of developers is not a surprise, but it’s also nice to get a benchmark win out there for the capability of this specifically for developers. I also liked the connectors you had to enterprise apps. I liked the supply chain stuff that Q was doing, and I’m certain there’s going to be even more to come out later.

So in transitions and in new waves of technology, there’s a lot of debates out there. And one of these is big model, small model, horizontal model, vertical model, slims. But I’m curious, what is your approach to foundational models that you think is the best path forward for your customers?

Matt Wood:
Yeah, it’s a good question. I mean, the answer to all of those answers is yes, right? There is not one model. There is not one model architecture. There is not one model family that can fit all, every use case, not going to be one that meets out all the use cases.

It’s kind of instinctual. Honestly if you go and talk to enterprises, they’ll often start their generative AI journey by looking at bottoms up, all the ideas across the organization as to how they might apply this technology. And sometimes customers send them to me as a kind of gut check, and I’ve been sent spreadsheets of 6, 700 ideas of everything across the organization. It’s just impossible to imagine that a single model is the best model to fit all of those different use cases.

Instead, our instinct, and this is turning out to be correct in large part because we see so much success from customers on AWS, but also competitors are copying us as quickly as they can, that you really want access to a broad set of models so that you can match the use case to the special capabilities of that model.

So sometimes, as you were saying, sometimes models have different sweet spots. Sometimes they are very general. Sometimes they’re very specialist. Sometimes they’re good at different tasks like summarization or natural language or co-generation and interpretation or analysis or reasoning, whatever it might be. Other times they just have different operational characteristics. Sometimes they’re really fast. Sometimes they’re really low cost. Sometimes they’re really intelligent.

And in every isolated case, you want to be able to take advantage and lean in on the sweet spot of that model to be able to get the best possible results. And that means that those results are at the price point that you’re comfortable with, have an accuracy that is actually useful and are reliable and consistent in a way that you would actually put your company name behind it.

So in isolation, you really want access to the broadest set of models in order to be able to maximize your chart so you can find the right sweet spot for your use case in isolation. But in aggregate, you want to be able to solve all of those 6, 700 ideas across your organization to do that, and you need a lot of models.

But the super power of generative AI is in the combination of the models. It’s in the combining of those sweet spots to be able to form an assistive system or intelligent system, which is stronger than the sum of those parts. And that’s really why a lot of customers are flocking to Bedrock. The majority of customers on Bedrock use more than one model in their applications. And the reason for that is because there is a compounding effect of intelligence when you put these models together. It’s a multiplier, not an addition.

And so for those reasons, plus the capabilities, plus we’ve got access to the best models with Claude 3 Opus, got access to the broadest set of guardrails, the best guardrails, agentic systems, privacy, security, it’s pretty clear why customers are choosing Bedrock in their droves.

Patrick Moorhead:
Which by the way, have it your way has always been a hallmark as long as I’ve been covering you, your compute as an example. I mean, you want big, you want little, you want our branded stuff, you want merchant silicon, whichever way you want it. You want to do containers. Here’s the five ways you can do containers. So it really is a … it’s what I expected from you.

Hey, let’s shift to a lot of conversation about responsible AI. And I’ll admit, I try to pay attention every time I’m getting briefed. I know it’s important. Responsibility has a different definition to different people. And I’m curious, what does it mean to you? What does responsible AI mean?

Matt Wood:
Well, to me, it really means that you’re taking seriously the benefits and the shortcomings of every component of your application and the system, and really understanding where the limitations are and doing everything that you can to control and reduce those limitations.

And what that means is that in aggregate, the final agent or the final assistant or whatever it is that you are building, the final helper will be able to operate in a way which is safe, which delivers reliability, which has consistency, that has truthful results. And that doesn’t go off on a tangent and potentially talk about things that you don’t want to talk about.

And so you can kind of think of these foundation models when you take them off the shelf, they’ve got a pretty good set of guardrails built into the models off the shelf. But in every case that we found either internally at Amazon or working with customers in all of the domains that we’ve been talking about, regulated and non-regulated, every customer wants to the ability to be able to shroud those models and additional layers of control. And sometimes that control is grounding the model in the reality of your own business.

So off the shelf, these models, they’re trained on the internet most of the time. They’ve got a very, very broad knowledge of a lot of different concepts. That’s really critically important. But most organizations, they operate at depth. And at depth it’s a bit more like Swiss cheese down. There’s areas of information density and areas of information sparsity. And mistakes are made when these models stumble into these Swiss cheese gaps and are still trying to generate responses even though they don’t have a lot of information to rely on.

And so a lot of what customers spend time on with Bedrock is how do I plug those information gaps? How do I plug those Swiss cheese gaps such as the models have more information rely on? Or how do I constrain the model in ways that it behaves the way that I want it to adjust the style and the tone, to be able to integrate more safely with our first and third party systems and data?

And then finally, how do I just ensure that it stays on topic, that it only talks about either for the inputs that get to the model or the outputs that come from the model that get back to the system or the user? How do I ensure that it stays on topic? And so we make available today the best performing set of guardrails to enable you to do that.

And so with Bedrock you can say, “Hey, here’s all the rules. Here’s everything that I don’t want you to ever talk about. I never want anything to do with these terms and these rules get anywhere near the model.” But also, we allow you to say things like, “I don’t know why you’d ever want to do this, but I don’t want to talk about anything to do with basketball.” So you can say to the model, “I don’t ever want anything to do with basketball to get near the model,” and if the model for any reason starts talking about basketball, I want to detect it and reject it so it doesn’t get back to the user.

Now, that’s a very simple example, but in addition to just filtering based on keywords, we can filter based on entire topics. And that’s what we do, which is so transformative for organizations, and it gives them the confidence to be able to not just bring their prototypes, but their real production systems that they’re going to give to their internal teams, that they’re going to put their company logo on and make available to their external customers.

There’s a huge amount of brand equity and loyalty and trust built into these organizations. And guardrails like this are one of the components that we provide that allows them to, with confidence, deliver those sorts of experiences.

Patrick Moorhead:
Yeah, it’s a very pragmatic approach. I mean, in one side you have a doctor or something in medical. On the other side you have a gaming site where people are throwing across different things. And I saw a ruling out of a different country of an airline who made a commitment on something too. And being responsible could be even being accurate as well and giving accurate results. But I like your definition.

Great conversation. Final comment here. I’m an advisor to the C-suite among others, and there’s a lot of conversations about automation, job displacement, job and then substitution and keeping a human in the loop. And I’m curious, is it a simple that says keep the human in the loop long enough to make sure it’s accurate and something different than that? How are you advising your customers on this?

Matt Wood:
Yeah, it’s a good question. I think even the most bearish on artificial intelligence would agree that the type of work that we’re going to do over the next couple of decades is going to meaningfully change.

If you accept my stipulated argument earlier that there’s kind of 70% of the work that we do today is kind of undifferentiated, yeah, that’s the stuff that we’re going to get rid of. And as a result, we’re going to free up all of our time to focus more of our attention and our time and our energy and our efforts on what we only get to do 30% of the time today. So that 30% will become 100% in all cases. But the work that we are going to do is going to change.

And that is a very human problem, and it requires us to approach the changes I think with empathy. It requires us to approach these changes through a human lens. It requires us to be really focused on helping everybody whose job is going to change to really upskill, understand how artificial intelligence can play a role in expanding that 30% to the 100% so they can spend more time on the really undifferentiated work that is where they want to spend their time anyway.

Patrick Moorhead:

Matt Wood:
So I wouldn’t assume that in every case that is just going to be automatic. We really do have to be humane, empathetic, emotionally invested in the change that a lot of us are going to have to go through in order to be able to make that change. And there’s going to be some upskilling, there’s going to be a lot of training, there’s going to be a lot of investment, a lot of change management, all those sorts of things. And as we go, and as those roles change, we’ll see faster and faster growth.

And I suspect that, again, another surprising kind second order effect is that for organizations that are able to plot that course and move from 30% of the time spent on differentiated work to 100% of the time spent on differentiated work, their businesses are going to grow disproportionately to that focus.

And so as a result, I suspect where that is most effective, could be software development, could be contact centers, whatever, we’ll end up with more people doing that work because we tend to invest in areas where we see the largest growth and the largest return on investment. And so I suspect we’re going to end up with far more people working in software development. The work that we do will be meaningfully different, but I think we’ll end up with far more people working on it as an example.

Patrick Moorhead:
Yeah, Matt history’s on your side too. If I go back 35 years, desktop publishing was going to kill the creatives. Moving from machine code to COBOL meant we weren’t going to need any programmers anymore. I heard the same thing about IDES as well. And what happens is we create new different types of jobs. We create, we allow people to have superpowers to do even more. I remember when PowerPoint was a threat to certain industries.

Now, I like the way you talked about, it has to be planned and we have to do it with empathy and respect. Otherwise, what we didn’t do right as a country, and this isn’t technology, but I grew up in Cleveland, Ohio, in the Midwest, and we decided to get rid of all the jobs and we didn’t retrain anybody. And bad things happen when those things happen. But I’m optimistic just because we’re having these conversations that this is going to happen.

Matt, I got to tell you this. I can’t believe that this interview has been great. I really thank you for spending this time.

Matt Wood:
Thank you. I really enjoyed it as well. Appreciate it.

Patrick Moorhead:
Appreciate that.
So this is Pat Moorhead and Matt Wood from AWS talking about end-to-end Enterprise AI platforms. I’m sure you’ve heard about Bedrock. I’m sure you’ve heard about Q. Dive in and learn more if you don’t stick around for more Enterprise AI platform action. And hey, if you’re into infrastructure, we have that, plan computing, enterprise apps. We are covering the gambit here on these, I was going to say the AI Summit, The Six Five Summit 2024. It is kind of an AI summit. So thanks for tuning in and take care.

Other Categories