Build and Scale Your AI Apps on Azure’s App and Infrastructure Platform

In this new Age of AI, enterprises now face unique challenges running AI apps in production. Hosts Patrick Moorhead and Daniel Newman are joined by Microsoft Azure‘s Jeremy Winter Chief Product Officer and CVP, Azure Core, and Rani Borkar, Corporate Vice President, Azure Hardware Systems & Infrastructure at Microsoft for this episode of Six Five On The Road. 📻Tune in to learn how Azure is paving the way for businesses to deploy and scale AI applications effectively.

Their discussion covers:

– Innovations across Azure’s app and infrastructure platform to aid businesses in scaling AI applications

– The seamless developer experience provided by the integration of Azure’s app and infrastructure stack

– The role of Azure’s custom silicon portfolio and industry partnerships in offering more choices to customers

– The vital components beyond managing AI workflows, model selection, and performance for AI app scaling

– Insights from Microsoft Ignite about Microsoft’s approach to application and infrastructure for AI

Learn more at Microsoft Azure.

Watch the video below at Six Five Media at Microsoft Ignite and be sure to subscribe to our YouTube channel, so you never miss an episode.

Transcript

Patrick Moorhead: The Six Five is On the Road here in Chicago at Microsoft Ignite 2024. The event has been everything you expect from Microsoft. I mean, imagine this. We’re talking about enterprise AI at multiple levels. At the app level, at the infrastructure level, client computing, and everything in between.

Daniel Newman: The past couple of years have been an incredible run for Microsoft and they’ve really found their place across the entire continuum. I mean, here at Ignite, we’re hearing all about it. We’re into the developer space, Pat. We are focusing on enterprise, finding value in AI. And you and I, of course, have a deep love affair with silicon and infrastructure, and that’s happening here too. So it’s really kind of hitting all of the things that you and I really care about and, more importantly, all the things that people in technology and business are trying to achieve with AI right now.

Patrick Moorhead: Yes. And also, the one thing that’s pretty evident in the enterprise is that it’s one thing to do a one-off kind of science project, kick the tires. You can do a POC. And by the way, all of those are important, but how do you scale that? You really have to have an apps platform, an infrastructure, to be able to scale, planet scale type of level. And I just happen to have two people here that can talk about it. Rani, Jeremy, welcome to Six Five.

Jeremy Winter: Hey, thanks for having us.

Rani Borkar: Thank you. Great to be here.

Daniel Newman: It’s really good to have both of you here. I mean, as you sort of listened to the preamble, Pat and I spend countless hours on the road, and days, across the vendor continuum, with customers, and we’re hearing about where are we at with AI. And we, of course, know there’s a little bit of a haves and have-nots going on. There’s a little bit of a, we’re moving faster, we’re more cautious. There’s obviously regulation in certain industries and not. There’s so much going on and, of course, at the leading edge, it’s really exciting. Are we going to have AGI that’s going to do our thinking for us? And I think we all would agree, yes, eventually, maybe, but right now, we’re really, to Pat’s point, in that taking a POC and trying to just be able to calculate the value. But you are busy. I mean, over the last couple of years, Microsoft’s had an incredible run. Jeremy, maybe start off. You’re talking to the enterprises. You’re out there helping these companies get these POCs off the ground. What are you seeing as it pertains to the enterprise AI opportunity? How are companies doing? Give us just the rundown.

Patrick Moorhead: Where are we on this one?

Daniel Newman: Where are we?

Jeremy Winter: I think, we’re still pretty early on the map. When you think about the mainstream generative AI, it’s happening across, as you were mentioning, just across the board. Whether you’re a startup or traditional enterprise, what have you, everyone’s really working with this right now. And we’ve seen some pretty cool scenarios that are popping up. There’s customer support. We’re seeing this really come through. We’re seeing this come through with how I can help with employee productivity. And so, people are just really starting to get their hands into this and it’s this balance. You are going to hear balance of how much should I start with, as you were saying with the POC, and how much do I really need to go modernize fully? But when you think about just this entire AI and Copilot stack, if you will, at the heart of it is Azure, and this is this key piece you’re going to hear me really push on is Azure has to give this infrastructure for both your traditional apps, now this shift to the new, where we’re thinking through these new generative AI apps. And the infrastructure, as well, at a physical level that’s really through there.

Rani Borkar: Exactly. And on this journey, we’ve learned a lot from our customers also. So first of all, AI, as we know, needs a lot of computing power. And then, on top of the computing power, you need the networks to hook up all these massive number of GPUs, and that is very critical for training models. Similarly, we talk about performance when it comes to cost efficiencies. So you can’t just keep upping the performance, you have to have a balance of performance with cost efficiency as you’re training the model. Then it’s availability capacity, as he said. It’s in early stages, but believe it or not, there is a lot of demand for capacity here.

Jeremy Winter: And I think we’ve really pushed out ahead of the industry to make sure that we have that infrastructure available. It’s really rooted from our HPC components. This just didn’t come out of thin air. I mean, we’ve been working on this for a while.

Rani Borkar: Exactly.

Jeremy Winter: And I think it really positioned us well for the industry to take advantage of that at the app layer.

Patrick Moorhead: I’ve been really happy to see. I mean, there was a lot of discussion, for many years, obviously about that app layer, but not a lot about a conversation on the infrastructure side.

Rani Borkar: Correct.

Patrick Moorhead: So it has been really great to get to know you and the team, and really document, as an analyst, kind of the planet scale capabilities that you’re providing. Jeremy, talk a little bit in the run-up of … we are here on this map. You talked a little bit about where enterprises are. In the end, you’re trying to solve problems for customers, maybe inspire them, as well, to show how others do it. But what are some of the biggest challenges that you’re finding in enterprises scaling, going from the science project to the POC, to putting something into production?

Jeremy Winter: I mean, there’s a huge opportunity here, and it’s going to shift the way that we even think about software and our applications in the future. But, as you mentioned, there’s some challenges out there that we’re facing. The big ones that I’m seeing is we just get out with the customers, again, on all scales, but for the enterprise, specifically. This one I look at and say skilling and talent’s right up there, that I’d say almost number one.

Patrick Moorhead: It’s one of those things that you forget almost every major industry transition.

Rani Borkar: Exactly.

Jeremy Winter: Yeah. And it’s one of these where how do you go find the talent? How do you give your own employees time to go explore? How do you pull from the universities and how do you mix from places that already have experience? And I just think this is one that’s an entire mashup that’s occurring with the talent. The second one I’d say through is there’s just a ton of models out there right now. How do I go pick? What do I do? How do I compare the two as I’m thinking about building something out there from scratch or even integrating it? And then, this has been another problem we’ve been facing for years. The third one, in my mind, is just data. Data has been everywhere. And now, with AI, part of the big value we’re seeing is, can I reason over all these different disparate sets of data? And so, now we really have to go look at how do I go? If you really want to take advantage of this, pull the data sources in to do those LLMs over and really have the right knowledge on it. But at the end of the day, it’s all going to come down also to infrastructure. And it’s just not sitting around in the data centers, is it?

Rani Borkar: No, exactly. But exactly. I mean, you said it. There are different layers if you think about adoption. This technology moving forward, adoption and what are the challenges, from my perspective, the infrastructure challenges? If there are customers that have not migrated to the cloud, they are working still with infra, with old components. And so, they are definitely going to have scaling challenges. That’s point number one. Point number two is AI is expensive. And so, from that perspective, if they want to keep moving forward, they need to make sure if they are not on the cloud, how are they making those investments? And I’ve been talking a lot about in the semi industry, just in general about investments, because this is going to require massive investments. Whether you are GPUs, CPUs, whether you are memory, it doesn’t matter. And then, the third thing is the rate and pace at which technology’s advancing. Oh, my God, I have been through three decades, three and a half decades of platform shifts, never seen advancement like that. And so, if they are not taking advantage of the cloud, how are they going to bring this technology to bear to do the things that AI needs? So a lot of challenges.

Daniel Newman: So we’re at this really interesting inflection though, and you talk in our language. I mean, look, we understand the whole stack, but you start. We hear you. We hear you. We hear you. But, look, the countless amounts of debate that are going on about AI infrastructure investment challenges, there’s this whole CapEx boom, and it’s probably the most popular topic on business television right now is how much … and, of course, Microsoft’s spending a lot there. Then, of course, there’s sort of the, I won’t use the term, I won’t ask you the term, vertical integration, but there is the idea of building more integration, because ideally it’s both cost, it’s control, it’s optimization.

And then, of course, there’s companies that you have great partnerships with. Companies like Nvidia that are leading there. And it creates a very strategic set of decisions that have to be made, Rani, about where do we invest, where do we put our energy, what should we build, what should we take off the shelf and utilize? Can you share a little bit about how you’re sort of thinking about addressing all this and then, ultimately, because, in the end, you got to be able to bring value back? You got to be able to say, “We spent a hundred billion, but we’re going to create 200 billion of revenue,” or else everyone’s going, “What are you doing?”

Rani Borkar: Absolutely. And I loved it the way you said it. No, seriously. Hey, there is innovation happening in the industry. What do I buy? What do I make? At the very simplest level. And at the end of the day, what matters is are we giving our customers what they need at the best performance, at the best cost, or I should say at a competitive cost, and at the highest efficiency? And I need to be able to provide them … I need to have capacity available when they need it, where they need it. So, really, it starts at the top. The decisions we make, from the Microsoft perspective, first and foremost, for us, customers are front and center, which means we want to tap into the innovations happening in the industry, so we want to bring the best of the industry, and we want to bring the best of what we do. Now allow me to go a little bit click deeper, because you said we speak the same language.

Daniel Newman: Please, let’s do.

Patrick Moorhead: Please.

Rani Borkar: I grew up in the, and I’m guessing you too, you have a love affair going on with Silicon Valley. Those of us who had a love affair with silicon, guess what?

Daniel Newman: It’s been said.

Rani Borkar: Moore’s Law. We grew up in the Moore’s Law era, which doubled every two years. Now look at the AI performance models. They’re doubling every six months. Now let’s plot. How is hardware innovation happening? Not keeping up. Plot it. You have to change the graphs to really even have hardware show up it’s making progress. And so, now the lines between, whether you’re talking silicon, you’re talking system, you’re talking servers, you’re talking networking, cooling, data center, you name it, the whole entire stack, you have to innovate, otherwise how are you keeping up with those models? And so, that’s why, at Microsoft, when we talk about it, we don’t talk about we build these chips or we … We say we are doing a systems approach, which is really end to end. My friend here, if he doesn’t help me, how am I going to make one plus one greater than two? Now how are we doing that?

So we have to innovate at every layer of the stack. And that’s why you heard the news at Ignite this year. Last year we talked about our custom silicon. We will share more information on that as we are continuing to really build the stack. But, in addition, you will hear us doing infrastructure innovations such as our cooling solutions. We talked about Maia sidekick cooling last year. This year we are going to talk about how we’ve evolved that design. And not only that, that design is going to sit in the data centers, which are, actually, were air-cooled. Now they’re going to be liquid-cooled. Same footprint. I can put a Maia in there, Microsoft, or I can put a third-party, Nvidia or AMD. That is huge. That is huge.

Patrick Moorhead: Well, that is huge. And, by the way, the industry, historically, has gone through this aggregation, disaggregation, how we look at solutions, and we are definitely in this integrated systems approach. And the reason is really simple, because we’re trying to get as much efficiency as we can, do the most amount of compute with the lowest amount of power that we can. And the only way you can do that is in this integrated approach. And I applaud that you’re doing this with a combination of first-party silicon and also merchant silicon, but going all the way to the walls of the data center, and then all the way in.

Jeremy Winter: And this system is key to the entire thing we’re building, because whether it’s how we build the facilities, all the way up into the power and cooling, but we also have to think, as you’re mentioning, these customers, as we’re thinking about how the next silicon comes in or how we’re using the GB200, so what have you, we’ve got to think through the network across it, all the way up to how the apps and everything come back together. So this is an entire system.

Daniel Newman: And your time between, sorry, but your time from generation to generation has gone, like you said, it was a couple of years to now it’s months. And I think I was asking one data center engineer about it. Is the cooling solution you’re putting in place today going to work throughout Rubin? And the idea is they think probably. But it will not work even one generation beyond that, meaning … so there’s so much investment that has a really short half-life.

Rani Borkar: Yeah. If I may add to that point. You made a very critical point here. So the pace of innovation is very fast. At the same time, engineers, who are folks like us, who are really looking at the infrastructure innovations, cannot just look six months out. It takes more time than six months out. So as we are building this stuff, we have to future-proof also, and this is where Microsoft also we are focusing on modularity and interoperability. You have to do that, because brick and mortar takes a long time to come up, the equipment takes a long time, then testing it. The next generation of our merchant vendor. So you have to be prepared for that, so that when those technologies or those GPUs or accelerators are ready, Microsoft is going to be the first one to bring it out. Simple.

Patrick Moorhead: I had discussions with Microsoft Teams probably a decade ago. We talked about infrastructure and leaning into it. But, I mean, you are absolutely leaning into it now. There is no question about that. Now, infrastructure’s cool, I know that, but I’m going to bounce to the app layer. Okay? Infrastructure for infrastructure’s sake doesn’t deliver value, but Jeremy-

Rani Borkar: Yes. You’re right.

Patrick Moorhead: Jeremy, here we go. So in every major transition we’ve seen, developers can go one of two ways, maybe three. They can start over with a new application or they can take their current applications and integrate the cool stuff. We saw this with the web, we saw this with e-commerce, we saw this with social, local, mobile. The best answer is not always just to tear up the old stuff. I’m curious, what’s the tack or what’s the recommendation? So there’s what enterprises might be thinking, then there’s what they’re doing, but what are you recommending that enterprises do here? Why?

Jeremy Winter: Yeah, I mean it’s a balance. And I mentioned this earlier, and I think this is the recommendation, is you have to look at the balance on this. Let’s take your existing apps. This is an actual great opportunity and we’re seeing customers go do this.

Patrick Moorhead: And the typical enterprise might have 5,000, 10,000, 50,000 apps just to set everybody out there here.

Jeremy Winter: And you have to be smart and intentional about which ones you’re picking. But there are some great scenarios where you could augment an existing application to do so much more without having to rewrite everything. We’re seeing this already in the retail space. Heck, even if you look at Azure Copilot, we’re adding generative AI elements to an existing framework and experience that augments and improves. We’re also seeing areas where you may want to … you’ve got something that’s extremely down in the software stack that’s doing a ton of processing, a lot of logic across multiple steps. Leverage AI, again, at this layer, whether it’s agentry, or what have you, to really go work through and simplify those key pieces. But what’s key to this is it helps you minimize a bit of the risk. It helps you start with something you know, something that you can start small, and often get those stakeholders with it . But then-

Patrick Moorhead: The early wins are so important here to get … I mean, it’s one thing to have the board of directors edict come down, and the CEO edict, and then you are left with, okay, well now what do we do? Prioritize the different applications and then getting those quick wins is beneficial, because it just keeps everybody in the game and also keeps investment coming, which is critical.

Jeremy Winter: Yeah. Now let’s definitely look at the other side of the equation. Full-on modernization. This is where it’s just greenfield. You just go. And right there you’re seeing a lot of this is really around autonomous execution, autonomous decision-making, robotics. Automotive is really heading into that direction. You’re seeing decision-making based off of real-time data or looking at the data, inferencing and making decisions on that. Cool scenarios I’ve seen or even looked at myself is looking at heart rhythms and understanding your heart rhythms that are coming through or imaging across from and helping make decisions on there. Those are two big areas through there.

But there’s also this just entire view of end-to-end automation that’s starting to come into the picture. Manufacturing’s really digging in here. But the key on these is this just opens up that opportunity for us to balance across, picking the right ones at the existing, as well as the modern. But again, underneath this, this is where we have to look at it, is Azure and Microsoft, we’re not only just looking at these trends, and we’ve talked about infra and I’ll get back to that, but we’re also giving you the apps to go build it. Whether you’re seeing with AI Foundry, and what we’re doing with our development tools, and the applications in the AI space, making your choices of which tech you want to use.

But also, we see containers as being a big element that people are betting on that can do it with their new apps and they can also start to do those components in their existing. And so Azure Kubernetes Service, Azure Container Apps, depending on which way you want to go there, these are such great steps for you to come through in the app. But again, if you go back to our beginning, it all sits with Azure at the heart of this, and we need to make sure this whole modernization, this whole shift to AI, and our goal is to ensure that we have that platform from an infrastructure level all the way up that just powers this entire thing.

Daniel Newman: And there’s still a pretty significant element of land and expand with AI the way there was with cloud, where there is with cloud. I mean, people are like, “There’s still so much more growth to be had.” And if you think about, by the way, we didn’t talk about it much here, but just the data fabric and platform itself. Pat and I talk about this endlessly on the show. These are great apps sitting on top of this great hardware, because you can’t run an app without a little silicon in there, requires really well-designed data. And, of course, in the era of AI, it maybe gets simpler to be able to point an app at more data and have it be able to orbit the data a little bit better, but it’s still complicated. We still haven’t come … so a lot of the enterprise, at least the ones that we’re talking to, part of the reason they’re not getting POC to a scale is they can’t quite get that data ecosystem.

Jeremy Winter: It’s always about that.

Daniel Newman: Right. And so, that’ll be something, I imagine, here a lot of people want to hear from. So maybe we can wrap up, because I am guessing the four of us could talk for a long time. My chair’s turning in natural time, because I’m enjoying this so much. But kind of advice. So for those, and Rani, I’ll start with you, for those companies that are trying to really move in the direction of AI, develop and implement applications, you can take it through a hardware lens, take it through the software lens, but what is your advice to these enterprises to get more from these investments, to move faster and create more meaningful outcomes?

Rani Borkar: So I’m going to first start at the very top and say, having been in the industry for so long and seeing the platform changes, we all used to get excited about technology. Technology was the center of everything. I view it differently now. The era of AI is going to be … the innovations are going to be possible, because of the infrastructure. You don’t have infrastructure innovations, you’re not going to move forward the way we talked about. So my guidance … who am I to give advice? My guidance would be for these customers to focus on what their AI use cases are. And as you gentlemen already talked about, start small, do the POCs, but you can’t wait forever. You are going to be left behind. Your businesses are going to be in serious trouble, because this thing is moving fast. And those who are adopting faster and taking the risk are going to get way ahead. And so, I would say do your POCs, do small, do it fast.

Patrick Moorhead: And Rani, just to add on. I mean, look at the phases that I had talked about, the mobile, local, social, e-commerce and web, the companies that don’t even exist anymore, because they did not move quickly enough into e-commerce or they didn’t shift quickly enough to buying widgets to Widgets as a Service. So spot on. And there’s a lot of history around that. A lot of people don’t study history a lot, but they should.

Jeremy Winter: Yeah, yeah. Yeah. And I think moving fast is a key one. I’ll just chime in on mine since I think we’re coming up into that is, look, I think, no surprise, I think talent’s going to be the one we’ve got to get through. And how do you do that? Helping with the POCs, helping with exploration, but giving your teams the time to actually go build that talent up and/or really starting to go look at how you’re going to get more talent into the company to do that. This is where you’re going to have that shift. You’ll be able to see those explorations to go, “This is the POC that I should execute on.” And that helps you start small. But if you don’t give the time for the talent to actually grow, and if you don’t start to think about getting the talent to really understand this space, you’re going to be left behind.

Daniel Newman: Slightly more sage than the more you buy, the more you save.

Rani Borkar: If I may add one more thing here. As Jeremy talks about talent, and we talked about POCs and all that, I think the second thing which I’m observing in this AI era is you have to take bold risks. You have to make those big bets. You’re not making those big bets, and this is why sometimes I worry about the semiconductor industry, because sometimes, if you look at memory and things like that, they’re so cyclic. Should I invest or should I not? Well, if you don’t invest, you don’t give me more HBM, I’m going somewhere else. And so, it is very, very fast. So risk taking is going to be … it better get in our DNA bigger than it has been in the past.

Patrick Moorhead: Yeah, you’re even hitting on culture here, which intersects with the people part in a big way. But senior executives and management need to give permission to fail, and there’s guardrails that are in there. But I always like to say, like joke, when I used to have a real job, “If you’re not making some mistakes, you’re not pushing it hard enough.” Let’s just give you the boundaries of doing this, and we’re going to celebrate wins, but sometimes we’re going to celebrate when we duffed it, but we learned a ton about it, and we could share that with people.

Rani Borkar: Absolutely.

Daniel Newman: The annual cycles may finally bring us, in the semiconductor space, to a little less of that sort of crazy boom-bust cycle. Having said that, we will have to find those hundreds of billions to continue to spend CapEx at this rate. But, as analysts, we’re sitting back, we’re grabbing our popcorn, and we’re enjoying the show. Rani, Jeremy, thank you so much. Great to sit down with you here. Let’s have you all back again soon.

Jeremy Winter: Yeah, perfect.

Rani Borkar: Thank you so much.

Daniel Newman: And thank you very much for tuning in. We are The Six Five. We are here On the Road at Microsoft Ignite 2024 in Chicago. Hit subscribe. Join us for all of the other episodes, all the other coverage. We appreciate you being part of our community, but we got to say bye for now. See you all later.

Other Categories