The Driving Force Behind AI: Accelerated Infrastructure
- AI starts with silicon. To achieve the potential of AI, we need to develop an accelerated infrastructure that delivers the performance, bandwidth, and efficiency required.
- It’s also radically different from a chip perspective, requiring dramatic increases in performance and new classes of devices.
- Marvell is in a unique position to deliver on this vision through its product portfolio and critical expertise across digital, analog, and other technologies.
Transcript
Daniel Newman:
Hey everyone. Welcome back to the Six Five Summit, Daniel Newman here. I’m in beautiful Santa Clara, California, sitting down with Chris Koopmans, COO of Marvell Technologies, a keynote from last year’s event, and he’s joining us once again and we’re going to be talking all about what’s going on in AI. We’re going to talk about accelerated infrastructure, we’re going to talk about custom silicon and compute and so much more. Chris, thanks for joining again. It’s great to have you.
Chris Koopmans:
Absolutely. Glad to be here and thanks for having us on again.
Daniel Newman:
Yeah, it’s been a year since we sat down at the last summit. I think you and I probably sat down once more on camera, and we’ve had somewhat regular interactions as this has all been moving, especially around AI. Marvell came out early, really put a meaningful number and metric behind the growth of AI. I was very impressed by the fact that you were one of the first out there to really get that story and give that clarity to the market. But here we are a year later. Give me a little bit of your perspective on where things are at. How fast is this AI opportunity moving?
Chris Koopmans:
Thanks. Yeah, it’s moving incredibly fast. It’s actually, it’s the most exciting time in my career to be in semiconductors. The rate at which things are moving, you mentioned the number that we put out, I think it was about a year ago that we said that we do around $400 million in AI related revenue last year, and that would double, or more than double this year to 800 million. Just a month or so ago, we actually said publicly that it’d be over a billion and a half dollars this year, so about double of what we thought even just a year ago. So it’s moving incredibly fast, and that really goes for pretty much everything, including the connectivity as well as the custom silicon, that I’m sure we’ll talk some more about.
Daniel Newman:
Yeah, absolutely. And by the way, that’s never bad when you double your double, so you should be really pleased with that. I mean, what we found early on, Chris, was there was sort of this bifurcation of companies that had a very clear understanding of what the AI opportunity would be, and then we had companies that were kind of trying to AI wash a lot of things. So the ones that were in the first bucket did a lot better, and I think Marvell definitely fit into that category. But then the second part was we were all a little bit in this space of trying to guess how quick it could move. And what you’re telling me from that is it was moving really fast, but even based on your estimates, you didn’t fully appreciate, no one did, just how fast this was actually going to go.
So congratulations because that’s a great result. No one ever is sad, like, Hey, we have to double that number again. But one of the interesting things I’ve always found about Marvell is you play in a lot of different spaces. And when it comes to AI, there’s two that I’d really like to zero in on with you. The first is accelerated infrastructure, and we can come back to some of the custom compute opportunities, but they have numbers like 20, 25, I’ve heard as high as 30% depending on the data sources, that of the AI opportunity will be related to networking. Meaning, so every time we hear about a dollar of custom or of GPU compute, there’s going to be 25, 30 cents of networking. Huge opportunity. What are you seeing there? How is that materializing and growing? In terms of Marvell, what are you hearing on the street in that space?
Chris Koopmans:
Great points. So first of all, I would say that Marvell got here on purpose, not by accident. We didn’t happen into this position. We set our strategy in 2017 for data infrastructure. And so what we focused on is the data and where’s the data going. And of course, AI is the most data-hungry application the world has ever seen. And so our data infrastructure strategy is what led us towards AI. And you asked about networking specifically. If you think about the way data centers operate, it got pretty boring there for a couple of decades of standard general purpose computing, where effectively a processor or a server got powerful enough to process pretty much any workload we could come up with. In fact, it got so powerful that we had to virtualize the processors and share out fractions of them in order to be able to make it efficient.
AI turned that on its head. AI takes hundreds or thousands of accelerated processors in order to be able to perform, and what that means is that they need to be connected together because they’re running one computation on a whole host of separate processors, and they need to communicate with one another during that calculation, which means that the network or the connectivity amongst those XPUs is as important to the computation as the processors themselves. And so it’s just exploded. I mean, and of course it’s correlated, by the way. I mean, you said it could be guessed. Well, no, it was hard to guess because the number of GPUs in this case that were shipping every quarter is sort of continuing to upside, and ultimately all of our connectivity, Marvell is by far the leader in connectivity for all of these models. Our connectivity is correlated to that. And so it’s been growing off the charts.
Daniel Newman:
And we’re in a kind of what I would call a great reset for the whole technology market, Chris, because I mean, look, as compute went generation to generation, it was all iterative. A lot of the same architecture work for network, for storage, for compute. AI has created just this whole new mass scale opportunity. Every company, whether you’re an ISV, an infrastructure provider, an OEM, a chipmaker, has had the chance to reset their whole business strategy and attack and approach a whole new market. And it’s really exciting, but at the same time, it creates a lot of, it creates risk. It creates a new entrance into the space. And of course it opened up a door, I think for you, because you’ve been in the custom silicon business for some time. You’ve been building custom compute, but custom AI chips, this is something that everyone out there is asking about right now. We are hearing, and we won’t speculate here, and people can do their own reading, but you hear whether it’s Meta, whether it’s Google, whether it’s Apple, and so many others, they’re either, they’re making their own, they’re going to make their own, they’re doing some. You recently put out a filing or you shared in a recent filing about 25% is an expectation. Talk to me about the XPU space or the custom AI compute space, Chris, and how that’s progressing here at Marvell?
Chris Koopmans:
Yeah. And by the way, everything you just described, that’s what I mean by, it’s never been more exciting.
Daniel Newman:
It is exciting.
Chris Koopmans:
It reminds me of the late nineties in terms of servers, where you were always looking to see what’s going to come next. Or even remember when you wanted a new PC, every time a new PC came out, you couldn’t wait to get your hands on it because it would be faster and so much better. And then at some point you couldn’t tell the difference between one and the other, and it all slowed down and you didn’t care anymore. That’s where we are right now with AI. You’re constantly waiting for what’s the next thing. And the reason is because the applications we’re trying to run, run that much better every time something new comes out. And that’s why you see so much differentiation. That’s why you see so many companies building their own now because it’s not sort of one size fits all.
Ultimately, there are so many different types of applications, so many different types of specialization. Whether you’re trying to train on consumer-related data sets or whether it’s enterprise-related data sets, there’s totally different types of information out there. So yeah, I mean what you’re seeing is anybody with massive workloads, and specifically it really comes down to the hyperscale data center operators by the way. It’s not so easy to just build a data center. You have to have the space. You have to have the power. I mean, these hyperscale data center operators are contracting power years in the future. So even if you suddenly found yourself with tons of GPUs, where are you going to put them? Right?
Daniel Newman:
Yeah.
Chris Koopmans:
So ultimately we see the hyperscalers as all wanting to build their own custom AI accelerators, and they’re doing that not to replace what they have in terms of GPUs, but to augment and to add and create new value for new types of applications that they can operate even better.
Daniel Newman:
Yeah, well, there’s always efficiencies in design, and I’m glad you talked about power. Let’s just use that as a for instance. I mean, GPUs are really good at doing AI and giving a lot of flexibility. We’ve also seen that there’s some real value in creating custom chips that ASIC, and maybe they’re ASIC, maybe there’s an ASIC with some programmable logic. There’s variants of how this will end up being delivered to market, Chris. But in the end, I mean just the power or use alone, if you can do it and say it’s a recommendation engine and you’re a company that does something that requires it, using that particular architecture and type of silicon, you can get more performance, lower power use. And you said it, I mean, I read somewhere Chris recently, that 99.6 or 99.8% of Northern Virginia’s power grid is committed.
Okay. Basically they can’t stand up one more rack of GPUs. So, I have to imagine part of your vision of Marvell is that part of the opportunity is that finding that very specific compute, so that you can help with the sustainability issue. I know it was really hot to talk about, then it got a little less hot to talk about, and now I think we’ve come full circle because AI is the hottest thing on the planet right now, and it’s also the most power-consuming thing on the planet, and we’re going to have to address that mix. Yeah?
Chris Koopmans:
I think that’s right, and I think that is one of the drivers. I mean, generally speaking, when you talk about building any type of these AI accelerated compute, the key metrics are performance per watt. It’s number one, and performance per dollar is probably number two because watts drive dollars as well. So performance per watt is one of the more important metrics that you can possibly have. And so being able to build an optimized piece of silicon to solve specific use cases is critical, I think going forward.
Daniel Newman:
Yeah. And that’s in your number, so where you said 25%, and I think we’ve talked about this, but I think speculation is it could be pretty substantially above 25%. And by the way, I want to reiterate, this isn’t to say that the GPUs that we’re hearing about and all the exuberance is around right now isn’t real. It is real. The point is that $400 billion TAM that we’ve talked about with infrastructure, with accelerated infrastructure, with compute and GPUs together, could go well above a trillion.
Chris Koopmans:
That’s right.
Daniel Newman:
There’s opportunity for everyone.
Chris Koopmans:
Yeah, I mean, so just to give you the numbers. At our recent analyst day, we shared analyst estimates, not our estimates per se, but analyst estimates showing that they thought the data center compute would grow to about $200 billion going forward in the next four or five years, annually. Now, some people have said it’s 400 billion.
Daniel Newman:
That’s the range I’ve seen.
Chris Koopmans:
And it’s going to keep growing, so whatever the number is. So what we said was that about a quarter of that was going to go custom. So somewhere in the 40 to $50 billion range was going to go custom. You’re right. I think if everybody’s ambitions came true, it’ll be higher than that. And right now it’s probably in the 15% range, last year. So, I mean 25% is actually not a big jump. I mean the ambitions would be to go even larger than that, but ultimately the bottom line is whatever the numbers are, they’re huge. The demand is huge. And ultimately there’s really very few companies in the world that can partner with these hyperscale data center operators to help them realize their dreams and build these types of complex custom silicon.
Daniel Newman:
Well, and Chris, that’s a great segue because our audience out there, here at the Six Five Summit, we focus on trying to really tie together these transformational technologies, and basically Marvell is part of this story. Marvell is one of these companies. Talk a little bit about what enables Marvell and what is the differentiation? Because you’re not the only company. There’s companies in all parts of the world and there’s other companies here in the U.S. that do custom chips, that do accelerated infrastructure, but why are you winning? Why are you growing? Why are you doubling your double?
Chris Koopmans:
Sure. So there’s a couple of things. I think first piece is that high-speed connectivity is very hard. It’s not the same thing as digital logic, it’s analog mix signal technology, and the number of analog mix signal engineers that can do this type of high-speed networking in the world is very low, and they tend to work at a couple of companies. And Marvell has been fortunate enough, both organically, as well as through our acquisition of Inphi, who was the leader in high-speed connectivity inside the hyperscale data center operators, really built a team that is second to none in the world. And ultimately, in order to keep up with this, when you said double in your double, as the opportunity’s growing, it’s also expanding. You can’t just build one chip. So what we’re specifically talking about is the digital signal processor or the DSP that goes inside these optical cables that connect all of these servers together.
And you can’t just build one. It used to be that you might be able to just build one and double the speed every few years. Now we’re having to put out a host of different chips and different solutions for every single niche in the marketplace, every single customer, every single speed, every single distance and optimized link because it’s grown so fast. And so we’ve greatly increased the number of engineers working on these projects and ultimately the number of chips that we’re putting out, and maintaining that market share is critical to us, and we’ve been able to continue to do that.
Daniel Newman:
Yeah, congratulations on all the success. I would love to end on something just to get your sort of big picture viewpoint, Chris, someone that’s leading one of the more important companies in the silicon space and the AI space, and it’s working very closely with some of the companies that all the consumers out there, you’re helping to enable some of these technologies for them. There’s a bit of a debate right now, and I’ll call it the iPhone Android debate about AI, and you’re in the middle of this because you build all the infrastructure, you build custom chips, but we’ve also talked about GPUs and we’ve talked about analog-digital networking. Where do you think it lands? Where do you think in the end, I mean, when I say the Android is sort of the mixed ecosystem. We’re going to use different GPUs, different CPUs, different network providers, we’re going to use different open source software, and then there’s like the we have everything top to bottom company that looks more like Apple. How does this evolve? Do you think it ends up being just a massive trillion dollar opportunity, it ends up splitting a little bit like Apple and Android, or how do you see this growing in the community of AI?
Chris Koopmans:
Yeah, it’s a great question. So I think first of all, there’s sort of different segments. There’s the hyperscale data center operators who have massive software teams, silicon teams and capabilities, and their own massive internal workloads to be able to drive the need for custom silicon. And then there’s a longer tail of enterprises, for example, that just wants something to work. And so I do think that ultimately both will be winners. I think that ultimately the integrated solution that just is turnkey, and you turn it on and it works, and you can just train your models and you don’t have to go and become experts on programming things all the way down to the hardware level, great. They’re going to do great. I think ultimately the cloud models are going to get better. And I think that one of the ways to think about it now, is that this is really a cloud first world. For the longest time, we were talking now about moving workloads from on-prem into the cloud. That’s been the cloud model for the past decade or so. This is a cloud first. If you’re about to stand something up in AI, what’s the first thing you’re going to do? Buy a data center and a bunch of servers, probably not. Just-
Daniel Newman:
You only have few companies out there, right?
Chris Koopmans:
… Yeah, I mean, of course some will. There are some that will do that, but most are going to say, just put it in the cloud. And ultimately, I think they’re going to have great success as well. And that’s one of the reasons why you’re seeing the need for customs, is that they want to differentiate. If you’re simply deploying the same hardware as an enterprise could buy themselves, or as all of your competitors in the cloud world, then your ability to differentiate is fairly small. But if you can then build your value proposition around that, where you’re saying, no, no, my offering is better in the following ways. It has this type of ARM CPU, it has this type of accelerator, hitting this particular price and cost TCO envelope. That will provide a much bigger opportunity for them to differentiate amongst themselves. And so this idea of doing custom, that you asked about a moment ago, is not because they’re just trying to save money. They’re differentiating their service for the next wave of innovation in the world. And so it’s never been more critical for these data center operators to have the most optimized silicon.
Daniel Newman:
Really appreciate you taking the time and taking all my questions, including the most challenging ones, Chris. Chris Koopmans, COO of Marvell Technologies. Thanks for joining me again this year at the Six Five Summit.
Chris Koopmans:
Thanks, Dan. Great to see you.
Daniel Newman:
All right, back to you in the studio.