Edge AI Vision with MemryX

Home

MemryX is a provider of Edge AI Accelerators. In this session we will discuss how differentiated architecture leads not just to great performance/power, but ease of use and fast implementation. We will also discuss how MemryX is sampling production quality Si and software now (Q2'24).

Transcript

Dave Nicholson:
Welcome back to Six Five Summit. I'm Dave Nicholson, Chief Research Officer, and I'm here with a very special guest from MemryX, their CEO, Keith Kressin. Keith, how are you doing?

Keith Kressin:
Very good, Dave. How are you?

Dave Nicholson:
Good, good, welcome. I've been looking forward to this conversation. This is a very, very interesting area. Maybe some of our viewers have heard of this thing called AI, and there's a lot of work to be done in this space, and you folks are doing some really good work. Why don't you tell me about MemryX, who are you, what's the problem you're trying to solve? In my Wharton classes that I teach, we like to ask the question of folks, what gives you the right to exist in the market? Fill us in.

Keith Kressin:
Especially in the world of AI, where there's so many different players, and yeah, it's very hard to stand out or be different, and we are probably one of many, but let me tell you why we're different. So we're a startup founded at the University of Michigan, and we develop AI accelerators or AI hardware and the accompanying software that goes with it. And one of the major problems we hear, we heard years ago when we started the company, we hear even today, is how difficult it is to take new models, get them up and running with good power, performance accuracy without a lot of heavy engineering work. And so our goal is to provide best in class metrics out of the box without modifying the models, without anyone needing to know or understand our architecture. We just provide a set of tools and customers can use those tools to use existing models or port their own models and like I said, get them up and running very quickly, very low NRE, and fast time to market.

Dave Nicholson:
Okay, not to be cynical here, but I've heard this story before.

Keith Kressin:
Well, bye.

Dave Nicholson:
Literally, because look, I'm dubious about any claim that sounds like it's a, we do everything for everyone, makes everything more delicious and nutritious at the same time. So put some meat on the bone, prove this to us.

Keith Kressin:
Yes, there's always suspicion. I had suspicion before I joined the company, and I can promise you every investor has the same suspicion. I think whenever you develop hardware, it's easy to say best performance, best performance per watt, simplest to use, and you're done. And it definitely is good to dive a little bit more. So let me explain how we reduce the complexity a little bit on the software side and kind of how it's done. First, kind of, bound the problem. So our focus is on inference, not training. So that's already balanced part of the problem. Our focus also actually right now is more on classical AI models as opposed to the latest model that comes out yesterday or the day before, the day before.

And we talk about gen AI, we go to conferences, talk about gen AI, and generating new content and what's the latest chatbot, and there's a place for that in the data center, no doubt about it. But when we talk to embedded platforms, really folks are just getting to the point now where they're comfortable deploying computer vision-centric models in their applications. Where they can trust it, they have their own data, their data has been fitted for their sensors, and they have their models. And so applications like security or retail or industrial or manufacturing, these sorts of customers are just now at the verge of deploying computer vision. And that's basically what we're focused on. So it's inference, it's mostly computer vision, although we're great with any streaming input. Now let me explain maybe a couple of the technical areas, which I'm sure is what you're looking for. So first, we run the entire model on our hardware so we don't go back and forth to the CPU, back and forth to a GPU, back and forth to a DSP. So that, again, simplifies things.

So we connect to a host. The host just sends the data we processed, and on our hardware is the entire model that does the processing. So there's no dependency, bus contention, memory burden, anything like that. Also, we did everything from the ground up. So we don't use a third-party software, we don't use third-party DSP. We didn't start with RISC-V, nothing against RISC-V, but we didn't start with any third-party IP. It was literally build and design the state machines from the ground up. So we did that with every processing block. We have our own dedicated ISA, specifically for AI operations.

We don't have caches, we don't have prefectures. We have a data flow architecture with at memory computing, and it really simplifies the hardware. We try to make the hardware as simple and performant as possible and put the burden more on our own software tools, like at the compiler layer. And for example, ER hardware doesn't even have a knock or a network on chip. It's a pure data flow architecture at memory computing. And you add all these things together and it turns into a solution that we can prove that is very, very quick and easy to use. And like I said, not at the expense of accuracy and not at the expense of power performance metrics.

Dave Nicholson:
So you alluded to a market segment that you're focused on, vision systems specifically. I want to get to that in a moment, but I'm curious about, it's interesting because when describe what you do, it reminds me of an era when there was an argument, probably an argument that continues to this day. Do we have our general purpose CPU, this is the argument in the past, do everything, or do we create maybe custom ASICs, FPGAs, this is in the past, to do specific things very, very efficiently. Fast-forward, 2024, correct me if I'm wrong, but this problem's even worse today because when you're talking about thousand watt or 2000 watt GPUs, I imagine that those thousand watt or 2000 watt GPUs could do some of the things that you do.

Keith Kressin:
Yeah, they certainly can. Yes.

Dave Nicholson:
And so leading question, do you need 3000 watts to do these things?

Keith Kressin:
Yeah, I think when Moore's Law started to decline many, many years ago, the realization was you have to do things at the architecture level and not the physical level. So obviously multi-core CPUs, and that you have dedicated engines for each application. A dedicated engine for graphics, a dedicated engine for camera, and it makes all the sense in the world to have a dedicated engine for AI. And in fact, GPUs are starting to add some dedicated AI engines as part of the GPU, but we're very focused on AI.

We're very focused today on streaming inputs. And because of that, we can have a very optimized solution. So something you can do on a GPU with hundreds of watts you can do on our system with single digit watts. And so we have a very clear value proposition. If someone needs to analyze 20 data streams or 50 video streams or something, you could even run it, well, actually you can't run it on a CPU. You could run it on a GPU for the most part. But we can do a side by side with the latest GPU, and there's an order of magnitude in efficiency and size and thermals advantage to using a dedicated AI accelerator.

Dave Nicholson:
So with that, with that in mind, this idea that you're targeting certain market segments where you can deliver efficiency specifically in a lot of different ways, not the least of which is power consumption. Segment out the market. I think on one end, you've got data centers with nuclear power plants directly next to them. On the other end, you've got devices we wear on our wrists or we carry in our pockets. Where do you fit and then tell me more about the vision systems that you alluded to earlier?

Keith Kressin:
Right. Yeah, so it's important to separate kind of system architecture and design point. So we have an architecture that's very efficient in terms of scale. So we can build something for the high-end, for the low-end, and all parts in between. We chose a design point with our current product that doesn't focus on saving every micro watt, and it doesn't focus on getting the absolute best performance in a data center application. I think there's plenty of mature vendors and even startups going, and cloud providers, going after the data center. But on the edge platform, single digit watts, something in an M.2 type form factor or a USB plugin, that's really where we're focused. And there's a lot of existing and new applications emerging that want to take advantage of AI. Things that people have never thought of before, and embedded systems that now they can do with the help of AI.

So from a power standpoint, that's what we're focused on. From an application standpoint, what's really a sweet spot for us is there's hundreds of thousands, possibly millions of dumb cameras out there in the world, cameras that don't have intelligence built in, and they're monitoring in retail or safety in a factory or lots of different street corners, lots of different scenarios. And in the vast majority of cases, that data is basically being saved for offline analysis. What we want to do is enable those cameras that are streaming information to become intelligent. And the way we do that is cameras are streamed into a central edge server.

We can provide AI processing on that edge server for just a handful of watts. And process 20, 50, a hundred, 500 data streams by just using some of our M.2 Cards. And then action can be taken. So AI can be used to figure out what to do with that intersection or what to do in terms of safety or if there's a security problem as opposed to something happening, and then pulling up the recording and doing an analysis after the fact. We want to enable real time capability, and that's a really good sweet spot market for us. And then in the future, we'll be expanding our portfolio, but it makes sense to focus on a niche where you're really good to begin.

Dave Nicholson:
So if I have a warehouse and I already have a hundred cameras that are, let's say I've put the system in the last five years, we're not talking about 50-year-old CCTV, but something digital, a digital camera system it's feeding into, at this point, let's say it's just recording. So you can take that data stream, and when we talk about the concepts like core and edge and the Internet of Things, there's always a question about how far out do you push intelligence to the edge. You're saying in that scenario, you don't need to push your intelligence out into the camera device, but instead you can be in the server that those devices are feeding and the intelligence can live there.

Keith Kressin:
That's exactly right.

Dave Nicholson:
I know that might seem like an obvious point to make or to clarify, but I don't think that it is when some camera manufacturers are trying to push intelligence to the edge. Wait a minute, we already have cameras.

Keith Kressin:
And many also send that data to the cloud to be analyzed. So it costs a lot of money in terms of communication, there's privacy issues, you lack control. So it's much, much better to do things locally if possible. And we'll enable companies to do that locally. So even retail shops, if you have 20 cameras in your retail shop and you have a lot of retail shops in your city, you might have hundreds of thousands of cameras going to a central location. We can either be at each store or at that central location running one or more AI models concurrently to analyze those data streams. And like I said, we partner with other software vendors and then you have a real time action that can be taken depending on what the customer wants to do.

Dave Nicholson:
So on the subject of partnership, it sounds like that's important from a go-to-market strategy, there's a difference between monitoring activity on a dock where ships are coming in and out and monitoring activity in a retail environment or a traffic scenario. Is that where the rubber meets the road there? Is that your collaboration with partners that create those end-user solutions?

Keith Kressin:
Absolutely. And there's a number of partners. You have the company that builds the box, that accepts our hardware. Okay, that's probably the easy part. But then there's companies that specialize in AI models for certain industries. Vendors to make certain customer facing dashboards so they can take action or integrated into software and cities for smart city applications. And that's why for us, ease of use is so important, because there's so many different applications. They're changing so fast. You want the ability to upgrade to the latest model seamlessly. We want to support a broad set of customers, but not have engineers that have to talk to each customer.

We'd much rather provide them with a set of tools and get them up and running very quickly with high accuracy, with very, very low touch from us. And so that's why the link is there between kind of the market that we're going after and our product, our product in terms of great metrics or streaming inputs, because we're a pure data flow operation, but also the ease of use that's so extremely important that you'd think other people would solve, but it's a very difficult problem to solve. You can make it easier to use, but you normally need to give up on something else like accuracy.

Dave Nicholson:
Yes. Well, on behalf of a grateful world, I'd like to thank you and the folks at MemryX for not being distracted by the shiny object that is chatbot generative AI at this point, and really doing the block and tackle stuff that businesses need.

Keith Kressin:
Yeah, gen AI is certainly exciting and there's a lot of good applications for that, but it's one thing to focus on the technology where the market is, another thing to focus on the application of that technology. We're really focused on the application and the practical ramp of it.

Dave Nicholson:
Keith Kressin, CEO of MemryX, thanks for joining us here at the Six Five Summit.

Keith Kressin:
Thank you.

Dave Nicholson:
Dave Nicholson, for the rest of the crew, stay tuned. More to come.

CYBERSECURITY

Threat Intelligence: Insights on Cybersecurity from Secureworks

Alex Rose from Secureworks joins Shira Rubinoff on the Cybersphere to share his insights on the critical role of threat intelligence in modern cybersecurity efforts, underscoring the importance of proactive, intelligence-driven defense mechanisms.

HP Launches World’s First Business PCs to Protect Against Quantum Hacks - The Six Five On the Road

On this episode of the Six Five - On the Road, hosts Patrick Moorhead and Daniel Newman are joined by HP's Ian Pratt, Global Head of Security for Personal Systems.

What is Autonomous Endpoint Management?

Autonomous Endpoint Management is a framework designed to unify IT operations and security teams on a single platform through real-time control and visibility.

quantum

Quantum in Action: Insights and Applications with Matt Kinsella

Quantum is no longer a technology of the future; the quantum opportunity is here now. During this keynote conversation, Infleqtion CEO, Matt Kinsella will explore the latest quantum developments and how organizations can best leverage quantum to their advantage.

Accelerating Breakthrough Quantum Applications with Neutral Atoms

Our planet needs major breakthroughs for a more sustainable future and quantum computing promises to provide a path to new solutions in a variety of industry segments. This talk will explore what it takes for quantum computers to be able to solve these significant computational challenges, and will show that the timeline to addressing valuable applications may be sooner than previously thought.

Edge AI Vision with MemryX

MORE VIDEOS

The Six Five Pod | EP 259: Tech Titans Under Scrutiny: Antitrust, AI, and Global Competition

Cisco on Advancing Proactive Threat Detection and Response - Six Five On The Road

Building Resilience: The Board's Role in Cyber Preparedness - Six Five On The Road

CYBERSECURITY

Threat Intelligence: Insights on Cybersecurity from Secureworks

HP Launches World’s First Business PCs to Protect Against Quantum Hacks - The Six Five On the Road

What is Autonomous Endpoint Management?

quantum

Quantum in Action: Insights and Applications with Matt Kinsella

Accelerating Breakthrough Quantum Applications with Neutral Atoms