Artificial intelligence is a focal point for the business community globally. Every day AI makes the case for cost savings, efficiency, and competitive advantage.  AI transformation is a challenging task.  Decision makers and business leaders shouldn’t just act on AI, they should learn what AI is, figure out how to apply specific AI/ML.

A technologist and a futurist Mahmood’s contribution to artificial intelligence, machine learning, deep learning, robotics and data center is hard to underestimate. We recently spoke with Mahmood about the need for AI inferencing.

Blueshift Research: What are the emerging drivers around the inferencing market?

Mahmood:  In the field of Artificial Intelligence, inference is a component of the system that applies logical rules to the knowledge base to deduce new information.  Technology works in layers and inferencing is a part of the platform.  Autonomous vehicles, intelligent machines, supply chain modernization, intelligent services, smart cities and visualization platforms are some of the growth drivers.

AI training is one of the core drivers for the data center platform. This involves

    1. Training (Deep Learning which involves teaching an artificial “neural network” how to make inferences from data like humans do)
    2. Applications (machines applying their training to new data).

Blueshift Research: What are the processing requirements for conversational AI? Where does current chip technology stand in meeting them?

Mahmood:  Conversational AI is the next iteration of human to computer interaction – one that will replace the mouse and keyboard.  Conversational AI converges three separate technologies: artificial intelligence, messaging apps and speech recognition (Natural Language Processing – NLP). Achieving human-level language capabilities, automated customer support, industrial machinery and robots, or personal digital assistants will require large ai models.  Recent developments, however, bode well for the future evolution and proliferation of conversational AI.

With using GPUs, we’ll see two times the latency reduction and five times throughput improvement during inference.  Google’s BERT is moving conversational AI forward. BERT is many things i.e. open source NLP.  I believe we are at an inflection point where conversational technologies have the power to evolve and transform industries.

Blueshift Research: How big do you expect the AI inferencing market will be?

 Mahmood:  The global artificial intelligence (AI) market is projected to reach $200 billion by 2025, registering a CAGR of 40% from 2020 to 2025.  GPUs and CPUs lead the market today but there is an expanding role for FPGAs, ASICs, SoC accelerators, and other emerging chipsets.  Inferencing will dominate the crucial part of the AI market for at least the next two years.   I expect the AI inferencing hardware in the data center by 2025 to be more than $10B.

Blueshift Research: Do you expect all the demand for inferencing chips to come from hyperscalers or will there be broader demand in the next 12-18 months??

Mahmood:  Data centers mainly use AI chips for training. Hyperscalers are now demonstrating that marketing hype in ML/DL is not going to cut it. I expect to see inferencing AI chips more widely deployed in automotive, robotic and industrial Internet of Things applications hence the demand initially will come from hyerpscalers and the edge.

Blueshift Research: Will GPUs be the chip of choice over CPUs and FPGAs for inferencing? Why or why not?

Mahmood:  Not yet!  CPU is the brains of the computer and a GPU acts as a specialized microprocessor. A CPU can handle multiple tasks, and a GPU can handle a few tasks very quickly.  CPUs will account for 50% of AI inferencing demand by 2025, with ASICs (which are custom chips designed for specific activities) at 20% and GPUs and other architectures will pick up the rest.  GPUs are not as cost-effective today for automating inferencing within mobile, IoT, and other “edge computing” uses.  Technologies including CPUs, ASICs, FPGAs, and various neural network processing units—have performance, cost, and power-efficiency advantages over GPUs in many edge-based inferencing scenarios, such as autonomous vehicles and robotics.

Artificial intelligence is awakening the chip industry’s animal spirits.  After decades of the big three (Intel, AMD and NVIDIA) ruling the industry, the insatiable demand for computing generated by AI has created an opening for newcomers. And it may even be big enough to allow some startups to establish themselves as big, independent firms.