The Evolution of Machine Intelligence

10 min readMar 23, 2022

The evolution of Machine Intelligence (or Artificial intelligence, AI) is one of the main trends shaping the future of humanity. It is a critical piece in addressing the aging population, stagnating productivity rates, and falling efficiency of energy production. It will also be crucial to our expansion beyond Earth. It may turn out to be the most important technology we have ever developed and has a promise of wonderful achievements but also a grave danger for humankind.

Below is our framework on how to think about AI. What it is. What’s the current state and how it may evolve in the future.

What is AI

The term AI has been around for a while and is an umbrella, that includes many computer science fields pursuing the goal of enabling machines to perform tasks commonly associated with intelligent beings. The definition is so broad, however, that we don’t find it practical to use. Instead, we look at machine capabilities as a spectrum — moving from simple, narrow, and straightforward tasks to more complex and fuzzy ones.

The further we move along this spectrum the more resources the machine needs (compute power, storage, bandwidth), and the more sophisticated should be the hardware and software.

Hardware is the enabling driver of the advances. As an example, the concept of the Artificial Neural Network (ANN) has been around since at least the 1960-s, however, the adoption of the technology was hindered by insufficient computational power.
With advances in semiconductor manufacturing, more sophisticated hardware has been built, and new software has been developed to utilize the available power.

The first software has been written with simple imperative languages, like assembler (or before that, punched cards) because that was the only type of instructions the hardware was able to handle. Then came procedural languages, like C, a superstructure on top of assembler, dramatically simplifying the creation of programs. The next huge step forward was object-oriented programming (OOP), the technique that needed much more powerful hardware, as well as more sophisticated software (the run-time environment) to run new programs. As of now, we are in the golden age of Artificial Neural Networks (ANN). Each step above enabled machines to solve more and more complex tasks.

But there is a step-change from OOP and its predecessors to ANN. For the former, you “code” the system actions, precisely describing what the system should do. The output will also be precise, and its quality will be defined by the code you wrote and will not change until you change the code.

ANN however is an empty vessel. When set up it can do nothing. You train it to perform the desired action by providing it with the input data and giving feedback — how good is its output (e.g. you “show” it different images and ask it to identify objects in them, or provide a part of a sentence and ask to estimate what will be the next word in it). The output is changing as the system strives to get more favorable feedback. This approach is called Machine Learning — the ability to learn without explicitly being programmed.

The benefit of the approach is that you don’t have to figure out the precise algorithm of how a specific action should be performed by the system. The system has to figure it out itself based on the received feedback.

There are several trade-offs with this approach:

The output of the machine learning system is always imprecise, always an approximation — it doesn’t make sense to use it to calculate your taxes.
The output is always probabilistic — an image recognition ML system will tell you that there is a 96% that there is a cat in this image, but never 100%.
When presented with input data very different from the training set the system may produce completely non-sensical results.
Sensitivity to training data. Current ML techniques require very large amounts of high-quality data for their training and will perform poorly otherwise. If there are biases in the training data, these biases will be inherited by the system.
The inner workings of the system are opaque. Since we didn’t explicitly program the system but instead let it learn, we don’t know how exactly it does the task and (with complex implementations) can’t explain why it made a certain mistake.

It makes sense to think about modern ANN-based ML systems (they are called Deep learning systems or DL systems) as acting like human intuition. Our intuition is also imprecise, probabilistic, can be wrong if used outside of our area of experience and we can’t explain exactly how it works.

These shortcomings are the price of the DL system’s ability to perform actions that are way too complex to be hard-coded, including those where human experts do not even know the correct algorithm (e.g. there is no known algorithm to reliably differentiate a picture of a dog from a picture of a cat).

The actions we automate with the DL system should have some tolerance for imprecision/errors, but it turns out most of the real-world complex actions do. In production, DL systems are complemented by either human-in-the-loop and/or traditional software filtering the critical errors out.

Below is a summary of approaches:

To learn more about the “system capabilities’ exploration”, take a look at this note on improving the DL system’s results by including “Let’s think step by step” text in the instructions, or this one on the image generation system’s “secret language.”

Note, that in practice most implementations of AI systems include multiple different types of ANNs as well as relevant traditional software modules, to compensate for the DL shortcomings.

When talking about AI aplications below we mean the combined systems.

AI Applications

AI adoption started from tasks with a narrower scope and higher tolerance for mistakes, like recommenders (showing “people also buy” widgets, or choosing the next post in the social network feed), dictation, OCR, and basic image recognition. Though such tasks used to be performed by hard-coded algorithms, the DL approach demonstrated much better results.

The adoption is now expanding along the following three vectors.

Boosting current software capabilities in every field.

Just like in the examples above DL techniques are augmenting hard-coded algorithms in every industry from healthcare to military, improving data analysis, simulations, data transformation, and more. Every machine will be powered by the next generation of chips and firmware supporting DL.

Augmenting knowledge workers/co-creation.

Early results show that DL techniques can be used to enable co-creation where humans direct the machine to come up with drafts and edit the output. There are early examples in co-writing, co-coding, co-composing, co-designing, co-science, and so on.

Augmenting physical labor.

The physical world is a challenging environment. While more narrow tasks, like keeping the car in a lane, picking a product from a box, or sanding a surface, etc. are solved with single-purpose robots, it is less clear whether the current hardware and AI techniques will be enough to solve autonomous vehicles on public streets, and more versatile robots (see the areas of research section below).

That said, even solving for a subset of the tasks above will have a significant impact on the economy. The installation of industrial robots already creates an equivalent of additional 1.3 million workers annually, and this number has grown 3x over the last ten years [1]. While most of the robots installed today are traditional ones, AI penetration is growing, especially in the segment of collaboration robots. In services, the adoption of robots is just emerging driven by fulfillment, cleaning, food services, and security.

The following figure lists some of the AI use cases based on their maturity. All of them are still maturing, as the field is relatively young, but some are further ahead.

Market size

Based on our estimates, we believe the overall global market for the AI systems to be on the order of $10T broken down into the four areas below:

Boosting current capabilities: ~$1T
Physical labor: ~ $5T (ground transportation ~ $1T, physical labor in mining, construction, manufacturing, wholesale and retail trade, transportation and warehousing ~ $4T).
Knowledge workers: ~$4T.
AI infrastructure: hardware, tools, and services to build/run AI systems (~$1T).

Depending on how powerful the technology will turn out to be, we may see a market of a few trillion dollars over the next decade.

The AI system stack

Below are the main components of the stack.

Talent. The engineers and researchers. But also the headhunters, outsourcing shops, freelance marketplaces, and other services that help connect these experts with the work.
Hardware. The TPUs, GPU, and CPU are currently the main sources of computing power for ANNs, while novel technologies, like neuromorphic chips, are being developed. Initially, the hardware was configured and set up in-house, though currently developers increasingly leverage capacity available in the public cloud.
Architectures. Since ANNs are not programmed but trained to perform specific actions, the job of the engineer is to come up with the most efficient architecture of the different types of ANNs (e.g. CNNs, transformers) and traditional software based on the task at hand. The building blocks and reference architectures are developed by the open-source community, academia, and businesses, led by big tech companies (Google, Meta, Microsoft, Amazon, Open AI, and others).
Data. Can be collected and labeled in-house and/or purchased from 3rd parties. The data can also be artificially generated to cover cases that are not well represented in the existing datasets. While there are open-source datasets, data availability became one of the most important bottlenecks for open-source/academic researchers (the other one is access to powerful hardware).
Tooling and services. Software that helps build and manage AI applications or robotic systems in production. Tools for data management, deploying, monitoring, analytics, etc. Also in this category, we include services that collect or label data for clients.
Model-as-a-Service (MaaS). While some app developers will design and train the DL systems from scratch, many will use MaaS, where the pre-trained ANN (model) is available and ready to be used with limited fine-tuning. The cloud-based models do all the processing heavy lifting, while the app collects the data, coordinates the models, and presents the output in the form (or a set of actions) needed by the customer. It is especially relevant when the model is very large (e.g. GPT-3 or PaLM) and it would be impractical to deploy it separately for each app/customer. Leveraging MaaS also dramatically reduces time-to-market. The most advanced systems leverage multiple MaaS offerings from different vendors to achieve the best results.
Application. Performs desired actions and delivers end-user value. Is implemented either as an App (cloud, desktop, mobile, VR/AR, etc), equipent firmware/software, or a robot.
Embodiment. Sensors, manipulators, locomotion, etc. for the systems interacting with the physical world (EVs, drones, robots).
Integration. Implementation of application in the customer-specific environment, integration with other systems, including other apps. Fine-tuning pre-trained ANNs and MaaS using customer-specific data (e.g. adjusting the general language translation service to the customer-specific terminology), ongoing work with the customer data, and AI system monitoring to make sure it performs as planned.

Known challenges of AI systems and areas of research

While modern AI systems demonstrate quite impressive results in tasks ranging from image recognition to playing Go, and even explaining jokes, these are still the early days, and there are multiple areas for improvement, including abilities to better generalize, learn from fewer samples, process multiple types of data, and energy efficiency. Many of them are interconnected and will likely require innovation at the levels of systems architecture, scale, and hardware architecture to achieve breakthroughs.

Trends

We expect these multi-year trends to shape the evolution of AI systems.

Bigger models and more powerful hardware enabling more advanced actions performed by AI systems.
New architectures enabling better generalization and reasoning.
Multimodality — the ability to process and relate information from multiple modalities, like text, audio, visual, etc. Models like Dall-E, Imagen, and NUWA are steps in this direction.
Embedding AI into all modern software already used by businesses and consumers (both, as an underlying engine and as an interface, allowing to control the software with language, gestures, etc.)
Emergence of AI serving as a connecting tissue between different systems (the next level of RPA).
Embedding AI in every device powered by a chip.
Maturation of DataOps/MLOps allowing for AI applications at scale in enterprise.
An explosion of applications built on top of MaaS, where developers benefit from continuous improvements of MaaS and leverage multiple vendors for optimal results.
Transition from modular to end-to-end architectures.
Expansion of real-time AI.
Co-creation in copywriting, animation, music, computer games, and other creative tasks.
Declining costs of industrial and service robots, including Robots-as-a-Service / Pay-as-you-use models and a growing supply of low-cost robots.
Growing robot density in the industry (the current world average is 126 robots/10,000 employees, while in South Korea the number is 932). Gradual penetration of services.
Power law distribution of vendors in hardware (chips), MaaS, and edge autonomous software (powering drones and robots). A handful of companies will control most of the market.
Global progress in AI led by the US and China. Scientists from the Chinese Academy of Science, Peking, and Tsinghua Universities are already competitive with the oldest and best universities in the world: Oxford, Cambridge, Harvard, Stanford. Overall, CCP considers AI a critical area for China’s global leadership. China is also by far the largest destination for industrial robots (48% of global installs in 2020).

To dive deeper here is a good overview of the AI evolution by Nathan Benaich and Ian Hogarth.

[1] One industrial robot is equivalent to about 3.3 workers, and there are about 400,000 robots installed per year.