China is two or three years behind America in building foundation models of AI. There are three reasons for this underperformance. The first concerns data. A centralized autocracy should be able to marshal lots of it—the government was, for instance, able to hand over troves of surveillance information on Chinese citizens to firms such as SenseTime or Megvii that, with the help of China’s leading computer-vision labs, then used it to develop top-notch facial-recognition systems.
That advantage has proved less formidable in the context of generative AIs, because foundation models are trained on the voluminous unstructured data of the web. American model-builders benefit from the fact that 56% of all websites are in English, whereas just 1.5% are written in Chinese, according to data from w3Techs, an internet-research site. As Yiqin Fu of Stanford University points out, the Chinese interact with the internet primarily through mobile super-apps like WeChat and Weibo. These are “walled gardens”, so much of their content is not indexed on search engines. This makes that content harder for ai models to suck up. Lack of data may explain why Wu Dao 2.0, a model unveiled in 2021 by the Beijing Academy of Artificial Intelligence, a state-backed outfit, failed to make a splash despite its possibly being computationally more complex than GPT-4.
The second reason for China’s lackluster generative achievements has to do with hardware. In 2022 America imposed export controls on technology that might give China a leg-up in AI. These cover the powerful microprocessors used in the cloud-computing data centrers where foundation models do their learning, and the chipmaking tools that could enable China to build such semiconductors on its own.
That hurt Chinese model-builders. An analysis of 26 big Chinese models by the Centre for the Governance of ai, a British think-tank, found that more than half depended on Nvidia, an American chip designer, for their processing power. Some reports suggest that SMIC, China’s biggest chipmaker, has produced prototypes just a generation or two behind TSMC, the Taiwanese industry leader that manufactures chips for Nvidia. But SMIC can probably mass-produce only chips which TSMC was churning out by the million three or four years ago.
Chinese AI firms are having trouble getting their hands on another American export: know-how. America remains a magnet for the world’s tech talent; two-thirds of ai experts in America who present papers at the main ai conference are foreign-born. Chinese engineers made up 27% of that select group in 2019. Many Chinese AI boffins studied or worked in America before bringing expertise back home. The covid-19 pandemic and rising Sino-American tensions are causing their numbers to dwindle. In the first half of 2022 America granted half as many visas to Chinese students as in the same period in 2019.
The triple shortage—of data, hardware and expertise—has been a hurdle for China. Whether it will hold Chinese ai ambitions back much longer is another matter.
Excerpts from Artificial Intelligence: Model Socialists, Economist, May 13, 2023, at 49