The race to create extra highly effective synthetic intelligence functions has additionally created an enormous demand in China for top of the range coaching knowledge.
SCOTT DETROW, HOST:
The race to create extra highly effective synthetic intelligence functions has additionally created an enormous demand for high-quality coaching knowledge and competitors over who will get to make use of that knowledge, and plenty of that demand is in China, as NPR’s Emily Feng and Aowen Cao report.
(SOUNDBITE OF MOUSE CLICKING)
EMILY FENG, BYLINE: On this brand-spanking-new workplace constructing in northeastern China, rows and rows of individuals sit silently clicking at their laptop screens. That is the gasoline that powers a lot of generative AI – uncooked knowledge – and this knowledge processing middle is the brainchild of this man.
HENRY CHEN: My title is Henry, Henry Chen.
FENG: He is the founding father of Sapien AI. It hires folks world wide to gather knowledge and tag and arrange it, so it may be used to coach quite a lot of synthetic intelligence functions. China is an enormous market.
CHEN: Particularly after DeepSeek got here out.
FENG: DeepSeek, the Chinese language chatbot acting on par with American-trained chatbots however skilled at a fraction of the associated fee – that demand for knowledge is why Chen’s firm now has about 60 workers in China labeling maps of Chinese language streets. This knowledge at the moment is getting used to coach an autonomous driving program.
AOWEN CAO, BYLINE: Seems very summary.
FENG: That is NPR Producer Aowen Cao.
CAO: I see folks working in entrance of computer systems, however on the pc screens, they’re black backgrounds with squares.
FENG: Squares and inexperienced dots – it virtually appears to be like like, Aowen says laughing, the tv present “Severance.” The info might look summary, however it’s a invaluable commodity, says Rogier Creemers. He is a professor at Leiden College within the Netherlands who research China’s digital know-how insurance policies.
ROGIER CREEMERS: They consider that knowledge is an financial enter, and in a manner, they see it as akin, in that sense, to uncooked supplies.
FENG: Chatbots at the moment, like ChatGPT, want actually trillions of knowledge factors to stand up to hurry, and who owns that knowledge has more and more been a contest between corporations and between international locations just like the U.S. and China. Every needs an edge over the opposite in AI, and which means hoarding knowledge. Knowledge is such a choke level that since final yr, China’s our on-line world regulators need to approve any bulk export of knowledge in another country, which is partly why Sapien AI, a Canadian firm, is in China to start with.
CHEN: For the AI fashions which can be skilled right here, the info must be processed within the nation and can’t depart the nation.
FENG: The race to create and defend knowledge can also be as a result of the info AI corporations need is getting extra difficult. Olga Megorskaya, the founding father of an Amsterdam-registered knowledge processing firm known as Toloka, now focuses on creating datasets for extremely technical scientific and engineering fields. She makes use of an analogy that compares early AI fashions to human toddlers.
OLGA MEGORSKAYA: The particular person is like 2 years previous. She or he is taught by youngsters books with very vibrant footage.
FENG: And extra superior AI fashions are like college college students.
MEGORSKAYA: When she goes to the college, there are dozens of textbooks that she must learn.
FENG: For an AI mannequin, which means gobbling up an increasing number of superior datasets. The info trade is essential sufficient that native governments in China, as soon as depending on dying industries like steelmaking and coal mining, are actively recruiting AI knowledge processing corporations. This is Creemers at Leiden College once more.
CREEMERS: China needs to make a big amount of cash via creating the industries of the longer term.
FENG: The rust belt metropolis of Shenyang, the place Sapien AI selected to find certainly one of its workplaces, is certainly one of seven Chinese language cities that claims it needs to turn into an AI knowledge hub. The town provides low rates of interest on loans and versatile and inexpensive workplace area. This is Chen once more at Sapien AI. They benefited from this assist.
CHEN: So they provide us plenty of assist as nicely, so we simply discover a actually good surroundings to arrange the workplace right here.
FENG: As a result of knowledge processing employs plenty of younger folks – China’s financial system by no means totally recovered from a worldwide coronavirus pandemic, and youth unemployment has involved policymakers sufficient that they briefly stopped publishing that statistic.
(SOUNDBITE OF MOUSE CLICKING)
FENG: One of many younger folks working at Sapien AI is Huang Rui, age 21. She’s a knowledge high quality specialist.
HUANG RUI: (Non-English language spoken).
FENG: She says the work of knowledge processing is appropriate for folks with obsessive-compulsive tendencies as a result of it requires a excessive degree of consideration to element. Knowledge processing is admittedly not essentially the most thrilling work, says Chen, her boss.
CHEN: Simply image your self sitting at a desk and take a look at to attract bounding containers round automobiles for 40 hours per week.
FENG: However typically innovation requires somebody – really, a complete lot of individuals – to do the boring work. Emily Feng, NPR Information.
Copyright © 2025 NPR. All rights reserved. Go to our web site terms of use and permissions pages at www.npr.org for additional data.
Accuracy and availability of NPR transcripts might differ. Transcript textual content could also be revised to right errors or match updates to audio. Audio on npr.org could also be edited after its unique broadcast or publication. The authoritative file of NPR’s programming is the audio file.