The Chinese government on Monday issued a set of guidelines to drive high-quality development of the data annotation industry -- the first national-level strategic directives for the emerging sector that is crucial for successful AI applications.
Data annotation, namely categorizing and labeling various data types including text, audio, images, and video, is a foundational process that enables AI systems to deliver accurate and reliable results, powering many advanced fields such as autonomous driving, low-altitude economy, smart manufacturing, and intelligent healthcare.
The guidelines, jointly released by the National Development and Reform Commission, National Bureau of Statistics, Ministry of Finance, and Ministry of Human Resources and Social Security, set key development goals for the industry by 2027. That include significant advancements in specialization, automation, and technological innovation and a compound annual growth rate of over 20 percent in industrial scale. It also highlights the critical role of data annotation in underpinning innovative development of AI.
As technology evolves, there's a growing demand for professionals with specialized knowledge in sectors such as finance, transportation, energy, and healthcare, said Meng Qinguo, executive director of the Laboratory of Computational Social Science and State Governance under Tsinghua University.
"The shortage of high-quality data has become a bottleneck in developing some large AI models in China. Data annotation involves processes like screening, cleaning and classifying data to produce high-quality datasets that can let machines read fast and learn fast. As the sector becomes increasingly automated and intelligent and more segments emerge, it is also shifting from a labor-intensive to a knowledge-intensive industry," Meng said.
Efforts will be made to build distinctive and effective data annotation hubs to form a comprehensive ecosystem for the industry. Pilot projects will be implemented in seven cities, including Chengdu, Shenyang, Hefei, and Changsha.
The guidelines underscore demand- and innovation-driven development with 13 specific measures.
For the first time, it encourages the utilization of public data for annotation, advocating for its development and use in key areas such as modern agriculture, smart manufacturing, and information services in accordance with the law and regulations.
"The Central Economic Work Conference has proposed launching an AI Plus initiative, with many localities accelerating the deployment of government AI models. These models will see more applications in areas important to people's lives, such as transportation, meteorology, and healthcare, promoting an increasing demand for public data-based annotation. The efforts to unlock the value of public data have focused on ensuring effective annotation of massive amounts of public data," said Meng.
To boost innovation, the guidelines call for improved data annotation standards and support for the development of original technologies and equipment that integrate hardware and software solutions.
The document also outlines supportive measures to boost the industry, including greater fiscal, tax, and financial support, adding data annotation services to the government procurement list, and training more professionals.

China issues guidelines on high-quality development of data annotation industry