Schematic summarizing the time coverage and target populations of major Lee-Campbell Group projects

We have five families of ongoing historical ‘big data’ projects. Here we introduce them briefly, and link them to pages with extended descriptions. The figure summarizes the temporal coverage and target populations of the relevant datasets.

For details on the construction of these datasets, their contents, the results from analysis , please see our paper Historical Chinese Microdata. 40 Years of Dataset Construction by the Lee-Campbell Group published in 2020 in Historical Life Course Studies.

The following table lists the time coverage, numbers of records, and numbers of persons in the datasets in each of the five families, with links to brief descriptions of each.

Time Coverage Records Persons
China Multigenerational Panel Datasets (CMGPD) 中国多代人口数据库
3,013,861 527,325
Liaoning (LN) 辽宁 1774-1909 1,513,352 266,091
Liaoning Supplements (LN-S) 1909-2004 38,650 38,650
Shuangcheng (SC) 双城 1866-1913 1,346,826 107,551
Imperial Lineage (IL) 1633-1933 115,033 115,033
China Government Employee Datasets (CGED) 中国历史官员数据库 5,329,186 562,414
  Qing Jinshenlu (Q-JSL) 清- 缙绅录 1760-1911 4,433,600 327,618
Qing Examination Records (Q-ER) 1645-1907 102,484 102,484
Qing Fenfa Lists (Q-FF) 清-分发 1788-1903 45,337 45,337
Beiyang Zhiyuanlu (BY-ZYL) 北洋-北洋职员录 1911-1924 658,873 35,814
Beiyang Fenfa (BY-FF) 北洋-分发 1911-1912 7,442 7,442
Republic of China (ROC) 1927-1949 81,450 43,719
China University Student Datasets (CUSD) 中国大学生数据库 564,664 450,714
Qing/Beiyang 1902-1927 129,742 92,285
Republic of China (ROC) 1928-1949 186,843 154,833
Overseas Students (OS) 1847-1948 97,186 52,703
People’s Republic of China (PRC) 1949-2003 150,893 150,893
China Rural Reconstruction Datasets (CRRD) 中国农村革命数据库 116,244 215,960
Land Reform (LR) 土改 1947-1948 91,263 90,163
Siqing (SQ) 四清 1946-1966 24,981 125,797
China Professional Occupation Datasets (CPOD) 中国专业职业人员数据库 60,278 60,265
Republic of China (ROC)
  Chartered Accountants (CA) 1911-1949 2021 2008
  Engineering Professionals (EP) 1911-1949 20967 20967
  Legal Professionals (LP) 1911-1949 1350 1350
  Health Professionals (HP) 1911-1949 5024 5024
  University Employees (UE) 1911-1949 20340 20340
People’s Republic of China (PRC)
  University Employees (UE) 1949- 5500 5500
Multiple eras
  Academicians and Experts
5076 5076
Total 9,084,233 1,816,678
Updated 27 March 2024

China Multigenerational Panel Datasets (CMGPD) 中国多代人口数据库

Kinship, demographic behavior,  and inequality in China from the 18th century to the early of the 20th century

Project Page

For nearly four decades we have conducted research on kinship, inequality, and demographic behavior in China and in comparative perspective using large multi-generational population databases that we constructed, most notably the China Multigenerational Panel Datasets (CMGPD) 中国多代人口数据库. We have published on a wide variety of topics using these data, including economic, family and social influences on demographic outcomes such as birth, marriage, migration, and death, fertility limitation in historical China, and the role of kin networks in shaping social mobility. The most recent outputs include a 2018 study of ethnic intermarriage in northeast China that appeared in Demographic Research and a 2015  study of how patrilineal kin network characteristics can influence individuals’ life chances generations later in American Sociological Review. We summarize our earlier work on these data in a 1997 book from Cambridge University Press. The multi-generational databases were the basis of our contributions to three coauthored books published in 2004, 2010, and 2014 in the MIT Press Eurasian Population and Family History Series comparing relations between family organization, demographic behavior and economic conditions in past times. We publicly released two CMGPD datasets, Liaoning (CMGPD-LN) and Shuangcheng (CMGPD-SC), and they are are available at ICPSR.

China Government Employee Datasets (CGED) 中国历史官员数据库

Civil and military officials in Qing and Republican China, 1700-1949

Project Page – CGED-Q 中国历史官员数据库(清代)
Project Page – CGED-BY/ROC 中国历史官员数据库(北洋/民国)

In 2013, we initiated a new project to study Qing educational and political elites and the Qing bureaucracy by constructing and analyzing a longitudinal individual-level database of nearly all civil officials recorded in the surviving editions of the jinshenlu 縉紳錄 as well as military officials recorded in the zhongshu beilan 中樞備覧. The resulting dataset is core of the China Government Employee Database-Qing (CGED-Q) 中国历史官员数据库(清代). As of November 2021, we have 4,443,600 records of 327,618 officials entered, with nearly complete coverage of the period between 1830 and 1912. This study is distinctive in that previous studies of officials and other government employees have mainly been case studies of specific individuals, groups of individuals, or government offices. Starting in spring 2019, we began making these data public in stages, in collaboration with the Institute of Qing History at Renmin University. We have also created linked datasets of holders of Qing examination degrees, and office purchasers. We have also begun collecting and entering data on civil officials during the Republican era in the China Government Employee Dataset – Beiyang/Republic of China (CGED-BY/ROC) 中国历史官员数据库(北洋/民国).

China University Student Datasets (CUSD) 中国大学生数据库

Social origins of university students in China, 1890-2010

Project Page

The China University Student Database (CUSD) project analyzes the social and geographic familial background of university students in Republican China (ROC) and elite university students in the People’s Republic of China (PRC) based on a collection of 400,000 individual university student registration cards from 35+ Chinese universities as well as a variety of published and archival sources on Chinese students studying abroad as well as domestically. Our analysis of the social origins of students at Peking University and Suzhou University was the basis of a book 无声的革命: 北京大学、苏州大学的学生社会来源 1952-2002 that was published in 2013 as well as a 2012 article with the same name that appeared in 中国社会科学 (Chinese Social Science). These fed into and influenced ongoing debates about the role of examinations for university admissions. More recently we have been constructing databases of student registration data from major universities 1911-1949 and are using these data to complete a book manuscript that is a prequel to 无声的革命 on the history and origins of Chinese university students in the first half of the twentieth century.

China Rural Revolution Datasets (CRRD) 中国农村革命数据库

Rural revolution in 20th century China

Project Page – CRRD-LR 中国农村革命数据库——土改
Project Page – CRRD-SQ 中国农村革命数据库——四清

One of the most defining features of twentieth century China was its transformation of the world’s largest agrarian society. We are constructing a new dataset series, the China Rural Revolution Datasets (CRRD), to capture the many stages and movements that comprise this century-long transformation. The first two datasets in this series record information about individual and household experiences during the most dramatic stages of this process between 1946 and 1966, when the Chinese Communist Party (CCP) carried out a nationwide redistribution of land and then gradually organized rural communities into agricultural cooperatives and ultimately People’s Communes. The China Rural Revolution Dataset – Land Reform (CRRD-LR)  中国农村革命数据库——土改 studies one of the largest re-distributions of wealth and power in history – the CCP’s nationwide Land Reform Movement from 1946 to 1953. Records of this movement include detailed individual- and household-level registers of property expropriation and reallocation and the political struggles involved in this redistribution of wealth. Currently the CRRD-LR contains county-wide data on the land reform experiences of over 80,000 households with approximately 400,000 individuals in Shuangcheng, Heilongjiang between 1946 and 1948.

The China Rural Revolution Dataset – Siqing (CRRD-SQ) 中国农村革命数据库——四清 studies the social and economic transformation of Chinese society over the first half of the twentieth century by analyzing social class registration forms (阶级成份登记表) compiled on the eve of the Cultural Revolution in 1966. The CRRD-SQ currently contains data from approximately 25,000 of these forms, one quarter in collaboration with the Shanxi University Research Center for Chinese Social History, from four provinces: Shanxi, Hebei, Inner Mongolia, and Guangdong. Each form records two to three pages of information per household, including their property holdings and occupations before and after land reform in the late 1940s, at the time when cooperatives were formed in the mid-1950s, and at the time of compilation c.1966; the household head’s social relations, a three-generation family history, and social, demographic, and political details on every household member over 15 sui.

China Professional Occupational Datasets (CPOD) 中国专业职业人员数据库

Educated professionals in Republican China

Project Page – CPOD

We have created China Professional Occupation Datasets (CPOD) for six professional occupations – certified accountants, engineers, health professionals, legal professionals, university faculty and staff, and academicians and experts – during the Republic of China (ROC) and the People’s Republic of China (PRC) to understand better the development of professional education and employment in China.

These six datasets include information for some 50,000 working professionals, largely from the Republic of China, many of who also have student records in the CUSD-ROC and CUSD-OS.  These datasets do not include individuals from the CGED-BY and ROC unless they worked in private practice as well as pubic service.  They also do not include many more individuals in the CUSD-BY, CUSD-ROC,  CUSD-OS, and CUSD-PRC who majored in a professional training program for whom we do not have any record of professional registration or employment.