Tutorial for using R to analyze the CGED-Q JSL Public Releases

Chen Jun, my MA student at Central China Normal University, has shared slides and sample code he produced to help anyone planning to use R to analyze the CGED-Q JSL public releases. The materials are all in Chinese. They introduce how to import the public data into R, create and transform variables, process strings to create variables, and tabulate and graph results. We hope that this will be useful to users of the data.

New paper by others using CMGPD-LN

We were pleased to learn that Yu Bai, Yanjun Li, and Pak Hong Lam had just published a paper “Quantity-quality trade-off in Northeast China during the Qing dynasty” in the Journal of Population Economics using the public release of the CMGPD-LN! We hope their paper along with other recent publications by others using the dataset will inspire others to use it.

Here is a link to their paper: https://link.springer.com/article/10.1007/s00148-022-00933-x

We are eternally grateful for the support from NICHHD that allowed us to prepare the CMGPD-LN for release, and to ICPSR for hosting the dataset.

CGED-Q JSL receives Best Project Award (最佳项目奖 ) at China Digital Humanities 2022 Annual Meeting

We are pleased to report that the China Government Employee-Qing (CGED-Q) Jinshenlu (JSL) dataset was one of four to receive the Best Project Award (最佳项目奖 ) at the China Digital Humanities 2022 Annual Meeting held at Renmin University on November 26 and 27.

For more information about the award, please see the final report of the CDH 2022 meeting.

For more information about the CGED-Q JSL, please see the project page at the Lee-Campbell Group Website.

CGED-Q Jinshenlu 1850-1864 Public Release now available

We just made available for download the China Government Employee Database-Qing (CGED-Q) Jinshenlu 1850-1864 Public Release.This release consists of 341,092 quarterly records of 37,632 (by our linkage) officials who served between 1850 to 1864. The information is drawn from 26 quarterly editions.

We chose 1850-1864 as the next period for a release since it includes the Taiping Rebellion, a major event in 19th century Qing history.

Each record includes information about the post, and if it was occupied, the holder, including their name, province and county of origin, qualification, and other information.

Together with our previous release of 686,945 records for the period 1900-1912, we have now released publicly more than 1,000,000 records from the CGED-Q.

The 1850-1864 and 1900-1912 releases may be downloaded at the HKUST Dataspace, the Harvard Dataverse, and the mirror site at Renmin University Institute for Qing History:

HKUST Dataspace


Harvard Dataverse


Renmin University Institute for Qing History


Github repository with code for the CMGPD Public Release

We created a repository to share the STATA code that processes the original CMGPD data from the Excel spreadsheets produced by our coders and turns it into the working file that is the basis of our analysis and the public release available at ICPSR. This is intended to help users of the data better understand the process by which it went from the spreadsheets transcribed by the coders to the datasets available at ICPSR. The code for linking individuals to their kin may be of particular interest.

Major phase of data entry for the China Government Employee Database-Qing Jinshenlu (CGED-Q JSL) completed

In November 2021, our coders completed the entry of virtually all the quarterly editions of the rosters of Qing civil officials 縉紳錄 and military officials 中樞備覧 available to the Lee-Campbell Group, including all the editions from the published Tsinghua University Library collection and other editions from  the Columbia University and Harvard University libraries, as well as the National Library and Shanghai Library. We are grateful to the staff of all these libraries, in particular the Columbia University Library, for their cooperation in making their library holdings available.  We have also located a number of other editions in the Peking University library and the Palace Museum Library, but do not yet have access to these data.  We are not aware of any other readily accessible editions in other collections.

The CGED-Q JSL now consists of 4,433,600 records of 327,618 officials for the period between 1760 and 1912. 3,843,644 are records of civil offices in editions of the jinshenlu and 589,956 are records of military offices in editions of the zhongshubeilan. The data are most complete for the period 1830 to 1912. According to our analysis based on our most recent record linkage, of these officials, 261,451 were civil officials, 58,482 were military officials, and 7,685 made appearances as both civil and military officials. Please note that since these counts of numbers of officials are based on record linkage, they may change as we adjust our nominative linkage procedures.

Figure 1 (below) summarizes the coverage of the entered 縉紳錄 editions by decade (black bar) and compares it to the potential coverage if all the editions in different collections were entered. In the 1840s, and then from 1870 to 1912, we have entered at least one edition per year. In the 1830s, and then from 1850 to 1869, we have at least one edition entered for 9 out of 10 years in each decade. Between 1800 and 1830, the coverage of our entered data is spottier. We have at least one edition in 7 out of 10 years in the 1800s, 4 out of 10 years in the 1810s, and 6 out of 10 years in the 1820. From 1760 to 1800, our coverage is less complete, with at least one edition entered every 2 to 4 years per decade.

Figure 1. Entered and Available Editions

Based on our review of the catalogs of other collections, it should still be possible to improve coverage of the last half of the 18th century and first half of the 19th century. The heights of the green bars represent the numbers of years for which at least one edition appears to exist in other collections. Most of these are in the Peking University Library and the Palace Museum. We hope very much to gain access to these collections at some point in the future.

Figure 2 presents a more detailed view of the coverage of the editions so far. From about 1865 onward, we have 3 or 4 editions per year entered all the way to 1911. From 1830 to 1865 or so, we have at least one or two editions per year entered, except for one year each in the 1850s and 1860s where we have no editions at all. Before 1830, it is more common to have one or two editions entered, or none at all.

Figure 2. Entered editions by year

For more details about the CGED, please see the project page.

Addendum – 30 April 2022

Since November 2021, we found five more editions that had been entered but not added to our central work file. This post and the content of related pages has been accordingly updated.

CGED-Q Jinshenlu 1900-1912 Public Release Tabulation and Visualization Platform

Charlie Liu, an undergraduate in the Quantitative Social Analysis program at HKUST, created a platform for producing tabulations and visualizations with the CGED-Q Jinshenlu 1900-1912 Public Release. At the platform, users can explore the contents of the publicly released CGED-Q  for the period 1900-1912 without having to download data and open it in a statistical package in R or Stata. Among the available variables are province and county of origin, location of current post, Banner status, and exam or purchase degree (出身). Here is the CGED-Q tabulation and visualization platform.

Our CGED-Q project page has more information about the CGED-Q itself, including links to sites where advanced users can download the data to be analyzed in a statistical package like R or Stata. These sites also include documentation.

As a reminder, if you’re looking for a specific official, the entire CGED-Q is searchable via this platform, originally created by Fi Siwei and housed on a server by the HKUST VisGroup.