DP15825 Human Biographical Record (HBR)
|Author(s):||Arash Nekoei, Fabian Sinn|
|Publication Date:||February 2021|
|Keyword(s):||Bid data, economic history, Machine Learning|
|Programme Areas:||Economic History|
|Link to this Page:||cepr.org/active/publications/discussion_papers/dp.php?dpno=15825|
We construct a new dataset of more than seven million notable individuals across recorded human history, the Human Biographical Record (HBR). With Wikidata as the backbone, HBR adds further information from various digital sources, including Wikipedia in all 292 languages. Machine learning and text analysis combine the sources and extract information on date and place of birth and death, gender, occupation, education, and family background. This paper discusses HBR's construction and its completeness, coverage, accuracy, and also its strength and weakness relative to prior datasets. HBR is the first part of a larger project, the human record project that we briefly introduce.