VoxEU Column Development Industrial organisation Labour Markets

Defining emerging market cities using lights at night

Nearly three billion more people are expected to join the world’s urbanised population over the coming decades, most of them in developing countries. In order to compare the spatial distribution of economic activity in different cities as they evolve, this column begins with a deceptively simple question: what constitutes a city? Using satellite imagery of lights at night to define cities, this approach finds that in Brazil, China, and India, agglomeration appears to be skill-biased, as it is in developed economies.

The World Bank projects that developing economies alone will host more than two billion additional people in cities by 2050. This process of urbanisation in developing economies is important due to both the number of people involved and the opportunity for a substantial transformation of human wellbeing (Bryan et al. 2019). While urbanisation does not necessarily imply growth, the two processes have often been closely intertwined (Desmet and Henderson 2015).

In developed economies, agglomeration is increasingly skill-biased (Moretti 2012, Davis and Dingel 2019). Larger cities have greater relative quantities of skill, as measured by the share of residents with a college degree, and greater relative prices of skill, as measured by the college wage premium. The implied greater relative demand for skill in larger cities suggests that agglomeration complements skill in production. Is agglomeration also skill-biased in developing economies? As billions of people urbanise, will the spatial distribution of economic activity evolve similarly?

The first step in answering these questions is deceptively simple: defining cities in order to study cross-city variation in economic outcomes. What constitutes a city? Research describing cities in the US and other developed economies typically uses spatial units defined by economic integration rather than legal jurisdictions or administrative boundaries. Agglomeration forces, commuting flows, and other economic linkages do not stop at municipal, county, or state borders, so using these boundaries to define the unit of analysis would fragment economically integrated metropolitan areas. Viewing cities as integrated labour markets, statistical agencies in rich countries overwhelmingly define metropolitan areas on the basis of commuting flows (Duranton 2015). Unfortunately, in developing economies, such nationwide commuting flow data are often unavailable. This is the case in China and India. In practice, researchers studying emerging markets' cities have relied upon a variety of geographic units, often using administrative definitions of cities that are not comparable to the metropolitan areas studied in rich countries.

In recent research (Dingel et al. 2019), we construct metropolitan areas for Brazil, China, and India by aggregating finer geographic units on the basis of contiguous areas of light in night-time satellite images. We show that these lights-based metropolitan areas closely mirror commuting-based definitions in the US and Brazil. In China and India, which lack commuting-based definitions, lights-based metropolitan populations follow a power law, while administrative units do not. We use these definitions of metropolitan areas to study how the relative quantities and prices of skill vary across cities in these emerging markets.

Defining metropolitan areas

We propose a method for aggregating spatial units into a ‘metropolitan area’ defined by a contiguous area of lights at night. The two inputs to the algorithm are a satellite (raster) image of the country at night and a shapefile of the administrative units for which socioeconomic characteristics are reported.

Figure 1 illustrates the procedure for a portion of the eastern coast of China along the East China Sea in 2000. The left panel of Figure 1 depicts the light intensity in the raster image as a ‘heatmap’ over the administrative boundaries of Chinese townships. Next, we select contiguous areas of light brighter than a selected threshold to define polygons, as depicted in the middle panel of Panel 1. The largest polygon in that panel corresponds to the city of Shanghai. Finally, we use the intersection of the night-lights-based polygons and the spatial units to construct metropolitan areas. The union of the spatial units assigned to a light polygon constitutes a metropolitan area. The right panel of Figure 1 depicts the metropolitan areas that result from applying our procedure to Chinese townships.

Figure 1 Building metropolitan areas by aggregating smaller units based on lights at night 

Notes: This figure illustrates our procedure for combining satellite imagery of lights at night with administrative spatial units to build metropolitan areas. These panels depict a portion of the eastern coast of China in 2000. The administrative spatial units are townships. The polygons in the middle panel are areas of contiguous light brighter than 30. Aggregating the townships that intersect these polygons produces the metropolitan areas depicted in the right panel.

The choice of the light-intensity threshold that defines the polygons in the middle panel of Figure 1 is not pinned down by economic theory or prior empirical research. To address this issue, we report results for a variety of light-intensity thresholds and examine whether they are sensitive to this choice. The arbitrary choice of the light-intensity threshold is likely to be most consequential for defining the edges of metropolitan areas. We show that the specific empirical questions we examine are not sensitive to the choice of light-intensity threshold. However, research questions particularly focused on the ‘urban fringe’ of metropolitan areas may be more sensitive to such choices.

Lights-based and commuting-based definitions align

To validate the algorithm depicted in Figure 1, we apply it to the US and Brazil. Since commuting data are available in these countries, we can compare our lights-based approach to the standard commuting-based definitions of metropolitan areas. For the US, the Office of Management and Budget (OMB) defines core-based statistical areas (CBSAs) by aggregating counties with sufficiently strong commuting ties. For Brazil, we apply Duranton’s (2015) algorithm to define metropolitan areas by aggregating municipalities with commuting ties of 10% or greater in 2010 population census data.

Following Rozenfeld et al. (2011), we compare log population and log land area across agglomeration schemes to show that our lights-based approach aligns with these commuting-based definitions. The left panel of Figure 2 shows that the correlation of log population between US CBSAs and their night-lights-based counterparts is about 0.98 and relatively insensitive to the choice of light-intensity threshold. Similarly, the right panel of Figure 2 shows that in Brazil, this comparison of lights-based and commuting-based metropolitan areas yields correlations for population exceeding 97% for all the reported light-intensity thresholds. The correlations for land area are weaker but still quite high, around 0.8. This lower correlation is natural, as these definitions of metropolitan areas likely differ in their inclusion or exclusion of outlying counties or municipalities, which typically have lower population densities and larger physical areas.

Figure 2 Comparing population and land area for lights-based and commuting-based metropolitan areas

Notes: The left panel depicts correlations of log population and log land area between metropolitan areas defined by contiguous areas of lights at night and 377 OMB-defined core-based statistical areas (CBSAs) with populations above 100,000 in the 2010 US Census of Population. The right panel depicts those correlations between Brazilian lights-based metropolitan areas and metropolitan areas defined by commuting flows in 2010 with a 10% threshold (per Duranton 2015). The sample is restricted to metropolitan areas with population above 100,000. The horizontal axes in both panels vary the thresholds for light intensity used to define metropolitan areas in our procedure.

Contrast with administrative definitions

Since nationwide commuting data are not available in India and China, prior studies of urbanisation in these economies have employed administrative units. China has a hierarchy of urban administrative units: provincial-level cities, deputy-provincial cities, provincial capitals, prefecture-level cities, and county-level cities (Chan 2010). While studying the units ranked as prefecture-level cities or higher is convenient, it suffers several shortcomings. Provincial-level and prefecture-level cities incorporate both substantial rural areas and distinct urban areas not necessarily economically integrated with the prefecture-level city's urban core (Chan 2007). Further, economically integrated metropolitan areas need not be contained within a prefecture. For example, the prefecture-level cities of Guangzhou and Foshan are only 18 miles apart and share connected subway lines.

Similar issues arise with respect to India. India is divided into states, districts, and sub-districts. Most prior research on urbanisation in India has studied the urban population of districts. Since an Indian district is roughly twice the size of a US county, towns within a district need not be economically integrated. Simultaneously, this approach fragments urban areas that span multiple districts.

We find considerable contrasts between these administrative spatial units and the metropolitan areas produced by applying our lights-based algorithm to China and India. The differences are sufficiently large that they affect conclusions about the distribution of population and economic activity across space. Importantly, the largest metropolitan area produced by our night-lights-based procedure corresponds to the Pearl River Delta, an administratively fragmented urban area spanning Dongguan, Foshan, Guangzhou, and Shenzhen that has no dominant central city but rather “several original centers that over time merge across boundaries” (World Bank Group 2015). This multi-jurisdictional urban area, which by its nature does not appear in prefecture-level city data, is home to more than 40 million residents.

One way of illustrating the contrast between our lights-based metropolitan areas and the administrative units is to look at the city-size distributions that result. Zipf's law for cities (the number of cities larger than L is proportionate to 1/L) is an empirical regularity found to hold in many countries and time periods (Gabaix and Ioannides 2004). China and India appear to be notable exceptions: Chauvin, Glaeser, Ma, and Tobio (2017) document that Chinese prefecture-level cities and Indian districts' urban populations do not conform to Zipf's law. The rank-size plots shown in Figure 3 are log-quadratic, not log-linear. Chauvin et al. (2017) suggest a number of potential explanations for this deviation, up to the possibility that "China and India may be better seen as continents rather than standard countries." Another potential explanation is that the finding is simply a statistical artefact of the geographic units used to characterise the Chinese and Indian city-size distributions.

Figure 3 City-size distributions with administrative units

Notes: These two panels are taken from Figure 2 in Chauvin et al. (2017).

When measured using night-lights-based metropolitan areas, both China's and India's city-size distributions are well described by a power law, and this fit is not very sensitive to the light-intensity threshold used to construct the metropolitan areas. Figure 4 depicts China's and India's city-size distributions using metropolitan areas defined by our algorithm with a light-intensity threshold of 30. For both China and India, the rank-size relationship fits a log-linear power-law specification very well. The log-quadratic shape found by Chauvin et al. (2017) seems primarily due to their choices of geographic units. When using our night-lights-based approach to build metropolitan areas, these emerging economies' city-size distributions exhibit the same regularities as developed economies' urban systems.

Figure 4 China's and India's city-size distributions with lights-based units


Notes: This figure depicts the rank-size relationship for Chinese (top panel) and Indian (bottom panel) metropolitan areas with populations greater than 100,000. Those metropolitan areas are defined by aggregating townships (China) or subdistricts (India) in areas of contiguous night lights with intensity greater than 30.

Agglomeration is skill-biased in emerging markets

After appropriately defining these metropolitan areas, we turn to examining whether agglomeration is skill-biased in Brazil, China, and India. For each country, we characterise the distribution of skill across metropolitan areas using four categories of educational attainment. Following Davis and Dingel (forthcoming), we regress each skill's log population in a city on that city's log total population to estimate skill-specific ‘population elasticities’. The theoretical model in Davis and Dingel (forthcoming) implies that more skilled groups have higher population elasticities.

Our empirical estimates of these population elasticities are indeed monotonically increasing in skill. In Brazil, the population elasticities range from less than 0.93 for the least-skilled ‘no schooling’ group to more than 1.16 for college graduates. That is, a Brazilian metropolitan area that is 10% larger has 11.6% more college graduates but only 9.3% more unschooled residents. In China, the population elasticities for township-based metropolitan areas range from less than 0.91 for those with primary school or lower educational attainment to more than 1.31 for those with college or university education. In India, educational attainment is not available for such granular spatial units, but we find population elasticities ranging from 0.96 to 1.03. In short, in all three emerging markets, larger cities are home to more skilled populations.

Spatial variation in nominal wages also suggests that agglomeration is skill-biased. A recent finding in developed economies is that larger cities exhibit both higher relative quantities and higher relative prices of skill (Baum-Snow and Pavan 2013, Davis and Dingel 2019). We find a similar pattern in Brazil. The population elasticity of the college wage premium in Brazil is between 4% and 5%, exceeding the 3% elasticity estimated for US metropolitan areas in 2000 (Davis and Dingel 2019). The implied greater relative demand for college graduates in larger cities suggests that more skilled workers particularly benefit from the productivity-enhancing effects of agglomeration in both the US and Brazil.


Do urban systems of developing economies exhibit spatial patterns similar to those found in developed economies? Answering this question requires defining metropolitan areas comparable to those in developed economies. To avoid the pitfalls associated with administrative units that do not correspond to integrated economic entities and to circumvent the absence of commuting data in settings like China and India, we use satellite imagery to construct metropolitan areas. Since satellite images cover the entire globe and are becoming available in finer resolutions, our method for defining metropolitan areas should facilitate studies of urbanisation and local labour markets in many different contexts.


Baum-Snow, N and R Pavan (2013), “Inequality and City Size”, The Review of Economics and Statistics 95(5): 1535–1548.

Bryan, G, E Glaeser and N Tsivanidis (2019), "Cities in the Developing World", NBER Working Paper 26390.

Chan, K W (2007), “Misconceptions and Complexities in the Study of China’s Cities: Definitions, Statistics, and Implications”, Eurasian Geography and Economics 48(4): 383– 412.

Chan, K W (2010), “Fundamentals of China’s Urbanization and Policy”, China Review 10(1): 63–93.

Chauvin, J P, E Glaeser, Y Ma and K Tobio (2017), “What is different about urbanization in rich and poor countries? Cities in Brazil, China, India and the United States”, Journal of Urban Economics 98: 17 – 49.

Davis, D R and J I Dingel (2019), “A Spatial Knowledge Economy”, American Economic Review 109(1): 153–170.

Davis, D R and J I Dingel (forthcoming), “The Comparative Advantage of Cities”, Journal of International Economics.

Desmet, K and J Henderson (2015), "The Geography of Development Within Countries", Handbook of Regional and Urban Economics, Volume 5B.

Dingel, J I, A Miscio and D R Davis (2019), "Cities, Lights, and Skills in Developing Economies", Journal of Urban Economics.

Duranton, G (2015), “Delineating Metropolitan Areas: Measuring Spatial Labour Market Networks Through Commuting Patterns”, in The Economics of Interfirm Networks,  Tokyo: Springer Japan 107–133.

Gabaix, X and Y Ioannides (2004), “The evolution of city size distributions”, in Handbook of Regional and Urban Economics Vol. 4.1: 2341–2378.

Moretti, E (2012), The New Geography of Jobs.

Rozenfeld, H D, D Rybski, X Gabaix and H A Makse (2011), “The Area and Population of Cities: New Insights from a Different Perspective on Cities”, American Economic Review 101(5): 2205–25.

World Bank Group (2015), East Asia’s Changing Urban Landscape: Measuring a Decade of Spatial Growth, World Bank.

945 Reads