Discussion paper

DP14450 Building(s and) cities: Delineating urban areas with a machine learning algorithm

This paper proposes a novel methodology for delineating urban areas based on a machine learning algorithm that groups build-ings within portions of space of sufficient density. To do so, we use the precise geolocation of all 12 million buildings in Spain. We exploit building heights to create a new dimension for urban areas, namely, the vertical land, which provides a more accurate measure of their size. To better understand their internal structure and to illustrate an additional use for our algorithm, we also identify employment centers within the delineated urban areas. We test the robustness of our method and compare our urban areas to other delineations obtained using admin-istrative borders and commuting-based patterns. We show that: 1) our urban areas are more similar to the commuting-based delineations than the administrative boundaries but that they are more precisely measured; 2) when analyzing the urban areas’ size distribution, Zipf’s law appears to hold for their population, surface and vertical land; and 3) the impact of transportation improvements on the size of the urban areas is not underestimated.


Viladecans-Marsal, E, M Garcia-López and D Arribas-Bel (eds) (2020), “DP14450 Building(s and) cities: Delineating urban areas with a machine learning algorithm”, CEPR Press Discussion Paper No. 14450. https://cepr.org/publications/dp14450