Postdoctoral Publications

2017
Coscia, M. & Neffke, F., 2017. Network Backboning with Noisy Data. 2017 IEEE 33rd International Conference on Data Engineering (ICDE) , (May) , pp. 425-436. Publisher's VersionAbstract
Networks are powerful instruments to study complex phenomena, but they become hard to analyze in data that contain noise. Network backbones provide a tool to extract the latent structure from noisy networks by pruning non-salient edges. We describe a new approach to extract such backbones. We assume that edge weights are drawn from a binomial distribution, and estimate the error-variance in edge weights using a Bayesian framework. Our approach uses a more realistic null model for the edge weight creation process than prior work. In particular, it simultaneously considers the propensity of nodes to send and receive connections, whereas previous approaches only considered nodes as emitters of edges. We test our model with real world networks of different types (flows, stocks, cooccurrences, directed, undirected) and show that our Noise-Corrected approach returns backbones that outperform other approaches on a number of criteria. Our approach is scalable, able to deal with networks with millions of edges.
Neffke, F., 2017. Coworker complementarity.Abstract

How important is working with people who complement one's skills? Using administrative data that record which of 491 educational tracks each worker in Sweden absolved, I quantify the educational fit among coworkers along two dimensions: coworker match and coworker substitutability. Complementary coworkers raise wages with a comparable factor as does a college degree, whereas working with close substitutes is associated with wage penalties. Moreover, this coworker fit does not only account for large portions of the urban and large-plant wage premiums, but the returns to own schooling and the urban wage premium are almost completely contingent on finding complementary coworkers.

rfwp79_neffke.pdf
2016
O'Clery, N., Gomez-Lievano, A. & Lora, E., 2016. The Path to Labor Formality: Urban Agglomeration and the Emergence of Complex Industries.Abstract

Labor informality, associated with low productivity and lack of access to social security services, dogs developing countries around the world. Rates of labor (in)formality, however, vary widely within countries. This paper presents a new stylized fact, namely the systematic positive relationship between the rate of labor formality and the working age population in cities. We hypothesize that this phenomenon occurs through the emergence of complex economic activities: as cities become larger, labor is allocated into increasingly complex industries as firms combine complementary capabilities derived from a more diverse pool of workers. Using data from Colombia, we use a network-based model to show that the technological proximity (derived from worker transitions between industry pairs) of current industries in a city to potential new complex industries governs the growth of the formal sector in the city. The mechanism proposed has robust strong predictive power, and fares better than alternative explanations of (in)formality.

rfwp_78.pdf
Explaining the prevalence, scaling and variance of urban phenomena
Gomez-Lievano, A., Patterson-Lomba, O. & Hausmann, R., 2016. Explaining the prevalence, scaling and variance of urban phenomena. Nature Human Behavior. Publisher's VersionAbstract

The prevalence of many urban phenomena changes systematically with population size1 . We propose a theory that unifies models of economic complexity2,3 and cultural evolution4 to derive urban scaling. The theory accounts for the difference in scaling exponents and average prevalence across phenomena, as well as the difference in the variance within phenomena across cities of similar size. The central ideas are that a number of necessary complementary factors must be simultaneously present for a phenomenon to occur, and that the diversity of factors is logarithmically related to population size. The model reveals that phenomena that require more factors will be less prevalent, scale more superlinearly and show larger variance across cities of similar size. The theory applies to data on education, employment, innovation, disease and crime, and it entails the ability to predict the prevalence of a phenomenon across cities, given information about the prevalence in a single city.

Related Content: The Urban Theory of Everything

Harvard Magazine: Recipes for Thriving Cities

Gomez-Lievano, A., Patterson-Lomba, O. & Hausmann, R., 2016. Explaining the Prevalence, Scaling and Variance of Urban Phenomena.Abstract

The prevalence of many urban phenomena changes systematically with population size1. We propose a theory that unifies models of economic complexity2, 3 and cultural evolution4 to derive urban scaling. The theory accounts for the difference in scaling exponents and average prevalence across phenomena, as well as the difference in the variance within phenomena across cities of similar size. The central ideas are that a number of necessary complementary factors must be simultaneously present for a phenomenon to occur, and that the diversity of factors is logarithmically related to population size. The model reveals that phenomena that require more factors will be less prevalent, scale more superlinearly and show larger variance across cities of similar size. The theory applies to data on education, employment, innovation, disease and crime, and it entails the ability to predict the prevalence of a phenomenon across cities, given information about the prevalence in a single city.

urban_phenomena_cidwp329.pdf

This paper is published in the journal, Nature: Human Behavior.

Coscia, M., Hausmann, R. & Neffke, F., 2016. Exploring the Uncharted Export: An Analysis of Tourism-Related Foreign Expenditure with International Spend Data.Abstract

Tourism is one of the most important economic activities in the world: for many countries it represents the single largest product in their export basket. However, it is a product difficult to chart: "exporters" of tourism do not ship it abroad, but they welcome importers inside the country. Current research uses social accounting matrices and general equilibrium models, but the standard industry classifications they use make it hard to identify which domestic industries cater to foreign visitors. In this paper, we make use of open source data and of anonymized and aggregated transaction data giving us insights about the spend behavior of foreigners inside two countries, Colombia and the Netherlands, to inform our research. With this data, we are able to describe what constitutes the tourism sector, and to map the most attractive destinations for visitors. In particular, we find that countries might observe different geographical tourists' patterns - concentration versus decentralization -; we show the importance of distance, a country's reported wealth and cultural affinity in informing tourism; and we show the potential of combining open source data and anonymized and aggregated transaction data on foreign spend patterns in gaining insight as to the evolution of tourism from one year to another.

tourism_cid_wp_328.pdf
Neffke, F., Otto, A. & Weyh, A., 2016. Inter-industry Labor Flows.Abstract

Labor flows across industries reallocate resources and diffuse knowledge among economic activities. However, surprisingly little is known about the structure of such inter-industry flows. How freely do workers switch jobs among industries? Between which pairs of industries do we observe such switches? Do different types of workers have different transition matrices? Do these matrices change over time?

Using German social security data, we generate stylized facts about inter-industry labor mobility and explore its consequences. We find that workers switch industries along tight paths that link industries in a sparse network. This labor-flow network is relatively stable over time, similar for workers in different occupations and wage categories and independent of whether workers move locally or over larger distances. When using these networks to construct inter-industry relatedness measures they prove better predictors of local industry growth rates than co-location or input-based alternatives. However, because industries that exchange much labor typically do not have correlated growth paths, the sparseness of the labor-flow network does not necessarily prevent a smooth reallocation of workers from shrinking to growing industries. To facilitate future research, the inter-industry relatedness matrices we develop are made available as an online appendix to this paper.

rfwp_72.pdf
Hausmann, R. & Neffke, F., 2016. The Workforce of Pioneer Plants.Abstract

Is labor mobility important in technological diffusion? We address this question by asking how plants assemble their workforce if they are industry pioneers in a location. By definition, these plants cannot hire local workers with industry experience. Using German social-security data, we find that such plants recruit workers from related industries from more distant regions and local workers from less-related industries. We also show that pioneers leverage a low-cost advantage in unskilled labor to compete with plants that are located in areas where the industry is more prevalent. Finally, whereas research on German reunification has often focused on the effects of east-west migration, we show that the opposite migration facilitated the industrial diversification of eastern Germany by giving access to experienced workers from western Germany.

pioneerplants_cid_wp_310.pdf
2015
Nedelkoska, L. & Khaw, N., 2015. The Albanian Community in the United States: Statistical Profiling of the Albanian-Americans, Growth Lab at Harvard's Center for International Development.Abstract

When the Albanian Communist regime fell in 1991-92, many Albanians saw their future outside the borders of Albania. At that time in history, no one anticipated the scale of migration that would take place in the subsequent two decades. Today, one third of Albania’s 1991 population lives abroad. Most of these migrants live and work in neighboring Greece and Italy. The third most popular destination is however the United States. Besides this new wave of migrants, the US has an old Albanian diaspora–the offspring of migrants who came to the US between the First and the Second World War. This is what mainly gives rise to the second generation Albanian-Americans.

To the best of our knowledge, there is currently no systematic documentation of the socio-demographic and economic characteristics of the Albanian community in the US. To bridge this gap, we use data from the American Community Survey 2012 and analyze these characteristics. The profiling could be of interest for anyone who focuses on the Albanians abroad – the Government’s Programs dealing with diaspora and migration issues, researchers interested in migration questions, the Albanian Community Organizations in the US or the diaspora members themselves.

We find that the first and the second generation Albanian-Americans have distinctive features. The first generation (those who arrived after the fall of Communism) is more educated than the non-Albanian Americans with comparable demographics. This is particularly true of Albanian women. The education of the second generation resembles more closely the US population with comparable demographic characteristics.

Despite the qualification advantage, first generation Albanian-Americans earn much less than non-Albanian Americans with comparable socio-demographic characteristics. We find that this is not associated with being Albanian per se but with being an immigrant more generally. The migrant-native gap narrows down with time spent in the US.

An important channel through which the current gap is maintained is qualification mismatch. We observe that first generation Albanian-Americans are over-represented in occupations requiring little skills and under-represented in occupations requiring medium and high skills, in direct contrast to them being more educated than non-Albanians.

When it comes to the earnings of second generation Albanian-Americans, the situation is more nuanced. The low skilled Albanian-Americans earn significantly more, and the highly skilled Albanian-Americans earn significantly less than the non-Albanian Americans with comparable socio-demographic characteristics. We currently do not have a straightforward explanation for this pattern.

The Albanian population in the US is highly concentrated in a few states: New York, Michigan and Massachusetts account for almost 60% of all Albanian Americans. The community in Massachusetts is the best educated; best employed and has the highest earnings among the three, but is also the oldest one in terms of demographics.

However, due to its sheer size (over 60,000 Albanian-Americans), New York is the host of most Albanians with BA degree (about 10,000). New York also hosts the largest number of high earning Albanians (about 1,800 earn at least $100,000 a year).

usadiasporaprofile_final.pdf
Nedelkoska, L. & Kosmo, M., 2015. Albanian-American Diaspora Survey Report, Growth Lab at Harvard's Center for International Development.Abstract

This survey studies the ways in which active Albanian-Americans would like to engage in the development of their home countries. Its results will help us define the focus of the upcoming events organized under the Albanian Diaspora Program.

Between March 6th and March 22nd 2015, 1,468 Albanian-Americans took part in the online survey, of which 869 completed the survey. The results presented in this report are based on the answers of the latter group. The results of this survey do not represent the opinions of the general Albanian-American community, but rather the opinions of those who are more likely to engage in an Albanian Diaspora Program.

The survey was jointly prepared with the following Albanian-American organizations: Massachusetts Albanian American Society (MAAS/BESA), Albanian American Success Stories, Albanian Professionals in Washington D.C., Albanian Professionals and Entrepreneurs Network (APEN), Albanian-American Academy, Albanian American National Organization, and VATRA Washington D.C. Chapter. The survey was sponsored by the Open Society Foundations, as a part of the grant OR2013-10995 Economic Growth in Albania granted to the Center for International Development at Harvard University.

diaspora_survey_results_report_v5.pdf
Coscia, M., Neffke, F. & Lora, E., 2015. Report on the Poblacion Flotante of Bogota.Abstract

In this document we describe the size of the Poblacion Flotante of
Bogota (D.C.). The Poblacion Flotante is composed by people who live
outside Bogota (D.C.), but who rely on the city for performing their job.
We estimate the Poblacion Flotante impact relying on a new data source
provided by telecommunications operators in Colombia, which enables us
to estimate how many people commute daily from every municipality of
Colombia to a specic area of Bogota (D.C.). We estimate that the size of
the Poblacion Flotante could represent a 5.4% increase of Bogota (D.C.)'s
population. During weekdays, the commuters tend to visit the city center
more.

rf_wp_67.pdf
Gomez-Lievano, A., Tellez, J. & Lora, E., 2015. New Insights About Wage Inequality in Colombia.Abstract

This paper presents a descriptive analysis of wage inequality in Colombia by cities and industries and attempts to evaluate the impact of the inequality of industries on inequality of cities. Using the 2010-2014 Colombian Social Security data, we calculate the gini coefficient for cities and industries and draw comparisons between their distributions. Our results show that while cities are unequal in similar ways, industries differ widely on how unequal they can be with ginis. Moreover, industrial structure plays a significant role to determine city inequality. Industrial framework proves to be a key element in this area for researches and policymakers.

wage_inequality_colombia_wp_66.pdf
2014
Neffke, F., et al., 2014. Agents of Structural Change: The role of firms and entrepreneurs in regional diversification.Abstract

Who introduces structural change in regional economies: Entrepreneurs or existing firms? And do local or non‐local founders of establishments create most novelty in a region? Using matched employer/employee data for the whole Swedish workforce, we determine how unrelated and therefore how novel the activities of different establishments are to a region’s industry mix. Up‐ and downsizing establishments cause large shifts in the local industry structure, but these shifts only occasionally require an expansion of local capabilities because the new activities are often related to existing local activities. Indeed, these incumbents tend to align their production with the local economy, deepening the region’s specialization. In contrast, structural change mostly originates via new establishments, especially those with non‐local roots. Moreover, although entrepreneurs start businesses more often in activities unrelated to the existing regional economy, new establishments founded by existing firms survive in such activities more often, inducing longer‐lasting changes in the region.

cid_rfwp_75.pdf
Hausmann, R., et al., 2014. Implied Comparative Advantage. CID Working Paper , 276.Abstract

The comparative advantage of a location shapes its industrial structure. Current theoretical models based on this principle do not take a stance on how comparative advantages in different industries or locations are related with each other, or what such patterns of relatedness might imply about the underlying process that governs the evolution of comparative advantage. We build a simple Ricardian-inspired model and show this hidden information on inter-industry and inter-location relatedness can be captured by simple correlations between the observed patterns of industries across locations or locations across industries. Using the information from related industries or related locations, we calculate the implied comparative advantage and show that this measure explains much of the location’s current industrial structure. We give evidence that these patterns are present in a wide variety of contexts, namely the export of goods (internationally) and the employment, payroll and number of establishments across the industries of subnational regions (in the US, Chile and India). The deviations between the observed and implied comparative advantage measures tend to be highly predictive of future industry growth, especially at horizons of a decade or more; this explanatory power holds at both the intensive as well as the extensive margin. These results suggest that a component of the long-term evolution of comparative advantage is already implied in today’s patterns of production.

2020-07-cid-wp-276-revised-implied-comparative-advantage.pdf
Revised July 2020.

Pages