DP6789 Microstructure of Collaboration: The 'Social Network' of Open Source Software

Author(s): Chaim Fershtman, Neil Gandal
Publication Date: April 2008
Keyword(s): Microstructure of Collaboration, network, open source
JEL(s): L17
Programme Areas: Industrial Organization
Link to this Page: www.cepr.org/active/publications/discussion_papers/dp.php?dpno=6789

The open source model is a form of software development with source code that is typically made available to all interested parties. At the core of this process is a decentralized production process: open source software development is done by a network of unpaid software developers. Using data from Sourceforge.net, the largest repository of Open Source Software (OSS) projects and contributors on the Internet, we construct two related networks: A Project network and a Contributor network. Knowledge spillovers may be closely related to the structure of such networks, since contributors who work on several projects likely exchange information and knowledge. Defining the number of downloads as output we finds that (i) additional contributors are associated with an increase in output, but that additional contributors to projects in the giant component are associated with greater output gains than additional contributors to projects outside of the giant component; (ii) Betweenness centrality of the project is positively associated with the number of downloads. (iii) Closeness centrality of the project appears also to be positively associated with downloads, but the effect is not statistically significant over all specifications. (iv) Controlling for the correlation between these two measures of centrality (betweenness and closeness), the degree is not positively associated with the number of downloads. (v) The average closeness centrality of the contributors that participated in a project is positively correlated with the success of the project. These results suggest that there are positive spillovers of knowledge for projects occupying critical junctures in the information flow. When we define projects as connected if and only if they had at least two contributors in common, we again find that additional contributors are associated with an increase in output, and again find that this increase is much higher for projects with strong ties than other projects in the giant component.