DP17934 How important are user-generated data for search result quality? Experimental evidence
Do some search engines produce better search results because their algorithm is better, or because they have access to more data from past searches? We document that the algorithm of a small search engine can produce non-personalized results that are of similar quality to the dominant firm’s (Google), if it has enough data, and that overall differences in the quality of search results are explained by searches for rare queries. This is confirmed by results from an experiment, in which we keep the algorithm of the search engine fixed and only vary the amount of data it uses as an input. Because 74% of the traffic in our data come from rare queries, these are the pivotal dimension of competition, where a search engine must perform well to offer users high quality and gain market share. Our results suggest that a small search engine would be able to produce search results that are of similar quality as Google’s, if it had access to the data users generated by using Google in the past. We discuss why data sharing may increase innovation here.