It is increasingly common for surveys to collect information on social links and inter-personal flows among individuals such as friendship, loans and gifts, advice, or referral. In particular, much social network analysis is based on data reported by survey respondents – for example, answers to questions such as “To whom did you lend money?’; “With whom do you exchange information?”; or Who are your friends?” (Banerjee et al. 2013, Fafchamps and Lund 2003). In principle, the answers to these questions should agree. If, for instance, respondent A reports lending money to respondent B, then B should report receiving money from A. Yet it is extremely common for such data to be discordant – that is, there are often considerable discrepancies between the answers given by two respondents.
Until now, misreporting of this kind has typically been ignored. We show that failing to properly account for misreporting may bias estimation. This is best understood with an example. Imagine we have data on respondents' ethnicity (two groups, A and B) and transfers between them. Assume that respondents from both groups give and receive with equal probability, but members of group A are less likely to report transfers. If the researcher assumes that a transfer took place only if at least one respondent reported it, the estimated coefficient of belonging to ethnic group A on the probability of transfer will be biased downwards. This is because the researcher observes transfers less frequently when givers and receivers are from this ethnic group, but this is entirely due to the difference in misreporting.
In a recent paper, we propose a novel maximum likelihood estimator that corrects for such misreporting bias in a systematic way (Comola &and Fafchamps 2015). This estimator accounts separately for the two respondents’ propensity to report a transfer, which may depend on their observable characteristics. The methodology we propose is of particular interest to researchers studying social networks, but it is applicable to all those situations where the researcher has two conflicting measurements of the same dependent variable from two different sources, for example, multiple measurements of schooling levels in twins (Ashenfelter and Krueger 1994), discrepancies over earnings reported by workers and companies (Duncan and Hill 1985), estimates of time spent on housework by the spouse (Lee and Waite 2005), or bilateral trade flows reported by exporters and importers (Gaulier and Zignago 2010).
Informal transfers have been recognised as of great importance for development, since they represent a source of favour exchange and insurance against idiosyncratic shocks. We illustrate our methodology using data on inter-household informal transfers (loans and gifts) from the village of Nyakatoke in Tanzania. These data display a high rate of discrepancy in survey responses, which is in line with other existing data sources1 – in only 27% of the cases with a reported transfer was a transfer reported by both the giver and the receiver.
We find robust evidence that failing to account for misreporting results in a sizable underestimation of the total amount of transfers between villagers. According to our calculations, applying standard estimation techniques would capture at most two-thirds of the transfers that we estimate to have been made within the village. This finding casts some doubt on the reliability of previous results that rely on transfers reported in household surveys. In particular, many studies have found that reported gifts and loans are insufficient to insulate households against shocks (Rosenzweig 1988). But if actual gifts and loans are much larger, these findings may have to be revised upwards.
Ashenfelter, O and A Krueger (1994) “Estimates of the economic return to schooling from a new sample of twins”, American Economic Review, 84: 1157-73.
Banerjee, A, A G Chandrasekhar, E Duflo and M O Jackson (2013) “The diffusion of microfinance”, Science: 341.
Comola, M and M Fafchamps (2015) “The missing transfers: Estimating misreporting in dyadic data” forthcoming in Economic Development and Cultural Change.
Duncan, G and D Hill (1985) “An investigation of the extent and consequences of measurement error in labor economic survey data”, Journal of Labor Economics, 3: 508-522.
Fafchamps, M and S Lund (2003) “Risk sharing networks in rural philippines”, Journal of Development Economics, 71: 261-87.
Gaulier, G and S Zignago (2010) “BACI: International trade database at the product-level: The 1994-2007 version”, CEPII Working Paper 2010-23 , CEPII.
Lee, Y S and L J Waite (2005) “Husbands and wives time spent on housework: A comparison of measures”, Journal of Marriage and Family, 67: 328-336.
Rosenzweig, M R (1988) “Risk, implicit contracts and the family in rural areas of low-income countries”, Economic Journal, 98: 1148-1170.
Vaquera, E and G Kao (2008) “Do you like me as much as I like you? Friendship reciprocity and its effects on school outcomes among adolescents', Social Science Research, 37 (1): 55-72.
1 See Vaquera and Krao 2008 for the widely-studied Add-Health dataset.