The Nordic Africa Institute (NAI) has released a brief news piece, talking about the opportunities and limitations of “call detail/data records” (CDR), i.e. the information or metadata that is generated by the use of mobile phones, might offer for research. This big data is quintessentially used by the service providers in order to carry out accurate billing of their customers. According to NAI, such big data does not only include information from internet browsing and geolocation, but also (mobile money) payment activities. “Since every phone is like a transmitter, CDR can show where people are and where they are going. This can facilitate studies on labour migration or people’s movement during conflicts” say NAI-researcher Johan Kiessling.
When it comes to limitations of CDR, the NAI-article identifies two main themes:
(1) What is being tracked is not persons but rather devices (or SIM cards more specifically), and, as we latest know from James’ & Versteegs (2007) seminal article Mobile phones in Africa: how much do we really know?, the figure of mobile phone users and mobile phone owners – especially in Africa – is not necessarily the same due to shared phone usage. For research on mobile phone diffusion, this means that there is a tendency of the figure of actual users being higher than the reported figure of registered SIM-cards (or lines, as they are called e.g. in Kenya). The latter, however, is the basis that most available longitudinal statistics (see ITU as an example) rely upon – the number of active SIM-cards as reported by the phone providers.* In relation to that, for research on the movement of people a major challenge will be that – especially in regions where shared access represents a dominant mode of phone usage – it will be difficult to pinpoint the movement of a “CDR-data point” to a specific person. This would not only affect research on urban density or the movement of people, but also other potential use cases I do see for CDR in social science research, which is social network analysis or crime investigation for instance.
(2) The second drawback of the usage of CDR is that any conclusion that is drawn from such data is quintessentially non-representative, i.e. it allows conclusions for mobile phone users only. “CDR only tells us something about people with phones. It is likely that rich people have several phones and it is also likely that their behaviour differs from that of poor people. Thus, the information does not reflect the population at large. Particularly in Africa, many people live beyond the range of CDR and therefore it is tricky to draw general conclusions,” Johan Kiessling notes.
Generally speaking, another important question will be how such extremely sensitive data can be obtained or used anyway, which essentially is a matter of privacy, data protection and research ethics. This is, because while the availability of “call detail/data-records” are problematic in their own right when it comes to privacy and/or security, the situation gets even more tricky when such data is merged with additional publicly available data. An astonishing example on what such a merger might look like has been put together into an interactive graphic by the German weekly newspaper DIE ZEIT a while ago. A German green party politician had sued German telecoms giant Deutsche Telekom to hand over “six months of his phone data that he then made available to ZEIT ONLINE. [They] combined this geolocation data with information relating to his life as a politician, such as Twitter feeds, blog entries and websites, all of which is freely available on the internet”.
Click on the graphic below for the interactive map.
Source article: Cellphone data into research, the Nordic Africa Institute (December 2014)
*EDIT: While writing those lines here, I double-checked this fact for validity and found that the ITU recently changed their statistical basis, probably taking account of this weakness: as for the definitions and standards of their ICT-indicators, the ITU points out “Mobile-cellular telephone subscriptions, by postpaid/prepaid” in the 2011-edition of their Handbook on Data Collection (see p. 33), but “Proportion of individuals using a mobile cellular telephone” in the 2014-edition of the Handbook (see p. 60). While the former figure is based on registered SIM-cards as reported by service providers, the latter figure is derived from a survey question “Have you used a mobile telephone in the last three months? Yes/No” (ibid.). This adaption should lead to a much clearer statistical picture of mobile phone access and usage than a ownership-based model.)