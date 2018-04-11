Facebook recently announced dramatic data access restrictions on its app and website. The company framed the lockdown as an attempt to protect user information, in response to the public outcry following the Cambridge Analytica scandal.

But the decision is in line with growing restrictions imposed on researchers studying Facebook and its photo-sharing app Instagram, which also began immediately restricting access to its data on April 4.

In fact, several limitations were put in place in February this year, before the Cambridge Analytica fiasco – in which data was allegedly harvested from 50m Facebook profiles – erupted publicly. Facebook’s API, version 2.5, was scheduled to be retired this month, by – among other things – preventing access to the ID of users participating in public forums.

Social networks offer two main entry points for the collection of data: they work as interfaces for users and software interfaces designed for consumption by computer programs, known as Application Programming Interfaces (APIs).

While APIs are intended for programmers building apps that add to the growing ecosystem of services offered by social networks, researchers have also leveraged these interfaces to study social behaviour online.

Read more: Cambridge Analytica scandal: legitimate researchers using Facebook data could be collateral damage

Given the mammoth size of Facebook’s userbase (2.13 billion at the last count), external scrutiny of the content on the social network is extremely important. In recent years, however, researchers have been fighting an uphill battle with the company to provide access to data. Now its latest decision has made it virtually impossible to carry out large-scale research on Facebook.

The changes make defunct software and libraries dedicated to academic research on Facebook, including netvizz, NodeXL, SocialMediaLab, fb_scrape_public and Rfacebook, all of which relied on Facebook’s APIs to collect data.

Systematic research on Facebook content is now untenable, turning what was already a worryingly opaque, siloed social network into a black box that is arguably even less accountable to lawmakers and the public – both of whom benefited from academics who monitored developments on the site.

Deen Freelon, the developer of fb_scrape_public which analyses large, publicly available datasets on Facebook, told us via email that “the decision to restrict access to the Pages API could severely impair content-based Facebook research going forward, depending on how willing Facebook is to approve access. If it doesn’t approve access for most research purposes, that could create incentives for researchers to scrape Facebook directly, which violates its terms of service.” Data scraping or harvesting is a method by which a computer program extracts information from web pages.

Bernhard Rieder, an associate professor at the University of Amsterdam who developed netvizz – a tool that extracts data from Facebook for research purposes – believes the move was a consequence of the level of unfettered access given to anyone until 2015 and that “there is a real possibility that these services will increasingly be inscrutable and unobservable”.

Up until three years ago, Facebook allowed third-party apps to have access to data on the friends of app users. It was this function that was used by Aleksandr Kogan, a researcher at the University of Cambridge.

Read more: How Cambridge Analytica’s Facebook targeting model really worked – according to the person who built it

Kogan – through his Global Science Research startup, which was separate from his academic work – allegedly collected profile information from 270,000 Facebook users and tens of millions of their friends using a personality test app called “thisisyourdigitallife”. It’s alleged that Cambridge Analytica used that data in an attempt to target political campaigns including the 2016 US presidential election.

Marc Smith, who led the Microsoft team that created NodeXL, which analyses social network data, told us that there was an opportunity to rethink the social networks people choose to use in light of the data scandal.

Why APIs matter

APIs allow researchers to retrieve large-scale data and curate databases associated with meaningful events. Without them, web interfaces have to be scraped to access the data, which is labour intensive and drastically limits the amount of information that can be collected and processed.

Story Continues