Despite proportionality being one of the tenets of data protection laws, we
currently lack a robust analytical framework to evaluate the reach of modern
data collections and the network effects at play. We here propose a
graph-theoretic model and notions of node- and edge-observability to quantify
the reach of networked data collections. We first prove closed-form expressions
for our metrics and quantify the impact of the graph’s structure on
observability. Second, using our model, we quantify how (1) from 270,000
compromised accounts, Cambridge Analytica collected 68.0M Facebook profiles;
(2) from surveilling 0.01% the nodes in a mobile phone network, a
law-enforcement agency could observe 18.6% of all communications; and (3) an
app installed on 1% of smartphones could monitor the location of half of the
London population through close proximity tracing. Better quantifying the reach
of data collection mechanisms is essential to evaluate their proportionality.

Author Of this post: <a href="">Florimond Houssiau</a>, <a href="">Piotr Sapiezynski</a>, <a href="">Laura Radaelli</a>, <a href="">Erez Shmueli</a>, <a href="">Yves-Alexandre de Montjoye</a>

