piotr sapieżyński
home | news | publications | press coverage | cv


  1. Algorithms that "Don't See Color": Comparing Biases in Lookalike and Special Ad Audiences
    Piotr Sapiezynski, Avijit Ghosh, Levi Kaplan, Alan Mislove, Aaron Rieke. arXiv preprint, 2019
    Following accusations of allowing discrimination on their ad platform, Facebook settled with the civil rights groups and agreed to introduce a number of changes to the platform. Among them, they introduced a tool called Special Audiences, that allows advertisers to reach users "similar" to their customers (or employees) but without considering age, gender, race, etc. In this report we show that simply not looking at these protected attributes doesn't change anything - the created audiences have nearly the same level of bias as the source audience.
  2. Interaction data from the Copenhagen Networks Study
    Piotr Sapiezynski, Arkadiusz Stopczynski, David Dreyer Lassen, Sune Lehmann. Nature Scientific Data, 2019
    We released the multi-layer temporal network connecting more than 700 students over a period of four weeks. The dataset was collected via smartphones as part of the Copenhagen Networks Study and it includes physical proximity, metadata for calls and text messages, as well as a static Facebook friendship graph, and gender information. My collaborators and I already published multiple papers based on the data, now we're happy to finally share it with the rest of the scientific community!
  3. Ad Delivery Algorithms: The Hidden Arbiters of Political Messaging
    Muhammad Ali*, Piotr Sapiezynski*, Aleksandra Korolova, Alan Mislove, Aaron Rieke. arXiv preprint, 2019
    Political speech is paid, not free. On Facebook it also cost different amounts to advertise different political opinions to the same people. Showing liberal ads to conservatives (or conservative ads to liberals) can cost three times more than showing an "aligned" ad to the same audience. Further, when a political advertiser tries to show their ad to a broad audience, Facebook will show it predominantly to people who already agree with the message instead.
  4. Discrimination through Optimization: How Facebook's Ad Delivery Can Lead to Biased Outcomes
    Muhammad Ali*, Piotr Sapiezynski*, Miranda Bogen, Aleksandra Korolova, Alan Mislove, Aaron Rieke. CSCW 2019
    Most of research in discriminatory advertisting concerned abuse of targeting features: excluding Black and Latino people from seeing housing ads, excluding older workers from job ads, etc. In this work, we showed that even if the advertisers want to show their ads to a diverse audience, Facebook will preferentially present them to users who Facebook predicts to be more interested. As a result, women see different job ads than men (supermarket cashiers and janitors vs. AI specialists and lumberjacks), while white and Black people are presented with different housing opportunities.
  5. Auditing Offline Data Brokers via Facebook's Advertising Platform
    Giridhari Venkatadri, Piotr Sapiezynski, Elissa M Redmiles, Alan Mislove, Oana Goga, Michelle Mazurek, Krishna P Gummadi. WWW 2019
    Facebook does not only know the information you share in your profile or your browsing history. We find that on top of that Facebook has been buying information about more than 90% of their US users from data brokers. At least 40% of it (including financial information) is not at all accurate, potentially affecting not just the ads you see but also credit decisions, etc.
  6. Quantifying the Impact of User Attention on Fair Group Representation in Ranked Lists
    Piotr Sapiezynski, Wesley Zeng, Ronald E Robertson, Alan Mislove, Christo Wilson. Companion Proceedings of WWW 2019
    We interact with ranked lists everyday through web search results, job postings, or dating services. Arguably, a fair representation of a group (for example women among job applications) requires that this group gets enough attention as a whole. That attention depends both on where they are in the ranking and on how much of that ranking is actually seen. In this paper we model the interplay between these two factors.
  7. Investigating sources of PII used in Facebook’s targeted advertising
    Giridhari Venkatadri, Elena Lucherini, Piotr Sapiezynski, Alan Mislove. Proceedings on Privacy Enhancing Technologies, 2019
    Facebook nudged their users to enable two-factor autheticantion by providing their phone numbers "for additional security". In turn, the advertisers can now use this phone number to target these users with ads. Even if the users didn't enable their 2FA but went with the default option of using FB Messanger for text messagning - their phone number is now targetable. Worst of all - even if you never gave your phone number to Facebook for any reason but any one of your friends had your phone number in their phone book and allowed Facebook access - your phone number is now targetable.


  8. Academic performance and behavioral patterns
    Valentin Kassarnig, Enys Mones, Andreas Bjerre-Nielsen, Piotr Sapiezynski, David Dreyer Lassen, Sune Lehmann. EPJ Data Science, 2018
    Based on data collected from smartphones and Facebook, we find that for a big part of students academic performance of their friends is more predictive of their own performance than attendance is (but not for all of them, see our other paper). Showing up for classes consistently and not playing with phones during lectures are still most predictive individual features.
  9. Evidence for a conserved quantity in human mobility
    Laura Alessandretti, Piotr Sapiezynski, Vedran Sekara, Sune Lehmann, Andrea Baronchelli. Nature Human Behaviour, 2018
    Humans have been shown to have fixed maxiumum capacity for the number of people they can maintain active ackquaintanceships with (because of mental, not time constraints) known as the "Dunbar number". In this work we show that such a capacity exists also for the number of active physical locations - for example, when you find a new restaurant, you tend to stop going to one of your previous favorites.


  10. Evidence of Complex Contagion of Information in Social Media: An Experiment Using Twitter Bots
    Bjarke Mønsted, Piotr Sapieżyński, Emilio Ferrara, Sune Lehmann. PLOS One, 2017
    The spread of diseases follows a simple contagion model - everytime you're exposed to a virus or bacteria, there's a certain probability of getting sick. It has been hypothesised that spread of information and trends follows a complex contaigion model, in which you need multiple sources of exposure to pick it up. Using a coordinated group of Twitter bots we disseminated positive messages to real people and showed they are more likely to retweet our content if they're exposed to it from multiple sources compared to just being exposed multiple times from the same source.
  11. Inferring person-to-person proximity using WiFi signals
    Piotr Sapiezynski, Arkadiusz Stopczynski, David Kofoed Wind, Jure Leskovec, Sune Lehmann. ACM Interactive, Mobile, Wearable, and Ubiquitous Technologies, 2017
    We find it's possible to reliably infer whether two people are in close proximity by comparing which WiFi routers their phones see. At the time of writing, 80% of Android apps had access to the nearby WiFi routers at all times, posing a massive privacy risk.
  12. Academic performance prediction in a gender-imbalanced environment
    Piotr Sapiezynski, Valentin Kassarnig, Christo Wilson, Sune Lehmann, Alan Mislove. FATREC Workshop on Responsible Recommendation at RecSys, 2017
    Our other paper on predicting academic performance from individual behavior and social network shows that social ties are predictive of one's grades. In this paper we show that it's mostly the case for men (majority) in the dataset, but not for women (minority). Any machine learning algorithm by default optimizes for overal performance, and as an effect women get worse predictions than men. We suggest achieving parity through selecting such combinations of features that lead to a more balanced performance.
  13. The Role of Gender in Social Network Organization
    Ioanna Psylla, Piotr Sapiezynski, Enys Mones, Sune Lehmann. PLOS ONE, 2017
    We observe population level differences between men and women in the Copenhagen Networks Study, especially with respect to their social networks: women are much more likely to be friends mostly with other women and, as a minority, are on the periphery of the university network.
  14. Multi-scale spatio-temporal analysis of human mobility
    Laura Alessandretti, Piotr Sapiezynski, Sune Lehmann, Andrea Baronchelli. PLOS ONE, 2017
    We show that the distributions of distances and waiting times in between consecutive locations human mobility trances are best described by log-normal and gamma distributions, respectively, and that natural time-scales emerge from the regularity of human mobility.

    Full publication list is available on my Google Scholar profile.