Menu Close

Big Data & Society (BD&S)

Big Data & Society (BD&S) is an Open Access peer-reviewed scholarly journal that publishes interdisciplinary work principally in the social sciences, humanities and computing and their intersections with the arts and natural sciences about the implications of Big Data for societies. The Journal’s key purpose is to provide a space for connecting debates about the emerging field of Big Data practices … |

This journal is a member of the Committee on Publication Ethics (COPE).

  • by Tarleton Gillespie
    Big Data & Society, Volume 11, Issue 2, April-June 2024. Proponents of generative AI tools claim they will supplement, even replace, the work of cultural production. This raises questions about the politics of visibility: what kinds of stories do these tools tend to generate, and what do they generally not? Do these tools match the kind of diversity of representation that marginalized populations and non-normative communities have fought to secure in publishing and broadcast media? I tested three widely available generative AI tools with prompts designed to reveal these normative assumptions; I prompted the tools multiple times with each, to […]
  • by Claire Stravato Emes
    Big Data & Society, Volume 11, Issue 2, April-June 2024. The paper explores the potential of big data analytics for researching anti-immigrant discourse. We emphasize contextualization as an essential element of research and follow a hybrid approach inspired by best practices of computational content analysis, combining human hermeneutic expertise with supervised machine learning to classify a corpus of comments in online news communities in Singapore over 6 months (N = 399,225). The paper highlights how big data analytics can provide a nuanced and critical apprehension of immigrant-related discourse in large social media datasets.
  • by Pauline Gourlet
    Big Data & Society, Volume 11, Issue 2, April-June 2024. How to participate in artificial intelligence otherwise? Put simply, when it comes to technological developments, participation is either understood as public debates with non-expert voices to anticipate risks and potential harms, or as a way to better design technical systems by involving diverse stakeholders in the design process. We advocate for a third path that considers participation as crucial to problematise what is at stake and to get a grip on the situated developments of artificial intelligence technologies.This study addresses how the production of accounts shape problems that arise with […]
  • by Kenzo Soares Seto
    Big Data & Society, Volume 11, Issue 2, April-June 2024. This commentary explores Brazil's role in Latin American platform capitalism, integrating Ruy Mauro Marini's theoretical framework with contemporary studies of platform capitalism. It examines the connections between Latin American platforms, overexploitation, and data accumulation, leading to the concept of platform sub-imperialism: The emergence of certain Southern countries as platform sub-imperialist powers, acting as regional centers of data and capital accumulation through the expansion of their platforms into neighboring countries. This positioning constitutes an intermediate state between hegemonic nations and “digital colonies” in the international division of platform labor, data accumulation, […]
  • by Dietmar Offenhuber
    Big Data & Society, Volume 11, Issue 2, April-June 2024. Synthetic data are computer-generated data that mimic and substitute empirical observations without directly corresponding to real-world phenomena. Widely used in privacy protection, machine learning, and simulation, synthetic data is an emerging field only just beginning to be explored in the social sciences and critical data studies. However, recent developments, such as the use of synthetic data in the US Census and American Community Survey, make a reflection on the nature and implications of synthetic data urgent. While earlier work focused mostly on training data for machine-learning models, this paper presents […]
  • by Paula Helm
    Big Data & Society, Volume 11, Issue 2, April-June 2024. In addition to tapping data from users’ behavioral surplus, by drawing on generative adversarial networks, data for artificial intelligence is now increasingly being generated through artificial intelligence. With this new method of producing data synthetically, the data economy is not only shifting from “data collection” to “data generation.” Synthetic data is also being employed to address some of the most pressing ethical concerns around artificial intelligence. It thereby comes with the sociotechnical imaginary that social problems can be cut out of artificial intelligence, separating training data from real persons. In […]
  • by Julian Posada
    Big Data & Society, Volume 11, Issue 2, April-June 2024. Many workers worldwide rely on digital platforms for their income. In Venezuela, a nation grappling with extreme inflation and where most of the workforce is self-employed, data production platforms for machine learning have emerged as a viable opportunity for many to earn an income in US dollars. Data workers are deeply interconnected within a vast network of entities that act as intermediaries for wage payments in digital currencies. Past research on embeddedness has noted that being intertwined in multi-tiered socioeconomic networks of companies and individuals can offer significant rewards to […]
  • by Sarah Burkhardt
    Big Data & Society, Volume 11, Issue 2, April-June 2024. A recent innovation in the field of machine learning has been the creation of very large pre-trained models, also referred to as ‘foundation models’, that draw on much larger and broader sets of data than typical deep learning systems and can be applied to a wide variety of tasks. Underpinning text-based systems such as OpenAI's ChatGPT and image generators such as Midjourney, these models have received extraordinary amounts of public attention, in part due to their reliance on prompting as the main technique to direct and apply them. This paper […]
  • by Shelbey R. Call
    Big Data & Society, Volume 11, Issue 2, April-June 2024. Self-trackers collect personal data for many reasons, including generating insight about their bodies, habits, productivity, and wellbeing. Self-tracking may expose intimate facets of daily life, raising important questions about surveillance, privacy, and data ownership. In this study, we investigated an online community of self-trackers and their weekly “show-and-tell” presentations through observations of their meetings and interviews with members. Making sense of their personal data in community with others involved practical and philosophical difficulties that participants navigated by integrating competing priorities for their interactions in specific communication moves and by transcending […]
  • by Chelsea Peterson-Salahuddin
    Big Data & Society, Volume 11, Issue 2, April-June 2024. Content moderation algorithms influence how users understand and engage with social media platforms. However, when identifying hate speech, these automated systems often contain biases that can silence or further harm marginalized users. Recently, scholars have offered both restorative and transformative justice frameworks as alternative approaches to platform governance to mitigate harms caused to marginalized users. As a complement to these recent calls, in this essay, I take up the concept of reparation as one substantive approach social media platforms can use alongside and within these justice frameworks to take actionable […]
  • by Yuner Zhu
    Big Data & Society, Volume 11, Issue 2, April-June 2024. This article examines the public response to mandatory location disclosure (MLD), a new surveillance technology implemented on China's Sina Weibo. Initially introduced to geo-tag posts related to the Ukraine War, the MLD eventually expanded to encompass all posts and comments on the platform. Drawing on a large-scale dataset comprising over 0.6 million posts and 24 million comments, this study uncovers political asymmetry observed during the initial implementation of MLD. Users with different political orientations were subjected to different levels of geo-tagging. Pro-Ukraine users were most frequently geo-tagged, followed by Pro-Russia […]
  • by Cristina Juverdeanu
    Big Data & Society, Volume 11, Issue 2, April-June 2024. Part of an accelerated trend to integrate algorithms in immigration decision-making, the UK's EU Settlement Scheme relies on automated data checks as an essential and mandatory step in the application for UK residence. In this article, I engage with the literature on datafication and algorithmic accuracy to showcase algorithmic inaccuracy within borders in regard to the allocation of residence statuses and rights. I argue that, while the EUSS uses big data to create a data double of the ‘desirable’ migrant, even applicants within this category experience mismatches. Some EU+ Citizens […]
  • by Isak Engdahl
    Big Data & Society, Volume 11, Issue 2, April-June 2024. This article presents an ethnographic case study of a corporate-academic group constructing a benchmark dataset of daily activities for a variety of machine learning and computer vision tasks. Using a socio-technical perspective, the article conceptualizes the dataset as a knowledge object that is stabilized by both practical standards (for daily activities, datafication, annotation and benchmarks) and alignment work – that is, efforts including forging agreements to make these standards effective in practice. By attending to alignment work, the article highlights the informal, communicative and supportive efforts that underlie the success […]
  • by Przemyslaw Matt Lukacz
    Big Data & Society, Volume 11, Issue 2, April-June 2024. The proliferation of environmentally oriented programs within the tech industry, and the industry's coinciding efforts toward data and technology democratization, generate concerns about the status of environmental data within digital economy. While the accumulation of digital personal data has been a cornerstone of domination of the data analytics industry, many believe environmental data to be a source of “untapped potential.” The potential of environmental data, the argument goes, would benefit equally the digital economy, environmental sciences, and academic data and artificial intelligence experts. This article analyzes the proliferation of the […]
  • by Mihaela Popescu
    Big Data & Society, Volume 11, Issue 2, April-June 2024. This study examines the impact of role-based constraints on privacy cynicism within higher education, a workplace increasingly subjected to surveillance. Using a thematic analysis of 15 in-depth interviews conducted between 2017 and 2023 with data stewards in the California State University System, the research explores the reasons behind data stewards’ privacy cynicism, despite their knowledge of privacy and their own ability to protect it. We investigate how academic data custodians navigate four role-based tensions: the conflict between the institutional and personal definitions of privacy; the mutual reinforcement between their privacy-cynical […]