AI and memory: Are we letting the machines help us to forget?

I was recently chatting over lunch with a researcher. We were discussing AI, and whether large language models like ChatGPT produce anything new, or whether they only reuse what is already there. Generative large language models determine what word comes next using probability, so she leaned on the side of AI producing nothing new, and I disagreed. Which got me thinking about what it was that AI produces.

Zuboff (1985) talks about how intelligent systems informate as a process that translates raw data (if there is such a thing… (see Gitelman 2013)) into information that can be used by an organization. Through this process, the intelligent system produces something new. This is a visible form of production. The ‘something new’ is a report, a pie chart, a distillation of the raw data into a more palatable, actionable format. However, there’s something else produced that is also new. By distilling the raw data into information, the AI is also producing an absence. Certain aspects of the data are removed, ignored, forgotten. Stefan Timmermans dramatically describes this process during the creation of categories and standards. ‘Classifying is a memory practice to both hold on to certain characteristics and send other elements into oblivion’ (Timmermans 2015 p6, see also Bowker 2008). So, what is it that is being remembered by AI and what is being forgotten? What are we letting an AI cast into oblivion?

I think it’s important to consider what absence is being created, and whether we can let an AI decide what to forget on our behalf. Consider a literature review conducted using ChatGPT. The AI model will select what it considers to be the most salient aspects related to the research question, input through the prompt. It draws the insights from the literature across tens, hundreds, maybe thousands of research papers to give a concise overview of the literature. But how do we understand this overview? It’s much more that a summary, because a summary is by its nature the result of the jettisoning of features that are not considered to be salient. It’s a forgetting. The features not salient are cast into oblivion.

Invisibilization is a theme that’s close to my own research on identity recategorization. When an identity is recategorized, the categorical borders are shifted. People who were once invisible become visible, and some who were visible disappear, which shapes policy decisions in health and welfare provision. For example, take the hypothetical case of an AI-supported systematic literature review to understand health inequities among transgender people. An AI could be prompted to draw insights about the size of the transgender population, and the access to healthcare. However, there is a huge amount of contexts and nuanced meaning-making around the terminology used by and about this community, to say nothing of the privacy concerns and the need of the population to remain hidden for their personal safety. Without this depth of appreciation of nuance, any study conducted by an AI would create an absence, and this absence would persist into the literature, and the subsequent policy decisions, where it will linger potentially unseen but no less powerful. This absence did not exist before the study was conducted. Or, certainly, it was an absence that was known.

Now, one could argue that such an absence could have been created by a human researcher too. And this is of course true. But a large language model is developed to be generalized, to work well with the average. It does not have the training of a professional researcher with domain expertise. When working with quantitative or qualitative data about a specific context, it is vital to understand the nuances of that context. Otherwise, we are producing absences that have material effects.  

AI models do produce something new. They produce absences. They make us forget. They decide what characteristics we hold on to, and which we send into oblivion. And this is an absence that needs to be better understood and theorized as we come to use and rely on them more.  

References

  • Bowker, G. (2008) Memory Practice in the Sciences, MIT Press, Cambridge MA.
  • Gitelman, L. (2013) Raw Data is an Oxymoron, MIT Press, Cambridge MA.
  • Timmermans, S. (2015) Introduction: Working with Leigh Star, in Boundary Objects and Beyond: Working with Leigh Star, Ed. Bowker, G., Timmermans, S., Clarke, A, and Balka, E. MIT Press, Cambridge MA.
  • Zuboff, S. (1985) Automate/informate: The two faces of intelligent technology, Organizational Dynamics, Elsevier Press.

Written by: katherinewyers

I'm an information systems researcher studying LGBTQ+ identity categorization, information infrastructures, and social justice

January 15, 2025

You May Also Like…

PhD Abstract

Understanding the consequences of identity category change in government information infrastructure India’s...