Correlating Code & Community (part 2)

But how should we study the serialized traces of a code-based community? Even the relatively small sample of 240 mods (small by the standards of "big data") is still quite large by the standards of a traditional, close reading-based criticism. If we were to open up these files, what would we be looking for anyway? The machine-readable mods hardly reward the close attention of a human reader.

In looking for an alternate approach, it will be helpful to understand what these files contain. The various mods are distributed as patches (.ips files) which have to be applied to a ROM file of the original game; the patches are just instruction files indicating how the game's code is to be modified by the computer. As such, the patch files can be seen, rather abstractly, as crystallizations of the serialization process: if "repetition + variation" is the formal core of seriality, that is, then these patches are the abstract records of pure variation, waiting to be plugged back into the framework of the game (the repeating element).

But when we do plug them back in, then what? We can play the games in an emulator, and certainly it would be interesting — but extremely time-consuming — to compare them all in terms of visual appearance, gameplay, and interface. Or we can open the modified game file in a hex editor, in which case we might get lucky and find an interesting trace of the serialization process, such as the following:

Here we find an embedded "infratext" in the hexcode of "Millennium Mario," a mod by an unknown hacker reportedly dating back to January 1, 2000. Note, in particular, the reference to a fellow modder, "toma," the self-glorifying "1337" (i.e. "leet" or "elite") comment, and the skewed ASCII art — all signs of a community of serialization operating at a level that is subterranean to gameplay. But this example also demonstrates the need for a more systematic approach — while at the same time exposing the obstacles to systematicity. For at stake here is not just code but also the software we use to access it and other broadly "paratextual" elements, including even the display window size or "view" settings of the hex editor:

In a sense, this might be seen as a first demonstration of the importance of visualization not only in the communication of results but in the constitution of research objects themselves! In any case, it clearly establishes the need to think carefully about what it is, precisely, that we are studying: serialization is not imprinted clearly and legibly in the code, but is distributed in the interfaces of software and hardware, gameplay and modification, code and community.

Again, I follow Mark Marino's conception of critical code studies, particularly with respect to his broad understanding of the object of study. He asks:

What can be interpreted?

Everything. The code, the documentation, the comments, the structures — all will be open to interpretation. Greater understanding of (and access to) these elements will help critics build complex readings. [...] code can be written for programs that will never be executed. Within CCS, if code is part of the program or a paratext (understood broadly), it contributes to meaning. I would also include interpretations of markup languages and scripts, as extensions of code. Within the code, there will be the actual symbols but also, more broadly, procedures, structures, and gestures. There will be paradigmatic choices made in the construction of the program, methods chosen over others and connotations.

In addition to symbols and characters in the program files themselves, paratextual features will also be important for informed readers. The history of the program, the author, the programming language, the genre, the funding source for the research and development (be it military, industrial, entertainment, or other), all shape meaning, although any one reading might emphasize just a few of these aspects. The goal need not be code analysis for code's sake, but analyzing code to better understand programs and the networks of other programs and humans they interact with, organize, represent, manipulate, transform, and otherwise engage.
But, especially when we are dealing with a large set of serialized texts and paratexts, this expansion of code and the attendant proliferation of data exacerbates our methodological problems. How are we to conduct a "critical hermeneutics" of the binary files, their accompanying "README" files, the ROMhacking website, and its extensive database — all of which contain information relevant to an assessment of the multi-layered processes of digital seriality? It is here, I suggest, that CCS can profit from combination with DH methods.