Two Fables

Abstract

This Article contains two imaginary stories about the future.  The first attempts to imagine what might happen if intellectual property law no longer prohibited copying and we were to live in a world entirely driven by data, algorithms, and metrics that monitor reading and discussion; in particular, it dwells on how this might affect scientific and scholarly publications.  The second attempts to imagine what would happen if all of the world's scholarly literature were suddenly deposited in an imaginary open access version of Wikipedia called Wikidemia, and how that might affect scientific practice and scholarly communication.  A brief reflection on the significance of each follows.


 

Introduction

In response to an invitation to participate in a UCLA workshop on “Law in 2030,” I took seriously the organizers’ command to imagine the future, and produced two speculative fictions of technology focused on the future of algorithms, publishing practices (commercial and scholarly), information retrieval and indexing, sorting, searching, and measuring value in terms of the movement of social media trends and metrics.  I am neither science fiction writer, nor legal scholar, but an anthropologist and historian of science and technology, interested in how the nature of scientific practice is changing with the rise of new information and communication technologies.

The first fable extrapolates in a broadly pessimistic direction, the second in a direction that is certainly more optimistic, and which I will stop just short of advocating.  Both are based in my own experience and research into the politics of open access, piracy, the digital transformation of publishing, and the challenges facing the neoliberal university.

Of course, the reader will quickly recognize, both of these fables are not really about fifteen years from now, but reflect many things happening today; their science fiction quality rests on a few specific insertions of deus ex machina: In the first, I tried to imagine not so much a technological or a legal change as a broadly cultural one, viz. what would it look like if we really did give up on the myth of the Romantic Author, and instead let consumption drive our criteria of good and bad.  There are some obvious absurdities that emerge, the most obvious of which is the effect that a radical rejection of history and authority would have on science and the scientific record.  But there are also some disturbingly familiar parts of the story: such as the fact that we are increasingly being sold media (books, movies, music) that is authored as much by the data and metrics of consumption (to say nothing of so-called machine learning) as it is by any artist.

In the second case, I turned towards a more utopian imagination driven directly by my own observation of the problem of open access in the university.  This story is even less far-fetched, and the lines between what has already taken place, and what could happen next, are very thin.  Indeed, I have been anxious to get the story into print before the next big event occurs involving piracy, open access, and the scholarly publishers.  Surely, by the time most people read it, it will look more like an alternate past, than a vision of the future.

In both cases I have tried to add footnotes (something a science fiction writer would never do, and a legal scholar would do much better than I) to give a sense of some of the sources that capture my own research and experience.

I.  The Discovery Engine (or, Algorithms Everywhere and Nowhere)

Finally, intellectual property law changed.  There was not much choice in the matter, it had to be brought into line with a world that had changed so much.  Otherwise, it was bound to be just a pathological growth on the judiciary, repeatedly requiring resection.  At least as far as copyright was concerned, everyone conceded that copying had become too expensive to prevent and too complicated to litigate.  Plus, it was not worth the stingy awards the juries were giving anyway.  The technological and organizational complexity of the information ecology had grown like kudzu, to such a point that copying, updating, distributing, forwarding, and deleting all happened so fast, so routinely, and so widely, that copying had just become part of the definition of reading, watching, listening, or playing.

It was not an ideological win, though there were periodic ComicCon retrospective panels featuring Cory Doctorow and Lawrence Lessig’s frozen head, which was, strangely, wearing an old Oculus Rift rig, and swaying back and forth in its pressurized cryo-transport container, humming snippets of Jonathan Coulton songs.1

It was just that, after all the big media companies had figured out how to Spotify everything,2 there was not much reason to hemorrhage money into controlling media, when the game was to extract value in other ways: discovering, sorting, searching, repackaging, and repurposing.3  Copyright enforcement fell away; with everything open, or so multiply licensed it was impossible to track, it became clear that real, new growth was in repackaging and reselling information rather than investing in the creation of new information.  New information had margins way too low to invest in anymore, even if novelty still drove the market more than ever.  Something about the old argument that supporters of copyright terms used to make—that copyright was an incentive for people to create—no longer seemed to even make sense.  Innovation certainly had not died, nor had the music.

With all these changes to market practice in place, the reforms swiftly followed.  The Trump presidency forced many changes in its fourth term in office: Copyright was reduced to a one-year term, registration was required again, and fair use was broadened to include basically any kind of action short of plagiarism (and even that was weakly enforced).  The “Trump clause” added to the copyright statute a crypto-Lockean might-makes-right principle of ownership whereby property rights were tied to clear evidence of superior exploitation of a particular digital resource in real time; but those rights could just as easily be transferred when someone else started making more money from them, which nowadays seemed to happen overnight.  Rather than registering ownership of bits, people pointed to the acceleration of transactions in the blockchain as evidence of ownership.  Courts loved the process because it created a permanent historical record of who paid for what and when; they were happy to call it property.

Trademark, by contrast, was not so much reformed as proliferated into abandonment; it was not the brand itself that mattered anymore; it was the metrics, and mentions, and likes, and data flows, now directly tied to the flow of money.4  Open access was now required of nearly anything funded by governments, foundations, or corporations.  And the moral right of the author?  There were a few aging legal scholars in the EU who defended it abstractly.  Though the laws were on the books in many parts of the world, they were not much more effective than the few remaining laws against onanism.

It was a new day.  Creativity seemed to blossom suddenly and gloriously everywhere and the content economy had been booming for the first time in decades.  Most of the money came initially from savings: Transactional costs involved in negotiating endless licensing deals fell away, legal fees for policing content dropped to nothing, and, as investment shifted from older systems of content production, firms engaged in a whole new range of economic explorations of existing content.5  What was once conducted only as a game of controlling the market sale of a bounded product exploded into qualifications of every sort, strategies for channeling metrics and data—the real value—by combing through complex digital landscapes of material, combining, linking, composing in collections, pulling it apart, and packaging it up in new ways.

In the world of books, it all started with the Public Domain Profit Project (PDPP) run by MacMillan.6  As more and more work started to fall into the public domain, MacMillan realized that consumers still actually wanted to read classic literature and that old work could be revived by clever branding and marketing strategies.  Jane Austen was easy to sell, and every publisher had a cheap copy in app stores to which they gave no thought.  But the PDPP asked, what about the more obscure stuff people do not know about?  Could you get people to read Sir Philip Sydney?  The Horrors of Oakdale Abbey ?  Walter Pater?  Saul Bellow’s lesser novels?  Could you get them to read Areopagitica?  All the way through?  Not buy them, mind you, but read them.7

Because not long after the first wave of subscription eBook sellers like ScribD and Oyster had failed, publishers realized that they had at their disposal a nearly direct conduit into the brains of readers.8  That first wave of data extraction companies had recognized that they could track exactly what people were reading but because they relied on subscriptions, the data was not much use on the consumer side.  On the licensing side, this data had told them what to license and what not to, but they had no power over the potential consumer revenue stream they had discovered.

But once the content industries figured out how to make the social media ecology interoperable with payment systems and personal finance software, and once consumers had started using quantified self data to allocate their mutual funding and crowdfunding apps according to their own social media behavior, or mood, or desire, or CV-fashioning—a whole new world opened up.9  Now it was not just about knowing what people were reading and when they stopped; it was about linking this information to social media profiles, networks of friends and followers, and ultimately monitoring in real time the reading experience and its effects.  Finally, no one read alone anymore, even if they were lying in bed by themselves.  And no one bought a book anymore either, yet the money flowed like words on a page.

No, PDPP was not about selling the books, it was about repackaging and marketing them, about generating discussions that made it obligatory to be in the know, about using the power of automated social media-driven mutual funding to mine the immense back catalog of culture for new money.  It was like cool hunting for literature, but it was cheap—no authors, no licensing rights, and no sunk costs, since the texts were already out there and easy to package.  The cool hunters worked nearly for free; a few of them could win big if they could manage to get something eddying in the datastream, drive up stats, and suck the money out of consumers bent on being the first to read something people had forgotten about decades or centuries ago.  Everyone had long since stopped underestimating the speed of fads—and started scrambling to find the front of the wave.  Climbing stats and trending metrics meant that people shifted their personal subscription budgets around, and modulated the mix of their mutual funding apps, and presto, money from nothing.

“But why stop at the public domain?” asked the people at MacMillan.  As the academic legal scholars had been yelling for years, there was a whole slew of orphan books that no one seemed to own any more.10  A one-page risk analysis later, and MacMillan was the undisputed king of repurposed orphaned literature and their datastream conversion had advertisers rolling in it.  The rest of the big five publishers followed suit—FSG, Amazon, Hachette, and Buzzfeed all had the same model within a year.

A few successes down this road, Hachette realized there was another possibility, not just cool hunting the works into peoples’ feeds, but as one employee at Hachette put it, “why make new books, when you can make old books better!”  At first, it was limited to the traditional authorized editions packaged with some essay or annotations, or read by celebrities.  There were a few successful flashes, those zombie Jane Austen books and the Randall Munroe versions of Feeling Good About Yourself and Having Bad Ideas about Others by Jane Austen and Down-Going Force of Light and Rain by Thomas Pynchon.11  But then Hachette realized that it could actually make all kinds of changes to these books, mostly through the range of algorithms, machine learning tools to improve discoverability—that holy alembic of the information ecosystem that matched the desires of consumers to the available pool of content.  Step 3: Profit!12

In easy cases, readability made a work profitable.  The natural language translation algorithms became so good, having been fed on decades of electronic chatter amongst all manner of people, that they could not only translate between languages, but within one language as well.  The Shakespeare project at Carnegie-Mellon had long since proven that they could feed a Shakespeare play to their English-English translation system, and get back something that sounded like a mix of Mamet and Hemingway if they wanted.13  Or something that sounded like Stephen Ambrose, except not as original.  It was possible to modulate the output on the fly too, so that any particular reader might be given a subtly shifting text keyed to the readability of other English texts she was also reading.  College professors ended up with something close to the original while they were at work in their offices, but something more like a Gawker editorial by bedtime.14  There were abridgments, vocabulary, and syntactic simplifications, and in one case an edition aimed at an online community of aging Bay Area language poets, grammatical complexification.  Racial and ethnic changes were made by scoring explicit and hidden bias of users’ Twitter feeds, and altering the names, places, and language in response; whole new populations were drawn into the publishers’ customer bases thanks to the extraordinary trove of policing analytics firms data that was leaked in 2028.15

The detailed customization of all these works made them accessible in new ways, but it was the insight that the stories themselves could be reworked that really made the system explode; new ways of illustrating, serializing, and reordering the stories emerged (so few older works really understood the power of the cliffhanger and the big reveal).  Then the question became why stop there, when the stories themselves could be improved.  Teams of data scientists and marketers worked with creatives to finely segment and track the global reading population and tailor works to drive up the reading and subscription metrics.  Instant feedback from eReader data gave Hachette a way to experiment with changing the books as people read, inserting cliffhangers and literary versions of clickbait.  As soon as readers slowed their reading, or if they turned off the device, they would get a tweet or a message with what they were going to miss, because the text they were reading was about to change tomorrow.

Before long, every publisher had dozens, if not hundreds of versions of classic texts.  Suddenly, although it was still the most important distinguishing feature, it became impossible to locate anything like an original version.  It was a fascinating paradox; social media trends and tailored discovery algorithms had created immense demand for original work, but whose original was the most original? Asking Amazon Echo to find you a copy of Marcel Proust might bring you one of ten or twenty or even a hundred versions, ordered by your profile, each one subtly different, each reminiscing about a different sweet treat, but all sold as the original.  There were “new originals” and works advertised as “more original than ever before” and special Scalia-branded editions for true originalists.  But those mostly consisted of enraged commentary about the decline of Western civilization, regardless of the title.

In a surprising moment of visceral excess, a cluster of textual scholars around the world went on hunger strike in front of the buildings that used to house libraries, demanding that the publishers return control of definitive editions and oeuvres complètes to the scholars.  The publishers responded by digitizing some of the older printed versions the textual scholars had once worked on, and adjusting the algorithms to favor displaying a tiny-font, annotated, heavily footnoted version of the text to anyone who fit the profile of the curmudgeonly bibliophile.  For a while, a group of homeless adjunct literature professors encamped on the banks of the Charles River had converted an old gas station into an archive of originality, filled with crumbling print copies—most of which were actually abridged Readers’ Digest versions from the 1960s.  These were all that was left over after the interior design fads of the 2020s had channeled all the available print books into custom-designed old school libraries,16 stretching from Soho Loft exclusivity to an eventual, unsustainable demand in the desert housing developments still popping up around the country.  Occasionally the press would do a story on these ragged and angry river-dwellers and there would be brief interest in the idea that there was once a single text that everyone read.  But by then, it was just one fad among others.  Farrar, Straus, and Giroux17 even capitalized on it by producing a whole slew of Ray Bradbury’s Fahrenheit 451 set in Boston, featuring this odd library.  At the end of one version, Jorge Luis Borges arrives blindly driving a fire truck with the license plate “Aleph” and torches the whole place.  That version went viral in four hours.

It had been easy to see where this was all going with film even as early as the 2000s, with the proliferation of comic book rewrites and reboots and remakes.  But with the innovation of industry-standard openly shared computer graphics assets, it became possible to generate movies on the fly nearly as fast as books.18  Diving into the existing pool of material and simply rereleasing it as part of strategy to drive users from platform to platform, generate ad revenue, and build brands was far more lucrative than the incredibly wasteful system of independent films, studio campaigns for awards, and endless, expensive policing of pirate sites.  Why not just release as fast as possible, repackage everything for anyone who might want it, and improve it all in a race for viewers, swipes, likes, and segmented loyalty?  The beauty of machine learning algorithms was that every consumer virtually wrote his or her next read based on what they had looked at previously.

The Game of Thrones reboot was the first to rely on user-contributed social-neural feedback to adjust the story as it unfolded for a subsegment of power-watchers and their followers.  It was truly a Golden Age of media production.  Writers were employed by the system to check these texts, but not to write them, just to tweak and improve them, insert clever jokes and Easter eggs, to look for the giveaways that the algorithms could not spot.  It could not be crowdsourced per se because the crowd was the audience.  And it mattered that they could be surprised, pleased, entertained on the first shot; staying a step ahead of them was the art of it.  The New Work departments in the publishers and movie studios were eventually dwarfed by the Repurposing departments; it became increasingly impossible to sell a book or script just because no one had ever written something like it before.  It turned out, actually, that this was always false, and it was far easier and cheaper to update Trollope than to pay Franzen for his novel on the same topic.  Who in their right mind would sit through an entire Cassavetes movie when it was possible to watch it unfold as the crazy mashed-up backstory of an X-Men film?

It was a perverse effect of making originals more palatable to contemporary readers, that it resulted in a digital ecosystem where the original originals were so swamped by false claimants to originality that there was nothing to be done but to cast off that burden of history completely.  But you could not give everyone their own original book—that way lay madness—the tailoring and improvement had to be pegged to populations of readers, or else the industry could not rely on the free horizontal marketing and the social media dynamics of discussion-liking, cool hunting, and recommendation swapping that drove the financials.  It made no sense for everyone to have a version of Shakespeare tailored only to themselves.  Who would they discuss it with?  Instead, there were ever-more sophisticated maps of these populations.  Locative media, personal assistant software, and lifestyle apps could be correlated across overlapping networks of people so that the texts they were reading were similar enough to facilitate discussion, but different enough to give every reader the experience they desired.  The filter bubble had popped, and out spilled a whole new way of interacting with writing and through writing with each other.19

At the weak links between these populations were the lit-geeks, the ones who fit more than one population profile, the ones that the algorithms choked on.  At first they were like canaries in the coal mine; they would essentially be given gibberish, because the algorithms and discoverability services could not place them in a population.  Or placed them as one segment but were flummoxed when they started to read something far outside the expected range.  Some of them cultivated a kind of connoisseurship around this ability, recognizing that their own interests and reading habits were indirectly affecting the nature of the work being recommended to different circles of friends.  It was not long before they became recognized as valuable—the New New Critics, they were called by the publishers—and those who cultivated this skill did so not only by reading but by exploring social space in new ways, friending unlikely people, and engaging in discussion with them about the texts.  But it was ephemeral stuff—maintaining an edge as a New New Critic required near constant monitoring and self-fashioning.  It was not for the weak of heart, or bank account.

Some of these experts, these lit-geeks, took delight in gaming the system, in trying to find ways to divert the stream of metrics, discovery, and algorithmic data into the coffers of others.  There were arbitrageurs of algorithmic literary fads, betting on the recombination of populations, and trying to model the fads fast enough to predict where the money would flow.  There were collateralized subscription obligations sold by hedge funds doing analysis of variation ranges of a particular book—how many different versions of Tristram Shandy were people reading and how much did they differ?  There were also hackers of the system, finding ways to game the data and metrics being generated using techniques of obfuscation, reader-bots, and proxy shell-company networks devoted to flooding social media in order to create leakage in the system.  But with the turnover speed in the ecology of information consumption, even these new pirates, as the content industry liked to refer to them, were reduced to something that was essentially indistinguishable from competition.  Finally, there were no pirates, because everyone was on the open sea together.

 

*                      *                      *

 

Not everyone was thrilled, though.  The scholarly publishers found themselves in a true conundrum.  They, along with the academics, had bought into the new system of value propagation: metrics, mentions, likes, discussions,20 and the system of mutual funding and individual research allocation budgets, largely negotiated via university libraries, corporate research organizations, and other large entities on behalf of the scholars who now had unrestricted access to the scientific record.  But even before all this happened, the scientific record had been a mess, undiscoverable, poorly indexed, no cross-referencing, everything siloed in discipline and institution-specific repositories, incommensurably indexed by one of dozens of different handcoded catalogs.  When Google had finally shut down its Google Scholar and Google Library programs, a near revolution had occurred.  Scholars finally realized that one in ten of them had no clue how to find anything relevant to their own work.  There was intense competition to make sense of the mess of research out there, but the thought of actually cataloguing, indexing, and evaluating all that research in any orderly or systematic way was far too expensive to contemplate.  There was not enough money in the world to pay all the specialists needed to index it, and worse, there was no money in it for the publishers.  Algorithmic discoverability was the only answer, and it was Big Content that held the keys.

Already, with the fall of the journal as a container and the rise of the article as the central form of digital object, everyone had turned from keeping track of journal tables of contents to being subscribed to a range of different feeds, streams, and notices all curated in different ways by different publishers, scholarly societies, academics, and private firms.21  The forms of curation differed; older scholarly societies insisted on human curation, but it was not clear whether the humans doing it were really experts on one thing or another.  The big publishers all turned to an array of new curation algorithms, some secret and some not.  But no two people were watching the same literature anymore, just like no one was reading the same Shakespeare anymore.

In the space left by Google Scholar, dozens of new commercial search engines filled the void.  Every one of them gave different answers and located different subsets of the literature.  None of them worked on any classical principles of using text strings or keywords, but relied exclusively on a pyramid of recommendation algorithms, look-ahead suggestion algorithms leased by commercial publishers, and behavioral clickbait systems that responded to users social media profiles, publicly and privately held data about past reading and viewing habits, occupational information, and partial maps of coauthors and co-citation networks.22  Finding something that you knew actually existed became either impossible or so annoyingly common that you had to filter it out.  The algorithms were so good that no two scientists could find the same paper in the literature: Minute differences in networks of citation, reading practices, clicking, library borrowing, shopping, word use in tweets, email, and bureaucratic documents drove results.  And unless SafeSearch was on all the time, almost all people who identified as men were led within a couple of swipes to one porn video site or another, regardless of search term.

A kind of neo-divination emerged in response—second-guessing the second-guessers.  Networks of scholars in a particular area shared new techniques for finding literature in verifiable ways.  A 2024 Review paper on the New Literature Search summed up the range of these practices: techniques of randomization, pseudonymous search validation, H2H searchterm propagation and feedback, and so on.  Others insisted on an embodied interpretive expertise—reading search results for signs and symptoms of some existing literature—and made graduate students, undergrads, and Mechanical Turkers comb through search results looking for particular combinations of words and pictures.  Academic conferences frequently featured daylong Birds-of-a-Feather sessions devoted to correlating the known literature amongst researchers and looking for ways to avoid reinventing the wheel.  Still others with the resources available built bespoke archives on their own servers, cleansed of cloud-based search/sort algorithms, running only classic full text search, building their own personal indexes of the literature.  But claiming authority based on such personal indexes was unverifiable.  They were the new cranks.  The journals themselves started to demand that citation lists include everything a researcher or lab had consulted, not just cited, from beginning to end in an organized and categorized list, in an attempt to keep track of the boundaries of a discipline and keep some semblance of peer review in place.  Academics objected that this was—in addition to all the other aspects of writing, reviewing, editing, and typesetting the papers—really not their job.

Of course, given the increasing competition for the ever scarcer permanent faculty positions, the articles themselves were also changing to game the system—to be easier to find and therefore cite, mention, discuss, and show up in the metrics.

It had to happen that way: Prepublication peer review as a gate keeping measure was clearly on the way out.23  It was simply taking too much time away from writing papers, and after it was proven for the seemingly thousandth time that it was reinforcing a narrow conception of acceptable research in many domains, it ultimately became something practiced only in obscure corners, replaced by the system of likes, mentions, and discusses and networks of post-publication review sites that periodically turned into vigilante mobs bent on destroying some poor person’s reputation.24  Occasionally, this was in defense of an apparently new finding or idea, but mostly because these were now the metrics that led to hiring and promotion, not the number of publications or their content but the ability to dominate a curated feed, a cycle of mentions and discussions.  After all, everyone argued, what difference did it make if a paper was out there if no one talked about it or re-mentioned it daily.  If a paper is deposited in a repository and no one is there to read it, does it really exist?

Abstracts and keywords now rarely matched the content of a paper, designed as they were to catch the attention of the imagined recommendation algorithms and to gain valuable citations.  Although everything was open, access was no longer enough to get noticed: Being featured in the feeds designed by the publishers and scholarly societies had become coin of the realm—but how these feeds and curated streams worked was never apparent.  Because all the algorithms were carefully guarded secrets—the only real value the big publishers had left, now that the literature was fully open access—much energy was expended on understanding how they worked and which articles would appear in the most widely distributed, re-mentioned, and liked lists.25  The new metrics of impact of these feeds that had quickly been incorporated into academic practice and merit review depended on some mix of mentions, likes, re-mentions, and especially discussions of articles on the highest impact feeds.  It was now routine to deny advancement to anyone who did not have some number of likes, mentions, or discusses for each article.  The publishers had capitalized on this by selling packaged systems to the elite universities that plugged into faculty dossiers and beautifully displayed actionable metrics for each scholar.  They sold the administration on metametrics of the corporation’s own accuracy and quality, which would allow review committees to cut through the noise.  As more of the elite universities purchased these systems, it became easier for people in those universities to understand what would work to satisfy the hungry maw of metric generation.  But on the outside, in the third tier universities and throughout the global South, it was a big black box filled with crap in which hopeful scholars wallowed around, trying all manner of techniques to get noticed.

The rate of publication did not slow, though.  Indeed, it grew in multiple dimensions: more publications, smaller publications highlighted single observations and findings,26 reports, interviews, theoretical ramblings, conversations, tweets about mentions, mentions about tweets, anything at all to try to get into one or another curated list or trending topics list, or highly-mentioned stream.  Given how hard it was to know what was going to get attention, people tried everything.  Their ingenuity was unstoppable.  Once an article was out there though, it almost instantly became impossible to find if it was not being constantly mentioned.  This had the effect of dramatically increasing the number of nearly identical papers that did not cite each other.  For researchers, it was now impossible to figure out which of eight similar innovations or pathbreaking papers or discoveries was actually correct, could be trusted, or pointed down the path to the future.  More and more balkanization happened.  People started to form tiny enclaves of research that either repeated nearly exactly what some other enclave was doing, or explored some unproductive corner of research for a decade, producing reams of papers that no one bothered to reject, much less read, only to discover they had been scooped a decade earlier and did not even know it.  Food and Drug Administration applications had multiple disjoint sets of evidence.  The courtroom problem of evidence and expert witnesses was now common in every administrative setting, and there was no clear sense of global progress, only of incremental, local competition.  Every search turned up yet another source, with absolutely no sense of an endpoint.

At what seemed like the breaking point, in 2030 Elsevier-Springer had joined forces with MacMillan, Netflix, and Cloudera to tackle what had clearly become one of the biggest machine learning opportunities in a lifetime.  Hundreds of engineers and computer scientists were set loose on the existing record with the brief to make sense of it, to save it from itself, and create tools for scholars to navigate the existing morass.  At first, it was a rescue mission.  Optimize the algorithms to index and sort the scientific record.  Unleash all the Otlet-redundant classifiers and Garfield-Hillel citation reduction optimizers to segment the scientific record by reference to its internal citational keyword and content structure.  But the most powerful tools now assumed some correlated set of data about reading patterns, social media traffic and use, and of course, valuestream optimization.  And then there was all the data in the faculty dossiers about patterns of merit, promotion, award, and competitive hiring in and around different clusters of work.

But then it hit a few key people: It was a short step from stabilizing a scientific record as a source of past authority to recognizing in it a motor of the future of science.  Before too long, the project was producing new science or at least, new papers.  Recurrent Neural Networks (RNNs) and Tedtalkoptimized filters could generate short papers whose content could be tested using the same tools that could determine if a paper was fraudulent or if data was faked.  Some of the stuff being produced by the RNNs was hardly worth the processor cycles, some of it was hilarious, and a lot of it was just wrong on some easy count: unrepeatable observations, citations to irrelevant literature, bombastic claims for the impact and importance of the research, cut-and-pasted diagrams and images.

The problem, however, was that this process produced fewer such papers than the existing system of human-authored papers.  The scientific record was full of so much human-generated misconduct, error, sloppiness, gaming, and outright fraud that when the Federal Agency of Retraction Enforcement, headed by Ivan Oransky,27 was asked to compare several cases of potential retraction for addition to their centralized blacklist, his staff was unable to tell which were done by humans and which by algorithms.

Human nature being what it is, it was not long before the possibilities for gaming this new system were clear.  First, there was the scourge of “zombie literature”—legitimate articles that were produced by using a clever echo state network tool to spawn multiple versions, based on a real human manuscript, each with correct results, a distinct narrative with slightly different figures and graphs, and submitting them to a cascade of increasingly lower ranked journals in an attempt to fill in a long tail of citation and mention metrics.  Several academics managed to use this technique to rocket to the top of the trending lists with mildly interesting research projects.  Some were puzzled at first; trust in the social media trending lists had grown with the sophistication of the algorithms, but now the system seemed to be rewarding substandard work.

Shortly afterwards came the articles cut from whole cloth, whose scholars trained the RNNs on a mix of their own research papers and a selected set of past papers to produce a wholly new article.  Just as in the world of big content, these results could be tweaked and tailored, though this time less for readers than according to current scientific trends, and new techniques and approaches.  In a shockingly large number of cases, the papers surprised even their authors, who set about dutifully conducting the experiments or the research reported therein, in order to get the data sets and results that would in fact support the paper.  Some of these publications went on to make a splash; others, it was eventually revealed, were indistinguishable from older research projects that had long since disappeared into the cesspool of past literature.  At the outset scientists were honest about this project, clearly labeling the papers with the names of the algorithms alongside the human authors.  But even that practice fell away when it was revealed that the indexing algorithms had themselves developed an implicit bias against certain ethnic-sounding algorithm names (like the Lamar Echo Reduction Network and the Cesar Chavez Gradient Descent Algorithm—so named because its Chinese inventors had been trying to get attention for it on March 31st), and so the human authors decided they should stick to publishing under their original Twitter handles.

But by this point it was too late; administrators, politicians, pundits, and engineers finally realized that having humans in the loop was not only a source of noise, but the single biggest drain on the economy as well.  As the purges began, the first great breakthroughs in climate engineering and one fourth speed-of-light propulsion were trending.

II.  Wikidemia (or, a Good Index Knows More About Your Problem Than You Do)

The universities finally buckle.  The amount of money sucked out of the university through corruption triples between 2016 and 2020.  Chancellors and Presidents around the country are competing to be on multiple boards, accepting huge payouts from the same companies siphoning money out of the university: publishers, textbook companies, edtech companies, data warehouses, Learning Management Systems,28 and so on.29  The rush to join the board of Trump University after the election was fierce, and it was only decided in the end when Rice University and Vanderbilt University jointly announced that they would be hosting all of the campus operations of Trump University at no cost.  The announcement was quickly drowned out though, when it was revealed that the Chancellor of the University of Michigan had joined the board of the new Daesh University in Saudi Arabia.  She defended her feminist credentials by explaining that it was her demand that she be allowed to attend board meetings unveiled via Google Hangout.  Daesh scholars found an obscure fifteenth century ruling on communicating with women through closed doors that justified the choice.  As a result, Daesh University now allows women to attend, but only if they do so via Google Hangout with the camera turned off.

But the real siphons were the big publishers.  Their margins stayed at 35 percent for a decade—all of which came from university librarycoffers, which had to be refilled from elsewhere in the university.  Even though administration had tried to solve the problem by requiring open access and funneling small amounts of money into homegrown research and data repositories, the faculty rebelled when subscriptions were canceled and demanded that, to be a real university, faculty had to have access to all possible research, full stop.  But research overhead could no longer cover the costs of subscribing to all the research produced, or even a fraction of it, and slowly, starting with the humanities and social sciences, the library disappeared from campuses almost completely.

Physical libraries were devoted to increasingly hyper-special collections, and increasingly large consortia relied on the shared storage facilities and InterLibrary Loans networks, effectively increasing the wait for a physical book beyond the horizon of promotion and tenure time lines.  At the top two hundred richest universities, faculty still had complete access to more or less anything they needed, but the costs were killing them, and the murder of two adjunct professors trying to access Widener Library stacks by contract security guards had just exacerbated the sense of siege the library had been experiencing.

So, in a coordinated protest, a group of radicalized librarians, archivists, students, professors, and a few disenchanted Silicon Valley engineers-cum-Wikipedians banded together and agreed to turn off access to the entire scientific literature provided by the big publishers at hundreds of universities around the world.30  Black Tuesday.  They stopped payment on any contracts that they had control over, and they sent one-line emails to all their sales reps which read: “We will no longer be renewing our subscriptions to your content.”

On that Tuesday, scholars all over the world were met with black screens when they tried to access articles or books from big publishers.  The screen displayed only two things:

Total cost savings to the universities engaged in this action:

1.5 billion dollars.

And a link:

https://sci-hub.wikpedia.io

The link was recognizable to most people outside of the richest universities.31  It was the only place anyone could access the scientific record and was the biggest scholarly pirate site in the world.32  But it was never associated with Wikipedia before.  Previously it had just been http://sci-hub.io and it had been famous for Alexandra Elbakyan’s defense of her blatant piracy of the scholarly literature.  She had responded to a court case by sending a letter to the judge, in which she basically asserted that she was not the pirate—her accusers were.  A classic pirate move.  Injunction granted.  TLD (Top Level Domain) changed.  Business as usual.33

But this Tuesday clearly represented a change.  By the afternoon, a “black paper” had appeared explaining what had taken place.  The same group of librarians, students, engineers, and archivists had approached Wikipedia with a plan to liberate all scholarly knowledge for the benefit of everyone on the globe; to take control out of the hands of administrators and publishers and put it back in the hands of volunteers and scientists.34  At first, Wikipedians balked at the idea; they were busy replacing the scientific record, and did not want to be bothered by the corruptions of the university.  But it became clear to many that this was a complementary problem to the creation of an encyclopedia.  It nicely solved the problem of the “no original research” rule in Wikipedia.35  The existing literature was valuable in a different way than the encyclopedia was—it could be linked, mined, sorted, and indexed in myriad new ways; it was a source for knowledge, not the endpoint.  And Wikipedians saw the beauty in making that source available to anyone—and the technical challenges and opportunities if it happened.

That, and now there was $1.5 billion on the table.  Well, it was spread across hundreds of tables, but even so, money was sloshing around the libraries and the librarians wanted to make sure to commit it before the administration decided to fork it over to some other set of highway robbers.  So they engineered a consortium that would oversee Wikipedia’s involvement in the stewardship of the scientific record.  Wikipedia itself was operating completely on about $40 million a year.  It had only two hundred paid employees and the expenses were primarily server and bandwidth costs.36  Getting $100 million a year out of a $1.5 billion total investment was doable, and everyone was still saving money.  So all the libraries signed up.  Yearly membership fees were something like 10 percent of what the library had previously been spending on subscriptions.  And if they could demonstrate that the entire scientific record could be stored in one place, then it seemed obvious that no university would balk at the win.

And getting it all was actually not a hard thing to do.  The estimates of how many scholarly articles exist varied, but the highest was around two hundred million since the seventeenth century.37  The case against Elbakyan had claimed that her site was amassing 58,000 articles per day, and that the site must have already amassed forty million articles in 2015.  It was easily possible that, combined with the other pirate sites, 50 percent of the existing literature was already in circulation outside of scholarly publishers’ control, in a distributed network of torrent archives, seeding from several mirror locations around the world.

The remaining 50 percent—of what had been digitized—was obtained by the group through two coordinated methods.  Hathi Trust opened up its digital archive of everything Google had scanned from their libraries for twenty-four hours, claiming that a maintenance window had improperly left the site security lax.  Twelve million scanned books flowed into the new library.

Second, a worldwide, coordinated phishing attack by unnamed pirates obtained thousands of login credentials from faculty at the ten largest universities, which together were subscribed to only about 60 percent of the existing literature.  Within days, via thousands of accounts downloading one hundred articles a day, the coffers had been filled.  The largest collection of scientific literature in history, was now in one place: Wikidemia.

Getting the stuff was only half the battle, though.  It was nothing without the indexing, and that too was owned by the largest publishers.  The major providers, like Thomson-Reuters, ProQuest, or Ebsco, were pyramid schemes of indexes that had been developed over the years in countless places, through buyouts, mergers, acquisitions, and other means.  Individual projects, startup companies, and scholarly societies all aiming at indexing some part of the literature had eventually been sucked into this oligopoly of discoverability companies.  They had simply hoovered up these databases and done the low-paid work of federating, cleaning, and cataloguing them and making a pretty interface that would search across all of them.  But now all the DOIs and URLs were broken.  This turned out to be the last defense of the scholarly publishers.  Wikipedia could not serve as the single site for the scholarly record unless all these indexes pointed to their records.

But the beauty of Wikipedia was the volunteer labor force.  It boasted thirty million users—but that was just the accounts.  In reality, it had a (still impressive) workforce of twelve thousand active volunteers.  Even with that many, something like twenty thousand articles per person needed processing.  Fortunately, much of the labor could be automated; bots could take care of the bulk of things that already had quality metadata associated with them.  To assess and incorporate the rest of the works, the volunteers on the new Wikidemia adopted the Wikipedia workflow, but with a new set of criteria befitting this heterogeneous literature.  No one was allowed to claim that something was true or not true, valid or invalid research.  They were only allowed to pose questions on the discuss page.  Anyone could answer the questions, or correct other answers; answers would get voted up or down, crosslinking across literature was encouraged.  Volunteers from StackOverflow, PubPeer, f1000 joined forces to reconfigure their own sites to work with and reference Wikidemia.  Wave after wave of existing repositories connected to and populated Wikidemia with metadata, commentary, peer review, and ratings.  ArXiv and BioArxiv came first, others like SSRN,38 Open Humanities Press, and hundreds of struggling open access publications followed.

To organize the literature, Wikidemians explicitly rejected hierarchical tree-like relations among scientific topics in favor of cyclic interlinking and recursive substructures.  Because discuss pages tended towards endless ongoing chatter, article pages were given observation and story pages, as well.  Rule: No arguing about observations.  Either they can be reproduced or they cannot.  Observation pages accumulated attempts to replicate an observation with respect to a particular claim, finding, or experiment.  Anthropologists, archaeologists, historians, and climate change scientists argue whether observations are experimental observations, data observations, or simulation observations, but the rule remains: Either reproduce it, or report failures to reproduce it.  If you have any other issue, it goes on the discuss page, and ultimately on the story page; observations should be replicated, confirmed, or validated, but stories are there to argue savagely and critically.

Story pages, therefore, were where theories and claims about the meaning of observations and results were to be disputed.  This would allow linking to other stories and disputes, rather than pursuing the futile effort to hierarchically taxonomize knowledge.  Statistics, for instance, is not part of a hierarchy, but a possible subbranch of either an observation or a story page in which arguments can turn into debates about statistical techniques, which in turn can link to a story page of an article about the statistical technique and its status.  These cyclic relations, the Wikidemians argue, allowed researchers to navigate horizontally across the cutting edges of many fields, not just vertically in one arbitrary discipline or journal-specific field.  A group of aging, little known science-studies scholars dissented, arguing on their own article story pages that this was a reinstantiation of an artificial distinction between theory and observation they had long since debunked.  But the story pages for these articles only link to each other, so no one notices.  And if you ever ended up on the story page for an article about “The Need for Reductionism in Science,” you would know you had taken a wrong turn somewhere, and should skedaddle.

The whole process was not without fierce internal debate.  But the sense of historic mission was leveling; and under the Trump presidency, compromise had emerged as a virtue and ethical demand everywhere.  Wikipedians work overtime to control sockpuppets, trolls, and hucksters.  Regardless, a range of culture wars around hot issues, ranging from archaeology in the Middle East to contested pharmaceutical company data, periodically threaten to bring the site to its knees.  But the Wikipedia/Wikidemia way prevails.

 

*                      *                      *

 

The scholarly publishers had to respond.  A sudden attack on their revenue was hardly going to go unpunished.  But their response came as a surprise, and demonstrated a sophisticated understanding of the situation.  At first, they shocked everyone by simply agreeing to the deal.  “It’s all fine,” they said, “it’s true that it would be better if the entire scientific record were in one place, and Wikipedia is a great neutral party for that.”  So they pointed all the DOIs and stable links to Wikipedia and let them sort it out—the indexing problem was suddenly orders of magnitude easier to deal with.

But the existing record was only part of the story.  Tied up in the big publishers’ pipelines were on the order of 150,000 current articles, all in some stage of review, editing or publication in the roughly twelve thousand journals owned by the big five.  From first publications by grad students to thousandth publications by Nobel laureates and Nobel hopefuls, this is where the current energy and focus of academia lay, and the publishers knew this.  A week after Black Tuesday, anyone with a publication in process received a note stating that, due to the recent action to save money on the part of the universities, the publishers would be instituting a new requirement to publish in their journals: $5,000 processing charges for articles, $25,000 for a book.  Finished articles and books would be deposited in Wikidemia once the fees were paid.  Unpaid fees would mean the return of the manuscript to the author.  No hard feelings.

It was a gamble—the fees only covered half of the losses they would be seeing, but they were a big chunk, and it was a growable strategy, unlike subscriptions.  If they could just find ways to get academics to publish more, then the number could grow, the shareholders could be appeased.

Academics flooded the offices of deans, chairs, and administrators demanding that the university pay for their publications to appear in the big journals.  They brought with them dossiers in process, statements from the chairs and senior faculty confirming that professors who did not have an article in this or that journal, with this or that impact factor would not be tenured or promoted, and that the result would be catastrophic for the reputation of the departments and the university.  There were detailed charts and graphs of the various publication and citation metrics; there were even threats to occupy offices, chain themselves to desks or go on ad hoc strikes, refusing to teach students or take any more blasted cybersecurity classes online until the university paid up!  Which it started to do.  Instantly it seemed as if the huge savings achieved by the librarians’ coordinated action would disappear in the manic demands of faculty to have their research “published” in all the finest venues.

There were other voices though—these were the voices of faculty demanding a simple change to the system: Promotion and review criteria should be changed so that 1) committees review no more than five items, chosen by the candidate; and 2) all journal or publisher names, metrics of citation and impact, or other markers be removed from the articles in question.  The logic, they proposed, was that it was only the publishers who benefited from increasing numbers of publications, and it was up to academics to preserve and articulate the quality and authoritativeness of the scientific record—not the publishers.  Wikidemia was like a dream come true from this perspective: one place where all work could be preserved, debated, discussed, and analyzed.  Rather than a fake measure like “impact factor”39 measuring the number of citations a journal received (which meant nothing for an article that just happened to be in that journal but was never cited), Wikidemia preserved a record of all discussions and all disputes about an article right alongside it on the talk page.  People could watch their publications for activity; committees could look for evidence of people reading, disputing, clarifying, or linking to an article.  Overviews of linkage networks became trivial to generate, and issues of fraud, misconduct, and retraction easier to identify, and trivial to regulate.  There was no longer any need for dozens of outside, confidential letters—though the practice continued because trust continued to be essential and an ill understood part of the process—because anyone who knew anything about the research, or cared about it, had written something on the discuss page.  Work that was taught in classes had whole sections of the discuss pages devoted to student questions and clarifications.  If anything, the most read works were swamped by the commentary which itself became a challenge to organize and navigate.

A few universities agreed to change the promotion system.  But these changes came from universities ranked in the third tier and below, so it was easy for the elite universities to dismiss their system as an assault on scientific quality and impact, which, they argued, could only by measured by publications accepted in the highest ranked journals, by authors at the highest ranked universities.  The publishers played their hand: They rolled out a whole new set of metrics and indicators—now also based on the work of the Wikidemians.  Added to citation impacts came metrics of mentions, discussions, volume of traffic on talk and article pages, graphs, and interactive widgets of interlinkage all correlated with the journal name owned by the publisher.  Academic administrators fell for it, and at many universities, rather than changing the review criteria to encourage fewer publications and less emphasis on metrics, the reverse happened.  Outside companies created a whole new class of faculty dossier systems designed to work seamlessly with the publishers’ new metrics.  They mined Wikidemia for the metrics and data associated with mentions, likes, discussions, links, forward citation, and back citation—everything but the article itself.  They automated the entire merit and review system to a level never before imagined; but for those trapped inside this system, it only worked if an article went through a big publisher—one whose journal brand and metrics were available to the dossier management system.  Anything else was essentially in a weirdly wide open black hole: available to anyone anywhere on the globe, but if it was not measured by the metrics companies then it must not exist.

Meanwhile, the collection of articles available all in one place for the first time slowly started to reveal something surprising.  If one ignored the measurement systems of the big publishers, and instead focused on the content of the articles, the discussions, the network of linkages, and indexed references produced by volunteers, it became clear that certain kinds of scientific problems—both practically pressing and theoretically challenging—had been ignored for decades.  Whole new areas of possible research opened up to the curious; the topology of the scientific record appeared to be dramatically skewed by the emphasis on journal names, citations, and impact factors, all of which pointed to a tightly linked self-referential, self-citing cluster of scholars at fewer than one hundred universities.  “Exactly,” they said, “you can plainly observe our quality by the new clothes we are wearing.”

Epilogue

In rereading these two fables, and being asked by the editors to reflect on them, I can say only that they represent a desire to articulate a difference.  In the first fable, a dystopia, of sorts, in which the scramble to monetize everything through metrics and data and machine learning results in the destruction of past principles and ideals; and in the second fable, a utopia of sorts, in which the problems of the present system of scholarly publishing are solved.  The former story is the more general one, most likely to form the starting point for a reflection on what the future of our particular present might look like.  The latter is really quite narrowly directed at those who are deeply involved with and thinking about the problems of open access, the economics of scholarly publishing, and the rise of scholarly piracy.

With regard to the first story, I think one could react in one of two ways.  On the one hand, it represents the loss of something sacred—a sense of authority, authorship, creativity, or truth as it is expressed and contained in the works of the past.  It is clear, I think, to scholars of the history of science and scientific information, or the history of the book (scholars such as Adrian Johns, Carla Hesse, Roger Chartier, and Robert Darnton), just how fragile and surprising an achievement it is that we have books at all, and how unstable the boundaries of those things have been and continue to be.  So the contemporary dissolution of the book by new information technologies represents at once, a continuation of that fragility but also an assault on a hard-won achievement that might easily be destroyed.  From this perspective, what I describe is a version of a contemporary immodesty towards that past, and a vision of culture in ruins.

On the other hand, I think it is also clear that culture has always been monetized, and what we are observing today is merely a variation on that theme, but one with potentially enormous consequences not just for culture, but for our subjectivities as well.  This might be the more Nietzschean reading of the story, in which the challenge to become new kinds of readers—perhaps also new kinds of scholars and scientists—seems to loom on the horizon beneath a swirl of technologies, business schemes and scams, and an increasingly formatted (and performative) world of big and small, public and private data.

The second fable is more nostalgic, but at the same time more utopian.  It rests on an impossible desire, which is quite clear in the world of science and movements for open access: to live in a world where data, metrics, science, and truth exist untainted by money.  It reflects an image of science and scholarship in which the custodianship of science is practiced by those who do the science, and maintained and sustained either aside from or without the need for imagining it as a generator of revenue or profit.

This story is therefore actually very close to the present world; it is a world where we scholars are struggling to maintain an image of science that is no longer possible, in a world where much of what scholars and universities do has already become entangled with the financial interests of various industries, from private firms and startups to laboratory supply companies to, in this case, scholarly publishers.  To the extent that there is a twist in the story, it occurs when the imagined utopia of Wikidemia comes up against the fact that we scholars have already monetized our academic capital.  The image we have of ourselves engaged in a scholarly pursuit separated from economic concerns is given the lie by the fact that we have ourselves developed an exquisite discrimination of the value of certain publications.  The story is intended to lay bare the double bind we already operate under: We want our work to be openly available and widely read, but we want it to be published by the particular journals and presses that will generate the maximum amount of academic capital.  And academic capital is now more than ever directly convertible into grants, salaries, and other rewards.  This double bind is likely to be the thing that renders something like Wikidemia—which I think many people would like to see come true—exist only in fantasy.

[1].      Transhumanism seems to be essential to any sci-fi story, far be it from me to dissent.  Compare James Grimmelmann, Copyright for Literate Robots, 101 [small-caps]Iowa L. Rev. [end-small-caps]657 (2016), with [small-caps]James Boyle, Endowed by Their Creator? The Future of Constitutional Personhood[end-small-caps] (Jeffrey Rosen & Benjamin Wittes eds., 2011).  See generally Jonathan Coulton, [small-caps]Wikipedia[end-small-caps], https://en.wikipedia.org/wiki/Jonathan_Coulton [https://perma.cc/JH57-3NSM].  Oculus Rift was the darling technology of transhumanists in 2016.  See Oculus Rift, [small-caps]Wikipedia[end-small-caps], https://en.wikipedia.org/wiki/Oculus_Rift [https://perma.cc/G9NW-TQB7].

[2].      See, e.g., Andy, Vkontakte & Universal Sign Anti-piracy & Licensing Deal, [small-caps]TorrentFreak[end-small-caps] (July 18, 2016), https://torrentfreak.com/vkontakt-universal-sign-anti-piracy-licensing-deal-160718 [https://perma.cc/ZF9Y-B3SS] (explaining how pirates convert into licensees of stolen content).

[3].      See, e.g., Tarleton Gillespie, Can an Algorithm Be Wrong?, 2 [small-caps]Limn [end-small-caps](2012), http://escholarship.org/uc/item/0jk9k4hj [https://perma.cc/8FYP-6ETC]; Nick Seaver, Algorithmic Recommendations and Synaptic Functions, 2 [small-caps]Limn[end-small-caps] (2012), [https://perma.cc/Z7EJ-4FHS].

[4].      The eagle-eyed reader will note that patents are not mentioned in this Part.  Science Fiction can only go so far . . .

[5].      See Michel Callon et al., The Economy of Qualities, 31 [small-caps]Econ. & Soc’y [end-small-caps]194, 194–217 (2002).

[6].      MacMillan is one of the largest publishers, and owner of Nature Publishing Group.  See Macmillan Publishers (United States), [small-caps]Wikipedia[end-small-caps], https://en.wikipedia.org/wiki/Macmillan_Publishers_(United_States) [https://perma.cc/DGQ7-3SHC].

[7].      See generally [small-caps]Sir Phillip Sidney, The Defense of Poesie[end-small-caps] (1595); [small-caps]John F. Gilbert, The Horrors of Oakendale Abbey: A Romance[end-small-caps] (1812); [small-caps]Walter Pater, Marius the Epicurian[end-small-caps] (1885); [small-caps]Saul Bellow, Dangling Man [end-small-caps](1944); [small-caps]John Milton, Areopagitica: A Speech of Mr. John Milton for the Liberty of Unlicensed Printing to the Parliament of England[end-small-caps] (1644).

[8].      By 2015, Oyster had already failed.  See Andrew Albanese & Jim Milliot, After Oyster, What’s Next for E-Book Subscriptions?, [small-caps]Publisher’s Weekly [end-small-caps](Sept. 25, 2015), http://goo.gl/Y2v4lY [https://perma.cc/G77V-MC46].  In the Summer of 2016, Scribd was predicted to fail too.  See Andrew Albanese, Scribd Revises Its Subscription Model, [small-caps]Publisher’s Weekly [end-small-caps](Feb. 16 , 2016), http://goo.gl/nw6Tam [https://perma.cc/LRN4-R7DB].  Scribd and Oyster’s licensing models both relied on pirated content that each company had amassed and licensed, and the licensing terms were pegged to the amount of a book that a reader read.  See David Streitfeld, As New Services Track Habits, the E-Books Are Reading You, [small-caps]N.Y. Times [end-small-caps](Dec. 24, 2013), http://goo.gl/35TeIY [https://perma.cc/79XK-K6BJ].

[9].      See [small-caps]Bill Maurer, How Would You Like to Pay?: How Technology Is Changing the Future of Money [end-small-caps]22–34 (2015).

[10].     See generally Matthew Sag, Orphan Works as Grist for the Data Mill, 27 [small-caps]Berkeley Tech. L.J.[end-small-caps] 1503, 1503–50 (2012); Katharina de la Durantaye, Finding a Home for Orphans: Google Book Search and Orphan Works Law in the United States and Europe, 21 [small-caps]Fordham Intell. Prop. Media & Ent. L.J. [end-small-caps]229 (2011); Pamela Samuelson, Google Book Search and the Future of Books in Cyberspace, 94 [small-caps]Minn. L. Rev. [end-small-caps]1308 (2010).

[11].     Randall Munroe, Alternative Literature, [small-caps]Xkcd[end-small-caps], https://xkcd.com/971 [https://perma.cc/M4LX-B6ZB]; Randall Munroe, Thing Explainer, [small-caps]Xkcd[end-small-caps], https://xkcd.com/thing-explainer [https://perma.cc/2SJT-4N6X].

[12].     ???? Profit!!!, [small-caps]Know Your Meme[end-small-caps], http://knowyourmeme.com/memes/profit [https://perma.cc/AQ83-PJF6] (explaining the origin of the South Park Meme in which a business plan is presented with three steps, the second of which says '?' and which ends with "Step 3: Profit!").

[13].     But certain corners of academia simply could not give up on the authorship question.  See Michael Barrett, ‘Shakespearenomics’ Applies Big Data Analysis to Literature, [small-caps]Austl. Fin. Rev.[end-small-caps] (Feb. 10, 2016, 12:15 AM), http://goo.gl/nyTBRP [https://perma.cc/R5PS-KXEX].

[14].     I had to delete the comic sub-plot involving a love affair between a student and a teacher desperate to figure out whether they have both read Midsummer Nights’ Dream.  Instead of characters being bewitched and switching lovers, the books do.

[15].     Or maybe this leak occurred in 2016.  See, e.g., Ashitha Nagesh, Wikileaks Published Link to Personal Details of Almost Every Woman in Turkey for No Reason, [small-caps]Metro [end-small-caps](July 26, 2016, 10:25 AM), http://goo.gl/fEXHfs [https://perma.cc/7F34-G5ZN].

[16].     See, e.g., Penelope Green, Selling a Book by Its Cover, [small-caps]N.Y. Times [end-small-caps](Jan. 5, 2011), http://www.nytimes.com/2011/01/06/garden/06books.html?_r=0 [https://perma.cc/4ZCF-VKCH].

[17].     See Farrar, Straus and Giroux, [small-caps]Wikipedia[end-small-caps], https://en.wikipedia.org/wiki/Farrar,_Straus_and_Giroux [https://perma.cc/26QZ-M254].

[18].     Could this be the apotheosis of “transmedia”?  See generally [small-caps]Henry Jenkins et al., Spreadable Media: Creating Value and Meaning in a Networked Culture[end-small-caps] (2013).

[19].     [small-caps]Eli Pariser, The Filter Bubble: What the Internet Is Hiding From You[end-small-caps] (2012).

[20].     Cf., e.g., Who’s Talking About Your Research?, [small-caps]Altmetric[end-small-caps], https://www.altmetric.com [https://perma.cc/75TQ-TLZT] (explaining how Altmetric collates metrics from multiple channels into one number).

[21].     See Claire Creaser, The Role of the Academic Library, in [small-caps]The Future of the Academic Journal [end-small-caps]317, 318–19 (Bill Cope & Angus Phillips eds., 2d ed. 2014).

[22].     See generally Bryan Gardiner, You’ll Be Outraged at How Easy It Was to Get You to Click on This Headline, [small-caps]Wired[end-small-caps] (Dec. 18, 2015, 7:00 AM), http://www.wired.com/2015/12/psychology-of-clickbait [https://perma.cc/X3SE-KN4H].

[23].     See François Diederich, Are We Refereeing Ourselves to Death? The Peer-Review System at Its Limit, 52 [small-caps]Angewandte Chemie Int’l Edition[end-small-caps] 13828, 13828–29 (2013).

[24].     See, e.g.,[small-caps] PubPeer[end-small-caps], https://pubpeer.com [https://perma.cc/5DG6-VX37]; see also Dana Goodyear, The Stress Test, [small-caps]New Yorker [end-small-caps](Feb. 29, 2016), http://www.newyorker.com/magazine/2016/02/29/the-stem-cell-scandal [https://perma.cc/ZK7N-NL2R].

[25].     See [small-caps]Frank Pasquale, The Black Box Society: The Secret Algorithms That Control Money and Information [end-small-caps](2015); Mike Ananny, Toward an Ethics of Algorithms: Convening, Observation, Probability, and Timeliness, 41[small-caps] Sci., Tech., & Hum. Values[end-small-caps] 93 (2016); Malte Ziewitz, Governing Algorithms: Myth, Mess, and Methods, 41 [small-caps]Sci., Tech., & Hum. Values [end-small-caps]3 (2016).

[26].     See, e.g., Articles, [small-caps]Matters[end-small-caps], https://www.sciencematters.io/articles [https://perma.cc/AM57-VTWQ].

[27].     Ivan Oransky, About Ivan Oransky, [small-caps]Retraction Watch[end-small-caps], http://retractionwatch.com/meet-the-retraction-watch-staff/about [https://perma.cc/62PY-Y6KZ].

[28].     See Learning Management System, [small-caps]Wikipedia[end-small-caps], https://en.wikipedia.org/wiki/Learning_management_system [https://perma.cc/B8R9-TLHQ].

[29].     Diana Lambert & Dale Kasler, UC Davis Chancellor Received $420,000 on Book Publisher’s Board, [small-caps]Sacramento Bee[end-small-caps] (Mar. 3, 2016, 4:32 PM), http://www.sacbee.com/news/investigations/the-public-eye/article63917982.html [https://perma.cc/B4YP-HF9C] (“UC Davis Chancellor Linda P.B Katehi received $420,000 in compensation as a board member for John Wiley & Sons, a leading publisher of science, engineering, and math textbooks for universities.”).

[30].     See, e.g., [small-caps]custodians online[end-small-caps], http://custodians.online [https://perma.cc/GW9A-TDZV].

[31].     See Ernesto, Elsevier Cracks Down on Pirated Scientific Articles, [small-caps]TorrentFreak[end-small-caps] (June 9, 2015), https://torrentfreak.com/elsevier-cracks-down-on-pirated-scientific-articles-150609.

[32].     See generally John Bohannon, The Frustrated Science Student Behind Sci-Hub, [small-caps]Sci.[end-small-caps] (Apr. 28, 2016, 2:00 PM), http://www.sciencemag.org/news/2016/04/alexandra-elbakyan-founded-sci-hub-thwart-journal-paywalls [https://perma.cc/AEY8-J8ZL]; John Bohannon, Who’s Downloading Pirated Papers? Everyone,[small-caps] Sci.[end-small-caps] (Apr. 28, 2016, 2:00 PM), http://www.sciencemag.org/news/2016/04/whos-downloading-pirated-papers-everyone [https://perma.cc/E8Z7-UN4B]; Kate Murphy, Should All Research Papers Be Free?, [small-caps]N.Y. Times[end-small-caps] (Mar. 12, 2016), http://www.nytimes.com/2016/03/13/opinion/sunday/should-all-research-papers-be-free.html [https://perma.cc/6YRM-RHEK].

[33].     See Letter Brief, Elsevier v. Sci-Hub, No. 1:15-cv-04282-RWS (S.D.N.Y. Sep. 15, 2015), https://torrentfreak.com/images/sci-hub-reply.pdf; Complaint, Elsevier v. Sci-Hub, No. 1:15-cv-04282-RWS (S.D.N.Y. June 3, 2015).

[34].     From Michael Eisen: “When we started PLOS the only way we had to make money was through APCs, but if I had my druthers we’d all just post papers online in a centralized server funded and run by a coalition of governments and funders, and scientists would use lightweight software to peer review published papers and organize the literature in useful ways.  And no money would be exchanged in the process.  I’m glad that PLOS is stable and has shown the world that the APC model can work, but I hope that we can soon move beyond it to a very different system.”  See Michael Eisen, On Pastrami and the Business of PLOS, [small-caps]It Is NOT Junk [end-small-caps](Mar. 20, 2016), http://www.michaeleisen.org/blog/?p=1883 [https://perma.cc/8WZR-EZMU].

[35].     See generally [small-caps]Nathaniel Tkacz, Wikipedia and the Politics of Openness [end-small-caps]61–74 (2015).

[36].     See Annual Report, [small-caps]Wikimedia Found.[end-small-caps], https://wikimediafoundation.org/wiki/Annual_Report [https://perma.cc/PL3U-9BQQ]; see also Wikipedia: Statistics, [small-caps]Wikipedia[end-small-caps], https://en.wikipedia.org/wiki/Wikipedia:Statistics [https://perma.cc/7VTA-DDLK].

[37].     See Carole Tenopir & Donald W. King, The Growth of Journals Publishing, in [small-caps]The Future of the Academic Journal[end-small-caps] 159 (Bill Cope & Angus Phillips eds., 2d ed. 2014) (discussing the various estimates of the number of scholarly journals and their growth since the seventeenth century).

[38].     In between the first and final draft of this story, SSRN was purchased by Elsevier.  See Ckelty, It’s the Data, Stupid: What Elsevier’s Purchase of SSRN Also Means, [small-caps]Savage Minds[end-small-caps] (May 18, 2016), http://savageminds.org/2016/05/18/its-the-data-stupid-what-elseviers-purchase-of-ssrn-also-me [https://perma.cc/WU3E-U2BG]; Richard Van Noorden, Social-Sciences Preprint Server Snapped up by Publishing Giant Elsevier, [small-caps]Nature [end-small-caps](May 17, 2016), http://www.nature.com/news/social-sciences-preprint-server-snapped-up-by-publishing-giant-elsevier-1.19932 [https://perma.cc/GZ7A-PV9W].  As a result, several scholars banded together to announce a new SocArXiv as a replacement.

[39].     Randy Schekman has been the most prestigious channel of this critique.  See, e.g., Randy Schekman, How Journals Like Nature, Cell and Science Are Damaging Science, [small-caps]Guardian[end-small-caps] (Dec. 9, 2013, 2:30 PM), http://www.theguardian.com/commentisfree/2013/dec/09/how-journals-nature-science-cell-damage-science [https://perma.cc/6RDP-ZBRA].

About the Author

Christopher Kelty is a Professor in the Institute for Society and Genetics, the Department of Anthropology and the Department of Information Studies at UCLA.

By uclalaw