The Forest of the SPARC Landscape

Focusing on various trees isn't as useful as realizing some are heading into the woods

The recent “Landscape Analysis” from SPARC, released at the end of March, walks readers through a sober-sided evaluation of the market, with an emphasis on the major publishers — Elsevier, Wiley, and Springer Nature on the journals side, with Pearson and McGraw-Hill (and Cengage, now part of McGraw-Hill) on the education side.

There are some interesting analyses in the document, and some surprising figures to be sure (for example, Elsevier makes less money per article than either Springer Nature or Wiley). There are tables showing the asymmetry of usage (~5-7% of journals account for ~50% or more usage in multiple fields), but drawing what I consider to be unforgiving conclusion (I don’t agree that low usage equates to low value — a lot of good science and scholarship comes out of small disciplines, new ways of thinking, or emerging fields). The authors confirm in passing my finding that subscriptions only cost academic institutions 0.5% of their budgets. There are arguments about productivity gains, and some contradictory and incomplete data. But overall, the analysis seems solid at the detailed level, missing the mark only a few times here and there.

What’s truly interesting about the analysis is the forest it describes — the big picture it asserts — which is alive with customer knowledge and Big Data assumptions. The authors examine how Elsevier and other companies are now pivoting away from content and into the surveillance economy.

The implications of this are examined in the analysis through a narrow premise — that academic institutions can and should guide the data acquisition and analysis practices of private firms using information products as ways to ignite data exhaust they can use to sell information and projections about academic practices, research areas, and individuals back to institutions.

What the analysis describes is a fascinating — and totally expected — pivot, one we’ve seen developing for quite some time. The SPARC analysis puts a pin in it, and states it quite explicitly.

But exploring the forest is where the analysis falls down, failing multiple times to answer questions its own premise begs — for instance, it asserts data acquisition and analysis should be guided by academic culture, without testing whether there is actually something we can identify as “academic culture” against which proper data utilization practices can be judged.

More glaring is the fact that the analysis assumes that acquisition and analysis of academic data will be the purview of publishers like Elsevier, McGraw-Hill, and Wiley. I’m fairly confident that Facebook and Google have reams more data — and more years of it — than any of these will ever acquire. Google, in particular, has been smart about gathering data from academic sources, via Google Scholar and now CASA. Facebook’s sign-in options — used by some scholarly publishers — certainly give them a lot of information about academics, students, and researchers. These platforms have so much information to wash everything against, and so much more experience and talent than the little publishers covered here will ever have.

Remember, from a revenue standpoint, Google makes in few weeks what Elsevier makes in a year. Facebook makes in one quarter (3 months) what the entire publishing industry makes in a year, depending on the figure you use for the size of the STM economy. Beyond that, these mega-businesses (Google, Facebook) are all predicated on data-gathering and data-utilization. They don’t sell any content. They are existentially reliant on doing Big Data well.

I think they’re probably far better at this than scholarly publishers — even the giants — ever will be.

The pivot into data, of course, has been caused by shifts in the content market spurred by OA and exacerbated by piracy and the resulting softening of subscription prices. Publishers who are no longer certain that selling content will result in stable or growing revenues have been driven to this. There are clear benefits. The pivot into data obviates the need to argue with librarians about renewal percentages or funders about OA fees. They can approach other people with checkbooks at institutions. The surveillance economy looks comparatively liberating, and holds potentially greater upside.

The authors note that the trends driving the members of the Billion Dollar Club to pivot to surveillance capitalism are also affecting the Million Dollar Club and Hundred Thousand Dollar Club members, who don’t have the money, scale, or management skills to pivot in the same manner, writing:

The possibilities are many and the opportunity for smaller publishers to replicate these advantages is almost nil . . .

This speaks to how OA — directly and indirectly — continues to create advantages for the larger and richer publishers, while putting smaller, less-affluent publishers at distinct short-term and long-term disadvantages. OA has pushed major players in our industry — and the only ones capable of moving in this direction — toward surveillance capitalism. It has left few options for anyone else. It’s a brittle and unforgiving business model based on market dominance and production efficiency.

So this is the forest I see described in the SPARC analysis — OA will continue to make the big bigger, the small smaller, and drive the market away from subscriptions and into surveillance capitalism.

The recommendations that academic institutions “take control of metrics” is said with a straight face, with no acknowledgement that academic institutions have abused the Impact Factor for years, and continue to do so, making it safe to question their ability to manage metrics in sophisticated and disinterested ways.

There are also recommendations that algorithms shouldn’t be “black box.” However, pulling back further, if the algorithms Elsevier or Wiley uses are exposed, but those used by Google, Facebook, and YouTube are not, who will win in the end? It’s easy to see how the larger black boxes would simply factor in the open algorithms and dominate them in unseen ways.

Reading this analysis confirms my belief that smaller publishers need to pivot hard into content at this point, and find novel ways to get users and institutions to pay for quality. They can’t win the quantity game, whether played with content or data. There are plenty of examples emerging of new ways to get users to pay while honoring OA commitments and other business necessities, including profits.

I’d recommend reading this analysis. You may get something else out of it. Just don’t lose the forest for the trees.


Other posts this week included coverage of the subversion of referees and implications for scholarly publishers and editors; how and why PubMed’s failures are drawing renewed scrutiny; and, why working to make publications that don’t exploit users should be a goal for the future. Subscribe today to get access to these and all future coverage and archival posts.