Scholarly publishing round-up: Broader perspectives on AI and industry evolution
This week, we bring you an interesting mix of recommendations that will help you take a broader view of scholarly publishing, its evolution, and where the industry could be headed.
Are AI guidelines enough?
This is not your regular piece on the use of AI in research and writing. Science writer and educator Tim Requarth introduces an additional perspective on the temptations of using AI in academia. He argues that guidelines for AI use fail to address the deeper aspect of differentiation and choice, i.e., the issue of understanding when and how to use AI without sacrificing the intellectual effort that encourages genuine insight. He explains that writing and thinking are intertwined processes; the friction of articulating complex ideas helps clarify thought and exposes gaps in reasoning. When AI tools take over that struggle by producing fluent, coherent text, they risk short-circuiting the very process that leads to understanding. This overreliance on AI may mirror the “productivity paradox” seen in economics, where technological efficiency increases output but not meaningful progress. By making it easier to produce polished work, AI can create an illusion of advancement while diminishing opportunities for deep reflection and learning. He urges students, researchers, and writers to be discerning about that moment when they reach for AI: whether to save time or to think more clearly. True progress depends not on how much faster we can generate content, but on maintaining awareness of when the creative and cognitive effort itself is essential to discovery. Read this post here.
What is the potential size of the scholarly publishing market?
Ian Mulvany explores the enormous scale and latent value of content within the publishing ecosystem, especially as it relates to AI training, by examining how much (how many items) major scholarly publishers contribute and what that implies in terms of data tokens, economic worth, and future growth. Based on CrossRef figures for the 80 publishers, he finds roughly 123 million records. When converted to words and then to tokens, this shows 1 to 1.5 trillion tokens. And since leading AI models are likely trained on 10+ trillion tokens, this seems to suggest that the content from these publishers alone might represent billions of dollars’ worth of training data. But Mulvany emphasizes that the non-traditional published outputs also need to be accounted for, such as audio, video, image files, raw research data, and metadata like article drafts and revisions. If AI training interest expands into these areas, the addressable market could easily double or triple in size. He also flags ethical, incentive, and structural concerns: huge financial flows tied to training rights could warp incentives, the direction of major actors is unlikely to be swayed by industry sentiment alone, and although scholarly publishers provide relatively higher-quality and less biased content than much of the open web, powerful LLMs will likely be used broadly regardless of content provenance or quality. Read the full post here.
How publishing is slowly being redefined
This is a refreshing take on the evolution of scholarly publishing. Iain Hrynaszkiewicz, Director of Open Research Solutions at PLOS, argues that the traditional focus on journal articles as the center piece of scholarly communication is misaligned with how research works and how we should reward it. He traces how innovations like Digital Object Identifiers (DOIs), ORCID, CRediT roles, data sharing, and preprints gradually nudged publishing toward more openness and richer metadata. He also notes that the incentive systems around funding, hiring, or recognition still revolve heavily around articles and journal prestige. PLOS is now experimenting with a “knowledge stack” model to treat all research outputs (data, code, methods, negative results, early drafts) as first-class, linked components, and to shift to more sustainable, equitable business models beyond article processing charges (APCs). Hrynaszkiewicz emphasizes that making these diversified outputs visible, creditable, and assessable would enable fairer recognition (especially for early career researchers), reward risk-taking and failure, and better reflect what science really produces. Doing so will require broad adoption across publishers, institutions, and funders as well as thinking about interoperability and integration into existing research workflows. Read this post here.
Three years since ChatGPT: Where are we headed from here?
Phil Jones, co-founder of MoreBrains Consulting Cooperative, reflects on the trajectory and impact of generative AI since ChatGPT’s debut, noting that while the technology has dramatically reshaped how the public interacts with AI, we are still in the early stages of understanding its long-term implications. He acknowledges the excitement and skepticism around AI usage, with concerns around attribution, copyright, hallucinations, and degradation of content quality. He also argues that these risks should be addressed rather than used to stifle innovation. Over the past year, the use cases of AI have evolved: AI tools shifting from doing the work for users to more advisory or “coach-like” functions (e.g. writing critique tools), stronger growth in discovery and summarization services, and increasing adoption of AI in editorial and research integrity workflows (such as fraud detection, manuscript triage, or reviewer matching). While the future remains opaque, the possibility that we will move beyond traditional interfaces toward agentic systems or even redefine scholarly publication looms on the horizon. For now, what seems certain is that AI-driven improvements in workflow and efficiency will continue and likely accelerate. Read the full post here.
Read anything interesting lately? Share it with us in the comments section below.
Leave a Comment