21 September 2023
Takeaway from the Jagged Tech Frontier*
Lautonomy is being built on claims that (1) LLMs - properly trained - can radically improve how organizations and individuals access knowledge about rules and (2) that this unlocks better decision making.
Given this, it’s perhaps unsurprising we’d be interested in a paper that promises to evidence the impact of AI on the workflow of what the authors term “high human capital professionals”, including consultants and, presumably, lawyers.
Our takeaways from “Navigating The Jagged Tech Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality”.
1/ It may seem obvious but LLMs are fundamentally different compared to earlier forms of automation, to the extent that they represent “an entirely new category of automation”. Where earlier generations of AI focused on the narrow problems a data science team might work on, LLMs have capabilities that overlap with the most creative, most educated, and most highly paid workers.”[1]
In other words, AI is now able to be incorporated into the sort of complex, real-world workflows in much the way that had been predicted to trigger the radical transformation of “the professions”.[2] The difference with this study is that the authors aren’t theorizing imagined futures but reporting on an empirically validated present.
One (perhaps nitpicky) thing to pick up on is that LLM’s - and the scope of automation enabled - is the outcome of longstanding research into Natural Language Processing. LLMs are radically different technology not because it allows for higher levels of automation (which it does) but because it’s built on tech designed to parse the type of data that constitutes the workflow of knowledge intensive professions, namely unstructured natural language based text.
2/ We’re only at the beginning of the transformation. We know that LLMs will change workflows in consulting, medicine, law and so on but the exact nature and depth of this change are difficult to predict. This is in part because the capabilities of generative AI models have, even for model architects, been novel, unexpected and evolving at a speed nobody expected. Preach.
3/ Explainability (or opacity) matters not only from an AI ethics of safety perspective (where an understandably large amount of focus has been) but from an operational, productivity perspective. The risk that LLMs produce wrong-but-plausible results (hallucinations or confabulations) or mistaken answers to some relatively trivial tasks (e.g. math, citations) impacts the way that AI can be used and integrated into workflows. There are actually two distinct points here. One is that LLMs sometimes get answers wrong or lie. The second is that they get answers wrong or deceive in unexpected or difficult to predict ways.
It’s this confusion about what LLMs will or won’t and can and can’t do well (not helped by the hype cycles surrounding generative AI) that is most telling when it comes to the organizational dynamics the authors term the “jagged technological frontier”. Per the paper: “the future of understanding how AI impacts work involves understanding how human interaction with AI changes depending on where tasks are placed on this frontier, and how the frontier will change over time.”
4/ The research methodology could probably be developed into an interesting benchmark for LLM performance “in the wild”. Consultants were given a mix of survey tasks categorized into four types: creativity (e.g., “Propose at least 10 ideas for a new shoe targeting an underserved market or sport.”), analytical thinking (e.g., “Segment the footwear industry market based on users.”), writing proficiency (e.g., “Draft a press release marketing copy for your product.”), and persuasiveness (e.g., “Pen an inspirational memo to employees detailing why your product would outshine competitors.”).
They also had a more complex task which involved combining data from a spreadsheet and interviews with company insiders, the idea being that participants would need to use insights from the (qualitative) interviews to understand the key insights to be gleaned from the (quantitative) spreadsheet data if they were to offer the correct strategic advice to a hypothetical CEO.
From our perspective (as developers of a conversational AI based legal intelligence workflow) it’s validating to see the second “benchmark” in play. We’re putting a fair bit of effort into combining qualitative and quantitative data sources exactly because we think the most interesting and radical applications of AI to (strategic) decision making require an ability to fuse different data types and sources. This is what pushes LLMs beyond serving as a content mill (writing emails, creating lists, generating images) into informing or - eventually - governing system-critical decisions.
5/ The authors point to a compelling, much needed research agenda able to consolidate various cross-disciplinary work into the social impact of AI. Would love to see more research explicitly focused on how humans behave on the jagged technological frontier.
— written by humans —
Notes
[1] Eloundou, Tyna, Sam Manning, Pamela Mishkin, and Daniel Rock (2023), “Gpts are gpts: An early look at the labor market impact potential of large language models,” https://arxiv.org/abs/2303.10130.
[2] Susskind, R. and Susskind, D. (2015). The future of the professions: How technology will transform the work of human experts, OUP
* Dell'Acqua, Frabrizio, Edward McFowland III, Ethan Mollick, Hila Lifshitz-Assaf, Katherine C. Kellogg, Saran Rajendran, Lisa Krayer, François Candelon, and Karim R. Lakhani. "Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality." Harvard Business School Working Paper, No. 24-013, September 2023.
https://dx.doi.org/10.2139/ssrn.4573321
KEYWORDS
Large Language Models | AI and Machine Learning | Human-Machine Alignment | Knowledge Management | Impact of AI