The rising share of American workplace employees who’ve experimented with synthetic intelligence of their day-to-day work have seemingly had just a few moments of doubt as to their long-term job stability.
However for all of the enhancements in AI over the previous few years, the expertise continues to be solely in a position to hit low bars in particular office duties, in keeping with current knowledge revealed by MIT. Even then, it’d nonetheless be making some huge errors.
Staff involved they may quickly get replaced by AI will seemingly be reassured by new analysis popping out of MIT, which frames the AI-driven jobs takeover narrative not a lot as a fast-paced motion film, however extra like a slow-burn suppose piece.
AI is steadily bettering at carrying out a wide range of duties throughout quite a few professions, in keeping with a examine of preliminary findings launched on Thursday. However most often, the efficiency of at the moment out there fashions is much like that of a disenchanted intern—hitting minimal benchmarks however struggling total to supply high quality work and not using a human hand to refine its output.
Clearing the bar
MIT researchers used 41 completely different LLMs—together with variations of Claude, Gemini, and ChatGPT—to investigate efficiency in additional than 11,000 primarily text-based duties for numerous job roles listed by the Labor Division. Their outputs had been then scored by people with precise on-the-job expertise in these fields. The aim was to see how usually an AI employee substitute might produce an output {that a} supervisor would discover acceptable with none human edits, after which to guage its high quality.
The researchers discovered AI has change into extra dependable through the years for a lot of forms of work, however nonetheless falls brief each time the stakes or requirements are raised. The MIT examine utilized a 1–9 scoring scale to guage AI efficiency, wherein a 7 was outlined as “minimally sufficient,” that means the work is helpful as is and requires no edits. As of late 2025, AI fashions scored a 7 in roughly 65% of duties.
Most essential for corporations contemplating changing patches of their workforce with AI, the MIT knowledge suggests AI struggles to carry out extra difficult duties. No matter how a lot time an AI mannequin needed to full a job, the chance of success when graded in opposition to a 9 or “superior” high quality rating by no means exceeded 50%. In different phrases, when a job requires a number of steps, creativity, or precision, AI replacements usually tend to fail than succeed.
The analysis matches some facets of company America’s present AI adoption narrative. Firms that use AI usually tend to automate routine duties and roles as soon as left for entry-level positions, whereas some extremely technical abilities, significantly digital ones, have truly been related to wage premiums.
That was mirrored in MIT’s knowledge, which discovered common success charges decrease for expert roles in authorized and IT jobs, whereas AI fashions typically had a better time tackling the text-based duties related to building and upkeep professions.
Firms which have experimented with absolutely automating sure elements of their workload have handled rising pains. Final 12 months, Deloitte produced two studies for presidency shoppers in Australia and Canada that had been each discovered to be riddled with fabrications. Media shops together with CNET and Sports activities Illustrated have been caught utilizing AI to generate inaccurate tales below made-up bylines. Legal professionals have additionally relied on AI to organize their briefs, with one legislation agency publicly apologizing final 12 months after it emerged faux AI-generated citations had appeared in a chapter submitting in one among its circumstances.
The anecdotal proof and MIT’s knowledge counsel AI nonetheless requires a human hand to maximise its upside, although the expertise is quickly bettering. MIT researchers estimated AI’s success fee on the duties analyzed elevated by as much as 11 share factors annually owing to extra succesful fashions.
By 2029, the authors estimate, most AI fashions will be capable to accomplish between 80% and 95% of text-based duties on the minimally adequate benchmark.
Whether or not AI will ever be capable to scale towards glorious and even good efficiency stays unknown.
“Widespread automation, particularly in domains with low tolerance for errors, may still be some distance away,” the researchers wrote.
