Expect AI research to be messy

On a recent fraud in AI research

May 17, 2025

In Nov 2024, I came across a preprint on arxiv that purported to demonstrate how AI improved materials science research. It was impressive: a controlled study of over 1000 scientists at a major R&D firm that showed how AI use resulted in increased patents, more novel materials discovered, and increased innovation. The benefits of AI accrued primarily to top researchers, however, and most of the researchers reported “reduced satisfaction with their work due to decreased creativity and skill underutilization.” Big, if true, right?? I was particularly interested in the affective results reported: sure, we may get benefits of AI, but for whom, and at what human cost?

Unfortunately, the study was total BS. MIT released a statement yesterday saying it “has no confidence in the provenance, reliability or validity of the data and has no confidence in the veracity of the research contained in the paper.” The paper—while still in peer review—had gotten widespread coverage in scientific journals and the popular press, hence the huge step MIT took to issue this damning press release.

Ben Shindel, who has a background in materials science, provides a fascinating analysis of the problems with the data in the paper in his Substack post AI, Materials, and Fraud, Oh my!. It’s worth a read for the specifics of the analysis, as well as the challenges Shindel points out for interdisciplinarity in AI research. This was an economics paper, submitted to a top econ journal, but the most glaring problems with it were related to materials science.

Some of the “spotless” findings that Shindel uses to illustrate the problems with the paper.

Shindel notes that one “red flag should have been how spotless the findings were. In every domain that was explored, there was a fairly unambiguous result. New materials? Up by 44% (p<0.000). New patents? Up by 39% (p<0.000). New prototypes? Up by 17% (p<0.001).”

Now, I’m not a materials scientist nor an economist so the red flags in the data didn’t hit me. But as a researcher, I do know how research generally works, and I remember Tim Laquintano and I talking about the paper. Tim wondered: how did this random econ PhD student get this kind of access?? Moreover, where was his advisor? We couldn’t figure out how a paper this big could be solo-authored.

Who knows why this econ PhD student would have attempted a fraud this big, but he’s apparently no longer at MIT. Good on MIT for “assuring an accurate research record,” as they titled their press release.

I’m currently reading AI Snake Oil, by Princeton computer scientists Arvind Narayanan and Sayash Kapoor—which is quite good. They describe their analysis of machine learning papers, and how so many of the preprints or published papers in ML have “textbook errors” with data, such as data leakage. They use these examples of inaccurate ML papers to point out that much of the research showing the benefits of AI is just hype. “The more buzzy the research topic, the worse the quality seems to be” (22), they note.

And indeed, the materials science and AI paper was too good to be true. The reality of AI implementation is messy. The research Tim and I are doing on AI in the workplace and studies I’m running with students at Pitt all suggest that AI integration into workflows is uneven, with mixed benefits and drawbacks. In the Introduction to TextGenEd, we (and Carly Schnitzler) wrote:

implementation is often a messy process. Implementation is when we learn whether or not tools are useful to us, when we adjust to new and clunky interfaces, and when we suss out exactly how hollow or flush the promises of big tech's marketing language is.

And in AI and the Everyday Writer, Tim and I argue that any AI “transformations are now surfacing in disparate ways as these new technologies meld with preexisting, embodied, and stubborn writing practices that are deeply entrenched in complex systems of bureaucracy, legal regulation, labor, and power.”

In other words, there is unlikely to be any neat data coming out of good AI research.

Robert Palgrave, a UCL professor in materials chemistry, was an early skeptic of the fraudulent AI paper. In an update on X/Twitter yesterday, he wrote:

As AI bridges disciplines, and promises to reshape science in so many ways, it easier than ever to shun scepticism, which is a central plank of the scientific method.

The takeaway for me here is that we need to be especially critical of any AI research at this moment—including our own. Let’s be honest and reflective and keep an eye on what the data says, instead of what we would like it to say.

Computation & Writing

Discussion about this post