Useless security reports generated by AI are frustrating open-source maintainers

Facepalm: Generative AI services are neither intelligent nor capable of providing a meaningful addition to open-source development efforts. A security expert who has had enough of “spammy,” hallucinated bug listings is venting his frustration, asking the FOSS community to sidestep AI-generated reports.

Generative AI models have already proven powerful tools in the hands of cyber-criminals and fraudsters. However, hucksters can also use them to spam open-source projects with useless bug reports. According to Seth Larson, the number of “extremely” low-quality, spammy, and LLM-hallucinated security reports has recently increased, forcing maintainers to waste their time on things of low intelligence.

Larson is a security developer at the Python Software Foundation who also volunteers on “triage teams” tasked with vetting security reports for popular open-source projects such as CPython, pip, urllib3, Requests, and others. In a recent blog post, the developer denounces a new and troublesome trend of sloppy security reports created with generative AI systems.

These AI reports are insidious because they appear as potentially legitimate and worth checking out. As Curl and other projects have already pointed out, they are just better-sounding crap but crap nonetheless. Thousands of open-source projects are affected by this issue, while maintainers aren’t encouraged to share their findings because of the sensitive nature of security-related development.

“If this is happening to a handful of projects that I have visibility for, then I suspect that this is happening on a large scale to open source projects,” Larson said.

Hallucinated reports waste volunteer maintainers’ time and result in confusion, stress, and much frustration. Larson said that the community should treat low-quality AI reports as malicious, even if this is not the original intent of the senders.

He had valuable advice for platforms, reporters, and maintainers currently dealing with an uptick in AI-hallucinated reports. The community should employ CAPTCHA and other anti-spam services to prevent the automated creation of security reports. Meanwhile, bug reporters should not use AI models to detect security vulnerabilities in open-source projects.

Large language models don’t understand anything about code. Finding legitimate security flaws requires dealing with “human-level concepts” such as intent, common usage, and context. Maintainers can save themselves from a lot of trouble by responding to apparent AI reports with the same effort put forth by the original senders, which is “near zero.”

Larson acknowledges that many vulnerability reporters act in good faith and usually provide high-quality reports. However, an “increasing majority” of low-effort, low-quality reports ruin it for everyone involved in development.