Distracted by the LLM-generated style of writing. Not sure whether the author truly writes like that (unlikely) or there was heavy AI assistance in drafting this.
It reads that way and Pangram says it's AI. And my experience says that if you see an AI-related headline on HN, there's a 50%+ chance that it is AI-generated and meant mostly for clicks.
>Have you tried putting known human writing into pangram? I have. I've gotten 100% AI with multiple samples of my own human writing. It has also given me 50% on things I know were 100% AI written (from my prompts).
>Pangram is basically a made-up number. / I've tried it on large docs I've written well before the AI times, and that are nowhere available on the Internet (so it can't be a corpus issue) - and it is happily classifying me as 60%-80% AI.
Unfortunate there's an incentive to pay to sign up to protect oneself against false accusations.
An earlier claim in this thread stated 100% from the same tool, but another commenter claims 76%, so apparently the tool is even susceptible to that failure mode.
Skepticism against AI text detectors on HN is as old as time, and frequently comes from people with some vested interest in filling up the internet with slop when you look at their business ideas / projects / blogs. I've done systematic testing on human and LLM-generated text and I'm confident that the accuracy is higher than 95% (and that <5% is almost exclusively false negatives).
You shouldn't crucify people based on this alone, but if it reads like AI, quacks like AI, and is detected as AI, it's probably AI.
>Skepticism against AI text detectors on HN is as old as time
Since early 2023 or so, when the detectors were widely reported (off platform) as unreliable?
>and frequently comes from people with some vested interest in filling up the internet with slop
I'm sure sometimes.
At least once, has come from someone who recently philosophized about whether to call out the specifics of AI writing in the first place given the potential for aiding labs in their training missions https://news.ycombinator.com/item?id=48326913
>I've done systematic testing
You in the field?
>but if it reads like AI
Actually didn't to me, and I'd like to think my detector's no worse than the average for a commenter here... perhaps as we'd all :)
I have no idea about the current article, and given that the author is the person with the first commit in the Kubernetes repo (https://joe.dev/about/), he obviously has a lot of credibility.
Just generally though: what we're seeing a ton of these days is people writing something and then passing it to an LLM with a request to improve it somehow, e.g. by fixing grammar, tightening the style, etc. In such cases, the answer to your question is that the "prompt" is (1) a first draft, and (2) an instruction to edit it.
It's clear, though, that the LLMs leave far more imprints on the text than most people realize, and that although they may have asked the LLM to restrict its edits to "just" X or Y, the actual changes to the text will often go beyond that.
How this will evolve over time is anyone's guess, of course.
I do this so many times. Type in a large amount of text and the only thing final in my mind is para breaks and the idea per paragraph. And then give it to AI saying "sending to director", "sending to friends on WhatsApp group", "sending to colleagues" and it does an awesome job of bringing the "AI polish" and then you edit or negotiate line by line or para by para on what you want to keep.
Yes, this is certainly common, and opinions and tastes differ about the outcomes—as they should, because we are all still in the early stage of sorting through the best way to use these tools. I think it's also already clear that "best way" means something different in different contexts.
Where some people are getting into trouble, at least in the HN context, is underestimating the impact that this has on their text. There's a big perception gap between the author's view ("fixed up the grammar a bit") and the reader's view ("this sounds entirely like an AI wrote it") in many cases. So many, in fact, that I feel I can say something about it. I'm no authority on any of this and don't want to sound like one, but this is such a common pattern at the moment that I feel confident reporting it. How it will change over time, I have no idea.
(I also don't want to sound anti-LLM - we rely on these tools heavily, they're amazing, they've already improved HN, and they show every sign of high potential to improve it further. The bottleneck isn't the LLMs, it's how quickly we can figure out how to use (and test) them. We just don't use them to process any text that we put on HN itself.)
"Turn this outline/lose idea for an article/4 paragraphs of text into a blog post similar to these previous blog posts, but make sure that this one has a table of contents and a bunch of references"
The article didn't do a good job explaining the 120% attention angle, I kept reading waiting for that and it never really came. I definitely had the impression it was heavily using AI in the writing which I gave up on being against, but it just didn't explain the thesis well.
I guess the idea is AI gives you back time so you could now do the 20% but you still really can't because you have to still think about it even if the code is generated? Not even sure after reading all that text what the idea is
.
I never worked at EA, but they had "Friday Afternoon Project" at least someone who worked for EA here in Florida told me so. The unfortunate thing about Friday Afternoon Project being its shorthand acronym or "F.A.P." not sure if that was intentional or just a "happy accident" but a coworker found one such Friday Afternoon Project and had sent me a youtube of it, it was pretty funny looking, thing of a really bare bones game concept basically. I guess it was a way for EA to let employees have some downtime.
I was always jealous of the 20% off concept, because there's so many jobs and places where I'd use that time to solve things nobody wants to "fund" within my org, sometimes there's some really dumb bug somewhere, or easy to solve for internal tooling need (I'm sure Google has had this resolved many a time internally) that could be met if I could even have two hours on a Friday to work on anything.
Artists and actors could point the LLMs at programmers by developing their own apps, doing their own engineering. Do they? Who's doing it already? Tilly Norwood's creator seems like a decent start.
>Have you tried putting known human writing into pangram? I have. I've gotten 100% AI with multiple samples of my own human writing. It has also given me 50% on things I know were 100% AI written (from my prompts).
https://news.ycombinator.com/item?id=48326698
>Pangram is basically a made-up number. / I've tried it on large docs I've written well before the AI times, and that are nowhere available on the Internet (so it can't be a corpus issue) - and it is happily classifying me as 60%-80% AI.
https://news.ycombinator.com/item?id=48378226
Two of my own thoughts:
Unfortunate there's an incentive to pay to sign up to protect oneself against false accusations.
An earlier claim in this thread stated 100% from the same tool, but another commenter claims 76%, so apparently the tool is even susceptible to that failure mode.
You shouldn't crucify people based on this alone, but if it reads like AI, quacks like AI, and is detected as AI, it's probably AI.
Since early 2023 or so, when the detectors were widely reported (off platform) as unreliable?
>and frequently comes from people with some vested interest in filling up the internet with slop
I'm sure sometimes.
At least once, has come from someone who recently philosophized about whether to call out the specifics of AI writing in the first place given the potential for aiding labs in their training missions https://news.ycombinator.com/item?id=48326913
>I've done systematic testing
You in the field?
>but if it reads like AI
Actually didn't to me, and I'd like to think my detector's no worse than the average for a commenter here... perhaps as we'd all :)
Just generally though: what we're seeing a ton of these days is people writing something and then passing it to an LLM with a request to improve it somehow, e.g. by fixing grammar, tightening the style, etc. In such cases, the answer to your question is that the "prompt" is (1) a first draft, and (2) an instruction to edit it.
It's clear, though, that the LLMs leave far more imprints on the text than most people realize, and that although they may have asked the LLM to restrict its edits to "just" X or Y, the actual changes to the text will often go beyond that.
How this will evolve over time is anyone's guess, of course.
Where some people are getting into trouble, at least in the HN context, is underestimating the impact that this has on their text. There's a big perception gap between the author's view ("fixed up the grammar a bit") and the reader's view ("this sounds entirely like an AI wrote it") in many cases. So many, in fact, that I feel I can say something about it. I'm no authority on any of this and don't want to sound like one, but this is such a common pattern at the moment that I feel confident reporting it. How it will change over time, I have no idea.
(I also don't want to sound anti-LLM - we rely on these tools heavily, they're amazing, they've already improved HN, and they show every sign of high potential to improve it further. The bottleneck isn't the LLMs, it's how quickly we can figure out how to use (and test) them. We just don't use them to process any text that we put on HN itself.)
I guess the idea is AI gives you back time so you could now do the 20% but you still really can't because you have to still think about it even if the code is generated? Not even sure after reading all that text what the idea is .
I was always jealous of the 20% off concept, because there's so many jobs and places where I'd use that time to solve things nobody wants to "fund" within my org, sometimes there's some really dumb bug somewhere, or easy to solve for internal tooling need (I'm sure Google has had this resolved many a time internally) that could be met if I could even have two hours on a Friday to work on anything.
only a few companies like google had that imo. most companies cannot afford that.