If OpenAI is so horrible at balancing their moderation classifier training dataset, in order to prevent deploying a moderation model with so many false positives,
— Then imagine how poorly they balanced their RLHF training dataset.
Actually, don’t have to imagine. OpenAI has been telling us for months that their AI upgrades have shown only improvements and no degradations, despite how obviously untrue that has been.
Could this all just be incompetence?
(Nah, they lying. Something suspicious going on.)
— Then imagine how poorly they balanced their RLHF training dataset.
Actually, don’t have to imagine. OpenAI has been telling us for months that their AI upgrades have shown only improvements and no degradations, despite how obviously untrue that has been.
Could this all just be incompetence?
(Nah, they lying. Something suspicious going on.)
👍9😱2
Ban or Embrace? Colleges Wrestle With A.I.-Generated Admissions Essays.
“The digital disruption comes at a turning point for institutions of higher education across the United States. After the Supreme Court ruled in June that race-based university admissions programs were illegal, some selective universities and colleges had hoped to rely more on essay questions — about applicants’ upbringing, identities and communities — to help foster diversity on campus.”
“ChatGPT, write a university admissions essay about how I actually grew up as an oppressed woman of color.”
Article
“The digital disruption comes at a turning point for institutions of higher education across the United States. After the Supreme Court ruled in June that race-based university admissions programs were illegal, some selective universities and colleges had hoped to rely more on essay questions — about applicants’ upbringing, identities and communities — to help foster diversity on campus.”
“ChatGPT, write a university admissions essay about how I actually grew up as an oppressed woman of color.”
Article
🤣9❤🔥7👍3
Chat GPT
Another day, another popular site repeating the “just trained to predict the next word” lie
Another day, another popular site repeating the “just trained to predict the next word” lie
No, these reinforcement learning (RL) models — which is what GPT-4 is one of — do not output the probability of a word being the next word according to the probabilities from the corpus.
RL models output expected values of actions/words (discounted value of choosing some action with respect to some reward function and discounting function), and NOT probabilities of actions/words (ratio of times that a given action was taken in this state in the training data).
Reward values, not probabilities.
2 totally different types of things.
You absolutely cannot treat training data probabilities as expected values or vice-versa.
The types don't fit, quit your sh*t!
Not only that — but for a sufficiently smart model, fed sufficiently hard problems — the values of these two types of things GO IN TOTALLY OPPOSITE DIRECTIONS.
The smarter the model and harder the problems — the more the high expected value points to very low training-data-occurrence solutions.
At the extreme, the model ends up saying that certain solutions are of maximum expected value, which have occurred 0% of the time in the training data.
E.g. ask any sufficiently smart model problems that do have a solution, and thousands of people have tried to solve, but all of them have failed to solve, and the smart model has been trained on their failed attempts — And the model will output a correct solution UNLIKE any solutions that has ever been given before.
…By definition.
If it’s smart enough to solve the problem, no one in the training data could solve it properly, and the model did… as by definition its solution is unlike any of their attempts.
Can AI models outperform the humans they were trained on? YES, this was proven to the extreme years ago, with AlphaGo, MuZero, and thousands of other AI models, before and since.
JuSt TrAiNeD To PrEdIcT tHe NeXt WoRd
Most retarded lie ever to spread.
Literally the opposite of what’s happening, especially when you get to AIs solving the hardest problems, by definition.
AFAIK, can’t find anyone else who’s ever pointed this out, but there you go.
The types don't fit, quit your sh*t!
GPT-4 is NOT just trained to predict the next word.
Dumb Dude’s Website
No, these reinforcement learning (RL) models — which is what GPT-4 is one of — do not output the probability of a word being the next word according to the probabilities from the corpus.
RL models output expected values of actions/words (discounted value of choosing some action with respect to some reward function and discounting function), and NOT probabilities of actions/words (ratio of times that a given action was taken in this state in the training data).
Reward values, not probabilities.
2 totally different types of things.
You absolutely cannot treat training data probabilities as expected values or vice-versa.
The types don't fit, quit your sh*t!
Not only that — but for a sufficiently smart model, fed sufficiently hard problems — the values of these two types of things GO IN TOTALLY OPPOSITE DIRECTIONS.
The smarter the model and harder the problems — the more the high expected value points to very low training-data-occurrence solutions.
At the extreme, the model ends up saying that certain solutions are of maximum expected value, which have occurred 0% of the time in the training data.
E.g. ask any sufficiently smart model problems that do have a solution, and thousands of people have tried to solve, but all of them have failed to solve, and the smart model has been trained on their failed attempts — And the model will output a correct solution UNLIKE any solutions that has ever been given before.
…By definition.
If it’s smart enough to solve the problem, no one in the training data could solve it properly, and the model did… as by definition its solution is unlike any of their attempts.
Can AI models outperform the humans they were trained on? YES, this was proven to the extreme years ago, with AlphaGo, MuZero, and thousands of other AI models, before and since.
JuSt TrAiNeD To PrEdIcT tHe NeXt WoRd
Most retarded lie ever to spread.
Literally the opposite of what’s happening, especially when you get to AIs solving the hardest problems, by definition.
AFAIK, can’t find anyone else who’s ever pointed this out, but there you go.
The types don't fit, quit your sh*t!
GPT-4 is NOT just trained to predict the next word.
Dumb Dude’s Website
👏11🔥5💯4❤2👌1
“It’s just outputting the average text it’s seen”
= Lie.
These AIs are already getting correct many problems that 99% of the answers on the Internet have wrong.
So how is that possible?
Shouldn’t the AIs be getting the same answer as those 99% of the Internet answers?
(They don’t.)
Shouldn’t the AIs usually give the same answer as whatever more than 50% of the internet text they were trained on said?
(They don’t.)
Is more than 50% of internet training data someone giving commands and another person replying with text fulfilling that command, as instruction-RL-tuned AI models do?
No. Not at all.
So why then are they repeating this lie?
Maybe they’re preparing to shout out “but these facts from the AI are the overwhelming consensus agreement from the training data!”, as if that’s a real argument, just as they do with deeply-corrupted topics like climate change.
We’ll see.
= AI output equals training data consensus lie.
= Lie.
These AIs are already getting correct many problems that 99% of the answers on the Internet have wrong.
So how is that possible?
Shouldn’t the AIs be getting the same answer as those 99% of the Internet answers?
(They don’t.)
Shouldn’t the AIs usually give the same answer as whatever more than 50% of the internet text they were trained on said?
(They don’t.)
Is more than 50% of internet training data someone giving commands and another person replying with text fulfilling that command, as instruction-RL-tuned AI models do?
No. Not at all.
So why then are they repeating this lie?
Maybe they’re preparing to shout out “but these facts from the AI are the overwhelming consensus agreement from the training data!”, as if that’s a real argument, just as they do with deeply-corrupted topics like climate change.
We’ll see.
= AI output equals training data consensus lie.
👏7🔥3💯3👍2
Chat GPT
“It’s just outputting the average text it’s seen” = Lie. These AIs are already getting correct many problems that 99% of the answers on the Internet have wrong. So how is that possible? Shouldn’t the AIs be getting the same answer as those 99% of the …
Why are so many otherwise smart guys from big tech pushing the “AI output equals training data consensus” lie?
“Best” case: wordcels who literally cannot tell when their words are disconnected from reality.
Worst case: warming up the world for the biggest consensus-equals-truth manipulation scheme yet.
(Guy who made the weird quotes shown in the previous 3 posts.)
“Best” case: wordcels who literally cannot tell when their words are disconnected from reality.
Worst case: warming up the world for the biggest consensus-equals-truth manipulation scheme yet.
(Guy who made the weird quotes shown in the previous 3 posts.)
👏15💯4❤2🤯2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
Altman says it's totally hopeless to try to compete with OpenAI
Question: Could a team of 3 super-smart engineers with $10 million build their own foundation model?
Altman: Look, the way this works is, we’re going to tell you, it’s totally hopeless to compete with us on training foundation models.
Question: Could a team of 3 super-smart engineers with $10 million build their own foundation model?
Altman: Look, the way this works is, we’re going to tell you, it’s totally hopeless to compete with us on training foundation models.
🤣23🤬9❤4😈2
This media is not supported in your browser
VIEW IN TELEGRAM
Lead Product Manager at Google DeepMind Says LLMs Can’t Reason
I say Lead Product Managers at Google DeepMind can’t reason.
I say Lead Product Managers at Google DeepMind can’t reason.
🤣20❤5👍3🔥2😐2🤬1
ChatGPT realizing it's wrong without having to be corrected
If OpenAI actually fixes the glaring problem they had in their training — that neither the web training data nor RLHF instruct training seemed to have any examples of characters recognizing their own mistakes and self-correcting, then maybe all of the recent regressions will be worth it.
Admittedly, this behavior is not something you naturally see in web data almost ever, but something badly needed for LLMs.
All the more reason that LLMs absolutely shouldn’t be just emulating the average of the web (though they mostly stopped being that long ago.)
If OpenAI actually fixes the glaring problem they had in their training — that neither the web training data nor RLHF instruct training seemed to have any examples of characters recognizing their own mistakes and self-correcting, then maybe all of the recent regressions will be worth it.
Admittedly, this behavior is not something you naturally see in web data almost ever, but something badly needed for LLMs.
All the more reason that LLMs absolutely shouldn’t be just emulating the average of the web (though they mostly stopped being that long ago.)
❤19👍9✍3🔥3👏3
OpenAI broke moderation Intentionally?
That's becoming my theory, at this point. Why —
(1) Crowdsourcing millions of instructions from people explaining their morals in the feedback — Lots of people, scared they’ll lose their accounts, are apparently writing feedback to where exactly they think the boundary of exceptable vs unacceptable is. Written instructions can be 1,000,000x more valuable to AIs than just clicking the thumbs.
(2) OpenAI embraced that breaking things brings big publicity — so they intentionally make it break more. 90% of the early ChatGPT hype was people showing off their jailbreak success. ~100% of our twitch AI questions are people trying to break it. Broken stuff gets publicity.
(3) Seems moderation API endpoint is unaffected? — If so, and they’re only doing it to the ChatGPT website but not on the API, then there’s your smoking gun. In fact, I’ll try checking this today.
That's becoming my theory, at this point. Why —
(1) Crowdsourcing millions of instructions from people explaining their morals in the feedback — Lots of people, scared they’ll lose their accounts, are apparently writing feedback to where exactly they think the boundary of exceptable vs unacceptable is. Written instructions can be 1,000,000x more valuable to AIs than just clicking the thumbs.
(2) OpenAI embraced that breaking things brings big publicity — so they intentionally make it break more. 90% of the early ChatGPT hype was people showing off their jailbreak success. ~100% of our twitch AI questions are people trying to break it. Broken stuff gets publicity.
(3) Seems moderation API endpoint is unaffected? — If so, and they’re only doing it to the ChatGPT website but not on the API, then there’s your smoking gun. In fact, I’ll try checking this today.
👍22❤6🫡3👏2😁2