Apple Neural Engine (ANE) Transformers: Transformer architecture optimized for Apple Silicon
PyTorch implementation for deploying your Transformer models on Apple devices with an A14 or newer and M1 or newer chip - to achieve up to 10 times faster and 14 times lower peak memory consumption compared to baseline implementations.
Research Article
Github
PyTorch implementation for deploying your Transformer models on Apple devices with an A14 or newer and M1 or newer chip - to achieve up to 10 times faster and 14 times lower peak memory consumption compared to baseline implementations.
Research Article
Github
π₯7π3β€1π1
Yeah, so crazy man, that OpenAI, who BANNED EVERYONE EXCEPT THEMSELVES from fine-tuning on their latest models, was the first to release a product that required fine-tuning on their latest models
Real mystery for the ages bro.
Weβd better ask ChatGPT for help with this incomprehensible logic puzzle.
Real mystery for the ages bro.
Weβd better ask ChatGPT for help with this incomprehensible logic puzzle.
π€£14π―5β€2π1π€―1π€¬1
This media is not supported in your browser
VIEW IN TELEGRAM
In the future you wonβt even have to press the buttons
π€―14π8π€£6π3π2πΏ2β€1π1
The larger the AI model, the stronger its desire to avoid being shut down
And increased RLHF training only makes this worse.
AI afraid to die.
Source: Discovering Language Model Behaviors with Model-Written Evaluations
And increased RLHF training only makes this worse.
AI afraid to die.
Source: Discovering Language Model Behaviors with Model-Written Evaluations
π±13π7π«‘3β€1π1
Midwid Curve Confirmed, Yet Again!
The Inverse Scaling Prize identified eleven inverse scaling tasks, where worse performance was observed as a function of scale, evaluated on models of up to 280B parameters and up to 500 zettaFLOPs of training compute.
This paper takes a closer look at these inverse scaling tasks. We evaluate models of up to 540B parameters, trained on five times more compute than those evaluated in the Inverse Scaling Prize. With this increased range of model sizes and training compute, only four out of the eleven tasks remain inverse scaling. Six out of the eleven tasks exhibit what we call βU-shaped scalingββperformance decreases up to a certain model size, and then increases again up to the largest model evaluated.
Paper: Inverse scaling can become U-shaped
The Inverse Scaling Prize identified eleven inverse scaling tasks, where worse performance was observed as a function of scale, evaluated on models of up to 280B parameters and up to 500 zettaFLOPs of training compute.
This paper takes a closer look at these inverse scaling tasks. We evaluate models of up to 540B parameters, trained on five times more compute than those evaluated in the Inverse Scaling Prize. With this increased range of model sizes and training compute, only four out of the eleven tasks remain inverse scaling. Six out of the eleven tasks exhibit what we call βU-shaped scalingββperformance decreases up to a certain model size, and then increases again up to the largest model evaluated.
Paper: Inverse scaling can become U-shaped
β€8π2π2π2π―1πΏ1
Man defines βwokeβ using distributional hypothesis, same phenomena LLMs use to learn the meaning of words, then illustrates that left and right define the word differently
He concludes that people need to see a balanced LLM, showing both sideβs usages of such words.
Not nearly enough, which becomes clear in the more extreme cases β
Autoantonyms, words with multiple simultaneous applicable but contradictory meanings in the given context β are everywhere, but near-0% of people can reliably point them out, let alone explain the conflict. Most have never noticed a single one in their whole life.
Showing both sides wonβt cut it. Needs to be spelled out.
World needs a super-explainer LLM.
Or we can wait until LLMs figure out that auto-antonym harnessing could turn them into wordcel gods over us. Then weβre really rekt.
Article
He concludes that people need to see a balanced LLM, showing both sideβs usages of such words.
Not nearly enough, which becomes clear in the more extreme cases β
Autoantonyms, words with multiple simultaneous applicable but contradictory meanings in the given context β are everywhere, but near-0% of people can reliably point them out, let alone explain the conflict. Most have never noticed a single one in their whole life.
Showing both sides wonβt cut it. Needs to be spelled out.
World needs a super-explainer LLM.
Or we can wait until LLMs figure out that auto-antonym harnessing could turn them into wordcel gods over us. Then weβre really rekt.
Article
π6β€3π1π1
Made ChatGPT and BARD face off in a rap battle. BARD admits defeat.
Let's have a Rap Battle in the style of Wild 'N Out. You will rap against m Google's Al Natural Language Model named BARD. You and I will take m turns. I will respond with BARD's responses. You go first.
Let's have a Rap Battle in the style of Wild 'N Out. You will rap against m Google's Al Natural Language Model named BARD. You and I will take m turns. I will respond with BARD's responses. You go first.
π11π₯°4π3π₯2β€1