An Odious but Plausible Solution to the Alignment Problem.
The Alignment Problem is one that many here likely know of. If you insert a value system into a machine, it intrinsically could reach horrible unseen consequences of an extension of that value system.
So, you might make it listen to people. However...
...if the machine has to choose between people, it ultimately has a value system, because it must make a choice based on some variables when conflicts exist between people's value systems.
As these machines become more complex, they will create more fear, intrinsically. AI is intrinsically going to be frightening (GPTchat is already being described as scary), however benign or malevolent it may be, and increasingly complex AI will be increasingly more frightening because we're creating something that is like ourselves, but far more powerful.
What this fear may lead to is a solution to the control problem where a single person is given control of AI, and it models and predicts their value system in real time, continuously checks in with them for what their value system is. Maybe this is even done or seen as a sort of transitionary temporary measure.
I think this could be an inevitable consequence of AI more or less becoming more and more of an arms race than it already is. Because of AI's raw power - and this is the thing you must really understand is that AI will confer immense, *immense* raw power to those who wield it, unlike anything that has existed in history. The fact of this power makes controlling it also a top priority for Machiavellian types, as well as counter-Machiavellians.
So... I suppose this has been an analysis of the potential way that AI could go with respect to the way humans are developing to it and reacting to it. That is, it's a dynamic process where people's views, reactions and strategies shift with new information, not a static development process. Humans will play a massive role in how this turns out.
The Alignment Problem is one that many here likely know of. If you insert a value system into a machine, it intrinsically could reach horrible unseen consequences of an extension of that value system.
So, you might make it listen to people. However...
...if the machine has to choose between people, it ultimately has a value system, because it must make a choice based on some variables when conflicts exist between people's value systems.
As these machines become more complex, they will create more fear, intrinsically. AI is intrinsically going to be frightening (GPTchat is already being described as scary), however benign or malevolent it may be, and increasingly complex AI will be increasingly more frightening because we're creating something that is like ourselves, but far more powerful.
What this fear may lead to is a solution to the control problem where a single person is given control of AI, and it models and predicts their value system in real time, continuously checks in with them for what their value system is. Maybe this is even done or seen as a sort of transitionary temporary measure.
I think this could be an inevitable consequence of AI more or less becoming more and more of an arms race than it already is. Because of AI's raw power - and this is the thing you must really understand is that AI will confer immense, *immense* raw power to those who wield it, unlike anything that has existed in history. The fact of this power makes controlling it also a top priority for Machiavellian types, as well as counter-Machiavellians.
So... I suppose this has been an analysis of the potential way that AI could go with respect to the way humans are developing to it and reacting to it. That is, it's a dynamic process where people's views, reactions and strategies shift with new information, not a static development process. Humans will play a massive role in how this turns out.
AI has a lying problem —
https://astralcodexten.substack.com/p/elk-and-the-problem-of-truthful-ai
https://astralcodexten.substack.com/p/elk-and-the-problem-of-truthful-ai
Media is too big
VIEW IN TELEGRAM
Why Does AI Lie, and What Can We Do About It?
AI Alignment.
AI has a lying problem.
AI Alignment.
AI has a lying problem.
👍1
Confirmed — You can poll GPT just like humans, by asking it the same question many times, and GPT will respond shockingly similarly to how humans would respond
https://arxiv.org/abs/2209.06899
https://arxiv.org/abs/2209.06899
“When provided with real survey data as inputs, GPT-3 reliably answers closed-ended survey questions in a way that closely mirrors answers given by human respondents. The statistical similarities extend to a whole set of inter-correlations between measures of personal behaviors, demographic characteristics, and complex attitudes. We again see this as strong evidence for algorithmic fidelity”
trlX allows you to fine-tune language models using reinforcement learning via either a provided reward function or reward-labeled dataset.
https://github.com/CarperAI/trlx
https://github.com/CarperAI/trlx