What's Wrong with Artificial Intelligence: From the perspective of Prof. Richard Sutton
I hold that AI has gone astray by neglecting its essential objective --- the turning over of responsibility for the decision-making and organization of the AI system to the AI system itself. It has become an accepted, indeed lauded, form of success in the field to exhibit a complex system that works well primarily because of some insight the designers have had into solving a particular problem. This is part of an anti-theoretic, or "engineering stance", that considers itself open to any way of solving a problem. But whatever the merits of this approach as engineering, it is not really addressing the objective of AI. For AI it is not enough merely to achieve a better system; it matters how the system was made. The reason it matters can ultimately be considered a practical one, one of scaling. An AI system too reliant on manual tuning, for example, will not be able to scale past what can be held in the heads of a few programmers. This, it seems to me, is essentially the situation we are in today in AI. Our AI systems are limited because we have failed to turn over responsibility for them to them.
Please forgive me for this which must seem a rather broad and vague criticism of AI. One way to proceed would be to detail the criticism with regard to more specific subfields or subparts of AI. But rather than narrowing the scope, let us first try to go the other way. Let us try to talk in general about the longer-term goals of AI which we can share and agree on. In broadest outlines, I think we all envision systems which can ultimately incorporate large amounts of world knowledge. This means knowing things like how to move around, what a bagel looks like, that people have feet, etc. And knowing these things just means that they can be combined flexibly, in a variety of combinations, to achieve whatever are the goals of the AI. If hungry, for example, perhaps the AI can combine its bagel recognizer with its movement knowledge, in some sense, so as to approach and consume the bagel. This is a cartoon view of AI -- as knowledge plus its flexible combination -- but it suffices as a good place to start. Note that it already places us beyond the goals of a pure performance system. We seek knowledge that can be used flexibly, i.e., in several different ways, and at least somewhat independently of its expected initial use.
With respect to this cartoon view of AI, my concern is simply with ensuring the correctness of the AI's knowledge. There is a lot of knowledge, and inevitably some of it will be incorrrect. Who is responsible for maintaining correctness, people or the machine? I think we would all agree that, as much as possible, we would like the AI system to somehow maintain its own knowledge, thus relieving us of a major burden. But it is hard to see how this might be done; easier to simply fix the knowledge ourselves. This is where we are today.
Date: November 12, 2001
https://incompleteideas.net/IncIdeas/WrongWithAI.html
#artificial_intelligence
I hold that AI has gone astray by neglecting its essential objective --- the turning over of responsibility for the decision-making and organization of the AI system to the AI system itself. It has become an accepted, indeed lauded, form of success in the field to exhibit a complex system that works well primarily because of some insight the designers have had into solving a particular problem. This is part of an anti-theoretic, or "engineering stance", that considers itself open to any way of solving a problem. But whatever the merits of this approach as engineering, it is not really addressing the objective of AI. For AI it is not enough merely to achieve a better system; it matters how the system was made. The reason it matters can ultimately be considered a practical one, one of scaling. An AI system too reliant on manual tuning, for example, will not be able to scale past what can be held in the heads of a few programmers. This, it seems to me, is essentially the situation we are in today in AI. Our AI systems are limited because we have failed to turn over responsibility for them to them.
Please forgive me for this which must seem a rather broad and vague criticism of AI. One way to proceed would be to detail the criticism with regard to more specific subfields or subparts of AI. But rather than narrowing the scope, let us first try to go the other way. Let us try to talk in general about the longer-term goals of AI which we can share and agree on. In broadest outlines, I think we all envision systems which can ultimately incorporate large amounts of world knowledge. This means knowing things like how to move around, what a bagel looks like, that people have feet, etc. And knowing these things just means that they can be combined flexibly, in a variety of combinations, to achieve whatever are the goals of the AI. If hungry, for example, perhaps the AI can combine its bagel recognizer with its movement knowledge, in some sense, so as to approach and consume the bagel. This is a cartoon view of AI -- as knowledge plus its flexible combination -- but it suffices as a good place to start. Note that it already places us beyond the goals of a pure performance system. We seek knowledge that can be used flexibly, i.e., in several different ways, and at least somewhat independently of its expected initial use.
With respect to this cartoon view of AI, my concern is simply with ensuring the correctness of the AI's knowledge. There is a lot of knowledge, and inevitably some of it will be incorrrect. Who is responsible for maintaining correctness, people or the machine? I think we would all agree that, as much as possible, we would like the AI system to somehow maintain its own knowledge, thus relieving us of a major burden. But it is hard to see how this might be done; easier to simply fix the knowledge ourselves. This is where we are today.
Date: November 12, 2001
https://incompleteideas.net/IncIdeas/WrongWithAI.html
#artificial_intelligence
Crafting Papers on Machine Learning
This paper provides some useful hints and advice for preparing machine learning papers. Besides, consider that it is not meant to cover all types of papers.
https://icml.cc/Conferences/2002/craft.html
#machine_learning #writing
This paper provides some useful hints and advice for preparing machine learning papers. Besides, consider that it is not meant to cover all types of papers.
https://icml.cc/Conferences/2002/craft.html
#machine_learning #writing
Model-based evolutionary algorithms: a short survey
Abstract: The evolutionary algorithms (EAs) are a family of nature-inspired algorithms widely used for solving complex optimization problems. Since the operators (e.g. crossover, mutation, selection) in most traditional EAs are developed on the basis of fixed heuristic rules or strategies, they are unable to learn the structures or properties of the problems to be optimized. To equip the EAs with learning abilities, recently, various model-based evolutionary algorithms (MBEAs) have been proposed. This survey briefly reviews some representative MBEAs by considering three different motivations of using models. First, the most commonly seen motivation of using models is to estimate the distribution of the candidate solutions. Second, in evolutionary multi-objective optimization, one motivation of using models is to build the inverse models from the objective space to the decision space. Third, when solving computationally expensive problems, models can be used as surrogates of the fitness functions. Based on the review, some further discussions are also given.
https://link.springer.com/article/10.1007/s40747-018-0080-1
#evolutionary_algorithm #machine_learning
Abstract: The evolutionary algorithms (EAs) are a family of nature-inspired algorithms widely used for solving complex optimization problems. Since the operators (e.g. crossover, mutation, selection) in most traditional EAs are developed on the basis of fixed heuristic rules or strategies, they are unable to learn the structures or properties of the problems to be optimized. To equip the EAs with learning abilities, recently, various model-based evolutionary algorithms (MBEAs) have been proposed. This survey briefly reviews some representative MBEAs by considering three different motivations of using models. First, the most commonly seen motivation of using models is to estimate the distribution of the candidate solutions. Second, in evolutionary multi-objective optimization, one motivation of using models is to build the inverse models from the objective space to the decision space. Third, when solving computationally expensive problems, models can be used as surrogates of the fitness functions. Based on the review, some further discussions are also given.
https://link.springer.com/article/10.1007/s40747-018-0080-1
#evolutionary_algorithm #machine_learning
Complex & Intelligent Systems
Model-based evolutionary algorithms: a short...
Complex & Intelligent Systems - The evolutionary algorithms (EAs) are a family of nature-inspired algorithms widely used for solving complex optimization problems. Since the operators (e.g....
At the Interface of Algebra and Statistics
Abstract: This thesis takes inspiration from quantum physics to investigate mathematical structure that lies at the interface of algebra and statistics. The starting point is a passage from classical probability theory to quantum probability theory. The quantum version of a probability distribution is a density operator, the quantum version of marginalizing is an operation called the partial trace, and the quantum version of a marginal probability distribution is a reduced density operator. Every joint probability distribution on a finite set can be modeled as a rank one density operator. By applying the partial trace, we obtain reduced density operators whose diagonals recover classical marginal probabilities. In general, these reduced densities will have rank higher than one, and their eigenvalues and eigenvectors will contain extra information that encodes subsystem interactions governed by statistics. We decode this information, and show it is akin to conditional probability, and then investigate the extent to which the eigenvectors capture "concepts" inherent in the original joint distribution. The theory is then illustrated with an experiment that exploits these ideas. Turning to a more theoretical application, we also discuss a preliminary framework for modeling entailment and concept hierarchy in natural language, namely, by representing expressions in the language as densities. Finally, initial inspiration for this thesis comes from formal concept analysis, which finds many striking parallels with the linear algebra. The parallels are not coincidental, and a common blueprint is found in category theory. We close with an exposition on free (co)completions and how the free-forgetful adjunctions in which they arise strongly suggest that in certain categorical contexts, the "fixed points" of a morphism with its adjoint encode interesting information.
Introductory Video: https://youtu.be/wiadG3ywJIs
Thesis: https://arxiv.org/abs/2004.05631
#statistics #machine_learning #algebra #quantum_physics
Abstract: This thesis takes inspiration from quantum physics to investigate mathematical structure that lies at the interface of algebra and statistics. The starting point is a passage from classical probability theory to quantum probability theory. The quantum version of a probability distribution is a density operator, the quantum version of marginalizing is an operation called the partial trace, and the quantum version of a marginal probability distribution is a reduced density operator. Every joint probability distribution on a finite set can be modeled as a rank one density operator. By applying the partial trace, we obtain reduced density operators whose diagonals recover classical marginal probabilities. In general, these reduced densities will have rank higher than one, and their eigenvalues and eigenvectors will contain extra information that encodes subsystem interactions governed by statistics. We decode this information, and show it is akin to conditional probability, and then investigate the extent to which the eigenvectors capture "concepts" inherent in the original joint distribution. The theory is then illustrated with an experiment that exploits these ideas. Turning to a more theoretical application, we also discuss a preliminary framework for modeling entailment and concept hierarchy in natural language, namely, by representing expressions in the language as densities. Finally, initial inspiration for this thesis comes from formal concept analysis, which finds many striking parallels with the linear algebra. The parallels are not coincidental, and a common blueprint is found in category theory. We close with an exposition on free (co)completions and how the free-forgetful adjunctions in which they arise strongly suggest that in certain categorical contexts, the "fixed points" of a morphism with its adjoint encode interesting information.
Introductory Video: https://youtu.be/wiadG3ywJIs
Thesis: https://arxiv.org/abs/2004.05631
#statistics #machine_learning #algebra #quantum_physics
YouTube
At the Interface of Algebra and Statistics
This video is a nontechnical introduction to my PhD thesis, which uses basic tools from quantum physics to investigate algebraic and statistical mathematical structure.
"At the Interface of Algebra and Statistics"
available on the arXiv at https://arxi…
"At the Interface of Algebra and Statistics"
available on the arXiv at https://arxi…
Tips for Publishing Research Code
This repository represents several important tips for releasing research code in Machine Learning (with official NeurIPS 2020 recommendations). These recommendations have been gathered based on analysis of more than 200 Machine Learning repositories, these recommendations facilitate reproducibility and correlate with GitHub stars - for more details, see our our blog post.
https://github.com/paperswithcode/releasing-research-code
#research_paper #machine_learning #neurIPS
This repository represents several important tips for releasing research code in Machine Learning (with official NeurIPS 2020 recommendations). These recommendations have been gathered based on analysis of more than 200 Machine Learning repositories, these recommendations facilitate reproducibility and correlate with GitHub stars - for more details, see our our blog post.
https://github.com/paperswithcode/releasing-research-code
#research_paper #machine_learning #neurIPS
GitHub
GitHub - paperswithcode/releasing-research-code: Tips for releasing research code in Machine Learning (with official NeurIPS 2020…
Tips for releasing research code in Machine Learning (with official NeurIPS 2020 recommendations) - paperswithcode/releasing-research-code
A cool 3D representation of the structure of BERT language model
Blogpost: https://peltarion.com/knowledge-center/documentation/modeling-view/build-an-ai-model/blocks/bert-encoder
#NLP #deep_learning
Blogpost: https://peltarion.com/knowledge-center/documentation/modeling-view/build-an-ai-model/blocks/bert-encoder
#NLP #deep_learning
Critique of Honda Prize for Dr. Hinton
Summary: Hinton has made significant contributions to artificial neural networks (NNs) and deep learning, but Honda credits him for fundamental inventions of others whom he did not cite. Science must not allow corporate PR to distort the academic record. Sec. I: Modern backpropagation was created by Linnainmaa (1970), not by Rumelhart & Hinton & Williams (1985). Ivakhnenko's deep feedforward nets (since 1965) learned internal representations long before Hinton's shallower ones (1980s). Sec. II: Hinton's unsupervised pre-training for deep NNs in the 2000s was conceptually a rehash of my unsupervised pre-training for deep NNs in 1991. And it was irrelevant for the deep learning revolution of the early 2010s which was mostly based on supervised learning - twice my lab spearheaded the shift from unsupervised pre-training to pure supervised learning (1991-95 and 2006-11). Sec. III: The first superior end-to-end neural speech recognition was based on two methods from my lab: LSTM (1990s-2005) and CTC (2006). Hinton et al. (2012) still used an old hybrid approach of the 1980s and 90s, and did not compare it to the revolutionary CTC-LSTM (which was soon on most smartphones). Sec. IV: Our group at IDSIA had superior award-winning computer vision through deep learning (2011) before Hinton's (2012). Sec. V: Hanson (1990) had a variant of "dropout" long before Hinton (2012). Sec. VI: In the 2010s, most major AI-based services across the world (speech recognition, language translation, etc.) on billions of devices were mostly based on our deep learning techniques, not on Hinton's. Repeatedly, Hinton omitted references to fundamental prior art (Sec. I & II & III & V). However, as Elvis Presley put it, "Truth is like the sun. You can shut it out for a time, but it ain't goin' away."
https://people.idsia.ch/~juergen/critique-honda-prize-hinton.html
#deep_learning
Summary: Hinton has made significant contributions to artificial neural networks (NNs) and deep learning, but Honda credits him for fundamental inventions of others whom he did not cite. Science must not allow corporate PR to distort the academic record. Sec. I: Modern backpropagation was created by Linnainmaa (1970), not by Rumelhart & Hinton & Williams (1985). Ivakhnenko's deep feedforward nets (since 1965) learned internal representations long before Hinton's shallower ones (1980s). Sec. II: Hinton's unsupervised pre-training for deep NNs in the 2000s was conceptually a rehash of my unsupervised pre-training for deep NNs in 1991. And it was irrelevant for the deep learning revolution of the early 2010s which was mostly based on supervised learning - twice my lab spearheaded the shift from unsupervised pre-training to pure supervised learning (1991-95 and 2006-11). Sec. III: The first superior end-to-end neural speech recognition was based on two methods from my lab: LSTM (1990s-2005) and CTC (2006). Hinton et al. (2012) still used an old hybrid approach of the 1980s and 90s, and did not compare it to the revolutionary CTC-LSTM (which was soon on most smartphones). Sec. IV: Our group at IDSIA had superior award-winning computer vision through deep learning (2011) before Hinton's (2012). Sec. V: Hanson (1990) had a variant of "dropout" long before Hinton (2012). Sec. VI: In the 2010s, most major AI-based services across the world (speech recognition, language translation, etc.) on billions of devices were mostly based on our deep learning techniques, not on Hinton's. Repeatedly, Hinton omitted references to fundamental prior art (Sec. I & II & III & V). However, as Elvis Presley put it, "Truth is like the sun. You can shut it out for a time, but it ain't goin' away."
https://people.idsia.ch/~juergen/critique-honda-prize-hinton.html
#deep_learning
people.idsia.ch
Critique of Honda Prize for Dr. Hinton
Honda credits Hinton for inventions of others whom he did not cite. Science must not allow corporate PR to distort the academic record.
The Cost of Training NLP Models: A Concise Overview
Abstract: We review the cost of training large-scale language models, and the drivers of these costs. The intended audience includes engineers and scientists budgeting their model-training experiments, as well as non-practitioners trying to make sense of the economics of modern-day Natural Language Processing (NLP).
https://arxiv.org/abs/2004.08900
#nlp #deep_learning
Abstract: We review the cost of training large-scale language models, and the drivers of these costs. The intended audience includes engineers and scientists budgeting their model-training experiments, as well as non-practitioners trying to make sense of the economics of modern-day Natural Language Processing (NLP).
https://arxiv.org/abs/2004.08900
#nlp #deep_learning
Grad-CAM++: Generalized Gradient-based Visual Explanations for Deep Convolutional Networks
Abstract: Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision based problems. However, deep models are perceived as "black box" methods considering the lack of understanding of their internal functioning. There has been a significant recent interest to develop explainable deep learning models, and this paper is an effort in this direction. Building on a recently proposed method called Grad-CAM, we propose Grad-CAM++ to provide better visual explanations of CNN model predictions (when compared to Grad-CAM), in terms of better localization of objects as well as explaining occurrences of multiple objects of a class in a single image. We provide a mathematical explanation for the proposed method, Grad-CAM++, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for the class label under consideration. Our extensive experiments and evaluations, both subjective and objective, on standard datasets showed that Grad-CAM++ indeed provides better visual explanations for a given CNN architecture when compared to Grad-CAM.
https://arxiv.org/pdf/1710.11063.pdf
#deep_learning #computer_vision
Abstract: Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision based problems. However, deep models are perceived as "black box" methods considering the lack of understanding of their internal functioning. There has been a significant recent interest to develop explainable deep learning models, and this paper is an effort in this direction. Building on a recently proposed method called Grad-CAM, we propose Grad-CAM++ to provide better visual explanations of CNN model predictions (when compared to Grad-CAM), in terms of better localization of objects as well as explaining occurrences of multiple objects of a class in a single image. We provide a mathematical explanation for the proposed method, Grad-CAM++, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for the class label under consideration. Our extensive experiments and evaluations, both subjective and objective, on standard datasets showed that Grad-CAM++ indeed provides better visual explanations for a given CNN architecture when compared to Grad-CAM.
https://arxiv.org/pdf/1710.11063.pdf
#deep_learning #computer_vision
Towards Biologically Plausible Deep Learning
Abstract: Neuroscientists have long criticized deep learning algorithms as incompatible with current knowledge of neurobiology. We explore more biologically plausible versions of deep representation learning, focusing here mostly on unsupervised learning but developing a learning mechanism that could account for supervised, unsupervised and reinforcement learning. The starting point is that the basic learning rule believed to govern synaptic weight updates (Spike-Timing-Dependent Plasticity) arises out of a simple update rule that makes a lot of sense from a machine learning point of view and can be interpreted as gradient descent on some objective function so long as the neuronal dynamics push firing rates towards better values of the objective function (be it supervised, unsupervised, or reward-driven). The second main idea is that this corresponds to a form of the variational EM algorithm, i.e., with approximate rather than exact posteriors, implemented by neural dynamics. Another contribution of this paper is that the gradients required for updating the hidden states in the above variational interpretation can be estimated using an approximation that only requires propagating activations forward and backward, with pairs of layers learning to form a denoising auto-encoder. Finally, we extend the theory about the probabilistic interpretation of auto-encoders to justify improved sampling schemes based on the generative interpretation of denoising auto-encoders, and we validate all these ideas on generative learning tasks.
https://arxiv.org/abs/1502.04156
#deep_learning #neuroscience
Abstract: Neuroscientists have long criticized deep learning algorithms as incompatible with current knowledge of neurobiology. We explore more biologically plausible versions of deep representation learning, focusing here mostly on unsupervised learning but developing a learning mechanism that could account for supervised, unsupervised and reinforcement learning. The starting point is that the basic learning rule believed to govern synaptic weight updates (Spike-Timing-Dependent Plasticity) arises out of a simple update rule that makes a lot of sense from a machine learning point of view and can be interpreted as gradient descent on some objective function so long as the neuronal dynamics push firing rates towards better values of the objective function (be it supervised, unsupervised, or reward-driven). The second main idea is that this corresponds to a form of the variational EM algorithm, i.e., with approximate rather than exact posteriors, implemented by neural dynamics. Another contribution of this paper is that the gradients required for updating the hidden states in the above variational interpretation can be estimated using an approximation that only requires propagating activations forward and backward, with pairs of layers learning to form a denoising auto-encoder. Finally, we extend the theory about the probabilistic interpretation of auto-encoders to justify improved sampling schemes based on the generative interpretation of denoising auto-encoders, and we validate all these ideas on generative learning tasks.
https://arxiv.org/abs/1502.04156
#deep_learning #neuroscience
arXiv.org
Towards Biologically Plausible Deep Learning
Neuroscientists have long criticised deep learning algorithms as incompatible with current knowledge of neurobiology. We explore more biologically plausible versions of deep representation...
The Notorious Difficulty of Comparing Human and Machine Perception
Abstract: With the rise of machines to human-level performance in complex recognition tasks, a growing amount of work is directed towards comparing information processing in humans and machines. These works have the potential to deepen our understanding of the inner mechanisms of human perception and to improve machine learning. Drawing robust conclusions from comparison studies, however, turns out to be difficult. Here, we highlight common shortcomings that can easily lead to fragile conclusions. First, if a model does achieve high performance on a task similar to humans, its decision-making process is not necessarily human-like. Moreover, further analyses can reveal differences. Second, the performance of neural networks is sensitive to training procedures and architectural details. Thus, generalizing conclusions from specific architectures is difficult. Finally, when comparing humans and machines, equivalent experimental settings are crucial in order to identify innate differences. Addressing these shortcomings alters or refines the conclusions of studies. We show that, despite their ability to solve closed-contour tasks, our neural networks use different decision-making strategies than humans. We further show that there is no fundamental difference between same-different and spatial tasks for common feed-forward neural networks and finally, that neural networks do experience a "recognition gap" on minimal recognizable images. All in all, care has to be taken to not impose our human systematic bias when comparing human and machine perception.
https://arxiv.org/abs/2004.09406
#machine_learning
Abstract: With the rise of machines to human-level performance in complex recognition tasks, a growing amount of work is directed towards comparing information processing in humans and machines. These works have the potential to deepen our understanding of the inner mechanisms of human perception and to improve machine learning. Drawing robust conclusions from comparison studies, however, turns out to be difficult. Here, we highlight common shortcomings that can easily lead to fragile conclusions. First, if a model does achieve high performance on a task similar to humans, its decision-making process is not necessarily human-like. Moreover, further analyses can reveal differences. Second, the performance of neural networks is sensitive to training procedures and architectural details. Thus, generalizing conclusions from specific architectures is difficult. Finally, when comparing humans and machines, equivalent experimental settings are crucial in order to identify innate differences. Addressing these shortcomings alters or refines the conclusions of studies. We show that, despite their ability to solve closed-contour tasks, our neural networks use different decision-making strategies than humans. We further show that there is no fundamental difference between same-different and spatial tasks for common feed-forward neural networks and finally, that neural networks do experience a "recognition gap" on minimal recognizable images. All in all, care has to be taken to not impose our human systematic bias when comparing human and machine perception.
https://arxiv.org/abs/2004.09406
#machine_learning
AlphaGo - The Movie | Full Documentary
Summary: with more board configurations than there are atoms in the universe, the ancient Chinese game of Go has long been considered a grand challenge for artificial intelligence. On March 9, 2016, the worlds of Go and artificial intelligence collided in South Korea for an extraordinary best-of-five-game competition, coined The DeepMind Challenge Match. Hundreds of millions of people around the world watched as a legendary Go master took on an unproven AI challenger for the first time in history.
https://www.youtube.com/watch?v=WXuK6gekU1Y
#artificial_intelligence #reinforcement_learning #deep_learning
Summary: with more board configurations than there are atoms in the universe, the ancient Chinese game of Go has long been considered a grand challenge for artificial intelligence. On March 9, 2016, the worlds of Go and artificial intelligence collided in South Korea for an extraordinary best-of-five-game competition, coined The DeepMind Challenge Match. Hundreds of millions of people around the world watched as a legendary Go master took on an unproven AI challenger for the first time in history.
https://www.youtube.com/watch?v=WXuK6gekU1Y
#artificial_intelligence #reinforcement_learning #deep_learning
YouTube
AlphaGo - The Movie | Full award-winning documentary
With more board configurations than there are atoms in the universe, the ancient Chinese game of Go has long been considered a grand challenge for artificial intelligence.
On March 9, 2016, the worlds of Go and artificial intelligence collided in South…
On March 9, 2016, the worlds of Go and artificial intelligence collided in South…
Universal Intelligence: A Definition of Machine Intelligence
Abstract: A fundamental problem in artificial intelligence is that nobody really knows what intelligence is. The problem is especially acute when we need to consider artificial systems which are significantly different to humans. In this paper we approach this problem in the following way: We take a number of well known informal definitions of human intelligence that have been given by experts, and extract their essential features. These are then mathematically formalised to produce a general measure of intelligence for arbitrary machines. We believe that this equation formally captures the concept of machine intelligence in the broadest reasonable sense. We then show how this formal definition is related to the theory of universal optimal learning agents. Finally, we survey the many other tests and definitions of intelligence that have been proposed for machines.
https://arxiv.org/pdf/0712.3329.pdf
#artificial_intelligence
Abstract: A fundamental problem in artificial intelligence is that nobody really knows what intelligence is. The problem is especially acute when we need to consider artificial systems which are significantly different to humans. In this paper we approach this problem in the following way: We take a number of well known informal definitions of human intelligence that have been given by experts, and extract their essential features. These are then mathematically formalised to produce a general measure of intelligence for arbitrary machines. We believe that this equation formally captures the concept of machine intelligence in the broadest reasonable sense. We then show how this formal definition is related to the theory of universal optimal learning agents. Finally, we survey the many other tests and definitions of intelligence that have been proposed for machines.
https://arxiv.org/pdf/0712.3329.pdf
#artificial_intelligence
An Approximation of the Universal Intelligence Measure
Abstract: The Universal Intelligence Measure is a recently proposed formal definition of intelligence. It is mathematically specified, extremely general, and captures the essence of many informal definitions of intelligence. It is based on Hutter’s Universal Artificial Intelligence theory, an extension of Ray Solomonoff’s pioneering work on universal induction. Since the Universal Intelligence Measure is only asymptotically computable, building a practical intelligence test from it is not straightforward. This paper studies the practical issues involved in developing a real-world UIM-based performance metric. Based on our investigation, we develop a prototype implementation which we use to evaluate a number of different artificial agents.
https://arxiv.org/pdf/1109.5951v2.pdf
#artificial_intelligence
Abstract: The Universal Intelligence Measure is a recently proposed formal definition of intelligence. It is mathematically specified, extremely general, and captures the essence of many informal definitions of intelligence. It is based on Hutter’s Universal Artificial Intelligence theory, an extension of Ray Solomonoff’s pioneering work on universal induction. Since the Universal Intelligence Measure is only asymptotically computable, building a practical intelligence test from it is not straightforward. This paper studies the practical issues involved in developing a real-world UIM-based performance metric. Based on our investigation, we develop a prototype implementation which we use to evaluate a number of different artificial agents.
https://arxiv.org/pdf/1109.5951v2.pdf
#artificial_intelligence
Keras Website Has been Updated
Quote from its developer (François Chollet): Keras has a new website, which includes a 100% refreshed list of developer guides and code examples.
https://keras.io/
#deep_learning #programming
Quote from its developer (François Chollet): Keras has a new website, which includes a 100% refreshed list of developer guides and code examples.
https://keras.io/
#deep_learning #programming
keras.io
Keras: Deep Learning for humans
Keras documentation
Ilya Sutskever: Deep Learning
Brief Biography: Ilya Sutskever is the co-founder of OpenAI, is one of the most cited computer scientist in history with over 165,000 citations, and to me, is one of the most brilliant and insightful minds ever in the field of deep learning. There are very few people in this world who I would rather talk to and brainstorm with about deep learning, intelligence, and life than Ilya, on and off the mic.
https://www.youtube.com/watch?v=13CZPWmke6A
#deep_learning #artificial_intelligence #reinforcement_learning
Brief Biography: Ilya Sutskever is the co-founder of OpenAI, is one of the most cited computer scientist in history with over 165,000 citations, and to me, is one of the most brilliant and insightful minds ever in the field of deep learning. There are very few people in this world who I would rather talk to and brainstorm with about deep learning, intelligence, and life than Ilya, on and off the mic.
https://www.youtube.com/watch?v=13CZPWmke6A
#deep_learning #artificial_intelligence #reinforcement_learning
YouTube
Ilya Sutskever: Deep Learning | Lex Fridman Podcast #94
Ilya Sutskever is the co-founder of OpenAI, is one of the most cited computer scientist in history with over 165,000 citations, and to me, is one of the most brilliant and insightful minds ever in the field of deep learning. There are very few people in this…
Unsupervised learning models of primary cortical receptive fields and receptive field plasticity
Abstract: The efficient coding hypothesis holds that neural receptive fields are adapted to the statistics of the environment, but is agnostic to the timescale of this adaptation, which occurs on both evolutionary and developmental timescales. In this work we focus on that component of adaptation which occurs during an organism's lifetime, and show that a number of unsupervised feature learning algorithms can account for features of normal receptive field properties across multiple primary sensory cortices. Furthermore, we show that the same algorithms account for altered receptive field properties in response to experimentally altered environmental statistics. Based on these modeling results we propose these models as phenomenological models of receptive field plasticity during an organism's lifetime. Finally, due to the success of the same models in multiple sensory areas, we suggest that these algorithms may provide a constructive realization of the theory, first proposed by Mountcastle (1978), that a qualitatively similar learning algorithm acts throughout primary sensory cortices.
https://papers.nips.cc/paper/4331-unsupervised-learning-models-of-primary-cortical-receptive-fields-and-receptive-field-plasticity
#neural_network #neuroscience
Abstract: The efficient coding hypothesis holds that neural receptive fields are adapted to the statistics of the environment, but is agnostic to the timescale of this adaptation, which occurs on both evolutionary and developmental timescales. In this work we focus on that component of adaptation which occurs during an organism's lifetime, and show that a number of unsupervised feature learning algorithms can account for features of normal receptive field properties across multiple primary sensory cortices. Furthermore, we show that the same algorithms account for altered receptive field properties in response to experimentally altered environmental statistics. Based on these modeling results we propose these models as phenomenological models of receptive field plasticity during an organism's lifetime. Finally, due to the success of the same models in multiple sensory areas, we suggest that these algorithms may provide a constructive realization of the theory, first proposed by Mountcastle (1978), that a qualitatively similar learning algorithm acts throughout primary sensory cortices.
https://papers.nips.cc/paper/4331-unsupervised-learning-models-of-primary-cortical-receptive-fields-and-receptive-field-plasticity
#neural_network #neuroscience
papers.nips.cc
Unsupervised learning models of primary cortical receptive fields and receptive field plasticity
Electronic Proceedings of Neural Information Processing Systems
Reinforcement learning in the brain
Abstract: A wealth of research focuses on the decision-making processes that animals and humans employ when selecting actions in the face of reward and punishment. Initially such work stemmed from psychological investigations of conditioned behavior, and explanations of these in terms of computational models. Increasingly, analysis at the computational level has drawn on ideas from reinforcement learning, which provide a normative framework within which decision-making can be analyzed. More recently, the fruits of these extensive lines of research have made contact with investigations into the neural basis of decision making. Converging evidence now links reinforcement learning to specific neural substrates, assigning them precise computational roles. Specifically, electrophysiological recordings in behaving animals and functional imaging of human decision-making have revealed in the brain the existence of a key reinforcement learning signal, the temporal difference reward prediction error. Here, we first introduce the formal reinforcement learning framework. We then review the multiple lines of evidence linking reinforcement learning to the function of dopaminergic neurons in the mammalian midbrain and to more recent data from human imaging experiments. We further extend the discussion to aspects of learning not associated with phasic dopamine signals, such as learning of goal-directed responding that may not be dopamine-dependent, and learning about the vigor (or rate) with which actions should be performed that has been linked to tonic aspects of dopaminergic signaling. We end with a brief discussion of some of the limitations of the reinforcement learning framework, highlighting questions for future research.
https://psycnet.apa.org/record/2009-07078-003
#reinforcement_learning #neuroscience
Abstract: A wealth of research focuses on the decision-making processes that animals and humans employ when selecting actions in the face of reward and punishment. Initially such work stemmed from psychological investigations of conditioned behavior, and explanations of these in terms of computational models. Increasingly, analysis at the computational level has drawn on ideas from reinforcement learning, which provide a normative framework within which decision-making can be analyzed. More recently, the fruits of these extensive lines of research have made contact with investigations into the neural basis of decision making. Converging evidence now links reinforcement learning to specific neural substrates, assigning them precise computational roles. Specifically, electrophysiological recordings in behaving animals and functional imaging of human decision-making have revealed in the brain the existence of a key reinforcement learning signal, the temporal difference reward prediction error. Here, we first introduce the formal reinforcement learning framework. We then review the multiple lines of evidence linking reinforcement learning to the function of dopaminergic neurons in the mammalian midbrain and to more recent data from human imaging experiments. We further extend the discussion to aspects of learning not associated with phasic dopamine signals, such as learning of goal-directed responding that may not be dopamine-dependent, and learning about the vigor (or rate) with which actions should be performed that has been linked to tonic aspects of dopaminergic signaling. We end with a brief discussion of some of the limitations of the reinforcement learning framework, highlighting questions for future research.
https://psycnet.apa.org/record/2009-07078-003
#reinforcement_learning #neuroscience
A Deep Reinforcement Learning course presented by Stanford University (Winter 2019)
https://web.stanford.edu/class/cs234/index.html
#deep_reinforcement_learning
https://web.stanford.edu/class/cs234/index.html
#deep_reinforcement_learning
probability_cheatsheet.pdf
789.3 KB
Probability Theory Cheat sheet
#probability_theory
#probability_theory