Feature selection — Correlation and P-value
Basic article, explaining key concepts: #correlation and #p_value with code example.
Link: https://towardsdatascience.com/feature-selection-correlation-and-p-value-da8921bfb3cf
#novice
Basic article, explaining key concepts: #correlation and #p_value with code example.
Link: https://towardsdatascience.com/feature-selection-correlation-and-p-value-da8921bfb3cf
#novice
Medium
Feature selection — Correlation and P-value
Often when we get a dataset, we might find a plethora of features in the dataset. All of the features we find in the dataset might not be…
What does your Spotify music sound like? Data Science with Spotify (Part 1)
Example of a good approach to the research. Though, as was noted, there is no data for the reproducibility, author can provide data and sample code in the future.
Link: https://towardsdatascience.com/data-science-and-machine-learning-with-spotify-841225bfb5d0
#spotify
Example of a good approach to the research. Though, as was noted, there is no data for the reproducibility, author can provide data and sample code in the future.
Link: https://towardsdatascience.com/data-science-and-machine-learning-with-spotify-841225bfb5d0
#spotify
Medium
What does your Spotify music sound like? Data Science with Spotify (Part 1)
What does your Spotify music sound like? Data Science with Spotify (Part 1)
Reproducing Imagenet in 18 minutes
The code to reproduce #ImageNet in 18 minutes is posted in the GitHub repo. It actually becomes «Imagenet in 12 minutes» if using 74.9% top1, used in Chainer's "Imagenet in 15" paper, last few bits are the hardest.
Link: https://github.com/diux-dev/imagenet18
The code to reproduce #ImageNet in 18 minutes is posted in the GitHub repo. It actually becomes «Imagenet in 12 minutes» if using 74.9% top1, used in Chainer's "Imagenet in 15" paper, last few bits are the hardest.
Link: https://github.com/diux-dev/imagenet18
GitHub
GitHub - cybertronai/imagenet18_old: Code to reproduce "imagenet in 18 minutes" DAWN-benchmark entry
Code to reproduce "imagenet in 18 minutes" DAWN-benchmark entry - cybertronai/imagenet18_old
Ultimate Machine Learning Cheat Sheet
Notes on top-level topics from Stanford's CS 229 by Shervine Amidi and Afshine Amidi:
* Supervised learning
* Unsupervised learning
* Deep learning
* Tips and tricks
* Probability and stats refresher
* Algebra and calculus refresher
Forward this message to your Saved Messages to make sure, you won’t lose it.
Repo link: https://github.com/afshinea/stanford-cs-229-machine-learning
#Stanford #cheatsheet
Notes on top-level topics from Stanford's CS 229 by Shervine Amidi and Afshine Amidi:
* Supervised learning
* Unsupervised learning
* Deep learning
* Tips and tricks
* Probability and stats refresher
* Algebra and calculus refresher
Forward this message to your Saved Messages to make sure, you won’t lose it.
Repo link: https://github.com/afshinea/stanford-cs-229-machine-learning
#Stanford #cheatsheet
GitHub
GitHub - afshinea/stanford-cs-229-machine-learning: VIP cheatsheets for Stanford's CS 229 Machine Learning
VIP cheatsheets for Stanford's CS 229 Machine Learning - afshinea/stanford-cs-229-machine-learning
Most recent version of Andrew Ng’s book Machine Learning Yearning
Link: https://gallery.mailchimp.com/dc3a7ef4d750c0abfc19202a3/files/5dd91615-3b3f-4f5d-bbfb-4ebd8608d330/Ng_MLY01_13.pdf
#andrewng #MLYearning
Link: https://gallery.mailchimp.com/dc3a7ef4d750c0abfc19202a3/files/5dd91615-3b3f-4f5d-bbfb-4ebd8608d330/Ng_MLY01_13.pdf
#andrewng #MLYearning
CACTUs: an unsupervised learning algorithm that learns to learn tasks constructed from unlabeled data. Leads to significantly more effective downstream learning & enables few-shot learning *without* labeled meta-learning datasets
ArXiv: https://arxiv.org/abs/1810.02334
#cactus #unsupervised
ArXiv: https://arxiv.org/abs/1810.02334
#cactus #unsupervised
Hitchhiker’s guide to Exploratory Data Analysis
Exploratory Data Analysis — stage of finding out distribution of the data, volume, number of missing values and all the other characteristics of the available dataset.
Part 1: https://towardsdatascience.com/hitchhikers-guide-to-exploratory-data-analysis-6e8d896d3f7e
Part 2: https://towardsdatascience.com/hitchhikers-guide-to-exploratory-data-analysis-part-2-36ab72201e1d
#ExploratoryDA #novice #entrylevel
Exploratory Data Analysis — stage of finding out distribution of the data, volume, number of missing values and all the other characteristics of the available dataset.
Part 1: https://towardsdatascience.com/hitchhikers-guide-to-exploratory-data-analysis-6e8d896d3f7e
Part 2: https://towardsdatascience.com/hitchhikers-guide-to-exploratory-data-analysis-part-2-36ab72201e1d
#ExploratoryDA #novice #entrylevel
Medium
Hitchhiker's guide to Exploratory Data Analysis
How to investigate a dataset with python?
The Code for Facial Identity in the Primate Brain
This paper showed that facial images can be reconstructed from a simple linear model using responses of only ~200 visual neurons recorded from a monkey. This approach uses "face cells" which are encoding how much a face differs from average in particular ways ("eigenface dimensions").
https://www.sciencedirect.com/science/article/pii/S009286741730538X
#cv #dl
This paper showed that facial images can be reconstructed from a simple linear model using responses of only ~200 visual neurons recorded from a monkey. This approach uses "face cells" which are encoding how much a face differs from average in particular ways ("eigenface dimensions").
https://www.sciencedirect.com/science/article/pii/S009286741730538X
#cv #dl
Zero-Shot Style Transfer in Text Using Recurrent Neural Networks
This is an article on text style transfer. There is an example code to check the results.
Paper: https://arxiv.org/pdf/1711.04731.pdf
Code: https://github.com/keithecarlson/Zero-Shot-Style-Transfer
#NLP #seq2seq #dl #rnn
This is an article on text style transfer. There is an example code to check the results.
Paper: https://arxiv.org/pdf/1711.04731.pdf
Code: https://github.com/keithecarlson/Zero-Shot-Style-Transfer
#NLP #seq2seq #dl #rnn
GitHub
GitHub - keithecarlson/Zero-Shot-Style-Transfer
Contribute to keithecarlson/Zero-Shot-Style-Transfer development by creating an account on GitHub.
Data Science by ODS.ai 🦜
Hitchhiker’s guide to Exploratory Data Analysis Exploratory Data Analysis — stage of finding out distribution of the data, volume, number of missing values and all the other characteristics of the available dataset. Part 1: https://towardsdatascience.com/hitchhikers…
3 articles on practical #ExploratoryDA in Spark
These articles might be useful for those, who just starting their Hadoop Path as long with those, who want to learn how to create fast-and-dirty dashboards with #Zeppelin
Links:
https://blog.madhukaraphatak.com/statistical-data-exploration-spark-part-1
https://blog.madhukaraphatak.com/statistical-data-exploration-spark-part-2
https://blog.madhukaraphatak.com/statistical-data-exploration-spark-part-3
#Spark #Hadoop #production #BigData
These articles might be useful for those, who just starting their Hadoop Path as long with those, who want to learn how to create fast-and-dirty dashboards with #Zeppelin
Links:
https://blog.madhukaraphatak.com/statistical-data-exploration-spark-part-1
https://blog.madhukaraphatak.com/statistical-data-exploration-spark-part-2
https://blog.madhukaraphatak.com/statistical-data-exploration-spark-part-3
#Spark #Hadoop #production #BigData
Madhukaraphatak
Statistical Data Exploration using Spark 2.0 - Part 2 : Shape of Data with Histograms
Thoughts on technology, life and everything else.
To continue #FeatureEngineering topic:
Understanding Feature Engineering (Part 1) — Continuous Numeric Data
Link: https://towardsdatascience.com/understanding-feature-engineering-part-1-continuous-numeric-data-da4e47099a7b
#Statistics #novice
Understanding Feature Engineering (Part 1) — Continuous Numeric Data
Link: https://towardsdatascience.com/understanding-feature-engineering-part-1-continuous-numeric-data-da4e47099a7b
#Statistics #novice
Medium
Continuous Numeric Data
Strategies for working with continuous, numerical data
Great example on how different approach to feature encoding can influence the results.
Mean (likelihood) encoding for categorical variables with high cardinality and feature interactions: a comprehensive study with Python
Link: https://www.kaggle.com/vprokopev/mean-likelihood-encodings-a-comprehensive-study
#FeatureEngineering #FeactureEncoding #Kaggle
Mean (likelihood) encoding for categorical variables with high cardinality and feature interactions: a comprehensive study with Python
Link: https://www.kaggle.com/vprokopev/mean-likelihood-encodings-a-comprehensive-study
#FeatureEngineering #FeactureEncoding #Kaggle
Neural Network Embeddings Explained
How deep learning can represent War and Peace as a vector
Easy to read #novice article about #embeddings. Basically — how to represent everything as a vector.
Link: https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526
How deep learning can represent War and Peace as a vector
Easy to read #novice article about #embeddings. Basically — how to represent everything as a vector.
Link: https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526
Medium
Neural Network Embeddings Explained
How deep learning can represent War and Peace as a vector
Building a Recommendation System Using Neural Network Embeddings
And, putting theory to work: embeddings for recommendation system
Link: https://towardsdatascience.com/building-a-recommendation-system-using-neural-network-embeddings-1ef92e5c80c9
And, putting theory to work: embeddings for recommendation system
Link: https://towardsdatascience.com/building-a-recommendation-system-using-neural-network-embeddings-1ef92e5c80c9
Medium
Building a Recommendation System Using Neural Network Embeddings
How to use deep learning and Wikipedia to create a book recommendation system
Test-Driven Data Analysis
TDD is an approach to software development, suggesting that tests are essential part of the process. Over the years TDD have shown that it is required to maintain a good code base and the most common requirement for the lasting project.
Test driven approach can be maintain with data analysis too, with the reproducible research approach or TDDA, which is suggested by the latter link.
Link: https://www.tdda.info
#tdda
TDD is an approach to software development, suggesting that tests are essential part of the process. Over the years TDD have shown that it is required to maintain a good code base and the most common requirement for the lasting project.
Test driven approach can be maintain with data analysis too, with the reproducible research approach or TDDA, which is suggested by the latter link.
Link: https://www.tdda.info
#tdda
Unsupervised Machine Learning of Open Source Russian Twitter Data Reveals Global Scope and Operational Characteristics
Article on previously not found Russian Troll group on twitter.
Link: https://www.technologyreview.com/s/612252/data-mining-has-revealed-previously-unknown-russian-twitter-troll-campaigns/
ArXiV: https://arxiv.org/abs/1810.01466
#clustering #nlp #twitter
Article on previously not found Russian Troll group on twitter.
Link: https://www.technologyreview.com/s/612252/data-mining-has-revealed-previously-unknown-russian-twitter-troll-campaigns/
ArXiV: https://arxiv.org/abs/1810.01466
#clustering #nlp #twitter
MIT Technology Review
Data mining has revealed previously unknown Russian Twitter troll campaigns
Trolls left forensic fingerprints that cybersecurity experts used to find other disinformation campaigns both in the US and elsewhere.
Digging into Airbnb data: reviews sentiments, superhosts, and prices prediction (part1)
Example of #AirBnB data research
Link: https://towardsdatascience.com/digging-into-airbnb-data-reviews-sentiments-superhosts-and-prices-prediction-part1-6c80ccb26c6a
Example of #AirBnB data research
Link: https://towardsdatascience.com/digging-into-airbnb-data-reviews-sentiments-superhosts-and-prices-prediction-part1-6c80ccb26c6a
Medium
Digging into Airbnb data: reviews sentiments, superhosts, and prices prediction (part1)
Airbnb is the leading and rapidly growing alternative to the traditional hotel networks. It collects a lot of data about their hosts and…
Getting Started with Markov Decision Processes: Reinforcement Learning
Part 2: Explaining the concepts of the Markov Decision Process, Bellman Equation and Policies
Link: https://towardsdatascience.com/getting-started-with-markov-decision-processes-reinforcement-learning-ada7b4572ffb
#rl #reinforcementlearning #markovprocess
Part 2: Explaining the concepts of the Markov Decision Process, Bellman Equation and Policies
Link: https://towardsdatascience.com/getting-started-with-markov-decision-processes-reinforcement-learning-ada7b4572ffb
#rl #reinforcementlearning #markovprocess
Medium
Getting Started with Markov Decision Processes: Reinforcement Learning
Part 2: Explaining the concepts of the Markov Decision Process, Bellman Equation and Policies