Data Science by ODS.ai 🦜

Feature selection — Correlation and P-value

Basic article, explaining key concepts: #correlation and #p_value with code example.

Link: https://towardsdatascience.com/feature-selection-correlation-and-p-value-da8921bfb3cf

#novice

Medium

Feature selection — Correlation and P-value

Often when we get a dataset, we might find a plethora of features in the dataset. All of the features we find in the dataset might not be…

5.33K views10:36

Data Science by ODS.ai 🦜

What does your Spotify music sound like? Data Science with Spotify (Part 1)

Example of a good approach to the research. Though, as was noted, there is no data for the reproducibility, author can provide data and sample code in the future.

Link: https://towardsdatascience.com/data-science-and-machine-learning-with-spotify-841225bfb5d0

#spotify

Medium

What does your Spotify music sound like? Data Science with Spotify (Part 1)

4.77K views17:43

Data Science by ODS.ai 🦜

Reproducing Imagenet in 18 minutes

The code to reproduce #ImageNet in 18 minutes is posted in the GitHub repo. It actually becomes «Imagenet in 12 minutes» if using 74.9% top1, used in Chainer's "Imagenet in 15" paper, last few bits are the hardest.

Link: https://github.com/diux-dev/imagenet18

GitHub

GitHub - cybertronai/imagenet18_old: Code to reproduce "imagenet in 18 minutes" DAWN-benchmark entry

Code to reproduce "imagenet in 18 minutes" DAWN-benchmark entry - cybertronai/imagenet18_old

4.98K views20:35

Data Science by ODS.ai 🦜

Ultimate Machine Learning Cheat Sheet

Notes on top-level topics from Stanford's CS 229 by Shervine Amidi and Afshine Amidi:

* Supervised learning
* Unsupervised learning
* Deep learning
* Tips and tricks
* Probability and stats refresher
* Algebra and calculus refresher

Forward this message to your Saved Messages to make sure, you won’t lose it.

Repo link: https://github.com/afshinea/stanford-cs-229-machine-learning

#Stanford #cheatsheet

GitHub

GitHub - afshinea/stanford-cs-229-machine-learning: VIP cheatsheets for Stanford's CS 229 Machine Learning

VIP cheatsheets for Stanford's CS 229 Machine Learning - afshinea/stanford-cs-229-machine-learning

6.46K views17:08

Data Science by ODS.ai 🦜

Most recent version of Andrew Ng’s book Machine Learning Yearning

Link: https://gallery.mailchimp.com/dc3a7ef4d750c0abfc19202a3/files/5dd91615-3b3f-4f5d-bbfb-4ebd8608d330/Ng_MLY01_13.pdf

#andrewng #MLYearning

7.06K views19:14

Data Science by ODS.ai 🦜

CACTUs: an unsupervised learning algorithm that learns to learn tasks constructed from unlabeled data. Leads to significantly more effective downstream learning & enables few-shot learning *without* labeled meta-learning datasets

ArXiv: https://arxiv.org/abs/1810.02334

#cactus #unsupervised

5.39K views08:47

Data Science by ODS.ai 🦜

5.83K views08:47

Data Science by ODS.ai 🦜

Hitchhiker’s guide to Exploratory Data Analysis

Exploratory Data Analysis — stage of finding out distribution of the data, volume, number of missing values and all the other characteristics of the available dataset.

Part 1: https://towardsdatascience.com/hitchhikers-guide-to-exploratory-data-analysis-6e8d896d3f7e
Part 2: https://towardsdatascience.com/hitchhikers-guide-to-exploratory-data-analysis-part-2-36ab72201e1d

#ExploratoryDA #novice #entrylevel

Medium

Hitchhiker's guide to Exploratory Data Analysis

How to investigate a dataset with python?

6.23K views23:53

Data Science by ODS.ai 🦜

The Code for Facial Identity in the Primate Brain

This paper showed that facial images can be reconstructed from a simple linear model using responses of only ~200 visual neurons recorded from a monkey. This approach uses "face cells" which are encoding how much a face differs from average in particular ways ("eigenface dimensions").

https://www.sciencedirect.com/science/article/pii/S009286741730538X

#cv #dl

5.28K views21:13

Data Science by ODS.ai 🦜

5.51K views21:13

Data Science by ODS.ai 🦜

Zero-Shot Style Transfer in Text Using Recurrent Neural Networks

This is an article on text style transfer. There is an example code to check the results.

Paper: https://arxiv.org/pdf/1711.04731.pdf
Code: https://github.com/keithecarlson/Zero-Shot-Style-Transfer

#NLP #seq2seq #dl #rnn

GitHub

GitHub - keithecarlson/Zero-Shot-Style-Transfer

Contribute to keithecarlson/Zero-Shot-Style-Transfer development by creating an account on GitHub.

5.5K views12:12

Data Science by ODS.ai 🦜

3 articles on practical #ExploratoryDA in Spark

These articles might be useful for those, who just starting their Hadoop Path as long with those, who want to learn how to create fast-and-dirty dashboards with #Zeppelin

Links:
https://blog.madhukaraphatak.com/statistical-data-exploration-spark-part-1
https://blog.madhukaraphatak.com/statistical-data-exploration-spark-part-2
https://blog.madhukaraphatak.com/statistical-data-exploration-spark-part-3

#Spark #Hadoop #production #BigData

Madhukaraphatak

Statistical Data Exploration using Spark 2.0 - Part 2 : Shape of Data with Histograms

Thoughts on technology, life and everything else.

6.06K views17:47

Data Science by ODS.ai 🦜

To continue #FeatureEngineering topic:

Understanding Feature Engineering (Part 1) — Continuous Numeric Data

Link: https://towardsdatascience.com/understanding-feature-engineering-part-1-continuous-numeric-data-da4e47099a7b

#Statistics #novice

Medium

Continuous Numeric Data

Strategies for working with continuous, numerical data

6.58K views23:50

Data Science by ODS.ai 🦜

Great example on how different approach to feature encoding can influence the results.

Mean (likelihood) encoding for categorical variables with high cardinality and feature interactions: a comprehensive study with Python

Link: https://www.kaggle.com/vprokopev/mean-likelihood-encodings-a-comprehensive-study

#FeatureEngineering #FeactureEncoding #Kaggle

5.46K views22:59

Data Science by ODS.ai 🦜

Neural Network Embeddings Explained

How deep learning can represent War and Peace as a vector

Easy to read #novice article about #embeddings. Basically — how to represent everything as a vector.

Link: https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526

Medium

Neural Network Embeddings Explained

How deep learning can represent War and Peace as a vector

5.89K views18:53

Data Science by ODS.ai 🦜

Building a Recommendation System Using Neural Network Embeddings

And, putting theory to work: embeddings for recommendation system

Link: https://towardsdatascience.com/building-a-recommendation-system-using-neural-network-embeddings-1ef92e5c80c9

Medium

Building a Recommendation System Using Neural Network Embeddings

How to use deep learning and Wikipedia to create a book recommendation system

7.22K views18:54

Data Science by ODS.ai 🦜

Test-Driven Data Analysis

TDD is an approach to software development, suggesting that tests are essential part of the process. Over the years TDD have shown that it is required to maintain a good code base and the most common requirement for the lasting project.

Test driven approach can be maintain with data analysis too, with the reproducible research approach or TDDA, which is suggested by the latter link.

Link: https://www.tdda.info

#tdda

6.26K views18:09