Data Science by ODS.ai 🦜
46.2K subscribers
649 photos
75 videos
7 files
1.74K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev
Download Telegram
What does your Spotify music sound like? Data Science with Spotify (Part 1)

Example of a good approach to the research. Though, as was noted, there is no data for the reproducibility, author can provide data and sample code in the future.

Link: https://towardsdatascience.com/data-science-and-machine-learning-with-spotify-841225bfb5d0

#spotify
Reproducing Imagenet in 18 minutes

The code to reproduce #ImageNet in 18 minutes is posted in the GitHub repo. It actually becomes «Imagenet in 12 minutes» if using 74.9% top1, used in Chainer's "Imagenet in 15" paper, last few bits are the hardest.

Link: https://github.com/diux-dev/imagenet18
Ultimate Machine Learning Cheat Sheet

Notes on top-level topics from Stanford's CS 229 by Shervine Amidi and Afshine Amidi:

* Supervised learning
* Unsupervised learning
* Deep learning
* Tips and tricks
* Probability and stats refresher
* Algebra and calculus refresher

Forward this message to your Saved Messages to make sure, you won’t lose it.

Repo link: https://github.com/afshinea/stanford-cs-229-machine-learning

#Stanford #cheatsheet
CACTUs: an unsupervised learning algorithm that learns to learn tasks constructed from unlabeled data. Leads to significantly more effective downstream learning & enables few-shot learning *without* labeled meta-learning datasets

ArXiv: https://arxiv.org/abs/1810.02334

#cactus #unsupervised
Hitchhiker’s guide to Exploratory Data Analysis

Exploratory Data Analysis — stage of finding out distribution of the data, volume, number of missing values and all the other characteristics of the available dataset.

Part 1: https://towardsdatascience.com/hitchhikers-guide-to-exploratory-data-analysis-6e8d896d3f7e
Part 2: https://towardsdatascience.com/hitchhikers-guide-to-exploratory-data-analysis-part-2-36ab72201e1d

#ExploratoryDA #novice #entrylevel
The Code for Facial Identity in the Primate Brain

This paper showed that facial images can be reconstructed from a simple linear model using responses of only ~200 visual neurons recorded from a monkey. This approach uses "face cells" which are encoding how much a face differs from average in particular ways ("eigenface dimensions").

https://www.sciencedirect.com/science/article/pii/S009286741730538X

#cv #dl
Great example on how different approach to feature encoding can influence the results.

Mean (likelihood) encoding for categorical variables with high cardinality and feature interactions: a comprehensive study with Python

Link: https://www.kaggle.com/vprokopev/mean-likelihood-encodings-a-comprehensive-study

#FeatureEngineering #FeactureEncoding #Kaggle
Neural Network Embeddings Explained

How deep learning can represent War and Peace as a vector

Easy to read #novice article about #embeddings. Basically — how to represent everything as a vector.

Link: https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526
Test-Driven Data Analysis

TDD is an approach to software development, suggesting that tests are essential part of the process. Over the years TDD have shown that it is required to maintain a good code base and the most common requirement for the lasting project.

Test driven approach can be maintain with data analysis too, with the reproducible research approach or TDDA, which is suggested by the latter link.

Link: https://www.tdda.info

#tdda