Personalized Machine Learning by Julian McAuley
Every day we interact with machine learning systems that personalize their predictions to individual users, whether to recommend movies, find new friends or dating partners, or organize our news feeds. Such systems involve several modalities of data, ranging from sequences of clicks or purchases, to rich modalities involving text, images, or social interactions.
While settings and data modalities vary significantly, in this book we introduce a common set of principles and methods that underpin the design of personalized predictive models.
The book begins by revising "traditional" machine learning models, with a special focus on how they should be adapted to settings involving user data. Later, we'll develop techniques based on more advanced principles such as matrix factorization, deep learning, and generative modeling. Finally, we conclude with a detailed study of the consequences and risks of deploying personalized predictive systems.
By understanding the principles behind personalized machine learning, readers will gain the ability to design models and systems for a wide range of applications involving user data. A series of case-studies will help readers understand the importance of personalization in domains ranging from e-commerce to personalized health, and hands-on projects and code examples (and an online supplement) will give readers experience working with large-scale real-world datasets.
Link: Book
Navigational hashtags: #armbooks #armsite
General hashtags: #ml #machinelearning #regression #classification #recommendation #recsys #nlp
@data_science_weekly
Every day we interact with machine learning systems that personalize their predictions to individual users, whether to recommend movies, find new friends or dating partners, or organize our news feeds. Such systems involve several modalities of data, ranging from sequences of clicks or purchases, to rich modalities involving text, images, or social interactions.
While settings and data modalities vary significantly, in this book we introduce a common set of principles and methods that underpin the design of personalized predictive models.
The book begins by revising "traditional" machine learning models, with a special focus on how they should be adapted to settings involving user data. Later, we'll develop techniques based on more advanced principles such as matrix factorization, deep learning, and generative modeling. Finally, we conclude with a detailed study of the consequences and risks of deploying personalized predictive systems.
By understanding the principles behind personalized machine learning, readers will gain the ability to design models and systems for a wide range of applications involving user data. A series of case-studies will help readers understand the importance of personalization in domains ranging from e-commerce to personalized health, and hands-on projects and code examples (and an online supplement) will give readers experience working with large-scale real-world datasets.
Link: Book
Navigational hashtags: #armbooks #armsite
General hashtags: #ml #machinelearning #regression #classification #recommendation #recsys #nlp
@data_science_weekly
👍7
Hey! 👋
We have some exciting news! Telegram offers a fantastic feature called auto-translation for posts, which would make our channel accessible to a much wider global audience 🌍✨
But here's the catch: To unlock this feature for our channel, we need Telegram Premium users to boost us! 🔋
How you can help (if you're a Premium user):
1. Tap on the channel name at the top.
2. Select "Boost Channel" (or find it in the channel menu).
3. Choose how many boosts you'd like to contribute (even 1 helps!).
4. Confirm – it's quick and free for Premium users!
Or simply use this link to boost!
Why boosting matters:
- 🌐 Break Language Barriers: Auto-translation will instantly translate our posts into your preferred language, making our content accessible to everyone.
- 💡 Share Knowledge Widely: Reach more people who can benefit from what we share here.
- 🚀 Grow Together: Help our community expand and become even more vibrant!
We're currently at 4 boosts (Level 2). Our goal is Level 4 to unlock auto-translation! Every single boost from a Premium user gets us closer.
To our amazing Premium members: Your boosts are incredibly valuable! If you find this channel useful and want to help us reach more people globally, please consider boosting us. It makes a huge difference! 🙏
To everyone else: Even if you're not Premium, you can still help massively! Please share this message with friends or groups who are Premium users and might be willing to support us. 🤝
Let's unlock the power of translation together! Thank you for being such a fantastic community!
With gratitude,
Artem Ryblov
We have some exciting news! Telegram offers a fantastic feature called auto-translation for posts, which would make our channel accessible to a much wider global audience 🌍✨
But here's the catch: To unlock this feature for our channel, we need Telegram Premium users to boost us! 🔋
How you can help (if you're a Premium user):
1. Tap on the channel name at the top.
2. Select "Boost Channel" (or find it in the channel menu).
3. Choose how many boosts you'd like to contribute (even 1 helps!).
4. Confirm – it's quick and free for Premium users!
Or simply use this link to boost!
Why boosting matters:
- 🌐 Break Language Barriers: Auto-translation will instantly translate our posts into your preferred language, making our content accessible to everyone.
- 💡 Share Knowledge Widely: Reach more people who can benefit from what we share here.
- 🚀 Grow Together: Help our community expand and become even more vibrant!
We're currently at 4 boosts (Level 2). Our goal is Level 4 to unlock auto-translation! Every single boost from a Premium user gets us closer.
To our amazing Premium members: Your boosts are incredibly valuable! If you find this channel useful and want to help us reach more people globally, please consider boosting us. It makes a huge difference! 🙏
To everyone else: Even if you're not Premium, you can still help massively! Please share this message with friends or groups who are Premium users and might be willing to support us. 🤝
Let's unlock the power of translation together! Thank you for being such a fantastic community!
With gratitude,
Artem Ryblov
Telegram
Artem Ryblov’s Data Science Weekly
Boost this channel to help it unlock additional features.
👍3
python-patterns
A collection of design patterns and idioms in Python.
Remember that each pattern has its own trade-offs. And you need to pay attention more to why you're choosing a certain pattern than to how to implement it.
Link: GitHub
Navigational hashtags: #armsite
General hashtags: #python #programming #patterns #development #engineering
@data_science_weekly
A collection of design patterns and idioms in Python.
Remember that each pattern has its own trade-offs. And you need to pay attention more to why you're choosing a certain pattern than to how to implement it.
Link: GitHub
Navigational hashtags: #armsite
General hashtags: #python #programming #patterns #development #engineering
@data_science_weekly
👍6
How to avoid machine learning pitfalls by Michael A. Lones
Mistakes in machine learning practice are commonplace, and can result in a loss of confidence in the findings and products of machine learning.
This guide outlines common mistakes that occur when using machine learning, and what can be done to avoid them.
Whilst it should be accessible to anyone with a basic understanding of machine learning techniques, it focuses on issues that are of particular concern within academic research, such as the need to do rigorous comparisons and reach valid conclusions.
It covers five stages of the machine learning process:
- What to do before model building
- How to reliably build models
- How to robustly evaluate models
- How to compare models fairly
- How to report results
Link: arXiv
Navigational hashtags: #armarticles
General hashtags: #ml #machinelearning #mlsystemdesign
@data_science_weekly
Mistakes in machine learning practice are commonplace, and can result in a loss of confidence in the findings and products of machine learning.
This guide outlines common mistakes that occur when using machine learning, and what can be done to avoid them.
Whilst it should be accessible to anyone with a basic understanding of machine learning techniques, it focuses on issues that are of particular concern within academic research, such as the need to do rigorous comparisons and reach valid conclusions.
It covers five stages of the machine learning process:
- What to do before model building
- How to reliably build models
- How to robustly evaluate models
- How to compare models fairly
- How to report results
Link: arXiv
Navigational hashtags: #armarticles
General hashtags: #ml #machinelearning #mlsystemdesign
@data_science_weekly
👍5
Deep Learning Fundamentals by Sebastian Raschka and Lightning AI
Deep Learning Fundamentals is a free course on learning deep learning using a modern open-source stack.
If you found this page, you probably heard that artificial intelligence and deep learning are taking the world by storm. This is correct. In this course, Sebastian Raschka, a best-selling author and professor, will teach you deep learning (machine learning with deep learning) from the ground up via a course of 10 units with bite-sized videos, quizzes, and exercises. The entire course is free and uses the most popular open-source tools for deep learning.
What will you learn in this course?
- What machine learning is and when to use it
- The main concepts of deep learning
- How to design deep learning experiments with PyTorch
- How to write efficient deep learning code with PyTorch Lightning
What will you be able to do after this course?
- Build classifiers for various kinds of data like tables, images, and text
- Tune models effectively to optimize predictive and computational performance
How is this course structured?
- The course consists of 10 units, each containing several subsections
- It is centered around informative, succinct videos that are respectful of your time
- In each unit, you will find optional exercises to practice your knowledge
- We also provide additional resources for those who want a deep dive on specific topics
What are the prerequisites?
- Ideally, you should already be familiar with programming in Python
- (Some lectures will involve a tiny bit of math, but a strong math background is not required!)
Are there interactive quizzes or exercises?
- Each section is accompanied by optional multiple-choice quizzes to test your understanding of the material
- Optionally, each unit also features one or more code exercises to practice implementing concepts covered in this class
Is there a course completion badge or certificate?
- At the end of this course, you can take an optional exam featuring 25 multiple-choice questions
- Upon answering 80% of the questions in the exam correctly (there are 5 attempts), you obtain an optional course completion badge that can be shared on LinkedIn
Link: Course
Navigational hashtags: #armcourses
General hashtags: #dl #deeplearning #pytorch #ligthning
@data_science_weekly
Deep Learning Fundamentals is a free course on learning deep learning using a modern open-source stack.
If you found this page, you probably heard that artificial intelligence and deep learning are taking the world by storm. This is correct. In this course, Sebastian Raschka, a best-selling author and professor, will teach you deep learning (machine learning with deep learning) from the ground up via a course of 10 units with bite-sized videos, quizzes, and exercises. The entire course is free and uses the most popular open-source tools for deep learning.
What will you learn in this course?
- What machine learning is and when to use it
- The main concepts of deep learning
- How to design deep learning experiments with PyTorch
- How to write efficient deep learning code with PyTorch Lightning
What will you be able to do after this course?
- Build classifiers for various kinds of data like tables, images, and text
- Tune models effectively to optimize predictive and computational performance
How is this course structured?
- The course consists of 10 units, each containing several subsections
- It is centered around informative, succinct videos that are respectful of your time
- In each unit, you will find optional exercises to practice your knowledge
- We also provide additional resources for those who want a deep dive on specific topics
What are the prerequisites?
- Ideally, you should already be familiar with programming in Python
- (Some lectures will involve a tiny bit of math, but a strong math background is not required!)
Are there interactive quizzes or exercises?
- Each section is accompanied by optional multiple-choice quizzes to test your understanding of the material
- Optionally, each unit also features one or more code exercises to practice implementing concepts covered in this class
Is there a course completion badge or certificate?
- At the end of this course, you can take an optional exam featuring 25 multiple-choice questions
- Upon answering 80% of the questions in the exam correctly (there are 5 attempts), you obtain an optional course completion badge that can be shared on LinkedIn
Link: Course
Navigational hashtags: #armcourses
General hashtags: #dl #deeplearning #pytorch #ligthning
@data_science_weekly
👍7
The Prompt Report: A Systematic Survey of Prompt Engineering Techniques
Generative Artificial Intelligence (GenAI) systems are increasingly being deployed across diverse industries and research domains. Developers and end-users interact with these systems through the use of prompting and prompt engineering.
Although prompt engineering is a widely adopted and extensively researched area, it suffers from conflicting terminology and a fragmented ontological understanding of what constitutes an effective prompt due to its relatively recent emergence.
Authors establish a structured understanding of prompt engineering by assembling a taxonomy of prompting techniques and analyzing their applications. They present a detailed vocabulary of 33 vocabulary terms, a taxonomy of 58 LLM prompting techniques, and 40 techniques for other modalities.
Additionally, authors provide best practices and guidelines for prompt engineering, including advice for prompting state-of-the-art (SOTA) LLMs such as ChatGPT. They further present a meta-analysis of the entire literature on natural language prefix-prompting. As a culmination of these efforts, this paper presents the most comprehensive survey on prompt engineering to date.
Link: ArXiv
Navigational hashtags: #armarticles
General hashtags: #promptengineering #prompts #prompt #llm
@data_science_weekly
Generative Artificial Intelligence (GenAI) systems are increasingly being deployed across diverse industries and research domains. Developers and end-users interact with these systems through the use of prompting and prompt engineering.
Although prompt engineering is a widely adopted and extensively researched area, it suffers from conflicting terminology and a fragmented ontological understanding of what constitutes an effective prompt due to its relatively recent emergence.
Authors establish a structured understanding of prompt engineering by assembling a taxonomy of prompting techniques and analyzing their applications. They present a detailed vocabulary of 33 vocabulary terms, a taxonomy of 58 LLM prompting techniques, and 40 techniques for other modalities.
Additionally, authors provide best practices and guidelines for prompt engineering, including advice for prompting state-of-the-art (SOTA) LLMs such as ChatGPT. They further present a meta-analysis of the entire literature on natural language prefix-prompting. As a culmination of these efforts, this paper presents the most comprehensive survey on prompt engineering to date.
Link: ArXiv
Navigational hashtags: #armarticles
General hashtags: #promptengineering #prompts #prompt #llm
@data_science_weekly
👍6
Linear Algebra for Data Science by Prof. Wanmo Kang and Prof. Kyunghyun Cho
Authors have been discussing over the past few years how they should teach linear algebra to students in this new era of data science and artificial intelligence.
Over these discussions, which also led to some research collaboration as well, they realized that (one of the central concepts from linear algebra that is used frequently in practice, if not every day, was) the central concepts from linear algebra invoked frequently in practice, if not every day, were projection, and consequently singular value decomposition (SVD) as well as even less frequently positive definiteness.
Unfortunately, they noticed that existing courses on linear algebra often focus much more on the invertibility (or lack thereof), to the point that many concepts are introduced not in the order of their practicality nor usefulness but in the order of the conveniences in mathematical derivations/introductions.
They began to wonder a lot whether they can introduce concepts and results in linear algebra in a radically different way.
So, here’s a new textbook on linear algebra, where they re-imagined how and in which order linear algebra could be taught.
Links:
- Site
- Book
Navigational hashtags: #armbooks
General hashtags: #math #mathematics #linearalgebra
@data_science_weekly
Authors have been discussing over the past few years how they should teach linear algebra to students in this new era of data science and artificial intelligence.
Over these discussions, which also led to some research collaboration as well, they realized that (one of the central concepts from linear algebra that is used frequently in practice, if not every day, was) the central concepts from linear algebra invoked frequently in practice, if not every day, were projection, and consequently singular value decomposition (SVD) as well as even less frequently positive definiteness.
Unfortunately, they noticed that existing courses on linear algebra often focus much more on the invertibility (or lack thereof), to the point that many concepts are introduced not in the order of their practicality nor usefulness but in the order of the conveniences in mathematical derivations/introductions.
They began to wonder a lot whether they can introduce concepts and results in linear algebra in a radically different way.
So, here’s a new textbook on linear algebra, where they re-imagined how and in which order linear algebra could be taught.
Links:
- Site
- Book
Navigational hashtags: #armbooks
General hashtags: #math #mathematics #linearalgebra
@data_science_weekly
👍3
Problem Solving with Algorithms and Data Structures using Python by Brad Miller and David Ranum, Luther College
This textbook is about computer science. It is also about Python. However, there is much more.
The study of algorithms and data structures is central to understanding what computer science is all about. Learning computer science is not unlike learning any other type of difficult subject matter. The only way to be successful is through deliberate and incremental exposure to the fundamental ideas. A beginning computer scientist needs practice so that there is a thorough understanding before continuing on to the more complex parts of the curriculum. In addition, a beginner needs to be given the opportunity to be successful and gain confidence.
This textbook is designed to serve as a text for a first course on data structures and algorithms, typically taught as the second course in the computer science curriculum. Even though the second course is considered more advanced than the first course, this book assumes you are beginners at this level. You may still be struggling with some of the basic ideas and skills from a first computer science course and yet be ready to further explore the discipline and continue to practice problem solving.
Authors cover abstract data types and data structures, writing algorithms, and solving problems. They look at a number of data structures and solve classic problems that arise. The tools and techniques that you learn here will be applied over and over as you continue your study of computer science.
Links:
- Site
- Book
Navigational hashtags: #armbooks #armcourses
General hashtags: #python #algorithms #datastructures #programming #cs #computerscience
@data_science_weekly
This textbook is about computer science. It is also about Python. However, there is much more.
The study of algorithms and data structures is central to understanding what computer science is all about. Learning computer science is not unlike learning any other type of difficult subject matter. The only way to be successful is through deliberate and incremental exposure to the fundamental ideas. A beginning computer scientist needs practice so that there is a thorough understanding before continuing on to the more complex parts of the curriculum. In addition, a beginner needs to be given the opportunity to be successful and gain confidence.
This textbook is designed to serve as a text for a first course on data structures and algorithms, typically taught as the second course in the computer science curriculum. Even though the second course is considered more advanced than the first course, this book assumes you are beginners at this level. You may still be struggling with some of the basic ideas and skills from a first computer science course and yet be ready to further explore the discipline and continue to practice problem solving.
Authors cover abstract data types and data structures, writing algorithms, and solving problems. They look at a number of data structures and solve classic problems that arise. The tools and techniques that you learn here will be applied over and over as you continue your study of computer science.
Links:
- Site
- Book
Navigational hashtags: #armbooks #armcourses
General hashtags: #python #algorithms #datastructures #programming #cs #computerscience
@data_science_weekly
👍5
Deep Learning and Computational Physics by Deep Ray, Orazio Pinti, Assad A. Oberai
These notes were compiled as lecture notes for a course developed and taught at the University of the Southern California. They should be accessible to a typical engineering graduate student with a strong background in Applied Mathematics.
The main objective of these notes is to introduce a student who is familiar with concepts in linear algebra and partial differential equations to select topics in deep learning. These lecture notes exploit the strong connections between deep learning algorithms and the more conventional techniques of computational physics to achieve two goals. First, they use concepts from computational physics to develop an understanding of deep learning algorithms. Not surprisingly, many concepts in deep learning can be connected to similar concepts in computational physics, and one can utilize this connection to better understand these algorithms. Second, several novel deep learning algorithms can be used to solve challenging problems in computational physics. Thus, they offer someone who is interested in modeling a physical phenomena with a complementary set of tools.
Links:
- ArXiv
- Book
Navigational hashtags: #armbooks
General hashtags: #dl #deeplearning #physics
@data_science_weekly
These notes were compiled as lecture notes for a course developed and taught at the University of the Southern California. They should be accessible to a typical engineering graduate student with a strong background in Applied Mathematics.
The main objective of these notes is to introduce a student who is familiar with concepts in linear algebra and partial differential equations to select topics in deep learning. These lecture notes exploit the strong connections between deep learning algorithms and the more conventional techniques of computational physics to achieve two goals. First, they use concepts from computational physics to develop an understanding of deep learning algorithms. Not surprisingly, many concepts in deep learning can be connected to similar concepts in computational physics, and one can utilize this connection to better understand these algorithms. Second, several novel deep learning algorithms can be used to solve challenging problems in computational physics. Thus, they offer someone who is interested in modeling a physical phenomena with a complementary set of tools.
Links:
- ArXiv
- Book
Navigational hashtags: #armbooks
General hashtags: #dl #deeplearning #physics
@data_science_weekly
👍3
Feature Selection in Machine Learning by Soledad Galli
Feature selection is the process of selecting a subset of features from the total variables in a data set to train machine learning algorithms. Feature selection is an important aspect of data mining and predictive modelling.
Feature selection is key for developing simpler, faster, and highly performant machine learning models and can help to avoid overfitting. The aim of any feature selection algorithm is to create classifiers or regression models that run faster and whose outputs are easier to understand by their users.
In this book, you will find the most widely used feature selection methods to select the best subsets of predictor variables from your data. You will learn about filter, wrapper, and embedded methods for feature selection. Then, you will discover methods designed by computer science professionals or used in data science competitions that are faster or more scalable.
First, we will discuss the use of statistical and univariate algorithms in the context of artificial intelligence. Next, we will cover methods that select features through optimization of the model performance. We will move on to feature selection algorithms that are baked into the machine learning techniques. And finally, we will discuss additional methods designed by data scientists specifically for applied predictive modeling.
In this book, you will find out how to:
- Remove useless and redundant features by examining variability and correlation.
- Choose features based on statistical tests such as ANOVA, chi-square, and mutual information.
- Select features by using Lasso regularization or decision tree based feature importance, which are embedded in the machine learning modeling process.
- Select features by recursive feature elimination, addition, or value permutation.
Each chapter fleshes out various methods for feature selection that share common characteristics. First, you will learn the fundamentals of the feature selection method, and next you will find a Python implementation.
The book comes with an accompanying Github repository with the full source code that you can download, modify, and use in your own data science projects and case studies.
Feature selection methods differ from dimensionality reduction methods in that feature selection techniques do not alter the original representation of the variables, but merely select a reduced number of features from the training data that produce performant machine learning models.
Using the Python libraries Scikit-learn, MLXtend, and Feature-engine, you’ll learn how to select the best numerical and categorical features for regression and classification models in just a few lines of code. You will also learn how to make feature selection part of your machine learning workflow.
Link:
- Book
Navigational hashtags: #armbooks
General hashtags: #ml #machinelearning #featureselection #fs
@data_science_weekly
Feature selection is the process of selecting a subset of features from the total variables in a data set to train machine learning algorithms. Feature selection is an important aspect of data mining and predictive modelling.
Feature selection is key for developing simpler, faster, and highly performant machine learning models and can help to avoid overfitting. The aim of any feature selection algorithm is to create classifiers or regression models that run faster and whose outputs are easier to understand by their users.
In this book, you will find the most widely used feature selection methods to select the best subsets of predictor variables from your data. You will learn about filter, wrapper, and embedded methods for feature selection. Then, you will discover methods designed by computer science professionals or used in data science competitions that are faster or more scalable.
First, we will discuss the use of statistical and univariate algorithms in the context of artificial intelligence. Next, we will cover methods that select features through optimization of the model performance. We will move on to feature selection algorithms that are baked into the machine learning techniques. And finally, we will discuss additional methods designed by data scientists specifically for applied predictive modeling.
In this book, you will find out how to:
- Remove useless and redundant features by examining variability and correlation.
- Choose features based on statistical tests such as ANOVA, chi-square, and mutual information.
- Select features by using Lasso regularization or decision tree based feature importance, which are embedded in the machine learning modeling process.
- Select features by recursive feature elimination, addition, or value permutation.
Each chapter fleshes out various methods for feature selection that share common characteristics. First, you will learn the fundamentals of the feature selection method, and next you will find a Python implementation.
The book comes with an accompanying Github repository with the full source code that you can download, modify, and use in your own data science projects and case studies.
Feature selection methods differ from dimensionality reduction methods in that feature selection techniques do not alter the original representation of the variables, but merely select a reduced number of features from the training data that produce performant machine learning models.
Using the Python libraries Scikit-learn, MLXtend, and Feature-engine, you’ll learn how to select the best numerical and categorical features for regression and classification models in just a few lines of code. You will also learn how to make feature selection part of your machine learning workflow.
Link:
- Book
Navigational hashtags: #armbooks
General hashtags: #ml #machinelearning #featureselection #fs
@data_science_weekly
👍7
SQL Tutorial
Learn to answer questions with data using SQL. No coding experience necessary.
Link: Site
Navigational hashtags: #armknowledgesharing #armsites #armcourses
General hashtags: #sql
@data_science_weekly
Learn to answer questions with data using SQL. No coding experience necessary.
Link: Site
Navigational hashtags: #armknowledgesharing #armsites #armcourses
General hashtags: #sql
@data_science_weekly
👍6
Recommenders
Recommenders objective is to assist researchers, developers and enthusiasts in prototyping, experimenting with and bringing to production a range of classic and state-of-the-art recommendation systems.
Recommenders is a project under the Linux Foundation of AI and Data.
This repository contains examples and best practices for building recommendation systems, provided as Jupyter notebooks. The examples detail our learnings on five key tasks:
- Prepare Data: Preparing and loading data for each recommendation algorithm.
- Model: Building models using various classical and deep learning recommendation algorithms such as Alternating Least Squares (ALS) or eXtreme Deep Factorization Machines (xDeepFM).
- Evaluate: Evaluating algorithms with offline metrics.
- Model Select and Optimize: Tuning and optimizing hyperparameters for recommendation models.
- Operationalize: Operationalizing models in a production environment on Azure.
Several utilities are provided in recommenders to support common tasks such as loading datasets in the format expected by different algorithms, evaluating model outputs, and splitting training/test data. Implementations of several state-of-the-art algorithms are included for self-study and customization in your own applications. See the Recommenders documentation.
For a more detailed overview of the repository, please see the documents on the wiki page.
For some of the practical scenarios where recommendation systems have been applied, see scenarios.
Link: Repository
Navigational hashtags: #armknowledgesharing #armrepo
General hashtags: #recsys #recommendersystems #recommenders
@data_science_weekly
Recommenders objective is to assist researchers, developers and enthusiasts in prototyping, experimenting with and bringing to production a range of classic and state-of-the-art recommendation systems.
Recommenders is a project under the Linux Foundation of AI and Data.
This repository contains examples and best practices for building recommendation systems, provided as Jupyter notebooks. The examples detail our learnings on five key tasks:
- Prepare Data: Preparing and loading data for each recommendation algorithm.
- Model: Building models using various classical and deep learning recommendation algorithms such as Alternating Least Squares (ALS) or eXtreme Deep Factorization Machines (xDeepFM).
- Evaluate: Evaluating algorithms with offline metrics.
- Model Select and Optimize: Tuning and optimizing hyperparameters for recommendation models.
- Operationalize: Operationalizing models in a production environment on Azure.
Several utilities are provided in recommenders to support common tasks such as loading datasets in the format expected by different algorithms, evaluating model outputs, and splitting training/test data. Implementations of several state-of-the-art algorithms are included for self-study and customization in your own applications. See the Recommenders documentation.
For a more detailed overview of the repository, please see the documents on the wiki page.
For some of the practical scenarios where recommendation systems have been applied, see scenarios.
Link: Repository
Navigational hashtags: #armknowledgesharing #armrepo
General hashtags: #recsys #recommendersystems #recommenders
@data_science_weekly
👍4
CS50’s Introduction to Programming with Python by Harvard
An introduction to programming using a language called Python. Learn how to read and write code as well as how to test and “debug” it. Designed for students with or without prior programming experience who’d like to learn Python specifically.
Learn about functions, arguments, and return values (oh my!); variables and types; conditionals and Boolean expressions; and loops. Learn how to handle exceptions, find and fix bugs, and write unit tests; use third-party libraries; validate and extract data with regular expressions; model real-world entities with classes, objects, methods, and properties; and read and write files.
Hands-on opportunities for lots of practice. Exercises inspired by real-world programming problems.
No software required except for a web browser, or you can write code on your own PC or Mac.
Link: Course
Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #python
@data_science_weekly
An introduction to programming using a language called Python. Learn how to read and write code as well as how to test and “debug” it. Designed for students with or without prior programming experience who’d like to learn Python specifically.
Learn about functions, arguments, and return values (oh my!); variables and types; conditionals and Boolean expressions; and loops. Learn how to handle exceptions, find and fix bugs, and write unit tests; use third-party libraries; validate and extract data with regular expressions; model real-world entities with classes, objects, methods, and properties; and read and write files.
Hands-on opportunities for lots of practice. Exercises inspired by real-world programming problems.
No software required except for a web browser, or you can write code on your own PC or Mac.
Link: Course
Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #python
@data_science_weekly
👍3
Interpreting Machine Learning Models With SHAP. A Guide With Python Examples And Theory On Shapley Values by Christoph Molnar
Machine learning is transforming fields from healthcare diagnostics to climate change predictions through their predictive performance. However, these complex machine learning models often lack interpretability, which is becoming more essential than ever for debugging, fostering trust, and communicating model insights.
Introducing SHAP, the Swiss army knife of machine learning interpretability:
- SHAP can be used to explain individual predictions.
- By combining explanations for individual predictions, SHAP allows to study the overall model behavior.
- SHAP is model-agnostic – it works with any model, from simple linear regression to deep learning.
- With its flexibility, SHAP can handle various data formats, whether it’s tabular, image, or text.
- The Python package shap makes the application of SHAP for model interpretation easy.
This book will be your comprehensive guide to mastering the theory and application of SHAP. It starts with the quite fascinating origin in game theory and explores what splitting taxi costs has to do with explaining machine learning predictions. Starting with using SHAP to explain a simple linear regression model, the book progressively introduces SHAP for more complex models. You’ll learn the ins and outs of the most popular explainable AI method and how to apply it using the shap package.
In a world where interpretability is key, this book is your roadmap to mastering SHAP. For machine learning models that are not only accurate but also interpretable.
Links:
- Paperback
- eBook
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #ml #machinelearning #shap #interpretability #python #shapley #shapleyvalues
@data_science_weekly
Machine learning is transforming fields from healthcare diagnostics to climate change predictions through their predictive performance. However, these complex machine learning models often lack interpretability, which is becoming more essential than ever for debugging, fostering trust, and communicating model insights.
Introducing SHAP, the Swiss army knife of machine learning interpretability:
- SHAP can be used to explain individual predictions.
- By combining explanations for individual predictions, SHAP allows to study the overall model behavior.
- SHAP is model-agnostic – it works with any model, from simple linear regression to deep learning.
- With its flexibility, SHAP can handle various data formats, whether it’s tabular, image, or text.
- The Python package shap makes the application of SHAP for model interpretation easy.
This book will be your comprehensive guide to mastering the theory and application of SHAP. It starts with the quite fascinating origin in game theory and explores what splitting taxi costs has to do with explaining machine learning predictions. Starting with using SHAP to explain a simple linear regression model, the book progressively introduces SHAP for more complex models. You’ll learn the ins and outs of the most popular explainable AI method and how to apply it using the shap package.
In a world where interpretability is key, this book is your roadmap to mastering SHAP. For machine learning models that are not only accurate but also interpretable.
Links:
- Paperback
- eBook
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #ml #machinelearning #shap #interpretability #python #shapley #shapleyvalues
@data_science_weekly
👍8
Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin
Even bad code can function. But if code isn’t clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn’t have to be that way.
Noted software expert Robert C. Martin, presents a revolutionary paradigm with Clean Code: A Handbook of Agile Software Craftsmanship. Martin, who has helped bring agile principles from a practitioner’s point of view to tens of thousands of programmers, has teamed up with his colleagues from Object Mentor to distill their best agile practice of cleaning code “on the fly” into a book that will instill within you the values of software craftsman, and make you a better programmer―but only if you work at it.
What kind of work will you be doing? You’ll be reading code―lots of code. And you will be challenged to think about what’s right about that code, and what’s wrong with it. More importantly you will be challenged to reassess your professional values and your commitment to your craft.
Clean Code is divided into three parts. The first describes the principles, patterns, and practices of writing clean code. The second part consists of several case studies of increasing complexity. Each case study is an exercise in cleaning up code―of transforming a code base that has some problems into one that is sound and efficient. The third part is the payoff: a single chapter containing a list of heuristics and “smells” gathered while creating the case studies. The result is a knowledge base that describes the way we think when we write, read, and clean code.
Readers will come away from this book understanding:
- How to tell the difference between good and bad code
- How to write good code and how to transform bad code into good code
- How to create good names, good functions, good objects, and good classes
- How to format code for maximum readability
- How to implement complete error handling without obscuring code
- How to unit test and practice test-driven development
- What “smells” and heuristics can help you identify bad code
This book is a must for any developer, software engineer, project manager, team lead, or systems analyst with an interest in producing better code.
Link: Paperback
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #development #cleancode
@data_science_weekly
Even bad code can function. But if code isn’t clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn’t have to be that way.
Noted software expert Robert C. Martin, presents a revolutionary paradigm with Clean Code: A Handbook of Agile Software Craftsmanship. Martin, who has helped bring agile principles from a practitioner’s point of view to tens of thousands of programmers, has teamed up with his colleagues from Object Mentor to distill their best agile practice of cleaning code “on the fly” into a book that will instill within you the values of software craftsman, and make you a better programmer―but only if you work at it.
What kind of work will you be doing? You’ll be reading code―lots of code. And you will be challenged to think about what’s right about that code, and what’s wrong with it. More importantly you will be challenged to reassess your professional values and your commitment to your craft.
Clean Code is divided into three parts. The first describes the principles, patterns, and practices of writing clean code. The second part consists of several case studies of increasing complexity. Each case study is an exercise in cleaning up code―of transforming a code base that has some problems into one that is sound and efficient. The third part is the payoff: a single chapter containing a list of heuristics and “smells” gathered while creating the case studies. The result is a knowledge base that describes the way we think when we write, read, and clean code.
Readers will come away from this book understanding:
- How to tell the difference between good and bad code
- How to write good code and how to transform bad code into good code
- How to create good names, good functions, good objects, and good classes
- How to format code for maximum readability
- How to implement complete error handling without obscuring code
- How to unit test and practice test-driven development
- What “smells” and heuristics can help you identify bad code
This book is a must for any developer, software engineer, project manager, team lead, or systems analyst with an interest in producing better code.
Link: Paperback
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #development #cleancode
@data_science_weekly
👍8
A new perspective on Shapley values, part I: Intro to Shapley and SHAP by Edden Gerber
This post is the first in a series of two posts about explaining statistical models with Shapley values.
There are two main reasons you might want to read it:
1. To learn about Shapley values and the SHAP python library.
This is what this post is about after all. The explanations it provides are far from exhaustive, and contain nothing that cannot be gathered from other online sources, but it should still serve as a good quick intro or bonus reading on this subject.
2. As an introduction or refresher before reading the next post about Naive Shapley values.
The next post is my attempt at a novel contribution to the topic of Shapley values in machine learning. You may be already familiar with SHAP and Shapley and are just glancing over this post to make sure we’re on common ground, or you may be here to clear up something confusing from the next post.
Link: Post
Navigational hashtags: #armknowledgesharing #armsites
General hashtags: #shap #shapley #interpretation #ml #python
@data_science_weekly
This post is the first in a series of two posts about explaining statistical models with Shapley values.
There are two main reasons you might want to read it:
1. To learn about Shapley values and the SHAP python library.
This is what this post is about after all. The explanations it provides are far from exhaustive, and contain nothing that cannot be gathered from other online sources, but it should still serve as a good quick intro or bonus reading on this subject.
2. As an introduction or refresher before reading the next post about Naive Shapley values.
The next post is my attempt at a novel contribution to the topic of Shapley values in machine learning. You may be already familiar with SHAP and Shapley and are just glancing over this post to make sure we’re on common ground, or you may be here to clear up something confusing from the next post.
Link: Post
Navigational hashtags: #armknowledgesharing #armsites
General hashtags: #shap #shapley #interpretation #ml #python
@data_science_weekly
👍7
Robotics Course by Hugging Face 🤗
This free course will take you on a journey, from classical robotics to modern learning-based approaches, in understanding, implementing, and applying machine learning techniques to real robotic systems.
This course is based on the Robot Learning Tutorial, which is a comprehensive guide to robot learning for researchers and practitioners. Here, we are attempting to distill the tutorial into a more accessible format for the community.
This first unit will help you onboard. You’ll see the course syllabus and learning objectives, understand the structure and prerequisites, meet the team behind the course, learn about LeRobot and the surrounding Huggnig Face ecosystem, and explore the community resources that support your journey.
What to expect from this course?
Across the course you will study classical robotics foundations and modern learning‑based approaches, learn to use LeRobot, work with real robotics datasets, and implement state‑of‑the‑art algorithms. The emphasis is on practical skills you can apply to real robotic systems.
At the end of this course, you'll understand:
- How robots learn from data
- Why learning-based approaches are transforming robotics
- How to implement these techniques using modern tools like LeRobot
What's the syllabus?
Here is the general syllabus for the robotics course. Each unit builds on the previous ones to give you a comprehensive understanding of Robotics.
- Course Introduction. Welcome, prerequisites, and course overview
- Introduction to Robotics. Why Robotics matters and LeRobot ecosystem
- Classical Robotics. Traditional approaches and their limitations
- Reinforcement Learning. How robots learn through trial and error
- Imitation Learning. Learning from demonstrations and behavioral cloning
- Foundation Models. Large-scale models for general robotics
Link: Course
Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #robotics #rf #reinforcementlearning #foundationalmodels #hf #huggingface
@data_science_weekly
This free course will take you on a journey, from classical robotics to modern learning-based approaches, in understanding, implementing, and applying machine learning techniques to real robotic systems.
This course is based on the Robot Learning Tutorial, which is a comprehensive guide to robot learning for researchers and practitioners. Here, we are attempting to distill the tutorial into a more accessible format for the community.
This first unit will help you onboard. You’ll see the course syllabus and learning objectives, understand the structure and prerequisites, meet the team behind the course, learn about LeRobot and the surrounding Huggnig Face ecosystem, and explore the community resources that support your journey.
This course bridges theory and practice in Robotics! It's designed for students interested in understanding how machine learning is transforming robotics. Whether you're new to robotics or looking to understand learning-based approaches, this course will guide you step by step.
What to expect from this course?
Across the course you will study classical robotics foundations and modern learning‑based approaches, learn to use LeRobot, work with real robotics datasets, and implement state‑of‑the‑art algorithms. The emphasis is on practical skills you can apply to real robotic systems.
At the end of this course, you'll understand:
- How robots learn from data
- Why learning-based approaches are transforming robotics
- How to implement these techniques using modern tools like LeRobot
What's the syllabus?
Here is the general syllabus for the robotics course. Each unit builds on the previous ones to give you a comprehensive understanding of Robotics.
- Course Introduction. Welcome, prerequisites, and course overview
- Introduction to Robotics. Why Robotics matters and LeRobot ecosystem
- Classical Robotics. Traditional approaches and their limitations
- Reinforcement Learning. How robots learn through trial and error
- Imitation Learning. Learning from demonstrations and behavioral cloning
- Foundation Models. Large-scale models for general robotics
Link: Course
Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #robotics #rf #reinforcementlearning #foundationalmodels #hf #huggingface
@data_science_weekly
👍6
A new perspective on Shapley values, part II: The Naïve Shapley method by Edden Gerber
Why should you read this post?
1. For insight into Shapley values and the SHAP tool.
Most other sources on these topics are explanations based on existing primary sources (e.g. academic papers and the SHAP documentation). This post is an attempt to gain some understanding through an empirical approach.
2. To learn about an alternative approach to computing Shapley values, that under some (limited) circumstances may be preferable to SHAP.
If you are unfamiliar with Shaply values or SHAP, or want a short recap of how the SHAP explainers work, check out the previous post. In a hurry? The author has emphasized the key sentences in bold to assist your speed-reading.
Link: Post
Navigational hashtags: #armknowledgesharing #armsites
General hashtags: #shap #shapley #interpretation #ml #python
@data_science_weekly
Why should you read this post?
1. For insight into Shapley values and the SHAP tool.
Most other sources on these topics are explanations based on existing primary sources (e.g. academic papers and the SHAP documentation). This post is an attempt to gain some understanding through an empirical approach.
2. To learn about an alternative approach to computing Shapley values, that under some (limited) circumstances may be preferable to SHAP.
If you are unfamiliar with Shaply values or SHAP, or want a short recap of how the SHAP explainers work, check out the previous post. In a hurry? The author has emphasized the key sentences in bold to assist your speed-reading.
Link: Post
Navigational hashtags: #armknowledgesharing #armsites
General hashtags: #shap #shapley #interpretation #ml #python
@data_science_weekly
👍6
Machine Learning Systems. Principles and Practices of Engineering Artificially Intelligent Systems by Vijay Janapa Reddi (Harvard University)
Machine Learning Systems provides a systematic framework for understanding and engineering machine learning (ML) systems.
This textbook bridges the gap between theoretical foundations and practical engineering, emphasizing the systems perspective required to build effective AI solutions.
Unlike resources that focus primarily on algorithms and model architectures, this book highlights the broader context in which ML systems operate, including data engineering, model optimization, hardware-aware training, and inference acceleration.
Readers will develop the ability to reason about ML system architectures and apply enduring engineering principles for building flexible, efficient, and robust machine learning systems.
Links:
- Book
- Site
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #mlsd #machinelearning #machinelearningsystemdesign
@data_science_weekly
Machine Learning Systems provides a systematic framework for understanding and engineering machine learning (ML) systems.
This textbook bridges the gap between theoretical foundations and practical engineering, emphasizing the systems perspective required to build effective AI solutions.
Unlike resources that focus primarily on algorithms and model architectures, this book highlights the broader context in which ML systems operate, including data engineering, model optimization, hardware-aware training, and inference acceleration.
Readers will develop the ability to reason about ML system architectures and apply enduring engineering principles for building flexible, efficient, and robust machine learning systems.
Links:
- Book
- Site
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #mlsd #machinelearning #machinelearningsystemdesign
@data_science_weekly
👍6
CS231n: Deep Learning for Computer Vision by Stanford
Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems.
This course is a deep dive into the details of deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement and train their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. Additionally, the final assignment will give them the opportunity to train and apply multi-million parameter networks on real-world vision problems of their choice.
Through multiple hands-on assignments and the final course project, students will acquire the toolset for setting up deep learning tasks and practical engineering tricks for training and fine-tuning deep neural networks.
Links:
- Course Materials
- Useful Notes
- Videos (2025)
Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #cv #computervision #nn #neuralnetworks
@data_science_weekly
Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems.
This course is a deep dive into the details of deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement and train their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. Additionally, the final assignment will give them the opportunity to train and apply multi-million parameter networks on real-world vision problems of their choice.
Through multiple hands-on assignments and the final course project, students will acquire the toolset for setting up deep learning tasks and practical engineering tricks for training and fine-tuning deep neural networks.
Links:
- Course Materials
- Useful Notes
- Videos (2025)
Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #cv #computervision #nn #neuralnetworks
@data_science_weekly
👍7
Machine Learning System Design Interview by Ali Aminian and Alex Xu
Machine learning system design interviews are the most difficult to tackle of all technical interview questions. This book provides a reliable strategy and knowledge base for approaching a broad range of ML system design questions. It provides a step-by-step framework for tackling an ML system design question. It includes many real-world examples to illustrate the systematic approach, with detailed steps you can follow.
This book is an essential resource for anyone interested in ML system design, whether they are beginners or experienced engineers. Meanwhile, if you need to prepare for an ML interview, this book is specifically written for you.
What’s inside?
- An insider’s take on what interviewers really look for and why.
- A 7-step framework for solving any ML system design interview question.
- 10 real ML system design interview questions with detailed solutions.
- 211 diagrams that visually explain how various systems work.
Table Of Contents
Chapter 1 Introduction and Overview
Chapter 2 Visual Search System
Chapter 3 Google Street View Blurring System
Chapter 4 YouTube Video Search
Chapter 5 Harmful Content Detection
Chapter 6 Video Recommendation System
Chapter 7 Event Recommendation System
Chapter 8 Ad Click Prediction on Social Platforms
Chapter 9 Similar Listings on Vacation Rental Platforms
Chapter 10 Personalized News Feed
Chapter 11 People You May Know
Links:
- Paper version
- Digital version
- Solutions
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #mlsd #machinelearning #machinelearningsystemdesign
@data_science_weekly
Machine learning system design interviews are the most difficult to tackle of all technical interview questions. This book provides a reliable strategy and knowledge base for approaching a broad range of ML system design questions. It provides a step-by-step framework for tackling an ML system design question. It includes many real-world examples to illustrate the systematic approach, with detailed steps you can follow.
This book is an essential resource for anyone interested in ML system design, whether they are beginners or experienced engineers. Meanwhile, if you need to prepare for an ML interview, this book is specifically written for you.
What’s inside?
- An insider’s take on what interviewers really look for and why.
- A 7-step framework for solving any ML system design interview question.
- 10 real ML system design interview questions with detailed solutions.
- 211 diagrams that visually explain how various systems work.
Table Of Contents
Chapter 1 Introduction and Overview
Chapter 2 Visual Search System
Chapter 3 Google Street View Blurring System
Chapter 4 YouTube Video Search
Chapter 5 Harmful Content Detection
Chapter 6 Video Recommendation System
Chapter 7 Event Recommendation System
Chapter 8 Ad Click Prediction on Social Platforms
Chapter 9 Similar Listings on Vacation Rental Platforms
Chapter 10 Personalized News Feed
Chapter 11 People You May Know
Links:
- Paper version
- Digital version
- Solutions
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #mlsd #machinelearning #machinelearningsystemdesign
@data_science_weekly
👍8