Forwarded from TGStat Bot
Summary of the year for the channel "Artem Ryblov’s Data Science Weekly" from @TGStat
👍7
Tech Interview Cheat Sheet
This list is meant to be both a quick guide and reference for further research into these topics. It's basically a summary of that comp sci course you never took or forgot about, so there's no way it can cover everything in depth.
Link: Site
Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #interview #techinterview #interviewprep #interviewpreparation
@data_science_weekly
This list is meant to be both a quick guide and reference for further research into these topics. It's basically a summary of that comp sci course you never took or forgot about, so there's no way it can cover everything in depth.
Link: Site
Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #interview #techinterview #interviewprep #interviewpreparation
@data_science_weekly
👍4
A/B Testing & Experimentation Roadmap
This roadmap is for analysts, data scientists, and product folks who want to go from “I know what an A/B test is” to running trustworthy, advanced online experiments (CUPED, sequential testing, quasi-experiments, Bayesian, etc.).
It’s organized by topics. You don’t have to go strictly top-to-bottom, but earlier sections are foundations for later ones.
Link: GitHub
Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #statistics #abtesting #ab
@data_science_weekly
This roadmap is for analysts, data scientists, and product folks who want to go from “I know what an A/B test is” to running trustworthy, advanced online experiments (CUPED, sequential testing, quasi-experiments, Bayesian, etc.).
It’s organized by topics. You don’t have to go strictly top-to-bottom, but earlier sections are foundations for later ones.
Link: GitHub
Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #statistics #abtesting #ab
@data_science_weekly
👍4
Machine Learning Design Primer
Some helpful notes for Machine Learning System Design Interview preparation, which author gathered from various resources to prepare for machine learning systems design interview.
Link: GitHub
Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #interview #techinterview #interviewprep #interviewpreparation #mlsd #mlsystemdesign #mlsysdes #systemdesign
@data_science_weekly
Some helpful notes for Machine Learning System Design Interview preparation, which author gathered from various resources to prepare for machine learning systems design interview.
Link: GitHub
Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #interview #techinterview #interviewprep #interviewpreparation #mlsd #mlsystemdesign #mlsysdes #systemdesign
@data_science_weekly
👍7
Engineering Math: Differential Equations and Dynamical Systems by Steve Brunton
This series presents a comprehensive introduction and overview to Differential Equations & Dynamical Systems. Dynamical systems are differential equations that describe any system that changes in time. Applications include fluid dynamics, elasticity and vibrations, weather and climate systems, epidemiology, biomechanics, space mission design, and control theory.
Author assumes that students have taken some calculus (but might not remember it) and are interested in modeling the real world.
Link: YouTube
Navigational hashtags: #armknowledgesharing #armcourse
General hashtags: #math #mathematics
@data_science_weekly
This series presents a comprehensive introduction and overview to Differential Equations & Dynamical Systems. Dynamical systems are differential equations that describe any system that changes in time. Applications include fluid dynamics, elasticity and vibrations, weather and climate systems, epidemiology, biomechanics, space mission design, and control theory.
Author assumes that students have taken some calculus (but might not remember it) and are interested in modeling the real world.
Link: YouTube
Navigational hashtags: #armknowledgesharing #armcourse
General hashtags: #math #mathematics
@data_science_weekly
👍7
Build a Large Language Model by Sebastian Raschka
In Build a Large Language Model (from Scratch) bestselling author Sebastian Raschka guides you step by step through creating your own LLM. Each stage is explained with clear text, diagrams, and examples. You’ll go from the initial design and creation, to pretraining on a general corpus, and on to fine-tuning for specific tasks.
Build a Large Language Model (from Scratch) teaches you how to:
• Plan and code all the parts of an LLM
• Prepare a dataset suitable for LLM training
• Fine-tune LLMs for text classification and with your own data
• Use human feedback to ensure your LLM follows instructions
• Load pretrained weights into an LLM
Build a Large Language Model (from Scratch) takes you inside the AI black box to tinker with the internal systems that power generative AI. As you work through each key stage of LLM creation, you’ll develop an in-depth understanding of how LLMs work, their limitations, and their customization methods. Your LLM can be developed on an ordinary laptop, and used as your own personal assistant.
Links:
• Amazon
• GitHub
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #llm #largelanguagemodels #nlp #naturallanguageprocessing
@data_science_weekly
In Build a Large Language Model (from Scratch) bestselling author Sebastian Raschka guides you step by step through creating your own LLM. Each stage is explained with clear text, diagrams, and examples. You’ll go from the initial design and creation, to pretraining on a general corpus, and on to fine-tuning for specific tasks.
Build a Large Language Model (from Scratch) teaches you how to:
• Plan and code all the parts of an LLM
• Prepare a dataset suitable for LLM training
• Fine-tune LLMs for text classification and with your own data
• Use human feedback to ensure your LLM follows instructions
• Load pretrained weights into an LLM
Build a Large Language Model (from Scratch) takes you inside the AI black box to tinker with the internal systems that power generative AI. As you work through each key stage of LLM creation, you’ll develop an in-depth understanding of how LLMs work, their limitations, and their customization methods. Your LLM can be developed on an ordinary laptop, and used as your own personal assistant.
Links:
• Amazon
• GitHub
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #llm #largelanguagemodels #nlp #naturallanguageprocessing
@data_science_weekly
👍9
Elements of Programming Interviews in Python: The Insiders' Guide by Adnan Aziz, Tsung-Hsien Lee and Amit Prakash
EPI is your comprehensive guide to interviewing for software development roles.
The core of EPI is a collection of over 250 problems with detailed solutions. The problems are representative of interview questions asked at leading software companies. The problems are illustrated with 200 figures, 300 tested programs, and 150 additional variants.
The book begins with a summary of the nontechnical aspects of interviewing, such as strategies for a great interview, common mistakes, perspectives from the other side of the table, tips on negotiating the best offer, and a guide to the best ways to use EPI. We also provide a summary of data structures, algorithms, and problem solving patterns.
Coding problems are presented through a series of chapters on basic and advanced data structures, searching, sorting, algorithm design principles, and concurrency. Each chapter stars with a brief introduction, a case study, top tips, and a review of the most important library methods. This is followed by a broad and thought-provoking set of problems.
Links:
• Amazon
• Free Sample
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #programming #python #algorithms #datastructures #interviewpreparation #interviewprep #interview
@data_science_weekly
EPI is your comprehensive guide to interviewing for software development roles.
The core of EPI is a collection of over 250 problems with detailed solutions. The problems are representative of interview questions asked at leading software companies. The problems are illustrated with 200 figures, 300 tested programs, and 150 additional variants.
The book begins with a summary of the nontechnical aspects of interviewing, such as strategies for a great interview, common mistakes, perspectives from the other side of the table, tips on negotiating the best offer, and a guide to the best ways to use EPI. We also provide a summary of data structures, algorithms, and problem solving patterns.
Coding problems are presented through a series of chapters on basic and advanced data structures, searching, sorting, algorithm design principles, and concurrency. Each chapter stars with a brief introduction, a case study, top tips, and a review of the most important library methods. This is followed by a broad and thought-provoking set of problems.
Links:
• Amazon
• Free Sample
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #programming #python #algorithms #datastructures #interviewpreparation #interviewprep #interview
@data_science_weekly
👍6
CS50’s Introduction to Databases with SQL
This is CS50’s introduction to databases using a language called SQL.
• Learn how to create, read, update, and delete data with relational databases, which store data in rows and columns.
• Learn how to model real-world entities and relationships among them using tables with appropriate types, triggers, and constraints.
• Learn how to normalize data to eliminate redundancies and reduce potential for errors.
• Learn how to join tables together using primary and foreign keys.
• Learn how to automate searches with views and expedite searches with indexes.
• Learn how to connect SQL with other languages like Python and Java.
Course begins with SQLite for portability’s sake and ends with introductions to PostgreSQL and MySQL for scalability’s sake as well. Assignments inspired by real-world datasets.
Whereas CS50x itself focuses on computer science more generally as well as programming with C, Python, SQL, and JavaScript, this course, aka CS50 SQL, is entirely focused on SQL. You can take CS50 SQL before CS50x, during CS50x, or after CS50x. But for an introduction to computer science itself, you should still take CS50x!
Links:
• Site
• Lectures (YouTube Playlist)
Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #sql
@data_science_weekly
This is CS50’s introduction to databases using a language called SQL.
• Learn how to create, read, update, and delete data with relational databases, which store data in rows and columns.
• Learn how to model real-world entities and relationships among them using tables with appropriate types, triggers, and constraints.
• Learn how to normalize data to eliminate redundancies and reduce potential for errors.
• Learn how to join tables together using primary and foreign keys.
• Learn how to automate searches with views and expedite searches with indexes.
• Learn how to connect SQL with other languages like Python and Java.
Course begins with SQLite for portability’s sake and ends with introductions to PostgreSQL and MySQL for scalability’s sake as well. Assignments inspired by real-world datasets.
Whereas CS50x itself focuses on computer science more generally as well as programming with C, Python, SQL, and JavaScript, this course, aka CS50 SQL, is entirely focused on SQL. You can take CS50 SQL before CS50x, during CS50x, or after CS50x. But for an introduction to computer science itself, you should still take CS50x!
Links:
• Site
• Lectures (YouTube Playlist)
Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #sql
@data_science_weekly
👍8
Clean Machine Learning Code by Moussa Taifi
This book explores the hidden fragility and complexity behind real-world machine learning systems. It examines how highly skilled data scientists and ML practitioners often struggle when their models become part of production software, where fragile code, complex dependencies, and poor engineering practices can lead to instability and failure.
Drawing parallels between today’s machine learning boom and earlier eras of software engineering, the book argues that many challenges in ML systems are not entirely new but echoes of long-standing software problems. It highlights the risks posed by overly complex and opaque ML software—especially in a fast-growing field with many inexperienced practitioners—and emphasizes the real-world consequences of unreliable systems.
Ultimately, the book advocates for applying proven software engineering principles to machine learning, offering a path toward building more robust, maintainable, and trustworthy ML systems.
Link: Book
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #ml #machinelearning #cleancode
@data_science_weekly
This book explores the hidden fragility and complexity behind real-world machine learning systems. It examines how highly skilled data scientists and ML practitioners often struggle when their models become part of production software, where fragile code, complex dependencies, and poor engineering practices can lead to instability and failure.
Drawing parallels between today’s machine learning boom and earlier eras of software engineering, the book argues that many challenges in ML systems are not entirely new but echoes of long-standing software problems. It highlights the risks posed by overly complex and opaque ML software—especially in a fast-growing field with many inexperienced practitioners—and emphasizes the real-world consequences of unreliable systems.
Ultimately, the book advocates for applying proven software engineering principles to machine learning, offering a path toward building more robust, maintainable, and trustworthy ML systems.
Link: Book
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #ml #machinelearning #cleancode
@data_science_weekly
👍8
Practical RL
An open course on reinforcement learning in the wild. Taught on-campus at HSE and YSDA and maintained to be friendly to online students (both English and Russian).
Manifesto:
• Optimize for the curious. For all the materials that aren’t covered in detail there are links to more information and related materials (D.Silver/Sutton/blogs/whatever). Assignments will have bonus sections if you want to dig deeper.
• Practicality first. Everything essential to solving reinforcement learning problems is worth mentioning. We won't shun away from covering tricks and heuristics. For every major idea there should be a lab that makes you to “feel” it on a practical problem.
• Git-course. Know a way to make the course better? Noticed a typo in a formula? Found a useful link? Made the code more readable? Made a version for alternative framework? You're awesome! Pull-request it!
Link: GitHub
Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #rl #reinforcementlearning
@data_science_weekly
An open course on reinforcement learning in the wild. Taught on-campus at HSE and YSDA and maintained to be friendly to online students (both English and Russian).
Manifesto:
• Optimize for the curious. For all the materials that aren’t covered in detail there are links to more information and related materials (D.Silver/Sutton/blogs/whatever). Assignments will have bonus sections if you want to dig deeper.
• Practicality first. Everything essential to solving reinforcement learning problems is worth mentioning. We won't shun away from covering tricks and heuristics. For every major idea there should be a lab that makes you to “feel” it on a practical problem.
• Git-course. Know a way to make the course better? Noticed a typo in a formula? Found a useful link? Made the code more readable? Made a version for alternative framework? You're awesome! Pull-request it!
Link: GitHub
Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #rl #reinforcementlearning
@data_science_weekly
👍6
ML Design Doc by Eugene Yan
A template for design docs for machine learning systems based on this post.
Note: This template is a guideline / checklist and is not meant to be exhaustive. The intent of the design doc is to help you think better (about the problem and design) and get feedback. Adopt whichever sections—and add new sections—to meet this goal. View other templates, examples here.
Link: GitHub
Navigational hashtags: #armknowledgesharing #armrepo
General hashtags: #mlsysdes #mlsystemdesign #mlsd
@data_science_weekly
A template for design docs for machine learning systems based on this post.
Note: This template is a guideline / checklist and is not meant to be exhaustive. The intent of the design doc is to help you think better (about the problem and design) and get feedback. Adopt whichever sections—and add new sections—to meet this goal. View other templates, examples here.
Link: GitHub
Navigational hashtags: #armknowledgesharing #armrepo
General hashtags: #mlsysdes #mlsystemdesign #mlsd
@data_science_weekly
👍8
Linear Algebra Review and Reference by Zico Kolter (updated by Chuong Do and Tengyu Ma)
Just read the book!
Links:
• Linear Algebra Review and Reference
• Probability Theory Review and Reference
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #linearalgebra #math #mathematics #algebra
@data_science_weekly
Just read the book!
Links:
• Linear Algebra Review and Reference
• Probability Theory Review and Reference
Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #linearalgebra #math #mathematics #algebra
@data_science_weekly
👍3