Artem Ryblov’s Data Science Weekly
618 subscribers
139 photos
163 links
@artemfisherman’s Data Science Weekly: Elevate your expertise with a standout data science resource each week, carefully chosen for depth and impact.

Long-form content: https://artemryblov.substack.com
Download Telegram
🧠 Awesome ChatGPT Prompts

Welcome to the "Awesome ChatGPT Prompts" repository! This is a collection of prompt examples to be used with the ChatGPT model.

The ChatGPT model is a large language model trained by OpenAI that is capable of generating human-like text. By providing it with a prompt, it can generate responses that continue the conversation or expand on the given prompt.

In this repository, you will find a variety of prompts that can be used with ChatGPT.

To get started, simply clone this repository and use the prompts in the README.md file as input for ChatGPT. You can also use the prompts in this file as inspiration for creating your own.

Link: Direct

Navigational hashtags: #armknowledgesharing #armrepo
General hashtags: #prompts #prompt #promptengineering #chatgpt #gpt

@data_science_weekly
👍2
Mathematics Of Machine Learning by MIT

Broadly speaking, Machine Learning refers to the automated identification of patterns in data. As such it has been a fertile ground for new statistical and algorithmic developments. The purpose of this course is to provide a mathematically rigorous introduction to these developments with emphasis on methods and their analysis.

Link: Direct

Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #math #maths #mathematics #ml

@data_science_weekly
👍3
Exceptional Resources for Data Science Interview Preparation. Part 3: Specialized Machine Learning

In the previous article, I shared materials for preparing for the stage on Classical Machine Learning.

In this article, we will look at materials that can be used to prepare for the section on specialized machine learning.

Table of contents
- Resources
- Deep Learning
- Natural Language Processing
- Computer Vision
- Graph Neural Networks
- Reinforcement Learning
- Recommender Systems
- Time Series
- Big Data
- Let’s sum it up
- What’s next?


NB:
I'm the author of the article.
It was initially published in Russian (on habr.com), then I published it on medium.com. So, for Russian speakers I recommend to read Russian version, for English speakers I recommend to read English version and both will benefit from starring the repository, which will be maintained and updated when new resources become available.

Links:
- Medium (eng)
- Habr (rus)

Navigational hashtags: #armknowledgesharing #armarticles
General hashtags: #interview #interviewpreparation #machinelearning #ml #deeplearning #dl #nlp #cv #rl #gnn #recsys

@data_science_weekly
👍3
DevOps for Data Science by Alex K Gold

In this book, you’ll learn about DevOps conventions, tools, and practices that can be useful to you as a data scientist. You’ll also learn how to work better with the IT/Admin team at your organization, and even how to do a little server administration of your own if you’re pressed into service.

Link: Direct

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #devops #mlops #datascience

@data_science_weekly
👍5
Bash Scripting Tutorial for Beginners by Herbert Lindemans

Learn bash scripting in this crash course for beginners. Understanding how to use bash scripting will enhance your productivity by automating tasks, streamlining processes, and making your workflow more efficient.

⌨️ (00:00) Introduction
⌨️ (03:24) Basic commands
⌨️ (06:21) Writing your first bash script
⌨️ (11:29) Variables
⌨️ (14:55) Positional arguments
⌨️ (16:23) Output/Input redirection
⌨️ (23:23) Test operators
⌨️ (25:19) If/Elif/Else
⌨️ (28:37) Case statements
⌨️ (32:16) Arrays
⌨️ (34:12) For loop
⌨️ (36:03) Functions
⌨️ (41:31) Exit codes
⌨️ (42:30) AWK
⌨️ (45:11) SED

Link: Video

Navigational hashtags: #armknowledgesharing #armyoutube
General hashtags: #bash #cmd #terminal

@data_science_weekly
👍2
Immersive linear algebra by J. Ström, K. Åström, and T. Akenine-Möller

"A picture says more than a thousand words" is a common expression, and for text books, it is often the case that a figure or an illustration can replace a large number of words as well. However, they believe that an interactive illustration can say even more, and that is why they have decided to build their linear algebra book around such illustrations. They believe that these figures make it easier and faster to digest and to learn linear algebra (which would be the case for many other mathematical books as well, for that matter). In addition, they have added some more features (e.g., popup windows for common linear algebra terms) to their book, and they believe that those features will make it easier and faster to read and understand as well.

After using linear algebra for 20 years times three persons, they were ready to write a linear algebra book that they think will make it substantially easier to learn and to teach linear algebra. In addition, the technology of mobile devices and web browsers have improved beyond a certain threshold, so that this book could be put together in a very novel and innovative way (they think). The idea is to start each chapter with an intuitive concrete example that practically shows how the math works using interactive illustrations. After that, the more formal math is introduced, and the concepts are generalized and sometimes made more abstract. They believe it is easier to understand the entire topic of linear algebra with a simple and concrete example cemented into the reader's mind in the beginning of each chapter.

Link: Book

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #math #linearalgebra #algebra

@data_science_weekly
👍3
Oh Shit, Git!?!

Git is hard: screwing up is easy, and figuring out how to fix your mistakes is fucking impossible. Git documentation has this chicken and egg problem where you can't search for how to get yourself out of a mess, unless you already know the name of the thing you need to know about in order to fix your problem.

- I did something terribly wrong, please tell me git has a magic time machine!?!
- I committed and immediately realized I need to make one small change!
- I need to change the message on my last commit!
- I accidentally committed something to master that should have been on a brand new branch!
- I accidentally committed to the wrong branch!
- I tried to run a diff but nothing happened?!
- I need to undo a commit from like 5 commits ago!
- I need to undo my changes to a file!
- I give up

Link

Navigational hashtags: #armknowledgesharing #armarticles
General hashtags: #git #versioncontrol #github #gitlab

@data_science_weekly
👍2
Leetcode for ML

Super neat set of machine learning coding challenges.

It could be useful to prep for an exam or ML interview.

Link

Navigational hashtags: #armknowledgesharing #armsites
General hashtags: #ml #dl #machinelearning #deeplearning

@data_science_weekly
👍6
NeetCode: A better way to prepare for coding interviews

The best free resources for Coding Interviews. Period.
- Organized study plans and roadmaps (Blind 75, Neetcode 150).
- Detailed video explanations.
- Public Discord community with over 30,000 members.
- Sign in to save your progress.

Links:
- Roadmap
- Practice (Core Skills, Blind 75, Neetcode 150, Neetcode All)
- Algorithms and Data Structures for Beginners (course) paid
- Advanced Algorithms (course) paid

Navigational hashtags: #armknowledgesharing #armsites #armtutorials
General hashtags: #leetcode #python #algorithms #datastructures #interviewpreparation #technicalinterview

@data_science_weekly
👍3
Write faster Python code, and ship your code faster

Faster and more memory efficient data
- Articles: Learn how to speed up your code and reduce memory usage.
- Products: Observability and profiling tools to help you identify bottlenecks in your code.

Docker packaging for Python
- Articles: Learn how to package your Python application for production.
- Products: Educational books and pre-written software templates.

Navigational hashtags: #armknowledgesharing #armsites
General hashtags: #python #development #docker

@data_science_weekly
👍4
Ace the SQL Interview by Nick Singh

Practice the most common SQL & Data Interview Questions and Learn SQL.

Navigational hashtags: #armknowledgesharing #armsites
General hashtags: #sql

@data_science_weekly
👍6
Applied Causal Inference Powered by ML and AI by Victor Chernozhukov, Christian Hansen, Nathan Kallus, Martin Spindler, Vasilis Syrgkanis

An introduction to the emerging fusion of machine learning and causal inference.

The book introduces ideas from classical structural equation models (SEMs) and their modern AI equivalent, directed acyclical graphs (DAGs) and structural causal models (SCMs), and presents Debiased Machine Learning methods to do inference in such models using modern predictive tools.

Links:
- PDF
- Site
- GitHub

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #statistics #ml #ai #causal #causalinference

@data_science_weekly
👍5
Applied Geospatial Data Science with Python: Leverage geospatial data analysis and modeling to find unique solutions to environmental problems by David S. Jordan

Key Features
- Learn how to integrate spatial data and spatial thinking into traditional data science workflows
- Develop a spatial perspective and learn to avoid common pitfalls along the way
- Gain expertise through practical case studies applicable in a variety of industries with code samples that can be reproduced and expanded

Table of Contents
1. Introducing Geographic Information Systems and Geospatial Data Science
2. What Is Geospatial Data and Where Can I Find It?
3. Working with Geographic and Projected Coordinate Systems
4. Exploring Geospatial Data Science Packages
5. Exploratory Data Visualization
6. Hypothesis Testing and Spatial Randomness
7. Spatial Feature Engineering
8. Spatial Clustering and Regionalization
9. Developing Spatial Regression Models
10. Developing Solutions for Spatial Optimization Problems
11. Advanced Topics in Spatial Data Science

Links:
- Amazon
- Packt
- GitHub

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #datascience #geo #geospatial

@data_science_weekly
👍4
Introduction to Machine Learning (I2ML) by LMU Munich

This website offers an open and free introductory course on (supervised) machine learning. The course is constructed as self-contained as possible, and enables self-study through lecture videos, PDF slides, cheatsheets, quizzes, exercises (with solutions), and notebooks.

The quite extensive material can roughly be divided into:
- An introductory undergraduate part (chapters 1-10)
- A more advanced second one on MSc level (chapters 11-19)
- A third course, on MSc level (chapters 20-23).

A key goal of the course is to teach the fundamental building blocks behind ML, instead of introducing “yet another algorithm with yet another name”. We discuss, compare, and contrast risk minimization, statistical parameter estimation, the Bayesian viewpoint, and information theory and demonstrate that all of these are equally valid entry points to ML. Developing the ability to take on and switch between these perspectives is a major goal of this course, and in our opinion not always ideally presented in other courses.

Link:
- Main Course Website

Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #ml #machinelearning #supervised

@data_science_weekly
👍6
Forecasting: Principles and Practice by Rob J Hyndman and George Athanasopoulos

This textbook is intended to provide a comprehensive introduction to forecasting methods and to present enough information about each method for readers to be able to use them sensibly. Authors don’t attempt to give a thorough discussion of the theoretical details behind each method, although the references at the end of each chapter will fill in many of those details.

The book is written for three audiences:
(1) people finding themselves doing forecasting in business when they may not have had any formal training in the area;
(2) undergraduate students studying business;
(3) MBA students doing a forecasting elective. We use it ourselves for masters students and third-year undergraduate students at Monash University, Australia.

For most sections, authors only assume that readers are familiar with introductory statistics, and with high-school algebra. There are a couple of sections that also require knowledge of matrices, but these are flagged.

At the end of each chapter we provide a list of “further reading”. In general, these lists comprise suggested textbooks that provide a more advanced or detailed treatment of the subject. Where there is no suitable textbook, authors suggest journal articles that provide more information.

Link: Book Website

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #forecasting #timeseries #ts

@data_science_weekly
👍3
The Cartoon Guide to Statistics by Larry Gonick, Woollcott Smith

The Cartoon Guide to Statistics covers all the central ideas of modern statistics: the summary and display of data, probability in gambling and medicine, random variables, Bernoulli Trials, the Central Limit Theorem, hypothesis testing, confidence interval estimation, and much more - all explained in simple, clear, and yes, funny illustrations. Never again will you order the Poisson Distribution in a French restaurant!

Links:
- Amazon
- Internet Archive

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #statistics #stats #probability

@data_science_weekly
👍4
Practitioners guide to MLOps: A framework for continuous delivery and automation of machine learning by Google Cloud

Across industries, DevOps and DataOps have been widely adopted as methodologies to improve quality and reduce the time to market of software engineering and data engineering initiatives. With the rapid growth in machine learning (ML) systems, similar approaches need to be developed in the context of ML engineering, which handle the unique complexities of the practical applications of ML. This is the domain of MLOps. MLOps is a set of standardized processes and technology capabilities for building, deploying, and operationalizing ML systems rapidly and reliably.

The document is in two parts. The first part, an overview of the MLOps lifecycle, is for all readers. It introduces MLOps processes and capabilities and why they’re important for successful adoption of ML-based systems.

The second part is a deep dive on the MLOps processes and capabilities. This part is for readers who want to understand the concrete details of tasks like running a continuous training pipeline, deploying a model, and monitoring predictive performance of an ML model.

Link: Book

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #mlops

@data_science_weekly
👍6
CS324 - Large Language Models by Stanford University

The field of natural language processing (NLP) has been transformed by massive pre-trained language models. They form the basis of all state-of-the-art systems across a wide range of tasks and have shown an impressive ability to generate fluent text and perform few-shot learning. At the same time, these models are hard to understand and give rise to new ethical and scalability challenges. In this course, students will learn the fundamentals about the modeling, theory, ethics, and systems aspects of large language models, as well as gain hands-on experience working with them.

TABLE OF CONTENTS
- Introduction
- Capabilities
- Harms I
- Harms
- Data
- Security
- Legality
- Modeling
- Training
- Parallelism
- Scaling laws
- Selective architectures
- Adaptation
- Environmental impact

Link: Course

Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #nlp #llm #transformer

@data_science_weekly
👍6
Deep Learning with Python by François Chollet

Deep Learning with Python, Second Edition introduces the field of deep learning using Python and the powerful Keras library. In this revised and expanded new edition, Keras creator François Chollet offers insights for both novice and experienced machine learning practitioners. As you move through this book, you’ll build your understanding through intuitive explanations, crisp color illustrations, and clear examples. You’ll quickly pick up the skills you need to start developing deep-learning applications.

What's inside:
- Deep learning from first principles
- Image classification and image segmentation
- Time series forecasting
- Text classification and machine translation
- Text generation, neural style transfer, and image generation
- Printed in full color throughout

Link: Book

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #dl #deeplearning #keras

@data_science_weekly
👍4
Competitive Programmer’s Handbook by Antti Laaksonen

The purpose of this book is to give you a thorough introduction to competitive programming. It is assumed that you already know the basics of programming, but no previous background in competitive programming is needed.

The book is especially intended for students who want to learn algorithms and possibly participate in the International Olympiad in Informatics (IOI) or in the International Collegiate Programming Contest (ICPC). Of course, the book is also suitable for anybody else interested in competitive programming.

It takes a long time to become a good competitive programmer, but it is also an opportunity to learn a lot. You can be sure that you will get a good general understanding of algorithms if you spend time reading the book, solving problems and taking part in contests.

Link: Book

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #leetcode #programming #competitiveprogramming

@data_science_weekly
👍2
The Querynomicon. An Introduction to SQL for Weary Data Scientists

Upon first encountering SQL after two decades of Fortran, C, Java, and Python, author thought he had stumbled into hell. He quickly realized that was optimistic: after all, hell has rules.

Author have since realized that SQL does too, and that they are no more confusing or contradictory than those of most other programming languages. They only appear so because it draws on a tradition unfamiliar to those of us raised with derivatives of C. To quote Terry Pratchett, it is not mad, just differently sane.

Welcome, then, to a world in which the strange will become familiar, and the familiar, strange. Welcome, thrice welcome, to SQL.

Table of contents:

1. Introduction
2. Core Features
3. Tools
4. Advanced Features
5. Python
6. R
7. PostgreSQL
8. Conclusion

Link: Tutorial

Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #sql

@data_science_weekly
👍2