AlexTCH
270 subscribers
74 photos
4 videos
2 files
845 links
Что-то про программирование, что-то про Computer Science и Data Science, и немного кофе. Ну и всякая чушь вместо Твиттера. :)
Download Telegram
It was recommended to me, the book "Introduction to Statistical Learning":
https://hastie.su.domains/ISLR2/
and the lectures based on it:
https://www.youtube.com/watch?v=5N9V07EIfIg&list=PLOg0ngHtcqbPTlZzRHA2ocQZqB1D_qZ5V

The lectures seem pretty fun. 😊
#statistics #datascience #book #lectures
1
https://nickchk.com/causalgraphs.html
Causal inference! With animations! 😄

The post explains and illustrates basic notions and methods of causal inference with examples from econometrics. And animated plots, yep.

#causalinference #statistics
👍1
https://lost-stats.github.io/
"LOST is a Rosetta Stone for statistical software"

Or "Rosetta Code". Useful reference either way.

#statistics #datascience
https://nickchk.com/robustness.html

A short practical guide on robustness tests in #statistics It even has a "checklist" to fill in! 😄 And a list of misconceptions too.
https://mlg.eng.cam.ac.uk/zoubin/papers/ijprai.pdf
"An introduction to Hidden Markov Models and Bayesian Networks" Ghahramani, 2001

#statistics #bayes
https://www.markhw.com/blog/aphextwin

#Statistics proves: Aphex Twin has more diverse discography than his peers. 😁

Very interesting post showcasing clever statistical analysis.
👍1
https://statmodeling.stat.columbia.edu/2020/07/02/no-i-dont-believe-that-claim-based-on-regression-discontinuity-analysis-that/

A great post with thorough replication/reanalysis/discussion and we might say debunking. Also an example of pretty decent scientific discussion. Plus deep technical dives in the comments.

#statistics #rdd
https://statmodeling.stat.columbia.edu/2023/01/03/explanation-and-reproducibility-in-data-driven-science-new-course/

WOW, a great reading list on #statistics and #machinelearning ! And an important topic for a course. Especially targeting CS students.
https://dmkpress.com/catalog/computer/statistics/978-5-93700-245-7/
«В поисках эффекта. Планирование экспериментов и причинный вывод в статистике»

ДМК перевели и издали книгу Nick Huntington-Klein, оригинал которой (и ещё записи лекций в придачу) можно найти онлайн:
https://www.theeffectbook.net/
(или купить на Amazon).

Я, безусловно, это дело полностью одобряю и поддерживаю. Во-первых, я — фанат "причинного вывода" (или "вывода причин"? короче, causal inference). Никому не интересно, что рост продаж коррелирует с увеличением маркетингового бюджета, все хотят знать, приводит ли увеличение бюджета на маркетинг к (дополнительному) росту продаж или нет. Во-вторых, я знаю автора как инициатора и основного контрибьютора https://lost-stats.github.io/ а также видео на YouTube — он создаёт впечатление знающего и весёлого малого. В-третьих, в своём учебнике он делает упор на концептуальное понимание причинности или её отсутствия, планирование экспериментов (experimental design) и казуальные графы a la Judea Pearl. Но и примеры реализации методов causal inference на R, Python и Stata тоже приводит (их и домашние задания можно забрать с GitHub по ссылкам с сайта книги).

Единственное, что вызывает подозрения — перевод статистических терминов. Estimator почему-то называют "оценивателем", и я не думаю, что наши статистики так говорят... Впрочем, в остальном перевод выглядит вполне достойным и передающим немного шутливый авторский стиль.

#book #statistics #causalinference
2👏1
Again on my favorite topic of correlation, causation and control systems.

"When causation does not imply correlation" presents pretty technical analysis (with a couple of theorems) of conditions when a control system "breaks the circuit" and decorrelates variables:
https://arxiv.org/abs/1505.03118

I wrestled with this issue when I did a control system parameters' identification from sensor data using machine learning. Judging from data some actions had no effect because they were kicking in precisely in order to counteract another force and keep the readings the same.

Then the "Slime Mold" guys rediscovered this effect, and they provide nice, approachable illustrations:
https://slimemoldtimemold.com/2022/03/15/control-and-correlation/

More comments from Gelman's blog including long historic roots of this observation:
https://statmodeling.stat.columbia.edu/2024/01/15/a-feedback-loop-can-destroy-correlation-this-idea-comes-up-in-many-places/

#statistics #machinelearning
1
https://statmodeling.stat.columbia.edu/2024/08/21/which-books-papers-and-blogs-are-in-the-bayesian-canon/

(An attempt at establishing) A "Bayesian Canon": a list of "must read" books, papers and blogs about #bayesian #statistics

They aren't really must read but there are some extremely interesting specimens.
🔥2
And another #free #book on #statistics from the list above:

https://tellingstorieswithdata.com/

It covers data search, acquisition, preparation and storage, exploratory data analysis, generalized linear models, causal inference, multilevel regression and post-stratification, visualization and reporting, and making the workflow reproducible.

Examples are in R (employing the Tidiverse), and there are questions and exercises at the end of every chapter.
🔥2