One of my favorite tricks is adding a constant to each of the independent variables in a regression so as to shift the intercept.  Of course just shifting the data will not change R-squared, slopes, F-scores, P-values, etc., so why do it?
Because just about any software package capable of doing regression, even Excel, can give you standard errors and confidence intervals for the Intercept, but it is much harder to get most packages to give you standard errors and confidence intervals around the predicted value of the dependent variable for OTHER combinations of the independent variables. Shifting the intercept is an easy way to get confidence intervals for arbitrary combinations of the independent variables.
This sort of thing becomes especially important at a time when the Statistics community is loudly calling for a move away from P-values. Instead it is recommended that researchers give confidence intervals in clinically meaningful terms.
#data #researchers #statistics #r #excel #regression
✴️ @AI_Python_EN
  Because just about any software package capable of doing regression, even Excel, can give you standard errors and confidence intervals for the Intercept, but it is much harder to get most packages to give you standard errors and confidence intervals around the predicted value of the dependent variable for OTHER combinations of the independent variables. Shifting the intercept is an easy way to get confidence intervals for arbitrary combinations of the independent variables.
This sort of thing becomes especially important at a time when the Statistics community is loudly calling for a move away from P-values. Instead it is recommended that researchers give confidence intervals in clinically meaningful terms.
#data #researchers #statistics #r #excel #regression
✴️ @AI_Python_EN
Important Machine Learning algorithms and their Hyperparameters 
#machinelearning #datascience #statistics #algorithms
✴️ @AI_Python_EN
  #machinelearning #datascience #statistics #algorithms
✴️ @AI_Python_EN
Why statistics should make you suspicious
Spiegelhalter on algorithm, luck, bias, probabilities, machine learning and AI.
https://lnkd.in/e-X9hXJ
#artificialintelligence #bias #ai #statistics #ai #bigdata
✴️ @AI_Python_EN
  Spiegelhalter on algorithm, luck, bias, probabilities, machine learning and AI.
https://lnkd.in/e-X9hXJ
#artificialintelligence #bias #ai #statistics #ai #bigdata
✴️ @AI_Python_EN
Here are some #statistics and research #journals I can recommend:
- Statistical Analysis and Data Mining (ASA)
- Analytics Journal (DMA)
- The American Statistician (ASA)
- Journal of the American Statistical Association (ASA)
- Statistics in Biopharmaceutical Research (ASA)
- Journal of Agricultural, Biological, and Environmental Statistics (ASA)
- Journal of Statistics Education (ASA)
- Statistics and Public Policy (ASA)
- Journal of Survey Statistics and Methodology (AAPOR and ASA)
- Journal of Educational and Behavioral Statistics (ASA)
- British Journal of Mathematical and Statistical Psychology (Wiley)
- Statistics Surveys (IMS)
- Stata Journal (StataCorp)
- The R Journal (R Project)
- Structural Equation Modeling: A Multidisciplinary Journal (Routledge)
- Journal of Business & Economic Statistics (ASA)
- Journal of Marketing Research (AMA)
- Journal of Computational and Graphical Statistics (ASA)
- Journal of Artificial General Intelligence (AGIS)
These are not purely theoretical publications and provide plenty of examples I can adapt for my own work. I try to read them as regularly as I can.
There's so much innovation happening in analytics that it's hard to keep up!
✴️ @AI_Python_EN
  - Statistical Analysis and Data Mining (ASA)
- Analytics Journal (DMA)
- The American Statistician (ASA)
- Journal of the American Statistical Association (ASA)
- Statistics in Biopharmaceutical Research (ASA)
- Journal of Agricultural, Biological, and Environmental Statistics (ASA)
- Journal of Statistics Education (ASA)
- Statistics and Public Policy (ASA)
- Journal of Survey Statistics and Methodology (AAPOR and ASA)
- Journal of Educational and Behavioral Statistics (ASA)
- British Journal of Mathematical and Statistical Psychology (Wiley)
- Statistics Surveys (IMS)
- Stata Journal (StataCorp)
- The R Journal (R Project)
- Structural Equation Modeling: A Multidisciplinary Journal (Routledge)
- Journal of Business & Economic Statistics (ASA)
- Journal of Marketing Research (AMA)
- Journal of Computational and Graphical Statistics (ASA)
- Journal of Artificial General Intelligence (AGIS)
These are not purely theoretical publications and provide plenty of examples I can adapt for my own work. I try to read them as regularly as I can.
There's so much innovation happening in analytics that it's hard to keep up!
✴️ @AI_Python_EN
Don't stop sharing, done is better than perfect
For people who actively continue to blame, condemn and complain online, especially when reacting to content containing statistics, programming and machine learning that has been simplified, look for value in the imperfections of others.
We both know that machine learning models will never be perfect, as George P.Box said, "there are no perfect models, but some are useful". As with the content mentioned above, there are often reduced details to facilitate understanding, actionability, business value and expand the spread of knowledge.
Not all of us will face cases that are on each topic of the content mentioned above, but if we know in part, we can get the opportunity to work on a better process, even helping people.
Don't stop sharing, done is better than perfect
#programming #statistics #machinelearning
✴️ @AI_Python_EN
  For people who actively continue to blame, condemn and complain online, especially when reacting to content containing statistics, programming and machine learning that has been simplified, look for value in the imperfections of others.
We both know that machine learning models will never be perfect, as George P.Box said, "there are no perfect models, but some are useful". As with the content mentioned above, there are often reduced details to facilitate understanding, actionability, business value and expand the spread of knowledge.
Not all of us will face cases that are on each topic of the content mentioned above, but if we know in part, we can get the opportunity to work on a better process, even helping people.
Don't stop sharing, done is better than perfect
#programming #statistics #machinelearning
✴️ @AI_Python_EN
Machine Learning (ML) & Artificial Intelligence (AI): From Black Box to White Box Models in 4 Steps - Resources for Explainable AI & ML Model Interpretability.
✔️STEP 1 - ARTICLES
- (short) KDnuggets article: https://lnkd.in/eRyTXcQ
- (long) O'Reilly article: https://lnkd.in/ehMHYsr
✔️STEP 2 - BOOKS
- Interpretable Machine Learning: A Guide for Making Black Box Models Explainable (free e-book): https://lnkd.in/eUWfa5y
- An Introduction to Machine Learning Interpretability: An Applied Perspective on Fairness, Accountability, Transparency, and Explainable AI (free e-book): https://lnkd.in/dJm595N
✔️STEP 3 - COLLABORATE
- Join Explainable AI (XAI) Group: https://lnkd.in/dQjmhZQ
✔️STEP 4 - PRACTICE
- Hands-On Practice: Open-Source Tools & Tutorials for ML Interpretability (Python/R): https://lnkd.in/d5bXgV7
- Python Jupyter Notebooks: https://lnkd.in/dETegUH
#machinelearning #datascience #analytics #bigdata #statistics #artificialintelligence #ai #datamining #deeplearning #neuralnetworks #interpretability #science #research #technology #business #healthcare
✴️ @AI_Python_EN
  ✔️STEP 1 - ARTICLES
- (short) KDnuggets article: https://lnkd.in/eRyTXcQ
- (long) O'Reilly article: https://lnkd.in/ehMHYsr
✔️STEP 2 - BOOKS
- Interpretable Machine Learning: A Guide for Making Black Box Models Explainable (free e-book): https://lnkd.in/eUWfa5y
- An Introduction to Machine Learning Interpretability: An Applied Perspective on Fairness, Accountability, Transparency, and Explainable AI (free e-book): https://lnkd.in/dJm595N
✔️STEP 3 - COLLABORATE
- Join Explainable AI (XAI) Group: https://lnkd.in/dQjmhZQ
✔️STEP 4 - PRACTICE
- Hands-On Practice: Open-Source Tools & Tutorials for ML Interpretability (Python/R): https://lnkd.in/d5bXgV7
- Python Jupyter Notebooks: https://lnkd.in/dETegUH
#machinelearning #datascience #analytics #bigdata #statistics #artificialintelligence #ai #datamining #deeplearning #neuralnetworks #interpretability #science #research #technology #business #healthcare
✴️ @AI_Python_EN
#Statistics has many uses but, fundamentally, it's a systematic way of dealing uncertainty. When something is certain, there is no need to bring in a statistician or ask anyone for their council.
Since we're concerned with uncertainty, statisticians approach questions probabilistically. To conclude that something is likely to be true does not mean we're claiming it IS true, only that it's more likely to be true than not.
We may estimate this probability as being very high but, again, this is not saying the #probability is perfect (1.0).
Statisticians also think in terms of conditional probabilities, which means we've estimated the probability after having taken other information into account.
For instance, we might estimate the probability of a person buying a certain type of product within the next three months as 0.7 because he is a 25 year-old male. This estimate may have been made with a statistical model and data from thousands or millions of other consumers. For a 55 year-old woman our estimate might be 0.15.
Part of the challenge of being a statistician is that decision-makers often come to us for definitive yes-or-no answers. They can become irritated when we ask for more information or give them very qualified recommendations.
It ain't just math and programming!
tips: If someone says, for example, that A is not the only possible explanation for something and that B, C, or D are other possibilities, a common reaction is for the other party to conclude the first person is saying A is NOT a possible explanation. Humans are funny people.
 
✴️ @AI_Python_EN
  Since we're concerned with uncertainty, statisticians approach questions probabilistically. To conclude that something is likely to be true does not mean we're claiming it IS true, only that it's more likely to be true than not.
We may estimate this probability as being very high but, again, this is not saying the #probability is perfect (1.0).
Statisticians also think in terms of conditional probabilities, which means we've estimated the probability after having taken other information into account.
For instance, we might estimate the probability of a person buying a certain type of product within the next three months as 0.7 because he is a 25 year-old male. This estimate may have been made with a statistical model and data from thousands or millions of other consumers. For a 55 year-old woman our estimate might be 0.15.
Part of the challenge of being a statistician is that decision-makers often come to us for definitive yes-or-no answers. They can become irritated when we ask for more information or give them very qualified recommendations.
It ain't just math and programming!
tips: If someone says, for example, that A is not the only possible explanation for something and that B, C, or D are other possibilities, a common reaction is for the other party to conclude the first person is saying A is NOT a possible explanation. Humans are funny people.
✴️ @AI_Python_EN
As the author states: "work in process and even in an early dirty phase"
But still very cool 🙂
Book: Predictive Models: Visual Exploration, Explanation and Debugging With examples in R and Python By Przemyslaw Biecek
#book #datascience #machinelearning #statistics #programming_language
🌎 Book
✴️ @AI_Python_EN
  But still very cool 🙂
Book: Predictive Models: Visual Exploration, Explanation and Debugging With examples in R and Python By Przemyslaw Biecek
#book #datascience #machinelearning #statistics #programming_language
🌎 Book
✴️ @AI_Python_EN
share knowledge on one of basic topic in Statistics and Machine Learning.
"Assumptions of Linear Regression"
Understanding the assumptions is very important for anybody to build a robust model and improve the performance.
#machinelearning #AIML #statistics #artificialintelligence
https://lnkd.in/eJupcDZ
✴️ @AI_Python_EN
  "Assumptions of Linear Regression"
Understanding the assumptions is very important for anybody to build a robust model and improve the performance.
#machinelearning #AIML #statistics #artificialintelligence
https://lnkd.in/eJupcDZ
✴️ @AI_Python_EN
There are now many methods we can use when our dependent variable is not continuous. SVM, XGBoost and Random Forests are some popular ones.
There are also "traditional" methods, such as Logistic Regression. These usually scale well and, when used properly, are competitive in terms of predictive accuracy.
They are probabilistic models, which gives them additional flexibility. They also are often easier to interpret, critical when the goal is explanation, not just prediction.
They can be more work, however, and are probably easier to misuse than newer methods such as Random Forests. Here are some excellent books on these methods that may be of interest:
- Categorical Data Analysis (Agresti)
- Analyzing Categorical Data (Simonoff)
- Regression Models for Categorical Dependent Variables (Long and Freese)
- Generalized Linear Models and Extensions (Hardin and Hilbe)
- Regression Modeling Strategies (Harrell)
- Applied Logistic Regression (Hosmer and Lemeshow)
- Logistic Regression Models (Hilbe)
- Analysis of Ordinal Categorical Data (Agresti)
- Applied Ordinal Logistic Regression (Liu)
- Modeling Count Data (Hilbe)
- Negative Binomial Regression (Hilbe)
- Handbook of Survival Analysis (Klein et al.)
- Survival Analysis: A Self-Learning Text (Kleinbaum and Klein)
#statistics #book #Machinelearning
✴️ @AI_Python
  There are also "traditional" methods, such as Logistic Regression. These usually scale well and, when used properly, are competitive in terms of predictive accuracy.
They are probabilistic models, which gives them additional flexibility. They also are often easier to interpret, critical when the goal is explanation, not just prediction.
They can be more work, however, and are probably easier to misuse than newer methods such as Random Forests. Here are some excellent books on these methods that may be of interest:
- Categorical Data Analysis (Agresti)
- Analyzing Categorical Data (Simonoff)
- Regression Models for Categorical Dependent Variables (Long and Freese)
- Generalized Linear Models and Extensions (Hardin and Hilbe)
- Regression Modeling Strategies (Harrell)
- Applied Logistic Regression (Hosmer and Lemeshow)
- Logistic Regression Models (Hilbe)
- Analysis of Ordinal Categorical Data (Agresti)
- Applied Ordinal Logistic Regression (Liu)
- Modeling Count Data (Hilbe)
- Negative Binomial Regression (Hilbe)
- Handbook of Survival Analysis (Klein et al.)
- Survival Analysis: A Self-Learning Text (Kleinbaum and Klein)
#statistics #book #Machinelearning
✴️ @AI_Python
#Statistics such as correlation, mean and standard deviation (variance) create strong visual images and meaning.  Two different #datasets with the same correlation would sort of look the same.  Right?
Not so much.
Each of these very different-looking graphs are plotting datasets with the same correlation, mean and SD. This is why plotting data is so important though oddly so rarely (in my expereince) done.
https://bit.ly/2oZ29MP
✴️ @AI_Python_EN
  Not so much.
Each of these very different-looking graphs are plotting datasets with the same correlation, mean and SD. This is why plotting data is so important though oddly so rarely (in my expereince) done.
https://bit.ly/2oZ29MP
✴️ @AI_Python_EN
The field of statistics has very long history, dating back to ancient times.
Much of marketing data science can be traced to the origins of actuarial science, demography, sociology and psychology, with early statisticians playing major roles in all of these fields.
Big is relative, and statisticians have been working with "big data" all along. "Machine learners" such as SVM and random forests originated in statistics, and neural nets were inspired as much by regression as by theories of the human brain.
Statisticians are involved in a diverse range of fields, including marketing, psychology, pharmacology, economics, meteorology, political science and ecology, and have helped developed research methods and analytics for nearly any kind of data.
The history and richness of #statistics is not always appreciated, though. For example, this morning I was asked "How's your #machinelearning?" :-)
✴️ @AI_Python_EN
  Much of marketing data science can be traced to the origins of actuarial science, demography, sociology and psychology, with early statisticians playing major roles in all of these fields.
Big is relative, and statisticians have been working with "big data" all along. "Machine learners" such as SVM and random forests originated in statistics, and neural nets were inspired as much by regression as by theories of the human brain.
Statisticians are involved in a diverse range of fields, including marketing, psychology, pharmacology, economics, meteorology, political science and ecology, and have helped developed research methods and analytics for nearly any kind of data.
The history and richness of #statistics is not always appreciated, though. For example, this morning I was asked "How's your #machinelearning?" :-)
✴️ @AI_Python_EN
Sampling is a deceptively complex subject, and some academic statisticians have devoted the bulk of their careers to it.
It's not a subject that thrills everyone but is a very important one, and one which seems underappreciated in marketing research and #data science.
Here are some books on or related to sampling I've found helpful:
- Survey Sampling (Kish)
- Sampling Techniques (Cochran)
- Model Assisted Survey Sampling (Särndal et al.)
- Sampling: Design and Analysis (Lohr)
- Practical Tools for Designing and Weighting Survey Samples (Valliant et al.)
- Survey Weights: A Step-by-step Guide to Calculation (Valliant and Dever)
- Complex Surveys (Lumley)
- Hard-to-Survey Populations (Tourangeau et al.)
- Small Area Estimation (Rao and Molina)
The first three are regarded as classics (though still relevant.) Sharon Lohr's book is the friendliest introduction I know of on this subject. Standard marketing research textbooks also give simple overviews of sampling but do not get into depth.
There are also academic journals that feature articles on sampling, such as the Public Opinion Quarterly (AAPOR) and the Journal of Survey #Statistics and Methodology (AAPOR and ASA).
✴️ @AI_Python_EN
  It's not a subject that thrills everyone but is a very important one, and one which seems underappreciated in marketing research and #data science.
Here are some books on or related to sampling I've found helpful:
- Survey Sampling (Kish)
- Sampling Techniques (Cochran)
- Model Assisted Survey Sampling (Särndal et al.)
- Sampling: Design and Analysis (Lohr)
- Practical Tools for Designing and Weighting Survey Samples (Valliant et al.)
- Survey Weights: A Step-by-step Guide to Calculation (Valliant and Dever)
- Complex Surveys (Lumley)
- Hard-to-Survey Populations (Tourangeau et al.)
- Small Area Estimation (Rao and Molina)
The first three are regarded as classics (though still relevant.) Sharon Lohr's book is the friendliest introduction I know of on this subject. Standard marketing research textbooks also give simple overviews of sampling but do not get into depth.
There are also academic journals that feature articles on sampling, such as the Public Opinion Quarterly (AAPOR) and the Journal of Survey #Statistics and Methodology (AAPOR and ASA).
✴️ @AI_Python_EN
This is Your Brain on Code 🧠💻🔢 computer programming is often associated with math, but researchers used functional MRI scans to show the role of the brain's language processing centers: https://lnkd.in/eN_-3RA
#datascience #machinelearning #ai #bigdata #analytics #statistics #artificialintelligence #datamining #computing #programmers #neuroscience
✴️ @AI_Python_EN
  #datascience #machinelearning #ai #bigdata #analytics #statistics #artificialintelligence #datamining #computing #programmers #neuroscience
✴️ @AI_Python_EN
Uncertainty in big data analytics: survey, opportunities, and challenges
https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0206-3
#BigData #statistics #NLP
✴️ @AI_Python_EN
  https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0206-3
#BigData #statistics #NLP
✴️ @AI_Python_EN
#AI/ #DataScience/ #MachineLearning/ #ML:
7 Steps for Data Preparation Using #Python
Link => https://bit.ly/PyDataPrep
#datamining #statistics #bigdata #artificialintelligence
✴️ @AI_Python_EN
  7 Steps for Data Preparation Using #Python
Link => https://bit.ly/PyDataPrep
#datamining #statistics #bigdata #artificialintelligence
✴️ @AI_Python_EN
Data in the Life:  Authorship Attribution in Lennon-McCartney Songs", was just published in the first issue of the HARVARD DATA SCIENCE REVIEW, the inaugural publication of harvard datascience published by the mit press. Combining features of a premier research journal, a leading educational publication, and a popular magazine, HDSR leverages digital technologies and data visualizations to facilitate author-reader interactions globally. Besides our article, the first issue features articles on topics ranging from machine learning models for predicting drug approvals to artificial intelligence. Read it now: 
https://bit.ly/2Kuze2q.
#datascience #bigdata #machinelearing #statistics #AI
✴️ @AI_Python_EN
  https://bit.ly/2Kuze2q.
#datascience #bigdata #machinelearing #statistics #AI
✴️ @AI_Python_EN
