Data Analytics & AI | SQL Interviews | Power BI Resources
25.1K subscribers
304 photos
2 videos
151 files
316 links
πŸ”“Explore the fascinating world of Data Analytics & Artificial Intelligence

πŸ’» Best AI tools, free resources, and expert advice to land your dream tech job.

Admin: @coderfun
Download Telegram
Bayesian Data Analysis
πŸ‘4❀1
Artificial Intelligence for Robotics.epub
24 MB
Artificial Intelligence for Robotics
Francis X. Govers, 2018
Ultimate ChatGPT Handbook for Enterprises.pdf
18.3 MB
Ultimate ChatGPT Handbook for Enterprises
Harald Gunia, 2024
πŸ‘5
Complete Syllabus for Data Analytics interview:

SQL:
1. Basic   
- SELECT statements with WHERE, ORDER BY, GROUP BY, HAVING   
- Basic JOINS (INNER, LEFT, RIGHT, FULL)   
- Creating and using simple databases and tables

2. Intermediate   
- Aggregate functions (COUNT, SUM, AVG, MAX, MIN)   
- Subqueries and nested queries
- Common Table Expressions (WITH clause)   
- CASE statements for conditional logic in queries
3. Advanced   
- Advanced JOIN techniques (self-join, non-equi join)   
- Window functions (OVER, PARTITION BY, ROW_NUMBER, RANK, DENSE_RANK, lead, lag)   
- optimization with indexing   
- Data manipulation (INSERT, UPDATE, DELETE)

Python:
1. Basic   
- Syntax, variables, data types (integers, floats, strings, booleans)   
- Control structures (if-else, for and while loops)   
- Basic data structures (lists, dictionaries, sets, tuples)   
- Functions, lambda functions, error handling (try-except)   
- Modules and packages

2. Pandas & Numpy   
- Creating and manipulating DataFrames and Series   
- Indexing, selecting, and filtering data   
- Handling missing data (fillna, dropna)   
- Data aggregation with groupby, summarizing data   
- Merging, joining, and concatenating datasets

3. Basic Visualization   
- Basic plotting with Matplotlib (line plots, bar plots, histograms)   
- Visualization with Seaborn (scatter plots, box plots, pair plots)   
- Customizing plots (sizes, labels, legends, color palettes)   
- Introduction to interactive visualizations (e.g., Plotly)

Excel:
1. Basic   
- Cell operations, basic formulas (SUMIFS, COUNTIFS, AVERAGEIFS, IF, AND, OR, NOT & Nested Functions etc.)   
- Introduction to charts and basic data visualization   
- Data sorting and filtering   
- Conditional formatting

2. Intermediate   
- Advanced formulas (V/XLOOKUP, INDEX-MATCH, nested IF)   
- PivotTables and PivotCharts for summarizing data   
- Data validation tools   
- What-if analysis tools (Data Tables, Goal Seek)

3. Advanced   
- Array formulas and advanced functions   
- Data Model & Power Pivot
- Advanced Filter
- Slicers and Timelines in Pivot Tables   
- Dynamic charts and interactive dashboards

Power BI:
1. Data Modeling   
- Importing data from various sources   
- Creating and managing relationships between different datasets   
- Data modeling basics (star schema, snowflake schema)

2. Data Transformation   
- Using Power Query for data cleaning and transformation   
- Advanced data shaping techniques   
- Calculated columns and measures using DAX

3. Data Visualization and Reporting   - Creating interactive reports and dashboards   
- Visualizations (bar, line, pie charts, maps)   
- Publishing and sharing reports, scheduling data refreshes

Statistics Fundamentals: Mean, Median, Mode, Standard Deviation, Variance, Probability Distributions, Hypothesis Testing, P-values, Confidence Intervals, Correlation, Simple Linear Regression, Normal Distribution, Binomial Distribution, Poisson Distribution.

Like for more πŸ˜„β€οΈ
πŸ‘26❀12
CHATGPT Ultimate Guide
❀3πŸ‘3
Starting your journey as a data analyst is an amazing start for your career. As you progress, you might find new areas that pique your interest:

β€’ Data Science: If you enjoy diving deep into statistics, predictive modeling, and machine learning, this could be your next challenge.

β€’ Data Engineering: If building and optimizing data pipelines excites you, this might be the path for you.

β€’ Business Analysis: If you're passionate about translating data into strategic business insights, consider transitioning to a business analyst role.

But remember, even if you stick with data analysis, there's always room for growth, especially with the evolving landscape of AI.

No matter where your path leads, the key is to start now.
πŸ‘4
Let's start with the topics we gonna cover in this 30 Days of Data Science Series,

We will primarily focus on learning Data Science and Machine Learning Algorithms

Day 1: Linear Regression
- Concept: Predict continuous values.
- Implementation: Ordinary Least Squares.
- Evaluation: R-squared, RMSE.

Day 2: Logistic Regression
- Concept: Binary classification.
- Implementation: Sigmoid function.
- Evaluation: Confusion matrix, ROC-AUC.

Day 3: Decision Trees
- Concept: Tree-based model for classification/regression.
- Implementation: Recursive splitting.
- Evaluation: Accuracy, Gini impurity.

Day 4: Random Forest
- Concept: Ensemble of decision trees.
- Implementation: Bagging.
- Evaluation: Out-of-bag error, feature importance.

Day 5: Gradient Boosting
- Concept: Sequential ensemble method.
- Implementation: Boosting.
- Evaluation: Learning rate, number of estimators.

Day 6: Support Vector Machines (SVM)
- Concept: Classification using hyperplanes.
- Implementation: Kernel trick.
- Evaluation: Margin maximization, support vectors.

Day 7: k-Nearest Neighbors (k-NN)
- Concept: Instance-based learning.
- Implementation: Distance metrics.
- Evaluation: k-value tuning, distance functions.

Day 8: Naive Bayes
- Concept: Probabilistic classifier.
- Implementation: Bayes' theorem.
- Evaluation: Prior probabilities, likelihood.

Day 9: k-Means Clustering
- Concept: Partitioning data into k clusters.
- Implementation: Centroid initialization.
- Evaluation: Inertia, silhouette score.

Day 10: Hierarchical Clustering
- Concept: Nested clusters.
- Implementation: Agglomerative method.
- Evaluation: Dendrograms, linkage methods.

Day 11: Principal Component Analysis (PCA)
- Concept: Dimensionality reduction.
- Implementation: Eigenvectors, eigenvalues.
- Evaluation: Explained variance.

Day 12: Association Rule Learning
- Concept: Discover relationships between variables.
- Implementation: Apriori algorithm.
- Evaluation: Support, confidence, lift.

Day 13: DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
- Concept: Density-based clustering.
- Implementation: Epsilon, min samples.
- Evaluation: Core points, noise points.

Day 14: Linear Discriminant Analysis (LDA)
- Concept: Linear combination for classification.
- Implementation: Fisher's criterion.
- Evaluation: Class separability.

Day 15: XGBoost
- Concept: Extreme Gradient Boosting.
- Implementation: Tree boosting.
- Evaluation: Regularization, parallel processing.

Day 16: LightGBM
- Concept: Gradient boosting framework.
- Implementation: Leaf-wise growth.
- Evaluation: Speed, accuracy.

Day 17: CatBoost
- Concept: Gradient boosting with categorical features.
- Implementation: Ordered boosting.
- Evaluation: Handling of categorical data.

Day 18: Neural Networks
- Concept: Layers of neurons for learning.
- Implementation: Backpropagation.
- Evaluation: Activation functions, epochs.

Day 19: Convolutional Neural Networks (CNNs)
- Concept: Image processing.
- Implementation: Convolutions, pooling.
- Evaluation: Feature maps, filters.

Day 20: Recurrent Neural Networks (RNNs)
- Concept: Sequential data processing.
- Implementation: Hidden states.
- Evaluation: Long-term dependencies.

Day 21: Long Short-Term Memory (LSTM)
- Concept: Improved RNN.
- Implementation: Memory cells.
- Evaluation: Forget gates, output gates.

Day 22: Gated Recurrent Units (GRU)
- Concept: Simplified LSTM.
- Implementation: Update gate.
- Evaluation: Performance, complexity.

Day 23: Autoencoders
- Concept: Data compression.
- Implementation: Encoder, decoder.
- Evaluation: Reconstruction error.

Day 24: Generative Adversarial Networks (GANs)
- Concept: Generative models.
- Implementation: Generator, discriminator.
- Evaluation: Adversarial loss.

Day 25: Transfer Learning
- Concept: Pre-trained models.
- Implementation: Fine-tuning.
- Evaluation: Domain adaptation.
πŸ‘6❀2
Day 26: Reinforcement Learning
- Concept: Learning through interaction.
- Implementation: Q-learning.
- Evaluation: Reward function, policy.

Day 27: Bayesian Networks
- Concept: Probabilistic graphical models.
- Implementation: Conditional dependencies.
- Evaluation: Inference, learning.

Day 28: Hidden Markov Models (HMM)
- Concept: Time series analysis.
- Implementation: Transition probabilities.
- Evaluation: Viterbi algorithm.

Day 29: Feature Selection Techniques
- Concept: Improving model performance.
- Implementation: Filter, wrapper methods.
- Evaluation: Feature importance.

Day 30: Hyperparameter Optimization
- Concept: Model tuning.
- Implementation: Grid search, random search.
- Evaluation: Cross-validation.

Share this channel with your real friends: https://t.iss.one/datasciencefun

Like if you want me to continue this series πŸ˜„β€οΈ

ENJOY LEARNING πŸ‘πŸ‘
πŸ‘8
Important Topics to become a data scientist
[Advanced Level]
πŸ‘‡πŸ‘‡

1. Mathematics

Linear Algebra
Analytic Geometry
Matrix
Vector Calculus
Optimization
Regression
Dimensionality Reduction
Density Estimation
Classification

2. Probability

Introduction to Probability
1D Random Variable
The function of One Random Variable
Joint Probability Distribution
Discrete Distribution
Normal Distribution

3. Statistics

Introduction to Statistics
Data Description
Random Samples
Sampling Distribution
Parameter Estimation
Hypotheses Testing
Regression

4. Programming

Python:

Python Basics
List
Set
Tuples
Dictionary
Function
NumPy
Pandas
Matplotlib/Seaborn

R Programming:

R Basics
Vector
List
Data Frame
Matrix
Array
Function
dplyr
ggplot2
Tidyr
Shiny

DataBase:
SQL
MongoDB

Data Structures

Web scraping

Linux

Git

5. Machine Learning

How Model Works
Basic Data Exploration
First ML Model
Model Validation
Underfitting & Overfitting
Random Forest
Handling Missing Values
Handling Categorical Variables
Pipelines
Cross-Validation(R)
XGBoost(Python|R)
Data Leakage

6. Deep Learning

Artificial Neural Network
Convolutional Neural Network
Recurrent Neural Network
TensorFlow
Keras
PyTorch
A Single Neuron
Deep Neural Network
Stochastic Gradient Descent
Overfitting and Underfitting
Dropout Batch Normalization
Binary Classification

7. Feature Engineering

Baseline Model
Categorical Encodings
Feature Generation
Feature Selection

8. Natural Language Processing

Text Classification
Word Vectors

9. Data Visualization Tools

BI (Business Intelligence):
Tableau
Power BI
Qlik View
Qlik Sense

10. Deployment

Microsoft Azure
Heroku
Google Cloud Platform
Flask
Django

Join @datasciencefun to learning important data science and machine learning concepts

ENJOY LEARNING πŸ‘πŸ‘
πŸ‘8❀3
Forecasting vs. Predictive Analytics: The Obama Example
Analytics can influence elections, not just predict them. This article explores how the Obama campaign used predictive analytics to outmaneuver traditional forecasting.

Forecasting vs. Predictive Analytics
Nate Silver’s forecasting predicted state outcomes, while Obama’s team used predictive analytics to score individual voters, targeting those most likely to be persuaded.

Impact of Predictive Analytics
The Obama campaign optimized interactions, avoiding β€œdo-not-disturb” voters and improving ad spending effectiveness by 18%.

Conclusion
Predictive analytics enables organizations to shape outcomes through personalized insights, distinguishing it from forecasting’s broad predictions.
πŸ‘1
The 'bias machine': How Google tells you what you want to hear

"We're at the mercy of Google." Undecided voters in the US who turn to Google may see dramatically different views of the world – even when they're asking the exact same question.

Type in "Is Kamala Harris a good Democratic candidate", and Google paints a rosy picture. Search results are constantly changing, but last week, the first link was a Pew Research Center poll showing that "Harris energises Democrats". Next is an Associated Press article titled "Majority of Democrats think Kamala Harris would make a good president", and the following links were similar. But if you've been hearing negative things about Harris, you might ask if she's a "bad" Democratic candidate instead. Fundamentally, that's an identical question, but Google's results are far more pessimistic.

"It's been easy to forget how bad Kamala Harris is," said an article from Reason Magazine in the top spot.


Source-Link: BBC
πŸ‘1
Characteristics of a Data whisperer
7 best GitHub repositories to break into data analytics and data science:


1. 100-Days-Of-ML-Code
- 𝐋𝐒𝐧𝐀: (https://lnkd.in/dcftdA57)
- π’π­πšπ«π¬: ~42k

2. awesome-datascience
- 𝐋𝐒𝐧𝐀: (https://lnkd.in/dcFYYwx9)
- π’π­πšπ«π¬: ~22.7k

3. Data-Science-For-Beginners
- 𝐋𝐒𝐧𝐀: (https://lnkd.in/d_zZBadF)
- π’π­πšπ«π¬: ~14.5k

4. data-science-interviews
- 𝐋𝐒𝐧𝐀: (https://lnkd.in/dkN4RZjH)
- π’π­πšπ«π¬: ~5.8k

5. Coding and ML System Design
- 𝐋𝐒𝐧𝐀: (https://lnkd.in/gXFaaaQR)
- π’π­πšπ«π¬: ~3.5k

6. Machine Learning Interviews from MAANG
- 𝐋𝐒𝐧𝐀: https://lnkd.in/gq_huuZD
- π’π­πšπ«π¬: 8.1k

7. data-science-ipython-notebooks
- 𝐋𝐒𝐧𝐀: (https://lnkd.in/dPmQuPB9)
- π’π­πšπ«π¬: ~27.2k


These repositories are maintained by various individuals and organizations, each offering valuable resources for learning and practicing data analytics and data science.
πŸ‘5
7 best Telegram Channels to break into data analytics and data science:


1. Data Science & Machine Learning
- 𝐋𝐒𝐧𝐀: (https://t.iss.one/datasciencefun)
- Subscribers: ~48k

2. Python for Data Analysts
- 𝐋𝐒𝐧𝐀: (https://t.iss.one/pythonanalyst)
- Subscribers: ~34.8k

3. SQL For Data Analytics
- 𝐋𝐒𝐧𝐀: (https://t.iss.one/sqlanalyst)
- Subscribers: ~58.9k

4. Power BI & Tableau
- 𝐋𝐒𝐧𝐀: (t.iss.one/PowerBI_analyst)
- Subscribers: ~36.1k

5. Artificial Intelligence
- 𝐋𝐒𝐧𝐀: (https://t.iss.one/machinelearning_deeplearning)
- Subscribers: ~28.7k

6. Coding Interviews
- 𝐋𝐒𝐧𝐀: (https://t.iss.one/crackingthecodinginterview)
- Subscribers: 38.6k

7. Data Science Interviews
- 𝐋𝐒𝐧𝐀: (https://t.iss.one/DataScienceInterviews)
- Subscribers: ~12.5k


These channels are maintained by various individuals and organizations, each offering valuable resources for learning and practicing data analytics and data science.
πŸ‘4❀2
The GPT-4 model outperformed GPT-3 and GPT-3.5 language models
Oil bosses have big hopes for the AI boom

Data centres are fuelling demand for natural gasβ€”for now
This week 180,000 people descended on Abu Dhabi to attend ADIPEC, the global oil-and-gas industry’s biggest annual gathering. This year’s focus, perhaps unsurprisingly, was the nexus of artificial intelligence (AI) and energy. On the eve of the jamboree Sultan Al Jaber, chief executive of ADNOC, the Emirati national oil giant, convened a private meeting of big tech and big energy bosses. A survey of some 400 energy, tech and finance bigwigs released in conjunction with the event concluded that AI is set to transform the energy business by boosting efficiency and cutting greenhouse-gas emissions.
πŸ‘2
Decagon and OpenAI deliver high-performance, fully automated customer support at scale

Launched in 2023, Decagon⁠(opens in a new window) has quickly become a key player in automating customer support for companies like Curology, BILT, Duolingo, Eventbrite, Notion, and Substack. OpenAI’s models are crucial in their ability to deliver fast, reliable responsesβ€”without human intervention.

From enterprises to tech-forward startups, Decagon helps businesses globally handle millions of support conversations without sacrificing quality or speed. The company uses a combination of OpenAI’s modelsβ€”including GPT-3.5, 4, 4o, 4 Turbo, and OpenAI o1-miniβ€”to deliver agentic bots that go beyond response generation and service the entire customer lifecycle.
❀1πŸ‘1