Data Science Machine Learning Data Analysis
38.9K subscribers
3.69K photos
31 videos
39 files
1.28K links
ads: @HusseinSheikho

This channel is for Programmers, Coders, Software Engineers.

1- Data Science
2- Machine Learning
3- Data Visualization
4- Artificial Intelligence
5- Data Analysis
6- Statistics
7- Deep Learning
Download Telegram
๐Ÿ”ฅ Trending Repository: data-engineer-handbook

๐Ÿ“ Description: This is a repo with links to everything you'd ever want to learn about data engineering

๐Ÿ”— Repository URL: https://github.com/DataExpert-io/data-engineer-handbook

๐Ÿ“– Readme: https://github.com/DataExpert-io/data-engineer-handbook#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 36.3K stars
๐Ÿ‘€ Watchers: 429
๐Ÿด Forks: 7K forks

๐Ÿ’ป Programming Languages: Jupyter Notebook - Python - Makefile - Dockerfile - Shell

๐Ÿท๏ธ Related Topics:
#data #awesome #sql #bigdata #dataengineering #apachespark


==================================
๐Ÿง  By: https://t.iss.one/DataScienceM
๐Ÿ”ฅ Trending Repository: leantime

๐Ÿ“ Description: Leantime is a goals focused project management system for non-project managers. Building with ADHD, Autism, and dyslexia in mind.

๐Ÿ”— Repository URL: https://github.com/Leantime/leantime

๐ŸŒ Website: https://leantime.io

๐Ÿ“– Readme: https://github.com/Leantime/leantime#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 5.8K stars
๐Ÿ‘€ Watchers: 69
๐Ÿด Forks: 671 forks

๐Ÿ’ป Programming Languages: PHP - JavaScript - CSS - Blade - Twig - HTML

๐Ÿท๏ธ Related Topics:
#php #trello #jira #sql #agile #calendar #projects #project_management #kanban #scrum #lean #strategy #timesheets #asana #gantt #hacktoberfest #notion #retrospective #clickup #leantime


==================================
๐Ÿง  By: https://t.iss.one/DataScienceM
โค1
๐Ÿ”ฅ Trending Repository: budibase

๐Ÿ“ Description: Create business apps and automate workflows in minutes. Supports PostgreSQL, MySQL, MariaDB, MSSQL, MongoDB, Rest API, Docker, K8s, and more ๐Ÿš€ No code / Low code platform..

๐Ÿ”— Repository URL: https://github.com/Budibase/budibase

๐ŸŒ Website: https://budibase.com

๐Ÿ“– Readme: https://github.com/Budibase/budibase#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 25.5K stars
๐Ÿ‘€ Watchers: 218
๐Ÿด Forks: 1.8K forks

๐Ÿ’ป Programming Languages: TypeScript - Svelte - JavaScript - CSS - Shell - Handlebars

๐Ÿท๏ธ Related Topics:
#open_source #internal_tools #workflow_engine #crud_application #workflow_automation #low_code #no_code #rest_api_framework #crud_app #no_code_platform #data_apps #low_code_platform #ai_applications #data_application #workflow_apps #low_code_no_code #sql_gui #ai_app_builder #it_workflows


==================================
๐Ÿง  By: https://t.iss.one/DataScienceM
๐Ÿ”ฅ Trending Repository: budibase

๐Ÿ“ Description: Create business apps and automate workflows in minutes. Supports PostgreSQL, MySQL, MariaDB, MSSQL, MongoDB, Rest API, Docker, K8s, and more ๐Ÿš€ No code / Low code platform..

๐Ÿ”— Repository URL: https://github.com/Budibase/budibase

๐ŸŒ Website: https://budibase.com

๐Ÿ“– Readme: https://github.com/Budibase/budibase#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 25.9K stars
๐Ÿ‘€ Watchers: 218
๐Ÿด Forks: 1.9K forks

๐Ÿ’ป Programming Languages: TypeScript - Svelte - JavaScript - CSS - Shell - Handlebars

๐Ÿท๏ธ Related Topics:
#open_source #internal_tools #workflow_engine #crud_application #workflow_automation #low_code #no_code #rest_api_framework #crud_app #no_code_platform #data_apps #low_code_platform #ai_applications #data_application #workflow_apps #low_code_no_code #sql_gui #ai_app_builder #it_workflows


==================================
๐Ÿง  By: https://t.iss.one/DataScienceM
๐Ÿ”ฅ Trending Repository: leantime

๐Ÿ“ Description: Leantime is a goals focused project management system for non-project managers. Building with ADHD, Autism, and dyslexia in mind.

๐Ÿ”— Repository URL: https://github.com/Leantime/leantime

๐ŸŒ Website: https://leantime.io

๐Ÿ“– Readme: https://github.com/Leantime/leantime#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 6.8K stars
๐Ÿ‘€ Watchers: 74
๐Ÿด Forks: 715 forks

๐Ÿ’ป Programming Languages: PHP - JavaScript - CSS - Blade - Twig - HTML

๐Ÿท๏ธ Related Topics:
#php #trello #jira #sql #agile #calendar #projects #project_management #kanban #scrum #lean #strategy #timesheets #asana #gantt #hacktoberfest #notion #retrospective #clickup #leantime


==================================
๐Ÿง  By: https://t.iss.one/DataScienceM
๐Ÿ”ฅ Trending Repository: chartdb

๐Ÿ“ Description: Database diagrams editor that allows you to visualize and design your DB with a single query.

๐Ÿ”— Repository URL: https://github.com/chartdb/chartdb

๐ŸŒ Website: https://chartdb.io

๐Ÿ“– Readme: https://github.com/chartdb/chartdb#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 18.1K stars
๐Ÿ‘€ Watchers: 61
๐Ÿด Forks: 968 forks

๐Ÿ’ป Programming Languages: TypeScript

๐Ÿท๏ธ Related Topics:
#react #visualization #mysql #editor #schema_migrations #typescript #sql #database #sqlite #postgresql #mariadb #db #mssql #erd #db_migration #react_flow #xyflow


==================================
๐Ÿง  By: https://t.iss.one/DataScienceM
๐Ÿ”ฅ Trending Repository: WrenAI

๐Ÿ“ Description: โšก๏ธ GenBI (Generative BI) queries any database in natural language, generates accurate SQL (Text-to-SQL), charts (Text-to-Chart), and AI-powered insights in seconds.

๐Ÿ”— Repository URL: https://github.com/Canner/WrenAI

๐ŸŒ Website: https://getwren.ai/oss

๐Ÿ“– Readme: https://github.com/Canner/WrenAI#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 10.1K stars
๐Ÿ‘€ Watchers: 70
๐Ÿด Forks: 1K forks

๐Ÿ’ป Programming Languages: TypeScript - Python - Go - JavaScript - Less - Dockerfile

๐Ÿท๏ธ Related Topics:
#agent #bigquery #charts #sql #postgresql #bedrock #business_intelligence #openai #spreadsheets #vertex #genbi #text_to_sql #rag #text2sql #duckdb #llm #anthropic #sqlai #text_to_chart


==================================
๐Ÿง  By: https://t.iss.one/DataScienceM
Top 100 Data Analyst Interview Questions & Answers

#DataAnalysis #InterviewQuestions #SQL #Python #Statistics #CaseStudy #DataScience

Part 1: SQL Questions (Q1-30)

#1. What is the difference between DELETE, TRUNCATE, and DROP?
A:
โ€ข DELETE is a DML command that removes rows from a table based on a WHERE clause. It is slower as it logs each row deletion and can be rolled back.
โ€ข TRUNCATE is a DDL command that quickly removes all rows from a table. It is faster, cannot be rolled back, and resets table identity.
โ€ข DROP is a DDL command that removes the entire table, including its structure, data, and indexes.

#2. Select all unique departments from the employees table.
A: Use the DISTINCT keyword.

SELECT DISTINCT department
FROM employees;


#3. Find the top 5 highest-paid employees.
A: Use ORDER BY and LIMIT.

SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;


#4. What is the difference between WHERE and HAVING?
A:
โ€ข WHERE is used to filter records before any groupings are made (i.e., it operates on individual rows).
โ€ข HAVING is used to filter groups after aggregations (GROUP BY) have been performed.

-- Find departments with more than 10 employees
SELECT department, COUNT(employee_id)
FROM employees
GROUP BY department
HAVING COUNT(employee_id) > 10;


#5. What are the different types of SQL joins?
A:
โ€ข (INNER) JOIN: Returns records that have matching values in both tables.
โ€ข LEFT (OUTER) JOIN: Returns all records from the left table, and the matched records from the right table.
โ€ข RIGHT (OUTER) JOIN: Returns all records from the right table, and the matched records from the left table.
โ€ข FULL (OUTER) JOIN: Returns all records when there is a match in either the left or right table.
โ€ข SELF JOIN: A regular join, but the table is joined with itself.

#6. Write a query to find the second-highest salary.
A: Use OFFSET or a subquery.

-- Method 1: Using OFFSET
SELECT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;

-- Method 2: Using a Subquery
SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);


#7. Find duplicate emails in a customers table.
A: Group by the email column and use HAVING to find groups with a count greater than 1.

SELECT email, COUNT(email)
FROM customers
GROUP BY email
HAVING COUNT(email) > 1;


#8. What is a primary key vs. a foreign key?
A:
โ€ข A Primary Key is a constraint that uniquely identifies each record in a table. It must contain unique values and cannot contain NULL values.
โ€ข A Foreign Key is a key used to link two tables together. It is a field (or collection of fields) in one table that refers to the Primary Key in another table.

#9. Explain Window Functions. Give an example.
A: Window functions perform a calculation across a set of table rows that are somehow related to the current row. Unlike aggregate functions, they do not collapse rows.

-- Rank employees by salary within each department
SELECT
name,
department,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dept_rank
FROM employees;


#10. What is a CTE (Common Table Expression)?
A: A CTE is a temporary, named result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. It helps improve readability and break down complex queries.
โ€ข (Time: 90s) Simpson's Paradox occurs when:
a) A model performs well on training data but poorly on test data.
b) Two variables appear to be correlated, but the correlation is caused by a third variable.
c) A trend appears in several different groups of data but disappears or reverses when these groups are combined.
d) The mean, median, and mode of a distribution are all the same.

โ€ข (Time: 75s) When presenting your findings to non-technical stakeholders, you should focus on:
a) The complexity of your statistical models and the p-values.
b) The story the data tells, the business implications, and actionable recommendations.
c) The exact Python code and SQL queries you used.
d) Every single chart and table you produced during EDA.

โ€ข (Time: 75s) A survey about job satisfaction is only sent out via a corporate email newsletter. The results may suffer from what kind of bias?
a) Survivorship bias
b) Selection bias
c) Recall bias
d) Observer bias

โ€ข (Time: 90s) For which of the following machine learning algorithms is feature scaling (e.g., normalization or standardization) most critical?
a) Decision Trees and Random Forests.
b) K-Nearest Neighbors (KNN) and Support Vector Machines (SVM).
c) Naive Bayes.
d) All algorithms require feature scaling to the same degree.

โ€ข (Time: 90s) A Root Cause Analysis for a business problem primarily aims to:
a) Identify all correlations related to the problem.
b) Assign blame to the responsible team.
c) Build a model to predict when the problem will happen again.
d) Move beyond symptoms to find the fundamental underlying cause of the problem.

โ€ข (Time: 75s) A "funnel analysis" is typically used to:
a) Segment customers into different value tiers.
b) Understand and optimize a multi-step user journey, identifying where users drop off.
c) Forecast future sales.
d) Perform A/B tests on a website homepage.

โ€ข (Time: 75s) Tracking the engagement metrics of users grouped by their sign-up month is an example of:
a) Funnel Analysis
b) Regression Analysis
c) Cohort Analysis
d) Time-Series Forecasting

โ€ข (Time: 90s) A retail company wants to increase customer lifetime value (CLV). A data-driven first step would be to:
a) Redesign the company logo.
b) Increase the price of all products.
c) Perform customer segmentation (e.g., using RFM analysis) to understand the behavior of different customer groups and tailor strategies accordingly.
d) Switch to a new database provider.

#DataAnalysis #Certification #Exam #Advanced #SQL #Pandas #Statistics #MachineLearning

โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
By: @DataScienceM โœจ
โค2๐Ÿ”ฅ1
๐Ÿ“Œ Multi-Agent SQL Assistant, Part 2: Building a RAG Manager

๐Ÿ—‚ Category: AI APPLICATIONS

๐Ÿ•’ Date: 2025-11-06 | โฑ๏ธ Read time: 21 min read

Explore building a multi-agent SQL assistant in this hands-on guide to creating a RAG Manager. Part 2 of this series provides a practical comparison of multiple Retrieval-Augmented Generation strategies, weighing traditional keyword search against modern vector-based approaches using FAISS and Chroma. Learn how to select and implement the most effective retrieval method to enhance your AI assistant's performance and accuracy when interacting with databases.

#RAG #SQL #AI #VectorSearch #LLM
โค1