Python Data Science Jobs & Interviews
20.3K subscribers
187 photos
4 videos
25 files
325 links
Your go-to hub for Python and Data Science—featuring questions, answers, quizzes, and interview tips to sharpen your skills and boost your career in the data-driven world.

Admin: @Hussein_Sheikho
Download Telegram
Question 4 (Intermediate):
When working with Pandas in Python, what does the inplace=True parameter do in DataFrame operations?

A) Creates a copy of the DataFrame before applying changes
B) Modifies the original DataFrame directly
C) Saves the results to a CSV file automatically
D) Enables parallel processing for faster execution

#Python #Pandas #DataAnalysis #DataManipulation
🚀 Comprehensive Guide: How to Prepare for a Data Analyst Python Interview – 350 Most Common Interview Questions

Are you ready: https://hackmd.io/@husseinsheikho/pandas-interview

#DataAnalysis #PythonInterview #DataAnalyst #Pandas #NumPy #Matplotlib #Seaborn #SQL #DataCleaning #Visualization #MachineLearning #Statistics #InterviewPrep


✉️ Our Telegram channels: https://t.iss.one/addlist/0f6vfFbEMdAwODBk

📱 Our WhatsApp channel: https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
Please open Telegram to view this post
VIEW IN TELEGRAM
3
1. What is the primary data structure in pandas?
2. How do you create a DataFrame from a dictionary?
3. Which method is used to read a CSV file in pandas?
4. What does the head() function do in pandas?
5. How can you check the data types of columns in a DataFrame?
6. Which function drops rows with missing values in pandas?
7. What is the purpose of the merge() function in pandas?
8. How do you filter rows based on a condition in pandas?
9. What does the groupby() method do?
10. How can you sort a DataFrame by a specific column?
11. Which method is used to rename columns in pandas?
12. What is the difference between loc and iloc in pandas?
13. How do you handle duplicate rows in pandas?
14. What function converts a column to datetime format?
15. How do you apply a custom function to a DataFrame?
16. What is the use of the apply() method in pandas?
17. How can you concatenate two DataFrames?
18. What does the pivot_table() function do?
19. How do you calculate summary statistics in pandas?
20. Which method is used to export a DataFrame to a CSV file?

#️⃣ #pandas #dataanalysis #python #dataframe #coding #programming #datascience

By: t.iss.one/DataScienceQ 🚀
1. What is the output of the following code?
import numpy as np
a = np.array([[1, 2], [3, 4]])
b = a.T
b[0, 0] = 99
print(a)

2. Which of the following functions is used to create an array with values spaced at regular intervals?
A) np.linspace()
B) np.arange()
C) np.logspace()
D) All of the above

3. Write a function that takes a 1D NumPy array and returns a new array where each element is squared, but only if it’s greater than 5.

4. What will be printed by this code?
import numpy as np
x = np.array([1, 2, 3])
y = x.copy()
y[0] = 5
print(x[0])

5. Explain the difference between np.meshgrid() and np.mgrid in generating coordinate matrices.

6. How would you efficiently compute the outer product of two vectors using NumPy?

7. What is the result of np.sum(np.eye(3), axis=1)?

8. Write a program to generate a 5x5 matrix filled with random integers from 1 to 100, then find the maximum value in each row.

9. What happens when you use np.resize() on an array with shape (3,) to resize it to (5,)?

10. Which method can be used to flatten a multi-dimensional array into a 1D array without copying data?

11. What is the output of this code?
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
result = arr[[0, 1], [1, 2]]
print(result)

12. Describe how np.take() works and provide an example using a 2D array.

13. Write a function that calculates the Euclidean distance between all pairs of points in a 2D array of coordinates.

14. What is the purpose of np.frombuffer() and when might it be useful?

15. How do you perform matrix multiplication using np.matmul() and @ operator? Are they always equivalent?

16. Write a program to filter out all elements in a 2D array that are outside the range [10, 90].

17. What does np.nan_to_num() do and why is it important in numerical computations?

18. How can you efficiently transpose a large 3D array of shape (100, 100, 100) using np.transpose() or swapaxes()?

19. Explain the concept of "views" vs "copies" in NumPy and give an example where a view leads to unexpected behavior.

20. Write a function that computes the covariance matrix of a dataset represented as a 2D NumPy array.

#NumPy #AdvancedPython #DataScience #InterviewPrep #PythonLibrary #ScientificComputing #MachineLearning #CodingChallenge #HighLevelNumPy #PythonDeveloper #TechnicalInterview #DataAnalysis

By: @DataScienceQ 🚀
In Python, NumPy is the cornerstone of scientific computing, offering high-performance multidimensional arrays and tools for working with them—critical for data science interviews and real-world applications! 📊

import numpy as np

# Array Creation - The foundation of NumPy
arr = np.array([1, 2, 3])
zeros = np.zeros((2, 3)) # 2x3 matrix of zeros
ones = np.ones((2, 2), dtype=int) # Integer matrix
arange = np.arange(0, 10, 2) # [0 2 4 6 8]
linspace = np.linspace(0, 1, 5) # [0. 0.25 0.5 0.75 1. ]
print(linspace)


# Array Attributes - Master your data's structure
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print(matrix.shape) # Output: (2, 3)
print(matrix.ndim) # Output: 2
print(matrix.dtype) # Output: int64
print(matrix.size) # Output: 6


# Indexing & Slicing - Precision data access
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(data[1, 2]) # Output: 6 (row 1, col 2)
print(data[0:2, 1:3]) # Output: [[2 3], [5 6]]
print(data[:, -1]) # Output: [3 6 9] (last column)


# Reshaping Arrays - Transform dimensions effortlessly
flat = np.arange(6)
reshaped = flat.reshape(2, 3)
raveled = reshaped.ravel()
print(reshaped)
# Output: [[0 1 2], [3 4 5]]
print(raveled) # Output: [0 1 2 3 4 5]


# Stacking Arrays - Combine datasets vertically/horizontally
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.vstack((a, b))) # Vertical stack
# Output: [[1 2 3], [4 5 6]]
print(np.hstack((a, b))) # Horizontal stack
# Output: [1 2 3 4 5 6]


# Mathematical Operations - Vectorized calculations
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
print(x + y) # Output: [5 7 9]
print(x * 2) # Output: [2 4 6]
print(np.dot(x, y)) # Output: 32 (1*4 + 2*5 + 3*6)


# Broadcasting Magic - Operate on mismatched shapes
matrix = np.array([[1, 2, 3], [4, 5, 6]])
scalar = 10
print(matrix + scalar)
# Output: [[11 12 13], [14 15 16]]


# Aggregation Functions - Statistical power in one line
values = np.array([1, 5, 3, 9, 7])
print(np.sum(values)) # Output: 25
print(np.mean(values)) # Output: 5.0
print(np.max(values)) # Output: 9
print(np.std(values)) # Output: 2.8284271247461903


# Boolean Masking - Filter data like a pro
temperatures = np.array([18, 25, 12, 30, 22])
hot_days = temperatures > 24
print(temperatures[hot_days]) # Output: [25 30]


# Random Number Generation - Simulate real-world data
print(np.random.rand(2, 2)) # Uniform distribution
print(np.random.randn(3)) # Normal distribution
print(np.random.randint(0, 10, (2, 3))) # Random integers


# Linear Algebra Essentials - Solve equations like a physicist
A = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])
x = np.linalg.solve(A, b)
print(x) # Output: [2. 3.] (Solution to 3x+y=9 and x+2y=8)

# Matrix inverse and determinant
print(np.linalg.inv(A)) # Output: [[ 0.4 -0.2], [-0.2 0.6]]
print(np.linalg.det(A)) # Output: 5.0


# File Operations - Save/load your computational work
data = np.array([[1, 2], [3, 4]])
np.save('array.npy', data)
loaded = np.load('array.npy')
print(np.array_equal(data, loaded)) # Output: True


# Interview Power Move: Vectorization vs Loops
# 10x faster than native Python loops!
def square_sum(n):
arr = np.arange(n)
return np.sum(arr ** 2)

print(square_sum(5)) # Output: 30 (0²+1²+2²+3²+4²)


# Pro Tip: Memory-efficient data processing
# Process 1GB array without loading entire dataset
large_array = np.memmap('large_data.bin', dtype='float32', mode='r', shape=(1000000, 100))
print(large_array[0:5, 0:3]) # Process small slice


By: @DataScienceQ 🚀

#Python #NumPy #DataScience #CodingInterview #MachineLearning #ScientificComputing #DataAnalysis #Programming #TechJobs #DeveloperTips
🧠 Quiz: What is one of the most critical first steps when starting a new data analysis project?

A) Select the most complex predictive model.
B) Immediately remove all outliers from the dataset.
C) Perform Exploratory Data Analysis (EDA) to understand the data's main characteristics.
D) Normalize all numerical features.

Correct answer: C

Explanation: EDA is crucial because it helps you summarize the data's main features, identify patterns, spot anomalies, and check assumptions before you proceed with more formal modeling. Steps like modeling or removing outliers should be informed by the initial understanding gained from EDA.

#DataAnalysis #DataScience #Statistics

━━━━━━━━━━━━━━━
By: @DataScienceQ
4