Python Data Science Jobs & Interviews
20.3K subscribers
188 photos
4 videos
25 files
326 links
Your go-to hub for Python and Data Science—featuring questions, answers, quizzes, and interview tips to sharpen your skills and boost your career in the data-driven world.

Admin: @Hussein_Sheikho
Download Telegram
#Python #InterviewQuestion #Concurrency #Threading #Multithreading #Programming #IntermediateLevel

Question: How can you use threading in Python to speed up I/O-bound tasks, such as fetching data from multiple URLs simultaneously, and what are the key considerations when using threads?

Answer:

To speed up I/O-bound tasks like fetching data from multiple URLs, you can use Python's threading module to perform concurrent operations. This is effective because threads can wait for I/O (like network requests) without blocking the entire program.

Here’s a detailed example using threading and requests:

import threading
import requests
from time import time

# List of URLs to fetch
urls = [
'https://httpbin.org/json',
'https://api.github.com/users/octocat',
'https://jsonplaceholder.typicode.com/posts/1',
'https://www.google.com',
]

# Shared list to store results
results = []
lock = threading.Lock() # To safely append to shared list

def fetch_url(url: str):
"""Fetches a URL and stores the response text."""
try:
response = requests.get(url, timeout=10)
response.raise_for_status()
with lock:
results.append({
'url': url,
'status': response.status_code,
'length': len(response.text)
})
except Exception as e:
with lock:
results.append({
'url': url,
'status': 'Error',
'error': str(e)
})

def fetch_urls_concurrently():
"""Fetches all URLs using multiple threads."""
start_time = time()

# Create a thread for each URL
threads = []
for url in urls:
thread = threading.Thread(target=fetch_url, args=(url,))
threads.append(thread)
thread.start()

# Wait for all threads to complete
for thread in threads:
thread.join()

end_time = time()
print(f"Time taken: {end_time - start_time:.2f} seconds")
print("Results:")
for result in results:
print(result)

if __name__ == "__main__":
fetch_urls_concurrently()

### Explanation:
- **threading.Thread**: Creates a new thread for each URL.
- **target**: The function to run in the thread (fetch_url).
- **args**: Arguments passed to the target start() **start()**: Begins execution of thjoin()- **join()**: Waits for the thread to finish before coLock.
- **Lock**: Ensures safe access to shared resources (like results) to avoid race conditions.

### Key ConsidGIL (Global Interpreter Lock)eter Lock)**: Python’s GIL limits true parallelism for CPU-bound tasks, but threads work well for I/O-bouThread Safetyead Safety**: Use locks or queues when sharing data betweenOverhead**Overhead**: Creating too many threads can degrade perTimeouts**Timeouts**: Always set timeouts to avoid hanging on slow responses.

This pattern is commonly used in web scraping, API clients, and backend services handling multiple external calls efficiently.

By: @DataScienceQ 🚀
1