Python Data Science Jobs & Interviews

Question:
How can you use Python’s asyncio and concurrent.futures to efficiently handle both I/O-bound and CPU-bound tasks in a single application, and what are the best practices for structuring such a system?

Answer:
To efficiently handle both I/O-bound (e.g., network requests, file I/O) and CPU-bound (e.g., data processing, math operations) tasks in Python, you should combine asyncio for I/O-bound work and concurrent.futures.ThreadPoolExecutor or ProcessPoolExecutor for CPU-bound tasks. This avoids blocking the event loop and maximizes performance.

Here’s an example:

import asyncio
import time
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
import aiohttp
import requests

# Simulated I/O-bound task (e.g., API call)
async def fetch_url(session, url):
    try:
        async with session.get(url) as response:
            return await response.text()
    except Exception as e:
        return f"Error: {e}"

# Simulated CPU-bound task (e.g., heavy computation)
def cpu_intensive_task(n):
    return sum(i * i for i in range(n))

# Main function using asyncio + thread/process pools
async def main():
    # I/O-bound tasks with asyncio
    urls = [
        "https://httpbin.org/json",
        "https://httpbin.org/headers",
        "https://httpbin.org/status/200"
    ]

    # Use aiohttp for concurrent HTTP requests
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_url(session, url) for url in urls]
        results = await asyncio.gather(*tasks)

    print("I/O-bound results:", results)

    # CPU-bound tasks with ProcessPoolExecutor
    with ProcessPoolExecutor() as executor:
        # Run CPU-intensive work in separate processes
        futures = [executor.submit(cpu_intensive_task, 1000000) for _ in range(3)]
        cpu_results = [future.result() for future in futures]

    print("CPU-bound results:", cpu_results)

# Run the async main function
if __name__ == "__main__":
    asyncio.run(main())

Explanation:
- asyncio handles I/O-bound tasks asynchronously without blocking the main thread.
- aiohttp is used for efficient HTTP requests.
- ProcessPoolExecutor runs CPU-heavy functions in separate processes (bypassing GIL).
- Mixing both ensures optimal resource usage: async for I/O, multiprocessing for CPU.

Best practices:
- Use ThreadPoolExecutor for light I/O or blocking code.
- Use ProcessPoolExecutor for CPU-intensive work.
- Avoid mixing async and blocking code directly — always offload CPU tasks.
- Use asyncio.gather() to run multiple coroutines concurrently.

#Python #AsyncIO #Concurrency #Multithreading #Multiprocessing #AdvancedPython #Programming #WebDevelopment #Performance

By: @DataScienceQ 🚀

❤1

577 viewsedited 04:19

About

Blog

Apps

Platform