Python Full-Stack Interview Questions 46–50 (Concurrency, Awaitables, Buffer Protocol, itertools/functools, CPython Internals)

This lesson covers five focused Python interview topics: choosing concurrency strategies for I/O-bound vs CPU-bound work, what awaitable objects are and how to run sync code asynchronously, the buffer protocol and memoryview, practical itertools and functools patterns, and CPython internals that affect performance. Each question has a clear, beginner-friendly explanation, relatable analogies, and runnable code samples with expected output.

46. How do you handle concurrency in Python for I/O-bound vs CPU-bound workloads?

Concisely: choose async I/O or threads for I/O-bound tasks; choose multiprocessing (separate processes) or native extensions for CPU-bound tasks. The reason is the Global Interpreter Lock (GIL) in CPython: it prevents multiple Python bytecode threads from executing simultaneously in the same process, so threads don't give CPU parallelism for pure Python CPU-heavy code.

  • I/O-bound (network, disk, database): use async/await (asyncio) or threads.
  • CPU-bound (heavy computation, number crunching): use multiprocessing or move compute into C/NumPy or other native libs that release the GIL.
  • Hybrid approaches: combine async for I/O orchestration and workers for CPU.

Real-world analogy: think of the program as a single cook. For I/O work (waiting for an oven or for a vendor delivery), the cook can prepare other dishes while waiting — that is asynchronous I/O. For CPU-bound tasks (kneading dough by hand), you need more cooks (processes) to do kneading in parallel.

Example: async for concurrent I/O (many network calls) versus multiprocessing for CPU.

# Async I/O example: many simulated network requests (I/O-bound)
import asyncio
import random
import time

async def fetch(i):
    # simulate network latency
    await asyncio.sleep(random.random() * 0.2)
    return f"result {i}"

async def main():
    start = time.time()
    tasks = [asyncio.create_task(fetch(i)) for i in range(10)]
    results = await asyncio.gather(*tasks)
    print("results:", results)
    print("elapsed:", time.time() - start)

# Run with:
# asyncio.run(main())

Expected behavior: tasks run concurrently, total time close to the single longest sleep.

# Output (example):

# results: ['result 0', 'result 1', ..., 'result 9']

# elapsed: 0.16 

For CPU-bound work, use multiprocessing:

# Multiprocessing example: compute squares in parallel (CPU-bound)
from multiprocessing import Pool
import time

def heavy(x):
    # simulate CPU work
    s = 0
    for i in range(10_000_00):
        s += (i * x) % 7
    return s

if __name__ == "__main__":
    start = time.time()
    with Pool(4) as p:
        results = p.map(heavy, range(8))
    print("results length:", len(results))
    print("elapsed:", time.time() - start)

Expected: running across processes uses multiple CPU cores; elapsed time reduces roughly by number of cores (subject to overhead).

Interview tip: explain the trade-offs — threads and async are lightweight for I/O, but processes have higher memory/IPC cost and startup overhead. Mention profiling first: measure whether the workload is I/O or CPU bound before choosing.

47. Explain awaitable objects, and how to make synchronous code run asynchronously.

Awaitable objects are values you can await inside async functions. There are three core kinds:

  • Coroutines (declared with async def) — you await them directly.
  • Tasks/Futures — wrappers that schedule coroutines or represent eventual results.
  • Objects implementing __await__ — custom awaitables.

To run synchronous (blocking) code without blocking the event loop, offload it to a thread or process pool. asyncio provides run_in_executor to wrap blocking calls into awaitable futures. This keeps your async event loop responsive.

# Example: making a blocking function awaitable using run_in_executor
import asyncio
import time
from concurrent.futures import ThreadPoolExecutor

def blocking_io(x):
    time.sleep(0.5)  # blocking
    return f"done {x}"

async def main():
    loop = asyncio.get_running_loop()
    # Use default thread pool executor
    results = await asyncio.gather(
        loop.run_in_executor(None, blocking_io, 1),
        loop.run_in_executor(None, blocking_io, 2),
    )
    print(results)

# Run with: asyncio.run(main())

Expected output:

# Output: ['done 1', 'done 2']  

You can also create custom awaitables by implementing __await__:

# Custom awaitable example
class SimpleAwaitable:
    def __init__(self, value):
        self.value = value

    def __await__(self):
        # generator-based awaitable
        yield from ()
        return self.value

async def main():
    v = await SimpleAwaitable(123)
    print("value:", v)

# Run with: asyncio.run(main())

Expected output:

# Output:
value: 123  

48. What is the buffer protocol and memoryview in Python? When is it useful?

The buffer protocol exposes a raw byte-oriented view of object memory so different objects (bytes, bytearray, array.array, NumPy arrays, and others) can share memory without copying. memoryview is a built-in wrapper that gives a view into the buffer, allowing slicing, casting, and in-place updates without copying data.

  • Use it when you need zero-copy access to binary data across APIs.
  • memoryview supports slicing with no copy and can change the underlying bytes.
  • Useful in high-performance code (I/O buffers, image processing, networking).

Example: modify a subsection of a bytearray without allocating a new bytes object.

b = bytearray(b"Hello, world!")
m = memoryview(b)
m[7:12] = b"Earth"  # modify in-place through memoryview
print(bytes(b))  

Expected output:

# Output:
b'Hello, Earth!'  

You can cast memoryview to different formats for numeric arrays (if the object supports such casts). This is handy with binary protocols and numeric data without making copies.

49. Explain itertools and functools — give examples where they improve performance or clarity.

itertools contains efficient iterator-building blocks that operate lazily (no intermediate collections). functools contains higher-order functions and helpers like lru_cache, partial, and reduce. Both modules help write concise and often faster code by avoiding unnecessary allocations and by caching results.

  • itertools: chain, islice, groupby, accumulate — use for streaming pipelines.
  • functools: lru_cache for memoization, partial to pre-fill function args.
  • Using lazy iterators often reduces memory usage and can be faster when you only need part of the results.

Examples: chain to flatten, islice to limit, and lru_cache to speed repeated calls.

from itertools import chain, islice
from functools import lru_cache, partial

# itertools.chain: flatten multiple iterables lazily

a = [1, 2]
b = (x for x in range(3, 6))
flat = chain(a, b)
print(list(flat))  # [1, 2, 3, 4, 5]

# itertools.islice: take first N items of a generator

def naturals():
    n = 0
    while True:
        yield n
        n += 1

print(list(islice(naturals(), 5)))  # [0, 1, 2, 3, 4]

# functools.lru_cache: memoize expensive function

@lru_cache(maxsize=128)
def fib(n):
    if n < 2:
        return n
    return fib(n-1) + fib(n-2)

print(fib(30))  # fast due to caching

# functools.partial: create a function with some arguments fixed

def multiply(a, b):
    return a * b

double = partial(multiply, 2)
print(double(8))  # 16

Expected output (example):

# Output: [1, 2, 3, 4, 5] [0, 1, 2, 3, 4]
832040
16  

Interview tip: when asked for performance, emphasize lazy iteration and caching where appropriate. For clarity questions, show how small, well-chosen itertools expressions can replace verbose loops.

50. What are CPython internals that affect performance (reference counting, caching, small integer caching, interning)?

Several CPython implementation details are commonly discussed in interviews because they explain surprising behaviors and performance implications.

  • Reference counting: CPython uses reference counts to deallocate objects quickly. Frequent creation/destruction of many small objects can cost CPU and memory churn.
  • Garbage collector: a cyclic GC runs to collect object cycles that reference counting misses.
  • Small integer caching: CPython pre-allocates some small integers (typically -5 to 256), so identity comparisons for these integers can be True unexpectedly in short-lived tests.
  • String interning: short strings and identifiers may be interned (shared) to save memory and speed comparisons using is for certain cases (but do not rely on is for equality).
  • Object attribute lookup: attribute access uses dictionaries and descriptor protocol; using local variables is faster than repeated global lookups. Use local references in hot loops.

Small examples that illustrate behavior:

# Small-int caching example
a = 256
b = 256
print(a is b)  # often True

x = 1000
y = 1000
print(x is y)  # often False (no cache)

# Attribute lookup cost example

class C:
    def method(self):
        return 1

c = C()

# method attribute lookup every iteration (slower)

import time
start = time.time()
for _ in range(1_000_000):
    c.method()
print("with attr lookup:", time.time() - start)

# cache the method in a local variable (faster)

start = time.time()
m = c.method
for _ in range(1_000_000):
    m()
print("with local cache:", time.time() - start)

Expected output: identity results may vary by Python build, and the timing lines show the local cache faster than repeated attribute lookup on hot paths.

# Output (example):
True
False
with attr lookup: 0.35
with local cache: 0.12  

Practical interview takeaways:

  • Explain which behaviors are CPython implementation details (caching, interning) and why you should not rely on them for program correctness.
  • Show that you know where to optimize: reduce allocations, cache frequently used lookups, use efficient libraries (NumPy) for numeric work, and profile before optimizing.
  • Mention tools: timeit, cProfile, tracemalloc for diagnosing performance and memory issues.

Closing note: for interviews, give short, concrete examples (like the ones above), explain the trade-offs, and show that you would measure and reason about the choice (I/O vs CPU, copy vs view, lazy iteration vs eager list). Those show both practical skill and thoughtful engineering judgement.

🚀 Deep Dive With AI Scholar