From 6d7783f0cfda658872685303b4eaaa481fcc03ca Mon Sep 17 00:00:00 2001 From: Jing Lan <jlan25@cs544-jlan25.cs.wisc.edu> Date: Thu, 20 Feb 2025 22:31:13 -0600 Subject: [PATCH] p3 draft update --- p3/README.md | 41 ++++++++++++++++++++++------------------- 1 file changed, 22 insertions(+), 19 deletions(-) diff --git a/p3/README.md b/p3/README.md index 972bb76..719dc3b 100644 --- a/p3/README.md +++ b/p3/README.md @@ -141,35 +141,38 @@ This method facilitates testing and subsequent benchmarking. The method should: 1. Remove all local file previously uploaded by method `Upload()` 2. Reset all associated server state (e.g., counters, paths, etc) -## Part 3: Multi-threading Client +## Part 3: Multi-threading Server/Client +With the Global Interpreter Lock (GIL), commonly-used CPython does not support parallel multi-threading execution. However, multi-threading can still boost the performance of our small system (why?). In Part 3, you are required to add threading support to `client.py`, then `server.py`. +### Client +More specifically, you will need to manually create *N* threads for `client.py` (with thread management primitives come with the `threading` module) to concurrently process the provided `workload`. For example, each worker thread may repeatedly fetch one command line from `workload` and process it. You can load all command strings to a list, then provide thread-safe access to all launched threads (how?). -## Part 4: Benchmarking the System - -You don't need to explicitly create threads using Python calls because -gRPC will do it for you. Set `max_workers` to 8 so that gRPC will -create 8 threads: +**HINT:** Before moving to work on the `server`, test your multi-threading client by running it with a single thread: +```bash +python3 client.py workload 1 ``` + +### Server + +Now with concurrent requests sent from `client.py`, you must correspondingly protect your server from data race with `threading.Lock()`. Make sure only one thread can modify the server state (e.g., names, paths, counters...). Note that you don't need to explicitly create threads for `server.py` as gRPC can do that for you. The following example code creates a thread pool with 8 threads: + +```python grpc.server( - futures.ThreadPoolExecutor(max_workers=????), - options=[("grpc.so_reuseport", 0)] + futures.ThreadPoolExecutor(max_workers=8), + options=[("grpc.so_reuseport", 0)] ) ``` -Now that your server has multiple threads, your code should hold a -lock (https://docs.python.org/3/library/threading.html#threading.Lock) -whenever accessing any shared data structures, including the list(s) -of files (or whatever data structure you used). Use a single global -lock for everything. Ensure the lock is released properly, even when -there is an exception. Even if your chosen data structures provide any -guarantees related to thread-safe access, you must still hold the lock -when accessing them to gain practice protecting shared data. - -**Requirement:** reading and writing files is a slow operation, so -your code must NOT hold the lock when doing file I/O. +**Requirement 1:** The server should properly acquire then release the lock. A single global lock is sufficient. Lock release should also work with any potential exceptions. + +**Requirement 2:** The server *MUST NOT* hold the lock when reading or writing files. A thread should release the lock right after it has done accessing the shared data structure. How could this behavior affect the performance? + +## Part 4: Benchmarking the System + + ## Grading -- GitLab