Skip to content
Snippets Groups Projects
Commit 1b732592 authored by Jing Lan's avatar Jing Lan
Browse files

p3 draft update

parent 621405c2
No related branches found
No related tags found
No related merge requests found
......@@ -91,6 +91,8 @@ This method should:
1. Recover the uploaded CSV table from *binary* bytes carried by the RPC request message.
2. Write the table to a CSV file and write the same table to another file in Parquet format.
**Requirement:** Write two files to disk per upload. We will test your server with a 512MB memory limit. Do *NOT* keep the table data in memory.
**HINT 1:** You are free to decide the names and locations of the stored files. However, the server must keep these records to process future queries (for instance, you can add paths to a data structure like a list or dictionary).
**HINT 2:** Both `pandas` and `pyarrow` provide interfaces to write a table to file.
......@@ -166,7 +168,7 @@ grpc.server(
## Part 4: Benchmarking the System
Congratulations, you have implemented a minimal parallel data system! Let's write a small script to finally benchmark it with different scales (i.e., number of worker threads). Overall, the script is expected to perform the following tasks:
Congratulations, you have implemented a minimal multi-threading data system! Let's write a small script to finally benchmark it with different scales (i.e., number of worker threads). Overall, the script is expected to perform the following tasks:
1. Run `client.py` multiple times with different therading parameters, record their execution time.
2. plot the data to visualize the performance trend.
......@@ -198,7 +200,7 @@ Plot a simple line graph with the execution time acquired by the previous step.
## Submission
Delirable should work with the `docker-compose.yaml` we provide:
Delirable should work with `docker-compose.yaml` we provide:
1. `Dockerfile.client` must launch `benchmark.py` **(NOT `client.py`)**. To achieve this, you need to copy both `client.py` and the driver `benchmark.py` to the image, as well as `workload`, `purge`, and the input CSV files. It is sufficient to submit a minimal working set as we may test your code with different datasets and workloads.
2. `Dockerfile.server` must launch `server.py`.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment