You can also find an example from the [lecture notes](https://git.doit.wisc.edu/cdis/cs/courses/cs544/s25/main/-/tree/main/lec/14-file-formats?ref_type=heads).
**Requirement:** when the server is asked to sum over the column of a
**Requirement:** when the server is asked to sum over the column of a
Parquet file, it should only read the data from that column, not other
Parquet file, it should only read the data from that column, not other
columns.
columns.
**Note:** we will run your server with a 512-MB limit on RAM. Any
**Note 1:** we will run your server with a 512-MB limit on RAM. Any
individual files we upload will fit within that limit, but the total
individual files we upload will fit within that limit, but the total
size of the files uploaded will exceed that limit. That's why your
size of the files uploaded will exceed that limit. That's why your
server will have to do sums by reading the files (instead of just
server will have to do sums by reading the files (instead of just
keeping all table data in memory). If you want manually test your
keeping all table data in memory).
code with some bigger uploads, use the `bigdata.py` client. Instead
of uploading files, it randomly generateds lots of CSV-formatted data
**Note 2:** the `bigdata.py` randomly generates a large volumne of
and directly uploads it via gRPC.
CSV-formatted data and uploads it vis gRPC. You are *required* to
test your upload implementation with this script and it will be used