@@ -96,9 +96,10 @@ In this part, your task is to implement the `DbToHdfs` gRPC call (you can find t
...
@@ -96,9 +96,10 @@ In this part, your task is to implement the `DbToHdfs` gRPC call (you can find t
4. Upload the generated table to `/hdma-wi-2021.parquet` in the HDFS, with **2x** replication and a **1-MB** block size, using PyArrow (https://arrow.apache.org/docs/python/generated/pyarrow.fs.HadoopFileSystem.html).
4. Upload the generated table to `/hdma-wi-2021.parquet` in the HDFS, with **2x** replication and a **1-MB** block size, using PyArrow (https://arrow.apache.org/docs/python/generated/pyarrow.fs.HadoopFileSystem.html).
To check whether the upload was correct, you can use `docker exec -it` to enter the gRPC server's container and use HDFS command `hdfs dfs -du -h <path>`to see the file size. The expected result is:
To check whether the upload was correct, you can use `docker exec -it` to enter the gRPC server's container and use HDFS command `hdfs dfs -du -h <path>`to see the file size. The expected result is:
```
15.3 M 30.5 M hdfs://nn:9000/hdma-wi-2021.parquet
```
```
14.4 M 28.9 M hdfs://nn:9000/hdma-wi-2021.parquet
```
**Hint 1:** We used similar tables in lecture: https://git.doit.wisc.edu/cdis/cs/courses/cs544/s25/main/-/tree/main/lec/15-sql
**Hint 1:** We used similar tables in lecture: https://git.doit.wisc.edu/cdis/cs/courses/cs544/s25/main/-/tree/main/lec/15-sql
...
@@ -117,7 +118,7 @@ In this part, your task is to implement the `BlockLocations` gRPC call (you can
...
@@ -117,7 +118,7 @@ In this part, your task is to implement the `BlockLocations` gRPC call (you can
For example, running `docker exec -it p4-server-1 python3 /client.py BlockLocations -f /hdma-wi-2021.parquet` should show something like this:
For example, running `docker exec -it p4-server-1 python3 /client.py BlockLocations -f /hdma-wi-2021.parquet` should show something like this: