Skip to content
Snippets Groups Projects
Commit 677df249 authored by WEICHU YANG's avatar WEICHU YANG
Browse files

Fix the wrong expected file size in Part 1 and sum of blocks in Part 2.

parent 05a7ac90
No related branches found
No related tags found
No related merge requests found
...@@ -96,9 +96,10 @@ In this part, your task is to implement the `DbToHdfs` gRPC call (you can find t ...@@ -96,9 +96,10 @@ In this part, your task is to implement the `DbToHdfs` gRPC call (you can find t
4. Upload the generated table to `/hdma-wi-2021.parquet` in the HDFS, with **2x** replication and a **1-MB** block size, using PyArrow (https://arrow.apache.org/docs/python/generated/pyarrow.fs.HadoopFileSystem.html). 4. Upload the generated table to `/hdma-wi-2021.parquet` in the HDFS, with **2x** replication and a **1-MB** block size, using PyArrow (https://arrow.apache.org/docs/python/generated/pyarrow.fs.HadoopFileSystem.html).
To check whether the upload was correct, you can use `docker exec -it` to enter the gRPC server's container and use HDFS command `hdfs dfs -du -h <path>`to see the file size. The expected result is: To check whether the upload was correct, you can use `docker exec -it` to enter the gRPC server's container and use HDFS command `hdfs dfs -du -h <path>`to see the file size. The expected result is:
```
15.3 M 30.5 M hdfs://nn:9000/hdma-wi-2021.parquet ```
``` 14.4 M 28.9 M hdfs://nn:9000/hdma-wi-2021.parquet
```
**Hint 1:** We used similar tables in lecture: https://git.doit.wisc.edu/cdis/cs/courses/cs544/s25/main/-/tree/main/lec/15-sql **Hint 1:** We used similar tables in lecture: https://git.doit.wisc.edu/cdis/cs/courses/cs544/s25/main/-/tree/main/lec/15-sql
...@@ -117,7 +118,7 @@ In this part, your task is to implement the `BlockLocations` gRPC call (you can ...@@ -117,7 +118,7 @@ In this part, your task is to implement the `BlockLocations` gRPC call (you can
For example, running `docker exec -it p4-server-1 python3 /client.py BlockLocations -f /hdma-wi-2021.parquet` should show something like this: For example, running `docker exec -it p4-server-1 python3 /client.py BlockLocations -f /hdma-wi-2021.parquet` should show something like this:
``` ```
{'7eb74ce67e75': 15, 'f7747b42d254': 6, '39750756065d': 11} {'7eb74ce67e75': 15, 'f7747b42d254': 7, '39750756065d': 8}
``` ```
Note: DataNode location is the randomly generated container ID for the Note: DataNode location is the randomly generated container ID for the
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment