Skip to content
Snippets Groups Projects
Commit d0fe98ea authored by TYLER CARAZA-HARTER's avatar TYLER CARAZA-HARTER
Browse files
parents 28ae67b1 935a941b
No related branches found
No related tags found
No related merge requests found
File added
File added
File added
File added
File added
File added
File added
File added
......@@ -14,7 +14,9 @@ Before starting, please review the [general project directions](../projects.md).
## Corrections/Clarifications
* Mar 5: A hint about HDFS environment variables added; a dataflow diagram added; some minor typos fixed.
- Mar 5: A hint about HDFS environment variables added; a dataflow diagram added; some minor typos fixed.
- Mar 5: Fix the wrong expected file size in Part 1 and sum of blocks in Part 2.
## Introduction
......@@ -96,9 +98,10 @@ In this part, your task is to implement the `DbToHdfs` gRPC call (you can find t
4. Upload the generated table to `/hdma-wi-2021.parquet` in the HDFS, with **2x** replication and a **1-MB** block size, using PyArrow (https://arrow.apache.org/docs/python/generated/pyarrow.fs.HadoopFileSystem.html).
To check whether the upload was correct, you can use `docker exec -it` to enter the gRPC server's container and use HDFS command `hdfs dfs -du -h <path>`to see the file size. The expected result is:
```
15.3 M 30.5 M hdfs://nn:9000/hdma-wi-2021.parquet
```
```
14.4 M 28.9 M hdfs://nn:9000/hdma-wi-2021.parquet
```
**Hint 1:** We used similar tables in lecture: https://git.doit.wisc.edu/cdis/cs/courses/cs544/s25/main/-/tree/main/lec/15-sql
......@@ -117,7 +120,7 @@ In this part, your task is to implement the `BlockLocations` gRPC call (you can
For example, running `docker exec -it p4-server-1 python3 /client.py BlockLocations -f /hdma-wi-2021.parquet` should show something like this:
```
{'7eb74ce67e75': 15, 'f7747b42d254': 6, '39750756065d': 11}
{'7eb74ce67e75': 15, 'f7747b42d254': 7, '39750756065d': 8}
```
Note: DataNode location is the randomly generated container ID for the
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment