Skip to content
Snippets Groups Projects
Commit 49650d32 authored by gsingh58's avatar gsingh58
Browse files

Lec19 update

parent e9ba06f0
No related branches found
No related tags found
No related merge requests found
Source diff could not be displayed: it is too large. Options to address this: view the blob.
%% Cell type:markdown id:1d888215 tags:
# JSON
* Download ALL files for today's lecture
* Quiz 5 due tonight at 11:59:00 pm
* Read [Sweigart Ch 16](https://automatetheboringstuff.com/2e/chapter16/)
* [Exam 2 Conflict Form](https://docs.google.com/forms/d/e/1FAIpQLSegJSzTsDHEnygijU3-HQvZDUbTCkHFPKccDkqMt1dGzC67_w/viewform)
* [Regrade Request](https://piazza.com/class/ld8bqui1lgeas/post/105)
* Regrade requests take a long time to verify
* It is faster to:
- Post on Piazza
- Go to [office hours](https://sites.google.com/wisc.edu/cs220-oh-sp23/home?pli=1&authuser=2)
* Spring Break
- Have an amazing break!
- The entire team is also taking a break
- Office hours end today at 4:30 pm and resume Monday, March 20
- No one will monitor Piazza during Spring Break
* [Thank You](https://docs.google.com/forms/d/e/1FAIpQLSe0Zi6JFbxPIEVr7u1DJykuel-qSi7U0sBp2iR0gi6R_CArgw/viewform)
%% Cell type:markdown id:7ccf358c tags:
### Learning Objectives:
- Interpret JSON formatted data and recognize differences between JSON and Python
- Deserialize data from JSON for use in Python programs (read)
- Serialize data into JSON for long term storage (write)
%% Cell type:code id:5990f3d6 tags:
``` python
import csv
# TODO: import json module
import json
```
%% Cell type:code id:5662ace1 tags:
``` python
# Deserialize
def read_json(path):
with open(path, encoding = "utf-8") as f: # f is a varaible
return json.load(f) # f represents data in the JSON file (dict, list, etc)
# Serialize
def write_json(path, data):
with open(path, 'w', encoding = "utf-8") as f:
json.dump(data, f, indent = 2)
```
%% Cell type:markdown id:a35312c9 tags:
### Example 1: Sum of numbers (simple JSON)
%% Cell type:code id:d241dc5c tags:
``` python
# TODO 1: Create a new "numsA.json".
# Add the list [1, 2, 3, 4] to "numsA.json" file.
# Use jupyter notebook to create and edit the new file
# TODO 2: Use input built-in function go get JSON file name from user
# Read the JSON file using read_json; capture return value into a variable
data = read_json(input("Enter the JSON file name: "))
# TODO 3: Print type of data returned by function that reads JSON file.
print(type(data))
# TODO 4: Using Python built-in function sum(...), calculate total of numbers in numsA.json, print the total.
print(sum(data))
# TODO 5: Create a new JSON file "numsB.json" and try out the following data:
# [-1, 10, 4,]
# Does that work?
# Change the data to [-1, 10, 4] and try to run the program by providing input as numsB.json
# TODO 6: Create a new JSON file "simple.json" and try out the following data.
# What kind of error do you get with this?
# Fix the error by commenting the line of code that causes the error!
# 3.14
# True
# true
# 'hello'
# "hello"
```
%% Output
Enter the JSON file name: numsA.json
<class 'list'>
10
%% Cell type:markdown id:e830f23f tags:
### Example 2: Score Tracker
%% Cell type:code id:10e00558 tags:
``` python
player_details = input("Enter player name and score: ")
# TODO 1: extract player name and score into variables
player_name, player_score = player_details.split(" ")
player_score = int(player_score)
# TODO 2: Define an empty "scores" dictionary to keep track of players'
# scores.
# KEY: player name VALUE: player scores list
input_file = "score_history.json"
scores = read_json(input_file) # updated code after TODO 6
# TODO 3: Check if player name is a key in the scores dictionary.
# If not, create a new key for player name and value as empty list
# to keep track of that player's scores.
if player_name not in scores:
scores[player_name] = []
# TODO 4: Add player's score to the player's list in scores dictionary
scores[player_name].append(player_score)
print(scores)
# TODO 5: Create a "score_history.json" file and popluate that file with
# empty dictionary {}
# TODO 6: Read "score_history.json" to populate initial "scores" dict,
# instead of the empty dict created in TODO 2.
# TODO 7: Calculate average score for that player
print("Average score for {} is {}.".format(player_name, sum(scores[player_name]) / len(scores[player_name])))
# TODO 8: At the end of the program, write the updated scores from dict
# into the "score_history.json" file
write_json(input_file, scores)
# That's it, now you have a program that helps you keep track
# of player scores permanently.
```
%% Output
Enter player name and score: Viyan 70
{'Meena': [10, 20, 10], 'Viyan': [40, 50, 70, 70], 'Rogers': [10, 40]}
Average score for Viyan is 57.5.
%% Cell type:markdown id:b920d6e0 tags:
### Example 3: Kiva.com Micro-lending site
Many Web Sites have APIs that allow you to get their data.
%% Cell type:code id:dc891f5a tags:
``` python
# TODO: read "kiva.json"
kiva_data = read_json('kiva.json')
# TODO: explore the type of the data structure returned by read_json
print(type(kiva_data))
# kiva_data # uncomment to see the whole JSON
```
%% Output
<class 'dict'>
%% Cell type:markdown id:e29310b6 tags:
How to explore an unknown JSON?
- If you run into a `dict`, try `.keys()` method to look at the keys of the dictionary, then use lookup process to explore further
- If you run into a `list`, iterate over the list and print each item
%% Cell type:code id:31fe630f tags:
``` python
print(list(kiva_data.keys()))
# TODO: lookup the value corresponding to the key
kiva_data["data"]
# TODO: you know what to do next ... explore type again
print(type(kiva_data["data"]))
```
%% Output
['data']
<class 'dict'>
%% Cell type:code id:e09b064f tags:
``` python
print(list(kiva_data["data"].keys()))
print(list(kiva_data["data"]["lend"].keys()))
print(list(kiva_data["data"]["lend"]["loans"].keys()))
loans_list = kiva_data["data"]["lend"]["loans"]["values"] # actual information: list of loan dictionaries
```
%% Output
['lend']
['loans']
['values']
%% Cell type:code id:b6cb9c28 tags:
``` python
# TODO: iterate over loans_list and print every borrower's name, loan amount and country details
for loan_dict in loans_list:
borrower_name = loan_dict["name"]
print("Borrower name:", borrower_name)
loan_amount = loan_dict["loanAmount"]
print("Loan amount: $", loan_amount, sep = "")
country_details = loan_dict["geocode"]["country"]
print("Country details:", country_details)
print("------------------------------------------------------------------------------------------------")
```
%% Output
Borrower name: Polikseni
Loan amount: $1325.00
Country details: {'name': 'Albania', 'region': 'Eastern Europe', 'fundsLentInCountry': 9051250}
------------------------------------------------------------------------------------------------
Borrower name: Safarmo
Loan amount: $1075.00
Country details: {'name': 'Tajikistan', 'region': 'Asia', 'fundsLentInCountry': 64243075}
------------------------------------------------------------------------------------------------
Borrower name: Elizabeth
Loan amount: $800.00
Country details: {'name': 'Kenya', 'region': 'Africa', 'fundsLentInCountry': 120841775}
------------------------------------------------------------------------------------------------
Borrower name: Ester
Loan amount: $275.00
Country details: {'name': 'Kenya', 'region': 'Africa', 'fundsLentInCountry': 120841775}
------------------------------------------------------------------------------------------------
Borrower name: Cherifa
Loan amount: $875.00
Country details: {'name': 'Togo', 'region': 'Africa', 'fundsLentInCountry': 13719125}
------------------------------------------------------------------------------------------------
%% Cell type:markdown id:26b4c70c tags:
### Let's write student information dataset into various JSON files
%% Cell type:code id:c120d0eb tags:
``` python
# inspired by https://automatetheboringstuff.com/2e/chapter16/
def process_csv(filename):
exampleFile = open(filename, encoding="utf-8")
exampleReader = csv.reader(exampleFile)
exampleData = list(exampleReader)
exampleFile.close()
return exampleData
survey_data = process_csv('cs220_survey_data.csv')
cs220_header = survey_data[0]
cs220_data = survey_data[1:]
```
%% Cell type:code id:47cb92e9 tags:
``` python
def cell(row_idx, col_name):
"""
Returns the data value (cell) corresponding to the row index and
the column name of a CSV file.
"""
col_idx = cs220_header.index(col_name)
val = cs220_data[row_idx][col_idx]
# handle missing values, by returning None
if val == '':
return None
# handle type conversions
if col_name in ["Age",]:
return int(val)
return val
```
%% Cell type:code id:9a84cfb8 tags:
``` python
def bucketize(bucket_column):
"""
generates and returns bucketized data based on bucket_column
"""
# Key: unique bucketize column value; Value: list of lists (rows having that unique column value)
buckets = dict()
for row_idx in range(len(cs220_data)):
row = cs220_data[row_idx]
col_value = cell(row_idx, bucket_column)
if col_value not in buckets:
# create a new bucket when there is no existing bucket
buckets[col_value] = []
buckets[col_value].append(row)
return buckets
# TODO: create lecture based buckets and store result into lecture_buckets
lec_buckets = bucketize("Lecture")
# TODO: What is the type of lec_buckets? A __dict____ of ___list of lists______
# TODO: write lec_buckets into a JSON file called "lecture_cs220_data.json"
write_json("lecture_cs220_data.json", lec_buckets)
# TODO: create major based buckets and store result into major_buckets
major_buckets = bucketize("Major")
# TODO: write major_buckets into a JSON file called "major_cs220_data.json"
write_json("major_cs220_data.json", major_buckets)
```
%% Cell type:code id:f36a584a tags:
``` python
def transform(header, data):
"""
Transform data into a list of dictionaries
"""
transformed_data = [] #should be defined outside the for loop, because it stores the entire data
for row in data:
#should be defined inside the for loop, because it represents one row as a dictionary
row_dict = {} # Key: header; Value: data
for idx in range(len(row)):
row_dict[header[idx]] = row[idx]
transformed_data.append(row_dict)
return transformed_data
transformed_data = transform(cs220_header, cs220_data)
# TODO: What is the type of transformed_data? A __list____ of ___dictionaries______
# TODO: write transformed_data into a JSON file called "cs220_survey_data.json"
write_json("cs220_survey_data.json", transformed_data)
```
%% Cell type:markdown id:834df269 tags:
### Self-practice: explore real-world JSON
### Weather for UW-Madison campus
%% Cell type:code id:a7daffd0 tags:
``` python
# TODO: read "weather.json"
weather_data = read_json('weather.json')
# TODO: explore the type of the data structure returned by read_json
print(type(weather_data))
# display the data from "weather.json"
# weather_data # uncomment to see the whole JSON
```
%% Output
<class 'dict'>
%% Cell type:code id:35c85f25 tags:
``` python
# TODO: display the keys of the weather.json dict
print(list(weather_data.keys()))
# TODO: lookup the value corresponding to the 'properties'
weather_data["properties"]
# TODO: you know what to do next ... explore type again
print(type(weather_data["properties"]))
```
%% Output
['@context', 'type', 'geometry', 'properties']
<class 'dict'>
%% Cell type:code id:1ebf2a09 tags:
``` python
# TODO: display the keys of the properties dict
print(list(weather_data["properties"].keys()))
# TODO: lookup the value corresponding to the 'periods'
# weather_data["properties"]["periods"] # uncomment to see the output
# TODO: you know what to do next ... explore type again
print(type(weather_data["properties"]["periods"]))
```
%% Output
['updated', 'units', 'forecastGenerator', 'generatedAt', 'updateTime', 'validTimes', 'elevation', 'periods']
<class 'list'>
%% Cell type:code id:5eb0ec1e tags:
``` python
# TODO: extract periods list into a variable
periods_list = weather_data["properties"]["periods"]
# TODO: iterate over loans_list and print every periods's startTime, endTime, temperature, and temperatureUnit
for period_dict in periods_list:
start_time = period_dict["startTime"]
print("Start time:", start_time)
end_time = period_dict["startTime"]
print("End time:", end_time)
temperature = period_dict["temperature"]
temperature_unit = period_dict["temperatureUnit"]
print("Temperature: {} degree {}".format(temperature, temperature_unit))
print("------------------------------------------------------------------------------------------------")
```
%% Output
Start time: 2022-10-21T06:00:00-05:00
End time: 2022-10-21T06:00:00-05:00
Temperature: 72 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-21T18:00:00-05:00
End time: 2022-10-21T18:00:00-05:00
Temperature: 47 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-22T06:00:00-05:00
End time: 2022-10-22T06:00:00-05:00
Temperature: 74 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-22T18:00:00-05:00
End time: 2022-10-22T18:00:00-05:00
Temperature: 58 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-23T06:00:00-05:00
End time: 2022-10-23T06:00:00-05:00
Temperature: 74 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-23T18:00:00-05:00
End time: 2022-10-23T18:00:00-05:00
Temperature: 63 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-24T06:00:00-05:00
End time: 2022-10-24T06:00:00-05:00
Temperature: 73 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-24T18:00:00-05:00
End time: 2022-10-24T18:00:00-05:00
Temperature: 52 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-25T06:00:00-05:00
End time: 2022-10-25T06:00:00-05:00
Temperature: 59 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-25T18:00:00-05:00
End time: 2022-10-25T18:00:00-05:00
Temperature: 45 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-26T06:00:00-05:00
End time: 2022-10-26T06:00:00-05:00
Temperature: 54 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-26T18:00:00-05:00
End time: 2022-10-26T18:00:00-05:00
Temperature: 39 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-27T06:00:00-05:00
End time: 2022-10-27T06:00:00-05:00
Temperature: 53 degree F
------------------------------------------------------------------------------------------------
Start time: 2022-10-27T18:00:00-05:00
End time: 2022-10-27T18:00:00-05:00
Temperature: 39 degree F
------------------------------------------------------------------------------------------------
......
%% Cell type:markdown id:1d888215 tags:
# JSON
* Download ALL files for today's lecture
* Quiz 5 due tonight at 11:59:00 pm
* Read [Sweigart Ch 16](https://automatetheboringstuff.com/2e/chapter16/)
* [Exam 2 Conflict Form](https://docs.google.com/forms/d/e/1FAIpQLSegJSzTsDHEnygijU3-HQvZDUbTCkHFPKccDkqMt1dGzC67_w/viewform)
* [Regrade Request](https://piazza.com/class/ld8bqui1lgeas/post/105)
* Regrade requests take a long time to verify
* It is faster to:
- Post on Piazza
- Go to [office hours](https://sites.google.com/wisc.edu/cs220-oh-sp23/home?pli=1&authuser=2)
* Spring Break
- Have an amazing break!
- The entire team is also taking a break
- Office hours end today at 4:30 pm and resume Monday, March 20
- No one will monitor Piazza during Spring Break
* [Thank You](https://docs.google.com/forms/d/e/1FAIpQLSe0Zi6JFbxPIEVr7u1DJykuel-qSi7U0sBp2iR0gi6R_CArgw/viewform)
%% Cell type:markdown id:7ccf358c tags:
### Learning Objectives:
- Interpret JSON formatted data and recognize differences between JSON and Python
- Deserialize data from JSON for use in Python programs (read)
- Serialize data into JSON for long term storage (write)
%% Cell type:code id:5990f3d6 tags:
``` python
import csv
# TODO: import json module
```
%% Cell type:code id:5662ace1 tags:
``` python
# Deserialize
def read_json(path):
with open(path, encoding = "utf-8") as f: # f is a varaible
return json.load(f) # f represents data in the JSON file (dict, list, etc)
# Serialize
def write_json(path, data):
with open(path, 'w', encoding = "utf-8") as f:
json.dump(data, f, indent = 2)
```
%% Cell type:markdown id:a35312c9 tags:
### Example 1: Sum of numbers (simple JSON)
%% Cell type:code id:d241dc5c tags:
``` python
# TODO 1: Create a new "numsA.json".
# Add the list [1, 2, 3, 4] to "numsA.json" file.
# Use jupyter notebook to create and edit the new file
# TODO 2: Use input built-in function go get JSON file name from user
# Read the JSON file using read_json; capture return value into a variable
# TODO 3: Print type of data returned by function that reads JSON file.
# TODO 4: Using Python built-in function sum(...), calculate total of numbers in numsA.json, print the total.
# TODO 5: Create a new JSON file "numsB.json" and try out the following data:
# [-1, 10, 4,]
# Does that work?
# Change the data to [-1, 10, 4] and try to run the program by providing input as numsB.json
# TODO 6: Create a new JSON file "simple.json" and try out the following data.
# What kind of error do you get with this?
# Fix the error by commenting the line of code that causes the error!
# 3.14
# True
# true
# 'hello'
# "hello"
```
%% Output
Enter the JSON file name: numsA.json
<class 'list'>
10
%% Cell type:markdown id:e830f23f tags:
### Example 2: Score Tracker
%% Cell type:code id:10e00558 tags:
``` python
player_details = input("Enter player name and score: ")
# TODO 1: extract player name and score into variables
# TODO 2: Define an empty "scores" dictionary to keep track of players'
# scores.
# KEY: player name VALUE: player scores list
input_file = "score_history.json"
# TODO 3: Check if player name is a key in the scores dictionary.
# If not, create a new key for player name and value as empty list
# to keep track of that player's scores.
# TODO 4: Add player's score to the player's list in scores dictionary
# TODO 5: Create a "score_history.json" file and popluate that file with
# empty dictionary {}
# TODO 6: Read "score_history.json" to populate initial "scores" dict,
# instead of the empty dict created in TODO 2.
# TODO 7: Calculate average score for that player
# print("Average score for {} is {}."???)
# TODO 8: At the end of the program, write the updated scores from dict
# into the "score_history.json" file
# That's it, now you have a program that helps you keep track
# of player scores permanently.
```
%% Output
Enter player name and score: Viyan 40
{'Meena': [20, 10, 10], 'Viyan': [40, 50, 70, 40, 50, 40], 'Rogers': [10, 40]}
Average score for Viyan is 48.333333333333336.
%% Cell type:markdown id:b920d6e0 tags:
### Example 3: Kiva.com Micro-lending site
Many Web Sites have APIs that allow you to get their data.
%% Cell type:code id:dc891f5a tags:
``` python
# TODO: read "kiva.json"
# TODO: explore the type of the data structure returned by read_json
```
%% Output
<class 'dict'>
{'data': {'lend': {'loans': {'values': [{'name': 'Polikseni',
'description': "Polikseni is 70 years old and married. She and her husband are both retired and their main income is a retirement pension of $106 a month for Polikseni and disability income for her husband of $289 a month. <br /><br />Polikseni's husband, even though disabled, works in a very small shop as a watchmaker on short hours, just to provide additional income for his family and to feel useful. Polikseni's husband needs constant medical treatment due to his health problems. She requested another loan, which she will use to continue paying for the therapy her husband needs. With a part of the loan, she is going to pay the remainder of the previous loan.",
'loanAmount': '1325.00',
'geocode': {'city': 'Korce',
'country': {'name': 'Albania',
'region': 'Eastern Europe',
'fundsLentInCountry': 9051250}}},
{'name': 'Safarmo',
'description': "Safarmo is 47 years old. She lives with her husband and her children in Khuroson district. <br /><br />Safarmo is a seamstress. She has been engaged in sewing for 10 years. She learned this activity with help of her mother and elder sister. <br /><br />Safarmo's sewing machine is old and she cannot work well. Her difficulty is lack of money. That’s why she applied for a loan to buy a new modern sewing machine. <br /><br />Safarmo needs your support.",
'loanAmount': '1075.00',
'geocode': {'city': 'Khuroson',
'country': {'name': 'Tajikistan',
'region': 'Asia',
'fundsLentInCountry': 64243075}}},
{'name': 'Elizabeth',
'description': 'Elizabeth is a mom blessed with five lovely children, who are her greatest motivation in life. She lives in the Natuu area of Kenya. Elizabeth is one of the most hardworking women in sub-Saharan Africa. Being a mother and living in a poor country has never been an excuse for Elizabeth, who has practiced mixed farming for the past few years.<br /><br />The cultural expectations in her area contribute to the notion that men should support their families. However, Elizabeth works independently for the success of her children. She perseveres because she wants to provide a better future for them.<br /><br />Elizabeth has always loved farming. She is a very proud farmer and enjoys milking her dairy cows. Elizabeth keeps poultry and grows crops, but she has not been making a good profit because of poor farming inputs. <br /><br />Elizabeth will use this loan to buy farm inputs and purchase high-quality seeds and good fertilizer to improve her crop production. Modern farming requires the use of modern techniques, and, therefore, using high-quality seeds will assure her of a bumper harvest and increased profit levels.<br /><br />Elizabeth is very visionary. Her goal for the season is to boost her crop production over the previous year.',
'loanAmount': '800.00',
'geocode': {'city': 'Matuu',
'country': {'name': 'Kenya',
'region': 'Africa',
'fundsLentInCountry': 120841775}}},
{'name': 'Ester',
'description': 'Ester believes that this year is her year of prosperity. Ester is a hardworking, progressive and honest farmer from a very remote village in the Kitale area of Kenya. This area is very fertile, with favorable weather patterns that support farming activities. Ester is happily married and the proud mother of lovely children. Together, they live on a small piece of land that she really treasures. Her primary sources of income are eggs and milk.<br /><br />Although this humble and industrious mother makes a profit, she faces the challenge of not being able to produce enough to meet the readily available market. Therefore, she is seeking funds from Kiva lenders to buy farm inputs such as good fertilizer and good-quality seeds. Through this loan, Ester should double her production, and this will translate into increased income. She then intends to save more money in the future so that she can develop her farming.<br /><br />One objective that Juhudi Kilimo aims at fulfilling is increasing the ease of accessing farm inputs and income-generating assets for farmers. Through the intervention of Juhudi Kilimo and Kiva, inputs such as fertilizers and pesticides have become more accessible to its members than buying a bottle of water. Ester is very optimistic and believes this loan will change her life completely.',
'loanAmount': '275.00',
'geocode': {'city': 'Kitale',
'country': {'name': 'Kenya',
'region': 'Africa',
'fundsLentInCountry': 120841775}}},
{'name': 'Cherifa',
'description': 'Cherifa is married, 57 years old with two children. She caters and also sells the local drink. She asks for credit to buy the necessities, in particular bags of anchovies, bags of maize and bundles of firewood. She wants to have enough income to run the house well.',
'loanAmount': '875.00',
'geocode': {'city': 'Agoe',
'country': {'name': 'Togo',
'region': 'Africa',
'fundsLentInCountry': 13719125}}}]}}}}
%% Cell type:markdown id:e29310b6 tags:
How to explore an unknown JSON?
- If you run into a `dict`, try `.keys()` method to look at the keys of the dictionary, then use lookup process to explore further
- If you run into a `list`, iterate over the list and print each item
%% Cell type:code id:31fe630f tags:
``` python
# TODO: lookup the value corresponding to the key data
# TODO: you know what to do next ... explore type again
```
%% Output
['data']
<class 'dict'>
%% Cell type:code id:e09b064f tags:
``` python
```
%% Output
['lend']
['loans']
['values']
%% Cell type:code id:b6cb9c28 tags:
``` python
# TODO: iterate over loans_list and print every borrower's name, loan amount and country details
???
print("Borrower name:", borrower_name)
print("Loan amount: $", loan_amount, sep = "")
print("Country details:", country_details)
print("------------------------------------------------------------------------------------------------")
```
%% Output
Borrower name: Polikseni
Loan amount: $1325.00
Country details: {'name': 'Albania', 'region': 'Eastern Europe', 'fundsLentInCountry': 9051250}
------------------------------------------------------------------------------------------------
Borrower name: Safarmo
Loan amount: $1075.00
Country details: {'name': 'Tajikistan', 'region': 'Asia', 'fundsLentInCountry': 64243075}
------------------------------------------------------------------------------------------------
Borrower name: Elizabeth
Loan amount: $800.00
Country details: {'name': 'Kenya', 'region': 'Africa', 'fundsLentInCountry': 120841775}
------------------------------------------------------------------------------------------------
Borrower name: Ester
Loan amount: $275.00
Country details: {'name': 'Kenya', 'region': 'Africa', 'fundsLentInCountry': 120841775}
------------------------------------------------------------------------------------------------
Borrower name: Cherifa
Loan amount: $875.00
Country details: {'name': 'Togo', 'region': 'Africa', 'fundsLentInCountry': 13719125}
------------------------------------------------------------------------------------------------
%% Cell type:markdown id:26b4c70c tags:
### Let's write student information dataset into various JSON files
%% Cell type:code id:c120d0eb tags:
``` python
# inspired by https://automatetheboringstuff.com/2e/chapter16/
def process_csv(filename):
exampleFile = open(filename, encoding="utf-8")
exampleReader = csv.reader(exampleFile)
exampleData = list(exampleReader)
exampleFile.close()
return exampleData
survey_data = process_csv('cs220_survey_data.csv')
cs220_header = survey_data[0]
cs220_data = survey_data[1:]
```
%% Cell type:code id:47cb92e9 tags:
``` python
def cell(row_idx, col_name):
"""
Returns the data value (cell) corresponding to the row index and
the column name of a CSV file.
"""
col_idx = cs220_header.index(col_name)
val = cs220_data[row_idx][col_idx]
# handle missing values, by returning None
if val == '':
return None
# handle type conversions
if col_name in ["Age",]:
return int(val)
return val
```
%% Cell type:code id:9a84cfb8 tags:
``` python
def bucketize(bucket_column):
"""
generates and returns bucketized data based on bucket_column
"""
# Key: unique bucketize column value; Value: list of lists (rows having that unique column value)
buckets = dict()
for row_idx in range(len(cs220_data)):
row = cs220_data[row_idx]
col_value = cell(row_idx, bucket_column)
if col_value not in buckets:
# create a new bucket when there is no existing bucket
buckets[col_value] = []
buckets[col_value].append(row)
return buckets
# TODO: create lecture based buckets and store result into lecture_buckets
# TODO: What is the type of lec_buckets? A ______ of _________
# TODO: write lec_buckets into a JSON file called "lecture_cs220_data.json"
# TODO: create major based buckets and store result into major_buckets
# TODO: write major_buckets into a JSON file called "major_cs220_data.json"
```
%% Cell type:code id:f36a584a tags:
``` python
def transform(header, data):
"""
Transform data into a list of dictionaries
"""
transformed_data = [] #should be defined outside the for loop, because it stores the entire data
for row in data:
#should be defined inside the for loop, because it represents one row as a dictionary
row_dict = {} # Key: header; Value: data
for idx in range(len(row)):
row_dict[header[idx]] = row[idx]
transformed_data.append(row_dict)
return transformed_data
transformed_data = transform(cs220_header, cs220_data)
# TODO: What is the type of transformed_data? A ______ of _________
# TODO: write transformed_data into a JSON file called "cs220_survey_data.json"
```
%% Cell type:markdown id:5279cdab tags:
### Self-practice: explore real-world JSON
### Weather for UW-Madison campus
%% Cell type:code id:7a4efc8c tags:
``` python
# TODO: read "weather.json"
# TODO: explore the type of the data structure returned by read_json
# display the data from "weather.json"
```
%% Cell type:code id:c1b98641 tags:
``` python
# TODO: display the keys of the weather.json dict
# TODO: lookup the value corresponding to the 'properties'
# TODO: you know what to do next ... explore type again
```
%% Cell type:code id:ca457052 tags:
``` python
# TODO: display the keys of the properties dict
# TODO: lookup the value corresponding to the 'periods'
# TODO: you know what to do next ... explore type again
```
%% Cell type:code id:34f6dfa4 tags:
``` python
# TODO: extract periods list into a variable
# TODO: iterate over loans_list and print every periods's startTime, endTime, temperature, and temperatureUnit
for period_dict in periods_list:
print("Start time:", start_time)
print("End time:", end_time)
print("Temperature: {} degree {}".format(temperature, temperature_unit))
print("------------------------------------------------------------------------------------------------")
```
......
%% Cell type:markdown id:1d888215 tags:
# JSON
* Download ALL files for today's lecture
* Quiz 5 due tonight at 11:59:00 pm
* Read [Sweigart Ch 16](https://automatetheboringstuff.com/2e/chapter16/)
* [Exam 2 Conflict Form](https://docs.google.com/forms/d/e/1FAIpQLSegJSzTsDHEnygijU3-HQvZDUbTCkHFPKccDkqMt1dGzC67_w/viewform)
* [Regrade Request](https://piazza.com/class/ld8bqui1lgeas/post/105)
* Regrade requests take a long time to verify
* It is faster to:
- Post on Piazza
- Go to [office hours](https://sites.google.com/wisc.edu/cs220-oh-sp23/home?pli=1&authuser=2)
* Spring Break
- Have an amazing break!
- The entire team is also taking a break
- Office hours end today at 4:30 pm and resume Monday, March 20
- No one will monitor Piazza during Spring Break
* [Thank You](https://docs.google.com/forms/d/e/1FAIpQLSe0Zi6JFbxPIEVr7u1DJykuel-qSi7U0sBp2iR0gi6R_CArgw/viewform)
%% Cell type:markdown id:7ccf358c tags:
### Learning Objectives:
- Interpret JSON formatted data and recognize differences between JSON and Python
- Deserialize data from JSON for use in Python programs (read)
- Serialize data into JSON for long term storage (write)
%% Cell type:code id:5990f3d6 tags:
``` python
import csv
# TODO: import json module
```
%% Cell type:code id:5662ace1 tags:
``` python
# Deserialize
def read_json(path):
with open(path, encoding = "utf-8") as f: # f is a varaible
return json.load(f) # f represents data in the JSON file (dict, list, etc)
# Serialize
def write_json(path, data):
with open(path, 'w', encoding = "utf-8") as f:
json.dump(data, f, indent = 2)
```
%% Cell type:markdown id:a35312c9 tags:
### Example 1: Sum of numbers (simple JSON)
%% Cell type:code id:d241dc5c tags:
``` python
# TODO 1: Create a new "numsA.json".
# Add the list [1, 2, 3, 4] to "numsA.json" file.
# Use jupyter notebook to create and edit the new file
# TODO 2: Use input built-in function go get JSON file name from user
# Read the JSON file using read_json; capture return value into a variable
# TODO 3: Print type of data returned by function that reads JSON file.
# TODO 4: Using Python built-in function sum(...), calculate total of numbers in numsA.json, print the total.
# TODO 5: Create a new JSON file "numsB.json" and try out the following data:
# [-1, 10, 4,]
# Does that work?
# Change the data to [-1, 10, 4] and try to run the program by providing input as numsB.json
# TODO 6: Create a new JSON file "simple.json" and try out the following data.
# What kind of error do you get with this?
# Fix the error by commenting the line of code that causes the error!
# 3.14
# True
# true
# 'hello'
# "hello"
```
%% Cell type:markdown id:e830f23f tags:
### Example 2: Score Tracker
%% Cell type:code id:10e00558 tags:
``` python
player_details = input("Enter player name and score: ")
# TODO 1: extract player name and score into variables
# TODO 2: Define an empty "scores" dictionary to keep track of players'
# scores.
# KEY: player name VALUE: player scores list
input_file = "score_history.json"
# TODO 3: Check if player name is a key in the scores dictionary.
# If not, create a new key for player name and value as empty list
# to keep track of that player's scores.
# TODO 4: Add player's score to the player's list in scores dictionary
# TODO 5: Create a "score_history.json" file and popluate that file with
# empty dictionary {}
# TODO 6: Read "score_history.json" to populate initial "scores" dict,
# instead of the empty dict created in TODO 2.
# TODO 7: Calculate average score for that player
# print("Average score for {} is {}."???)
# TODO 8: At the end of the program, write the updated scores from dict
# into the "score_history.json" file
# That's it, now you have a program that helps you keep track
# of player scores permanently.
```
%% Cell type:markdown id:b920d6e0 tags:
### Example 3: Kiva.com Micro-lending site
Many Web Sites have APIs that allow you to get their data.
%% Cell type:code id:dc891f5a tags:
``` python
# TODO: read "kiva.json"
# TODO: explore the type of the data structure returned by read_json
```
%% Cell type:markdown id:e29310b6 tags:
How to explore an unknown JSON?
- If you run into a `dict`, try `.keys()` method to look at the keys of the dictionary, then use lookup process to explore further
- If you run into a `list`, iterate over the list and print each item
%% Cell type:code id:31fe630f tags:
``` python
# TODO: lookup the value corresponding to the key data
# TODO: you know what to do next ... explore type again
```
%% Cell type:code id:e09b064f tags:
``` python
```
%% Cell type:code id:b6cb9c28 tags:
``` python
# TODO: iterate over loans_list and print every borrower's name, loan amount and country details
???
print("Borrower name:", borrower_name)
print("Loan amount: $", loan_amount, sep = "")
print("Country details:", country_details)
print("------------------------------------------------------------------------------------------------")
```
%% Cell type:markdown id:26b4c70c tags:
### Let's write student information dataset into various JSON files
%% Cell type:code id:c120d0eb tags:
``` python
# inspired by https://automatetheboringstuff.com/2e/chapter16/
def process_csv(filename):
exampleFile = open(filename, encoding="utf-8")
exampleReader = csv.reader(exampleFile)
exampleData = list(exampleReader)
exampleFile.close()
return exampleData
survey_data = process_csv('cs220_survey_data.csv')
cs220_header = survey_data[0]
cs220_data = survey_data[1:]
```
%% Cell type:code id:47cb92e9 tags:
``` python
def cell(row_idx, col_name):
"""
Returns the data value (cell) corresponding to the row index and
the column name of a CSV file.
"""
col_idx = cs220_header.index(col_name)
val = cs220_data[row_idx][col_idx]
# handle missing values, by returning None
if val == '':
return None
# handle type conversions
if col_name in ["Age",]:
return int(val)
return val
```
%% Cell type:code id:9a84cfb8 tags:
``` python
def bucketize(bucket_column):
"""
generates and returns bucketized data based on bucket_column
"""
# Key: unique bucketize column value; Value: list of lists (rows having that unique column value)
buckets = dict()
for row_idx in range(len(cs220_data)):
row = cs220_data[row_idx]
col_value = cell(row_idx, bucket_column)
if col_value not in buckets:
# create a new bucket when there is no existing bucket
buckets[col_value] = []
buckets[col_value].append(row)
return buckets
# TODO: create lecture based buckets and store result into lecture_buckets
# TODO: What is the type of lec_buckets? A ______ of _________
# TODO: write lec_buckets into a JSON file called "lecture_cs220_data.json"
# TODO: create major based buckets and store result into major_buckets
# TODO: write major_buckets into a JSON file called "major_cs220_data.json"
```
%% Cell type:code id:f36a584a tags:
``` python
def transform(header, data):
"""
Transform data into a list of dictionaries
"""
transformed_data = [] #should be defined outside the for loop, because it stores the entire data
for row in data:
#should be defined inside the for loop, because it represents one row as a dictionary
row_dict = {} # Key: header; Value: data
for idx in range(len(row)):
row_dict[header[idx]] = row[idx]
transformed_data.append(row_dict)
return transformed_data
transformed_data = transform(cs220_header, cs220_data)
# TODO: What is the type of transformed_data? A ______ of _________
# TODO: write transformed_data into a JSON file called "cs220_survey_data.json"
```
%% Cell type:markdown id:5279cdab tags:
### Self-practice: explore real-world JSON
### Weather for UW-Madison campus
%% Cell type:code id:7a4efc8c tags:
``` python
# TODO: read "weather.json"
# TODO: explore the type of the data structure returned by read_json
# display the data from "weather.json"
```
%% Cell type:code id:c1b98641 tags:
``` python
# TODO: display the keys of the weather.json dict
# TODO: lookup the value corresponding to the 'properties'
# TODO: you know what to do next ... explore type again
```
%% Cell type:code id:ca457052 tags:
``` python
# TODO: display the keys of the properties dict
# TODO: lookup the value corresponding to the 'periods'
# TODO: you know what to do next ... explore type again
```
%% Cell type:code id:34f6dfa4 tags:
``` python
# TODO: extract periods list into a variable
# TODO: iterate over loans_list and print every periods's startTime, endTime, temperature, and temperatureUnit
for period_dict in periods_list:
print("Start time:", start_time)
print("End time:", end_time)
print("Temperature: {} degree {}".format(temperature, temperature_unit))
print("------------------------------------------------------------------------------------------------")
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment