Skip to content
Snippets Groups Projects
Commit 90906987 authored by msyamkumar's avatar msyamkumar
Browse files

Lecture 16 notebooks

parent 929e161f
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id:72348536 tags:
# List Practice
%% Cell type:code id:ba562f5e tags:
``` python
import csv
```
%% Cell type:code id:9d936c1c tags:
``` python
# inspired by https://automatetheboringstuff.com/2e/chapter16/
def process_csv(filename):
# open the file, its a text file utf-8
example_file = open(filename, encoding="utf-8")
# prepare it for reading as a CSV object
example_reader = csv.reader(example_file)
# use the built-in list function to convert this into a list of lists
example_data = list(example_reader)
# close the file to tidy up our workspace
example_file.close()
# return the list of lists
return example_data
```
%% Cell type:markdown id:89621c98 tags:
### Student Information Survey data
%% Cell type:code id:d3c252b4 tags:
``` python
# TODO: call the process_csv function and store the list of lists in cs220_csv
cs220_csv = process_csv("cs220_survey_data.csv")
```
%% Cell type:code id:5838ae5f tags:
``` python
# Store the header row into cs220_header, using indexing
cs220_header = cs220_csv[0]
cs220_header
```
%% Output
['Lecture',
'Age',
'Primary major',
'Other majors',
'Zip Code',
'Pizza topping',
'Pet owner',
'Runner',
'Sleep habit',
'Procrastinator']
%% Cell type:code id:66fda88d tags:
``` python
# TODO: Store all of the data rows into cs220_data, using slicing
cs220_data = cs220_csv[1:]
# TODO: use slicing to display top 3 rows data
cs220_data[:3]
```
%% Output
[['LEC002',
'19',
'Engineering: Mechanical',
'',
'53711',
'pepperoni',
'Yes',
'No',
'night owl',
'Maybe'],
['LEC002',
'20',
'Science: Physics',
'Astronomy-Physics, History',
'53726',
'pineapple',
'Yes',
'Yes',
'night owl',
'Yes'],
['LEC001',
'20',
'Science: Chemistry',
'',
'53703',
'pepperoni',
'Yes',
'No',
'early bird',
'No']]
%% Cell type:markdown id:4267fe3e tags:
### What is the Sleep habit for the 2nd student?
%% Cell type:code id:4b8dbe8b tags:
``` python
cs220_data[1][8] # bad example: we hard-coded the column index
```
%% Output
'night owl'
%% Cell type:markdown id:4f125240 tags:
What if we decided to add a new column before sleeping habit? Your code will no longer work.
Instead of hard-coding column index, you should use `index` method, to lookup column index from the header variable. This will also make your code so much readable.
%% Cell type:code id:f2e52e06 tags:
``` python
cs220_data[1][cs220_header.index("Sleep habit")]
```
%% Output
'night owl'
%% Cell type:markdown id:5d298a4c tags:
### What is the Lecture of the 4th student?
%% Cell type:code id:3617b3de tags:
``` python
cs220_data[3][cs220_header.index("Lecture")]
```
%% Output
'LEC004'
%% Cell type:markdown id:059de363 tags:
### Create a list containing Age of all students 10 years from now
%% Cell type:code id:45909f22 tags:
``` python
ages_in_ten_years = []
for row in cs220_data:
age = row[cs220_header.index("Age")]
if age == '':
continue
age = int(age)
ages_in_ten_years.append(age + 10)
ages_in_ten_years[:3]
```
%% Output
[29, 30, 30]
%% Cell type:markdown id:8e18663d tags:
### cell function
- It would be very helpful to define a cell function, which can handle missing data and type conversions
%% Cell type:code id:bba90038 tags:
``` python
def cell(row_idx, col_name):
"""
Returns the data value (cell) corresponding to the row index and
the column name of a CSV file.
"""
# TODO: get the index of col_name
col_idx = cs220_header.index(col_name)
# TODO: get the value of cs220_data at the specified cell
val = cs220_data[row_idx][col_idx]
# TODO: handle missing values, by returning None
if val == '':
return None
# TODO: handle type conversions
if col_name in ["Age",]:
return int(val)
return val
```
%% Cell type:markdown id:b7c8e726 tags:
### Find average age per lecture.
%% Cell type:code id:f0a05e42 tags:
``` python
# TODO: initialize 4 lists for the 4 lectures
lec1_ages = []
lec2_ages = []
lec3_ages = []
lec4_ages = []
# Iterate over the data and populate the lists
for row_idx in range(len(cs220_data)):
age = cell(row_idx, "Age")
if age != None:
lecture = cell(row_idx, "Lecture")
if lecture == "LEC001":
lec1_ages.append(age)
elif lecture == "LEC002":
lec2_ages.append(age)
elif lecture == "LEC003":
lec3_ages.append(age)
elif lecture == "LEC004":
lec4_ages.append(age)
# TODO: compute average age of each lecture
print("LEC001 average student age:", round(sum(lec1_ages) / len(lec1_ages), 2))
print("LEC002 average student age:", round(sum(lec2_ages) / len(lec2_ages), 2))
print("LEC003 average student age:", round(sum(lec3_ages) / len(lec3_ages), 2))
print("LEC004 average student age:", round(sum(lec4_ages) / len(lec4_ages), 2))
```
%% Output
LEC001 average student age: 19.93
LEC002 average student age: 19.8
LEC003 average student age: 19.38
LEC004 average student age: 19.27
%% Cell type:markdown id:94548bf4 tags:
### `sort` method versus `sorted` function
- `sort` (and other list methods) have an impact on the original list
- `sorted` function returns a new list with expected ordering
- default sorting order is ascending / alphanumeric
- `reverse` parameter is applicable for both `sort` method and `sorted` function:
- enables you to specify descending order by passing argument as `True`
%% Cell type:code id:c1e555f9 tags:
``` python
some_list = [10, 4, 25, 2, -10] # TODO: Initialize some_list with a list of un-ordered integers
```
%% Cell type:code id:152297bb tags:
``` python
# TODO: Invoke sort method
rv = some_list.sort()
print(some_list)
# What does the sort method return?
# TODO: Capture return value into a variable rv and print the return value.
print(rv)
```
%% Output
[-10, 2, 4, 10, 25]
None
%% Cell type:markdown id:3c0d5e7d tags:
`sort` method returns `None` because it sorts the values in the original list
%% Cell type:code id:c06d8976 tags:
``` python
# TODO: invoke sorted function and pass some_list as argument
# TODO: capture return value into sorted_some_list
sorted_some_list = sorted(some_list)
# What does the sorted function return? It returns a brand new list with the values in sorted order
print(sorted_some_list)
```
%% Output
[-10, 2, 4, 10, 25]
%% Cell type:markdown id:ded0304c tags:
TODO: go back to `sort` method call and `sorted` function call and pass keyword argument `reverse = True`.
%% Cell type:markdown id:99803f1e tags:
### set data structure
- **not a sequence**
- no ordering of values:
- this implies that you can only store unique values within a `set`
- very helpful to find unique values stored in a `list`
- easy to convert a `list` to `set` and vice-versa.
- ordering is not guaranteed once we use `set`
%% Cell type:code id:928abc2e tags:
``` python
some_set = {10, 20, 30, 30, 40, 50, 10} # use a pair of curly braces to define it
some_set
```
%% Output
{10, 20, 30, 40, 50}
%% Cell type:code id:2aa9bc02 tags:
``` python
some_list = [10, 20, 30, 30, 40, 50, 10] # Initialize a list containing duplicate numbers
# TODO: to find unique values, convert it into a set
print(set(some_list))
# TODO: convert the set back into a list
print(list(set(some_list)))
```
%% Output
{40, 10, 50, 20, 30}
[40, 10, 50, 20, 30]
%% Cell type:markdown id:2a561420 tags:
Can you call `sort` method on a set?
%% Cell type:code id:0d616535 tags:
``` python
# some_set.sort()
# doesn't work: no method named sort associated with type set
# you cannot sort a set because of the lack of ordering
```
%% Cell type:markdown id:0349560e tags:
Can you pass a `set` as argument to `sorted` function? Python is intelligent :)
%% Cell type:code id:1db6f699 tags:
``` python
sorted(some_set) # works because Python converts the set into a list and then sorts the list
```
%% Output
[10, 20, 30, 40, 50]
%% Cell type:markdown id:7389953d tags:
Can you index / slice into a `set`?
%% Cell type:code id:8b819251 tags:
``` python
# some_set[1] # doesn't work - remember set has no order
```
%% Cell type:code id:d0a48520 tags:
``` python
# some_set[1:] # doesn't work - remember set has no order
```
%% Cell type:markdown id:64fd0945 tags:
### Find all unique zip codes. Arrange them based on ascending order.
%% Cell type:code id:c28e77ce tags:
``` python
# TODO: initialize list of keep track of zip codes
zip_codes = []
for row_idx in range(len(cs220_data)):
zip_code = cell(row_idx, "Zip Code")
if zip_code != None:
zip_codes.append(zip_code)
zip_codes = list(set(zip_codes))
zip_codes.sort()
zip_codes
```
%% Output
['10306',
'19002',
'43706',
'5 3706',
'52706',
'52816',
'53076',
'53089',
'53175',
'53562',
'53575',
'53590',
'53597',
'53701',
'53703',
'53703-1104',
'53704',
'53705',
'53706',
'53706-1127',
'53706-1188',
'53706-1203',
'53706-1406',
'53708',
'53711',
'53713',
'53715',
'53717',
'53719',
'53726',
'54636',
'55416',
'57305',
'59301',
'83001',
'92376',
'internation student']
%% Cell type:markdown id:e354b781 tags:
### Arrange unique zip codes based on descending order.
%% Cell type:code id:ca887135 tags:
``` python
sorted(zip_codes, reverse = True)
```
%% Output
['internation student',
'92376',
'83001',
'59301',
'57305',
'55416',
'54636',
'53726',
'53719',
'53717',
'53715',
'53713',
'53711',
'53708',
'53706-1406',
'53706-1203',
'53706-1188',
'53706-1127',
'53706',
'53705',
'53704',
'53703-1104',
'53703',
'53701',
'53597',
'53590',
'53575',
'53562',
'53175',
'53089',
'53076',
'52816',
'52706',
'5 3706',
'43706',
'19002',
'10306']
%% Cell type:markdown id:31a381fe tags:
## Self-practice
%% Cell type:markdown id:8ac26620 tags:
### How many students are both a procrastinator and a pet owner?
%% Cell type:markdown id:172141ea tags:
### What percentage of 18-year-olds have their major declared as "Other"?
%% Cell type:markdown id:d9a7a2b1 tags:
### How old is the oldest basil/spinach-loving Business major?
%% Cell type:code id:5fcc04f2 tags:
``` python
```
%% Cell type:markdown id:72348536 tags:
# List Practice
%% Cell type:code id:ba562f5e tags:
``` python
import csv
```
%% Cell type:code id:9d936c1c tags:
``` python
# inspired by https://automatetheboringstuff.com/2e/chapter16/
def process_csv(filename):
# open the file, its a text file utf-8
example_file = open(filename, encoding="utf-8")
# prepare it for reading as a CSV object
example_reader = csv.reader(example_file)
# use the built-in list function to convert this into a list of lists
example_data = list(example_reader)
# close the file to tidy up our workspace
example_file.close()
# return the list of lists
return example_data
```
%% Cell type:markdown id:89621c98 tags:
### Student Information Survey data
%% Cell type:code id:d3c252b4 tags:
``` python
# TODO: call the process_csv function and store the list of lists in cs220_csv
```
%% Cell type:code id:5838ae5f tags:
``` python
# Store the header row into cs220_header, using indexing
cs220_header = ???
cs220_header
```
%% Cell type:code id:66fda88d tags:
``` python
# TODO: Store all of the data rows into cs220_data, using slicing
cs220_data = ???
# TODO: use slicing to display top 3 rows data
cs220_data[???]
```
%% Cell type:markdown id:4267fe3e tags:
### What is the Sleep habit for the 2nd student?
%% Cell type:code id:4b8dbe8b tags:
``` python
# bad example: we hard-coded the column index
```
%% Cell type:markdown id:4f125240 tags:
What if we decided to add a new column before sleeping habit? Your code will no longer work.
Instead of hard-coding column index, you should use `index` method, to lookup column index from the header variable. This will also make your code so much readable.
%% Cell type:code id:f2e52e06 tags:
``` python
```
%% Cell type:markdown id:5d298a4c tags:
### What is the Lecture of the 4th student?
%% Cell type:code id:3617b3de tags:
``` python
```
%% Cell type:markdown id:059de363 tags:
### Create a list containing Age of all students 10 years from now
%% Cell type:code id:45909f22 tags:
``` python
```
%% Cell type:markdown id:8e18663d tags:
### cell function
- It would be very helpful to define a cell function, which can handle missing data and type conversions
%% Cell type:code id:bba90038 tags:
``` python
def cell(row_idx, col_name):
"""
Returns the data value (cell) corresponding to the row index and
the column name of a CSV file.
"""
# TODO: get the index of col_name
# TODO: get the value of cs220_data at the specified cell
# TODO: handle missing values, by returning None
# TODO: handle type conversions
```
%% Cell type:markdown id:b7c8e726 tags:
### Find average age per lecture.
%% Cell type:code id:f0a05e42 tags:
``` python
# TODO: initialize 4 lists for the 4 lectures
# Iterate over the data and populate the lists
# TODO: compute average age of each lecture
print("LEC001 average student age:", round(sum(lec1_ages) / len(lec1_ages), 2))
print("LEC002 average student age:", round(sum(lec2_ages) / len(lec2_ages), 2))
print("LEC003 average student age:", round(sum(lec3_ages) / len(lec3_ages), 2))
print("LEC004 average student age:", round(sum(lec4_ages) / len(lec4_ages), 2))
```
%% Cell type:markdown id:6aeaec34 tags:
### `sort` method versus `sorted` function
- `sort` (and other list methods) have an impact on the original list
- `sorted` function returns a new list with expected ordering
- default sorting order is ascending / alphanumeric
- `reverse` parameter is applicable for both `sort` method and `sorted` function:
- enables you to specify descending order by passing argument as `True`
%% Cell type:code id:69335f13 tags:
``` python
some_list = # TODO: Initialize some_list with a list of un-ordered integers
```
%% Cell type:code id:62b2f81a tags:
``` python
# TODO: Invoke sort method
print(some_list)
# What does the sort method return?
# TODO: Capture return value into a variable rv and print the return value.
print(rv)
```
%% Cell type:markdown id:b5738543 tags:
`sort` method returns `None` because it sorts the values in the original list
%% Cell type:code id:bb93809e tags:
``` python
# TODO: invoke sorted function and pass some_list as argument
# TODO: capture return value into sorted_some_list
sorted_some_list = sorted(some_list)
# What does the sorted function return? It returns a brand new list with the values in sorted order
print(sorted_some_list)
```
%% Cell type:markdown id:0b7fceb6 tags:
TODO: go back to `sort` method call and `sorted` function call and pass keyword argument `reverse = True`.
%% Cell type:markdown id:5c33901b tags:
### set data structure
- **not a sequence**
- no ordering of values:
- this implies that you can only store unique values within a `set`
- very helpful to find unique values stored in a `list`
- easy to convert a `list` to `set` and vice-versa.
- ordering is not guaranteed once we use `set`
%% Cell type:code id:96914953 tags:
``` python
some_set = {10, 20, 30, 30, 40, 50, 10} # use a pair of curly braces to define it
some_set
```
%% Cell type:code id:5249e026 tags:
``` python
some_list = [10, 20, 30, 30, 40, 50, 10] # Initialize a list containing duplicate numbers
# TODO: to find unique values, convert it into a set
print(some_list)
# TODO: convert the set back into a list
print(some_list)
```
%% Cell type:markdown id:245d1dff tags:
Can you call `sort` method on a set?
%% Cell type:code id:321105b7 tags:
``` python
```
%% Cell type:markdown id:5abca57e tags:
Can you pass a `set` as argument to `sorted` function? Python is intelligent :)
%% Cell type:code id:14a1a216 tags:
``` python
```
%% Cell type:markdown id:c656510b tags:
Can you index / slice into a `set`?
%% Cell type:code id:a64ec5bd tags:
``` python
```
%% Cell type:code id:4d61e84d tags:
``` python
```
%% Cell type:markdown id:64fd0945 tags:
### Find all unique zip codes. Arrange them based on ascending order.
%% Cell type:code id:c28e77ce tags:
``` python
# TODO: initialize list of keep track of zip codes
zip_codes = []
for row_idx in range(len(cs220_data)):
zip_code = cell(row_idx, "Zip Code")
if zip_code != None:
zip_codes.append(zip_code)
zip_codes # How do we get the unique values?
```
%% Cell type:markdown id:d4edf965 tags:
### Arrange unique zip codes based on descending order.
%% Cell type:code id:83926b35 tags:
``` python
```
%% Cell type:markdown id:31a381fe tags:
## Self-practice
%% Cell type:markdown id:8ac26620 tags:
### How many students are both a procrastinator and a pet owner?
%% Cell type:markdown id:172141ea tags:
### What percentage of 18-year-olds have their major declared as "Other"?
%% Cell type:markdown id:d9a7a2b1 tags:
### How old is the oldest basil/spinach-loving Business major?
%% Cell type:code id:5fcc04f2 tags:
``` python
```
This diff is collapsed.
File deleted
This diff is collapsed.
%% Cell type:markdown id:72348536 tags:
# List Practice
%% Cell type:code id:ba562f5e tags:
%% Cell type:code id:d21a94b5 tags:
``` python
import csv
```
%% Cell type:markdown id:cd8a434c tags:
### Warmup 1: min / max
%% Cell type:code id:baa730ba tags:
``` python
some_list = [45, -4, 66, 220, 10]
min_val = None
for val in some_list:
if min_val == None or val < min_val:
min_val = val
print(min_val)
max_val = None
for val in some_list:
if max_val == None or val > max_val:
max_val = val
print(max_val)
```
%% Cell type:markdown id:3502c700 tags:
### Warmup 2: median
%% Cell type:code id:414ae09e tags:
``` python
def median(some_items):
"""
Returns median of a list passed as argument
"""
pass
nums = [5, 4, 3, 2, 1]
print("Median of", nums, "is" , median(nums))
nums = [6, 5, 4, 3, 2, 1]
print("Median of", nums, "is" , median(nums))
```
%% Cell type:code id:73fa337e tags:
``` python
vals = ["A", "C", "B"]
print("Median of", nums, "is" , median(vals))
vals = ["A", "C", "B", "D"]
# print("Median of", nums, "is" , median(vals)) # does not work due to TypeError
```
%% Cell type:markdown id:050fd57c tags:
### set data structure
- **not a sequence**
- no ordering of values:
- this implies that you can only store unique values within a `set`
- very helpful to find unique values stored in a `list`
- easy to convert a `list` to `set` and vice-versa.
- ordering is not guaranteed once we use `set`
%% Cell type:code id:7d4a693f tags:
``` python
some_set = {10, 20, 30, 30, 40, 50, 10} # use a pair of curly braces to define it
some_set
```
%% Cell type:code id:baef596c tags:
``` python
some_list = [10, 20, 30, 30, 40, 50, 10] # Initialize a list containing duplicate numbers
# TODO: to find unique values, convert it into a set
print(set(some_list))
# TODO: convert the set back into a list
print(list(set(some_list)))
```
%% Cell type:markdown id:2be52d13 tags:
Can you index / slice into a `set`?
%% Cell type:code id:f622a5eb tags:
``` python
some_set[1] # doesn't work - remember set has no order
```
%% Cell type:code id:e679d3a7 tags:
``` python
some_set[1:] # doesn't work - remember set has no order
```
%% Cell type:code id:9d936c1c tags:
``` python
# inspired by https://automatetheboringstuff.com/2e/chapter16/
def process_csv(filename):
# open the file, its a text file utf-8
example_file = open(filename, encoding="utf-8")
# prepare it for reading as a CSV object
example_reader = csv.reader(example_file)
# use the built-in list function to convert this into a list of lists
example_data = list(example_reader)
# close the file to tidy up our workspace
example_file.close()
# return the list of lists
return example_data
```
%% Cell type:markdown id:89621c98 tags:
### Student Information Survey data
%% Cell type:code id:d3c252b4 tags:
``` python
# TODO: call the process_csv function and store the list of lists in cs220_csv
cs220_csv = process_csv(???)
```
%% Cell type:code id:5838ae5f tags:
``` python
# Store the header row into cs220_header, using indexing
cs220_header = ???
cs220_header
```
%% Cell type:code id:66fda88d tags:
``` python
# TODO: Store all of the data rows into cs220_data, using slicing
cs220_data = ???
# TODO: use slicing to display top 3 rows data
cs220_data[???]
cs220_data???
```
%% Cell type:markdown id:4267fe3e tags:
### What is the Sleep habit for the 2nd student?
### What `Pizza topping` does the 13th student prefer?
%% Cell type:code id:4b8dbe8b tags:
``` python
# bad example: we hard-coded the column index
```
%% Cell type:markdown id:4f125240 tags:
What if we decided to add a new column before sleeping habit? Your code will no longer work.
Instead of hard-coding column index, you should use `index` method, to lookup column index from the header variable. This will also make your code so much readable.
%% Cell type:code id:f2e52e06 tags:
``` python
```
%% Cell type:markdown id:5d298a4c tags:
### What is the Lecture of the 4th student?
%% Cell type:code id:3617b3de tags:
``` python
```
%% Cell type:markdown id:059de363 tags:
### Create a list containing Age of all students 10 years from now
### What **unique** `age` values are included in the dataset?
%% Cell type:code id:45909f22 tags:
``` python
```
%% Cell type:markdown id:8e18663d tags:
### cell function
- It would be very helpful to define a cell function, which can handle missing data and type conversions
%% Cell type:code id:bba90038 tags:
``` python
def cell(row_idx, col_name):
"""
Returns the data value (cell) corresponding to the row index and
the column name of a CSV file.
"""
# TODO: get the index of col_name
# TODO: get the value of cs220_data at the specified cell
# TODO: handle missing values, by returning None
# TODO: handle type conversions
return val
```
%% Cell type:markdown id:b7c8e726 tags:
### Find average age per lecture.
### Function `avg_age_per_lecture(lecture)`
%% Cell type:code id:f0a05e42 tags:
%% Cell type:code id:fa5598e0 tags:
``` python
# TODO: initialize 4 lists for the 4 lectures
def avg_age_per_lecture(lecture):
'''
avg_age_per_lecture(lecture) returns the average age of
the students in the given `lecture`; if there are no
students in the given `lecture`, it returns `None`
'''
# To compute average you don't need to actually populate a list.
# But here a list will come in handy. It will help you with the None return requirement.
pass
```
%% Cell type:code id:f0a05e42 tags:
# Iterate over the data and populate the lists
``` python
avg_age_per_lecture("LEC002")
```
%% Cell type:code id:9f2c7e6e tags:
# TODO: compute average age of each lecture
print("LEC001 average student age:", round(sum(lec1_ages) / len(lec1_ages), 2))
print("LEC002 average student age:", round(sum(lec2_ages) / len(lec2_ages), 2))
print("LEC003 average student age:", round(sum(lec3_ages) / len(lec3_ages), 2))
print("LEC004 average student age:", round(sum(lec4_ages) / len(lec4_ages), 2))
``` python
print(avg_age_per_lecture("LEC007"))
```
%% Cell type:markdown id:6aeaec34 tags:
%% Cell type:markdown id:94548bf4 tags:
### `sort` method versus `sorted` function
- `sort` (and other list methods) have an impact on the original list
- `sorted` function returns a new list with expected ordering
- default sorting order is ascending / alphanumeric
- `reverse` parameter is applicable for both `sort` method and `sorted` function:
- enables you to specify descending order by passing argument as `True`
%% Cell type:code id:69335f13 tags:
%% Cell type:code id:c1e555f9 tags:
``` python
some_list = # TODO: Initialize some_list with a list of un-ordered integers
some_list = [10, 4, 25, 2, -10]
```
%% Cell type:code id:62b2f81a tags:
%% Cell type:code id:152297bb tags:
``` python
# TODO: Invoke sort method
rv = ???
print(some_list)
# What does the sort method return?
# TODO: Capture return value into a variable rv and print the return value.
print(rv)
```
%% Cell type:markdown id:b5738543 tags:
%% Cell type:markdown id:3c0d5e7d tags:
`sort` method returns `None` because it sorts the values in the original list
%% Cell type:code id:bb93809e tags:
%% Cell type:code id:c06d8976 tags:
``` python
# TODO: invoke sorted function and pass some_list as argument
# TODO: capture return value into sorted_some_list
sorted_some_list = sorted(some_list)
???
# What does the sorted function return? It returns a brand new list with the values in sorted order
# What does the sorted function return?
# It returns a brand new list with the values in sorted order
print(sorted_some_list)
```
%% Cell type:markdown id:0b7fceb6 tags:
%% Cell type:markdown id:ded0304c tags:
TODO: go back to `sort` method call and `sorted` function call and pass keyword argument `reverse = True`.
%% Cell type:markdown id:5c33901b tags:
### set data structure
%% Cell type:markdown id:35894ef5 tags:
- **not a sequence**
- no ordering of values:
- this implies that you can only store unique values within a `set`
- very helpful to find unique values stored in a `list`
- easy to convert a `list` to `set` and vice-versa.
- ordering is not guaranteed once we use `set`
Can you call `sort` method on a set?
%% Cell type:code id:96914953 tags:
%% Cell type:code id:fc08879e tags:
``` python
some_set = {10, 20, 30, 30, 40, 50, 10} # use a pair of curly braces to define it
some_set
some_set.sort()
# doesn't work: no method named sort associated with type set
# you cannot sort a set because of the lack of ordering
```
%% Cell type:code id:5249e026 tags:
%% Cell type:markdown id:99161c42 tags:
``` python
some_list = [10, 20, 30, 30, 40, 50, 10] # Initialize a list containing duplicate numbers
Can you pass a `set` as argument to `sorted` function? Python is intelligent :)
# TODO: to find unique values, convert it into a set
print(some_list)
%% Cell type:code id:2549df29 tags:
# TODO: convert the set back into a list
print(some_list)
``` python
# works because Python converts the set into a list and then sorts the list
sorted(some_set)
```
%% Cell type:markdown id:245d1dff tags:
%% Cell type:markdown id:5c7f3489 tags:
Can you call `sort` method on a set?
### Function: `find_majors(phrase)`
%% Cell type:code id:321105b7 tags:
%% Cell type:code id:b6adbfe0 tags:
``` python
def find_majors(phrase):
"""
find_majors(phrase) returns a list of all the room names that contain the
substring (case insensitive match) `phrase`.
"""
# TODO: initialize the target list here
# TODO: iterate over row indices
for row_idx in range(len(cs220_data)):
major = cell(row_idx, "Major")
# TODO: write the actual logic here
return majors
```
%% Cell type:markdown id:5abca57e tags:
%% Cell type:markdown id:1b7f671f tags:
Can you pass a `set` as argument to `sorted` function? Python is intelligent :)
### Find all `major` that contain **either** `"Computer"` **or** `"Science"`.
%% Cell type:code id:14a1a216 tags:
Your output **must** be a *list*. The order **does not** matter, but if a `major` contains **both** `"Computer"` and `"Science"`, then the room must be included **only once** in your list.
%% Cell type:code id:ed895a3b tags:
``` python
computer_majors = ???
science_majors = ???
computer_and_science_majors = ???
# TODO: Now find just the unique values
computer_and_science_majors = ???
computer_and_science_majors
```
%% Cell type:markdown id:c656510b tags:
%% Cell type:markdown id:64fd0945 tags:
Can you index / slice into a `set`?
### Order the `major` that contain **either** `"Computer"` **or** `"Science"` using ascending order.
%% Cell type:code id:a64ec5bd tags:
%% Cell type:code id:efcdf514 tags:
``` python
# VERSION 1
# Be very careful: if you use sorted, make sure your return value
# variable matches with the variable for that project question
sorted_computer_and_science_majors = sorted(computer_and_science_majors)
sorted_computer_and_science_majors
```
%% Cell type:code id:4d61e84d tags:
%% Cell type:code id:c28e77ce tags:
``` python
# VERSION 2
computer_and_science_majors.sort()
computer_and_science_majors
```
%% Cell type:markdown id:64fd0945 tags:
%% Cell type:markdown id:e354b781 tags:
### Find all unique zip codes. Arrange them based on ascending order.
### Order the `major` that contain **either** `"Computer"` **or** `"Science"` using descending order.
%% Cell type:code id:c28e77ce tags:
%% Cell type:code id:ca887135 tags:
``` python
# TODO: initialize list of keep track of zip codes
zip_codes = []
for row_idx in range(len(cs220_data)):
zip_code = cell(row_idx, "Zip Code")
# VERSION 1
# Be very careful: if you use sorted, make sure your return value
# variable matches with the variable for that project question
reverse_sorted_computer_and_science_majors = sorted(computer_and_science_majors, reverse = ???)
reverse_sorted_computer_and_science_majors
```
if zip_code != None:
zip_codes.append(zip_code)
%% Cell type:code id:b6c61532 tags:
zip_codes # How do we get the unique values?
``` python
# VERSION 2
computer_and_science_majors.sort(reverse = ???)
computer_and_science_majors
```
%% Cell type:markdown id:d4edf965 tags:
%% Cell type:markdown id:2862160c tags:
### Arrange unique zip codes based on descending order.
### For `major` containing `"other"`, extract the details that come after `"|"`.
%% Cell type:code id:83926b35 tags:
%% Cell type:code id:600fae6c tags:
``` python
other_majors = find_majors("other")
other_major_details = []
for other in other_majors:
print(other)
# TODO: complete the rest of the logic
other_major_details
```
%% Cell type:markdown id:31a381fe tags:
## Self-practice
%% Cell type:markdown id:8ac26620 tags:
### How many students are both a procrastinator and a pet owner?
%% Cell type:markdown id:172141ea tags:
### What percentage of 18-year-olds have their major declared as "Other"?
%% Cell type:markdown id:d9a7a2b1 tags:
### How old is the oldest basil/spinach-loving Business major?
%% Cell type:code id:5fcc04f2 tags:
``` python
```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment