Skip to content
Snippets Groups Projects
Commit 5ca95af2 authored by gsingh58's avatar gsingh58
Browse files

lec9 solution updated

parent eb9f52e5
No related branches found
No related tags found
No related merge requests found
Pipeline #762288 passed
%% Cell type:markdown id:d684d88e-e96d-4392-b4d6-92d3f1669b32 tags:
# Binary Search Trees
- Recursive `add()` method
- Recursive `height()` method
%% Cell type:code id:66ba808a tags:
``` python
from graphviz import Graph, Digraph
import random
import math
```
%% Cell type:markdown id:3c372194 tags:
## Binary Search Tree
- special case of *Binary trees*
- **BST rule**: any node's value is bigger than every value in its left subtree, and and smaller than every value in its right subtree
- TODO: write an efficient search for a BST (better complexity than O(N)
- TODO: write a method to add values to a BST, while preserving the BST rule
%% Cell type:code id:894a39d2-5e3b-4178-bc1b-dedf0b5a86c0 tags:
``` python
class BSTNode:
def __init__(self, label):
self.label = label
self.left = None
self.right = None
# Category 2: functions that do some action
def dump(self, prefix="", suffix=""):
"""
prints out name of every node in the tree with some basic formatting
"""
print(prefix, self.label, suffix)
if self.left != None:
self.left.dump(prefix+"\t", "(LEFT)")
if self.right != None:
self.right.dump(prefix+"\t", "(RIGHT)")
# Category 1: functions that return some computation
def search(self, target):
"""
returns True/False, if target is somewhere in the tree
"""
if target == self.label:
return True
elif target < self.label:
if self.left != None:
if self.left.search(target):
return True
elif target > self.label:
if self.right != None:
if self.right.search(target):
return True
return False
def add(self, label):
"""
Finds the correct spot for label and adds a new node with it.
Assumes that tree already contains at least one node -> TODO: discuss why?
Raises ValueError if label is already on the tree.
"""
if label < self.label:
# go left
if self.left == None:
self.left = BSTNode(label)
else:
# recurse left
self.left.add(label)
elif label > self.label:
# go right
if self.right == None:
self.right = BSTNode(label)
else:
# recurse right
self.right.add(label)
else:
raise ValueError(f"{label} is already a node on the tree!")
def height(self):
"""
Calculates height of the BST.
Height: the number of nodes on the longest root-to-leaf path (including the root)
"""
if self.left == None:
l = 0
else:
# recurse left
l = self.left.height()
if self.right == None:
r = 0
else:
# recurse right
r = self.right.height()
return max(l, r)+1
```
%% Cell type:markdown id:d22c3684 tags:
### Code folding nbextension
- Go to "jupyterlab" > "Settings" > "Advanced Settings Editor" > "Notebook" > "Rulers" > enable "Code Folding" (there should be three such settings).
%% Cell type:markdown id:1d6935d8 tags:
### Recursive `add` method
- Manually creating a tree is cumbersome and subject to mistakes (violations of BST rule)
%% Cell type:code id:7047d184 tags:
``` python
root = BSTNode(10)
root.left = BSTNode(2)
root.left.left = BSTNode(1)
root.left.right = BSTNode(4)
root.left.right.right = BSTNode(8)
root.left.right.left = BSTNode(3)
root.right = BSTNode(15)
root.right.left = BSTNode(12)
root.right.right = BSTNode(19)
root.dump("", "(ROOT)")
```
%% Output
10 (ROOT)
2 (LEFT)
1 (LEFT)
4 (RIGHT)
3 (LEFT)
8 (RIGHT)
15 (RIGHT)
12 (LEFT)
19 (RIGHT)
%% Cell type:code id:0cd51cf2 tags:
``` python
values = [10, 2, 1, 4, 8, 3, 15, 12, 19]
root = BSTNode(values[0])
for val in values[1:]:
root.add(val)
root.dump("", "(ROOT)")
```
%% Output
10 (ROOT)
2 (LEFT)
1 (LEFT)
4 (RIGHT)
3 (LEFT)
8 (RIGHT)
15 (RIGHT)
12 (LEFT)
19 (RIGHT)
%% Cell type:markdown id:f9324526 tags:
### Recursive `height` method
- **Height**: the number of nodes on the longest root-to-leaf path (including the root)
- left subtree has height 4, right subtree has height 6, my height = 7
- left subtree has height 4, right subtree has height 4, my height = 5
- left subtree has height 10, right subtree has height 0, my height = 11
- left subtree has height of l, right subtree has height of r, my height = max(l, r)+1
- What is the simplest case for height calculation? Tree containing just root node
- What are the values of l and r in that case? l = 0 and r = 0
%% Cell type:code id:18d8de1d tags:
``` python
# TODO: Let's implement and invoke the height method
root.height()
```
%% Output
4
%% Cell type:markdown id:bb2057f2 tags:
### Tree containing 100 values
- let's use range(...) to produce a sequence of 100 integers
- recall that range(...) returns a sequence in increasing order
- what will be the height of this tree? **100**
%% Cell type:code id:820f3596 tags:
``` python
values = list(range(100))
# Q: Is this tree balanced?
# A: No, it is the worst possible BST for these numbers, that is
# it is a linked list!
root = BSTNode(values[0])
for val in values[1:]:
root.add(val)
print(root.height())
# root.dump("", "(ROOT)")
```
%% Output
100
%% Cell type:markdown id:af9dd1b3 tags:
#### Let's use `random` module `shuffle` function to randomly order the sequence of 100 numbers.
- in-place re-ordering of numbers (just like `sort` method)
%% Cell type:code id:c07664be tags:
``` python
values = list(range(100))
random.shuffle(values)
# Q: Is this tree balanced?
# A: depends on the shuffling, you can check using math.log2(N)
root = BSTNode(values[0])
for val in values[1:]:
root.add(val)
print(root.height())
root.dump("", "(ROOT)")
```
%% Output
15
3 (ROOT)
0 (LEFT)
1 (RIGHT)
2 (RIGHT)
96 (RIGHT)
78 (LEFT)
9 (LEFT)
5 (LEFT)
4 (LEFT)
8 (RIGHT)
17
13 (ROOT)
1 (LEFT)
0 (LEFT)
10 (RIGHT)
6 (LEFT)
3 (LEFT)
2 (LEFT)
4 (RIGHT)
5 (RIGHT)
9 (RIGHT)
8 (LEFT)
7 (LEFT)
6 (LEFT)
64 (RIGHT)
46 (LEFT)
29 (LEFT)
25 (LEFT)
14 (LEFT)
12 (LEFT)
10 (LEFT)
11 (RIGHT)
13 (RIGHT)
20 (RIGHT)
16 (LEFT)
15 (LEFT)
18 (RIGHT)
17 (LEFT)
19 (RIGHT)
21 (RIGHT)
23 (RIGHT)
22 (LEFT)
24 (RIGHT)
27 (RIGHT)
26 (LEFT)
28 (RIGHT)
44 (RIGHT)
38 (LEFT)
30 (LEFT)
37 (RIGHT)
32 (LEFT)
31 (LEFT)
12 (RIGHT)
11 (LEFT)
67 (RIGHT)
52 (LEFT)
50 (LEFT)
18 (LEFT)
16 (LEFT)
15 (LEFT)
14 (LEFT)
17 (RIGHT)
22 (RIGHT)
21 (LEFT)
20 (LEFT)
19 (LEFT)
49 (RIGHT)
23 (LEFT)
39 (RIGHT)
35 (LEFT)
31 (LEFT)
24 (LEFT)
25 (RIGHT)
27 (RIGHT)
26 (LEFT)
28 (RIGHT)
30 (RIGHT)
29 (LEFT)
32 (RIGHT)
33 (RIGHT)
35 (RIGHT)
34 (LEFT)
36 (RIGHT)
39 (RIGHT)
40 (RIGHT)
41 (RIGHT)
42 (RIGHT)
43 (RIGHT)
45 (RIGHT)
48 (RIGHT)
47 (LEFT)
53 (RIGHT)
51 (LEFT)
49 (LEFT)
50 (RIGHT)
52 (RIGHT)
60 (RIGHT)
56 (LEFT)
54 (LEFT)
55 (RIGHT)
59 (RIGHT)
58 (LEFT)
57 (LEFT)
62 (RIGHT)
61 (LEFT)
63 (RIGHT)
65 (RIGHT)
69 (RIGHT)
68 (LEFT)
66 (LEFT)
67 (RIGHT)
73 (RIGHT)
70 (LEFT)
34 (RIGHT)
37 (RIGHT)
36 (LEFT)
38 (RIGHT)
41 (RIGHT)
40 (LEFT)
44 (RIGHT)
43 (LEFT)
42 (LEFT)
47 (RIGHT)
46 (LEFT)
45 (LEFT)
48 (RIGHT)
51 (RIGHT)
55 (RIGHT)
54 (LEFT)
53 (LEFT)
58 (RIGHT)
56 (LEFT)
57 (RIGHT)
60 (RIGHT)
59 (LEFT)
62 (RIGHT)
61 (LEFT)
63 (RIGHT)
66 (RIGHT)
64 (LEFT)
65 (RIGHT)
82 (RIGHT)
80 (LEFT)
69 (LEFT)
68 (LEFT)
79 (RIGHT)
74 (LEFT)
73 (LEFT)
71 (LEFT)
70 (LEFT)
72 (RIGHT)
71 (LEFT)
75 (RIGHT)
74 (LEFT)
77 (RIGHT)
76 (LEFT)
87 (RIGHT)
85 (LEFT)
81 (LEFT)
79 (LEFT)
80 (RIGHT)
84 (RIGHT)
82 (LEFT)
83 (RIGHT)
86 (RIGHT)
92 (RIGHT)
89 (LEFT)
88 (LEFT)
75 (RIGHT)
77 (RIGHT)
76 (LEFT)
78 (RIGHT)
81 (RIGHT)
94 (RIGHT)
83 (LEFT)
88 (RIGHT)
85 (LEFT)
84 (LEFT)
86 (RIGHT)
87 (RIGHT)
91 (RIGHT)
90 (LEFT)
95 (RIGHT)
93 (LEFT)
94 (RIGHT)
97 (RIGHT)
98 (RIGHT)
99 (RIGHT)
89 (LEFT)
92 (RIGHT)
93 (RIGHT)
97 (RIGHT)
95 (LEFT)
96 (RIGHT)
99 (RIGHT)
98 (LEFT)
%% Cell type:code id:4d87a7e7 tags:
``` python
math.log2(100)
```
%% Output
6.643856189774724
%% Cell type:markdown id:cf919d84 tags:
### Balanced BSTs / Self-balancing BSTs
- not a covered topic for the purpose of this course
- you can explore the below recursive function definition if you are interested
- you are **not required** to know how to do this
%% Cell type:code id:bd5aa50f tags:
``` python
# Recrusive function that
def sorted_array_to_bst(nums, bst_nums):
"""
Produces best ordering nums (a list of sorted numbers),
for the purpose of creating a balanced BST.
Writes new ordering of numbers into bst_nums.
"""
if len(nums) == 0:
return None
elif len(nums) == 1:
bst_nums.append(nums[0])
else:
mid_index = len(nums)//2
bst_nums.append(nums[mid_index])
# recurse left
left_val = sorted_array_to_bst(nums[:mid_index], bst_nums)
if left_val != None:
bst_nums.append(left_val)
# recurse right
right_val = sorted_array_to_bst(nums[mid_index+1:], bst_nums)
if right_val != None:
bst_nums.append(right_val)
```
%% Cell type:code id:98b9148d tags:
``` python
bst_nums = []
sorted_array_to_bst(list(range(5)), bst_nums)
bst_nums
```
%% Output
[2, 1, 0, 4, 3]
%% Cell type:code id:f1288713 tags:
``` python
bst_nums = []
sorted_array_to_bst(list(range(100)), bst_nums)
root = BSTNode(bst_nums[0])
for val in bst_nums[1:]:
root.add(val)
print(root.height())
```
%% Output
7
%% Cell type:code id:399fe31a tags:
``` python
bst_nums = []
sorted_array_to_bst(list(range(5)), bst_nums)
root = BSTNode(bst_nums[0])
for val in bst_nums[1:]:
root.add(val)
print(root.height())
root.dump("", "(ROOT)")
```
%% Output
3
2 (ROOT)
1 (LEFT)
0 (LEFT)
4 (RIGHT)
3 (LEFT)
%% Cell type:markdown id:e042962c tags:
### Depth First Search (DFS)
- Last lecture: BST search with complexity **O(logN)**
- Finds a path from one node to another -- works on any directed graph
%% Cell type:code id:c00e99eb tags:
``` python
def example(num):
g = Graph()
if num == 1:
g.node("A")
g.edge("B", "C")
g.edge("C", "D")
g.edge("D", "B")
elif num == 2:
g.edge("A", "B")
g.edge("B", "C")
g.edge("C", "D")
g.edge("D", "E")
g.edge("A", "E")
elif num == 3:
g.edge("A", "B")
g.edge("A", "C")
g.edge("B", "D")
g.edge("B", "E")
g.edge("C", "F")
g.edge("C", "G")
elif num == 4:
g.edge("A", "B")
g.edge("A", "C")
g.edge("B", "D")
g.edge("B", "E")
g.edge("C", "F")
g.edge("C", "G")
g.edge("E", "Z")
g.edge("C", "Z")
g.edge("B", "A")
elif num == 5:
width = 8
height = 4
for L1 in range(height-1):
L2 = L1 + 1
for i in range(width-(height-L1-1)):
for j in range(width-(height-L2-1)):
node1 = str(L1)+"-"+str(i)
node2 = str(L2)+"-"+str(j)
g.edge(node1, node2)
else:
raise Exception("no such example")
return g
```
%% Cell type:markdown id:6690b3be tags:
### For a regular graph, you need a new class `Graph` to keep track of the whole graph.
- Why? Remember graphs need not have a "root" node, which means there is no one origin point
%% Cell type:code id:8f5e8b06 tags:
``` python
class Graph:
def __init__(self):
# name => Node
self.nodes = {}
# to keep track which nodes have already been visited
self.visited = set()
def node(self, name):
node = Node(name)
self.nodes[name] = node
node.graph = self
def edge(self, src, dst):
"""
Automatically adds missing nodes.
"""
for name in [src, dst]:
if not name in self.nodes:
self.node(name)
self.nodes[src].children.append(self.nodes[dst])
def _repr_svg_(self):
"""
Draws the graph nodes and edges iteratively.
"""
g = Digraph()
for n in self.nodes:
g.node(n)
for child in self.nodes[n].children:
g.edge(n, child.name)
return g._repr_image_svg_xml()
def dfs_search(self, src_name, dst_name):
"""
Clears the visited set and invokes dfs_search using Node object instance
with name src_name.
"""
# Q: is this method recursive?
# A: no, it is just invoking dfs_search method for Node object instance
# dfs_search method in Node class is recursive
# These methods in two different classes just happen to share the same name
self.visited.clear()
return self.nodes[src_name].dfs_search(self.nodes[dst_name])
class Node:
def __init__(self, name):
self.name = name
self.children = []
self.graph = None # back reference
self.finder = None # who found me during BFS
def __repr__(self):
return self.name
def dfs_search(self, dst):
"""
Returns True / False when path to dst is found / not found.
"""
# TODO: what is the simplest case? current node is the dst
if self in self.graph.visited:
return False
self.graph.visited.add(self)
if self == dst:
return True
for child in self.children:
if child.dfs_search_v1(dst):
return True
return False
g = example(1)
g
```
%% Output
<__main__.Graph at 0x7f58e025b490>
%% Cell type:markdown id:c83e9993-765c-42a0-97f6-6277627acf95 tags:
### Testcases for DFS with True or False
%% Cell type:code id:15edd0d2 tags:
``` python
print(g.dfs_search("B", "A")) # should return False
print(g.dfs_search("B", "D")) # should return True
```
%% Output
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[15], line 1
----> 1 print(g.dfs_search("B", "A")) # should return False
2 print(g.dfs_search("B", "D")) # should return True
Cell In[14], line 43, in Graph.dfs_search(self, src_name, dst_name)
38 # Q: is this method recursive?
39 # A: no, it is just invoking dfs_search method for Node object instance
40 # dfs_search method in Node class is recursive
41 # These methods in two different classes just happen to share the same name
42 self.visited.clear()
---> 43 return self.nodes[src_name].dfs_search(self.nodes[dst_name])
Cell In[14], line 69, in Node.dfs_search(self, dst)
66 return True
68 for child in self.children:
---> 69 if child.dfs_search_v1(dst):
70 return True
72 return False
AttributeError: 'Node' object has no attribute 'dfs_search_v1'
%% Cell type:code id:0330e0b6 tags:
``` python
# DFS search
# TODO: give the actual path, not just True/False
# TODO: use a different algorithm to find the shortest path
```
%% Cell type:markdown id:ac4d6c30 tags:
#### **IMPORTANT**: it is not recommended to re-define same `class`. This is shown only for example purposes. You must always go back to the original cell and update the definition there.
%% Cell type:code id:44fda5f1 tags:
``` python
class Graph:
def __init__(self):
# name => Node
self.nodes = {}
# to keep track which nodes have already been visited
self.visited = set()
def node(self, name):
node = Node(name)
self.nodes[name] = node
node.graph = self
def edge(self, src, dst):
"""
Automatically adds missing nodes.
"""
for name in [src, dst]:
if not name in self.nodes:
self.node(name)
self.nodes[src].children.append(self.nodes[dst])
def _repr_svg_(self):
"""
Draws the graph nodes and edges iteratively.
"""
g = Digraph()
for n in self.nodes:
g.node(n)
for child in self.nodes[n].children:
g.edge(n, child.name)
return g._repr_image_svg_xml()
def dfs_search(self, src_name, dst_name):
"""
Clears the visited set and invokes dfs_search using Node object instance
with name src_name.
"""
# Q: is this method recursive?
# A: no, it is just invoking dfs_search method for Node object instance
# dfs_search method in Node class is recursive
# These methods in two different classes just happen to share the same name
self.visited.clear()
return self.nodes[src_name].dfs_search(self.nodes[dst_name])
class Node:
def __init__(self, name):
self.name = name
self.children = []
self.graph = None # back reference
self.finder = None # who found me during BFS
def __repr__(self):
return self.name
def dfs_search_v1(self, dst):
"""
Returns True / False when path to dst is found / not found.
Try using this method by commenting out the dfs_search method below.
"""
# TODO: what is the simplest case? current node is the dst
if self in self.graph.visited:
return False
self.graph.visited.add(self)
if self == dst:
return True
for child in self.children:
if child.dfs_search_v1(dst):
return True
return False
def dfs_search(self, dst):
"""
Returns the actual path to the dst as a tuple or None otherwise
"""
# TODO: what is the simplest case? current node is the dst
if self in self.graph.visited:
return None
self.graph.visited.add(self)
if self == dst:
return (self,)
for child in self.children:
child_path = child.dfs_search(dst)
if child_path != None:
return (self,) + child_path
return None
g = example(1)
g
```
%% Output
<__main__.Graph at 0x7f58d164b430>
<__main__.Graph at 0x7f875c333b80>
%% Cell type:markdown id:c83e9993-765c-42a0-97f6-6277627acf95 tags:
### Testcases for DFS
%% Cell type:code id:15edd0d2 tags:
``` python
print(g.dfs_search("B", "A")) # should return False
print(g.dfs_search("B", "D")) # should return True
```
%% Output
None
(B, C, D)
%% Cell type:code id:0330e0b6 tags:
``` python
# DFS search
# TODO: give the actual path, not just True/False
# TODO: use a different algorithm to find the shortest path
```
%% Cell type:markdown id:59aee028 tags:
### Testcases for DFS with path
%% Cell type:code id:6de4c8b1 tags:
``` python
print(g.dfs_search("B", "A")) # should return None
print(g.dfs_search("B", "D")) # should return (B, C, D)
```
%% Output
None
(B, C, D)
%% Cell type:markdown id:43c566a5 tags:
### DFS search
- return the actual path rather than just returning True / False
- for example, path between B and D should be (B, C, D)
%% Cell type:markdown id:e7cb5fc1 tags:
### Why is it called "*Depth* First Search"?
- we start at the starting node and go as deep as possible because recursion always goes as deep as possible before coming back to the other children in the previous level
- we need a `Stack` data structure:
- Last-In-First-Out (LIFO)
- recursion naturally uses `Stack`, which is why we don't have to explicitly use a `Stack` data structure
- might not give us the shortest possible path
%% Cell type:code id:4480be3c tags:
``` python
g = example(2)
g
```
%% Output
<__main__.Graph at 0x7f58d16a22c0>
<__main__.Graph at 0x7f87581aead0>
%% Cell type:code id:9467f6cf tags:
``` python
print(g.dfs_search("A", "E")) # should return (A, B, C, D, E)
print(g.dfs_search("E", "A")) # should return None
```
%% Output
(A, B, C, D, E)
None
%% Cell type:markdown id:a54b6599 tags:
### `tuple` review
- similar to lists, but immutable
- defined using `()`
- `*` operator represents replication and not multiplication for lists and tuples
- `+` operator represents concatenation and not additional for lists and tuples
%% Cell type:code id:7da7bccc tags:
``` python
(3+2,) # this is a tuple containing 5
```
%% Output
(5,)
%% Cell type:code id:b05a563a tags:
``` python
(3+2) # order precedence
```
%% Output
5
%% Cell type:code id:778a76d7 tags:
``` python
# replicates item 5 three times and returns a new tuple
(3+2,) * 3
```
%% Output
(5, 5, 5)
%% Cell type:code id:b8cc1b36 tags:
``` python
(3+2) * 3 # gives us 15
```
%% Output
15
%% Cell type:code id:a9e31d22 tags:
``` python
# returns a new tuple containing all items in the first tuple and
# the second tuple
(3, ) + (5, )
```
%% Output
(3, 5)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment