Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • cdis/cs/courses/cs544/s25/main
  • zzhang2478/main
  • spark667/main
  • vijayprabhak/main
  • vijayprabhak/544-main
  • wyang338/cs-544-s-25
  • jmin39/main
7 results
Show changes
Commits on Source (114)
Showing
with 701 additions and 0 deletions
# Read-only Access
We've opened up the `autobadger` tool in attempt to make things more visible to you and less of a "black box". We've done this by making the repository *read-only*, meaning you should be able to `git clone` the repo but not `git push` to it.
To start, navigate to a directory outside of any class project. I'd recommend cloning to the same directory as your projects.
```bash
git clone https://oauth2:glpat-CSTX_tgpf38eJHyUW213@git.doit.wisc.edu/cdis/cs/courses/cs544/s25/tools/autobadger.git
```
> **NOTE**: if you want to use this method throughout the semester, you'll need to `git pull` to get up-to-date code for each project.
Your folder structure should look something like
```
some-directory/
autobadger/
p1/
p2/
... # other projects
```
# Making Changes
You can change the code inside of `autobadger`. The only files that will be of interest to you are inside the `projects/` directory, i.e. `projects/*.py`). Your changes will be for debugging, i.e. `print()` or `breakpoint()` statements.
#### Using `pip`
For whatever project you're working on, you will need to *apply* any changes you make using `pip`
For example, assuming
- I'm working on `p2`
- in my `p2` directory
- and have my `venv` activated
I would do something like:
```bash
pip3 install ../autobadger/.
```
This would install and replace my local version of `autobadger` . Now when I run
```
autobadger --project=p2
```
I will see my changes in effect.
# Breakpoints
Since `breakpoint()` is less known and straightforward, I will teach about it here.
> **NOTE**: It is not required to use `breakpoint()`. You are also welcome to use `print()` instead. `breakpoint()` has a **steeper learning curve**, but may **help you iterate more quickly and save you time** once the basic concepts are well-understood.
### What is a breakpoint?
`breakpoint()` is a built-in function in Python and starts the **debugger** at the point where it is called. It allows developers to inspect variables, step through code, and debug interactively.
#### Simple Example:
```python
# Inside of /path/to/file.py
def calculate_sum(a, b):
breakpoint() # Debugger starts here
return a + b
calculate_sum(3, 5) # execute function
```
Adding a `breakpoint()` will pause execution, allowing you to inspect `a` and `b` before proceeding. I would see something like:
```
> /path/to/file.py(3)calculate_sum()
-> return a + b
```
in the terminal, which displays
1. the next line to be executed `return a + b`
2. `(3)calculate_sum()` tells me the line number and the function name (if applicable)
3. `/path/to/file.py` tells me the current file
### Navigating the debugger
While the Python debugger is active, you can use several commands to navigate through your program and investigate.
- `Variable name`: I can type any variable that is in scope and get it's value.
- Ex: Typing `a` in the previous example would return the *value* of `a`
- **NOTE**: if a variable name also coincides with a command keyword in the debugger, you may need to use `print(<variable_name>)` instead. `b` is one of those commands, so to print the value of `b` to the terminal, I would need to do `print(b)`:
- `Evaluation`: I can also evaluate statements (i.e. add two numbers)
```
In [3]: calculate_sum(3, 5)
> <ipython-input-2-443b6e8e0b0a>(3)calculate_sum()
-> return a + b
(Pdb) print(a)
3
(Pdb) print(b)
5
(Pdb) print(a + b)
8
```
- `n`: Steps to the next line of my program
- `c`: Continues execution of the program until the next breakpoint, or until the program ends.
- `s`: Steps *into* a function or method call
- `exit`: kills the debugger and ends the program
# An example
### Using breakpoints
Suppose I want to investigate `Q4` for `p2`. I can add `breakpoint()` statements to the Q4 test method for the `ProjectTwoTest` class.
Navigating to `projects/p2.py` inside of `autobadger`, I find:
```python
@graded(Q=4, points=10)
def test_simple_http(self) -> int | TestError:
address = self._test_cache_server("-cache-1")
if isinstance(address, TestError):
return address
r = requests.get(f"{address}/lookup/53706")
r.raise_for_status()
result = r.json()
if "addrs" not in result or "source" not in result:
return TestError(
message=f"Result body should be JSON with 'addrs' and 'source' fields, but got {result}.",
earned=5,
)
return 10
```
> Note: This is Q4 since I have `Q=4` in the decorator.
**I can edit this method by adding *breakpoints*!**
```python
@graded(Q=4, points=10)
def test_simple_http(self) -> int | TestError:
breakpoint()
address = self._test_cache_server("-cache-1")
if isinstance(address, TestError):
return address
r = requests.get(f"{address}/lookup/53706")
breakpoint()
r.raise_for_status()
result = r.json()
if "addrs" not in result or "source" not in result:
return TestError(
message=f"Result body should be JSON with 'addrs' and 'source' fields, but got {result}.",
earned=5,
)
return 10
```
Now, after I update with `pip` as mentioned above, I can run `autobadger --project=p2` and get:
```
> /Users/.../p2.py(103)test_simple_http()
-> address = self._test_cache_server("-cache-1")
```
Note that in this situation, typing `address` would give me an error cause it **not yet defined**:
```
(Pdb) address
*** NameError: name 'address' is not defined
```
###### Using `n` (next line)
`address` defined on the *next line*. So, I use the `n` command to step!
```
(Pdb) n
> /Users/.../p2.py(104)test_simple_http()
-> if isinstance(address, TestError):
(Pdb) address
'http://localhost:64879'
```
###### Using `s` (step into)
I could have also used `s` to *step into* `self._test_cache_server(...)` if I had wanted to investigate further:
```
> /Users/.../p2.py(103)test_simple_http()
-> address = self._test_cache_server("-cache-1")
(Pdb) s
--Call--
> /Users/.../p2.py(118)_test_cache_server()
-> def _test_cache_server(self, server_suffix: str) -> str | TestError:
# Now in a new method — _test_cache_server
(Pdb) n
> /Users/.../p2.py(119)_test_cache_server()
-> cache_server = [c for c in self.containers if c["Name"].endswith(server_suffix)]
```
###### Using `c` (continue)
I can also *continue* till the next breakpoint, which is quite convenient if you don't need to step over every line of code:
```
> /Users/.../p2.py(103)test_simple_http()
-> address = self._test_cache_server("-cache-1")
(Pdb) c
> /Users/.../p2.py(108)test_simple_http()
-> r.raise_for_status()
(Pdb) print(r.json())
{'addrs': [...], 'error': None, 'source': '...'}
```
Using `c` jumped from line `103` to line `108`, where I had my two breakpoints defined.
> **NOTE**: using `c` again would continue the Python program till the end of its execution since I have no other `breakpoint()` statements
\ No newline at end of file
File added
File added
File added
File added
File added
File added
File added
File added
FROM ubuntu:24.04
RUN apt-get update && apt-get install -y python3 python3-pip curl iproute2
COPY requirements.txt /tmp/requirements.txt
RUN pip3 install -r /tmp/requirements.txt --break-system-packages
CMD ["python3", "-m", "jupyterlab", "--no-browser", "--ip=0.0.0.0", "--port=600", "--allow-root", "--NotebookApp.token=''"]
anyio==4.4.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
attrs==24.2.0
babel==2.16.0
beautifulsoup4==4.12.3
bleach==6.1.0
certifi==2024.8.30
cffi==1.17.1
charset-normalizer==3.3.2
comm==0.2.2
debugpy==1.8.5
decorator==5.1.1
defusedxml==0.7.1
executing==2.1.0
fastjsonschema==2.20.0
fqdn==1.5.1
h11==0.14.0
httpcore==1.0.5
httpx==0.27.2
idna==3.10
ipykernel==6.29.5
ipython==8.27.0
isoduration==20.11.0
jedi==0.19.1
Jinja2==3.1.4
json5==0.9.25
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter_client==8.6.3
jupyter_core==5.7.2
jupyter_server==2.14.2
jupyter_server_terminals==0.5.3
jupyterlab==4.2.5
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
MarkupSafe==2.1.5
matplotlib-inline==0.1.7
mistune==3.0.2
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nest-asyncio==1.6.0
notebook_shim==0.2.4
overrides==7.7.0
packaging==24.1
pandocfilters==1.5.1
parso==0.8.4
pexpect==4.9.0
platformdirs==4.3.3
prometheus_client==0.20.0
prompt_toolkit==3.0.47
psutil==6.0.0
ptyprocess==0.7.0
pure_eval==0.2.3
pycparser==2.22
Pygments==2.18.0
python-dateutil==2.9.0.post0
python-json-logger==2.0.7
PyYAML==6.0.2
pyzmq==26.2.0
referencing==0.35.1
requests==2.32.3
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.20.0
Send2Trash==1.8.3
setuptools==68.1.2
six==1.16.0
sniffio==1.3.1
soupsieve==2.6
stack-data==0.6.3
terminado==0.18.1
tinycss2==1.3.0
tornado==6.4.1
traitlets==5.14.3
types-python-dateutil==2.9.0.20240906
uri-template==1.3.0
urllib3==2.2.3
wcwidth==0.2.13
webcolors==24.8.0
webencodings==0.5.1
websocket-client==1.8.0
wheel==0.42.0
FROM ubuntu:24.04
RUN apt-get update && apt-get install -y python3 python3-pip iproute2
COPY requirements.txt /requirements.txt
RUN pip3 install -r /requirements.txt --break-system-packages
COPY *.py /
CMD ["python3", "/server.py"]
import sys
import grpc
import count_pb2, count_pb2_grpc
channel = grpc.insecure_channel("127.0.0.1:" + sys.argv[1])
stub = count_pb2_grpc.CounterStub(channel)
print(stub.Count(count_pb2.Req()))
syntax = "proto3";
message Req{}
message Resp{
int32 total = 1;
}
service Counter {
rpc Count(Req) returns (Resp);
}
\ No newline at end of file
# -*- coding: utf-8 -*-
# Generated by the protocol buffer compiler. DO NOT EDIT!
# NO CHECKED-IN PROTOBUF GENCODE
# source: count.proto
# Protobuf Python Version: 5.27.2
"""Generated protocol buffer code."""
from google.protobuf import descriptor as _descriptor
from google.protobuf import descriptor_pool as _descriptor_pool
from google.protobuf import runtime_version as _runtime_version
from google.protobuf import symbol_database as _symbol_database
from google.protobuf.internal import builder as _builder
_runtime_version.ValidateProtobufRuntimeVersion(
_runtime_version.Domain.PUBLIC,
5,
27,
2,
'',
'count.proto'
)
# @@protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()
DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x0b\x63ount.proto\"\x05\n\x03Req\"\x15\n\x04Resp\x12\r\n\x05total\x18\x01 \x01(\x05\x32\x1f\n\x07\x43ounter\x12\x14\n\x05\x43ount\x12\x04.Req\x1a\x05.Respb\x06proto3')
_globals = globals()
_builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, _globals)
_builder.BuildTopDescriptorsAndMessages(DESCRIPTOR, 'count_pb2', _globals)
if not _descriptor._USE_C_DESCRIPTORS:
DESCRIPTOR._loaded_options = None
_globals['_REQ']._serialized_start=15
_globals['_REQ']._serialized_end=20
_globals['_RESP']._serialized_start=22
_globals['_RESP']._serialized_end=43
_globals['_COUNTER']._serialized_start=45
_globals['_COUNTER']._serialized_end=76
# @@protoc_insertion_point(module_scope)
# Generated by the gRPC Python protocol compiler plugin. DO NOT EDIT!
"""Client and server classes corresponding to protobuf-defined services."""
import grpc
import warnings
import count_pb2 as count__pb2
GRPC_GENERATED_VERSION = '1.66.1'
GRPC_VERSION = grpc.__version__
_version_not_supported = False
try:
from grpc._utilities import first_version_is_lower
_version_not_supported = first_version_is_lower(GRPC_VERSION, GRPC_GENERATED_VERSION)
except ImportError:
_version_not_supported = True
if _version_not_supported:
raise RuntimeError(
f'The grpc package installed is at version {GRPC_VERSION},'
+ f' but the generated code in count_pb2_grpc.py depends on'
+ f' grpcio>={GRPC_GENERATED_VERSION}.'
+ f' Please upgrade your grpc module to grpcio>={GRPC_GENERATED_VERSION}'
+ f' or downgrade your generated code using grpcio-tools<={GRPC_VERSION}.'
)
class CounterStub(object):
"""Missing associated documentation comment in .proto file."""
def __init__(self, channel):
"""Constructor.
Args:
channel: A grpc.Channel.
"""
self.Count = channel.unary_unary(
'/Counter/Count',
request_serializer=count__pb2.Req.SerializeToString,
response_deserializer=count__pb2.Resp.FromString,
_registered_method=True)
class CounterServicer(object):
"""Missing associated documentation comment in .proto file."""
def Count(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def add_CounterServicer_to_server(servicer, server):
rpc_method_handlers = {
'Count': grpc.unary_unary_rpc_method_handler(
servicer.Count,
request_deserializer=count__pb2.Req.FromString,
response_serializer=count__pb2.Resp.SerializeToString,
),
}
generic_handler = grpc.method_handlers_generic_handler(
'Counter', rpc_method_handlers)
server.add_generic_rpc_handlers((generic_handler,))
server.add_registered_method_handlers('Counter', rpc_method_handlers)
# This class is part of an EXPERIMENTAL API.
class Counter(object):
"""Missing associated documentation comment in .proto file."""
@staticmethod
def Count(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(
request,
target,
'/Counter/Count',
count__pb2.Req.SerializeToString,
count__pb2.Resp.FromString,
options,
channel_credentials,
insecure,
call_credentials,
compression,
wait_for_ready,
timeout,
metadata,
_registered_method=True)
grpcio==1.70.0
grpcio-tools==1.70.0
protobuf==5.29.3
setuptools==75.8.0
import grpc
import count_pb2, count_pb2_grpc
from concurrent import futures
total = 0
class MyCounter(count_pb2_grpc.CounterServicer):
def Count(self, request, context):
global total
total += 1
return count_pb2.Resp(total=total)
server = grpc.server(futures.ThreadPoolExecutor(max_workers=1), options=[("grpc.so_reuseport", 0)])
count_pb2_grpc.add_CounterServicer_to_server(MyCounter(), server)
server.add_insecure_port("0.0.0.0:5440")
server.start()
print("started")
server.wait_for_termination()
%% Cell type:code id:34b260b6-a504-4cce-a9b3-a3566a8dbeb5 tags:
``` python
import random
import requests
import pandas as pd
r = requests.get("https://pages.cs.wisc.edu/~harter/cs544/data/wi-stations/stations.txt")
r.raise_for_status()
stations = r.text.strip().split("\n")
stations = random.sample(stations, k=10)
workload = random.choices(stations, k=100, weights=[0.3, 0.2] + [0.5/8]*8)
```
%% Cell type:code id:c65f907c-049f-4110-a5ca-75772e76fdcf tags:
``` python
import numpy as np
np.quantile([1,2,4,5], 0.5)
```
%% Output
np.float64(3.0)
%% Cell type:code id:e913b0fa-138c-4fb1-99df-738e79b7b5ad tags:
``` python
" ".join(workload)
```
%% Output
'US1WIIW0014 US1WIPC0020 USC00478805 US1WIBR0019 US1WIPC0020 US1WIBR0019 USC00474391 US1WIPC0020 USC00470062 USC00478805 US1WIBR0019 US1WIBR0019 US1WIIW0014 USC00478329 US1WIIW0014 US1WIBR0019 USC00478329 USC00470062 US1WIBR0019 USC00474391 US1WIMM0001 US1WIIW0014 US1WIBR0019 US1WIBR0019 US1WIMM0001 US1WIIW0014 US1WIBR0019 US1WIBR0019 USC00478805 US1WIIW0014 US1WIBR0019 US1WIIW0014 US1WIDG0011 US1WIIW0014 US1WIIW0014 US1WIDG0011 US1WIDG0011 USC00478805 US1WIIW0014 USC00478329 US1WIDG0011 US1WIBR0019 US1WIIW0014 US1WIIW0014 US1WIDG0011 USC00478805 USC00474391 US1WIBR0019 US1WIMM0001 USC00478329 US1WIBR0019 US1WIIW0014 US1WIPC0020 US1WIIW0014 US1WIBR0019 US1WIBR0019 USC00474391 US1WIBR0019 US1WIDG0011 US1WIVL0014 US1WIIW0014 US1WIIW0014 USC00478805 USC00470062 US1WIIW0014 USC00474391 US1WIBR0019 US1WIVL0014 US1WIIW0014 US1WIBR0019 USC00470062 US1WIBR0019 US1WIBR0019 US1WIPC0020 USC00478805 US1WIPC0020 USC00470062 USC00478329 US1WIPC0020 US1WIMM0001 USC00478329 USC00478329 USC00474391 US1WIBR0019 US1WIBR0019 USC00478805 USC00470062 US1WIIW0014 USC00470062 US1WIMM0001 US1WIBR0019 USC00474391 US1WIIW0014 US1WIPC0020 US1WIIW0014 US1WIIW0014 US1WIBR0019 US1WIBR0019 US1WIIW0014 US1WIPC0020'
%% Cell type:code id:3a66f04d-2361-4ca9-8556-d781d9e5b96c tags:
``` python
import time
time.time() # seconds since Jan 1, 1970
```
%% Output
1739200561.2493458
%% Cell type:code id:ed3e0c8a-6831-411c-b26c-2d14555127f0 tags:
``` python
start = time.time()
time.sleep(2)
end = time.time()
(end-start) * 1000 # milliseconds
```
%% Output
2000.2596378326416
%% Cell type:code id:508955f2-0071-4683-aae8-a8ddf8601570 tags:
``` python
# Example 1: FIFO Policy
cache_size = 3
cache = {} # key=station name, value=DataFrame with weather data for that station
evict_order = [] # evict from the left, try to keep whatever is on the right
# stats
hits = [] # 1 is a hit, 0 is a miss
latency_ms = [] # latency of get_station in milliseconds
def get_station(station):
start = time.time()
if station in cache:
hits.append(1)
print("hit", end=" ")
df = cache[station]
else:
hits.append(0)
print("miss", end=" ")
df = pd.read_csv(f"https://pages.cs.wisc.edu/~harter/cs544/data/wi-stations/{station}.csv.gz",
names=["station", "date", "element", "value", "m", "q", "s", "obs"], low_memory=False)
cache[station] = df
evict_order.append(station)
if len(cache) > cache_size:
#print("evict!")
victim = evict_order.pop(0)
cache.pop(victim)
#print("CACHE:", evict_order)
end = time.time()
latency_ms.append((end-start)*1000)
return df
for station in workload:
get_station(station)
```
%% Output
miss miss miss miss hit hit miss miss miss miss miss hit miss miss hit hit hit miss miss miss miss miss miss hit hit hit hit hit miss hit hit hit miss miss hit hit hit hit hit miss hit miss hit hit miss miss miss miss miss miss hit miss miss hit miss hit miss hit miss miss miss hit miss miss hit miss miss miss miss hit miss miss hit miss miss hit miss miss miss miss hit hit miss miss hit miss miss miss hit miss miss miss miss miss hit hit miss hit hit hit
%% Cell type:code id:01d83645-241b-478c-9142-dde57d778b63 tags:
``` python
print()
print("Hits:", sum(hits))
print("Hit Rate:", sum(hits) / len(hits))
print("Avg Latency:", sum(latency_ms) / len(latency_ms))
print("Median Latency:", np.quantile(latency_ms, 0.5))
print("p99 Latency:", np.quantile(latency_ms, 0.99))
```
%% Output
Hits: 40
Hit Rate: 0.4
Avg Latency: 16.57418727874756
Median Latency: 15.306830406188965
p99 Latency: 93.56653213500977
%% Cell type:code id:2f4e837c-da54-48f3-bd04-0465064c2df6 tags:
``` python
# Example 2: LRU Policy
cache_size = 3
cache = {} # key=station name, value=DataFrame with weather data for that station
evict_order = [] # evict from the left, try to keep whatever is on the right
# stats
hits = [] # 1 is a hit, 0 is a miss
latency_ms = [] # latency of get_station in milliseconds
def get_station(station):
start = time.time()
if station in cache:
hits.append(1)
print("hit", end=" ")
df = cache[station]
evict_order.remove(station)
evict_order.append(station)
else:
hits.append(0)
print("miss", end=" ")
df = pd.read_csv(f"https://pages.cs.wisc.edu/~harter/cs544/data/wi-stations/{station}.csv.gz",
names=["station", "date", "element", "value", "m", "q", "s", "obs"], low_memory=False)
cache[station] = df
evict_order.append(station)
if len(cache) > cache_size:
#print("evict!")
victim = evict_order.pop(0)
cache.pop(victim)
#print("CACHE:", evict_order)
end = time.time()
latency_ms.append((end-start)*1000)
return df
for station in workload:
get_station(station)
```
%% Output
miss miss miss miss hit hit miss hit miss miss miss hit miss miss hit hit hit miss hit miss miss miss miss hit hit hit hit hit miss hit hit hit miss hit hit hit hit miss hit miss miss miss miss hit hit miss miss miss miss miss hit miss miss hit hit hit miss hit miss miss miss hit miss miss hit miss miss miss miss hit miss hit hit miss miss hit miss miss hit miss hit hit miss miss hit miss miss miss hit miss miss miss miss miss hit hit miss hit hit hit
%% Cell type:code id:70dfb612-f83b-4d78-899d-afea29ea24de tags:
``` python
print()
print("Hits:", sum(hits))
print("Hit Rate:", sum(hits) / len(hits))
print("Avg Latency:", sum(latency_ms) / len(latency_ms))
print("Median Latency:", np.quantile(latency_ms, 0.5))
print("p99 Latency:", np.quantile(latency_ms, 0.99))
```
%% Output
Hits: 44
Hit Rate: 0.44
Avg Latency: 16.295619010925293
Median Latency: 15.197038650512695
p99 Latency: 95.38349866867067
%% Cell type:code id:1afdfed8-d89b-4bc1-a0a7-688d751bc9c6 tags:
``` python
```
FROM ubuntu:24.04
RUN apt-get update && apt-get install -y python3 python3-pip curl iproute2 wget unzip
COPY requirements.txt /tmp/requirements.txt
RUN pip3 install -r /tmp/requirements.txt --break-system-packages
RUN wget https://pages.cs.wisc.edu/~harter/cs544/data/hdma-wi-2021.zip && unzip hdma-wi-2021.zip
CMD ["python3", "-m", "jupyterlab", "--no-browser", "--ip=0.0.0.0", "--port=300", "--allow-root", "--NotebookApp.token=''"]