Skip to content
Snippets Groups Projects
Commit b4c88179 authored by gsingh58's avatar gsingh58
Browse files

lec12 updated

parent d418d2ca
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id:cf313adf tags:
# Web 2: Flask
%% Cell type:code id:d55e4bb4-9f29-4f4f-bba6-05054718259b tags:
``` python
import requests
import time
import urllib.robotparser
```
%% Cell type:markdown id:527600aa tags:
### Rate-limited webpage parsing
- `requests` module:
- `resp = requests.get(<URL>)` method: enables us to send HTTP GET request
- `resp.status_code`: status code of the response
- `resp.text`: `str` text content of the response
- `resp.headers`: `dict` content of response headers
- `@` operator is called a "decorator"
- `flask.Response`: enables us to create a response object instance
- Arguments: `str` representing reponse, `headers` dict representing metadata, `status` representing status code.
- ex:
```python
flask.Response("<b>go away</b>",
status=429,
headers={"Retry-After": "3"})
```
```python
flask.Response("""User-Agent: *
Disallow: /never
""", headers={"Content-Type": "text/plain"})
```
- `flask.request.remote_addr`: enables us to take action based on the IP address from which we receive the request
- 429 Too Many Requests: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429
%% Cell type:code id:8241e51c tags:
``` python
base_url = "http://35.226.223.87:5000/"
```
%% Cell type:code id:6cc81b85 tags:
``` python
def friendly_get(url):
while True:
resp = requests.get(url)
if resp.status_code == 429:
seconds = int(resp.headers.get("Retry-After", 1))
print(f"sleep {seconds}")
time.sleep(seconds)
continue
resp.raise_for_status() # raise exception if not 200
return resp
friendly_get(base_url + "slow").text
```
%% Output
'welcome!'
%% Cell type:code id:0d951114-6eee-4c67-b781-547b387a8b10 tags:
``` python
```
......
%% Cell type:markdown id:cf313adf tags:
# Web 2: Flask
%% Cell type:code id:d55e4bb4-9f29-4f4f-bba6-05054718259b tags:
``` python
import requests
import time
```
%% Cell type:markdown id:527600aa tags:
### Rate-limited webpage parsing
- `requests` module:
- `resp = requests.get(<URL>)` method: enables us to send HTTP GET request
- `resp.status_code`: status code of the response
- `resp.text`: `str` text content of the response
- `resp.headers`: `dict` content of response headers
- `@` operator is called a "decorator"
- `flask.Response`: enables us to create a response object instance
- Arguments: `str` representing reponse, `headers` dict representing metadata, `status` representing status code.
- ex:
```python
flask.Response("<b>go away</b>",
status=429,
headers={"Retry-After": "3"})
```
```python
flask.Response("""User-Agent: *
Disallow: /never
""", headers={"Content-Type": "text/plain"})
```
- `flask.request.remote_addr`: enables us to take action based on the IP address from which we receive the request
- 429 Too Many Requests: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429
%% Cell type:code id:8241e51c tags:
``` python
base_url = "http://35.226.223.87:5000/"
```
%% Cell type:code id:6cc81b85 tags:
``` python
def friendly_get(url):
while True:
resp = requests.get(url)
resp.raise_for_status() # raise exception if not 200
return resp
friendly_get(base_url + "/slow").text
```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment