Lecture 10
Cookies
-
Don’t forget that we can import a function from CS50’s Python library called
eprint
, which helps us debug our program by printing the filename and line number from where it was called. -
There are many websites like Facebook where we log in when we first visit, and not have to log in every time we move around that website after. Somehow, the website must remember who we are. And they store information on the server, like in CSV files or local variables like we saw last time. The server can’t necessarily rely on the IP address, since one household or network might have one public IP address that many devices share.
-
A cookie (in this context) is a small piece of information unique to our login that a web server gives to our browser, that our browser can send back in the future, to verify who we are. We can think of it like a digital handstamp that we might get at an amusement park, so we can get in or out easily.
-
Our browser stores cookies for us and sends it back as part of an HTTP request in the headers. When we first log in, some of the headers in the reply from the server might look like this:
HTTP/1.1 200 OK Content-Type: text/html Set-Cookie: session=value ...
-
session
is the key for the cookie, like a variable name, andvalue
will be our unique handstamp, which might be a large random number that’s hard for others to guess.
-
-
Now our browser, in future requests, will include that cookie in the headers:
GET / HTTP/1.1 Host: www.facebook.com Cookie: session=value ...
-
Facebook’s servers stores the values of each account’s cookies, so it can know who we are from that cookie.
-
-
As individual users, we might want to delete our cookies. If someone has access to the cookies our browser is storing, they could pretend to be us by including them in requests.
-
HTTPS encrypts the messages between our browser and the receiving server, so someone along the network won’t be able to steal our cookies. A few years ago, a security researcher created a program that anyone can run, to capture all wireless traffic in the room, and show all the unencrypted cookies that were sent. That motivated most websites to use only HTTPS for sending information like this. And we ourselves might clear the cookies our browser has stored, so there’s no trace of which websites we’ve visited in the past.
-
Let’s take a look at
application.py
from ourstore
demo:from flask import Flask, redirect, render_template, request, session from flask_session import Session # Configure app app = Flask(__name__) # Configure sessions app.config["SESSION_PERMANENT"] = False app.config["SESSION_TYPE"] = "filesystem" Session(app) @app.route("/") def index(): return render_template("index.html") @app.route("/update", methods=["POST"]) def update(): for item in request.form: session[item] = int(request.form.get(item)) return redirect("/cart") @app.route("/cart") def cart(): return render_template("cart.html", cart=session)
-
If we had a web application like a store, where we wanted to add items to a virtual cart, we could store that information on our server, linked to each user’s cookie.
-
We create a Flask application called
app
, and set up one of its built-in features, Session, which allows us to set information about a user’s session. -
We set the default route to be
index.html
as usual. -
The
/cart
route leads to a template where we pass in thesession
variable. Andsession
is tracked automatically by our server’s Flask application. Incart.html
, we see how that variable is used:{% extends "layout.html" %} {% block body %} <h1>Cart</h1> {% for item in cart %} {{ item }} : {{ cart[item] }} <br/> {% endfor %} <p> <a href="/">Continue shopping</a>. </p> {% endblock %}
-
We include the standard HTML from
layout.html
, and we add a heading ofCart
to our page. Then, for eachitem
in the variablecart
(which wassession
when we passed it in), we print the key ofitem
and the value of it incart
.cart
is a dictionary, or hash table, in which we can store values for keys. Here we are iterating over all the keys, like"foo"
,"bar"
, and"baz"
, so this loop would translate that tocart["foo"]
,cart["bar"]
, andcart["baz"]
to print each one’s value.
-
-
-
If we visit our store, we see this page:
-
"Foo", "bar", and "baz" are generic names in Computer Science that we can use to refer to some objects that don’t need a specific name.
-
We have a form where we can set the quantity of each item that we want, and if we go our cart, we’ll see the quantities for each item as we input it.
-
-
To see how, we can open Chrome’s Developer Tools, and use the Network tab to see that our server is indeed responding with some values for setting a cookie with a
session
in the Response Headers: -
And if we go back to the relevant section of our server’s code, we can see how we update the quantities stored every time the form is submitted, with the
update
route:@app.route("/update", methods=["POST"]) def update(): for item in request.form: session[item] = int(request.form.get(item)) return redirect("/cart")
-
For each
item
in the form, we set the value in oursession
dictionary to theint
value (like casting in C) of the value sent through the form.
-
-
And if we looked in the directory where our
store
application is, we would see a directory calledflask_session
, in which there are files for eachsession
our server has created.
Databases
-
Many web applications, including the ones we’ve demonstrated in lectures, follow a certain paradigm called MVC, Model-View-Controller. We have been writing Python code that act as controllers, the logic that controls where the user goes and what our application actually does with their inputs, etc. The HTML templates, in which we placed our data to display to the user, are views. This separates the logic of our application from the aesthetics and formatting. And finally, today we will see how models, or data, are stored and accessed.
-
SQL, Structured Query Language, is a programming language that we use to talk to a database, a program that stores data and responds to requests for data, like a web server. And it has advanced features like searching and filtering data quickly.
-
Spreadsheet programs like Excel or Google Sheets allow us to store data in rows and columns, and this layout is shared with relational databases, which we will be using. We commonly use the columns in the top row for headers like "Name" and "Dorm", and each row after will be an entry:
-
But with spreadsheet programs, we run into limitations with how many rows we can store, and we can only search linearly.
-
-
The database program we will use is SQLite, which stores our data in binary format and provides advanced, efficient features.
-
We can use yet another program, called phpLiteAdmin, to manage our databases, or use a command-line program,
sqlite3
. -
And we can connect and talk to our database with Python code.
-
In SQL, we’ll use a few basic operations:
-
CREATE …
-
INSERT …
-
SELECT …
-
UPDATE …
-
DELETE …
-
…
-
-
With those verbs, we can write statements like these, and run them one at a time:
-
CREATE TABLE 'registrants' ('id' INTEGER PRIMARY KEY, 'name' TEXT, 'dorm' TEXT)
-
This is the command we’d use to create a table named
registrants
, and we specify the columns we’d like to have in our table. More interestingly, we specify the type of data that each column will be, so our database can optimize how it stores and accesses data. Theid
column is an integer we can use to uniquely identify each row, andname
anddorm
will be text fields.
-
-
INSERT INTO "registrants" ("id", "name", "dorm") VALUES(1, 'David', 'Matthews')
-
Now we specify the columns of data we want to provide for a new row, and the values for each column.
-
-
SELECT * FROM "registrants"
-
*
allows us to select all rows, since*
matches all values.
-
-
UPDATE "registrants" SET "name" = 'David Malan' where id = 1
-
We can also change values for rows, but ensure that we only set it for the row we want to change with
where id = 1
.
-
-
DELETE FROM "registrants" WHERE id = 1
-
And we can delete rows, too.
-
-
-
And our database, SQLite, can have other data types, too:
-
BLOB
stands for "binary large object", or binary data. -
INTEGER
-
NULL
is something we can use to specify that there is no value for a particular column, like in C. -
REAL
is a real number, or floating-point value. -
TEXT
-
DATETIME
stores dates and times. -
NUMERIC
stores numbers, whether they are integers or floating-point.
-
-
SQL has functions to help us figure out the time, like
date
,time
, anddatetime
. -
For phone numbers and zip codes, we might want to store the data as text, since leading zeroes are usually removed from numbers.
-
We also have a few options we can apply to storing data like names and addresses. We might want separate first name and last name columns, as well as street, city, and state, so we can search or filter on each of those pieces individually.
-
And if we have lots of people sharing the same address, we’ll have redundant information in our database. With just the zip code, for example, we should be able to figure out the city and state.
-
So if we were using a spreadsheet program, we can have another sheet, "cities", and store that data separately:
-
Now, our "customers" sheet has less redundancy:
-
And we can assign an ID to each row in our cities sheet, and only refer to each one in our customers sheet by the ID, in case any of the other values change.
-
Now, to find the actual city and state of someone, we need to take the added step of looking up their zip code in the cities sheet, but we gain the advantage of using less space and not having to update all the rows if something changes.
-
In today’s lecture code, we have a file called
lecture.db
. If we upload that to our IDE, and double-click it, it opens in a program called phpLiteAdmin: -
This shows us the content of that database, and allows us to visualize it. Now we’ll see all the tables, and all the rows in each of the tables. We can click the SQL tab, and run commands:
-
In this particular command, we are selecting everything in the
Album
table, but we are joining theArtist
table, based on theArtistId
stored in each row of theAlbum
table. So instead of seeing some ID for theArtistId
column, we’ll see all the fields stored in theArtist
table.
-
-
And we can have more complex relations. A many-to-many relationship might be between our
Playlist
table and ourSong
table, where a playlist has many songs, and a particular song might be on many playlists. ThePlaylistTrack
table represents this, by indicating thePlaylistId
andTrackID
for every song on a playlist: -
We can also assign other attributes to columns:
-
PRIMARY KEY
, where this column will be used to uniquely identify rows. -
UNIQUE
means that the field will be unique for every row, but not used to identify rows in joins. -
INDEX
means that we want the database to store the field in some index to speed up searches in the future, if we anticipate searching on that field frequently. -
NOT NULL
means that the field has to have some value, and can’t be blank. -
FOREIGN KEY
we’ll come back to again later, but means that it is referring to a row in some other table.
-
-
Let’s run a command to create an empty file,
touch customers.db
. Now if we double-click it, phpLiteAdmin will open with a blank database, and we can create a new table calledusers
: -
We can specify the columns in our table like so:
-
Autoincrement tells our database to automatically increase the
id
field for us, and we can also specify the default value for a column if we want it to not be null.
-
-
After we click the create button, phpLiteAdmin also tells us the SQL command of what it did:
-
We can do the same in the terminal by running the
sqlite3
command. We’ll runsqlite3 alternative.db
to use a different database file, and run the sameCREATE
command: -
By running
.schema
, a built-in SQLite command, we can see what our tables look like. We can do the same for ourlecture.db
file:-
We see our
Album
table, and each of the columns inside. AFOREIGN KEY
is a reference to a row in another table, and we can indicate how that reference is made.
-
-
Now that we have an idea of how databases work, let’s see how we can access them from our web applications.
-
Let’s write a simple program to test out accessing databases:
from cs50 import SQL db = SQL("sqlite:///lecture.db") # Query database for all albums rows = db.execute("SELECT * FROM Album") # For each album in database for row in rows: # Print title of album print(row["Title"])
-
By using the
SQL
class from the CS50 Library, we can easily connect to a local database file with the SQLite program. Then we can usedb.execute
to run commands on that database, that we pass in as strings. Androws
will be a list of dictionaries, with fields we can access. -
Now if we run this program, we’ll see all the album titles printed out. And we’ll also see a debugging line printed out by the CS50 Library, that shows what query was actually executed.
-
-
We can also pass variables into the command we want to execute, which will then be substituted:
rows = db.execute("SELECT * FROM Album WHERE Title = :t", t=sys.argv[1])
-
sys.argv[1]
is the command-line argument our program gets.
-
-
We pass in the variables individually so the CS50 Library can sanitize, or remove dangerous characters from the user’s input, for us. If we had passed in a formatted string like so:
rows = db.execute(f"SELECT * FROM Album WHERE Title = '{sys.argv[1]}'")
-
Someone could pass in code that would then be added into the query. They could run
haha ; DELETE FROM Album
and the semicolon would end theSELECT
command, and then run anotherDELETE
command.
-
-
The name for this class of issues is an SQL injection attack. For example, when we log in to some website we might fill out the username and password fields of some form, and the back-end might try to select from the database a row that has a matching username and password.
-
But if the server uses SQL, then what we pass in to the form might be directly included in the query. For example, suppose we pass in
me@examplemailprovider.com
as the username and' OR '1' = '1
as our password: -
SQL happens to have the keyword
OR
, and it looks like this password value will change the meaning of a SQL query, if it’s included directly with single quotes. -
Suppose the back-end code looks like this:
username = request.form.get("username") password = request.form.get("password") db.execute(f"SELECT * FROM users WHERE username = '{username}' AND password = '{password}'"
-
So if
username
andpassword
is substituted in, our query will actually end up being:db.execute("SELECT * FROM users WHERE username = 'me@examplemailprovider.com' AND password = '' OR '1' = '1'")
-
And that will select the row with
username = 'me@examplemailprovider.com'
even ifpassword
doesn’t match, since1
is always equal to1
.
-
-
If we sanitize our input by passing in variables to the CS50 Library separately, the query after the input is substituted will look like this, with dangerous characters escaped with backslashes:
db.execute("SELECT * FROM users WHERE username = 'me@examplemailprovider.com' AND password = '\' OR \'1\' = \'1'")
-
Let’s combine this into a web application, with lecture0 in the source directory:
from flask import Flask, render_template, request from cs50 import eprint, SQL app = Flask(__name__) db = SQL("sqlite:///lecture.db") @app.route("/") def index(): rows = db.execute("SELECT * FROM Album") return render_template("index.html", albums=rows)
-
We connect to our local database, and pass all the rows we get from the
Album
table into our template, which looks like:{% extends "layout.html" %} {% block body %} <ul> {% for album in albums %} <li>{{ album["Title"] }}</li> {% endfor %} </ul> {% endblock %}
-
Now if we run our webserver, we’ll see an unordered, bulleted list of our album titles.
-
-
We can change the query to be from the URL:
@app.route("/") def index(): query = request.args.get("q") eprint(query) rows = db.execute("SELECT * FROM Album WHERE Title = :q", q=query) return render_template("index.html", albums=rows)
-
Now, if we add
?q=Something
to the URL, we’ll see only the album titles that match that query.
-
-
With these building blocks, we’ll build a web application in Problem Set 7 that simulates buying and selling stocks, with real prices!