Week 4
Last Time
-
Last time we looked at numbers and how we might search them and sort them, with algorithms like:
-
linear search
-
binary search
-
bubble sort
-
selection sort
-
insertion sort
-
merge sort
-
-
We started using basic terms to describe running time (in units of steps taken), like:
-
n2
-
n log n
-
n
-
log n
-
1
-
…
-
-
The notation for running time includes:
-
O, worst-case running time
-
Ω, best-case running time
-
Θ, if both of those are the same
-
Strings
-
Now we’ll take a closer look at strings and how they are actually stored by the computer.
-
Let’s look at
compare0.c
:#include <cs50.h> #include <stdio.h> int main(void) { printf("s: "); string s = get_string(); printf("t: "); string t = get_string(); if (s == t) { printf("same\n"); } else { printf("different\n"); } }
-
It looks like this program takes two strings from the user and compares them.
-
But it doesn’t work, when we put in two strings that look the same.
-
-
Hm, mysterious. Let’s try to copy the string:
#include <cs50.h> #include <ctype.h> #include <stdio.h> #include <string.h> int main(void) { printf("s: "); string s = get_string(); if (s == NULL) { return 1; } string t = s; if (strlen(t) > 0) { t[0] = toupper(t[0]); } printf("s: %s\n", s); printf("t: %s\n", t); return 0; }
-
Now we’re getting a string
s
from the user, copying it to a string calledt
, and then making the first letter oft
uppercase. -
But when we run the program, it again doesn’t behave like we might expect. Both
s
andt
are capitalized!
-
-
Another example we can look at:
#include <stdio.h> void swap(int a, int b); int main(void) { int x = 1; int y = 2; printf("x is %i\n", x); printf("y is %i\n", y); printf("Swapping...\n"); swap(x, y); printf("Swapped.\n"); printf("x is %i\n", x); printf("y is %i\n", y); } void swap(int a, int b) { int tmp = a; a = b; b = tmp; }
-
We have a function called
swap
that’s supposed to take two values,a
andb
, and swaps them. It takesa
, puts the value into a temporary variable calledtmp
, and then stores the value ofb
intoa
. Then the value oftmp
, which is the originala
, is stored intob
. -
But when we run this program, too, it doesn’t swap the values of
x
andy
inmain
.
-
-
So we open our debugger, and step over each line of our program:
-
Stepping into the
swap
function, we see thata
andb
are indeed the right values. But when we get back tomain
,x
andy
are still the same.
Memory
-
It turns out that programs are given memory by the operating system, and areas of memory are set aside in a fairly standard way:
-
If we think about memory as a rectangle, a grid of bytes, each area (comprised of many many bytes) can be labeled as above.
-
At the top is a chunk called "text," and that’s actually where the machine code for your program is put in memory.
-
Below that is the data, or variables, your program is using.
-
-
Then we have something we call the stack. The "bottom" of our computer’s memory, or the area with high addresses, is used for functions. In fact, for our C programs, the very bottom of the stack contains a chunk of memory for our
main
function, with any local variables or arguments:-
Then on top, the next function called, such as
swap
, will have its own chunk of memory.
-
-
And we can realize that each block, or byte, is individually addressed and stores some value, which explains what we saw earlier:
-
swap
has its arguments passed in as copies.
-
-
And once
swap
returns, its part of the stack is marked as usable (since it’s returned), somain
still sees the samex
andy
. -
And when we were comparing
s
andt
earlier, we were actually comparing two memory addresses. When we callget_string()
, we’re actually storing the characters of the string somewhere else in memory (since we don’t know how big the string will be). For example, if we calledget_string
and the user typed inZamyla
, the characters might be stored in memory starting at address123
. (Recall that a string is just an array of characters, each one in a byte in a consecutive set of bytes.) So ours
will have the value123
. -
And when we call
get_string
again for another string,t
, whatever the user types in will be stored somewhere else in memory, regardless of its contents. Sot
might have the value234
if the second string was stored starting at byte234
. (And this address is "dynamically allocated" by a C library, since we don’t necessarily know ahead of time how big the string will be.) -
When we tried to capitalize just one string, too, we were just setting
t
to the address of the strings
was pointing to: -
In fact, we can think of both
s
andt
as "pointers" to values that we care about. So in the end, what we knew as astring
type was really just a pointer to a character (the start of a "string"). (And recall that we recognize the end of a string by the\0
character, so we don’t need to store the length or the ending address.) -
So how might we compare a string?
#include <cs50.h> #include <stdio.h> #include <string.h> int main(void) { printf("s: "); char *s = get_string(); printf("t: "); char *t = get_string(); if (s != NULL && t != NULL) { if (strcmp(s, t) == 0) { printf("same\n"); } else { printf("different\n"); } } }
-
Now that we know what
get_string
actually returns, we can set the type of our variables
tochar *
, or a pointer to a character. (And indeed the CS50 Library has just been mapping all mentions ofstring
tochar *
this whole time!) -
Turns out, there exists a library function called
strcmp
that compares strings, and returns0
if they’re the same. Andstrcmp
probably does that with a loop looking at thei
th character in each string, comparing them one at a time.
-
-
To make a copy of a string, we do something a little fancier:
#include <cs50.h> #include <ctype.h> #include <stdio.h> #include <string.h> int main(void) { printf("s: "); char *s = get_string(); if (s == NULL) { return 1; } char *t = malloc((strlen(s) + 1) * sizeof(char)); if (t == NULL) { return 1; } for (int i = 0, n = strlen(s); i <= n; i++) { t[i] = s[i]; } if (strlen(t) > 0) { t[0] = toupper(t[0]); } printf("s: %s\n", s); printf("t: %s\n", t); free(t); return 0; }
-
We get
s
as usual, but then fort
we use another C library function calledmalloc
, which allocates some memory for us to use. The amount of memory we ask for is the length ofs
(plus 1 for\0
to end the string), times the size of a single character. And ifmalloc
returnsNULL
fort
, that means something went wrong (perhaps we ran out of memory), so our program too needs to check for that and return an error if so. -
Now we can deliberately go through the entire string, and one past the end of the string, to copy the
\0
character. Then we’ll have a copy ofs
int
, and changing something int
will no longer changes
. -
Finally, at the end of our program, we should make the habit of calling
free
on our manually allocated memory, which marks it as usable again.
-