Problem: Find
tl;dr
Implement a program that finds a number among numbers, per the below.
$ ./generate 1000 | ./find 42
Didn't find needle in haystack.
Academic Honesty
This course’s philosophy on academic honesty is best stated as "be reasonable." The course recognizes that interactions with classmates and others can facilitate mastery of the course’s material. However, there remains a line between enlisting the help of another and submitting the work of another. This policy characterizes both sides of that line.
The essence of all work that you submit to this course must be your own. Collaboration on problems is not permitted (unless explicitly stated otherwise) except to the extent that you may ask classmates and others for help so long as that help does not reduce to another doing your work for you. Generally speaking, when asking for help, you may show your code or writing to others, but you may not view theirs, so long as you and they respect this policy’s other constraints. Collaboration on quizzes and tests is not permitted at all. Collaboration on the final project is permitted to the extent prescribed by its specification.
Below are rules of thumb that (inexhaustively) characterize acts that the course considers reasonable and not reasonable. If in doubt as to whether some act is reasonable, do not commit it until you solicit and receive approval in writing from your instructor. If a violation of this policy is suspected and confirmed, your instructor reserves the right to impose local sanctions on top of any disciplinary outcome that may include an unsatisfactory or failing grade for work submitted or for the course itself.
Reasonable
-
Communicating with classmates about problems in English (or some other spoken language).
-
Discussing the course’s material with others in order to understand it better.
-
Helping a classmate identify a bug in his or her code, such as by viewing, compiling, or running his or her code, even on your own computer.
-
Incorporating snippets of code that you find online or elsewhere into your own code, provided that those snippets are not themselves solutions to assigned problems and that you cite the snippets' origins.
-
Reviewing past years' quizzes, tests, and solutions thereto.
-
Sending or showing code that you’ve written to someone, possibly a classmate, so that he or she might help you identify and fix a bug.
-
Sharing snippets of your own solutions to problems online so that others might help you identify and fix a bug or other issue.
-
Turning to the web or elsewhere for instruction beyond the course’s own, for references, and for solutions to technical difficulties, but not for outright solutions to problems or your own final project.
-
Whiteboarding solutions to problems with others using diagrams or pseudocode but not actual code.
-
Working with (and even paying) a tutor to help you with the course, provided the tutor does not do your work for you.
Not Reasonable
-
Accessing a solution to some problem prior to (re-)submitting your own.
-
Asking a classmate to see his or her solution to a problem before (re-)submitting your own.
-
Decompiling, deobfuscating, or disassembling the staff’s solutions to problems.
-
Failing to cite (as with comments) the origins of code, writing, or techniques that you discover outside of the course’s own lessons and integrate into your own work, even while respecting this policy’s other constraints.
-
Giving or showing to a classmate a solution to a problem when it is he or she, and not you, who is struggling to solve it.
-
Looking at another individual’s work during a quiz or test.
-
Paying or offering to pay an individual for work that you may submit as (part of) your own.
-
Providing or making available solutions to problems to individuals who might take this course in the future.
-
Searching for, soliciting, or viewing a quiz’s questions or answers prior to taking the quiz.
-
Searching for or soliciting outright solutions to problems online or elsewhere.
-
Splitting a problem’s workload with another individual and combining your work (unless explicitly authorized by the problem itself).
-
Submitting (after possibly modifying) the work of another individual beyond allowed snippets.
-
Submitting the same or similar work to this course that you have submitted or will submit to another.
-
Using resources during a quiz beyond those explicitly allowed in the quiz’s instructions.
-
Viewing another’s solution to a problem and basing your own solution on it.
Assessment
Your work on this problem set will be evaluated along three axes primarily.
- Correctness
-
To what extent is your code consistent with our specifications and free of bugs?
- Design
-
To what extent is your code written well (i.e., clearly, efficiently, elegantly, and/or logically)?
- Style
-
To what extent is your code readable (i.e., commented and indented with variables aptly named)?
To obtain a passing grade in this course, all students must ordinarily submit all assigned problems unless granted an exception in writing by the instructor.
Getting Ready
Lots of recommended videos for you for this problem, although a few of them you’ve seen before and should feel free to skip over. Chances are at least a few of these will come in handy.
First up, sorting algorithms with Jackson and Tommy:
And then a discussion of linear and binary search with Patrick (don’t worry too much about when Patrick turns the discussion toward binary search trees in the second half of the binary search video… we’ll get there soon enough, though!):
Getting Started
Log into cs50.io and execute
update50
within a terminal window to make sure your workspace is up-to-date.
Then, execute
cd ~/workspace/unit3
at your prompt to ensure that you’re inside of unit3
(which is inside of workspace
which is inside of your home directory). Then execute
wget https://cdn.cs50.net/ap/2018/problems/find/find.zip
to download a ZIP of this problem’s distro into your workspace (with a command-line program called wget
). You should see a bunch of output followed by:
'find.zip' saved
Confirm that you’ve indeed downloaded find.zip
by executing
ls
and then run
unzip find.zip
to unzip the file. If you then run ls
again, you should see that you have a newly unzipped directory called find
as well. You can now delete the ZIP, with:
rm find.zip
confirming your intent to delete that file, then proceed to execute
cd find
followed by
ls
and you should see that the directory contains five files:
Makefile generate.c helpers.c helpers.h find.c
Understanding
Implemented in generate.c
is a program that uses a "pseudorandom-number generator" (via a function called drand48
) to generate a whole bunch of random (well, pseudorandom, since computers can’t actually generate truly random) numbers, one per line, each of which is in [0, LIMIT
), where LIMIT
is a constant defined within the file, so to speak. That is, each is greater than or equal to 0 and less than LIMIT
.
Go ahead and compile this program by executing the command below.
make generate
Now run the program you just compiled by executing the command below.
./generate
You should be informed of the program’s proper usage, per the below.
Usage: generate n [s]
As this output suggests, this program expects one or two command-line arguments. The first, n
, is required; it indicates how many pseudorandom numbers you’d like to generate. The second, s
, is optional, as square brackets imply; if supplied, it represents the value that the pseudorandom-number generator should use as its "seed." A seed is simply an input to a pseudorandom-number generator that influences its outputs. For instance, if you seed drand48
by first calling srand48
(another function whose purpose is to "seed" drand48
) with an argument of, say, 0
, and then call drand48
itself three times, drand48
might return 0.170828
, then 0.749902
, then 0.096372
. But if you instead seed drand48
by first calling srand48
with an argument of, say, 1
, and then call drand48
itself three times, drand48
might instead return 0.041630
, then 0.454492
, then 0.834817
. But if you re-seed drand48
by calling srand48
again with an argument of 0
, the next three times you call drand48
, you’ll again get 0.170828
, then 0.749902
, then 0.096372
! See, not so random.
Go ahead and run this program again, this time with a value of, say, 10
for n
, as in the below; you should see a list of 10 pseudorandom numbers.
./generate 10
Run the program a third time using that same value for n
; you should see a different list of 10 numbers. Now try running the program with a value for s
too (e.g., 0
), as in the below.
./generate 10 0
Now run that same command again:
./generate 10 0
Bet you saw the same "random" sequence of ten numbers again? Yup, that’s what happens if you don’t vary a pseudorandom number generator’s initial seed.
Now take a look at generate.c
itself. (Remember how?) Comments atop that file explain the program’s overall functionality. But it looks like we forgot to comment the code itself. Read over the code carefully until you understand each line and then comment our code for us, replacing each TODO
with a phrase that describes the purpose or functionality of the corresponding line(s) of code. (Know that an unsigned int
is just an int
that cannot be negative.) And for more details on drand48
and srand48
, recall that you can execute:
man drand48
and:
man srand48
Once done commenting generate.c
, re-compile the program to be sure you didn’t break anything by re-executing the command below.
make generate
If generate
no longer compiles properly, take a moment to fix what you broke!
Now, recall that make
automates compilation of your code so that you don’t have to execute clang
manually along with a whole bunch of switches. Notice, in fact, how make
just executed a pretty long command for you, per the tool’s output. However, as your programs grow in size, make
won’t be able to infer from context anymore how to compile your code; you’ll need to start telling make
how to compile your program, particularly when they involve multiple source (i.e., .c
) files. And so we’ll start relying on "Makefiles," configuration files that tell make
exactly what to do.
How did make
know how to compile generate in this case? It actually used a configuration file that we wrote. Go ahead and look at the file called Makefile
that’s in the same directory as generate.c
. This Makefile
is essentially a list of rules that we wrote for you that tells make how to build generate from generate.c
for you. The relevant lines appear below.
generate: generate.c
clang -ggdb3 -O0 -std=c11 -Wall -Werror -o generate generate.c
The first line tells make
that the "target" called generate
should be built by invoking the second line’s command. Know that the leading whitespace on that second line is not a sequence of spaces but, rather, a tab. Unfortunately, make
requires that commands be preceded by tabs, so be careful not to change them to spaces, else you may encounter strange errors! The -Werror
flag, recall, tells clang
to treat warnings (bad) as though they’re errors (worse) so that you’re forced (in a good, instructive way!) to fix them.
Now take a look at find.c
. Notice that this program expects a single command-line argument: a "needle" to search for in a "haystack" of values. Once done looking over the code, go ahead and compile the program by executing the command below.
make find
Notice, per that command’s output, that make
actually executed the below for you.
clang -ggdb3 -O0 -std=c11 -Wall -Werror -o find find.c helpers.c -lcs50 -lm
Notice further that you just compiled a program comprising not one but two .c
files: helpers.c
and find.c
. How did make
know what to do? Well, again, open up Makefile
to see the man behind the curtain. The relevant lines appear below.
find: find.c helpers.c helpers.h
clang -ggdb3 -O0 -std=c11 -Wall -Werror -o find find.c helpers.c -lcs50 -lm
Per the dependencies implied above (after the colon), any changes to find.c
, helpers.c
, or helpers.h
will compel make
to rebuild find the next time it’s invoked for this target.
Go ahead and run this program by executing, say, the below.
./find 13
You’ll be prompted to provide some hay (i.e., some integers), one "straw" at a time. As soon as you tire of providing integers, hit ctrl-d to send the program an EOF
(end-of-file) character. That character will compel get_int
from the CS50 Library to return INT_MAX
, a constant that, per find.c
, will compel find
to stop prompting for hay. The program will then look for that needle in the hay you provided, ultimately reporting whether the former was found in the latter. In short, this program searches an array for some value. At least, it should, but it won’t find anything yet! That’s where you come in. More on your role in a bit.
In turns out you can automate this process of providing hay, though, by "piping" the output of generate
into find
as input. For instance, the command below passes 1,000 pseudorandom numbers to find
, which then searches those values for 42
.
./generate 1000 | ./find 42
Note that, when piping output from generate
into find
in this manner, you won’t actually see generate
's numbers, but you will see find
's prompts.
Alternatively, you can "redirect" generate
's output to a file with a command like the below.
./generate 1000 > numbers.txt
You can then redirect that file’s contents as input to find
with the command below.
./find 42 < numbers.txt
Let’s finish looking at that Makefile
. Notice the line below.
all: find generate
This target implies that you can build both generate
and find
simply by executing the below.
make all
Even better, the below is equivalent (because make
builds a Makefile
's first target by default).
make
If only you could whittle this whole problem set down to a single command! Finally, notice these last lines in Makefile
:
clean:
rm -f *.o a.out core find generate
This target allows you to delete all files ending in .o
or called core
(more on that soon!), find
, or generate
simply by executing the command below.
make clean
Be careful not to add, say, *.c
to that last line in Makefile
! (Why?)
Notice now that, in find.c
, main
calls search
, a function declared in helpers.h
. Unfortunately, we forgot to implement that function fully in helpers.c
! Indeed, take a peek at helpers.c
, and you’ll see that search
always returns false
, whether or not value
is in values
. To be sure, we could have put the contents of helpers.h
and helpers.c
in find.c
itself. But it’s sometimes better to organize programs into multiple files, especially when some functions are essentially "utility functions" that might later prove useful to other programs as well, much like those in the CS50 Library.
Notice too, per helpers.h
, that the prototype for search
is:
bool search(int value, int values[], int n);
And the prototype for sort
is:
void sort(int values[], int n);
Both functions take an array, values
, as one of their arguments as well as an integer, n
, the size of that array. That’s because, when passing an array to a function, you have to pass in its size separately; you can’t infer an array’s size from the array itself.
Specification
Complete the implementation of find
by completing the implementation of search
and sort
in helpers.c
.
search
-
Your implementation must return
false
immediately ifn
is non-positive. -
Your implementation must return
true
ifvalue
is invalues
andfalse
ifvalue
is not invalues
. -
The running time of your implementation must be in O(log n).
-
You may not alter the function’s declaration. Its prototype must remain:
bool search(int value, int values[], int n);
sort
-
Your implemenation must sort, from smallest to largest, the array of numbers that it’s passed.
-
The running time of your implementation must be in O(n2), where n is the array’s size.
-
You may not alter the function’s declaration. Its prototype must remain:
void sort(int values[], int n);
Usage
Your program should behave per the examples below. Assumed that the underlined text is what some user has typed. (^d
represents the ctrl-d character described above)
$ ./find 42
50
43
^d
Didn't find needle in haystack.
$ ./find 42
50
42
^d
Found needle in haystack!
Testing
When ready to check the correctness of your program, try running the command below.
./generate 1000 50 | ./find 127
Because one of the numbers outputted by generate
, when seeded with 50
, is 127
, your code should find that "needle"! By contrast, try running the command below as well.
./generate 1000 50 | ./find 128
Because 128
is not among the numbers outputted by generate
, when seeded with 50
, your code shouldn’t find that needle. Best to try some other tests as well, as by running generate
with some seed, taking a look at its output, then piping that same output to find
, looking for a "needle" you know to be among the "hay".
Incidentally, note that main
in find.c
is written in such a way that find
returns 0
if the needle is found, else it returns 1
. You can check the so-called "exit code" with which main
returns by executing
echo $?
after running some other command. For instance, assuming your implementation of search
is correct, if you run
./generate 1000 50 | ./find 127
echo $?
you should see 0
, since 127
is, again, among the 1,000 numbers outputted by generate
when seeded with 50
, and so search
(written by you) should return true
, in which case main
(written by us) should return (i.e., exit with) 0
. By contrast, assuming your implementation of search
is correct, if you run
./generate 1000 50 | ./find 128
echo $?
you should see 1
, since 128
is, again, not among the 1,000 numbers outputted by generate
when seeded with 50
, and so search
(written by you) should return false
, in which case main
(written by us) should return (i.e., exit with) 1
. Make sense?
Testing
Correctness
If you’d like to check the correctness of your program with check50
, you may execute the below.
check50 cs50/2018/ap/find/less
Style
style50 helpers.c
Staff Solution
If you’d like to play with the staff’s own implementation of calc
, you may execute the below.
~cs50/unit3/find/less
Hints
Before you implement search
in O(log n) time, you might want to implement it temporarily in O(n) time, as with linear search, if only because it’s a bit easier to get right. That way, you can move on to sort
, knowing that search
already works. And once sort
works, you can go back and re-implement search
in O(log n) time, as with binary search. Just remember to!
Ultimately, you are welcome to implement search
iteratively (with a loop) or recursively (wherein a function calls itself). If you pursue the latter, though, know that you may not change our declaration of search
, but you may write a new, recursive function (that perhaps takes different parameters) that search
itself calls.
For sort
, odds are you’ll want to implement bubble sort, selection sort, or insertion sort! Just realize that there’s no one "right" way to implement any of those algorithms; variations abound. In fact, you’re welcome to improve upon them as you see fit, so long as your implementation remains in O(n2). Although you may not alter our declaration of sort
, you’re welcome to define your own function(s) in helpers.c
that sort
itself may then call.
We leave it to you to determine how best to test your implementation of search
and sort
. But don’t forget that eprintf
is your friend while debugging! And don’t forget that you can generate the same sequence of pseudorandom numbers again and again by explicitly specifying generate
's seed.
How to Submit
Step 1 of 2
Ensure that find.c
is in ~/workspace/unit3/find
, as with:
cd ~/workspace/unit3/find
ls
If find.c
is not in ~/workspace/unit3/find
, move it into that directory, as via mv
(or via CS50 IDE’s lefthand file browser).
Step 2 of 2
-
To submit
find
, executecd ~/workspace/unit3/find submit50 cs50/2018/ap/find/less
If you run into any trouble, email sysadmins@cs50.harvard.edu!
You may resubmit any problem as many times as you’d like before the deadline.
Your submission should be graded for correctness within 2 minutes, at which point your score will appear at cs50.me
This was Find.