# Problem: Sort Race

Questions? Feel free to head to CS50 on Reddit, CS50 on StackExchange, the `#cs50ap` channel on CS50x Slack (after signing up), or the CS50 Facebook group.

## tl;dr

Implement several O(n2) sorting algorithms so that they can be compared under various conditions.

``````$./race -r 30000 bubble sort benchmark: 11.820 seconds selection sort benchmark: 2.696 seconds insertion sort benchmark: 2.496 seconds`````` ## Academic Honesty This course’s philosophy on academic honesty is best stated as "be reasonable." The course recognizes that interactions with classmates and others can facilitate mastery of the course’s material. However, there remains a line between enlisting the help of another and submitting the work of another. This policy characterizes both sides of that line. The essence of all work that you submit to this course must be your own. Collaboration on problems is not permitted (unless explicitly stated otherwise) except to the extent that you may ask classmates and others for help so long as that help does not reduce to another doing your work for you. Generally speaking, when asking for help, you may show your code or writing to others, but you may not view theirs, so long as you and they respect this policy’s other constraints. Collaboration on quizzes and tests is not permitted at all. Collaboration on the final project is permitted to the extent prescribed by its specification. Below are rules of thumb that (inexhaustively) characterize acts that the course considers reasonable and not reasonable. If in doubt as to whether some act is reasonable, do not commit it until you solicit and receive approval in writing from your instructor. If a violation of this policy is suspected and confirmed, your instructor reserves the right to impose local sanctions on top of any disciplinary outcome that may include an unsatisfactory or failing grade for work submitted or for the course itself. ### Reasonable • Communicating with classmates about problems in English (or some other spoken language). • Discussing the course’s material with others in order to understand it better. • Helping a classmate identify a bug in his or her code, such as by viewing, compiling, or running his or her code, even on your own computer. • Incorporating snippets of code that you find online or elsewhere into your own code, provided that those snippets are not themselves solutions to assigned problems and that you cite the snippets' origins. • Reviewing past years' quizzes, tests, and solutions thereto. • Sending or showing code that you’ve written to someone, possibly a classmate, so that he or she might help you identify and fix a bug. • Sharing snippets of your own solutions to problems online so that others might help you identify and fix a bug or other issue. • Turning to the web or elsewhere for instruction beyond the course’s own, for references, and for solutions to technical difficulties, but not for outright solutions to problems or your own final project. • Whiteboarding solutions to problems with others using diagrams or pseudocode but not actual code. • Working with (and even paying) a tutor to help you with the course, provided the tutor does not do your work for you. ### Not Reasonable • Accessing a solution to some problem prior to (re-)submitting your own. • Asking a classmate to see his or her solution to a problem before (re-)submitting your own. • Decompiling, deobfuscating, or disassembling the staff’s solutions to problems. • Failing to cite (as with comments) the origins of code, writing, or techniques that you discover outside of the course’s own lessons and integrate into your own work, even while respecting this policy’s other constraints. • Giving or showing to a classmate a solution to a problem when it is he or she, and not you, who is struggling to solve it. • Looking at another individual’s work during a quiz or test. • Paying or offering to pay an individual for work that you may submit as (part of) your own. • Providing or making available solutions to problems to individuals who might take this course in the future. • Searching for, soliciting, or viewing a quiz’s questions or answers prior to taking the quiz. • Searching for or soliciting outright solutions to problems online or elsewhere. • Splitting a problem’s workload with another individual and combining your work (unless explicitly authorized by the problem itself). • Submitting (after possibly modifying) the work of another individual beyond allowed snippets. • Submitting the same or similar work to this course that you have submitted or will submit to another. • Using resources during a quiz beyond those explicitly allowed in the quiz’s instructions. • Viewing another’s solution to a problem and basing your own solution on it. ## Getting Ready First, refresh your memory on the O(n2) sorting algorithms we’ve learned about so far in the course. Here are Jackson and Tommy to explain: Before moving on, be sure you’re comfortable answering the following questions: • Why is bubble sort in O(n2)? • Why is insertion sort in Ω(n)? • How does selection sort work? ## Getting Started As always, first log into your CS50 IDE at cs50.io and execute ``update50`` within a terminal window to make sure your workspace is up-to-date. Next, execute ``cd ~/workspace/chapter3`` at your prompt to ensure that you’re inside of the `chapter3` directory within your `workspace` directory. Then execute ``wget http://docs.cs50.net/2017/ap/problems/race/race.zip`` to download a ZIP of this problem’s distro into your workspace. You should see a bunch of output followed by: ``'race.zip' saved`` Confirm that you’ve indeed downloaded `race.zip` by executing ``ls`` and then run ``unzip race.zip`` to unzip the file. If you then run `ls` again, you should see that you have a newly unzipped directory called `race` as well. You can now delete the ZIP, with: ``rm -f race.zip`` Careful! We’ve added the `-f` flag this time, so `rm` will not confirm that you want to delete the file. If you like the comfort of having the system double-check with you, just omit `-f` from your command. Lastly, execute ``cd race`` followed by ``ls`` and you should see that the directory contains four files: ``Makefile helpers.c helpers.h race.o`` Off we go! ## Object Orienting In this problem, you’ll be racing the three O(n2) sorting algorithms we’ve seen under a few different test conditions, to see how they perform against one another. Those test conditions will be: • arrays that are almost sorted, with two elements out of place, • arrays in reverse order, • arrays in completely random order, and • arrays that are already sorted. The good news is that you don’t have to implement anything involving populating the arrays! You only have to do a tiny amount of command-line validation and implement the three sorting algorithms themselves. But we’re getting a bit ahead of ourselves. First we need to deal with the contents of the directory you just unzipped. Have a peek at the `Makefile` we’ve prepared for you. In particular, focus on this portion. ``````HDRS = helpers.h SRCS = race.o helpers.c`````` Per the dependencies implied above, any changes to either `race.o`, `helpers.c`, or `helpers.h` will compel `make` to rebuild `race` the next time it’s invoked for this target. In other words, this means that `race` is not simply comprised of a single source file, but rather of three separate files. `helpers.c` and `helpers.h` you probably can figure out. But what the heck is `race.o`? A refresher on the compilation process might be in order here, first. Take it away, Rob: Interesting…​ so it turns out that `race.o` contains the object code that was generated from a source file (presumably called `race.c`) that we wrote and partially compiled, but then chose not to include in the distro. We made this choice on the one hand because the code therein is a bit complicated at this stage of the course (thus allowing us to abstract some of the complex detail), but further because it provides you with an opportunity to write code that simply must conform to a precise specification, or it won’t work! It turns out that `race.o` contains the object code for, among other things, `main`. And if you can’t change `main` then you can’t change the way that `main` calls any functions, including the functions you’ll be tasked with writing in this problem. Bummer! If this seems unfair, know that it’s also a really good indicator of what you’ll experience in the real world if you continue with programming as a career. Often times large groups of people collaborate on a single project, and there are standards and specifications that must be adhered to so that everyone’s components interoperate smoothly. Veering from those standards will mean your code is incompatible with the project at large, which will mean you will have wasted some of your valuable time (and probably irritated your colleagues, too)! Incidentally, because it is not a so-called "target" specified in the `Makefile`, if when working on this problem you (inadvertently or intentionally) try to ``make helpers`` you’ll actually default to using the standard `Makefile` included with CS50 IDE[1] which will just try to compile the `helpers.c` file alone into its own program. Problem is, if you open up `helpers.c`, there’s no `main` function, so you’ll probably get a whole bunch of cryptic error messages concluding with one along these lines: ``````/usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crt1.o: In function `_start': (.text+0x20): undefined reference to `main'`````` So do be sure that whenever you try to compile this program, you do so with ``make race`` or, in fact, because `race` is the first target listed in `Makefile` and absent any other command-line arguments supplied to `make` it will simply default to compiling the first target listed in the `Makefile`, you can even say just ``make`` Convenient, eh? ## The Race Begins All of the work you’ll be doing in this problem will be confined to `helpers.c` and possibly `helpers.h`. In particular, you have to implement the four functions prototyped therein: `check_flag`, `bubble`, `selection`, and `insertion`. ### `check_flag` If you try to compile `race` from the distro and run it without any command line arguments, you’re immediately notified of the proper usage of the program. ``Usage: ./race array-type size`` And then, if you supply it with three command line arguments, regardless of what those arguments are, you’ll see the following: ``Invalid array-type. Must be -a, -b, -r, or -s`` Why? Because right now if you have a look at `check_flag` in `helpers.c`, you’ll see that it always returns `false`. But eventually what `check_flag` should do, per the comment atop its prototype, is return `true` if the argument passed in (which happens to be `argv[1]`) is `-a`, `-b`, `-r`, or `-s`, and return `false` if the argument passed in is anything other than that. Does the format of those strings remind you of anything you’ve seen recently? Recall the command we recommended you use to get rid of `race.zip` above: ``rm -f race.zip`` In this case, we would term `-f` a flag, which is just another way of describing a command-line argument to a particular program or Linux command (in this case, `rm`) that modifies the behavior of that program. In the case of `-f` and `rm`, that flag tells `rm` to not confirm with you whether you intend to delete the file(s) in question; it just deletes them right away. So all `check_flag` is doing is confirming whether `argv[1]` is one of those four things. If doesn’t report out which one it is, just that it’s one of them. Odds are there’s a function that might help with checking that. Incidentally, what do these four flags represent? They determine what type of array will be the test case for the various sorting algorithms: • `-a` for almost sorted arrays. These arrays are already sorted except for two elements which have been randomly switched. • `-b` for backwards arrays. These arrays are sorted, but in reverse order: left-to-right, largest-to-smallest. • `-r` for random arrays. These arrays have no particular order. • `-s` for sorted arrays. These arrays are already properly sorted in order from left-to-right, smallest-to-largest. Again, you needn’t worry about implementing the functionality of populating the arrays. That was done by us, and the object code resulting from that implementation lives in `race.o`. ### `bubble`, `selection`, and `insertion` In the functions `bubble`, `selection`, and `insertion` you will be implementing, respectively, bubble sort, selection sort, and insertion sort. Remember that you are not allowed to modify the prototypes of `bubble`, `selection`, or `insertion`, but you are welcome to create any additional "helper" functions that you wish, placing their prototypes in `helpers.h` and their definitions in `helpers.c`. We’re not going to give you much more than that! But do make sure to implement all three algorithms which, per the shorts atop this specification, behave quite differently even though all three have the same ultimate result. As a tip, you may want to start with fairly small `size` (aka `argv[2]`) arguments, and you may want to do some debugging to ensure that your sorting algorithms are actually sorting the array properly. ## Showcase Once you’ve implemented `check_flag`, if you try to run your program you’ll see that when it runs you get output like the following: ``````bubble sort benchmark: 0.000 seconds selection sort benchmark: 0.000 seconds insertion sort benchmark: 0.000 seconds`````` So this is where the "race" really happens. Time to see which of these algorithms is the fastest. But…​ wait? Aren’t they all O(n2) algorithms? Shouldn’t they all run at exactly the same speed? Well, not exactly. Theoretically, as n gets larger and larger, yes, these three algorithms will tend to run at closer and closer speeds. But theoretical runtime is not the same as real-world runtime, and so under average and varying test conditions, the performance of these three algorithms will differ, sometimes substantially. For example, see the below wherein underlined text represents user input to the program. ``````~/workspace/chapter3/race/$ ./race -b 2000
bubble sort benchmark:         0.056 seconds
selection sort benchmark:      0.012 seconds
insertion sort benchmark:      0.020 seconds``````

Not too much of a difference. But what about

``````~/workspace/chapter3/race/ $./race -r 100000 bubble sort benchmark: 129.968 seconds selection sort benchmark: 31.312 seconds insertion sort benchmark: 27.468 seconds`````` Ouch! Or lastly ``````~/workspace/chapter3/race/$ ./race -s 20000
bubble sort benchmark:         0.000 seconds
selection sort benchmark:      1.420 seconds
insertion sort benchmark:      0.000 seconds``````

Hmm…​ selection sort still took that much time to "sort" an already-sorted array? Is the difference between Ω(n2) and Ω(n) now a bit more clear?

And know that because of varying processor performance and system load, under otherwise-identical conditions from run-to-run the running time of these algorithms may vary somewhat.

`check50` is not capable of detecting whether you are implementing bubble, selection, or insertion sort correctly. It is only capable of determining whether your output is indeed sorted. Because the crux of this problem lies in implementing these sorts correctly, we leave it to you (and GDB!) to ensure that your three functions are implemented properly.

To run the staff solution, simply execute:

``~cs50/chapter3/race``

passing in appropriate command-line arguments.

In no part of this problem are you expected to optimize your runtimes for any of these algorithms (beyond, of course, implementing them correctly). Rather, after you get them implemented you should test different arrays of different sizes and different configurations to see under which circumstances each algorithm "shines". So you can see actual differences between these algorithms, we recommend that your `size` argument be at least `1000`, as that way they’ll tend to take at least a few thousandths of a second to sort.

## How to Submit

### Step 1 of 3

Execute `update50` again to ensure that your IDE is up-to-date.

### Step 2 of 3

• Recall that you were asked to implement the `sort race`.

• Be sure that the following files exist in `~/workspace/chapter3/race/`:

• `Makefile`

• `helpers.c`

• `helpers.h`

• `race.o` as with:

``````cd ~/workspace/chapter3/race/
ls``````

Since there is no way to automatically check for correctness here this project is graded on completion.

### Step 3 of 3

Submit `Sort Race`:

``````cd ~/workspace/chapter3/race/
submit50 cs50/2017/ap/race``````

If you run into any trouble, email sysadmins@cs50.harvard.edu!

You may resubmit any problem as many times as you’d like before the deadline.

Your submission should be graded for correctness within 2 minutes, at which point your score will appear at cs50.me!

This was Sort Race.

1. The existence of this file is why your previous programs compiled with no trouble despite not having a `Makefile` in the directory.