Problem: Bleep

tl;dr

Implement a program that censors messages that contain words that appear on a list of supplied "banned words."

$ python bleep.py banned.txt
What message would you like to censor?
What the heck
What the ****
$ python bleep.py banned.txt
What message would you like to censor?
gosh darn it
**** **** it

Academic Honesty

This course’s philosophy on academic honesty is best stated as "be reasonable." The course recognizes that interactions with classmates and others can facilitate mastery of the course’s material. However, there remains a line between enlisting the help of another and submitting the work of another. This policy characterizes both sides of that line.

The essence of all work that you submit to this course must be your own. Collaboration on problems is not permitted (unless explicitly stated otherwise) except to the extent that you may ask classmates and others for help so long as that help does not reduce to another doing your work for you. Generally speaking, when asking for help, you may show your code or writing to others, but you may not view theirs, so long as you and they respect this policy’s other constraints. Collaboration on quizzes and tests is not permitted at all. Collaboration on the final project is permitted to the extent prescribed by its specification.

Below are rules of thumb that (inexhaustively) characterize acts that the course considers reasonable and not reasonable. If in doubt as to whether some act is reasonable, do not commit it until you solicit and receive approval in writing from your instructor. If a violation of this policy is suspected and confirmed, your instructor reserves the right to impose local sanctions on top of any disciplinary outcome that may include an unsatisfactory or failing grade for work submitted or for the course itself.

Reasonable

  • Communicating with classmates about problems in English (or some other spoken language).

  • Discussing the course’s material with others in order to understand it better.

  • Helping a classmate identify a bug in his or her code, such as by viewing, compiling, or running his or her code, even on your own computer.

  • Incorporating snippets of code that you find online or elsewhere into your own code, provided that those snippets are not themselves solutions to assigned problems and that you cite the snippets' origins.

  • Reviewing past years' quizzes, tests, and solutions thereto.

  • Sending or showing code that you’ve written to someone, possibly a classmate, so that he or she might help you identify and fix a bug.

  • Sharing snippets of your own solutions to problems online so that others might help you identify and fix a bug or other issue.

  • Turning to the web or elsewhere for instruction beyond the course’s own, for references, and for solutions to technical difficulties, but not for outright solutions to problems or your own final project.

  • Whiteboarding solutions to problems with others using diagrams or pseudocode but not actual code.

  • Working with (and even paying) a tutor to help you with the course, provided the tutor does not do your work for you.

Not Reasonable

  • Accessing a solution to some problem prior to (re-)submitting your own.

  • Asking a classmate to see his or her solution to a problem before (re-)submitting your own.

  • Decompiling, deobfuscating, or disassembling the staff’s solutions to problems.

  • Failing to cite (as with comments) the origins of code, writing, or techniques that you discover outside of the course’s own lessons and integrate into your own work, even while respecting this policy’s other constraints.

  • Giving or showing to a classmate a solution to a problem when it is he or she, and not you, who is struggling to solve it.

  • Looking at another individual’s work during a quiz or test.

  • Paying or offering to pay an individual for work that you may submit as (part of) your own.

  • Providing or making available solutions to problems to individuals who might take this course in the future.

  • Searching for, soliciting, or viewing a quiz’s questions or answers prior to taking the quiz.

  • Searching for or soliciting outright solutions to problems online or elsewhere.

  • Splitting a problem’s workload with another individual and combining your work (unless explicitly authorized by the problem itself).

  • Submitting (after possibly modifying) the work of another individual beyond allowed snippets.

  • Submitting the same or similar work to this course that you have submitted or will submit to another.

  • Using resources during a quiz beyond those explicitly allowed in the quiz’s instructions.

  • Viewing another’s solution to a problem and basing your own solution on it.

Getting Started

Here’s how to download this problem’s "distribution code" (i.e., starter code) into your own CS50 IDE. Log into CS50 IDE and then, in a terminal window, execute each of the below.

  1. Execute cd to ensure that you’re in ~/ (i.e., your home directory).

  2. Execute mkdir chapter6 to make (i.e., create) a directory called chapter6 in your home directory, if you haven’t already done so.

  3. Execute cd chapter6 to change into (i.e., open) that directory.

  4. Execute wget http://cdn.cs50.net/ap/2019/problems/bleep/bleep.zip to download a (compressed) ZIP file with this problem’s distribution.

  5. Execute unzip bleep.zip to uncompress that file.

  6. Execute rm bleep.zip followed by yes or y to delete that ZIP file.

  7. Execute ls. You should see a directory called bleep, which was inside of that ZIP file.

  8. Execute cd bleep to change into that directory.

  9. Execute ls. You should see this problem’s distribution code, including bleep.py and banned.txt.

Understanding

This program defines only one function, main, which gets called per the file’s last line. Within main …​ ugh, looks like that’s just a big TODO!

Specification

Complete the implementation of bleep.py in such a way that it:

  • Accepts as its sole command-line argument the name (or path) of a dictionary of banned words (i.e., text file).

  • Opens and reads from that file the list of words stored therein, one per line, and stores each in a Python data structure for later access. While a Python list will work well for this, you may also find a set useful here.

  • If no command line argument (e.g., banned.txt) is provided, be sure to have your program exit with a status code of 1.

  • You may assume that any text files the staff tests with will have one word per line (each line terminated with a \n), and any alphabetic characters in those words will be lowercase.

  • Prompts the user to provide a message.

  • Tokenizes that message into its individual component words, using the split method on the provided string, and then iterates over the list of "tokens" (words) that is returned by calling split, checking to see whether any of the tokens match, case-insensitively, any of the words in the banned words list.

  • Prints back the message that the user provided, except if the message contained any banned words, each of its characters is replaced by a *.

  • For example, gosh should be replaced with four * characters, while fudge should be replaced with five.

  • You should not censor words that merely contain a banned word as a substring. For example, if bar is a banned word in the provided list, then none of barns nor crowbar nor wheelbarrow should be censored.

  • You explicitly do not need to support input strings that contain punctuation marks. You may assume we will only test your input where each word is only separated by whitespace.

Usage

Your program should behave per the examples below. Assume that the underlined text is what some user has typed.

$ python bleep.py
Usage: python bleep.py dictionary
$ python bleep.py list1.txt list2.txt list3.txt
Usage: python bleep.py dictionary
$ python bleep.py banned.txt
What message would you like to censor?
hello world
hello world
$ python bleep.py banned.txt
What message would you like to censor?
what the heck
what the ****
$ python bleep.py banned.txt
What message would you like to censor?
gosh darn it
**** **** it

Testing

Correctness

check50 cs50/problems/2019/ap/bleep

Style

style50 bleep.py

Staff’s Solution

If you’d like to play with the staff’s own implementation of bleep, you may execute the below.

~cs50/2019/ap/chapter6/bleep

How to Submit

Step 1 of 2

Head back to the ide.cs50.io[CS50 IDE] and ensure that bleep.py is in ~/chapter6/bleep, as with:

cd ~/chapter6/bleep
ls

If bleep.py is not in ~/chapter6/bleep, move it into that directory, as via mv (or via CS50 IDE’s lefthand file browser).

Step 2 of 2

  • To submit bleep, execute

    cd ~/chapter6/bleep
    submit50 cs50/problems/2019/ap/bleep

    inputting your GitHub username and GitHub password as prompted.

If you run into any trouble, email sysadmins@cs50.harvard.edu!

You may resubmit any problem as many times as you’d like before the deadline.

Your submission should be graded for correctness within 2 minutes, at which point your score will appear at submit.cs50.io!

Hints

  • Be sure to test with different banned words lists than the one provided by default — we will!

  • When independently researching how to do things on this problem (which is indeed part of the expectation, as you grow in your comfort with programming overall!), be sure your Google searches and the like include "Python 3" in them, and not just "Python", lest you get code examples written in an earlier version of Python!

  • Odds are you’ll find str.split of interest.

  • Odds are you’ll find str.lower of interest.

  • Odds are you’ll find str.strip of interest, to chomp off any trailing newlines that may be attached to words on your "banned words" list.

This was Bleep.