Today’s lab will focus on open data, using main(), and using Python from the command line.
Software tools needed: web browser and Python IDLE programming environment with the pandas and matplotlib packages installed.
During lab, there is a quiz. The password to access the quiz will be given during lab. To complete the quiz, log on to Blackboard (see Lab 1 for details on using Blackboard).
See Lab 1 for details on using Python, Gradescope, and Blackboard.
Much of the data collected by city agencies is publicly available at NYC Open Data. Let’s use pandas to plot some data from NYC OpenData.
(Totals for 2015-2016)
We’ll start with data that has the daily number of families and individuals residing in the Department of Homeless Services (DHS) shelter system:
Click on the “Explore Data/View Data” button. To keep the data set from being very large (and avoid some missing values in 2014), we are going filter the data to be all counts after January 1, 2017. To do this:
To download the file,
Move your CSV file to the directory that you save your programs. Open with Excel (or your favorite spreadsheet program) to make sure it downloaded correctly. Look at the names of the columns since those will correspond to series we can plot.
Now, we can write a (short) program to display daily counts:
import pandas as pd
import matplotlib.pyplot as plt
homeless = pd.read_csv("DHS_Daily_Report.csv")
homeless.plot(x = "Date of Census", y = "Total Individuals in Shelter")
plt.show()
Save your program and try on your dataset.
See the Programming Problem List.
main()
Python allows you to write programs as scripts: basically, a list of commands that are executed one after the other. You can also organize the programs in functions, which groups commands together that can be reused. Many programming languages (like C++ or Java) require that your programs be organized in functions.
To define function in Python, we use the def command, which has the basic form:
def myFunction(input1, input2, ...):
command1
command2
...
Note that everything indented below the def line is considered part of the function. When you type the function name (followed by parenthesis), it calls (or “invokes”) the function, which means it executes all the commands, one after another, that are part of the function.
Let’s rewrite our first program, using functions. By tradition (and since it matches the naming protoccol of C & C++), we will call our function main() (see Section 6.7: Using a Main Function):
#Name: your name here
#Date: October 2017
#This program, uses functions, says hello to the world!
def main():
print("Hello, World!")
if __name__ == "__main__":
main()
In Python, we have the option of running our programs as a standalone program, or included as module as part of another program. Since it’s common to do either, we include the last two lines of the file, which say if the program is being run directly (which we can test to see if the variable __name__
that is set by Python is __main__
), then we call main()
. If it’s not, then the file is being included in something else, and leaves it to that program to call it.
Save your program and try running it in IDLE.
Now, at the prompt (the window with the lines beginning with »>), type main()
. This calls the function directly. Note that calling the function either way results in the same actions: the commands inside main()
are executed.
When you have a running version, see the Programming Problem List.
In addition to IDLE (and other development environments with graphical interfaces), Python can also be used directly from the command line. In fact, this is what the grading scripts do to evaluate your programs, since Gradescope uses a server in the cloud and does not have a graphics window.
To start, we need a command line interface (aka a terminal window). To launch the terminal, click on the terminal window icon in the left menu, or go to search option in the upper left corner and type and then open terminal.
In Lab 1, we launched IDLE from the terminal by typing:
idle3
We can use Python in a similar fashion. In a terminal window, change directories to where you stored your hello program above (see Lab 4 for changing directories at the command line).
Let’s run your hello program from the command line. If your program is called hello.py, you would type at the command line:
python3 hello.py
Notice that the output goes directly to the terminal window. Try running other programs you have written from the command line.
If you finish the lab early, now is a great time to get a head start on the programming problems due early next week. There’s instructors to help you, and you already have Python up and running. The Programming Problem List has problem descriptions, suggested reading, and due dates next to each problem.