Week 3: Introduction to Python
In this session, we will learn some fundamentals in working with Python. So make sure you have a working copy of Python running on your machine.
In this session, we will concentrate on very basic functionality in using Python as to allow you to read and modify some of the example scripts provided the Python tutorial underlying the course. We will have a look at some basic commands in Python, how to write and open Python scripts from the command line, flow control in scripts, and the definition and loading of functions.
The examples given in the session will largely follow those given in two excellent introductory tutorials.
- Swaroop Chitlur: A Byte of Python.
- Al Sweigart (2015): Automate the Boring Stuff with Python: Practical Programming for Total Beginners. No Starch Press.
For a more comprehensive introduction, make sure to check out The Quick Python Book by Naomi Ceder:
- Naomi Ceder (2018) The Quick Python Book. 3rd ed. Manning Publications.
We will have only time to discuss a small selection of content covered in these tutorials but make sure to spend some time after the course working through these tutorials. This will help you massively in becoming more self-proficient in the use of Python and ultimately allow you much more flexibility in collecting and analyzing digital trace data.
Another option for teaching yourself the basics of Python is a free interactive introductory course to Python offered by codecademy.
For a guide to further readings in working with Python from a social scientist’s perspective make sure to check out Nick Eubank’s Data Analysis in Python.
As you have probably gathered by now, this session will only offer you the most preliminary of introductions to the use of Python. But do not worry. If you caught the bug, there are excellent guides available to you taking you further down the rabbit hole.
For a broader view of collecting digital trace data beyond Twitter see:
- Russell, M. A. (2018). Mining the social web (3rd ed.). Sebastopol, CA: O’Reilly Media.
For a very helpful introduction to data analysis with Python see:
- McKinney, W. (2017). Python for data analysis: Data wrangling with pandas, numpy, and ipython (2nd ed.). Sebastopol, CA: O’Reilly Media.
For a handy introduction to machine learning with Python see:
- Raschka, S. & Mirjalili, V. (2017). Python machine learning: Machine learning and deep learning with python, scikit-learn, and tensorflow (2nd ed.). Birmingham, UK: PACKT Publishing.
Remember, you have access to O’Reilly and PACKT publications on the Safari-Books Online through VPN access to your university network.
Alternatively, the following two introductory online courses are also very helpful places to start your journey with Python:
- CodeAcademy. Learn Python.
- Schouwenaars, F. Intro to Python for Data Science. DataCamp.
Code Examples:
Basics
Ok, let’s begin by starting Python from your command line. If you are unsure of how to use the command line, don’t worry but have look at the quick intro by Nick Eubank.
For the purposes of this course, we will predominantly interact with Python through the program iPython. This is no big deal but should make our life slightly easier. If you have chosen to install Python through the anaconda distribution you are set and ready to go. If not, well, why didn’t you?
So open your command line and at the prompt type
ipython
Once the program has started you are able to directly work with Python. Try typing:
print("Hello World!")
As an alternative to letting Python execute commands directly, you can also read in Python scripts from the command line and let Python execute them automatically.
To do this, first close iPython by typing
exit
Now, as an example open a new file in the text editor of your choice and type the following lines:
# Python Script: Hello World!
print("Hello World!")
Now save the file in the working directory you use for the scripts of this course under the name “hello.py”.
Now point your command line to your working directory by:
cd /(...)/Scripts
and type
python hello.py
Now your little script will be executed in the command line. This will come in very handy once you try to execute large programms that would be prohibitively complicated to execute interactively through Python or iPython. Still, for the purposes of getting to know Python and its functionalities let’s stick for the time being with interacting with Python directly through iPython.
For more basic information about Python and conventions in writing commands and programs see the chapter Basics in Swaroop Chitlur’s A Byte of Python.
Also make sure to check out his chapter Operators and Experessions.
Basic Python functionality
Al Sweigart offers a slightly more complicated example in his book Automate the Boring Stuff with Python illustrating Python’s basic functionality.
Start by opening your text editor and writing the following lines:
# This program says hello and asks for my name.
print('Hello world!')
print('What is your name?')
myName = input()
print('It is good to meet you, ' + myName)
print('The length of your name is:')
print(len(myName))
print('What is your age?')
myAge = input()
print('You will be ' + str(int(myAge) + 1) + ' in a year.')
Now save the new script in your working directory under the name name.py
.
Call it from the command line by typing
python name.py
Now it’s time to answer the questions.
This example illustrates a series of fundamental functions in Python. For us the most import one’s are:
- Python executes commands line by line
- Python is able to build and use variables specified by interactive input.
Make sure to check out Sweigart’s detailed discussion of the programm and his other examples.
Flow Control
The previous script offered a nice illustration of a series of Python’s elementary functionalities. Still, for most of your tasks your want a little more control. In the example, the program went through the script line by line. For most programs you will need a little more flexibility for this you have to control the flow of the program. Let’s focus on three statements that allow you just that:
- if, for, and while.
The “if” statement
Let’s turn to Swaroop Chitlur’s A Byte of Python: Control Flow for examples on how these commands work. The code for these examples and handy examplanations on it works in detail are available at http://python.swaroopch.com/control_flow.html
Open your text editor of choice and enter the following lines:
# if.py: This programs illustrates the workings of the "if" statement
number = 23
guess = int(input('Enter an integer : '))
if guess == number:
# New block starts here
print('Congratulations, you guessed it.')
print('(but you do not win any prizes!)')
# New block ends here
elif guess < number:
# Another block
print('No, it is a little higher than that')
# You can do whatever you want in a block ...
else:
print('No, it is a little lower than that')
# you must have guessed > number to reach here
print('Done')
This last statement is always executed, after the if statement is executed.
Now save the file as if.py
in your working directory for the course and execute it from your command line:
python if.py
Now pick a number, any number…
If you missed the correct one run the program again…
… and again
… ang again
… and … well, there should be an easier way to do this.
The “while” statement
Open your text editor and enter the following lines:
while.py: This programs illustrates the workings of the "while" statement
number = 23
running = True
while running:
guess = int(input('Enter an integer : '))
if guess == number:
print('Congratulations, you guessed it.')
# this causes the while loop to stop
running = False
elif guess < number:
print('No, it is a little higher than that.')
else:
print('No, it is a little lower than that.')
else:
print('The while loop is over.')
# Do anything else you want to do here
print('Done')
Now save the file as while.py
in your working directory for the course and execute it from your command line:
python while.py
Now pick a number, any number…
Looky here, the program does not stop anymore once you’ve blown your one shot in guessing the right number, instead the program runs in a loop, time and time executing the “if” command allowing you to slowly approximate your guesses to converge on the right number.
Loops are crucial for developing programs that continously perform automated tasks and having to account for specific conditions, such as rate limits of APIs during specific time windows. Another handy command in this context is
The “for” loop
Open your text editor of choice and enter the following lines:
# for.py: This programs illustrates the workings of the "for" statement
for i in range(1, 5):
print(i)
else:
print('The for loop is over')
Now save the file as for.py
in your working directory for the course and execute it from your command line:
python for.py
The “for” command allows you iterates a command over a sequence of objects. This is also very handy and you will see us use this command a lot in our sample scripts.
Be sure to also examine the other examples given by Swaroop Chitlur on flow control.
Altenatively, have a look at Chapter 2 in Al Sweigart’s “Automate the Boring Stuf”.
Functions
Finally, let’s have a look at defining and using functions. Functions allow you to prdefine commands and load them in your active workspace. Most of the example scripts provided by Pascal and me in the tutorials are predefined functions allowing you to access and use Twitter data.
To get a running start with functions let’s have a look at Swaroop Chitlur’s Python tutorial.
Let’s start with an easy example. Open your text editor and type the following lines:
def say_hello():
# block belonging to the function
print('hello world')
# End of function
say_hello() # call the function
say_hello() # call the function again
Now save the file as function1.py
.
Ok, now let’s call the file from the terminal:
python function1.py
Now, let’s turn to something a little more complicated. Open a new file in your text editor and type the following lines:
def print_max(a, b):
if a > b:
print(a, 'is maximum')
elif a == b:
print(a, 'is equal to', b)
else:
print(b, 'is maximum')
# directly pass literal values
print_max(3, 4)
x = 5
y = 7
# pass variables as arguments
print_max(x, y)
Now save as the file as function_param.py
and call it:
python function_param.py
Altenatively, have a look at Chapter 3 in Al Sweigart’s “Automate the Boring Stuf”.
Further Study
The examples provided above might serve as a first introduction to working with Python and the example scripts used in the tutorial. Still, this can only by a first and very cursory glance at the language. I, therefore, recommend you spend some time familiarizing yourself with further aspects of Python. A good start are the following chapters:
Lists and other data structures:
- Chapter 4, Automate the Boring Stuff.
- Chapter 5, Automate the Boring Stuff.
- Swaroop Chitlur’s Python tutorial on structures.
Manipulating strings:
- Chapter 6, Automate the Boring Stuff.
Modules:
- Swaroop Chitlur’s Python tutorial on modules.
Big Picture:
For a more structured approach, you could always turn to one of the available introductory books on Python.
Here, for example, I would recommend The Quick Python Book by Naomi Ceder. For our purposes here, Part 2 The Essentials should provide you with a running start.
For other alternatives, make sure to have a look at the respective links on the official Python site:
- Help for non-programmers;
- Help for programmers.
Another option is the free Python course by codecademy.
After you mastered these early stages a host of texts is ready to assisst you in more demanding tasks. The following two might prove useful to you in collecting and using digital trace data:
- McKinney, W. (2017). Python for data analysis: Data wrangling with pandas, numpy, and ipython (2nd ed.). Sebastopol, CA: O’Reilly Media.
- Russell, M. A. (2018). Mining the social web (3rd ed.). Sebastopol, CA: O’Reilly Media.
If you really want to get a head start on what you can do with Python and large data sets collected online have a look at:
- Raschka, S. & Mirjalili, V. (2017). Python machine learning: Machine learning and deep learning with python, scikit-learn, and tensorflow (2nd ed.). Birmingham, UK: PACKT Publishing.
Required Readings:
- Eubank, N. (2015). Data analysis in python.
Background Readings:
- Ceder, N. R. (2018). The quick python book (3rd ed.). Greenwich, CT: Manning Publications Co.
- Chitlur, S. (2016). A byte of python.
- Lutz, M. (2013). Learning python (5th ed.). Sebastopol, CA: O’Reilly Media.
- McKinney, W. (2017). Python for data analysis: Data wrangling with pandas, numpy, and ipython (2nd ed.). Sebastopol, CA: O’Reilly Media.
- Raschka, S. & Mirjalili, V. (2017). Python machine learning: Machine learning and deep learning with python, scikit-learn, and tensorflow (2nd ed.). Birmingham, UK: PACKT Publishing.
- Russell, M. A. (2018). Mining the social web (3rd ed.). Sebastopol, CA: O’Reilly Media.
- Sweigart, A. (2015). Automate the boring stuff with python: Practical programming for total beginners. San Francisco, CA: No Starch Press.
Additional Courses:
Python:
- CodeAcademy. Learn Python.
- Schouwenaars, F. Intro to Python for Data Science. DataCamp.