Data Structures
Please make sure you have downloaded the workshop data. See Setup for instructions.
For a more detailed tutorial, please refer to the Python documentation
Numbers
Additional information available here
Input
x = 2.
y = 4.
Input
x + y # sum
Input
z = 1
z += 1 # increment by one
z
Input
y**x # power
Input
y / x # division
Be careful
Input
y / 0 # division by 0
Output
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero
Strings
Additional information available here
You can declare a variable to hold text (a.k.a String)
Input
greetings = "Hello"
print(greetings)
You can combine or concatenate text
Input
tutorial = "Python "
offered_by = "Research Commons"
print(tutorial + offered_by)
In a string, each character is stored at an index. You can obtain a substring by indicating which indexes you want [start index:end index]
A blank before the colon, e.g. [:8]
means from the beginning up until index 8.
A blank after the colon, e.g. [5:]
means fro index 5 up until the end.
Input
offered_by[:8] # check substring
We don’t need to count each character. index(some character)
will programmatically return an index for us
Input
offered_by[:offered_by.index(" ")] # we can get indexes programatically
Input
idx = offered_by.index(" ")
offered_by[:idx] # we can get indexes programatically
Strings are case sensitive
Input
"Research" in offered_by # check if some string contains another
Input
"RESEARCH" in offered_by
Input
"Research".upper() # convert to upper
Input
"Research".lower() # convert to lowercase
Why do I need to know about Strings? Introducing our practical problem
Suppose we want to find all the sentences of a text file that explicitly mention global
and warming
First we need to open a file
Input
with open('G20-2019.txt', 'r') as f:
data = f.read()
print(data)
Then we need to split lines in the file.
Python has a special character \n
that represents a new line. Many programming languages have these escaped characters
Input
with open('G20-2019.txt', 'r') as f:
data = f.read().split('\n')
print(data)
Output
[
'Good morning, this G20 meeting takes place in a moment of high tension, high political tension.',
'We have global warming, but we have also global political warming, and this can be seen in relation to trade and technology conflicts, it can be seen in relation to situations in several parts of the world, namely the Gulf.'
...
]
data
is no longer a string. It is a list! We will get to arrays very soon.