Link Search Menu Expand Document

Data Structures

Please make sure you have downloaded the workshop data. See Setup for instructions.

For a more detailed tutorial, please refer to the Python documentation

Numbers

Additional information available here

Input

x = 2.
y = 4.

Input

x + y # sum

Input

z = 1
z += 1 # increment by one
z 

Input

y**x # power

Input

y / x # division

Be careful

Input

y / 0 # division by 0

Output

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero

Strings

Additional information available here

You can declare a variable to hold text (a.k.a String)

Input

greetings = "Hello"
print(greetings)

You can combine or concatenate text

Input

tutorial = "Python "
offered_by = "Research Commons"

print(tutorial + offered_by)

In a string, each character is stored at an index. You can obtain a substring by indicating which indexes you want [start index:end index]

A blank before the colon, e.g. [:8] means from the beginning up until index 8.

A blank after the colon, e.g. [5:] means fro index 5 up until the end.

Input

offered_by[:8] # check substring

We don’t need to count each character. index(some character) will programmatically return an index for us

Input

offered_by[:offered_by.index(" ")]  # we can get indexes programatically

Input

idx = offered_by.index(" ")
offered_by[:idx]  # we can get indexes programatically

Strings are case sensitive

Input

"Research" in offered_by # check if some string contains another

Input

"RESEARCH" in offered_by

Input

"Research".upper() # convert to upper

Input

"Research".lower() # convert to  lowercase

Why do I need to know about Strings? Introducing our practical problem

Suppose we want to find all the sentences of a text file that explicitly mention global and warming

First we need to open a file

Input

with open('G20-2019.txt', 'r') as f:
    data = f.read()
    print(data)

Then we need to split lines in the file.

Python has a special character \n that represents a new line. Many programming languages have these escaped characters

Input

with open('G20-2019.txt', 'r') as f:
    data = f.read().split('\n')
    print(data)

Output

[
    'Good morning, this G20 meeting takes place in a moment of high tension, high political tension.', 
    'We have global warming, but we have also global political warming, and this can be seen in relation to trade and technology conflicts, it can be seen in relation to situations in several parts of the world, namely the Gulf.'
    ... 
]

data is no longer a string. It is a list! We will get to arrays very soon.