Welcome to Lab Session 8: Python, Part 2
For this lab session we’ll continue with Python programming.
Let me know if you need help with any of exercises! I can also help explain concepts from the lecture if there was anything you found particularly challenging.
Additionally, don’t forget that you can discuss the exercises with your fellow classmates on MittUiB, the UiB Studyfellowship Discord server (channel: #ling123) or right here in ling123labs.com’s comments section.
The lecture notes for lecture 8 can be found here.
Exercise 1: Tokenizer with RE in Python
Take a look at the example in the lecture notes.
a) What are two important differences between the results of
b) What if we want to use word_tokenize but also want casefolding?
Exercise 2: Zippers
What happens if you try to take the
next of a zip generator that is exhausted?
Exercise 3: N-grams with Python
a) Once again, there is a difference between the results, due to a difference between
word_tokenize. Why would you use one or the other?
b) The resulting n-grams are in both cases generators. If you do not convert them to lists, you can use
c) Look at the following. Use such a method to convert the list of n-gram tuples to a list of strings in which the words are separated by spaces.
>>> [" ".join(('alpha', 'beta', 'gamma'))] ['alpha beta gamma']
Exercise 4: NLTK Stopwords
Create a function that removes stopwords from a given text.
1 2 from nltk.corpus import stopwords stoplist = stopwords.words('english')
Exercise 5: Matplotlib basics
pandas libraries to draw a line chart of Bitcoin price data from
1 2 3 4 5 6 7 8 9 10 Date,Open,High,Low,Close Mar-09-2021,52272.97,54824.12,51981.83,54824.12 Mar-08-2021,51174.12,52314.07,49506.05,52246.52 Mar-07-2021,48918.68,51384.37,48918.68,51206.69 Mar-06-2021,48899.23,49147.22,47257.53,48912.38 Mar-05-2021,48527.03,49396.43,46542.51,48927.30 Mar-04-2021,50522.31,51735.09,47656.93,48561.17 Mar-03-2021,48415.81,52535.14,48274.32,50538.24 Mar-02-2021,49612.11,50127.51,47228.85,48378.99 Mar-01-2021,45159.50,49784.02,45115.09,49631.24
Tip: You can use
pandas.read_csv('bitcoin-prices.csv', parse_dates=True, index_col=0)to read the data from the .csv file.
The output should look something like this: A line chart of Bitcoin price data, created using Matplotlib and Pandas
Feel free to play around with the Matplotlib library. For example, you can experiment with the parameters
markersize to customize the graph.
Tip: If you have 4 line charts on a single plot, you need to set
colorto a list of four matplotlib colors.
You can also check out the
matplotlib documentation in its entirety here!