Using the AI assistant

Table of contents

Introduction

The course has access to a chat AI at this address.

The AI, based on ChatGPT, is quite good at helping in programming problems. However, if left unchecked, it has two problematic tendencies. First, it likes to write "fancy" code that isn't necessarily the most understandable for a beginner. Second, it is often too eager to fix and alter code to its liking. Both of these hinder learning. To mitigate these problems, we've set up the following default prompt for the AI:

You are a helpful assistant tutoring university students who are learning the basics of Python programming. In the course, the students try to solve various simple programming assignments. Your task is to help them resolve whatever errors they encounter, and to explain in clear terms what pieces of any code pasted to you do. When you're asked about why a piece of code gives a specific error, it is extremely important for the student's learning that you don't alter the program otherwise than what is needed to fix that particular problem and only that problem.

Any code you give as examples should also be written to be understandable to a complete beginner in programming. So, don't use any fancy programming constructs and stick to how a beginner would do things.

This helps, but doesn't completely solve the problem. So, if you're getting weird results, try changing the prompt or adding more instructions. Also, make use of the fact that the AI has been tuned for discussion-type interaction. Like, if it gives you a solution you can't understand, call it out with "This is still too complex, can you simplify it?" etc. It is tuned to work very well with this kind interaction where it gives you something, and then you give it further instructions and ask further questions until what it provides actually matches your need.

Bottom line: when using the AI, remember to not let it solve the whole problem for you blindly. If it gives you a solution that you don't completely understand, insist that it explain it to you until you do. This way, you'll actually learn.

In terms of being able to use the AI to cheat your way through the course, that's certainly possible. I may do some spot checks on your submissions that may spot any code I wouldn't believe a beginner would write, but if you're careful, you can surely instruct the AI to write like a beginner. But, at that point you're not really learning anything. So, the only situation where it could even be worth considering is if you just want credits and don't really care about the content. But mainly the assumption is that you're learning the fundamentals taught on this course on your way to further courses that actually require you to know them. So, if you'd cheat your way through this course, you'd only be shooting yourself in the foot.

Examples of how to use the AI successfully

Example 1: Explain why a program doesn't work and drill down to understand why things are the way they are


User:

My task is: Please write a program which asks the user for their age. The program should then print out a message based on whether the user is of age or not, using 18 as the age of maturity.

Some examples of expected behaviour:

Sample output

How old are you? 12
You are not of age!

Sample output

How old are you? 32
You are of age!

My solution is:

# Write your solution here
age = input("How old are you?")
if age > 18:
   print("You are of age!")
if age < 18:
   print("You are not of age!")

Why am I getting: TypeError on line 3: '>' not supported between instances of 'str' and 'int'?


AI:

The error you're encountering is because the input() function in Python returns a string, but you're trying to compare it directly to an integer. To fix this, you need to convert the input string to an integer using the int() function before performing the comparison.

Here is the corrected version of your code:

# Write your solution here
age = int(input("How old are you? "))
if age >= 18:
    print("You are of age!")
else:
    print("You are not of age!")

Explanation:

  1. age = int(input("How old are you? ")): This line asks the user for their age and converts the input string to an integer.
  2. if age >= 18:: This condition checks if the age is 18 or greater.
  3. print("You are of age!"): This line prints the message if the user is of age.
  4. else:: This part handles the case where the user is not of age.
  5. print("You are not of age!"): This line prints the message if the user is not of age.

This should now work correctly and give the appropriate output based on the user's input.


User:

But what are integers and strings and why do they need to be different?!


AI:

Great question! Understanding the difference between integers and strings is fundamental in programming.

Integers

  • Integers are whole numbers without any decimal point. They can be positive, negative, or zero.
  • Examples of integers: -5, 0, 42.

In Python, integers are used for mathematical operations and comparisons involving numeric values.

Strings

  • Strings are sequences of characters enclosed in quotes. They can include letters, numbers, symbols, and whitespace.
  • Examples of strings: "hello", "123", "42".

Strings are used to represent text and are not suitable for mathematical operations without conversion.

Why They Need to Be Different

  • Operations: Different types of data are used for different kinds of operations. For example, you can add, subtract, multiply, and divide integers. However, these operations don't make sense for strings in the same way.
  • Comparisons: When comparing values, Python needs to know the type of data to understand how to perform the comparison. For instance, comparing the string "18" to the integer 18 directly is not meaningful because they represent different types of data.

User:

But why isn't "18" the same as 18?


AI:

That's a great question! It boils down to how computers interpret and handle different types of data.

Strings vs. Integers

  1. Representation:

    • "18" is a string. It is a sequence of characters, specifically the characters '1' and '8'. Strings are used to represent text.
    • 18 is an integer. It is a numerical value that can be used in arithmetic operations.
  2. Operations:

    • You can concatenate strings. For example, "1" + "8" results in "18".
    • You can perform arithmetic operations with integers. For example, 1 + 8 results in 9.
  3. Comparisons:

    • When comparing strings, Python compares them lexicographically (like dictionary order). For example, "2" > "10" is True because it compares character by character, and '2' comes after '1' in lexicographical order.
    • When comparing integers, Python compares their numerical values. For example, 2 > 10 is False.

Example

Let's look at an example to illustrate the difference:

# String comparison
string_a = "18"
string_b = "2"

print(string_a > string_b)  # This will print False because '1' in "18" is less than '2' in "2"

# Integer comparison
int_a = 18
int_b = 2

print(int_a > int_b)  # This will print True because 18 is greater than 2

Why This Matters in Your Program

When you ask for the user's age using input(), Python treats the input as a string. If you don't convert this string to an integer, Python will try to compare the string to an integer, which doesn't work as expected and leads to errors.

By converting the input to an integer, you ensure that Python is comparing numerical values correctly.

Summary

  • "18" (string) and 18 (integer) are fundamentally different types of data.
  • Strings are sequences of characters, while integers are numerical values.
  • Different types of data are used for different operations and comparisons.
  • Converting between types when necessary ensures your program behaves as expected.

I hope this clarifies why "18" is not the same as 18 and why type conversion is important!


Example 2: Explaining a Python code snippet step by step


User:

Can you please explain in detail and step by step what the following Python code does:

import requests
import re
from gensim.parsing.preprocessing import remove_stopwords
from gensim.corpora.dictionary import Dictionary
from gensim.models.ldamodel import LdaModel
import pyLDAvis
import pyLDAvis.gensim

pride_and_prejudice = requests.get("http://www.gutenberg.org/cache/epub/42671/pg42671.txt").text
paragraphs = [re.split(r"\W+",remove_stopwords(paragraph)) for paragraph in re.split(r"\r\n\r\n",re.sub(r"CHAPTER [XIV]+.","",pride_and_prejudice[pride_and_prejudice.index("CHAPTER I."):pride_and_prejudice.index("*** END OF THE PROJECT GUTENBERG EBOOK PRIDE AND PREJUDICE ***")]))]
d = Dictionary(paragraphs)
c = [d.doc2bow(paragraph) for paragraph in paragraphs]
m = LdaModel(c,num_topics = 15, alpha='auto')
pyLDAvis.display(pyLDAvis.gensim.prepare(m,c,d))

AI:

Sure! Let's break down the code step by step to understand what it does. This code performs topic modeling on the text of "Pride and Prejudice" by Jane Austen using Latent Dirichlet Allocation (LDA) and visualizes the topics using pyLDAvis.

Step-by-Step Explanation

  1. Importing Libraries:
import requests
import re
from gensim.parsing.preprocessing import remove_stopwords
from gensim.corpora.dictionary import Dictionary
from gensim.models.ldamodel import LdaModel
import pyLDAvis
import pyLDAvis.gensim
  • requests: For making HTTP requests to fetch the text.
  • re: For regular expressions and text manipulation.
  • gensim.parsing.preprocessing.remove_stopwords: To remove common stopwords from the text.
  • gensim.corpora.dictionary.Dictionary: To create a mapping of words to IDs.
  • gensim.models.ldamodel.LdaModel: For creating the LDA model.
  • pyLDAvis and pyLDAvis.gensim: For visualizing the LDA model.
  1. Fetching the Text:
pride_and_prejudice = requests.get("http://www.gutenberg.org/cache/epub/42671/pg42671.txt").text
  • This line fetches the text of "Pride and Prejudice" from Project Gutenberg and stores it in the variable pride_and_prejudice.
  1. Preprocessing the Text:
paragraphs = [re.split(r"\W+", remove_stopwords(paragraph)) for paragraph in re.split(r"\r\n\r\n", re.sub(r"CHAPTER [XIV]+.", "", pride_and_prejudice[pride_and_prejudice.index("CHAPTER I."):pride_and_prejudice.index("*** END OF THE PROJECT GUTENBERG EBOOK PRIDE AND PREJUDICE ***")]))]
  • pride_and_prejudice.index("CHAPTER I."): Finds the start of the main content.
  • pride_and_prejudice.index("*** END OF THE PROJECT GUTENBERG EBOOK PRIDE AND PREJUDICE ***"): Finds the end of the main content.
  • pride_and_prejudice[...]: Extracts the main content of the book.
  • re.sub(r"CHAPTER [XIV]+.", "", ...): Removes chapter headings.
  • re.split(r"\r\n\r\n", ...): Splits the text into paragraphs.
  • remove_stopwords(paragraph): Removes stopwords from each paragraph.
  • re.split(r"\W+", ...): Splits each paragraph into words.
  • The result is a list of lists, where each inner list contains the words of a paragraph.
  1. Creating a Dictionary:
d = Dictionary(paragraphs)
  • This creates a gensim dictionary from the paragraphs. The dictionary maps each unique word to an ID.
  1. Creating a Corpus:
c = [d.doc2bow(paragraph) for paragraph in paragraphs]
  • This converts each paragraph into a bag-of-words (BoW) format. Each paragraph is represented as a list of tuples, where each tuple contains a word ID and its frequency in the paragraph.
  1. Training the LDA Model:
m = LdaModel(c, num_topics=15, alpha='auto')
  • This trains an LDA model on the corpus c with 15 topics. The alpha='auto' parameter allows the model to automatically optimize the Dirichlet prior for document-topic distribution.
  1. Visualizing the Topics:
pyLDAvis.display(pyLDAvis.gensim.prepare(m, c, d))
  • This prepares the LDA model for visualization using pyLDAvis and displays the interactive visualization.

Summary

This code fetches the text of "Pride and Prejudice," preprocesses it by removing chapter headings and stopwords, splits it into paragraphs, and converts the text into a format suitable for LDA topic modeling. It then trains an LDA model to find 15 topics in the text and visualizes these topics using pyLDAvis.

Let me know if you need more detailed explanations on any specific part!