Lesson Plan: Introduction to Sentiment Analysis
Duration: 110 minutes
Course: SOCI 280
Learning Objectives
By the end of this lesson, students will:
- Understand and complete sentiment analysis to detect emotional tone (positive, negative, neutral), and understand and complete toxicity analysis to identify harmful or aggressive language (e.g., insults, threats)
- Introduce concepts in statistical testing to compare patterns between tweet types and languages (e.g., English vs. Russian)
- Learn how to work with pretrained LLMs, interpret model predictions, and use basic statistical methods to answer questions like:
- Are propagandist tweets more emotionally charged or toxic than normal political tweets?
- Do they use different rhetorical strategies in different languages?
- Can we identify signals that indicate a tweet is part of a disinformation campaign?
Through this analysis, we’ll explore various dimensions of AI applications, critically examining how it can better understand and detect the patterns of disinformation when working with large amounts of social data.
Materials and Technical Requirements:
The lesson will be administered through a Jupyter Notebook, hosted on the prAxIs UBC website. Students will need access to a device connected to the internet, preferably a laptop. No previous coding experience is required, though some familiarity with Python is an asset. If students do not have access to a capable device, it is acceptable to pair up into groups of two or three. Group work is encouraged throuhgout the lesson and students will be asked to compare their findings with other students.
Pre-lesson Checklist:
- Students have previously completed the reading up to Section 0 (roughly 5-10 minutes).
- The instructor has recently loaded the Notebook and can access and project it if needed.
- Students were reminded to bring a device to class.
Agenda
Pre-discussion and brief lecture based on Notebook reading (5-10 minutes)
Briefly discuss misinformation, disinformation, and propaganda, emphasizing the role of digital platforms in enabling their spread. Briefly explain the ways researchers can discern disinformation online and consider how new technology based on personal data might be more influential.
Example: Researchers secretly experimented on Reddit users with AI generated commentsLoad the Notebook (5 minutes)
Have students open the Jupyter Notebook, pairing any students who are unable to access the materials with a student who can. Once all students are able to access the materials, instruct them to complete Section 0.Section 0 (20 Minutes)
Instruct the students to begin working through the code and activities in Notebook Section 0. Once students get to the Screen Time activity pause for a brief discussion to compare results.Section 1 (15 Minutes)
Before instructing students to complete Notebook Section 1, have a brief (2-3 minute) explanation of classification, connecting course materials, and previous lessons. Students will complete Notebook Section 1 at their own pace, taking roughly 15 minutes to complete the lesson.Sections 3-4 (20 Minutes)
After pausing for a moment to discuss the findings of Section 1 in a large group, students will complete the rest of the Notebook.Takeaways and Activity (5-10 Minutes)
The last section of this lesson is time-dependent, and reserved for questions about the Notebook methods, a brief lecture outlining the key takeaways from the Notebook, and for students to get started on any potential participation activities such as discussion posts, worksheets, etc. (samples below).
Activity Materials
Discussion Post
Respond to the following questions in 100-250 words each:
- How can sentiment analysis be useful in answering research questions? Can you think of any tasks it would be well suited to?
- Do you think the methods in the notebook were useful in understanding disinformation campaigns? What might be some of the limitations to these approaches?
- Discuss the role AI plays in disinformation, both the detection and analysis of it, and the production of it.
Short Response:
On a social media platform of your choice, try to identify a post you believe to be disinformation. You can search for specific topics that you think will likely produce disinformation or wait for something to come across in your feed. Link to the content here: ____________________________________________________________________
Then, respond to the following questions:
Why do you think this is disinformation? What data/information are you using or pulling out from the content and its features to come to this conclusion? Explain why you think this is disinformation in 150-300 words.
Think back to the classifier in the notebook. What data was it using to classify text as disinformation? Is that process similar or different from how you classified your post as disinformation? Do you think you are more likely to be correct in identifying disinformation? If so, why? Respond in 200-350 words.