6.8.3 Project: Text Sentiment Analysis

Section Overview

Sentiment analysis is a good first NLP project because the hard parts are visible: label boundaries, tokenization, negation, sarcasm, mixed sentiment, and error analysis.

Learning Objectives

Define sentiment labels before choosing a model.
Build an interpretable keyword baseline.
Improve one known error type with a simple negation rule.
Turn wrong predictions into error buckets.
Package a small NLP project as a reproducible deliverable.

See the Project Loop First

Sentiment analysis project closed loop

label boundary -> baseline -> predictions -> error buckets -> targeted upgrade

Start with binary labels:

positive: clearly recommends, praises, or expresses satisfaction.
negative: clearly complains, rejects, or expresses dissatisfaction.

Do not begin with too many labels such as neutral, mixed, irony, and unclear until the basic loop is stable.

Lab: Keyword Baseline and Negation Fix

Create sentiment_project_baseline.py:

from collections import Counter


def tokenize(text):
    text = text.lower()
    for ch in ",.!?":
        text = text.replace(ch, "")
    return text.split()


train = [
    ("clear examples and practical pace", "positive"),
    ("recommended and systematic course", "positive"),
    ("messy confusing and too fast", "negative"),
    ("unclear examples and weak structure", "negative"),
]

val = [
    ("clear and practical course", "positive"),
    ("messy and confusing pace", "negative"),
    ("not recommended", "negative"),
]

positive_words = Counter()
negative_words = Counter()

for text, label in train:
    if label == "positive":
        positive_words.update(tokenize(text))
    else:
        negative_words.update(tokenize(text))

positive_words.update(["recommended"] * 2)
negative_words.update(["messy"] * 2)


def predict(text):
    score = sum(positive_words[t] - negative_words[t] for t in tokenize(text))
    return ("positive" if score >= 0 else "negative"), score


def predict_with_negation(text):
    score = 0
    flip = False

    for token in tokenize(text):
        if token in {"not", "no", "never"}:
            flip = True
            continue

        token_score = positive_words[token] - negative_words[token]
        if flip and token_score != 0:
            token_score *= -1
            flip = False

        score += token_score

    return ("positive" if score >= 0 else "negative"), score


print("sentiment_baseline")
for text, gold in val:
    pred, score = predict(text)
    print({"gold": gold, "pred": pred, "score": score, "text": text})

print("with_negation")
for text, gold in val:
    pred, score = predict_with_negation(text)
    print({"gold": gold, "pred": pred, "score": score, "text": text})

Run it:

python sentiment_project_baseline.py

Expected output:

sentiment_baseline
{'gold': 'positive', 'pred': 'positive', 'score': 3, 'text': 'clear and practical course'}
{'gold': 'negative', 'pred': 'negative', 'score': -3, 'text': 'messy and confusing pace'}
{'gold': 'negative', 'pred': 'positive', 'score': 3, 'text': 'not recommended'}
with_negation
{'gold': 'positive', 'pred': 'positive', 'score': 3, 'text': 'clear and practical course'}
{'gold': 'negative', 'pred': 'negative', 'score': -3, 'text': 'messy and confusing pace'}
{'gold': 'negative', 'pred': 'negative', 'score': -3, 'text': 'not recommended'}

What this teaches:

the baseline is explainable because every token changes the score;
not recommended fails before the negation rule;
a targeted rule fixes one error type without pretending to solve all language understanding.

Error Buckets

Wrong cases should be grouped by type, not hidden.

error_buckets = {
    "negation": [],
    "sarcasm": [],
    "mixed_sentiment": [],
    "other": [],
}

examples = [
    ("Not recommended for this course", "negative", "positive"),
    ("Great, it got stuck again", "negative", "positive"),
    ("The content is great, but the pace is too fast", "negative", "positive"),
]

for text, gold, pred in examples:
    lower = text.lower()
    if "not" in lower:
        error_buckets["negation"].append(text)
    elif "great" in lower and "again" in lower:
        error_buckets["sarcasm"].append(text)
    elif "but" in lower:
        error_buckets["mixed_sentiment"].append(text)
    else:
        error_buckets["other"].append(text)

for name, rows in error_buckets.items():
    print(name, len(rows), rows)

This is project evidence. It shows what the model fails at and what you would improve next.

Upgrade Path

Version	What to add	Why
rule baseline	keyword counts and negation rule	explainable starting point
traditional ML	TF-IDF + LogisticRegression	stronger baseline with low cost
neural baseline	embedding + pooling or small Transformer	learn representation features
portfolio version	error buckets, comparison table, demo command	shows engineering judgment

What to Show in the README

Keep the README concrete:

label definitions;
dataset source and split;
run command;
baseline comparison table;
error buckets;
examples the model gets right and wrong;
next-step plan.

Common Mistakes

Mistake	Fix
labels are vague	write label rules before training
only reporting accuracy	include error buckets and examples
ignoring negation	test `not`, `never`, and `no` cases
adding a deep model too early	keep a rule or TF-IDF baseline
hiding sarcasm/mixed sentiment errors	document them as known limitations

Exercises

Add "not clear" and "never useful" to validation examples.
Add an other bucket example that your rules cannot classify.
Replace keyword counts with TF-IDF in your project plan.
Write a label rule for neutral, but do not add it to the model yet.
Create a README outline for this project.

Key Takeaways

Sentiment projects live or die by label boundaries and error analysis.
Simple baselines are useful because they are explainable.
Negation is a classic first failure type.
Error buckets make the project stronger than a single accuracy score.

Learning Objectives​

See the Project Loop First​

Lab: Keyword Baseline and Negation Fix​

Error Buckets​

Upgrade Path​

What to Show in the README​

Common Mistakes​

Exercises​

Key Takeaways​