← Back to Main Portfolio

NLTK Sentiment Analyzer

This project is a simple but effective sentiment analysis tool built using Python and the Natural Language Toolkit (NLTK). It's trained on a dataset of movie reviews to classify a piece of text as either "Positive" or "Negative".

Project Overview

The script performs the following steps:

Loads Data: It uses the movie_reviews corpus included with NLTK.
Text Preprocessing: Cleans text by tokenizing, converting to lowercase, and removing stopwords and punctuation.
Feature Extraction: Uses a "Bag of Words" approach based on the 3,000 most frequent words.
Model Training: A Naive Bayes classifier is trained on 95% of the dataset.
Evaluation: The model is evaluated on the remaining data to check its accuracy.
Prediction: A function is provided to predict the sentiment of any new sentence.

Technologies Used

Python 3
NLTK (Natural Language Toolkit)
Pandas
Scikit-learn

How to Run

1. Navigate to the project directory:

cd NLTK-Sentiment-Analyzer

2. Install dependencies:

pip install -r requirements.txt

3. Run the script:

python sentiment_analyzer.py

The first time you run the script, it will automatically download the necessary NLTK data packages.

Sample Output


Accuracy: 85.0%

---
Sentence: "This movie was absolutely fantastic! The acting was superb."
Sentiment: Positive
---
Sentence: "I did not like the film. It was boring and slow."
Sentiment: Negative