06 - Pre-Process Text Data for a Naive Bayes Classifier to Filter Spam Emails Part 1: 00:00:00 __ 001 How to Translate a Business Problem into a Machine Learning Problem 00:07:26 __ 002 Gathering Email Data and Working with Archives & Text Editors 00:18:39 __ 003 How to Add the Lesson Resources to the Project 00:22:20 __ 004 The Naive Bayes Algorithm and the Decision Boundary for a Classifier 00:27:22 __ 005 Basic Probability 00:31:57 __ 006 Joint & Conditional Probability 00:48:34 __ 007 Bayes Theorem 01:00:50 __ 008 Reading Files (Part 1) Absolute Paths and Relative Paths 01:10:22 __ 009 Reading Files (Part 2) Stream Objects and Email Structure 01:22:45 __ 010 Extracting the Text in the Email Body 01:28:12 __ 011 Python - Generator Functions & the yield Keyword 01:48:15 __ 012 Create a Pandas DataFrame of Email Bodies 01:54:25 __ 013 Cleaning Data (Part 1) Check for Empty Emails & Null Entries 02:10:08 __ 014 Cleaning Data (Part 2) Working with a DataFrame Index 02:18:24 __ 015 Saving a JSON File with Pandas 02:24:30 __ 016 Data Visualisation (Part 1) Pie Charts 02:38:23 __ 017 Data Visualisation (Part 2) Donut Charts 02:46:27 __ 018 Introduction to Natural Language Processing (NLP) 02:54:13 __ 019 Tokenizing Removing Stop Words and the Python Set Data Structure 03:10:04 __ 020 Word Stemming & Removing Punctuation 03:19:34 __ 021 Removing HTML tags with BeautifulSoup 03:29:14 __ 022 Creating a Function for Text Processing 03:36:54 __ 024 Advanced Subsetting on DataFrames the apply() Function 03:49:10 __ 025 Python - Logical Operators to Create Subsets and Indices 04:01:57 __ 026 Word Clouds & How to install Additional Python Packages 04:11:11 __ 027 Creating your First Word Cloud 04:23:03 __ 028 Styling the Word Cloud with a Mask 04:38:29 __ 029 Solving the Hamlet Challenge 04:45:14 __ 030 Styling Word Clouds with Custom Fonts 04:58:07 __ 031 Create the Vocabulary for the Spam Classifier 05:13:48 __ 032 Coding Challenge Check for Membership in a Collection 05:19:03 __ 033 Coding Challenge Find the Longest Email 05:26:20 __ 034 Sparse Matrix (Part 1) Split the Training and Testing Data 05:39:37 __ 035 Sparse Matrix (Part 2) Data Munging with Nested Loops 06:00:57 __ 036 Sparse Matrix (Part 3) Using groupby() and Saving .txt Files 06:11:50 __ 037 Coding Challenge Solution Preparing the Test Data 06:16:30 __ 038 Checkpoint Understanding the Data
Hide player controls
Hide resume playing