Project Based Text Mining in Python

Use of Natural Language Processing, Machine Learning and Sentiment Analysis towards Data Science

   Watch Promo

What will students learn in your course?

In this course the students will learn the basics of text mining and will build on it to perform document categorization, document grouping and subjective analysis.

The code implementation is carried out in Python language, while Natural Language Processing (NLP) is used for pre-processing textual data.

We will learn about structuring textual data using different representation schemes and tuning their parameters.

Starting from a very small dummy dataset, we migrate to existing databases to build models and perform validation and evaluation on them.

We will learn about scraping data from the web and converting it into a dataset.

Sentiment analysis of user hotel reviews

Information extraction from raw documents

Are there any course requirements or prerequisites?

Basics of programming (Any language, python is a bonus)

Basic understanding of Machine Learning

Can code with lists, loops and conditions and have basic understanding of models learning patterns from data

Who are your target students?

Beginners in python and curious about data science

Knows programming in Python and basic concepts of Data Science but cannot practically correlate the two.

Course Description

In this course, we study the basics of text mining.

  1. The basic operations related to structuring the unstructured data into vector and reading different types of data from the public archives are taught.
  2. Building on it we use Natural Language Processing for pre-processing our dataset.
  3. Machine Learning techniques are used for document classification, clustering and the evaluation of their models.
  4. Information Extraction part is covered with the help of Topic modeling
  5. Sentiment Analysis with a classifier and dictionary based approach
  6. Almost all modules are supported with assignments to practice.
  7. Two projects are given that make use of most of the topics separately covered in these modules.
  8. Finally, a list of possible project suggestions is given for students to choose from and build their own project.

Your Instructor

Taimoor Kahn, PhD
Taimoor Kahn, PhD

I am a researcher and an academician since 2011, and have a background of professional software development for around 3 years. As an Assistant Professor in Computer Science faculty I have taught various courses to undergraduate and graduate students. I am particularly interested in courses related to software design and development, databases, artificial intelligence, machine learning and data mining etc.

My PhD research is related to data science and computational linguistics, having worked with large-scale textual data for building knowledge-based systems that are adaptive and evolve with the growing needs without having to explicitly trained for a specific scenario. I have published papers in internationally recognized journals and conferences where we proposed solutions to real-world data analysis issues. I have supervised tens of projects that offered software based solutions for social content analytics, recommendations and tracking evolving public interests.

Course Curriculum

Frequently Asked Questions

When does the course start and finish?
The course starts now and never ends! It is a completely self-paced online course - you decide when you start and when you finish.
How long do I have access to the course?
How does lifetime access sound? After enrolling, you have unlimited access to this course for as long as you like - across any and all devices you own.
What if I am unhappy with the course?
We would never want you to be unhappy! If you are unsatisfied with your purchase, contact us in the first 30 days and we will give you a full refund.

Get started now!