Text Mining in R - Virtual Classroom

Date: Tuesday 02 July 2024 1.00PM - Wednesday 03 July 2024 5.00PM
Location: Online
CPD: 6.0 hours
RSS Training
Book now


Share this event

Level: Intermediate (I)


Want to learn how to get the most out of text data? Today, a lot of data produced contains unstructured text, which can be difficult to transform and analyse without the correct knowledge and tools. In this course you will learn the basics of manipulating and transforming text data as well as how to extract meaning and sentiment in R, using packages such as {stringr} and {tidytext}.

Please note: Bookings will close 4 working days before the course start date or when the course has reached its maximum capacity.
 

Level: Intermediate (I)


Want to learn how to get the most out of text data? Today, a lot of data produced contains unstructured text, which can be difficult to transform and analyse without the correct knowledge and tools. In this course you will learn the basics of manipulating and transforming text data as well as how to extract meaning and sentiment in R, using packages such as {stringr} and {tidytext}.

 

Topics Covered

  • Appreciating the benefits of text data
  • Cleaning and extracting text with {stringr} and regular expressions
  • Transforming and mining text with {tidytext}
  • Analysing the sentiment of text
  • Understanding the content of a text with TF-IDF

Learning Outcomes


By the end of the course, participants will be able to…
  • Be able to clean, manipulate, and transform text data using the {stringr} package.
  • Use basic regular expressions to extract and remove patterns in text.
  • Convert unstructured text data into a tidy format suitable for analysis with {tidytext}.
  • Understand basic text mining concepts, such as tokenization and stop words.
  • Create beautiful plots of text data including word clouds.
  • Be able to analyse the sentiment of a piece of text and compare sentiment across texts and over time.
  • Understand how to extract representative words of a text to classify its content.
  • Be able to understand and perform lemmatization and stemming using {textstem}.

Knowledge Assumed

This course assumes basic familiarity with R and the {tidyverse}. We recommend first attending our Introduction to R course if you want to get up to speed for this course!

 

Theo Roe

Theo holds a 1st Class Honours MMathStat in Mathematics & Statistics from Newcastle University. He is the author of many of the Jumping Rivers courses and works with a range of clients.

 

Fees

   

Registration before  
02 June 2024

 

Registration on/after
02 June 2024

                                  


Non Member 

RSS Fellow 

RSS CStat/Gradstat/Data Analyst  
also MIS & FIS 

 

£445.00+vat 

£377.00+vat 

£356.00+vat

£494.00+vat 

£420.00+vat 

£396.00+vat

Group discounts are also available*:


3-5 people

6-8 people

9+ people
*Discount only applies to non-member price

 


10% discount

15% discount

20% discount

 
 
Book now