Introduction to R with Time Series Analysis Training Course

Duration

21 hours (usually 3 days including breaks)

Overview

R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has a wide variety of packages for data mining.

Course Outline

Introduction and preliminaries

• Making R more friendly, R and available GUIs
• Rstudio
• Related software and documentation
• R and statistics
• Using R interactively
• An introductory session
• Getting help with functions and features
• R commands, case sensitivity, etc.
• Recall and correction of previous commands
• Executing commands from or diverting output to a file
• Data permanency and removing objects

Simple manipulations; numbers and vectors

• Vectors and assignment
• Vector arithmetic
• Generating regular sequences
• Logical vectors
• Missing values
• Character vectors
• Index vectors; selecting and modifying subsets of a data set
• Other types of objects

Objects, their modes and attributes

• Intrinsic attributes: mode and length
• Changing the length of an object
• Getting and setting attributes
• The class of an object

Arrays and matrices

• Arrays
• Array indexing. Subsections of an array
• Index matrices
• The array() function
• The outer product of two arrays
• Generalized transpose of an array
• Matrix facilities
• Matrix multiplication
• Linear equations and inversion
• Eigenvalues and eigenvectors
• Singular value decomposition and determinants
• Least squares fitting and the QR decomposition
• Forming partitioned matrices, cbind() and rbind()
• The concatenation function, (), with arrays
• Frequency tables from factors

Lists and data frames

• Lists
• Constructing and modifying lists
• Concatenating lists
• Data frames
• Making data frames
• attach() and detach()
• Working with data frames
• Attaching arbitrary lists
• Managing the search path

Data manipulation

• Selecting, subsetting observations and variables
• Filtering, grouping
• Recoding, transformations
• Aggregation, combining data sets
• Character manipulation, stringr package

• Txt files
• CSV files
• XLS, XLSX files
• SPSS, SAS, Stata,… and other formats data
• Exporting data to txt, csv and other formats
• Accessing data from databases using SQL language

Probability distributions

• R as a set of statistical tables
• Examining the distribution of a set of data
• One- and two-sample tests

Grouping, loops and conditional execution

• Grouped expressions
• Control statements
• Conditional execution: if statements
• Repetitive execution: for loops, repeat and while

• Simple examples
• Defining new binary operators
• Named arguments and defaults
• The ‘…’ argument
• Assignments within functions
• Efficiency factors in block designs
• Dropping all names in a printed array
• Recursive numerical integration
• Scope
• Customizing the environment
• Classes, generic functions and object orientation

Graphical procedures

• High-level plotting commands
• The plot() function
• Displaying multivariate data
• Display graphics
• Arguments to high-level plotting functions
• Basic visualisation graphs
• Multivariate relations with lattice and ggplot package
• Using graphics parameters
• Graphics parameters list

Time series Forecasting

• Moving average
• Exponential smoothing
• Extrapolation
• Linear prediction
• Trend estimation
• Stationarity and ARIMA modelling

Econometric methods (casual methods)

• Regression analysis
• Multiple linear regression
• Multiple non-linear regression
• Regression validation
• Forecasting from regression
Posted in Uncategorized