Posts

Showing posts with the label R

ABC Photo stories term frequency Analysis

Download the ABC dataset from the data.gov.au site, ABC Local Online Photo Stories 2009-2014 which is available localphotostories20092014csv.csv . Open the file in the numbers and save the file with the UTF-8 encoding (for example ps.csv in my case) because the unknown-8bit is the encoding of the above document. > file -I localphotostories20092014csv.csv localphotostories20092014csv.csv: text/plain; charset=unknown- 8 bit In the Mac terminal if you type above command, you can find the charset of the csv file. In the RStudio, > library(tm) Loading required package: NLP > ps <- read.csv("data/ps.csv" , stringsAsFactors = FALSE) > vs <- VectorSource(ps$Keywords) > corpus <- Corpus(vs) The tm package is the best for the text mining. First load the tm library after install the package if the package is not being already installed. The VectorSource only accept the character vectors. Now create the corpus from the vector sources(vs) created f...

R Data manipulation with dplyr

First install the package > install.packages("dplyr") > library(dplyr) dplyr provide five functions filter select mutate summarise arrange The dataset of the ABC local stations where in the csv file abc-local-radio.csv downloaded from the data.gov.au web site. Another ABC dataset from the same site is ABC Local Online Photo Stories 2009-2014 which is available localphotostories20092014csv.csv . Here the headings for the first file radio . > radio <- read.csv( "abc-local-radio.csv" ) > names(radio) [ 1 ] "State" "Website.URL" "Station" [ 4 ] "Town" "Latitude" "Longitude" [ 7 ] "Talkback.number" "Enquiries.number" "Fax.number" [ 10 ] "Sms.number" "Street.number" "Street.suburb" [ 13 ] "Street.postcode" "PO.box...

R Language Basics

CONTENTS Structures and Types Matrix basics Inner and outer products Arrays List Factors Data Frames Data Types Tidy Data Melt operation decast operation Structures and Types There are three type of basic structures in the R: vector matrix array Matrix basics Scalar multiplication is the simplest : In R: > a <- c( 1 , 2 , 3 ) > 3 * a [ 1 ] 3 6 9 Inner and outer products Inner product of the following two vectors Define these two matrices In R: > a <- c( 1 , 2 , 3 ) > b <- c( 4 , 5 , 6 ) In R: > a %*% b [,1] [1,] 26 Here the vector outer product: in R: > a % o % b [,1] [,2] [,3] [1,] 4 5 6 [2,] 8 10 12 [3,] 12 15 18 First two lines defines the vectors and the last line show the inner vector operation. Arrays Arrays are multi dimensional. Here the array in E: > c <- array(c( 1 , 2 , 3 , 4 , 5 , 6...