Hi Folks!! Welcome to the 3rd part of the series where we would be extracting the reviewers name , date , rating and review for the product galaxy s4 The given below is the code to extract for the .com and .in and combining them into the new datafile Code: --library(RCurl) --library(XML) --library(rvest) --init <- "http://www.amazon.in/Samsung-Galaxy-S4-GT-I9500-White/product reviews/B00CL4HXQC" --crawlCandicate = "ref=cm_cr_pr_btm_link_" --base <- "http://www.amazon.in" --num <- 3 --doclist <- list() --anchorlist <- vector() --j <- 0 --while(j<num){ -- if(j==0){ --doclist[j+1] <- getURL(init) -- } else{ --doclist[j+1] <- getURL(paste(base,anchorlist[j+1],sep = "")) -- } -- doc <- htmlParse(doclist[[j+1]]) --anchor <- getNodeSet(doc,"//a") # capture all the 'a' tags which contains all the --anchor <- sapply(anchor,function(x) xmlGetAttr(x,"href")) ...