Subsetting and Combing the Datasets
In this blog we would look at some programmes on how to do subsetting of the data and how to combine the data from several to single datasets.
Subsetting a SAS datastep involves selecting observations from one data set by defining the selection
criteria either where , if/else , select among many others.
Lets look at the questions to get better understanding
Q
Code:
Code:
1) We are creating a new dataset A35_a in library A15035
2) We are reading the observations from blood dataset in library A15035.
3) We are partitioning the data based on the condition using the where keyword when gender equals female and bloodtype is AB.
4) We are creating the new variable combined computing the value based on the condition specified.
5) Then we are using the proc print to print the observations from the dataset A35_a
6) In the next part we are creating a new dataset A35_b in library A15035
7) We are reading the observations from blood dataset in library A15035.
8) We are creating the new variable combined computing the value based on the condition specified.
9) We are partitioning the data based on the condition using the if/else keyword when gender equals
female and bloodtype is AB and combined value is 14
10) Then we are using the proc print to print the observations from the dataset A35_b
One can see 2 output one pertaining to A35_a and other to A35_b
Output:
Q.
Code explanation:
1) We are creating a new dataset A35_Monday2002 in library A15035
2) We are reading the observations from hosp dataset in library A15035.
3) We are using where to check the conditions let see the date function used here
Here one condition to check the weekday to be Monday which corresponds to 6 and Year function would extract the year hence the year is 2002. If both the conditions are true then the result is extracted
4)Yrdif is used to compute the difference between DOB and Admit date as we want to calculate the age as per the admit date .
The third argument to this function Actual tells SAS to use the actual number of days in each month and to take leap year into account for every calculation. If one want they can specify 30 days if required according to the requirement
5) Proc print is used to print the observations. Noobs would omit the observations column
Q,
1) We created 2 datasets A35_Mountain USA and A35_Road_France
2) We read the observations from bicycles dataset
3)We used if else to extract the required observations in 1st one of the country is USA and model Mountain then put the observations in the dataset Mountain_USA
4) if the country is France and Model Roadbike then put the observations in the dataset A35_Road_France
5) Finally we are printing the observations using the proc print for both the datasets
Here we are getting two outputs on e where the country is France and Model is Road Bike and other where country is USA and model is Mountain. We have used if and else and if it matches the observations are stored in dataset A35_Mountain_USA and A35_Road_France. I have shown the output of latter for ready reference
Q.
Lets understand the code:
1) Printing the observations for inventory dataset in the library A15035
2) Printing the observations for newproducts dataset in the library A15035
3)Creating a new dataset updated where we are using the set statement and sas will add all the observations together to form a single data set. It is used for concatenating data sets and it must contain atleast one common variable
4) Finally we are printing the observation price based on model .
Learning:
We learnt how to combine several datasets into one single dataset.
Subsetting a SAS data set involves selecting observations from one data set by defining selection criteria, usually in a WHERE or subsetting IF statement.
Comments
Post a Comment