SASnR Home

Create Sample Data


  • This post shows how to create some sample data within a program using SAS and R Tidyverse
SAS
data CLASS;
infile datalines dlm='|' dsd missover;
input Name : $8. Sex : $1. Age : best32. Height : best32. Weight : best32.;
datalines;
Alfred|M|14|69|112.5
Alice|F|13|56.5|84
Barbara|F|13|65.3|98
;
run;

SAS code description :

  • The SAS code snippet creates a dataset named "CLASS" with variables such as Name, Sex, Age, Height, and Weight. The data for these variables is provided in an inline datalines section.

  • The infile statement specifies that the data will be read from the datalines section, using the pipe symbol (|) as the delimiter between variables. The dsd option treats consecutive delimiters as missing values, and the missover option allows SAS to continue reading data if a line ends before all variables are read.

  • The input statement defines the variables and their formats. The Name variable is defined as a character variable with a length of 8 characters, followed by Sex as a character variable with a length of 1 character. The Age, Height, and Weight variables are defined as numeric variables with the best32. format.

  • The actual data values are provided in the subsequent lines of the datalines section. Each line represents a separate observation, and the values for each variable are separated by the pipe symbol (|).

  • Finally, the run; statement ends the data step and executes the creation of the "CLASS" dataset.

 

R Tidyverse

library(tidyverse)

class<-tribble(
  ~Name,~Sex,~Age,~Height,~Weight,
  "Alfred","M",14,69,112.5,
  "Alice","F",13,56.5,84,
  "Barbara","F",13,65.3,98,
)

R code description :

  • The R Tidyverse code snippet demonstrates the use of the tribble function from the tidyverse package to create a data frame named "class" with variables such as Name, Sex, Age, Height, and Weight.

  • The library(tidyverse) statement loads the tidyverse package, which provides a collection of packages for data manipulation and visualization.

  • The tribble function is used to create the data frame. Column names are specified using the tilde (~) symbol before each variable name. The variables Name, Sex, Age, Height, and Weight are defined.

  • The rows of the data frame are constructed by providing the values for each variable, separated by commas. Each row of values is enclosed within parentheses, and the rows are separated by line breaks.

  • The resulting data frame, assigned to the variable "class," consists of three rows, each representing an observation, and the corresponding values for each variable.

 

 

 


SASnR Home