3 Legendary SAS Data Step Basics: Read, Write, and Manipulate Your First Dataset
Whether you’re reading raw input, cleaning datasets, or creating new variables, the DATA step is where it all begins. It’s the foundation of SAS programming, allowing you to read, write, and manipulate datasets with precision and control.
In this post, I’ll break down the SAS data step basics beginning with how the data step works, and I’ll walk you through the essentials so you can start coding with confidence.
SAS Data Step Basics
In SAS, the DATA step is a fundamental building block used to create, modify, and manipulate datasets. It allows you to read raw data into SAS, apply transformations, generate new variables, filter observations, and much more, all within a flexible programming structure.
A DATA step always begins with the DATA
statement, where you define the name of the dataset you’re creating, and ends with a RUN;
statement to execute the code.
Between those two lines, you can include a variety of instructions such as SET
to read existing data, IF
statements for conditional logic, and assignment statements to compute new values. The DATA step is incredibly powerful because it gives you complete control over how your data is structured and prepared for analysis. Whether you’re importing data for the first time or building a clean dataset for modeling, mastering the DATA step is essential for effective SAS programming.
Here’s a basic structure:

Step 1: Reading Data Into SAS
Let’s start by one of the most essential SAS data step basics: reading in data. SAS can read data from various sources (CSV, Excel, other datasets), but for simplicity, we’ll manually enter a small dataset using datalines.

Explanation:
Customer_Info
: The name of the new dataset.Input
: Defines the variables and their types ($
means character).Datalines
: Indicates inline data input.Run
: Signals to SAS that the data step is ready for execution.
More information on manually enter data into SAS can be found in my previous blog post on datalines.
Step 2: Writing or Creating a New Dataset
You can also create new datasets by copying from existing ones using the SET
statement.

Explanation:
Data
: creates a new dataset named Class_Save in the WORK librarySet
: Calls in the Class dataset from the SASHELP library as input.
If you’re looking for more information on SAS Libraries, check out my Beginner’s Guide to SAS Libraries.

Step 3: Manipulating Data in the DATA Step
To truly understand the SAS data step basics, we need to discuss the bread and butter of the Data Step: manipulating data. With a data step you can add new variables, recode values, or apply conditional logic. The code snippet below shows an example of some basic data step techniques.
Creating a New Variable

Explanation:
After reading in the dataset Class from the SASHELP library, this data step is creating a new field named Gender based on the values of Sex. When Sex EQ “F”, that row is assigned a Gender value of “Female”. When the Sex field equals “M”, the Gender field is assigned “Male”. Finally, any Sex value other than “F” or “M” is assigned a Gender value of “Unk”.
Common DATA Step Functions
Function | Purpose | Example |
---|---|---|
IF...THEN | Conditional logic | if Age > 16 then output; |
ELSE | Alternative logic branch | else Grade = Grade + 2; |
LENGTH | Set character variable length | length Name $20; |
DROP/KEEP | Control which variables to save | drop Grade; or keep Name Age; |
Best Practices for SAS Data Step Basics
- Always use
RUN;
to complete your DATA step. - Use comments (/
*This is a comment
*/) to describe your code. - Print datasets or run frequencies often to verify changes.
- Start small and build your logic step by step to ensure it is working as intended.
🚀 Final Thoughts
The DATA step is at the heart of SAS programming. Once you understand SAS data step basics by learning how to read, write, and manipulate data using it, you can handle most data preparation tasks with ease. Whether you’re cleaning raw data, creating subsets, or engineering new features the DATA step is your go-to tool.
Ready to try it yourself? Open up SAS and start experimenting with your own datasets. You’ll be surprised at how quickly it starts to click!