Controlling Input and Output - Exercises
Controlling Input and Output - Exercises
re
di
st
rib
SA
op
yr
ig
ht
ed
-D
no
t
2.4
2-2
ut
e
Exercises
re
di
st
rib
These exercises use SAS data sets stored in a permanent SAS data library.
If you are using SAS on your own computer, you might need to change the location
specified for the SAS data library.
Example: libname orion 'C:\SAS_Education\LWPRG2';
no
t
Check the log to confirm that the SAS data library was assigned.
Use the EXPLORER window to view the contents of the orion library.
-D
Level 1
The orion.prices data set contains price information for Orion Star products.
Partial Listing of orion.prices (50 Total Observations)
Unit_Price
Factor
210200100009
210200100017
210200200023
210200600067
210200600085
210200600112
210200900033
210200900038
210201000050
210201000126
$34.70
$40.00
$19.80
$67.00
$39.40
$21.80
$14.20
$20.30
$19.60
$6.50
1.01
1.01
1.01
1.01
1.01
1.01
1.01
1.01
1.01
1.01
op
yr
ig
ht
ed
Product_ID
a. Write a DATA step to create a new data set that forecasts unit prices for the next three years. This
data set will contain three observations for each input observation read from orion.prices.
SA
Open file p202e01. It reads orion.prices and creates a new data set named
work.price_increase.
Use explicit OUTPUT statements to forecast unit prices for the next three years, using
Factor as the annual rate of increase.
2-3
210200100009
210200100009
210200100009
210200100017
210200100017
210200100017
210200200023
210200200023
210200200023
210200600067
$35.05
$35.40
$35.75
$40.40
$40.80
$41.21
$20.00
$20.20
$20.40
$67.67
Year
1
2
3
1
2
3
1
2
3
1
ut
e
Unit_Price
rib
Product_ID
re
di
st
Obs
Level 2
2. Outputting Multiple Observations
End_Date
Unit_Sales_
Price
210100100027
210100100030
210100100033
210100100034
210100100035
210100100038
210100100039
210100100048
01MAY2007
01AUG2007
01AUG2007
01AUG2007
01MAY2007
01JUL2007
01JUN2007
01AUG2007
31MAY2007
31AUG2007
31AUG2007
31AUG2007
31MAY2007
31JUL2007
31AUG2007
31AUG2007
$17.99
$32.99
$161.99
$187.99
$172.99
$59.99
$21.99
$13.99
Discount
70%
70%
70%
70%
70%
60%
70%
70%
ed
-D
Product_ID
no
t
The data set orion.discount contains information about various discounts that Orion Star
runs on its products.
a. Due to excellent sales, all discounts from December 2007 will be repeated in July 2008. Both the
December 2007 and the July 2008 discounts will be called the Happy Holidays promotion.
yr
ig
ht
Create a new data set named work.extended that contains all discounts for the Happy
Holidays promotion.
Use a WHERE statement to read only observations with a start date of 01Dec2007.
Create a new variable, Promotion, which has the value Happy Holidays for each
observation.
op
Create another new variable, Season, that has a value of Winter for the December
observations and Summer for the July observations.
July 2008 discounts should have a start date of 01Jul2008 and an end date of 31Jul2008.
SA
Use explicit OUTPUT to write two observations for each observation read.
2-4
ut
e
Product_ID
Start_
Date
End_Date
Discount
1
2
3
4
5
210200100007
210200100007
210200300013
210200300013
210200300025
01DEC2007
01JUL2008
01DEC2007
01JUL2008
01DEC2007
31DEC2007
31JUL2008
31DEC2007
31JUL2008
31DEC2007
50%
50%
50%
50%
50%
Promotion
Happy
Happy
Happy
Happy
Happy
Season
Holidays
Holidays
Holidays
Holidays
Holidays
Winter
Summer
Winter
Summer
Winter
re
di
st
Obs
rib
Level 3
no
t
The data set orion.country contains information on country names as well as various lookup
codes.
Listing of orion.country
Population
AU
CA
DE
IL
TR
US
ZA
Australia
Canada
Germany
Israel
Turkey
United States
South Africa
20,000,000
.
80,000,000
5,000,000
70,000,000
280,000,000
43,000,000
Country_
ed
Continent_
ID
Country_Former
Name
Country_Name
ID
-D
Country
160
260
394
475
905
926
801
96
91
93
95
95
91
94
East/West Germany
yr
ig
ht
a. Create a new data set that contains one observation for each current country name as
well as one observation for each former country name.
Use conditional logic and explicit OUTPUT statements to create a data set named
work.lookup.
op
If a country has a former country name, write two observations: one with the current name
in the Country_Name variable and another with the former country name in the
Country_Name variable.
Drop the variables Country_FormerName and Population.
Create a new variable named Outdated with values of either Y or N to indicate whether the
observation represents the current country name.
SA
Country
1
2
AU
CA
Country_Name
Australia
Canada
Country_
ID
Continent_
ID
Outdated
160
260
96
91
N
N
DE
DE
IL
TR
US
ZA
Germany
East/West Germany
Israel
Turkey
United States
South Africa
394
394
475
905
926
801
93
93
95
95
91
94
N
Y
N
N
N
N
SA
op
yr
ig
ht
ed
-D
no
t
re
di
st
rib
3
4
5
6
7
8
2-5
ut
e
2-6
ut
e
Exercises
rib
Level 1
4. Creating Multiple SAS Data Sets
re
di
st
The data set orion.employee_organization contains information about employee job titles,
departments, and managers.
Partial Listing of orion.employee_organization (424 Total Observations)
Department
Director
Sales Manager
Sales Manager
Administration Manager
Secretary I
Office Assistant II
Office Assistant III
Sales Management
Sales Management
Sales Management
Administration
Administration
Administration
Administration
120261
120101
120101
120101
120101
120104
120104
120101
120102
120103
120104
120105
120106
120107
Manager_
ID
Job_Title
no
t
Employee_
ID
-D
ed
Use conditional logic and explicit OUTPUT statements to write to these data sets depending on
whether the value of Department is Administration, Stock & Shipping, or
Purchasing, respectively. Ignore all other Department values.
ht
Hint: Be careful with capitalization and the spelling of the Department values.
yr
ig
Employee_
ID
op
SA
Obs
1
2
3
4
5
6
7
8
120104
120105
120106
120107
120108
120109
120110
120111
Job_Title
Administration Manager
Secretary I
Office Assistant II
Office Assistant III
Warehouse Assistant II
Warehouse Assistant I
Warehouse Assistant III
Security Guard II
Department
Manager_
ID
Administration
Administration
Administration
Administration
Administration
Administration
Administration
Administration
120101
120101
120104
120104
120104
120104
120104
120104
2-7
120670
120671
120672
120673
120677
120678
120679
120680
120681
120682
Shipping
Shipping
Shipping
Shipping
Shipping
Shipping
Shipping
Shipping
Shipping
Shipping
Manager_
ID
Department
Manager
Agent III
Manager
Agent II
Manager
Agent III
Manager
Agent I
Agent II
Agent I
Stock
Stock
Stock
Stock
Stock
Stock
Stock
Stock
Stock
Stock
&
&
&
&
&
&
&
&
&
&
Shipping
Shipping
Shipping
Shipping
Shipping
Shipping
Shipping
Shipping
Shipping
Shipping
120659
120670
120659
120672
120659
120677
120659
120679
120679
120679
120728
120729
120730
120731
120732
120733
120734
120735
120736
120737
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Department
Agent II
Agent I
Agent I
Agent II
Agent III
Agent I
Agent III
Manager
Manager
Manager
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Purchasing
Manager_
ID
ht
Level 2
120735
120735
120735
120735
120736
120736
120736
120261
120261
120261
1
2
3
4
5
6
7
8
9
10
Job_Title
-D
Employee_
ID
ed
Obs
no
t
ut
e
1
2
3
4
5
6
7
8
9
10
Job_Title
rib
Employee_
ID
re
di
st
Obs
yr
ig
op
SA
1230058123
1230080101
1230106883
1230147441
1230315085
Order_
Type
1
2
2
1
1
Employee_ID
Customer_ID
Order_
Date
Delivery_
Date
121039
99999999
99999999
120174
120134
63
5
45
41
183
11JAN2003
15JAN2003
20JAN2003
28JAN2003
27FEB2003
11JAN2003
19JAN2003
22JAN2003
28JAN2003
27FEB2003
a. Orion Star wants to study catalog and Internet orders that were delivered quickly, as well
as those that went slowly.
Create three data sets named work.fast, work.slow, and work.veryslow.
Write a WHERE statement to read only the observations with Order_Type equal
to 2 (catalog) or 3 (Internet).
Create a variable named ShipDays that is the number of days between when the order
was placed and when the order was delivered.
Handle the output as follows:
Output to work.fast when the value of ShipDays is less than 3.
rib
ut
e
2-8
re
di
st
Of the 490 observations in orion.orders, only 230 are read due to the WHERE
statement.
no
t
1231305521
1236483576
1236965430
1237165927
1241298131
Level 3
ht
2
2
3
3
2
Customer_ID
Order_
Date
Order_
Type
16
70108
70165
79
2806
27AUG2003
22JUL2005
08SEP2005
27SEP2005
29JAN2007
Delivery_
Date
Ship
Days
04SEP2003
02AUG2005
18SEP2005
08OCT2005
08FEB2007
8
11
10
11
10
ed
1
2
3
4
5
Order_ID
-D
Obs
SA
op
yr
ig
Write a solution to the previous exercise. Using SELECT logic instead of IF-THEN/ELSE logic.
Refer to SAS documentation to explore the use of a compound expression in a SELECT statement.
Print the data set work.veryslow.
2-9
ut
e
Exercises
rib
Level 1
7. Specifying Variables and Observations
re
di
st
Sales Management
Sales Management
Sales Management
Administration
Administration
Administration
Administration
Manager_
120261
120101
120101
120101
120101
120104
120104
no
t
Director
Sales Manager
Sales Manager
Administration Manager
Secretary I
Office Assistant II
Office Assistant III
ID
120101
120102
120103
120104
120105
120106
120107
Department
-D
a. Create two data sets: one for the Sales department and another for the Executive department.
Name the data sets work.sales and work.exec.
Output to these data sets depending on whether the value of Department is Sales or
Executives, respectively. Ignore all other Department values.
ed
The work.sales data set should contain three variables (Employee_ID, Job_Title,
and Manager_ID) and have 201 observations.
ht
The work.exec data set should contain two variables (Employee_ID and Job_Title)
and have four observations.
yr
ig
op
Employee_
ID
Obs
SA
1
2
3
4
5
6
120121
120122
120123
120124
120125
120126
Manager_
ID
Job_Title
Sales
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
Rep.
II
II
I
I
IV
II
120102
120102
120102
120102
120102
120102
2-10
Job_Title
2
3
120260
120261
rib
Obs
ut
e
Executives
re
di
st
Level 2
8. Specifying Variables and Observations
The data set orion.orders contains information on in-store, catalog, and Internet orders
as well as delivery dates.
Partial Listing of orion.orders (490 Total Observations)
Customer_ID
121039
99999999
120455
99999999
120174
63
5
11
45
41
1
2
1
2
1
Order_
Delivery_
Date
Date
no
t
Order_
Employee_ID
11JAN2006
15JAN2006
17JAN2006
20JAN2006
28JAN2006
11JAN2006
19JAN2006
17JAN2006
22JAN2006
28JAN2006
-D
1230058123
1230080101
1230088186
1230106883
1230147441
Type
Order_ID
a. Create two data sets, work.instore and work.delivery, to analyze in-store sales.
Use a WHERE statement to read only observations with Order_Type equal to 1.
ed
Create a variable ShipDays that is the number of days between when the order was placed
and when the order was delivered.
Output to work.instore when ShipDays is equal to 0.
ht
yr
ig
The work.instore data set should contain three variables (Order_ID, Customer_ID,
and Order_Date).
The work.delivery data set should contain four variables (Order_ID, Customer_ID,
Order_Date, and ShipDays).
op
Test this program by reading the first 30 observations that satisfy the WHERE statement.
Check the SAS log to verify that no warnings or errors were reported.
Of the 490 observations in orion.orders, only 230 are read due to the WHERE
statement.
SA
2-11
Order_ID
Customer_ID
Order_
Date
1
2
3
1231468750
1231657078
1232648239
52
63
49
25SEP2003
29OCT2003
07APR2004
Ship
Days
5
4
8
rib
Obs
ut
e
re
di
st
d. Use PROC FREQ to display the number of orders per year in work.instore. Add an
appropriate title.
Hint: Format the variable Order_Date with a YEAR. format. Restrict the analysis to the
variable Order_Date with a TABLES statement.
PROC FREQ Output
no
t
SA
op
yr
ig
ht
ed
-D
Cumulative
Cumulative
Order_Date
Frequency
Percent
Frequency
Percent
2003
43
17.20
43
17.20
2004
50
20.00
93
37.20
2005
27
10.80
120
48.00
2006
67
26.80
187
74.80
2007
63
25.20
250
100.00
2-12
ut
e
Chapter Review
rib
no
t
re
di
st
continued...
-D
104
Chapter Review
ed
yr
ig
ht
op
SA
106