Monotonic (part duex)

Kevin Myers posted a comment on our monotonic post, which provided a lot more detail on the monotonic function (thanks Kevin) so I thought I would post them as an article.

Kevin says….

Here are some very preliminary unpublished docs that were put together when MONOTONIC was added in SAS 8.2. This has not received any official testing/verification/approval by SAS Institute personnel:

MONOTONIC

Returns a series of monotonically increasing values

Category: Special (Is this the right category?)
Syntax
Arguments
Details
Examples
Example 1: Creating Row Identifiers for a Simple SQL Query
Example 2: Generating Distinct Streams of Values
Example 3: Influencing MONOTONIC Evaluation with Arguments
See Also

Syntax

MONOTONIC(>>)

Arguments

argument
is any valid numeric or character expression. No arguments are required, but any number of arguments may be provided.

Details

The MONOTONIC function returns the series of monotonically increasing integers (1, 2, 3, …). Each time the function is evaluated, it returns the next integer from the series. It is particularly useful for returning a series of values that may be used for purposes such as dynamically generating row identifiers or counters in PROC SQL. See Generating Row Identifiers for a Simple SQL Query.

To generate a series of values that does not start with 1 or does not increment by 1, you may wish to create an expression that applies an offset and/or a multiplier to the values that are returned from the MONOTONIC function.

Each separate use of the MONTONIC function within a data step or SQL query will return a separate series of values. If values from the same MONOTONIC series are required at multiple points within the same data step or SQL query, then a variable should be used to store the result of a single call to the MONOTONIC function, and that variable can then be used to provide the additional values where they are required. See Generating Distinct Streams of Values.

Arguments that are passed to the MONOTONIC function have no direct impact on the series of values that is returned. Instead, these arguments may be optionally provided in order to serve as hints for PROC SQL to help influence when the MONOTONIC function will be evaluated during query processing. One must be cautious when using MONOTONIC in a complex PROC SQL query, because join processing, subsetting, and query optimization may have results that are difficult to predict. See Influencing MONOTONIC Evaluation with Arguments.

Examples

Example 1: Generating Row Identifiers for a Simple SQL Query

The following program illustrates generating a unique identifier for each row that is returned as the result of a simple SQL query:

data test;
do x=1 to 5;
output;
end;
run;

proc sql;
select *, monotonic() as rowid from test
having x ne 3;
quit;

The SAS System 10:00 Thursday,
August 5, 1999 10

x rowid
ffffffffffffffffff
1 1
2 2
4 4
5 5

Note in the output from the above example that the value of rowid skips from 2 to 4. This is because the data row for which the rowid value of 3 was generated by the MONOTONIC function was deleted by the HAVING clause after the function had already been evaluated for that row. Using a simple WHERE clause such as WHERE X NE 3 would not produce the same result, because the third data row would have been skipped entirely, and the MONOTONIC function would not have been evaluated at all for that row. This shows that when it is important for values resulting from use of the MONOTONIC function in an SQL query to maintain a constant increment with no skipped values, one must carefully consider the effects that any joins or subsetting criteria may have on the results.

Example 2: Generating distinct streams of values

The following program illustrates that independent streams of values are returned by each separate use of the MONOTONIC function. A data step is used here to illustrate this point for simplicity, but similar situations can also occur with multiple uses of MONOTONIC within a single SQL statement.

data _null_;
do x=1 to 2;
a=monotonic();
do y=1 to 2;
b=monotonic();
put x= y= a= b=;
end;
end;
run;

x=1 y=1 a=1 b=1
x=1 y=2 a=1 b=2
x=2 y=1 a=2 b=3
x=2 y=2 a=2 b=4
NOTE: DATA statement used:
real time 0.02 seconds
cpu time 0.02 seconds

Note the differing values for variables a and b which were generated by two different calls to the MONOTONIC function, one in the outer DO loop for the data step, and the other in the inner DO loop.

Example 3: Influencing MONOTONIC Evaluation with Arguments

The following example shows that it is possible to influence when evaluation of the MONOTONIC function occurs in an SQL query by providing arguments.

data test1;
do x=1 to 5;
output;
end;
run;

data test2;
do x=0 to 6 by 2;
output;
end;
run;

proc sql;
select t1.x,
monotonic() as m,
monotonic(t1.x) as m1,
monotonic(t2.x) as m2,
monotonic(t1.x,t2.x) as m12
from test1 t1, test2 t2
where t2.x=t1.x;
quit;

The SAS System 10:00 Thursday,
August 5, 1999 14

x m m1 m2 m12
ffffffffffffffffffffffffffffffffffffffffffffffff
2 1 2 2 1
4 2 4 3 2

Note in the output from the above example that providing different arguments to the MONOTONIC function calls caused them to be evaluated at different points during the query processing according to the table dependencies of the arguments, thereby resulting in different values for each call in the resulting joined data.

CAUTION: Even though it is often possible to successfully influence the evaluation order of the MONOTONIC function in a predictable manner as shown above, one must be very careful when attempting to do so. In complex queries, the SQL query optimizer may alter the apparent evaluation order to improve performance. Care should be taken when developing any SQL queries that use the MONTONIC function which also involve joins and/or subsetting criteria, to be certain that the results from using the MONOTONIC function are as intended. Furthermore, differences in query optimization between different releases of the SAS System could produce different results from use of the montonic function in the same SQL query.

  • Share/Bookmark

Dealing with Oracle Dates in SAS

Working with a customer where we have decided to utilise the current Oracle Data Warehouse as much as possible as the source of data for reporting, given the significant investment in the data warehouse over the last few years.

Rather than extract the Oracle data into SAS we were testing using SAS Information Maps and Web Report Studio over the top of the Oracle Star Schema’s. I have to say I was impressed with the performance of the Web Report Studio reports given the number of layerswe were traversing, i.e

Web Report Studio > Information Map > SAS Workspace Server > Proc Sql >SAS Access to Oracle > Oracle (and back again).

Anyway one issue we found was that we couldn’t apply SAS Date Formats against the Oracle date fields, if we did it would just show ******* instead of a date.

We could use a datepart function to substring the date field and that worked ok, but that could only be applied in the Information Map, not against the Metadata definition which means other tools such as SAS OLAP Cubes and Enterprise Guide etc wouldn’t pick it up.

We tried changing the Metadata definition for datefields from DATETIME20. to say DATE9. , but then our friends ****** would appear. Even worse if we reapplied the DATETIME20. format to the metadata definition it set all the dates to 1960. We had to delete the metadata definition and reimport it to get the correct dates again.

The solution was in the Usage Note 6413: SASDATEFMT option can be defined globally in the config file or as an environment variable with SAS/ACCESS Interface to Oracle” here:

http://support.sas.com/kb/6/413.html

Although it talks about SAS 8 we applied the following statement to the sasv9.cfg file:

-set SASDATEFMT date

Rebooted the workspace server, deleted and re-imported the Oracle Table Metadata and wallah the Oracle date fields were know registered as Date. and aplpying date formats in the Metadata definition, Information Maps or Enterprise Guide worked the way they should!

As an aside in working through the options we worked out that thwre is a magic number that if you take it away from the numeric Oracle data value you get the numeric SAS date value, whihc makes sense I suppose given they both juts start counting days to manage dates, they just start counting from different points.

  • Share/Bookmark

Cleaning work files on Windows

Found an interesting paper by John M. Wildenthal of JPMorgan Chase called An Effective CLEANWORK.SAS® for Windows , outlining a way of automatically deleting work files for any SAS processes that have been terminated abnormally.

Abstract of the paper is:

SAS provides an effective utility for deleting orphaned work directories on Unix boxes. But their offering for Windows
merely deletes everything older than a particular time (http://support.sas.com/techsup/unotes/SN/008/008786.html).

This is unworkable for a server environment. This paper discusses a short program for Windows that parses the root
WORK directory and matches directory names to SAS v9 running processes. Directories that are not matched are
then deleted.

  • Share/Bookmark

Monitoring SAS processes and SAS work space in Unix

We are currently trying to monitor some SAS queries against our new Star Schemas to see if we can make them go faster.

So I have been doing a bit of research to see whats out there to help us do the monitoring.

In my search I stumbled across this article:

 

Monitoring SAS® through the Web

ABSTRACT
Monitoring SAS processes and SAS work space in a mutli-user environment, such as UNIX, is key in maintaining optimal performance. This paper will present the SAS Monitor, a tool which captures SAS resources and uses SAS to analyze and display this information via a Web browser. SAS Monitor is useful for SAS/UNIX administrators and informative for end users since the information can be widely distributed through an intranet. Even though SAS Monitor has been tested on Solaris, it can be adapted to other flavors of UNIX. This paper will explore and explain some of the techniques used in capturing and publishing SAS resources on the web.

They also provide example code for this at SAS Monitor – Source Code

 

Unfortunately the code is based upon Unix and we are running windows. So the search continues….

As an aside we are still building the Star Schema’s using SAS Datasets but we plan to test SAS SPDS vs Oracle as a repository for the Star Schemas at the end of the year to see which performs better. If one results in any noticable improvement we may move to it, as it will also solve any file locking issues we currently encounter with SAS Datasets. More on that later.

  • Share/Bookmark

Making SAS Zippy’d doh da

I often see people on the SAS forums asking how to read a zip file in SAS.

I came across a great set of tips over at the Computer Measurement Group

And one of the tips was the following on how to access zip files via SAS code.


Using the ZIP engine to read zip filesThere is a currently undocumented filename engine available in SAS 9 that can be used to read from compressed ZIP files directly. You simply specify the engine “SASZIPAM” on a filename statement, and when referring to it you must specify the file within it that you wish to read.

In the example below “tomcat.zip” contains a number of files. I want to read “tomat.log” and therefore specify “in(tomcat.log)”, where “in” is the libref and “tomcat.log” is the file I will read from the zip file.

Sample SAS Program

filename in saszipam ‘c:\tomcat.zip’;

data _null_;

infile in(tomcat.log);

input ;

put _infile_;

if _n_>10 then

stop ;

run;


Easy peasy!

  • Share/Bookmark

The Dataset is there but no data is at home

I use Enterprise Guide a lot to model data before I expose it to the SAS Metadata Server to create prototype reports using Information Maps and Web Report Studio etc.

Its often said it is ok to make a mistake, but not to make it twice as you haven’t learnt from your mistakes, well here is one mistake that I constantly make when doing the above process (I blame my failing memory due to age).

Enterprise Guide allows you to create new datasets where the variable (column) name has a space in it. The SAS Metadata Server does not like these at all. The problem is it imports the metadata for the dataset fine, its just when you go to view the data all you see is the number of rows, but no columns and no data.

It doesn’t matter if you access the dataset in Enterprise Guide ( via the Metadata Server of course) or use view data in the SAS Management Console, all you see is blank rows.

But drag the exact same table directly into Enterprise Guide and its all hunky dory.

Of course I then spend ages going through metadata security assuming I have read metadata rights but not read data rights and that is whats causing the problem.

Finally I will work out that I had a space in the column name, fix it and wallah. So I thought i’d blog about it just in case anybody else strikes the problem (or I at least remember to check my blog next time).

Oh and if you try and view data in Data Integration Studio you will get this error:

Warning: No rows returned by columns query for table DIM_ORG , no columns registered.
Warning: No rows returned by columns query for table DIM_ORG_OPERATIONS , no columns registered.

(at least one of the SAS interfaces reports that their is actually a problem, yah DI development team!)

  • Share/Bookmark

SAS Monotonic (would you like Gin and a slice of Lemon with that)

I was using Enterprise Guide the other day to merge some data into a single table and I needed to generate a unique sequence for a derived column.

I asked around for the best option and was told that using ‘_n_’in a datastep would do i, but of course this would have to be a custom code node, and I was using the Enterprise Guide filter and query wizard, which meant I would have to create a second step within my process flow.

No biggy, but within Oracle I would have just called the Sequence function and all would have been good,

Then our resident SAS Geek (note the removal of Uber) mentioned that I could use ‘Monotonic()’. So I created a new computed column in the query, added Monotonic() as the expression and Bob’s you uncle (or even Aunt these days) EG produced a unique sequence in the new column.

The Monotonic() function is apparently experimental in SAS 9.1. But was mentioned in a SUGI paper : Helpful Undocumented Features in SAS

And according to Wikipedia it is:

“In mathematics, a monotonic function (or monotone function) is a function which preserves the given order.”

  • Share/Bookmark

Turbocharge your SAS ETL Code

If you are looking to make your SAS ETL code go faster (and who doesn’t) then with a hat tip to our local SAS office there is a great (but whopping – 18 MB) paper from SAS called ‘ ETL Performance Tuning Tips’.

My favorite part is the bit about ‘MAGIC=102′, classic.

  • Share/Bookmark

SAS Functions

SAS functions enable you to add powerful logic to your SAS code.

According to the SAS Online Documentation, SAS functions are defined as:

“A SAS function performs a computation or system manipulation on arguments and returns a value. Most functions use arguments supplied by the user, but a few obtain their arguments from the operating environment.”

You can see details on what SAS functions are and how they can be used at SAS Functions and Call Routines overview online documentation page.

You can also see a complete list of the functions available at the SAS Functions and Call Routines Dictionary online documentation page.

  • Share/Bookmark