| Intelligent SAS® Log Manager |
The SAS log generated during program execution is one of the
richest sources of application health, especially when dealing
with macros which build and invoke code dynamically.
Explicitly sending messages to the Log Manager at critical steps
throughout the application one may disable all low-level SAS
feedback and present only relevant, highly specialized information
to the SAS log. One is led immediately to the source of both
compile-time and run-time errors. Resulting is a clean,
well-structured and specific trace of application flow throughout
the entire job submitted.
presentation article macro code ( SAS ) examples Presented at the Twenty-sixth International SAS User Group Conference, Long Beach, California. Presented at the Thirteenth Regional Western SAS User Group Conference, San Jose, California. |
| Creating Efficient SQL |
Schema Query Language is a very flexible, general grammar for
fetching and processing large amounts of data directly at the
database level. Given this flexibility, there are many ways of
achieving the same result. Some approaches yield terribly poor
database performance especially when simple processing steps
are overlooked. Most notable are incorrect uses of the ORDER
BY and DISTINCT statements, whose often desired results
almost always lead to sub-optimal database behavior. It is
instructive to think about any “table”, which are named in the
FROM clause, as just another type of “object” in an object-oriented
sense. Tables consist of one or more fields or columns,
each having well defined types, and when related or joined to
each other hold references between themselves. An SQL
statement is thus nothing more than an object model diagram of
sorts. In that regard, I recommend the following sequence of four
simple steps when building any SQL query statement.
presentation article Presented at the Twenty-seventh International SAS User Group Conference, Orlando, Florida. Presented at the Twelfth Regional Western SAS User Group Conference, Pasadena, California. |
| Demeter in the Database |
Relational database normal form is nothing more than variable
dereference. Used properly by following the two most
important rules of Object-Oriented design, separation and
encapsulation, one's database model will always work well.
Violating the Law of Demeter enforces a concrete implementation
on an otherwise abstract model, effectively multiplying in
cartesian fashion weakly related variables and needlessly
increasing table object size with redundant information.
Instead, refer to table objects no more than one level away
from each other. Empower each level in the model heirarchy
with the privaledge and responsibility it deserves, giving
each table suitably named verbs (methods), behavior and
characteristics (fields). In the following example, an idea
of Consumption & Replenish, we will create a dynamically
adapting database model, helping Farmer Jack predict when he
needs to order more fertilizer to feed each of his many stalks.
presentation article Presented at the Twenty-eighth International SAS User Group Conference, Seattle, Washington. |
| Dynamic Macro Invocation Utility |
Often times it is necessary to repeatedly call a macro with
only minor differences in its argument list. Most common is
the application of producing multiple graphs panelized into
an aggregate. The following simple macro %invoke() builds an
argument list with a collection of index number suffixed
positional parameters and calls a desired macro with a subset
of these parameters as needed until all parameters have been
accomodated. Resulting is a set of output having identical
structure with customized content, all for the price of a
single macro call.
presentation article Presented at the Twenty-ninth International SAS User Group Conference, Montreal, Canada. |
without the |
Following good object oriented practice of building query
models we address the issue of poor query performance during
use of the UNION clause. Although column-wise joins are rather
well understood, the notion of the UNION or row-wise join is
often perceived incorrectly as a gluing together of independent
fullselect result sets rather than via model relations through
higher levels of abstraction. This perception is excusable
because there often does not exist suitable abstraction tables
or normalization key values, and we must think creatively how
a necessary abstraction can be derived. In the example to follow
we help farmer Jack analyze taste and laboratory test data to
determine the quality of his bean stalks.
presentation article handout ( SQL DDL ) Presented at the Twenty-ninth International SAS User Group Conference, Montreal, Canada. |
| The |
Normalization and warehousing are battle
cries of two waring factions. Although vertical or row-wise
designs ultimately minimize storage access and retrieval
time, the law of the relational model precludes
simultaneous access from more than one table row. Quite
often, we desire knowledge of two variables for the same
aggregate observation, when performing high level
arithmetic or reporting spreadsheet like tablature. This
article presents an efficient query method for turning
rows into columns at SQL run time, which farmer Jack uses
for his bean quality audit policy, and discusses some
finer points which make the transposition process database
system independent.
presentation article handout Presented at the 2005 Regional Pharmaceutical SAS User Group Conference, Phoenix, Arizona. |
Improving clock time and running smaller are goals for any
programmer. In the SAS® Data step, it is well known
that certain filtering expressions placed in a WHERE
statement often lead to better processing efficiency. These
expressions are simple "static" equalities, those without
arithmetic operations or function calls. The Data step
engine processes these statements at a very low level – as
observations are being read in from the input dataset –
so that later and more complicated Data step operations have
as few rows as possible to deal with.
Therefore a SAS programmer learns early on to do their easy
work – first things, first – in the WHERE statement.
This is unfortunate. Folks in the RDBMS camp know that the
in SAS, the data step WHERE is a statement, and processed first but in SQL, the query WHERE is a clause, and processed last. This paper suggests, and justifies, that SAS SQL programmers can often improve performance by adopting a few simple rules and coding style. Many annotated demonstrations, and a few humerous examples, are included as well as on the companion web site. presentation article Presented at the Thirteenth Regional Western SAS User Group Conference, San Jose, California. |