ÿþ<html> <head> <meta http-equiv="Content-Language" content="en"> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>Paul Dylan Sherman</title> </head> <body> <center><b>Publications</b></center> <hr> <div name="p108-26"><a name="p108-26"><table border="0" cellspacing="5" cellpadding="0"><tr> <td valign="top"><b>Intelligent<br>SAS<sup>&reg;</sup><br>Log<br>Manager</b></td><td> The SAS log generated during program execution is one of the richest sources of application health, especially when dealing with macros which build and invoke code dynamically. Explicitly sending messages to the Log Manager at critical steps throughout the application one may disable all low-level SAS feedback and present only relevant, highly specialized information to the SAS log. One is led immediately to the source of both compile-time and run-time errors. Resulting is a clean, well-structured and specific trace of application flow throughout the entire job submitted. <br><a href="p108-26/slide1.html" title="presentation slides"><small><i>presentation</i></small></a> &nbsp;<a href="p108-26/p108-26.pdf" title="article (PDF format)"><small><i>article</i></small></a> &nbsp;&nbsp;&nbsp;&nbsp;<a href="p108-26/bloknote/v1_2/index.html" title="%bloknote() macro, v1.2"><small><i>macro code</i></small></a> &nbsp;( &nbsp;<a href="p108-26/bloknote/v1_2/bloknote.sas" title="SAS statements, v1.2"><small><i>SAS</i></small></a> &nbsp;) &nbsp;<a href="p108-26/examples.html" title="%bloknote() examples"><small><i>examples</i></small></a> <br><small>Presented at the Twenty-sixth International SAS User Group Conference, Long Beach, California.</small> <br><small>Presented at the Thirteenth Regional Western SAS User Group Conference, San Jose, California.</small> </td> </tr></table></a></div> <hr> <div name="p073-27"><a name="p073-27"><table border="0" cellspacing="5" cellpadding="0"><tr> <td valign="top"><b>Creating<br>Efficient<br>SQL</b></td><td> Schema Query Language is a very flexible, general grammar for fetching and processing large amounts of data directly at the database level. Given this flexibility, there are many ways of achieving the same result. Some approaches yield terribly poor database performance especially when simple processing steps are overlooked. Most notable are incorrect uses of the ORDER BY and DISTINCT statements, whose often desired results almost always lead to sub-optimal database behavior. It is instructive to think about any  table , which are named in the FROM clause, as just another type of  object in an object-oriented sense. Tables consist of one or more fields or columns, each having well defined types, and when related or joined to each other hold references between themselves. An SQL statement is thus nothing more than an object model diagram of sorts. In that regard, I recommend the following sequence of four simple steps when building any SQL query statement. <br><a href="p073-27/slide1.html" title="presentation slides"><small><i>presentation</i></small></a> &nbsp;<a href="p073-27/p073-27.pdf" title="article (PDF format)"><small><i>article</i></small></a> <br><small>Presented at the Twenty-seventh International SAS User Group Conference, Orlando, Florida.</small> <br><small>Presented at the Twelfth Regional Western SAS User Group Conference, Pasadena, California.</small> </td> </tr></table></a></div> <hr> <div name="p172-28"><a name="p172-28"><table border="0" cellspacing="5" cellpadding="0"><tr> <td valign="top"><b>Demeter<br>in<br>the<br>Database</b></td><td> Relational database normal form is nothing more than variable dereference. Used properly by following the two most important rules of Object-Oriented design, separation and encapsulation, one's database model will always work well. Violating the Law of Demeter enforces a concrete implementation on an otherwise abstract model, effectively multiplying in cartesian fashion weakly related variables and needlessly increasing table object size with redundant information. Instead, refer to table objects no more than one level away from each other. Empower each level in the model heirarchy with the privaledge and responsibility it deserves, giving each table suitably named verbs (methods), behavior and characteristics (fields). In the following example, an idea of Consumption &amp; Replenish, we will create a dynamically adapting database model, helping Farmer Jack predict when he needs to order more fertilizer to feed each of his many stalks. <br> <a href="p172-28/slide1.html" title="presentation slides"><small><i>presentation</i></small></a> &nbsp;<a href="p172-28/p172-28.pdf" title="article (PDF format)"><small><i>article</i></small></a> <!-- &nbsp;<font color="blue"><small><i>article</i></small></font></a> --> <br><small>Presented at the Twenty-eighth International SAS User Group Conference, Seattle, Washington.</small> </td> </tr></table></a></div> <hr> <div name="2833-29"><a name="2833-29"><table border="0" cellspacing="5" cellpadding="0"><tr> <td valign="top"><b>Dynamic<br>Macro<br>Invocation<br>Utility</b></td><td> Often times it is necessary to repeatedly call a macro with only minor differences in its argument list. Most common is the application of producing multiple graphs panelized into an aggregate. The following simple macro <code>%invoke()</code> builds an argument list with a collection of index number suffixed positional parameters and calls a desired macro with a subset of these parameters as needed until all parameters have been accomodated. Resulting is a set of output having identical structure with customized content, all for the price of a single macro call. <br> <!-- <font color="blue"><small><i>presentation</i></small></font> &nbsp;<font color="blue"><small><i>article</i></small></font></a> --> <a href="p062-29/Slide1.html" title="presentation slides"><small><i>presentation</i></small></a> &nbsp;<a href="p062-29/p062-29.pdf" title="article (PDF format)"><small><i>article</i></small></a> <br><small>Presented at the Twenty-ninth International SAS User Group Conference, Montreal, Canada.</small> </td> </tr></table></a></div> <hr> <div name="2745-29"><a name="2745-29"><table border="0" cellspacing="5" cellpadding="0"><tr> <td valign="top"><b><nobr>Union Join</nobr><br>without<br>the<br><nobr>Union Clause</nobr></b></td><td> Following good object oriented practice of building query models we address the issue of poor query performance during use of the UNION clause. Although column-wise joins are rather well understood, the notion of the UNION or row-wise join is often perceived incorrectly as a gluing together of independent fullselect result sets rather than via model relations through higher levels of abstraction. This perception is excusable because there often does not exist suitable abstraction tables or normalization key values, and we must think creatively how a necessary abstraction can be derived. In the example to follow we help farmer Jack analyze taste and laboratory test data to determine the quality of his bean stalks. <br> <!-- <font color="blue"><small><i>presentation</i></small></font> &nbsp;<font color="blue"><small><i>article</i></small></font></a> --> <a href="p064-29/Slide1.html" title="presentation slides"><small><i>presentation</i></small></a> &nbsp;<a href="p064-29/p064-29.pdf" title="article (PDF format)"><small><i>article</i></small></a> &nbsp;&nbsp;&nbsp;&nbsp;<a href="p064-29/cc064-29_handout.pdf" title="handout (PDF format)"><small><i>handout</i></small></a> &nbsp;( &nbsp;<a href="p064-29/cc064-29_handout.html" title="SQL statements"><small><i>SQL</i></small></a> &nbsp;<a href="p064-29/cc064-29_handout.sql" title="data definition language (SQL statements)"><small><i>DDL</i></small></a> &nbsp;) <br><small>Presented at the Twenty-ninth International SAS User Group Conference, Montreal, Canada.</small> </td> </tr></table></a></div> <hr> <div name="po042-PH2005"><a name="po042-PH2005"><table border="0" cellspacing="5" cellpadding="0"><tr> <td valign="top"><b>The<br><nobr>De-normalize</nobr><br><nobr>Transpose</nobr></b></td><td> Normalization and warehousing are battle cries of two waring factions. Although vertical or row-wise designs ultimately minimize storage access and retrieval time, the law of the relational model precludes simultaneous access from more than one table row. Quite often, we desire knowledge of two variables for the same aggregate observation, when performing high level arithmetic or reporting spreadsheet like tablature. This article presents an efficient query method for turning rows into columns at SQL run time, which farmer Jack uses for his bean quality audit policy, and discusses some finer points which make the transposition process database system independent. <br> <a href="po042-PH2005/Slide1.html" title="presentation slides"><small><i>presentation</i></small></a> &nbsp;<a href="po042-PH2005/PO42.pdf" title="article (PDF format)"><small><i>article</i></small></a> &nbsp;&nbsp;&nbsp;&nbsp;<a href="po042-PH2005/PO42_handout.pdf" title="handout (PDF format)"><small><i>handout</i></small></a> <!-- &nbsp;( &nbsp;<a href="po042-PH2005/PO42_handout.html" title="SQL statements"><small><i>SQL</i></small></a> &nbsp;<a href="po042-PH2005/PO42_handout.sql" title="data definition language (SQL statements)"><small><i>DDL</i></small></a> &nbsp;) --> <br><small>Presented at the 2005 Regional Pharmaceutical SAS User Group Conference, Phoenix, Arizona.</small> </td> </tr></table></a></div> <hr> <div name="dwdb_where-WU2005"><a name="dwdb_where-WU2005"><table border="0" cellspacing="5" cellpadding="0"><tr> <td valign="top"><b><nobr>WHERE</nobr><br><nobr>The Statement</nobr><br><nobr>Is Not</nobr><br><nobr>The Clause</nobr></b></td><td> Improving clock time and running smaller are goals for any programmer. In the SAS<sup>&reg;</sup> Data step, it is well known that certain filtering expressions placed in a <code>WHERE</code> statement often lead to better processing efficiency. These expressions are simple "static" equalities, those without arithmetic operations or function calls. The Data step engine processes these statements at a very low level  as observations are being read in from the input dataset  so that later and more complicated Data step operations have as few rows as possible to deal with. Therefore a SAS programmer learns early on to do their easy work  first things, first  in the <code>WHERE</code> statement. </p> <p> This is unfortunate. Folks in the RDBMS camp know that the <code>FROM</code> clause is the first point of contact to one's data model. Only after processing all specifications and expressions in the <code>FROM</code> clause will the database system go on to the <code>WHERE</code> clause. In SAS Proc SQL code, position of the data filters (e.g., <code>WHERE gndr eq "M";</code>) can affect how fast the query runs and how much disk space it takes while running. Proper coding of the data filter allows Proc SQL to "delegate" the filtering to a "lower level" internal SAS subroutine called the Data Engine. </p> We learn in this paper that <br>&nbsp;&nbsp;&nbsp;in <b>SAS</b>, the data step WHERE is a <i>statement</i>, and processed <u>first</u> <br>&nbsp;&nbsp;&nbsp;but in <b>SQL</b>, the query WHERE is a <i>clause</i>, and processed <u>last</u>. <br> <br> This paper suggests, and justifies, that SAS SQL programmers can often improve performance by adopting a few simple rules and coding style. Many annotated demonstrations, and a few humerous examples, are included as well as on the companion web site. <br> <a href="dwdb_where-WU2005/Slide1.html" title="presentation slides"><small><i>presentation</i></small></a> &nbsp;<a href="dwdb_where-WU2005/DWDB_WHERE_The_Statement.pdf" title="article (PDF format)"><small><i>article</i></small></a> <!-- &nbsp;&nbsp;&nbsp;&nbsp;<a href="po042-PH2005/PO42_handout.pdf" title="handout (PDF format)"><small><i>handout</i></small></a> &nbsp;( &nbsp;<a href="dwdb_where-WU2005/PO42_handout.html" title="SQL statements"><small><i>SQL</i></small></a> &nbsp;<a href="dwdb_where-WU2005/PO42_handout.sql" title="data definition language (SQL statements)"><small><i>DDL</i></small></a> &nbsp;) --> <br><small>Presented at the Thirteenth Regional Western SAS User Group Conference, San Jose, California.</small> </td> </tr></table></a></div> <hr> <div name="presentations"><a name="presentations"><table border="0" cellspacing="0" cellpadding="0"> <tr><td valign="top"><b>Presentations</b></td><td> &nbsp;<a href="p108-26/slide1.html" title="Intelligent SAS Log Manager"><nobr><small><i>Intelligent SAS Log Manager</i></small></nobr></a> &nbsp;<a href="p073-27/slide1.html" title="Creating Efficient SQL: Four Steps to a Quick Query"><nobr><small><i>Creating Efficient SQL</i></small></nobr></a> &nbsp;<a href="p172-28/slide1.html" title="Demeter in the Database"><nobr><small><i>Demeter in the Database</i></small></nobr></a> &nbsp;<a href="p062-29/Slide1.html" title="Dynamic Macro Invocation Utility"><nobr><small><i>Dynamic Macro Invocation Utility</i></small></nobr></a> &nbsp;<a href="p064-29/Slide1.html" title="Creating Efficient SQL: Union Join without the UNION Clause"><nobr><small><i>Union Join without the UNION Clause</i></small></nobr></a> &nbsp;<a href="po042-PH2005/Slide1.html" title="Creating Efficient SQL: The De-normalize Transpose"><nobr><small><i>The De-normalize Transpose</i></small></nobr></a> &nbsp;<a href="dwdb_where-WU2005/Slide1.html" title="The Optimizer Project: WHERE The Statement Is Not The Clause"><nobr><small><i>WHERE The Statement Is Not The Clause</i></small></nobr></a> </td></tr> </table></a></div> </body> </html>