Tuesday, May 24, 2011

Using SAS92HFADD behind a “tough” firewall using CURL


As I have mentioned previously in Using SAS92HFADD behind a firewall, the SAS 9.2 Hot Fix Analysis, Download and Deployment tool is a great utility for the SAS Admin to help keep their SAS environment up to date with all the latest hot fixes from SAS.  I routinely implement this as part of deploying SAS, and in most cases the tweaks I mention in my previous blog post are all that is needed.  Unfortunately during one deployment I was faced with an outbound proxy/firewall that required Windows authentication…

The Problem

The organisation I was working with required all outbound internet traffic to pass through an authenticated proxy/firewall server.  For normal internet access from a PC this all happens seamlessly via their Windows network. i.e. A user logs into their PC using their Windows credentials and then any network resource requiring authentication would negotiate using those credentials “behind the scenes” – probably using NTLM, SSPI or Kerberos.  The SAS92HFADD tool uses the File Transfer Protocol (FTP) and the command line FTP client provided with Windows doesn’t support using proxy servers nor can it negotiate using NTLM, SSPI or Kerberos (that I know of).  The solution that I chose was to use CURL – an open source command line tool for transferring files via a number of protocols (FTP, HTTP, etc.) that also supports proxies and authentication.

I have only done this on Windows servers, but I’m sure the same basic premise should work on Unix/Linix servers to get through a Windows based proxy/firewall as well.

Windows

1. Download the SAS92HFADD package

The download is a self extracting archive which when run will provide three files:
  • SAS92HFADD.exe
  • SAS92_hot_fix_data_ftp_download.bat
  • SAS92_hot_fix_data_ftp_download_script.txt

2. Download CURL

CURL can be downloaded from http://curl.haxx.se/download.html. The CURL package that you download should have the SSPI option compiled into it.  The package that I used was  Win32 2000/XP  7.21.6 binary by Günter Knauf 1.32 MB.
image

3. Extract the CURLy bits

The CURL ZIP file that you downloaded has four files that you need to extract into the SAS92HFADD directory:
  • curl.exe
  • libcurl.dll
  • libeay32.dll
  • libssl32.dll
image
Your SAS92HFADD directory should look like this:
image

4. Modify SAS92_hot_fix_data_ftp_download.bat

Replace the entire contents of this file with the following single line:
curl -U : -x internetproxy:8080 --proxy-ntlm -o SAS92_hot_fix_data.xml http://ftp.sas.com/techsup/download/hotfix/HF2/util01/SASHotFixDLM/data/SAS92_hot_fix_data.xml
internetproxy:8080 should be the name and port of your proxy server

5. Run SAS92HFADD.EXE

After adding the DeploymentRegistry.txt file into the sas92hfadd directory, the next thing I did was open a command prompt (cmd.exe), change into the sas92hfadd directory and execute the sas92hfadd program:
image
This will connect to SAS and do it’s magic, resulting in a time-stamped subdirectory containing the AnalysisReport, DownloadTools, DeploymentTools and Logs directories.

6. Modify ftp_script.txt

Navigate into the DownloadTools directory and you should find the following files:
image
Open ftp_Script.txt in your favourite text editor and get ready for some search + replace fun!
Here is my file before any changes:
image
The first string we want to find is:
get techsup
This should be replaced with the following (changing internetproxy:8080 to your proxy and port): curl -U : -x internetproxy:8080 --proxy-ntlm http://ftp.sas.com/techsup
The above string is all one line.
The second string to change is:
..\
This should be replaced with:
-o ..\
There is a SPACE between “-o” and “..\”.
The next step is not a simple search and replace, but changing the first four lines of the file from:
open ftp.sas.com anonymous SAS92HFADD@sas.com binary
to:
@echo off set PATH=..\..;%PATH%
And finally, removing the last line that contains “quit”.
The final file looks like:
image
Now we can save this file as “curl_script.bat”
image

7. Download the Hot Fixes

Now we can execute the curl_script.bat file (instead of the ftp_script.bat file) to download our hot fixes.

image

8. BAU

From here on it is Business As Usual Winking smile

The above is all “experimental” as I have only had to go to these lengths on one customer site – your mileage may vary, but I hope if you do come across this situation these steps will help you develop your own solution.

Monday, April 25, 2011

Platform Suite for SAS – Dev/Test/Prod Tips

Here are some of my tips for using the Platform Suite for SAS when you have Development, Test and Production environments to support your SDLC.

Install and configure one instance of Platform LSF and Platform Process Manager

If you have separate machines for your development, test and production environments then this tip won’t have much value for you.

SAS manages to nicely separate itself into development, test and production environments on the same machine by using different port numbers, and these port numbers usually end in the same number as the Level it is associated with.

Environment Level Metadata Server Port
Development Lev3 8563
Test Lev2 8562
Production Lev1 8561

The Platform Suite for SAS, which consists at it’s core of Platform LSF and Platform Process Manager does not lend itself to the same, nice, easy separation.  While it is possible to install and configure multiple instances of the Platform software on the same machine to use different ports for different environments, in my view this is overly complicated and unnecessary.  I find that a more productive approach is to have a single instance and achieve environment separation through the use of “Batch User Id’s”. In this configuration, each SAS environment is configured to use a different LSF Batch user and although there is only a single Platform Suite instance, the separation of environments is maintained. An example view of this using the Platform Flow Manager is shown below:

image

Using the Platform Flow Manager, there will be one Tree of Job Flows for each environment (user).

Allowing Developers to Schedule Flows

Part of SAS ETL development should also be the creation and testing of the associated jobs schedules. The common way I have seen this previously done is either the Developer logs into SAS Management Console (SMC) as the SAS Administrator, or the Admin/Lead Developer logs into the SMC as the SAS Administrator so they can create and deploy Job Flows using the Schedule Manager plug-in.  In some cases I have seen customers with a “Scheduling Admin” user in SMC which has the “lsfadmin” account associated with it. I think that these scenarios are usually a result of not fully understanding the relationship between the SAS products (SMC and DI Studio) and the Platform products (LSF and Process Manager), security/account requirements of each and how SAS and Platform interact with each other.  It is easier to “just get it to work” which usually results in using an account with more power than is actually required. 

Disclaimer: The steps I describe below is one method of many to achieve the desired result. If you are unsure of anything please consult a SAS specialist – do not make any changes to the SAS Metadata or Configuration unless you understand why you are making those changes and have a working backup.

Preparing the Platform Process Manager

One of the first things to check is the Authentication Domain of the Platform Process Manager in SMC.  Log into SMC as the SAS Administrator and locate the Platform Process Manager in the Server Manager plug-in. Right click on the Connection: Platform Process Manager object on the right and select Properties.

image

On the Options tab, check the Authentication Domain. If this specifies DefaultAuth or (None) then create a new Authentication Domain called LSFAuth by clicking the New… button. This will be the “Tag” used by SAS when trying to find a valid logon for the Platform Process Manager, and is the key for the “single signon” effect when giving users/developers the ability to schedule jobs from SMC.

image

Preparing a Scheduling Role

Previously access to the SMC was usually an “all or nothing” deal. The introduction of Roles within the SAS Metadata has made it easier to grant access to specific tasks within SMC. Now we can create a Role that will allow users to access the SMC and have access to the Schedule Manager. To create a new Role, log into SMC as the SAS Administrator or another user with the Metadata Server: User Administration Role or equivalent, and select New->Role from the User Manager plug-in.

image

image

Once this Role has been defined it can be given directly to individual Users or Groups.  From an Administration perspective, it makes life easier if Roles are assigned to Group(s). Users come and go, but Groups usually stick around!

Create a Scheduling Group

Next create a Scheduling Users Group in SMC that will provide members access to the above Role, as well as provide access to the designated LSF Batch User Id.

image

On the Members tab, add each User or Group that you would like to give Scheduling capabilities to.  In a Development environment this may be the SAS ETL Developers group or similar.

image

On the Groups and Roles tab, select the Management Console: Scheduling Role we defined previously.

image

On the Accounts tab, click the New button and then enter the LSF Batch User Id you want to use for this environment along with the password.  This LSF Batch User Id must exist as a Host/Domain account – this is a normal requirement of LSF.  As this will be used for “Outbound” authentication it is important that you enter the password. The Authentication Domain of the Account should match the Authentication Domain of the Platform Process Manager server in the Server Manager plug-in – LSFAuth if you are following along with these steps.

image

What a Scheduling User Experiences

I am not going to go through the core ETL development process, but we will assume that Job(s) have been created and Deployed from DI Studio with the next step being to create and schedule the Job Flow.  The Scheduling process is done from SMC, so the developer/scheduler will log into SMC using their normal developer account. In this example I am logging in using IWA so that SAS automatically picks up my user id. Because I am a member of the SAS ETL Developers group, which is a member of the Scheduling Users group, which has the Management Console: Scheduling Role, I have access to the Schedule Manager plug-in in SMC.

image

I can now create a Job Flow and add my deployed DI Studio Job. In this example I have created a “TestFlow” Job Flow and added a “Test_Job” deployed job.

image

Choosing Schedule Flow… on my new Job Flow in SMC results in the following events happening behind the scenes:

  1. The associated Scheduling Server (Platform Process Manager) is checked in SAS Metadata to see what the Authentication Domain is. In this case the Authentication Domain is LSFAuth.
  2. SAS then checks who is currently logged in (Michael Dixon) and if their SAS Metadata Identity has an account for LSFAuth (No).
  3. As there is no LSFAuth account directly on the logged in Metadata Identity, the group memberships are checked until it finds the LSFAuth account on the Scheduling Users group.
  4. The user id and password for this LSFAuth account is then used by SAS to communicate with the scheduling server (this is why the password needs to be stored in Metadata).

Because of this there will be no password prompt and it should appear to “magically” work. One less username and password for people to remember!

If the above setup has not been done or a user that is not a member of the Scheduling Users group attempts to schedule a Job Flow then the following prompt, which you may already be familiar with, will appear:

image

Configuring the Platform Process Manager in this way allows for a more streamlined development process, with DI Studio developers able to schedule and test their own jobs under a standard LSF Batch User Id without compromising security.

Test and Production

Test and Production have the same configuration settings as above, except the LSFAuth account stored against the Scheduling Users group is changed appropriately. On our server we have the following Active Directory accounts for each environment:

Environment Level Account
Development Lev3 SCORPIO\lsfdev
Test Lev2 SCORPIO\lsftest
Production Lev1 SCORPIO\lsfprod

Since there should not be any development work happening in Test (no SAS ETL Developers group), another appropriate group or individual user(s) would be added as Members of the Scheduling Users group in Test.

In Production there may only be one user with the role of maintaining schedules added as a Member of the Scheduling Users group, or another method, such as creating an internal account may be utilised. There are many options/combinations that can be implemented based on individual site security requirements.

References and Useful Links

Thursday, April 7, 2011

Using SAS92HFADD behind a firewall


The SAS 9.2 Hot Fix Analysis, Download and Deployment tool available at http://ftp.sas.com/techsup/download/hotfix/HF2/SAS92HFADD.html is a fantastic step forward by SAS in helping SAS Administrators keep their system up to date.  Paul Homes from Metacoda has previously blogged about this tool and how (much fun it is) to get it working under Windows 2008 R2 with UAC enabled - http://platformadmin.com/blogs/paul/2011/01/sas92hfadd-viewregistry-windows-server-2008-r2/

I have used this tool on quite a few sites (on Windows, Unix and Linux) and once it is working it is great!  Unfortunately I estimate it has only worked “out of the box” for 1 in 10 installs, and the reason is usually a firewall. This blog post describes my workarounds to this situation…

The Problem

The SAS92HFADD tool uses the File Transfer Protocol (FTP) to download various files from the SAS FTP site. If the machine running this tool is behind a firewall (which should be the standard situation in a corporate environment) then "normal" FTP may not work. One method implemented within the file transfer protocol to allow communication through some firewall implementations is "passive" mode. There are some tweaks we can make to the files provided by SAS as part of this package to get it to use passive mode, but unfortunately the FTP client provided with the Windows operating system does not support passive mode (see http://support.microsoft.com/kb/271078). The Unix/Linux FTP clients I’ve seen will support passive mode without any problems.

Unix/Linux

Downloading the Unix version of the tool and untarring it will result in two files:
  • SAS92HFADD.pl
  • SAS92_hot_fix_data_ftp_download.sh
A common way to invoke FTP in passive mode on Unix/Linux is to use the "-p" switch. We can tweak the two files provided above to add in this “-p” switch and have the FTP transfers done in passive mode.

SAS92_hot_fix_data_ftp_download.sh

The 5th line of this file reads:
ftp -n $HOST <<-EOF
Add in the “-p” switch and save the file:
ftp -p -n $HOST <<-EOF

SAS92HFADD.pl

The 646th line of this file reads:
$ftpscript .= "ftp -n \$HOST <<-EOF\n";
Change this to:
$ftpscript .= "ftp -p -n \$HOST <<-EOF\n";
Under Unix/Linux these are all the changes that are required.  Following the provided instructions from SAS should now work.

Windows

The Windows version of the tool requires a bit more work since the Perl code that we could have edited (as above with the Unix version) is packaged into an executable file along with a Perl runtime. The download is a self extracting archive which when run will provide three files:
  • SAS92HFADD.exe
  • SAS92_hot_fix_data_ftp_download.bat
  • SAS92_hot_fix_data_ftp_download_script.txt
Unfortunately we can't just start tweaking things because, as mentioned above, the FTP client provided with Windows does not support passive mode. There are a number of FTP-like command line tools that you can get for Windows (and in a later post I will explain how I used CURL to get the Windows tool to work through a "tough" firewall configuration where passive mode was not enough), but I highly recommend MOVEit Freely by Ipswitch File Transfer. This is a "drop-in" replacement for the Windows FTP client which supports passive mode. You can download MOVEit Freely from http://www.ipswitchft.com/Products/MoveitFreely/

Once you have downloaded MOVEit Freely copy the "ftps.exe" file into the SAS92HFADD directory created by the tool and rename it to "ftp.exe".


Step 1 - Edit SAS92_hot_fix_data_ftp_download_script.txt

The fourth line of this file has the "binary" command and line five has the "get" command. Insert a line after line four with the "passive" command. I also like to see that something is actually happening, so I also insert the "hash" command which prints a hash mark (#) after a certain amount of bytes have been downloaded giving you a kind of progress status.
Here is my edited file:

open ftp.sas.com
anonymous
SAS92HFADD
binary
passive
hash
get [long path removed to fit in blog post]
quit

Step 2 - Run the SAS92HFADD.exe file

Running this file will call the batch file, which includes the script we just edited, which then downloads an XML file from SAS of all the latest hot fixes. It then parses this file and compares if to the DeploymentRegistry.txt file you created (as part of the standard steps of this tool) and then generates the Analysis Report, Download scripts and Deployment scripts.


Step 3 - Modify the generated download batch and script files

The tool dynamically creates a DownloadTools subdirectory under a time stamped parent directory within the SAS92HFADD directory. Depending on whether you want all hot fixes or just Alert hot fixes you then need to edit the appropriate batch and script files. I will edit the "all hot fixes" files, but the changes are the same for the Alert only files.

ftp_script.bat

This file needs to use the MOVEit freely FTP client, so you can either copy our new "ftp.exe" into this directory or edit the batch file to use the one we already have in the SAS92HFADD directory. I edited the batch file to use my existing copy (i.e. prefix the ftp command with "..\..\"):
..\..\ftp -s:ftp_script.txt

ftp_script.txt

Like the previous text file edit, we need to insert the "passive" command and the "hash" command if you like to see the progress:
Here is the top of my ftp_Script.txt file after the edits:
open ftp.sas.com
anonymous
SAS92HFADD
binary
passive
hash
get [long path removed to fit in blog post]


Step 4 - Run the ftp_script.bat file

From here it should be back to the "normal" instructions.