[sas-users] Comparing the actual dataset layout to the prescribed dataset layout

suzanne.marie.dorinski at census.gov suzanne.marie.dorinski at census.gov
Thu Aug 25 16:23:18 EDT 2011


You could use PROC SQL to get the information from SASHELP.VCOLUMN, which
might be more efficient than PROC CONTENTS.  If the position of the
variable in the data set is important, you could also use varnum to see if
the data set you have is in the same relative order as the layout.

If the earthquake hadn't so rudely interrupted me, I was going to present
an example that grabs information from SASHELP.VCOLUMN instead of PROC
CONTENTS.

You might want to look at
http://www2.sas.com/proceedings/sugi29/237-29.pdf, which is "Dictionary
Tables and Views:  Essential Tools for Serious Applications", a SUGI paper
by Frank Dilorio and Jeff Abolafia from 2004.

Suzanne






                                                                                                                                       
  From:       b.nathaniel.miller at census.gov                                                                                            
                                                                                                                                       
  To:         sas-users list for SAS Users Group <sas-users at lists.census.gov>                                                          
                                                                                                                                       
  Date:       08/25/2011 02:31 PM                                                                                                      
                                                                                                                                       
  Subject:    Re: [sas-users] Comparing the actual dataset layout to the	prescribed	dataset layout                                   
                                                                                                                                       
  Sent by:    sas-users-bounces at lists.census.gov                                                                                       
                                                                                                                                       






You can put your (keep=name) in your proc contents after out=contents so
you don't have to do that as an additional datastep:
proc contents data=dataset NODETAILS out=contents (keep=name);
run;

Also, the values you specified for GETNAMES, MIXED, SCANTEXT, USEDATE, and
SCANTIME are all the default values. At this point, it wouldn't do a lot
for your efficiency, but if you type up similar programs in the future,
omitting them would save you some typing.

Have fun!
Nate


                                                                           
 From:   josue.delarosa at census.gov                                         
                                                                           
 To:     sas-users at lists.census.gov                                        
                                                                           
 Date:   08/25/2011 01:44 PM                                               
                                                                           
 Subject [sas-users] Comparing the actual dataset layout to the prescribed 
 :       dataset layout                                                    
                                                                           
 Sent    sas-users-bounces at lists.census.gov                                
 by:                                                                       
                                                                           






When verifying files I used to print out a proc contents report and compare
it to the layout of a file which was prescribed in a specification.
Instead of reviewing the variables by pencil and paper I’ve been
experimenting with the following program.  Any feedback on how to make this
program more efficient would be appreciated.

Thanks,

Josh

*output contents to dataset;
proc contents data=dataset NODETAILS out=contents;
run;

*keep only var names;
data contents2 (keep=name);
set contents;
run;

*import layout of file from specification, assumes layout is already in
excel form;
PROC IMPORT OUT=spec_layout
            DATAFILE= "C:\...\spec.xls"
            DBMS=EXCEL REPLACE;
     RANGE="Sheet1$";
     GETNAMES=YES;
     MIXED=NO;
     SCANTEXT=YES;
     USEDATE=YES;
     SCANTIME=YES;
RUN;


proc sort data=spec_layout;
by name;
run;

*merge files and look for vars in layout and not in dataset & vice versa;
data not_in_spec not_in_contents ;
merge spec_layout  (in=s) contents2 (in=c);
by name;
if s=1 and c=0 then output not_in_contents;
else if s=0 and c=1 then output not_in_spec;
run;_______________________________________________
sas-users mailing list
sas-users at lists.census.gov
http://lists.census.gov/mailman/listinfo/sas-users

_______________________________________________
sas-users mailing list
sas-users at lists.census.gov
http://lists.census.gov/mailman/listinfo/sas-users




More information about the sas-users mailing list