   .major .clear/window=1 3 .title/start=1,center/bold Applications of VAX SCAN    .text/start=4,2 N VAX SCAN is a language for writing programs that process text or text symbols.  N Because of this emphasis, SCAN is often called the ~b tool building language~.4 SCAN is an excellent choice for building tools that:  )         o remove unwanted text from files 1         o translate text from one form to another "         o extract and analyze text         o parse text  J SCAN is an excellent choice because it leads to efficient solutions.  ThisH efficiency is 2 fold.  First, the SCAN language is very high level, thusK the time to develop a tool is dramatically reduced.  Second, SCAN solutions H are efficient.  The SCAN compiler produces the same high quality code as VAX Ada, VAX PL/I and VAX C.    B In this demo, you will examine SCAN solutions to several problems.	 .end_text 
 .end_frame   .clear/window=1  .major- .title/start=1,center/bold Filter Application  .text/start=4,2  Problem:J  You have a series of tests you run to test one of your software products.K  The result of each test is a log file that you compare against the correct K  log file to test if the software is still functioning properly.  This test F  system is run on different machines which results in different disks,K  directories and versions appearing in file specs in the log files from run J  to run.  These cause differences in the comparsion to the correct results#  even when the test run is correct.     	 Solution: J  Create a SCAN application that will look for file specs in a log file andL  transform the disk, directory and version to a standard form.  For example:  :         input:     DISK$USER01:[JOHN.WORK.TEST]TEST1.SCN;5  .         output:    disk:[directory]TEST1.SCN;*	 .end_text 
 .end_frame   .text/start=4,2 #         SET digit     ( '0'..'9' ); M         SET spec_char ( 'a'..'z' OR 'A'..'Z' OR digit OR '$' OR '_' OR '*' );   7         TOKEN version           { { digit | '*' }... }; 1         TOKEN spec_field        { spec_char... }; (         TOKEN colon ALIAS ':'   { ':' };J         TOKEN lb    ALIAS '['   { '[' };  TOKEN rb    ALIAS ']'   { ']' };J         TOKEN dot   ALIAS '.'   { '.' };  TOKEN semi  ALIAS ';'   { ';' };	 .end_text  .clear/start=12,1  .text/start=13,2J These tokens describe the components of a VMS file spec.  The next step is8 to arrange them in a macro picture to match a file spec.	 .end_text 
 .end_frame   .text/start=4,2 $         MACRO find_file_spec TRIGGERF             {   [ spec_field ':' ]                  ! disk is optionalK                 [ '[' [spec_field / '.'] ']' ]      ! directory is optional F               f:{ spec_field '.' spec_field }       ! file is requiredI                 [ ';' version ]};                   ! version is optional   /             ANSWER 'disk:[directory]', f, ';*';          END MACRO;	 .end_text  .clear/start=12,1  .text/start=13,2K This macro will locate file specs.  The body of the macro then replaces the + matched file spec with the standard form.   	 .end_text 
 .end_frame   .text/start=4,2 $         PROCEDURE main_routine MAIN;               START SCANJ                 INPUT FILE  'log$file'            ! use a logical name to J                 OUTPUT FILE 'log$file';           !   define the log file            END PROCEDURE;	 .end_text  .clear/start=11,1  .text/start=13,2K This main procedure completes the program.  The START SCAN statement begins J the picture-matching process.  A logical name ~u log$file~ is used to tell& the program which log file to process.	 .end_text 
 .end_frame   .clear/start=4,1 .box/start=5,2/end=11,39/bold  .box/start=5,41/end=11,78/bold .label/start=4,14 Input Stream .label/start=4,53 Output Stream  .label/start=6,3 $RUN VALIDATE0 .label/start=7,3 SCAN$DISK:[RESULT]MAIN.DAT OKAY- .label/start=8,3 SCAN$DISK:[RESULT]A.DAT OKAY - .label/start=9,3 SCAN$DISK:[RESULT]B.DAT OKAY . .label/start=10,3 SCAN$DISK:[RESULT]C.DAT OKAY
 .end_frame .label/start=6,42 $RUN VALIDATE 9 .label/start=7,42 ~b disk:[directory]~MAIN.DAT;~b *~ OKAY 6 .label/start=8,42 ~b disk:[directory]~A.DAT;~b *~ OKAY6 .label/start=9,42 ~b disk:[directory]~B.DAT;~b *~ OKAY7 .label/start=10,42 ~b disk:[directory]~C.DAT;~b *~ OKAY 
 .end_frame   .text/start=13,2J Examining this program you find 17 statements including the MODULE and ENDI MODULE statements.  It took less than 15 minutes to create, compile, link H and debug.  Written in PASCAL you'd still be designing or in the editor.  L The program is extensible.  Adding another macro to filter times and anotherL to filter dates does not require major restructuring, just a few more tokens and macros. 	 .end_text 
 .end_frame   .text/start=13,2G Finally, the program runs with good speed.  A 1000 line log file with a H tranformation on each line ran in 22 second elapsed time, 16 seconds cpuL time.  The program was run on a VAX 780 with 8 people logged on at the time.  L SCAN programs are often quickly designed and implemented.  At the same time,G you still have the power and performance of a modern compiled language. 	 .end_text  .clear/start=19,1 
 .end_frame   .major .clear/window=1 1 .title/start=1,center/bold Translator Application    .text/start=4,2  Problem:J  The syntax of VAX SCAN changed between an internal release and V1.0.  The  changes included:  K     o Using { } rather than ( ) for delimiters in macro pictures and tokens 1     o Requiring that files be declared explicitly /     o Changing the form of TREEPTR declarations )     o Changing the form of I/O statements 3     o Requiring either SYNTAX or TRIGGER on a macro   M  It was decided that a conversion aid for existing internal users was needed.   	 Solution: L  Write a SCAN application that accepts old programs and converts them to the  V1.0 format.     H  The first two bullets are the more interesting problems.  Discussion of  their solution follows.	 .end_text 
 .end_frame   .clear/start=4,1 .text/start=4,2 ?     TOKEN id CASELESS           { alpha [ alpha | other ]... }; 6     TOKEN str                   { quote [ non_quote | ;                                   quote quote ]... quote }; 5     TOKEN spaces IGNORE         { {' ' | s'ht'}... }; 3     TOKEN com1 IGNORE           { '!' non_eol... }; J     TOKEN com2 IGNORE           { '/*' [ non_star... | '*' non_slash ]... 6                                   { s'eol' | '*/' } };	 .end_text  .text/start=13,2M The first step in the solution is to define tokens for the tokens in the SCAN I language.  This is necessary so that you look at the source like the SCAN N compiler does.  In particular, you define what a string and comment look like.N You don't want our translator changing the contents of either of these tokens.  K This is not all the tokens in the translator.  There are tokens for some of H the keywords and punctuation marks that are important such as ( = and ;.	 .end_text 
 .end_frame   .text/start=13,2N Changing parentheses to braces is an interesting problem because you only wantG to do it in macro pictures and token definitions, not in sets, groups,  M expressions, etc.  The solution chosen is to match each parenthesis, but only G replace it with a brace if you are in a TOKEN or MACRO statement.  This L requires some extra macros to detect when you are in one of these statements# and set some global state variable. 	 .end_text  .clear/start=19,1 
 .end_frame  .text/start=4,2 I MACRO macro_stmt TRIGGER { x:{ macro_key id } };  ! id important to avoid G     depth = depth + 1;                            ! matching END MACRO; H     ANSWER x;                                     ! replace matched text END MACRO /* macro_stmt */;   B MACRO token_stmt TRIGGER { x:{ token_key id } };  ! start of TOKENM     depth = depth + 1;                            ! increment global variable 
     ANSWER x;  END MACRO /* token_stmt */;   J MACRO semicolon TRIGGER { ';' };                  ! end of MACRO and TOKENM     IF depth > 0                                  ! decrement global variable L     THEN                                          !   if in a MACRO or TOKEN?         depth = depth - 1;                        !   statement      END IF;      ANSWER ';';  END MACRO /* semicolon */;	 .end_text  .clear/start=21,1 
 .end_frame   .text/start=4,2 ! MACRO left_paren TRIGGER { '(' }; J     IF depth > 0                         ! replace ( with { only if globalL     THEN                                 !   variable indicates you are in aE         ANSWER '{';                      !   MACRO or TOKEN statement      ELSE         ANSWER '(';      END IF;  END MACRO /* left_paren */;   " MACRO right_paren TRIGGER { ')' };     IF depth > 0       THEN         ANSWER '}';      ELSE         ANSWER ')';      END IF;  END MACRO /* right_paren */;	 .end_text  .clear/start=21,1 
 .end_frame   .clear/start=4,2 .text/start=10,2M Adding declarations for files is an interesting problem because SCAN requires J that the declaration of the file precede its first use.  Thus, you need toL insert a file declaration earlier in the input stream than you find out that you need it.  L The solution is to use 2 passes.  Macros are created to match the statementsN that implicitly create file declarations in the internal release (READ, WRITE,I OPEN, CLOSE, and ENDFILE).  These macros save the names of the implicitly K declared files in a tree.  If the tree is not empty at the end of the scan, G another scan is performed to insert the file declarations following the  MODULE statement. 	 .end_text 
 .end_frame   .text/start=4,2 J DECLARE files : TREE( STRING ) OF BOOLEAN;        ! tree to hold the files  2 MACRO open_statement TRIGGER { k:open_key i: id };G     ANSWER k, ' FILE( ', i, ' )';                 ! new syntax for OPEN D     files( lower( i ) ) = true;                   ! add file to tree
 END MACRO;  J START SCAN                                        ! part of main procedure:     INPUT FILE full_file_name  OUTPUT FILE full_file_name;  G IF first( files ) <> NIL                          ! second scan if tree ? THEN                                              !   not empty      START SCAN!         INPUT FILE full_file_name #         OUTPUT FILE full_file_name; B     PRUNE files;                                  ! empty the tree END IF; 	 .end_text 
 .end_frame   .clear/start=4,1 .box/start=5,2/end=11,39/bold  .box/start=5,41/end=11,78/bold .label/start=4,14 Input Stream .label/start=4,53 Output Stream  .label/start=6,3 MODULE test; , .label/start=7,5 TOKEN a (( 'A' | 'a' )...); .label/start=8,5 CLOSE x;  .label/start=9,3 END MODULE;
 .end_frame .label/start=6,42 MODULE test;/ .label/start=7,42/bold  DECLARE x: COMMON FILE; 6 .label/start=8,44 TOKEN a ~b{{~ 'A' | 'a' ~b}~...~b}~;( .label/start=9,44 CLOSE ~bFILE(~ x ~b)~; .label/start=10,42 END MODULE;
 .end_frame .text/start=13,2J This program is 244 lines of SCAN that is not very densely packed (over 60> lines are not code).  It took an afternoon to write and debug.  H The program performs all the translations covered in the initial problemH description plus a few more.  In addition, it prompts for the file to beJ processed which can contain wildcards.  All the files meeting the spec are
 processed.	 .end_text 
 .end_frame   .major .clear/window=1 0 .title/start=1,center/bold Extractor Application .text/start=4,2  Problem:J  You have many applications written in VAX BASIC.  You just bought the VAXV  ~b C~ommon ~b D~ata ~b D~ictionary and you would like to store many of the records inF  the BASIC applications in the CDD.  Your problem is that the form forI  entering records into the CDD is CDDL not BASIC.  Converting 200 records 7  by hand will be a time consuming and error prone task.     	 Solution: N  Write a SCAN application that will find record definitions in BASIC programs,J  extract the meaning of the BASIC record, and write out the equivalent CDD
  record.  	 .end_text 
 .end_frame .text/start=17,2I  The actual solution has one trigger macro that fires when encountering a G  RECORD statement.  A half dozen syntax macros are used to describe the M  syntax of a record declaration, gather the meaning, and write the equivalent   CDD record back out. 	 .end_text 
 .end_frame   .clear/start=4,1 .text/start=4,2 /     TOKEN comment IGNORE    { '!' non_eol... }; -     TOKEN line_no IGNORE    { sol digit... }; (     TOKEN continue IGNORE   { '&' sol };D     TOKEN space IGNORE      { { ' ' | s'ht' }... | s'vt' | s'eol' };)     TOKEN integer           { digit... }; A     TOKEN str1              { '"' [ non_double | '""' ]... '"' }; K     TOKEN str2              { quote [ non_quote | quote quote ]... quote }; C     TOKEN id                { alpha [ alpha | digit | other ]... }; $     TOKEN lp ALIAS '('      { '(' };$     TOKEN rp ALIAS ')'      { ')' };$     TOKEN comma ALIAS ','   { ',' };$     TOKEN equals ALIAS '='  { '=' };	 .end_text  .text/start=16,2  M You start as you did in the translator application by creating tokens for the O rudimentary patterns in a BASIC program.  This insures that you see the program I as the BASIC compiler would and avoid finding records inside comments and  strings.	 .end_text 
 .end_frame   .clear/start=4,1 .text/start=4,2 #     MACRO record_statement TRIGGER  C         { record_key r: id component... end_key record_key [ id ]};   .     MACRO component SYNTAX { { element \ ',' }+                              | group_dcl };   2     MACRO element SYNTAX   { [ data_type ] item };  ?     MACRO item SYNTAX      { n:id subs_dcl [ '=' l:integer ] };          ?     MACRO subs_dcl SYNTAX  { [ '(' { s:integer \ ',' } ')' ] }; 	 .end_text  .text/start=16,2N The first macro triggers when a record declaration is encountered.  The syntaxL of the record declaration is laid out much like the syntax diagrams you find in Digital manuals.     6              RECORD id component... END RECORD [ id ] 	 .end_text 
 .end_frame .text/start=16,2O ~u Component~ is described in more detail in a separate syntax macro.  Here you E see four syntax macros that describe much of the syntax of the record Q declaration.  Several others such as ~u data_type~ are omitted due to the limited M amount of space on the screen.  However, the concept of subdividing a complex 1 pattern into several simpler ones is illustrated. 	 .end_text  .clear/start=21,1 
 .end_frame   .text/start=16,2T Each macro picture contains picture variables, such as ~u r~ in ~u record_statement~S and ~u s~ in ~u item~, to gather the needed information to construct the equivalent 
 CDD record.   	 .end_text  .clear/start=19,1 
 .end_frame   .clear/start=4,1 .text/start=4,2 L An extractor application is ~u not~ usually interested in the output stream.M This application is in an example since you are not interested in the balance I of the BASIC program, just the record declarations.  Thus, you assign the G output stream to the file NL: (the null device) and use SCAN's standard K I/O statements (OPEN, CLOSE, and WRITE) to create a file containing the CDD  record.   M The solution uses a linked list of records to represent the CDD record.  Each O record in the list represents a line of text that you wish to write to the file G holding the CDD record definition.  Various of the syntax macros gather J information about the record declaration in global variables, other macrosJ use this information to construct a line for a part of the declaration andK append it to the linked list.  Once the entire BASIC record is encountered, 7 the contents of the linked list is written to the file. 	 .end_text 
 .end_frame   .clear/start=4,1 .text/start=4,2 !    MACRO record_statement TRIGGER R        { record_key r: id ~b init_record~ component... end_key record_key [ id ]};  '        MACRO init_record SYNTAX ~b{ }~;   G            avail_ptr = first_ptr;              ! variable to keep track L            last_ptr = NIL;                     !   of the records in use andD            first_ptr = NIL;                    !   available for use<            CALL allocate_line( lower( r ) & ' STRUCTURE.' );          END MACRO; 	 .end_text  .text/start=16,2L One situation you run into with applications that recognize complex patternsN is that the body of the macro is executed after the entire pattern is matched.J You need to perform initiatization actions for the syntax macro bodies.  AN syntax macro, such as ~u init_record~, is one solution.  This macro has a nullM pattern.  The macro always succeeds in matching the input stream since it hasaM nothing to match.  The statements in its body can perform the initialization.t	 .end_text 
 .end_frame   .clear/window=1/start=4,1e .text/start=4,2nD When you take a BASIC program and run it through this application...	 .end_text  .box/start=7,2/end=20,38/boldl .box/start=7,42/end=20,78/bold( .label/start=6,3 A BASIC program segment   .label/start=8,3 ...' .label/start=9,3 10     RECORD employeei& .label/start=10,12 GROUP employee_name) .label/start=11,14 STRING last,first = 20n   .label/start=13,12 END GROUP .label/start=14,12 BYTE age   .label/start=15,12 DOUBLE salary- .label/start=16,12 STRING children( 10 ) = 20o  & .label/start=18,10 END RECORD employee .label/start=19,3 ...m
 .end_frame   .text/start=4,2d, The equivalent CDD record emerges in a file.	 .end_textv/ .label/start=6,43 Created CDD equivalent recordo  * .label/start=9,43/bold employee STRUCTURE.0 .label/start=10,45/bold employee_name STRUCTURE.3 .label/start=11,47/bold last DATATYPE TEXT SIZE 20.g4 .label/start=12,47/bold first DATATYPE TEXT SIZE 20.4 .label/start=13,45/bold END employee_name STRUCTURE.- .label/start=14,45/bold DATATYPE SIGNED BYTE.g3 .label/start=15,45/bold salary DATATYPE D_FLOATING.n6 .label/start=16,45/bold children DATATYPE TEXT SIZE 20# .label/start=17,54/bold ARRAY 0:10. / .label/start=18,43/bold END employee STRUCTURE.e
 .end_frame   .clear/start=4,1 .text/start=4,2r# How large is this SCAN application?.  K The application is 270 lines in length.  Approximately 25% of the lines are C blank or commentary.  A rough breakdown of the lines is as follows:e           25 token definitions          9 macros'          2 proceduresc  8 The application took less than a day to write and debug.	 .end_text   2 .label/start=20,center/reverse End of Applications
 .end_frame .end_script 