.\" ====================================================================
.\"  @Troff-man-file{
.\"     author          = "Nelson H. F. Beebe",
.\"     version         = "2.05",
.\"     date            = "17 November 1992",
.\"     time            = "14:36:48 MST",
.\"     filename        = "bibclean.man",
.\"     address         = "Center for Scientific Computing
.\"                        Department of Mathematics
.\"                        University of Utah
.\"                        Salt Lake City, UT 84112
.\"                        USA
.\"                        Tel: +1 801 581 5254
.\"                        FAX: +1 801 581 4148",
.\"     checksum        = "44077 1133 4708 31441",
.\"     email           = "beebe@math.utah.edu (Internet)",
.\"     codetable       = "ISO/ASCII",
.\"     keywords        = "bibliography, BibTeX, prettyprint",
.\"     supported       = "yes",
.\"     docstring       = "This file is the UNIX nroff/troff manual page
.\"                        documentation for bibclean, a prettyprinter
.\"                        and syntax checker for BibTeX bibliography
.\"                        data base files.
.\"
.\"                        The checksum field above contains a CRC-16
.\"                        checksum as the first value, followed by the
.\"                        equivalent of the standard UNIX wc (word
.\"                        count) utility output of lines, words, and
.\"                        characters.  This is produced by Robert
.\"                        Solovay's checksum utility.",
.\"  }
.\" ====================================================================
.if t .ds Bi B\s-2IB\s+2T\\h'-0.1667m'\\v'0.20v'E\\v'-0.20v'\\h'-0.125m'X
.if n .ds Bi BibTeX
.\"
.if t .ds Sc S\s-2CRIBE\s+2
.if n .ds Sc Scribe
.\"
.if t .ds Te T\\h'-0.1667m'\\v'0.20v'E\\v'-0.20v'\\h'-0.125m'X
.if n .ds Te TeX
.\"
.\"=====================================================================
.TH BIBCLEAN 1 "17 November 1992" "Version 2.05"
.\"=====================================================================
.SH NAME
bibclean \- prettyprint and syntax check BibTeX and Scribe bibliography \
data base files
.\"=====================================================================
.SH SYNOPSIS
.B bibclean
[
.B \-author
]
[
.BI \-error-log " filename"
]
.if t .ti +.5i
.if n .ti +9n
[
.B \-help
]
[
.B '\-?'
]
.if n .ti +9n
[
.B \-[no-]check-values
]
.if n .ti +9n
[
.B \-[no-]delete-empty-fields
]
.if t .ti +.5i
.if n .ti +9n
[
.B \-[no-]file-position
]
.if n .ti +9n
[
.B \-[no-]fix-initials
]
[
.B \-[no-]fix-names
]
.if n .ti +9n
.if t .ti +.5i
[
.B \-[no-]par-breaks
]
[
.B \-[no-]print-patterns
]
.if n .ti +9n
[
.BI \-[-no]read-init-files " filename"
]
.if t .ti +.5i
.if n .ti +9n
[
.B \-[no-]remove-OPT-prefixes
]
.if n .ti +9n
[
.B \-[no-]scribe
]
[
.B \-[no-]trace-file-opening
]
.if n .ti +9n
.if t .ti +.5i
[
.B \-[no-]warnings
]
[
.B \-version
]
.if n .ti +9n
.if t .ti +.5i
.IR "<infile" " or " " bibfile1 bibfile2 bibfile3 .\|.\|."
.if n .ti +9n
.if t .ti +.5i
.I ">outfile"
.PP
All options can be abbreviated to a unique leading
prefix.
.PP
An explicit file name of ``\-'' represents
standard input; it is assumed if no input files
are specified.
.PP
On VAX VMS and IBM PC DOS, the leading ``\-'' on
option names may be replaced by a slash, ``/'';
however, the ``\-'' option prefix is always
recognized.
.\"=====================================================================
.SH DESCRIPTION
.B bibclean
prettyprints input \*(Bi\& files to
.IR stdout ,
and checks the brace balance and bibliography
entry syntax as well.  It can be used to detect
problems in \*(Bi\& files that sometimes confuse
even \*(Bi\& itself, and importantly, can be used
to normalize the appearance of collections
of \*(Bi\& files.
.PP
Here is a summary of the formatting actions:
.TP \w'\(bu'u+2n
\(bu
\*(Bi\& items are formatted into a consistent
structure with one \fIkey = "value"\fP pair per
line, and the initial @ and trailing right brace
in column 1.
.TP
\(bu
Tabs are expanded into blank strings; their use is
discouraged because they inhibit portability, and
can suffer corruption in electronic mail.
.TP
\(bu
Long string values are split at a blank and
continued onto the next line with leading
indentation.
.TP
\(bu
A single blank line separates adjacent
bibliography entries.
.TP
\(bu
Text outside \*(Bi\& entries is passed through
verbatim.
.TP
\(bu
Outer parentheses around entries are
converted to braces.
.TP
\(bu
Personal names in
.I author
and
.I editor
field values are normalized to the form ``P. D. Q.
Bach'', from ``P.D.Q. Bach'' and ``Bach, P.D.Q.''.
.TP
\(bu
Hyphen sequences in page numbers are converted to
en-dashes.
.TP
\(bu
Month values are converted to standard \*(Bi\&
string abbreviations.
.TP
\(bu
In titles, sequences of upper-case characters at
brace level zero are braced to protect them from
being converted to lower-case letters by some
bibliography styles.
.TP
\(bu
ISBN (International Standard Book Number) and ISSN
(International Standard Serial Number) entry
values are examined to verify the checksums of
each listed number.
.PP
The standardized format of the output of
.B bibclean
facilitates the later application of simple
filters, such as
.BR bibextract (1),
.BR bibindex (1),
.BR biblook (1),
.BR bibsort (1),
.BR citefind (1),
and
.BR citetags (1),
to process the text, and also is the one expected
by the GNU Emacs \*(Bi\& support functions.
.\"=====================================================================
.SH OPTIONS
Command-line switches may be abbreviated to a
unique leading prefix, and letter case is
.I not
significant.  All options are parsed before any
input bibliography files are read, no matter what
their order on the command line.  Options that
correspond to a yes/no setting of a flag have a
form with a prefix "no-" to set the flag to
.IR no .
For such options, the last setting determines the
flag value used.  This is significant when options
are also specified in initialization files (see
the
.B "INITIALIZATION FILES"
manual section).
.\"-----------------------------------------------
.TP \w'\-[no-]remove-OPT-prefixes'u+3n
.B \-author
Display an author credit on
.IR stderr .
Sometimes an executable program is separated from
its documentation and source code; this option
provides a way to recover from that.
.\"-----------------------------------------------
.TP
.BI \-error-log " filename"
Redirect
.I stderr
to the indicated file, which will then contain all
of the error and warning messages.  This option is
provided for those systems that have difficulty
redirecting
.IR stderr .
.\"-----------------------------------------------
.TP
.BR \-help " or " \-?
Display a help message on
.IR stderr ,
giving a usage description, similar to this
section of the manual pages.
.\"-----------------------------------------------
.TP
.BI \-init-file " filename"
Provide an explicit value pattern initialization
file.  It will be processed
.I after
any system-wide and job-wide initialization files
found on the
.B PATH
(for VAX VMS,
.BR SYS$SYSTEM )
and
.B BIBINPUTS
search paths, respectively, and may override them.
It in turn may be overridden by a subsequent
file-specific initialization file.  The
initialization file name can be changed at compile
time, but defaults to
.I .bibcleanrc
on UNIX, and to
.I bibclean.ini
elsewhere.
For further details, see the
.B "INITIALIZATION FILES"
manual section.
.\"-----------------------------------------------
.TP
.B \-[no-]check-values
With the positive form, apply heuristic pattern
matching to value fields in order to detect
possible errors (e.g. ``\fIyear = "192"\fP''
instead of ``\fIyear = "1992"\fP''), and issue
warnings when unexpected patterns are found.
.IP
This checking is usually beneficial, but if it
produces too many bogus warnings for a particular
bibliography file, you can disable it with the
negative form of this option.
.RI "Default: " yes .
.\"-----------------------------------------------
.TP
.B \-[no-]delete-empty-fields
With the positive form, remove all key/value pairs
for which the value is an empty string.  This is
helpful in cleaning up bibliographies generated
from text editor templates. Compare this option
with
.B \-[no-]remove-OPT-prefixes
described below.
.RI "Default: " no .
.\"-----------------------------------------------
.TP
.B \-[no-]file-position
With the positive form, give detailed file
position information in warning and error
messages.
.RI "Default: " no .
.\"-----------------------------------------------
.TP
.B \-[no-]fix-initials
With the positive form, insert a space after a
period following author initials.
.RI "Default: " yes .
.\"-----------------------------------------------
.TP
.B \-[no-]fix-names
With the positive form, reorder
.I author
and
.I editor
name lists to remove commas at brace level zero,
placing first names or initials before last names.
.RI "Default: " yes .
.\"-----------------------------------------------
.TP
.B \-[no-]par-breaks
With the negative form, a paragraph break (either
a formfeed, or a line containing only spaces) is
not permitted in value strings, or between
key/value pairs.  This may be useful to quickly
trap runaway strings arising from mismatched
delimiters.
.RI "Default: " yes .
.\"-----------------------------------------------
.TP
.B \-[no-]print-patterns
With the positive form, print the value patterns
read from initialization files as they are added
to internal tables.  Use this option to check
newly-added patterns, or to see what patterns are
being used.
.IP
When
.B bibclean
is compiled with native pattern-matching code (the
default), these patterns are the ones that will be
used in checking value strings for valid syntax,
and all of them are specified in initialization
files, rather than hard-coded into the program.
For further details, see the
.B "INITIALIZATION FILES"
manual section.
.RI "Default: " no .
.\"-----------------------------------------------
.TP
.B \-[no-]read-init-files
With the negative form, suppress loading of
system-, user-, and file-specific initialization
files.  Initializations will come
.I only
from those files explicitly given by
.BI \-init-file " filename"
options.
.RI "Default: " yes .
.\"-----------------------------------------------
.TP
.B \-[no-]remove-OPT-prefixes
With the positive form, remove the ``OPT'' prefix
from each key name where the corresponding value
is
.I not
an empty string.  The prefix ``OPT'' must be
entirely in upper-case to be recognized.  This
option is for bibliographies generated with the
help of the GNU Emacs \*(Bi\& editing support,
which generates templates with optional fields
identified by the ``OPT'' prefix.  Although the
function
.I M-x bibtex-remove-OPT
normally bound to the keystrokes
.I C-c C-o
does the job, users often forget, with the result
that \*(Bi\& does not recognize the key name, and
ignores the value string.  Compare this option
with
.B \-[no-]delete-empty-fields
described above.
.RI "Default: " no .
.\"-----------------------------------------------
.TP
.B \-[no-]scribe
With the positive form, accept input syntax
conforming to the \*(Sc\& document system.  The
output will be converted to conform to \*(Bi\&
syntax.  See the
.B "SCRIBE BIBLIOGRAPHY FORMAT"
manual section for further details.
.RI "Default: " no .
.\"-----------------------------------------------
.TP
.B \-[no-]trace-file-opening
With the positive form, log in the error log file
the names of all files which
.B bibclean
attempts to open.  Use this option to identify
where initialization files are located.
.\"-----------------------------------------------
.TP
.B \-[no-]warnings
With the positive form, allow all warning
messages.  The negative form is
.I not
recommended since it may mask problems that should
be repaired.
.RI "Default: " yes .
.\"-----------------------------------------------
.TP
.B \-version
Display the program version number on
.IR stderr .
This will also include an indication of who
compiled the program, the host name on which it
was compiled, the time of compilation, and the
type of string value matching code selected, when
that information is available to the compiler.
.\"=====================================================================
.SH "ERROR RECOVERY AND WARNINGS"
When
.B bibclean
detects an error, it issues an error message to
both
.I stderr
and
.IR stdout .
That way, the user is clearly notified, and the
output bibliography also contains the message at
the point of error.
.PP
Error messages begin with a distinctive pair of
queries, ??, beginning in column 1, followed by
the input file name and line number.  If the
.B \-file-position
option was specified, they also contain the input
and output positions of the current file, entry,
and value.  Each position includes the file byte
number, the line number, and the column number.
In the event of a runaway string argument, the
entry and value positions should precisely
pinpoint the erroneous bibliography entry, and the
file positions will indicate where it was
detected, which may be rather later in the files.
.PP
Warning messages identify possible problems, and
are therefore sent only to
.IR stderr ,
and not to
.IR stdout ,
so they never appear in the output file.  They are
identified by a distinctive pair of percents, %%,
beginning in column 1, and as with error messages,
may be followed by file position messages if the
.B \-file-position
option was specified.
.PP
For convenience, the first line of each error and
warning message sent to
.I stderr
is formatted according to the expectations of the
GNU Emacs
.I next-error
command.  You can invoke
.B bibclean
with the Emacs
.I "M-x compile<RET>bibclean filename.bib >filename.new"
command, then use the
.I next-error
command, normally bound to
.I "C-x `"
(that's a grave, or back, accent), to move to the
location of the error in the input file.
.PP
If error messages are ignored, and left in the
output bibliography file, they will precipitate an
error when the bibliography is next processed
with \*(Bi\&.
.PP
After issuing an error message,
.B bibclean
then resynchronizes its input by copying it
verbatim to
.I stdout
until a new bibliography entry is recognized on a
line in which the first non-blank character is an
at-sign (@).  This ensures that nothing is lost
from the input file(s), allowing corrections to be
made in either the input or the output files.
However, if
.B bibclean
detects an internal error in its data structures,
it will terminate abruptly without further input
or output processing; this kind of error should
never happen, and if it does, it should be
reported immediately to the author of the program.
Errors in initialization files, and running out of
dynamic memory, will also immediately terminate
.BR bibclean .
.\"=====================================================================
.SH "INITIALIZATION FILES"
.B bibclean
can be compiled with one of three different types
of pattern matching; the choice is made by the
installer at compile time:
.RS
.TP \w'\(bu'u+2n
\(bu
The original version uses explicit hand-coded
tests of value-string syntax.
.TP
\(bu
The second version uses regular-expression
pattern-matching host library routines together
with regular-expression patterns that come
entirely from initialization files.
.TP
\(bu
The third version uses special patterns that come
entirely from initialization files.
.RE
.PP
The second and third versions are the ones of most
interest here, because they allow the user to
control what values are considered acceptable.
However, command-line options can also be
specified in initialization files, no matter which
pattern matching choice was selected.
.PP
When
.B bibclean
starts, it searches for initialization files,
finding the first one in the system executable
program search path (on UNIX and IBM PC DOS,
.BR PATH )
and the first one in the
.B BIBINPUTS
search path, and processes them in turn.  Then,
when command-line arguments are processed, any
additional files specified by
.BI \-init-file filename
options are also processed.  Finally, immediately
before each
.I named
bibliography file is processed, an attempt is made
to process an initialization file with the same
name, but with the extension changed to
.IR .ini .
This scheme permits system-wide, user-wide,
session-wide, and file-specific initialization
files to be supported.
.PP
When input is taken from
.IR stdin ,
there is no file-specific initialization.
.PP
For precise control, the
.B \-no-init-files
option suppresses all initialization files except
those explicitly named by
.BI \-init-file filename
options, either on the command line, or in
requested initialization files.
.PP
Recursive execution of initialization files with
nested
.B \-init-file
options is permitted; if the recursion is
circular,
.B bibclean
will finally get a non-fatal initialization file
open failure after opening too many files.  This
terminates further initialization file processing.
As the recursion unwinds, the files are all
closed, then execution proceeds normally.
.PP
An initialization file may contain empty lines,
comments from percent to end of line (just like
\*(Te\&), option switches, and key/pattern or
key/pattern/message assignments.  Leading and
trailing spaces are ignored.  This is best
illustrated by a short example:
.PP
.nf \fC
% This is a small bibclean initialization file

-init-file /u/math/bib/.bibcleanrc %% departmental patterns

chapter = "\e"D\e""                 %% 23

pages   = "\e"D--D\e""              %% 23--27

volume  = "\e"D \e\ean\e\ed D\e""       %% 11 and 12

year    = \e
   "\e"dddd, dddd, dddd\e"" \e
   "Multiple years specified."      %% 1989, 1990, 1991

-no-fix-names   %% do not modify author/editor lists
.fi \fP
.PP
Long logical lines can be split into multiple
physical lines by breaking at a backslash-newline
pair; the backslash-newline pair is discarded.
This processing happens while characters are being
read, before any further interpretation of the
input stream.
.PP
Each logical line must contain a complete option
(and its value, if any), or a complete key/pattern
pair, or a key/pattern/message triple.
.PP
Comments are stripped during the parsing of the
key, pattern, and message values.  The comment
start symbol is not recognized inside quoted
strings, so it can be freely used in such strings.
.PP
Comments on logical lines that were input as
multiple physical lines via the backslash-newline
convention must appear on the
.I last
physical line; otherwise, the remaining physical
lines will become part of the comment.
.PP
Pattern strings must be enclosed in quotation
marks; within such strings, a backslash starts an
escape mechanism that is commonly used in UNIX
software.  The recognized escape sequences are:
.RS
.TP
.B "\ea"
alarm bell (octal 007)
.TP
.B "\eb"
backspace (octal 010)
.TP
.B "\ef"
formfeed (octal 014)
.TP
.B "\en"
newline (octal 012)
.TP
.B "\er"
carriage return (octal 015)
.TP
.B "\et"
horizontal tab (octal 011)
.TP
.B "\ev"
vertical tab (octal 013)
.TP
.B "\eooo"
character number octal
.I ooo
(e.g
.B "\e012"
is linefeed)
.TP
.B "\e0xhh"
character number hexadecimal
.I hh
(e.g.
.B "\e0x0a"
is linefeed)
.B "\e0Xhh"
is character number hexadecimal
.I hh
(e.g.
.B "\e0X0A"
is linefeed)
.RE
.PP
Backslash followed by any other character produces
just that character.  Thus, \e% gets a literal
percent into a string (preventing its
interpretation as a comment), \e" produces a
quotation mark, and \e\e produces a single
backslash.
.PP
Use of an ASCII NUL
.I "(\e0)"
in a string will terminate it; this is a feature
of the C programming language in which
.B bibclean
is implemented.
.PP
Key/pattern pairs can be separated by arbitrary
space, and optionally, either an equals or colon
functioning as an assignment operator.  Thus, the
following are equivalent:
.PP
.nf \fC
pages="\e"D--D\e""
pages:"\e"D--D\e""
pages "\e"D--D\e""
  pages = "\e"D--D\e""
  pages : "\e"D--D\e""
pages   "\e"D--D\e""
.fi \fP
.PP
Each key name can have an arbitrary number of
patterns associated with it; however, they must
be specified in separate key/pattern assignments.
.PP
An empty pattern string causes previously-loaded
patterns for that key name to be forgotten.  This
feature permits an initialization file to
completely discard patterns from earlier
initialization files.
.PP
Patterns for value strings are represented in a
tiny special-purpose language that is both
convenient and suitable for bibliography value
string syntax checking.  While not as powerful as
the language of regular-expression patterns, its
parsing can be portably implemented in less than
3% of the code in a widely-used regular-expression
parser (the GNU
.B regexp
package).
.PP
The patterns are represented by the following
special characters:
.RS
.TP \w'<space>'u+2n
.B <space>
one or more spaces
.TP
.B a
exactly one letter
.TP
.B A
one or more letters
.TP
.B d
exactly one digit
.TP
.B D
one or more digits
.TP
.B w
exactly one word (one or more letters and digits)
.TP
.B W
one or more space-separated words, beginning and
ending with a word
.TP
.B .
one `special' character, one of the characters
<space>\|!\|#\|(\|)\|*\|+\|,\|-\|.\|/\|:\|;\|?\|[\|]\|~,
a subset of punctuation characters that are
typically used in string values
.TP
.B :
one or more `special' characters
.TP
.B X
one or more `special'-separated words, beginning
and ending with a word
.TP
.B \ex
exactly one x (x is any character), possibly with
an escape sequence interpretation given earlier
.TP
.B x
exactly the character x (x is anything but
one of these pattern characters:
a\|A\|d\|D\|w\|W\|.\|:\|<space>\|\e\|)
.RE
.PP
The
.B X
pattern character is very powerful, but generally
inadvisable, since it will match almost anything
likely to be found in a \*(Bi\& value string.
The reason for providing pattern matching on the
value strings is to uncover possible errors, not
mask them.
.PP
There is no provision for specifying ranges or
repetitions of characters, but this can usually be
done with separate patterns.  It is a good idea to
accompany the pattern with a comment showing the
kind of thing it is expected to match.  Here is a
portion of an initialization file giving a few of
the patterns used to match
.I number
value strings:
.PP
.nf \fC
number  =       "\e"D\e""         %% 23
number  =       "\e"A AD\e""      %% PN LPS5001
number  =       "\e"A D(D)\e""    %% RJ 34(49)
number  =       "\e"A D\e""       %% XNSS 288811
number  =       "\e"A D\e\e.D\e""   %% Version 3.20
number  =       "\e"A-A-D-D\e""   %% UMIAC-TR-89-11
number  =       "\e"A-A-D\e""     %% CS-TR-2189
number  =       "\e"A-A-D\e\e.D\e"" %% CS-TR-21.7
.fi \fP
.PP
For a bibliography that contains only
.I article
entries, this list should probably be reduced to
just the first pattern, so that anything other
than a digit string fails the pattern-match test.
This is easily done by keeping
bibliography-specific patterns in a corresponding
file with extension
.IR .ini ,
since that file is read automatically.
.PP
You should be sure to use empty pattern strings in
this pattern file to discard patterns from earlier
initialization files.
.PP
The value strings passed to the pattern matcher
contain surrounding quotes, so the patterns should
also.  However, you could use a pattern
specification like "\e"D" to match an initial
digit string followed by anything else; the
omission of the final quotation mark \e" in the
pattern allows the match to succeed without
checking that the next character in the value
string is a quotation mark.
.PP
Because the value strings are intended to be
processed by \*(Te\&, the pattern matching ignores
braces, and \*(Te\& control sequences, together
with any space following those control sequences.
Space around braces are preserved.  This
convention allows the pattern fragment
.I A-AD-D
to match the value string
.IR TN-K\eslash\ 27-70 ,
because the value is implicitly collapsed to
.I TN-K27-70
during the matching operation.
.PP
.BR bibclean 's
normal action when a string value fails to match
any of the corresponding patterns is to issue a
.I warning
message something like this:
\fI"Unexpected value in ``year = "192"''\fP.
In most cases, that is sufficient to alert the
user to a problem.  In some cases, however, it may
be desirable to associate a different message with
a particular pattern.  This can be done by
supplying a message string following the pattern
string.  Format items
.I %%
(single percent),
.I %e
(entry name),
.I %k
(key name),
.I %t
(citation tag),
and
.I %v
(string value)
are available to get current values expanded in
the messages.  Here is an example:
.PP
.nf \fC
chapter = "\e"D:D\e"" "Colon found in ``%k = %v''" %% 23:2
.fi \fP
.PP
To be consistent with other messages output by
.BR bibclean ,
the message string should
.I not
end with punctuation.
.PP
If you wish to make the message an error, rather
than just a warning, begin it with a query (?),
like this:
.PP
.nf \fC
chapter = "\e"D:D\e"" "?Colon found in ``%k = %v''" %% 23:2
.fi \fP
.PP
The query will not be included in the output message.
.PP
Escape sequences are supported in message strings,
just as they are in pattern strings.  You can use
this to advantage for fancy things, such as
terminal display mode control.  If you rewrite the
previous example as
.PP
.nf \fC
chapter = "\e"D:D\e"" \e
          "?\e033[7mColon found in ``%k = %v''\e033[0m" %% 23:2
.fi \fP
.PP
the error message will appear in inverse video on
display screens that support ANSI terminal control
sequences.  Such practice is not normally
recommended, since it may have undesirable
effects on some output devices.  Nevertheless, you
may find it useful for restricted applications.
.PP
For some types of bibliography keys,
.B bibclean
contains special-purpose code to supplement or
replace the pattern matching:
.RS
.TP \w'\(bu'u+2n
\(bu
.I ISBN
and
.I ISSN
fields are handled this way because their
validation requires evaluation of checksums that
cannot be expressed by simple patterns; no
patterns are even used in these two cases.
.TP
\(bu
When
.B bibclean
is compiled with pattern-matching code support,
.IR chapter ,
.IR number ,
.IR pages ,
and
.I volume
values are checked only by pattern matching.
.TP
\(bu
.I month
values are first checked against the standard
\*(Bi\& month abbreviations, and only if no match
is found are patterns then used.
.TP
\(bu
.I year
values are first checked against patterns, then if
no match is found, the year numbers are found and
converted to integer values for testing against
reasonable bounds.
.RE
.PP
Values for other keywords are checked only against
patterns.   You can provide patterns for
.I any
keyword you like, even ones
.B bibclean
does not already know about.  New ones are simply
added to an internal table that is searched for
each string to be validated.
.PP
The special keyword,
.IR tag ,
represents the bibliographic citation tag.  It can
be given patterns, like any other keyword.  Here
is an initialization file pattern assignment that
will match an author name, a colon, an alphabetic
string, and a two-digit year:
.PP
.nf \fC
tag = "A:Add"                     %% Knuth:TB86
.fi \fP
.PP
Notice that no quotation marks are included in the
pattern, because the citation tags are not quoted.
You can use such patterns to help enforce uniform
naming conventions for citation tags, which is
increasingly important as your bibliography data
base grows.
.\"=====================================================================
.SH "SCRIBE BIBLIOGRAPHY FORMAT"
.BR bibclean 's
support for the \*(Sc\& bibliography format is
based on the syntax description in the \*(Sc\&
Introductory User's Manual, 3rd Edition, May 1980.
\*(Sc\& was originally developed by Brian Reid at
Carnegie-Mellon University, and is now marketed by
Unilogic, Ltd.
.PP
The \*(Bi\& bibliography format was strongly
influenced by \*(Sc\&, and indeed, with care, it
is possible to share bibliography files between
the two systems.  Nevertheless, there are some
differences, so here is a summary of features of
the \*(Sc\& bibliography file format:
.TP \w'(10)'u+2n
(1)
Letter case is not significant in keywords and
entry names, but case is preserved in value
strings.
.TP
(2)
In key/value pairs, the key and value may be
separated by one of three characters: =, /, or
space.  Space may optionally surround these
separators.
.TP
(3)
Value delimiters are any of these seven
pairs: { }   [ ]   ( )   < >   ' '   " "   ` `
.TP
(4)
Value delimiters may not be nested, even though with
the first four delimiter pairs, nested balanced
delimiters would be unambiguous.
.TP
(5)
Delimiters can be omitted around values that
contain only letters, digits, sharp (#), ampersand
(&), period (.), and percent (%).
.TP
(6)
Outside of delimited values, a literal at-sign
(@) is represented by doubled at-signs (@@).
.TP
(7)
Bibliography entries begin with @name, as
for \*(Bi\&, but any of the seven \*(Sc\& value
delimiter pairs may be used to surround the values
in key/value pairs.  As in (4), nested delimiters
are forbidden.
.TP
(8)
Arbitrary space may separate entry names from the
following delimiters.
.TP
(9)
@Comment is a special command whose delimited
value is discarded.  As in (4), nested delimiters
are forbidden.
.TP
(10)
The special form
.IP
.nf
@Begin{comment}
 .\|.\|.
@End{comment}
.fi
.IP
permits encapsulating arbitrary text containing
any characters or delimiters, other than
``@End{comment}''.  Any of the seven delimiter
pairs may be used around the word ``comment''
following the ``@Begin'' or ``@End''; the
delimiters in the two cases need not be the same,
and consequently,
``@Begin{comment}''/``@End{comment}'' pairs may
.I not
be nested.
.TP
(11)
The
.I key
keyword is required in each bibliography entry.
.TP
(12)
A backslashed quote in a string will be assumed to
be a \*(Te\& accent, and braced appropriately.
While such accents do not conform to \*(Sc\&
syntax, \*(Sc\&-format bibliographies have been
found that appear to be intended for \*(Te\&
processing.
.PP
Because of this loose syntax,
.BR bibclean 's
normal error detection heuristics are less
effective, and consequently, \*(Sc\& mode input is
not the default; it must be explicitly requested.
.\"=====================================================================
.SH "SEE ALSO"
bibextract(1), bibindex(1), biblook(1),
bibsort(1), bibtex(1), citefind(1), citetags(1),
latex(1), scribe(1), tex(1)
.\"=====================================================================
.SH FILES
.TP \w'\fIbibclean.ini\fP'u+3n
.I *.bib
\*(Bi\& and \*(Sc\& bibliography data base files.
.TP
.I *.ini
File-specific initialization files.
.TP
.I .bibcleanrc
UNIX system and user initialization files.
.TP
.I bibclean.ini
Non-UNIX system and user initialization files.
.TP
.B BIBINPUTS
Search path for user initialization files.
.TP
.B PATH
Search path for system initialization files on
UNIX and IBM PC DOS>
.TP
.B SYS$SYSTEM
Search path for system initialization files on
VAX VMS.
.\"=====================================================================
.SH AUTHOR
.nf
Nelson H. F. Beebe
Center for Scientific Computing
Department of Mathematics
University of Utah
Salt Lake City, UT 84112
USA
Tel: +1 801 581 5254
FAX: +1 801 581 4148
Email: <beebe@math.utah.edu>
.fi
.\"=====================================================================
.\" This is for GNU Emacs file-specific customization:
.\" Local Variables:
.\" fill-column: 50
.\" End:
