forked from backtracking/bibtex2html
-
Notifications
You must be signed in to change notification settings - Fork 0
/
manual.tex
882 lines (641 loc) · 31.4 KB
/
manual.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% bibtex2html - A BibTeX to HTML translator %
% Copyright (C) 1997-2014 Jean-Christophe Filliâtre and Claude Marché %
% %
% This software is free software; you can redistribute it and/or %
% modify it under the terms of the GNU General Public %
% License version 2, as published by the Free Software Foundation. %
% %
% This software is distributed in the hope that it will be useful, %
% but WITHOUT ANY WARRANTY; without even the implied warranty of %
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. %
% %
% See the GNU General Public License version 2 for more details %
% (enclosed in the file GPL). %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentclass[11pt,a4paper]{article}
\usepackage[T1]{fontenc}
\usepackage[latin1]{inputenc}
\usepackage{fullpage}
\usepackage{hyperref}
%BEGIN LATEX
\newcommand{\link}[2]{#2}
%END LATEX
%HEVEA\newcommand{\link}[2]{\ahref{#1}{#2}}
\newcommand{\monurl}[1]{\link{#1}{#1}}
\newcommand{\mm}{\symbol{45}\symbol{45}}
\input{version.tex}
\title{{\Huge\bfseries BibTeX2HTML} \\
A translator of BibTeX bibliographies into HTML\\
~\\
Version \version{} --- \today}
\author{Jean-Christophe Filli\^{a}tre and Claude March\'e \\
\normalsize\monurl{\url{http://www.lri.fr/~filliatr/bibtex2html}}}
\date{}
\begin{document}
\sloppy
\maketitle
\tableofcontents
\section{Introduction}
BibTeX2HTML is a collection of tools for producing automatically HTML
documents from bibliographies written in the BibTeX format. It
consists in three command line tools:
\begin{itemize}
\item \texttt{bib2bib} is a filter tool that reads one or several
bibliography files, filters the entries with respect to a given
criterion, and outputs the list of selected keys together with a new
bibliography file containing only the selected entries;
\item \texttt{bibtex2html} is a translator that reads a bibliography
file and outputs two HTML documents that contains respectively the
cited bibliography in a nice presentation, and the original BibTeX
file augmented with several transparent HTML links to allow easy
navigation. \texttt{bibtex2html} can handle \emph{any} BibTeX style
file, including those producing multiple bibliographies.
\item \texttt{aux2bib} reads a \texttt{.aux} file as produced by
\LaTeX\ and writes to standard output a BibTeX file containing exactly the
BibTeX entries refereed in the \texttt{.aux} file.
\end{itemize}
\section{The \texttt{bibtex2html} command tool}
bibtex2html is a BibTeX to HTML translator. It is invocated as
\begin{flushleft}
\texttt{bibtex2html} [options] [\textit{file.bib}]
\end{flushleft}
where the possible \link{\#options}{options} are described below and
where \textit{file.bib} is the name of the BibTeX file, which must
have a \textit{.bib} suffix. If this file is not given, then entries
are input from standard input.
Then two HTML documents are created (unless option
\verb|-nobibsource| is selected or input is standard input, see below) :
\begin{itemize}
\item \textit{file.html} which is the bibliography in HTML format
%HEVEA, like \link{examples/biblio-these.html}{this}
;
\item \textit{file\_bib.html} which contains all the entries in ASCII
format.
%HEVEA ,like \link{examples/biblio-these_bib.html}{this}
\end{itemize}
\texttt{bibtex} is called on \textit{file.bib} in order to produce the
a LaTeX document, and then this LaTeX document is translated into an
HTML document. The BibTeX file \textit{file.bib} is also parsed in
order to collect additional fields (abstract, url, ps, http, etc.)
that will be used in the translation.
If input is standard input and option \verb|--output| is not given,
the first file is output to standard output, and the second is not
created.
%HEVEA\begin{rawhtml}<a name=fields>\end{rawhtml}
\subsection{Additional fields and automatic web links}
The main interest of \texttt{bibtex2html} with respect to a
traditional LaTeX to HTML translator is the use of additional fields
in the BibTeX database and the automatic insertion of web
links.
A link is inserted:
\begin{itemize}
\item at each cross-reference inside the bibliography entries;
\item when the \verb|\url| LaTeX macro is used in the text;
\item for each BibTeX field whose name is "ftp", "http", "url", "ps"
,"dvi", "rtf", "pdf",
"documenturl", "urlps" or "urldvi". The name of this link depends on
the nature of the link:
\begin{itemize}
\item it is the file suffix, whenever this suffix is \texttt{.dvi},
\texttt{.ps}, \texttt{.pdf}, \texttt{.rtf}, \texttt{.txt} or
\texttt{.html}, possibly followed by a compression sufix,
\texttt{.gz}, \texttt{.Z} or \texttt{.zip};
\item otherwise the name of the link is either \texttt{http} or
\texttt{ftp} depending on the protocol.
\end{itemize}
You can insert web link for other fields and/or specify alternative
names for the links using the options
\texttt{-f} and \texttt{-nf} (see below).
\end{itemize}
%HEVEA See the result on \link{examples/biblio-these.html}{this example}.
\subsubsection{Abstracts}
If a BibTeX entry contains a field \texttt{abstract} then its contents
is quoted right after the bibliography entry
%HEVEA , like \link{examples/publis-abstracts.html}{this}
.
This behavior may be suppressed with the option \texttt{\mm{}no-abstract}.
If you want both versions with and without abstracts, use the option
\texttt{\mm{}both}. In that case, links named "Abstract" will be
inserted from the page without abstracts to the page with abstracts,
%HEVEA like \link{examples/publis.html}{this}.
\subsubsection{Keywords}
If a BibTeX entry contains a field \texttt{keywords} then its contents
is displayed after the bibliography entry (and after the abstract if any).
This behavior may be suppressed with the option \texttt{\mm{}no-keywords}.
%HEVEA\begin{rawhtml}<a name=options>\end{rawhtml}
\subsection{Command line options}
Most of the command line options have a short version of one character
(e.g. \texttt{-r}) and an easy-to-remember/understand long version
(e.g. \texttt{\mm{}reverse-sort}).
\subsubsection{General aspect of the HTML document}
\begin{description}
\item[\texttt{-t} \textit{string}, \texttt{\mm{}title} \textit{string}] ~
specify the title of the HTML file (default is the file name).
\item[\texttt{\mm{}header} \textit{string}] ~
give an additional header for the HTML document.
\item[\texttt{\mm{}footer} \textit{string}] ~
give an additional footer for the HTML document.
\item[\texttt{-s} \textit{string}, \texttt{\mm{}style} \textit{string}] ~
use BibTeX style \textit{string} (plain, alpha, etc.). Default
style is plain.
\item[\texttt{-noabstract}, \texttt{\mm{}no-abstract}] ~
do not print the abstracts (if any).
\item[\texttt{-nokeywords}, \texttt{\mm{}no-keywords}] ~
do not print the keywords (if any).
\item[\texttt{-both}, \texttt{\mm{}both}] ~
produce both pages with and without abstracts. If the BibTeX file is
foo.bib then the two pages will be respectively foo.html and
foo\_abstracts.html (The suffix may be different, see option
\texttt{\mm{}suffix}). Links are inserted from the page without
abstracts to the page with abstracts.
\item[\texttt{-nokeys}, \texttt{\mm{}no-keys}] ~
do not print the cite keys. Note: this option implicitly suppresses
the use of HTML tables to format the entries; to enforce the use of
tables, use option \texttt{-use-table} (passing it \emph{after}
option \texttt{-nokeys} on the command line).
\item[\texttt{-use-keys}, \texttt{\mm{}use-keys}] ~
use the cite keys from the BibTeX input file (and not the ones generated
by the BibTeX style file).
\item[\texttt{-rawurl}, \texttt{\mm{}raw-url}] ~
print URLs instead of files' types.
\item[\texttt{-heveaurl}, \texttt{\mm{}hevea-url}] ~
interpret the macro \verb!\url! as HeVeA's one, i.e. with two
arguments, the first one being the url and the second one the text
to print. The default behavior is to interpret the macro \verb!\url!
as the one from the package \texttt{url}, which has only one
argument (the url itself).
\item[\texttt{-f} \textit{field}, \texttt{\mm{}field} \textit{field}] ~
add a web link for that BibTeX field.
\item[\texttt{-nf} \textit{field} \textit{string},
\texttt{\mm{}named-field} \textit{field} \textit{string}] ~
similar to \texttt{-f} but specifies the way to display the link
(e.g. \texttt{-nf springer "At Springer's"}).
\item[\texttt{-note} \textit{field},
\texttt{\mm{}note} \textit{field}] ~
declare that a field must be treated like the \texttt{abstract}
field, i.e. is an annotation to be displayed as a text paragraph
below the entry.
\item[\texttt{-multiple}, \texttt{\mm{}multiple}] ~
make a separate web page for each entry. \textit{Beware: this
option produces as many HTML files as BibTeX entries!}
\item[\texttt{-single}, \texttt{\mm{}single}] ~
produce a single document, inserting each BibTeX entry (the input)
right after its BibTeX output
\item[\texttt{-bg} \textit{color}, \texttt{\mm{}background} \textit{color}] ~
set the background color of the HTML file (default is none).
\item[\texttt{-css} \textit{file}, \texttt{\mm{}style-sheet} \textit{file}] ~
set a style sheet file for the HTML document (default is none).
\item[\texttt{-dl}, \texttt{\mm{}dl}] ~
use HTML \texttt{DL} lists instead of HTML tables to format entries.
\item[\texttt{-unicode}, \texttt{\mm{}unicode}] ~
use Unicode entities for the following macros :
\begin{verbatim}
\models \curlyvee \curlywedge \bigcirc \varepsilon
\not{\models}
\end{verbatim}
\item[\texttt{-html-entities}, \texttt{\mm{}html-entities}] ~
use HTML entities for the following macros :
\begin{verbatim}
\= \Im \Leftarrow \Re \Rightarrow \aleph \ang \angle \approx
\ast \cdot \cdots \cong \copyright \cup \dagger \diamond \emptyset
\equiv \exists \forall \ge \geq \in \infty \int \land \lang
\lceil \le \leftarrow \leftrightarrow \leq \lfloor \longleftarrow
\longrightarrow \lor \lozenge \nabla \ne \neg \neq \ni \notin
\oplus \otimes \partial \perp \pm \prod \propto \rang \rceil
\rfloor \rightarrow \sim \simeq \sqrt \subset \subseteq
\sum \supset \supseteq \therefore \times \tm \to \vartheta
\vee \wedge \wp
\end{verbatim}
\end{description}
\subsubsection{Controlling the translation}
\begin{description}
\item[\texttt{-m} \textit{file}, \texttt{\mm{}macros-from} \textit{file}] ~
read the \LaTeX\ macros in the given file.
Note: \texttt{bibtex2html} does not handle macros arguments;
arguments are simply discarded.
\item[\texttt{-noexpand} \texttt{\mm{}no-expand}] ~
do not expand the abbreviation strings, leave them in the output file.
\end{description}
\subsubsection{Selecting the entries}
\begin{description}
\item[\texttt{-citefile} \textit{filename}, \texttt{\mm{}citefile}
\textit{filename}] ~
Select only keys appearing in \textit{filename}. To be used manually
or in conjonction with \verb|bib2bib|.
\item[\texttt{-e} \textit{key}, \texttt{\mm{}exclude} \textit{key}] ~
exclude an particular entry.
\end{description}
\subsubsection{Sorting the entries}
\begin{description}
\item[\texttt{-d}, \texttt{\mm{}sort-by-date}] ~
sort by date.
\item[\texttt{-a}, \texttt{\mm{}sort-as-bibtex}] ~
sort as BibTeX (usually by author).
\item[\texttt{-u}, \texttt{\mm{}unsorted}] ~
unsorted i.e. same order as in .bib file (default).
\item[\texttt{-r}, \texttt{\mm{}reverse-sort}] ~
reverse the sort.
\item[\texttt{\mm{}revkeys}] ~
number entries in reverse order (i.e. from $n$ to 1 in plain style).
\end{description}
\subsubsection{Miscellaneous options}
\begin{description}
\item[\texttt{-nodoc}, \texttt{\mm{}nodoc}] ~
do not produce a full HTML document but only its body (useful to
merge the HTML bibliography in a bigger HTML document).
\item[\texttt{-nobibsource}, \texttt{\mm{}nobibsource}] ~
do not produce the \verb|_bib.html| file. In that case, no ``BibTeX
entry'' link are inserted in the HTML file.
\item[\texttt{-suffix} \textit{string}, \texttt{\mm{}suffix} \textit{string}] ~
give an alternate suffix \textit{string} for both
HTML files and links (default is \texttt{.html}).
\item[\texttt{-fsuffix} \textit{string},
\texttt{\mm{}file-suffix} \textit{string}] ~
give an alternate suffix \textit{string} for
HTML files (default is \texttt{.html}).
\item[\texttt{-lsuffix} \textit{string},
\texttt{\mm{}link-suffix} \textit{string}] ~
give an alternate suffix \textit{string} for
HTML links (default is \texttt{.html}).
\item[\texttt{-o} \textit{file}, \texttt{\mm{}output} \textit{file}] ~
specifies the output file. If \textit{file} is \verb!-!, then the
standard output is selected.
\item[\texttt{-c} \textit{command}, \texttt{\mm{}command} \textit{command}] ~
specify the BibTeX command (default is \texttt{bibtex
-min-crossrefs=1000}). May be useful for example if
you need to specify the full path of the \texttt{bibtex} command.
\item[\texttt{\mm{}print-keys}] ~
print the BibTeX entries on the standard output (one per line), as
selected and sorted by \texttt{bibtex2html}. This is useful if you
want to use the selection and sorting facilities of
\texttt{bibtex2html} in another program. Note: you may need to set
also the \texttt{-q} option (quiet) to suppress the usual output.
\item[\texttt{-i}, \texttt{\mm{}ignore-errors}] ~
ignore BibTeX errors.
\item[\texttt{-q}, \texttt{\mm{}quiet}] ~
be quiet.
\item[\texttt{-w}, \texttt{\mm{}warn-error}] ~
stop at the first warning.
\item[\texttt{-h}, \texttt{\mm{}help}] ~
print a short usage and exit.
\item[\texttt{-v}, \texttt{\mm{}version}] ~
print the version and exit.
\item[\texttt{-noheader}, \texttt{\mm{}no-header}] ~
do not insert the \texttt{bibtex2html} command in the HTML document
(default is to insert it as a comment at the beginning).
\end{description}
\section{The \texttt{bib2bib} command line tool}
\texttt{bib2bib} is a tool for extracting some entries from a list of
bibliography files. It is invocated as
\begin{flushleft}
\texttt{bib2bib} [options] \textit{file1.bib} $\cdots$ \textit{filen.bib}
\end{flushleft}
where the possible options are described below and where
\textit{file1.bib} $\cdots$ \textit{filen.bib} are the names of the
BibTeX files, which must have a \textit{.bib} suffix. If no files at
all are given on the command line, then input is taken from standard
input.
The options allow to specify a filter condition to test against each
references read from bib files. The result will be a new BibTeX file
containing all the entries of the input files that satisfy the
condition. Notice that this output file contains all the necessary
informations: each string and each cross-reference needed will be also
in that file.
Additionally, \textit{bib2bib} may output a file containing all the
keys of entries that satisfy the condition. This second file is
suitable for input as option \verb|-citefile| to \verb|bibtex2html|.
\subsection{Command line options}
\begin{description}
\item[\texttt{-c} \textit{condition}] ~
specify a condition for selecting the entries. The output will
retain only the entries that satisfy this condition. If several such
condition are given, then only the entries that satisfy all the
conditions are selected. The syntax of conditions is given below,
notice that it is better to escape shell expansions in that
conditions, in other words, you should write conditions between
quotes.
\item[\texttt{-ob} \textit{filename}] ~
specify the filename where the selected entries are output. If not
given, it defaults to standard output.
\item[\texttt{-oc} \textit{filename}] ~
specify the filename where the list of selected keys is output. If
not given, this file is not created.
Notice that the two output files above are suitable for use with
bibtex2html. A typical use would be
\begin{flushleft}
\texttt{bib2bib -oc $citefile$ -ob $bibfile.bib$ -c $condition$
file1.bib file2.bib ... } \\
\texttt{bibtex2html -citefile $citefile$ bibfile.bib}
\end{flushleft}
which will produce exactly the HTML file for the selected
references.
\item[\texttt{\mm{}expand}] ~
expand all abbreviations in the output file.
\item[\texttt{\mm{}expand-xrefs}] ~
expand all crossrefs in the output file. Notice that the meaning of
such an expansion is not completely obvious: it's better to let
bibtex (via bibtex2html) handle the cross-references itself,
depending on the style considered.
Notice that \texttt{bibtex2html} itself will expand the strings (by
default, unless you specify the \verb|-noexpand| option) but not the
cross-references.
\item[\texttt{\mm{}no-comment}] ~
prevent generation of extra comments at beginning of output bib file.
\item[\texttt{\mm{}remove} \textit{f}] ~
remove all occurrences of field \textit{f}. This option can be
used several times to remove several fields.
\item[\texttt{\mm{}rename} \textit{f1} \textit{f2}] ~
rename all occurrences of field \textit{f1} into \texttt{f2}. This
option can be used several times to rename several fields. Beware
that if an entry already has both fields \texttt{f1} and
\texttt{f2}, this will result in two fields \textit{f2}, and
BibTeX styles usually take only the first occurrence into account.
Example:
\begin{flushleft}
\texttt{bib2bib \mm{}remove abstract \mm{}remove copyright \mm{}rename x-pdf url $bibfile.bib$}
\end{flushleft}
removes all \texttt{abstract} and \texttt{copyright} fields and
rename all \texttt{x-pdf} fields into name \texttt{url}.
\item[\texttt{-s} \textit{f}] ~
sorts the entries of the bibliography with respect to the given
field \textit{f}, which may also be \texttt{\$key} or \texttt{\$type}
to refer to the key or to the entry type, as for filter
conditions. It may also be \texttt{\$date}, to ask for sorting from
oldest to newest, as for option \texttt{-d} of bibtex2html.
This option may be used several times to specify a lexicographic
order, such as by author, then by type, then by date:
\begin{flushleft}
\texttt{bib2bib -s 'author' -s '\$type' -s '\$date' $bibfile.bib$}
\end{flushleft}
When sorting, the resulting bibliography will always contains the
comments first, then the preambles, then the abbreviations, and
finally the regular entries. Be warned that such a sort may put
cross-references before entries that refer to them, so be cautious.
\item[\texttt{-r}] ~
reverses the sort order.
\item[\texttt{-q}, \texttt{\mm{}quiet}] ~
be quiet.
\item[\texttt{-w}, \texttt{\mm{}warn-error}] ~
stop at the first warning.
\item[\texttt{\mm{}php-output} \textit{file}] ~
outputs the bib file as a two-dimensional array in PHP syntax, in
\textit{file}.
\end{description}
\subsection{Filter conditions}
\label{sec:conditions}
A filter condition is a boolean expression that is evaluated against a
BibTeX entry to decide whether this entry should be selected. A
condition is either:
\begin{itemize}
\item a \emph{comparison} between two \emph{expressions}, written as
$e_1~op~e_2$ ;
\item a matching of a field name with respect to a \emph{regular
expression}, written as $field : regexp$ ;
\item a conjunction of two conditions, written as $c_1 \verb| and |
c_2$ (or $c_1 \verb| & | c_2$) ;
\item a disjunction of two conditions, written as $c_1 \verb| or |
c_2$ (or $c_1 \verb+ | + c_2$);
\item a negation of a condition, written as $\verb|not | c$ (or
$\verb|! | c$) ;
\item a test of existence of a field, written as $\verb|exists | f$
(or $\verb|? | f$) where $f$ is a field name ;
\end{itemize}
where an expression is either:
\begin{itemize}
\item a string constant, written either between double quotes or single
quotes ;
\item an integer constant ;
\item a field name ;
\item the special ident \verb|$key| which corresponds to the key of
an entry.
\item the special ident \verb|$type| which corresponds to the type
of an entry (ARTICLE, INPROCEEDINGS, etc.). Notice that an entry
type is always written in uppercase letters.
\end{itemize}
Comparison operators are the usual ones: \texttt{=}, \texttt{<},
\texttt{>}, \texttt{<=}, \texttt{>=} and \texttt{<>}.
The field names are any sequences of lowercase or uppercase letters (but no
distinction is made between lowercase and uppercase letters).
Be careful when writing conditions in a shell command: the shell must
not interpret anything in a condition, such as \verb|$key|. So usually
you need to put conditions inside quote characters that forbid shell
interpretation: single quotes under Unix shell, or double quotes under
Microsoft Windows shell. This is why strings in conditions may be put
indifferently between single or double quotes: you will use the ones
which are not the ones you use to forbid shell interpretation. In
examples below, we will use Unix convention, so under Windows you have
to permute the use of single and double quotes.
Note that within \texttt{Makefile}s you have to escape the \verb|$|
character in \verb|$key| or \verb|$type| (using \verb|$$key| and
\verb|$$type| instead, at least with GNU make).
Regular expressions must be put between single or double quotes, and must
follow the GNU syntax of regular expressions, as for example in GNU Emacs. Any
character other than \verb|$^.*+?[]| matches itself, see
Table~\ref{table:regexp} for the meaning of the special characters.
\begin{table}[t]
\begin{center}
\begin{tabular}{|l|p{100mm}|}
\hline
\verb|.| & matches any character except newline \\\hline
\verb|[..]| & character set; ranges are denoted with \verb|-|, as in
\verb|[a-z]|; an initial \verb|^|, as in \verb|[^0-9]|, complements
the set \\\hline
\verb|^| & matches the beginning of the string matched \\\hline
\verb|$| & matches the end of the string matched \\\hline
\verb|\r| & matches a carriage return \\\hline
\verb|\n| & matches a linefeed \\\hline
\verb|\t| & matches a tabulation \\\hline
\verb|\b| & matches word boundaries \\\hline
\verb|\|\textsl{ddd} & matches character with ASCII code \textsl{ddd} in decimal \\\hline
\verb|\|\textsl{char} & quotes special character \textsl{char} among
\verb#$^.*+?[]\# \\\hline
\textsl{regexp}\verb|*| & matches \textsl{regexp} zero, one or several
times \\\hline
\textsl{regexp}\verb|+| & matches \textsl{regexp} one or several times
\\\hline
\textsl{regexp}\verb|?| & matches \textsl{regexp} once or not at all \\\hline
\textsl{regexp1} \verb+\|+ \textsl{regexp2} & alternative between two
regular expressions, this operator has low priority against
\verb|*|, \verb|+| and \verb|?| \\\hline
\verb|\(| \textsl{regexp} \verb|\)| & grouping regular expression \\\hline
\end{tabular}
\end{center}
\caption{Syntax of regular expressions}
\label{table:regexp}
\end{table}
Notice that if several conditions are given with option \verb|-c| on
the command line, then they are understood as the conjunction of them,
in other words
\begin{flushleft}
\texttt{bib2bib -c '$c_1$' $\cdots$ -c '$c_n$'}
\end{flushleft}
is equivalent to
\begin{flushleft}
\texttt{bib2bib -c '$c_1$ and $\cdots$ and $c_n$'}
\end{flushleft}
Table~\ref{table:syntax} shows a formal grammar for conditions.
\begin{table}[t]
\begin{eqnarray*}
Cond & \rightarrow & Cond \verb| and | Cond \mid Cond \verb| or | Cond
\mid \verb|not | Cond \mid \verb|exists | Id \\
Cond & \rightarrow & Cond \verb| & | Cond \mid Cond \verb+ | + Cond
\mid \verb|! | Cond \mid \verb|? | Id\\
&& \mid Expr ~ Comp ~ Expr \mid Expr \verb|:| String
\mid \verb|( | Cond \verb| )|\\
Comp & \rightarrow & \verb|=| \mid \verb|>| \mid \verb|<| \mid
\verb|>=| \mid \verb|<=| \mid \verb|<>| \\
Expr & \rightarrow & Id \mid String \mid Int \mid \verb|$key|
\mid \verb|$type| \\
Id & \rightarrow & [\verb|a|-\verb|z|\verb|A|-\verb|Z|]^+ \\
String & \rightarrow & \verb|"| ([\verb|^"\|] \mid \verb|\"| \mid \verb|\\| )^*
\verb|"| \mid \verb|'| ([\verb|^'\|] \mid \verb|\'| \mid \verb|\\| )^* \verb|'| \\
Integer & \rightarrow & [\verb|0|-\verb|9|]^+
\end{eqnarray*}
\caption{Syntax of conditions}
\label{table:syntax}
\end{table}
\subsubsection*{Remarks on evaluation of conditions}
\begin{itemize}
\item According to BibTeX conventions, entry types, keys and field
names have to be considered case insensitive, that is no distinction
is made between uppercase and lowercase letters. Inside
\verb|bib2bib|, these are always converted to uppercase, so you may
take this into account when writting conditions (see below).
\item On the other hand, case matters when comparing strings, or
matching them against regular expressions. For example,
\verb|title : "Computer"| may return \verb|true| if the title
contains the word \verb|Computer| with a capital letter, whereas
\verb|title : "computer"| would return \verb|false|.
\item A consequence of the two previous remarks, is that if you want for
example to check equality of the entry type and a string value, put the
value in uppercase, as for example \verb|$type = "INPROCEEDINGS"|,
otherwise the condition would be always false.
\item When performing a comparison with an non-existent field, the result is
always false ; beware that this means that for example
\verb|not (f = "value")| and \verb|f <> "value"| are not equivalent: for an
entry that does not have a field \verb|f|, the first condition is true
whereas the second is false.
\item As usual, \verb|not| has higher priority than \verb|and|, which itself
has higher priority than \verb|or|. \verb|and| and \verb|or| associate to
the left.
\item Comparison using \verb|>|, \verb|<|, \verb|>=| and \verb|<=| may only be
used between integer values. In any other case, a warning is displayed and
the result is false.
\item There is a special handling for strings containing LaTeX
accented letters (or for backward compatibility, ISO-Latin1 accented
characters): each variant of writing such letters are considered the
same, and equivalent to their HTML entity form, for example strings
\verb|"Filli\^atre"|, \verb|"Filli{\^a}tre"|, \verb|"Filli\^{a}tre"|
and \verb|"Filliâtre"| are considered identical and indeed equal to
\verb|"Filliâtre"|. Note that when using such a string as a
regular expression, there is no need to escape the backslash, since
interpretation of LaTeX accenting commands is made before
interpretation into a regexp. Using HTML entities for matching
accented names is thus considered as the safest method.
\end{itemize}
\subsection{Examples}
Here are some examples to help you writing the filter conditions you
are interested in.
\subsubsection{Selecting entries of a given year}
The following command reads input files \verb|biblio1.bib| and
\verb|biblio2.bib|, and select only entries that appeared in 1999 :
\begin{verbatim}
bib2bib -oc cite1999 -ob 1999.bib -c 'year=1999' biblio1.bib biblio2.bib
\end{verbatim}
The resulting file \verb|cite1999| contains the list of keys
selected. You can then produce the HTML file by
\begin{verbatim}
bibtex2html -citefile cite1999 1999.bib
\end{verbatim}
You may also select references appeared after and/or before a given
year. For example, references after 1997:
\begin{verbatim}
bib2bib -oc citeaft1997 -ob aft1997.bib -c 'year>1997' biblio.bib
\end{verbatim}
or between 1990 and 1995:
\begin{verbatim}
bib2bib -oc cite90-95 -ob 90-95.bib -c 'year>=1990 and year<=1995' biblio.bib
\end{verbatim}
\subsubsection{Selecting references of a given author}
The following command reads input files \verb|biblio.bib| and select
only entries whose (co)author is Donald Knuth:
\begin{verbatim}
bib2bib -oc knuth-citations -ob knuth.bib -c 'author : "Knuth"' biblio.bib
\end{verbatim}
More complicated, if you would like to have only the references whose
author is Knuth only, you may try
\begin{verbatim}
bib2bib -oc knuth-citations -ob knuth.bib \
-c 'author : "^\(Donald \(E. \)?Knuth\|Knuth, Donald \(E.\)?\)$"' biblio.bib
\end{verbatim}
or equivalently but missing the possible ``\verb|E.|'':
\begin{verbatim}
bib2bib -oc knuth-citations -ob knuth.bib -c 'author = "Donald Knuth"
or author = "Knuth, Donald"' biblio.bib
\end{verbatim}
\subsubsection{Other examples}
Any boolean combination of comparison and/or matching are
possible. For example, the following command extract the references
that appeared since 1995 and have lambda-calculus in their title, with
anything between ``lambda'' and ``calculus'':
\begin{verbatim}
bib2bib -oc lambda -c 'year >= 1995 and title : "lambda.*calculus"' biblio.bib
\end{verbatim}
for example, it will select a title containing
\verb|$\lambda$-calculus|.
\subsection{Note on duplicates entries}
\verb|bib2bib| has the effect of merging several bib files into a
single one. This process may result in duplicate entries in the
resulting files, which is considered as erroneous by \verb|bibtex|.
Of course, this is not really a bug of \verb|bib2bib| since it is of
your own to take care not having entries with the same key.
However, there are two particular cases when this occurs naturally:
when two bib files share common abbreviations, or when they share
common cross-references.
In order to make \verb|bib2bib| behaves correctly in such a case,
it is designed as follows: for repeated abbrevs, the first abbrev is kept and
the others are ignored, and for repeated regular entries, the last entry
is kept and the others are ignored. With this behaviour, everything
works well as soon as repeated abbrevs are really duplicate abbrevs of
the same sentence, and repeated keys are really duplicate
entries.
\section{The \texttt{aux2bib} command line tool}
\texttt{aux2bib} is a tool extracting the BibTeX references from a
\texttt{.aux} file (as produced by \LaTeX) and building the
corresponding BibTeX file. It is invocated as
\begin{flushleft}
\texttt{aux2bib} \textit{file.aux}
\end{flushleft}
The BibTeX file is written on the standard output.
\section{Frequently Asked Questions}
\begin{enumerate}
\item \textbf{How may I tell bibtex2html to expand cross-references?} ~
By default, all entries of the input BibTeX file are translated into
HTML, including cross-references. Since the latter are there, bibtex
will never expand them. If you want them to be expanded, you have to
tell bibtex2html that crossref entries need not be in the resulting
file. To do that you have to use the option \verb|-citefile| to give
the exact list of entries you want to see. If a cross-reference is
not in that list, then its fields will be expanded into all entries
that cross-refers to it. (Technically, this work because bibtex2html
calls bibtex with option \verb|-min-crossrefs=1000| by default.)
\item \textbf{When running}
\begin{verbatim}
bib2bib -oc knuth-citations -ob knuth.bib -c 'author : "Knuth"' biblio.bib
\end{verbatim}
\textbf{I get "Lexical error in condition: Unterminated
string". What's going wrong?}
You are probably running \verb|bib2bib| under Microsoft Windows, hence
you should permute the use of single quotes and double quotes, as
explained in Section~\ref{sec:conditions}:
\begin{verbatim}
bib2bib -oc knuth-citations -ob knuth.bib -c "author : 'Knuth'" biblio.bib
\end{verbatim}
\end{enumerate}
%HEVEA\begin{rawhtml}<hr><img src="http://www.lri.fr/~filliatr/icons/mail.gif" ALIGN=middle><em><a href="mailto:Jean-Christophe.Filliatre[at]lri.fr, Claude.Marche[at]lri.fr">mail to authors</a> , Fri Feb 12 17:46:23 1999\end{rawhtml}
\end{document}
%%% Local Variables:
%%% mode: latex
%%% TeX-master: t
%%% End: