Chameleon

Chameleon Svn Source Tree

Root/trunk/package/bin/po4a/lib/Locale/Po4a/Man.pm

1#!/usr/bin/perl -w
2
3=encoding UTF-8
4
5=head1 NAME
6
7Locale::Po4a::Man - convert manual pages from/to PO files
8
9=head1 DESCRIPTION
10
11The po4a (PO for anything) project goal is to ease translations (and more
12interestingly, the maintenance of translations) using gettext tools on
13areas where they were not expected like documentation.
14
15Locale::Po4a::Man is a module to help the translation of documentation in
16the nroff format (the language of manual pages) into other [human]
17languages.
18
19=head1 TRANSLATING WITH PO4A::MAN
20
21This module tries pretty hard to make translator's life easier. For that,
22the text presented to translators isn't a verbatim copy of the text found
23in the man page. Indeed, the cruder parts of the nroff format are hidden, so
24that translators can't mess up with them.
25
26=head2 Text wrapping
27
28Unindented paragraphs are automatically rewrapped for the translator. This
29can lead to some minor difference in the generated output, since the
30rewrapping rules used by groff aren't very clear. For example, two spaces
31after a parenthesis are sometimes preserved.
32
33Anyway, the difference will only be about the position of the extra spaces
34in wrapped paragraph, and I think it's worth.
35
36=head2 Font specification
37
38The first change is about font change specifications. In nroff, there are
39several ways to specify if a given word should be written in small, bold or
40italics. In the text to translate, there is only one way, borrowed from the
41POD (Perl online documentation) format:
42
43=over
44
45=item IE<lt>textE<gt> -- italic text
46
47equivalent to \fItext\fP or ".I text"
48
49=item BE<lt>textE<gt> -- bold text
50
51equivalent to \fBtext\fP or ".B text"
52
53=item RE<lt>textE<gt> -- roman text
54
55equivalent to \fRtext\fP
56
57=item CWE<lt>textE<gt> -- constant width text
58
59equivalent to \f(CWtext\fP or ".CW text"
60
61=back
62
63Remark: The CW face is not available for all groff devices. It is not
64recommended to use it. It is provided for your convenience.
65
66=head2 Automatic characters transliteration
67
68Po4a automatically transliterate some characters to ease the translation
69or the review of the translation.
70Here is the list of the transliterations:
71
72=over
73
74=item hyphens
75
76Hyphens (-) and minus signs (\-) in man pages are all transliterated
77as simple dashes (-) in the PO file. Then all dash are transliterated into
78roff minus signs (\-) when the translation is inserted into the output
79document.
80
81Translators can force an hyphen by using the roff glyph '\[hy]' in their
82translations.
83
84=item non-breaking spaces
85
86Translators can use non-breaking spaces in their translations. These
87non-breaking spaces (0xA0 in latin1) will be transliterated into a roff
88non-breaking space ('\ ').
89
90=item quotes transliterations
91
92`` and '' are respectively tranliterated into \*(lq and \*(rq.
93
94To avoid these transliterations, translators can insert a zero width roff
95character (i.e., using `\&` or '\&' respectively).
96
97=back
98
99=head2 Putting 'E<lt>' and 'E<gt>' in translations
100
101Since these chars are used to delimit parts under font modification, you
102can't use them verbatim. Use EE<lt>ltE<gt> and EE<lt>gtE<gt> instead (as in
103POD, one more time).
104
105=head1 OPTIONS ACCEPTED BY THIS MODULE
106
107These are this module's particular options:
108
109=over
110
111=item B<debug>
112
113Activate debugging for some internal mechanisms of this module.
114Use the source to see which parts can be debugged.
115
116=item B<verbose>
117
118Increase verbosity.
119
120=item B<groff_code>
121
122This option permits to change the behavior of the module when it encounter
123a .de, .ie or .if section. It can take the following values:
124
125=over
126
127=item I<fail>
128
129This is the default value.
130The module will fail when a .de, .ie or .if section is encountered.
131
132=item I<verbatim>
133
134Indicates that the .de, .ie or .if sections must be copied as is
135from the original to the translated document.
136
137=item I<translate>
138
139Indicates that the .de, .ie or .if sections will be proposed for the
140translation.
141You should only use this option if a translatable string is
142contained in one of these section. Otherwise, I<verbatim>
143should be preferred.
144
145=back
146
147=item B<generated>
148
149This option specifies that the file was generated, and that po4a should not
150try to detect if the man pages was generated from another format.
151This permits to use po4a on generated man pages.
152This option does not take any argument.
153
154=item B<mdoc>
155
156This option is only useful for mdoc pages.
157
158It selects a stricter support of the mdoc format by telling po4a not to
159translate the 'NAME' section.
160mdoc pages whose 'NAME' section is translated won't generate any header or
161footer.
162
163According to the groff_mdoc page, the NAME, SYNOPSIS and DESCRIPTION
164sections are mandatory.
165There are no known issues with translated SYNOPSIS or DESCRIPTION section,
166but you can also specify these sections this way:
167 -o mdoc=NAME,SYNOPSIS,DESCRIPTION
168
169This mdoc issue can also be solved with an addendum like this one:
170 PO4A-HEADER:mode=before;position=^.Dd
171 .TH DOCUMENT_TITLE 1 "Month day, year" OS "Section Name"
172
173
174=back
175
176The following options permit to specify the behavior of a new macro
177(defined with a .de request), or of a macro not supported by po4a.
178They take as argument a comma-separated list of macros.
179For example:
180
181 -o noarg=FO,OB,AR -o translate_joined=BA,ZQ,UX
182
183Note: if a macro is not supported by po4a and if you consider that it is a
184standard roff macro, you should submit it to the po4a development team.
185
186=over
187
188=item B<untranslated>
189
190B<untranslated> indicates that this macro (at its arguments) don't have to
191be translated.
192
193=item B<noarg>
194
195B<noarg> is like B<untranslated>, except that po4a will verify that no
196argument is added to this macro.
197
198=item B<translate_joined>
199
200B<translate_joined> indicates that po4a must propose to translate the
201arguments of the macro.
202
203=item B<translate_each>
204
205With B<translate_each>, the arguments will also be proposed for the
206translation, except that each one will be translated separately.
207
208=item B<no_wrap>
209
210This option takes as argument a list of comma-separated couples
211I<begin>:I<end>, where I<begin> and I<end> are commands that delimit
212the begin and end of a section that should not be rewrapped.
213
214Note: no test is done to ensure that an I<end> command matches its
215I<begin> command; any ending command stop the no_wrap mode.
216If you have a I<begin> (respectively I<end>) macro that has no I<end>
217(respectively I<begin>), you can specify an existing I<end> (like fi) or
218I<begin> (like nf) as a counterpart.
219These macros (and their arguments) wont be translated.
220
221=item B<inline>
222
223This option specifies a list of comma-separated macros that must
224not split the current paragraph. The string to translate will then contain
225I<foo E<lt>.bar baz quxE<gt> quux>, where I<bar> is the command that
226should be inlined, and I<baz qux> its arguments.
227
228=item B<unknown_macros>
229
230This option indicates how po4a should behave when an unknown macro is found.
231By default, po4a fails with a warning.
232It can take the following values: B<failed> (the default value),
233B<untranslated>, B<noarg>, B<translate_joined>, or B<translate_each> (see above
234for an explanation of these values).
235
236=back
237
238=head1 AUTHORING MAN PAGES COMPLIANT WITH PO4A::MAN
239
240This module is still very limited, and will always be, because it's not a
241real nroff interpreter. It would be possible to do a real nroff
242interpreter, to allow authors to use all the existing macros, or even to
243define new ones in their pages, but we didn't want to. It would be too
244difficult, and we thought it wasn't necessary. We do think that if
245manpages' authors want to see their productions translated, they may have to
246adapt to ease the work of translators.
247
248So, the man parser implemented in po4a have some known limitations we are
249not really inclined to correct, and which will constitute some pitfalls
250you'll have to avoid if you want to see translators taking care of your
251documentation.
252
253=head2 Don't program in nroff
254
255nroff is a complete programming language, with macro definition,
256conditionals and so on. Since this parser isn't a fully featured nroff
257interpreter, it will fail on pages using these facilities (There are about
258200 such pages on my box).
259
260=head2 Use the plain macro set
261
262There are still some macros which are not supported by po4a::man. This is
263only because I failed to find any documentation about them. Here is the
264list of unsupported macros used on my box. Note that this list isn't
265exhaustive since the program fails on the first encountered unsupported
266macro. If you have any information about some of these macros, I'll
267happily add support for them. Because of these macros, about 250 pages on
268my box are inaccessible to po4a::man.
269
270 .. ." .AT .b .bank
271 .BE ..br .Bu .BUGS .BY
272 .ce .dbmmanage .do .En
273 .EP .EX .Fi .hw .i
274 .Id .l .LO .mf
275 .N .na .NF .nh .nl
276 .Nm .ns .NXR .OPTIONS .PB
277 .pp .PR .PRE .PU .REq
278 .RH .rn .S< .sh .SI
279 .splitfont .Sx .T .TF .The
280 .TT .UC .ul .Vb .zZ
281
282=head2 Hiding text from po4a
283
284Sometimes, the author knows that some parts are not translatable, and
285should not be extracted by po4a. For example, an option may accept an
286I<other> argument, and I<other> may also appear as the last item of a
287list. In the first case, I<other> should be not be translatable. And in
288the second case, I<other> should be translated.
289
290In such case, the author can avoid po4a to extract some strings, using
291some special groff constructs:
292
293 .if !'po4a'hide' .B other
294
295(this will require the B<-o groff_code=verbatim> option)
296
297A new macro can also be defined to automate this:
298 .de IR_untranslated
299 . IR \\$@
300 ..
301
302 .IR_untranslated \-q ", " \-\-quiet
303
304(this will require the options B<-o groff_code=verbatim> and
305B<-o untranslated=IR_untranslated>; with this construct, the B<.if
306!'po4a'hide'> conditional is not strictly needed since po4a will not parse
307the internal of the macro definition)
308
309or using an alias:
310 .als IR_untranslated IR
311
312 .IR_untranslated \-q ", " \-\-quiet
313
314(this will require the B<-o untranslated=als,IR_untranslated> option)
315
316=head2 Conclusion
317
318To summarise this section, keep simple, and don't try to be clever while
319authoring your man pages. A lot of things are possible in nroff, and not
320supported by this parser. For example, don't try to mess with \c to
321interrupt the text processing (like 40 pages on my box do). Or, be sure to
322put the macro arguments on the same line that the macro itself. I know that
323it's valid in nroff, but would complicate too much the parser to be
324handled.
325
326Of course, another possibility is to use another format, more translator
327friendly (like POD using po4a::pod, or one of the XML familly like SGML),
328but thanks to po4a::man it isn't needed anymore. That being said, if the
329source format of your documentation is POD, or XML, it may be clever to
330translate the source format and not this generated one. In most cases,
331po4a::man will detect generated pages and issue a warning. It will even
332refuse to process POD generated pages, because those pages are perfectly
333handled by po4a::pod, and because their nroff counterpart defines a lot of
334new macros I didn't want to write support for. On my box, 1432 of the 4323
335pages are generated from POD and will be ignored by po4a::man.
336
337In most cases, po4a::man will detect the problem and refuse to process the
338page, issuing an adapted message. In some rare cases, the program will
339complete without warning, but the output will be wrong. Such cases are
340called "bugs" ;) If you encounter such case, be sure to report this, along
341with a fix when possible...
342
343=head1 STATUS OF THIS MODULE
344
345This module can be used for most of the existing man pages.
346
347Some tests are regularly run on Linux boxes:
348
349=over 4
350
351=item *
352
353one third of the pages are refused because they were generated from
354another format supported by po4a (e.g. POD or SGML).
355
356=item *
357
35810% of the remaining pages are rejected with an error (e.g. a
359groff macro is not supported).
360
361=item *
362
363Then, less than 1% of the pages are accepted silently by po4a, but with
364significant issues (i.e. missing words, or new words inserted)
365
366=item *
367
368The other pages are usually handled without differences more important
369than spacing differences or line rewrapped (font issues in less than 10% of
370the processed pages).
371
372=back
373
374=head1 SEE ALSO
375
376L<Locale::Po4a::Pod(3pm)>,
377L<Locale::Po4a::TransTractor(3pm)>,
378L<po4a(7)|po4a.7>
379
380=head1 AUTHORS
381
382 Denis Barbier <barbier@linuxfr.org>
383 Nicolas François <nicolas.francois@centraliens.net>
384 Martin Quinson (mquinson#debian.org)
385
386=head1 COPYRIGHT AND LICENSE
387
388Copyright 2002-2008 by SPI, inc.
389
390This program is free software; you may redistribute it and/or modify it
391under the terms of GPL (see the COPYING file).
392
393=cut
394
395package Locale::Po4a::Man;
396use DynaLoader;
397
398use 5.006;
399use strict;
400use warnings;
401
402require Exporter;
403use vars qw(@ISA @EXPORT);
404@ISA = qw(Locale::Po4a::TransTractor DynaLoader);
405@EXPORT = qw();# new initialize);
406
407# Try to use a C extension if present.
408eval('bootstrap Locale::Po4a::Man "0.30"');
409
410use Locale::Po4a::TransTractor;
411use Locale::Po4a::Common;
412
413use File::Spec;
414use Getopt::Std;
415
416my %macro; # hash of known macro, with parsing sub. See end of this file
417my %default_macro; # The default known macros, when no options are used.
418
419# A font start by \f and is followed either by
420# [.*] - a font name within brackets (e.g. [P], [A_USER_FONT])
421# (.. - a parenthesis followed by two char (e.g. "(CW")
422# . - a single char (e.g. B, I, R, P, 1, 2, 3, 4, etc.)
423my $FONT_RE = "\\\\f(?:\\[[^\\]]*\\]|\\(..|[^\\(\\[])";
424
425# Variable used to identify non breaking spaces.
426# These non breaking spaces are used to ease the parsing, and a
427# translator can use them in her translation (and they will be translated
428# into the groff non-breaking space.
429my $nbs;
430
431# Indicate if the page uses the mdoc macros
432my $mdoc_mode = 0;
433
434my $unknown_macros = undef;
435
436#########################
437#### DEBUGGING STUFF ####
438#########################
439my %debug;
440# The following debug options can be set with '-o debug=...':
441# * splitargs see how macro args are separated
442# * pretrans see pre-conditioning of translation
443# * postrans see post-conditioning of translation
444# * fonts see font modifier handling
445
446
447######## CONFIG #########
448# This variable indicates the behavior of the module when a .de, .if or
449# .ie is encountered.
450my $groff_code;
451# %no_wrap_begin and %no_wrap_end are lists of macros that respectively
452# begins and ends a no_wrap paragraph.
453# Any ending macro will end the no_wrap paragraph started by any beginning
454# macro.
455my %no_wrap_begin;
456my %no_wrap_end;
457# List of macros that should be inlined (with E<.xx ...>)
458my %inline;
459# The default list of inlined macros (when no options are used)
460my %default_inline;
461# This variable indicates whether po4a should try to detect the generated
462# files.
463my $allow_generated;
464# This hash indicates section name that should not be translated in mdoc
465# mode.
466# The groff's mdoc processor requires the NAME section, otherwise headers
467# and footers of the pages are not generated.
468# The mdoc_groff man page indicates that NAME, SYNOPSIS and DESCRIPTION
469# are mandatory.
470my %mdoc;
471sub initialize {
472 my $self = shift;
473 my %options = @_;
474
475 $self->{options}{'debug'}='';
476 $self->{options}{'verbose'}='';
477 $self->{options}{'groff_code'}='';
478 $self->{options}{'untranslated'}='';
479 $self->{options}{'noarg'}='';
480 $self->{options}{'translate_joined'}='';
481 $self->{options}{'translate_each'}='';
482 $self->{options}{'no_wrap'}='';
483 $self->{options}{'inline'}='';
484 $self->{options}{'generated'}='';
485 $self->{options}{'mdoc'}='';
486 $self->{options}{'unknown_macros'}='';
487
488 foreach my $opt (keys %options) {
489 if (defined $options{$opt}) {
490 die wrap_mod("po4a::man",
491 dgettext("po4a", "Unknown option: %s"), $opt)
492 unless exists $self->{options}{$opt};
493 $self->{options}{$opt} = $options{$opt};
494 }
495 }
496
497 %debug = ();
498 if (defined $options{'debug'}) {
499 foreach ($options{'debug'}) {
500 $debug{$_} = 1;
501 }
502 }
503
504 $groff_code = "fail";
505 if (defined $options{'groff_code'}) {
506 unless ($options{'groff_code'} =~ m/fail|verbatim|translate/) {
507 die wrap_mod("po4a::man", dgettext("po4a",
508 "Invalid 'groff_code' value. Must be one of 'fail', 'verbatim', 'translate'."));
509 }
510 $groff_code = $options{'groff_code'};
511 }
512
513 if (%default_macro) {
514 %macro = %default_macro;
515 } else {
516 %default_macro = %macro
517 }
518 if (defined $options{'untranslated'}) {
519 foreach (split(/,/, $options{'untranslated'})) {
520 $macro{$_} = \&untranslated;
521 }
522 }
523 if (defined $options{'noarg'}) {
524 foreach (split(/,/, $options{'noarg'})) {
525 $macro{$_} = \&noarg;
526 }
527 }
528 if (defined $options{'translate_joined'}) {
529 foreach (split(/,/, $options{'translate_joined'})) {
530 $macro{$_} = \&translate_joined;
531 }
532 }
533 if (defined $options{'translate_each'}) {
534 foreach (split(/,/, $options{'translate_each'})) {
535 $macro{$_} = \&translate_each;
536 }
537 }
538
539 %no_wrap_begin = (
540 'nf' => 1,
541 'EX' => 1,
542 'EQ' => 1
543 );
544 %no_wrap_end = (
545 'fi' => 1,
546 'EE' => 1,
547 'EN' => 1
548 );
549 if (defined $options{'no_wrap'}) {
550 foreach (split(/,/, $options{'no_wrap'})) {
551 if ($_ =~ m/^(.*):(.*)$/) {
552 $no_wrap_begin{$1} = 1;
553 $no_wrap_end{$2} = 1;
554 } else {
555 die wrap_mod("po4a::man", dgettext("po4a","The no_wrap parameters must be a set of comma-separated begin:end couples.\n"));
556 }
557 }
558 }
559
560 if (%default_inline) {
561 %inline = %default_inline;
562 } else {
563 %default_inline = %inline
564 }
565 if (defined $options{'inline'}) {
566 foreach (split(/,/, $options{'inline'})) {
567 $inline{$_} = 1;
568 }
569 }
570
571 $allow_generated = 0;
572 if (defined $options{'generated'}) {
573 $allow_generated = 1;
574 }
575
576 %mdoc = ();
577 if (defined $options{'mdoc'}) {
578 if ($options{'mdoc'} eq 1) {
579 $mdoc{"NAME"} = 1;
580 } else {
581 foreach (split(/,/, $options{'mdoc'})) {
582 $mdoc{$_} = 1;
583 }
584 }
585 }
586
587 $unknown_macros = undef;
588 if (defined $options{'unknown_macros'}) {
589 if ($options{'unknown_macros'} eq "failed") {
590 $unknown_macros = undef;
591 } elsif ($options{'unknown_macros'} eq "untranslated") {
592 $unknown_macros = \&untranslated;
593 } elsif ($options{'unknown_macros'} eq "noarg") {
594 $unknown_macros = \&noarg;
595 } elsif ($options{'unknown_macros'} eq "translate_joined") {
596 $unknown_macros = \&translate_joined;
597 } elsif ($options{'unknown_macros'} eq "translate_each") {
598 $unknown_macros = \&translate_each;
599 } else {
600 die wrap_mod("po4a::man", dgettext("po4a",
601 "Invalid 'unknown_macros' value. Must be one of:\n").
602 "failed untranslated noarg translate_joined translate_each\n");
603 }
604 }
605}
606
607my @comments = ();
608my @next_comments = ();
609# This function returns the next line of the document being parsed
610# (and its reference).
611# It overload the Transtractor shiftline to handle:
612# - font requests (.B, .I, .BR, .BI, ...)
613# because these requests can be present in a paragraph (handled
614# in the parse subroutine), or in argument (on the next line)
615# of some other request (for example .TP)
616# - font size requests (.SM,.SB) (not done yet)
617# - input escape (\ at the end of a line)
618sub shiftline {
619 my $self = shift;
620 # call Transtractor's shiftline
621NEW_LINE:
622 my ($line,$ref) = $self->SUPER::shiftline();
623
624 if (!defined $line) {
625 # end of file
626 return ($line,$ref);
627 }
628
629 # Do as few treatments as possible with the .de, .ie and .if sections
630 if ($line =~ /^\.\s*(if|ie|de)/) {
631 chomp $line;
632 return ($line,$ref);
633 }
634
635 # Handle some escapes
636 # * reduce the number of \ in macros
637 if ($line =~ /^\\?[.']/) {
638 # The first backslash is consumed while the macro is read.
639 $line =~ s/\\\\/\\/g;
640 }
641 # * \\ is equivalent to \e, which is less error prone for the rest
642 # of the module (e.g. when searching for a font : \f, whe don't
643 # want to match \\f)
644 $line =~ s/\\\\/\\e/g;
645 # * \. is just a dot (this can even be use to introduce a macro)
646 $line =~ s/\\\././g;
647
648 chomp $line;
649 if ($line =~ m/^(.*?)(?:(?<!\\)\\(["#])(.*))$/) {
650 my ($l, $t, $c) = ($1, $2, $3);
651 $line = $l;
652 unless ($allow_generated) {
653 # Check for comments indicating that the file was generated.
654 if ($c =~ /Pod::Man/) {
655 warn wrap_ref_mod($ref, "po4a::man", dgettext("po4a", "This file was generated with Pod::Man. Translate the POD file with the pod module of po4a."));
656 exit 254;
657 } elsif ($c =~ /generated by help2man/) {
658 warn wrap_ref_mod($ref, "po4a::man", dgettext("po4a", "This file was generated with help2man. Translate the source file with the regular gettext."));
659 } elsif ($c =~ /with docbook-to-man/) {
660 warn wrap_ref_mod($ref, "po4a::man", dgettext("po4a", "This file was generated with docbook-to-man. Translate the source file with the sgml module of po4a."));
661 exit 254;
662 } elsif ($c =~ /generated by docbook2man/) {
663 warn wrap_ref_mod($ref, "po4a::man", dgettext("po4a", "This file was generated with docbook2man. Translate the source file with the sgml module of po4a."));
664 exit 254;
665 } elsif ($c =~ /created with latex2man/) {
666 warn wrap_ref_mod($ref, "po4a::man", dgettext("po4a",
667 "This file was generated with %s. ".
668 "You should translate the source file, but continuing anyway."
669 ),"latex2man");
670 } elsif ($c =~ /Generated by db2man.xsl/) {
671 warn wrap_ref_mod($ref, "po4a::man", dgettext("po4a","This file was generated with db2man.xsl. Translate the source file with the xml module of po4a."));
672 exit 254;
673 } elsif ($c =~ /generated automatically by mtex2man/) {
674 warn wrap_ref_mod($ref, "po4a::man", dgettext("po4a",
675 "This file was generated with %s. ".
676 "You should translate the source file, but continuing anyway."
677 ),"mtex2man");
678 } elsif ($c =~ /THIS FILE HAS BEEN AUTOMATICALLY GENERATED. DO NOT EDIT./ ||
679 $c =~ /DO NOT EDIT/i || $c =~ /generated/i) {
680 warn wrap_ref_mod($ref, "po4a::man", dgettext("po4a",
681 "This file contains the line '%s'. ".
682 "You should translate the source file, but continuing anyway."
683 ),$l."\\\"".$c);
684 }
685 }
686
687 if ($line =~ m/^[.']*$/) {
688 if ($c !~ m/^\s+$/) {
689 # This commentted line may be comment for the next paragraph
690 push @next_comments, $c;
691 }
692 if ($line =~ m/^[.']+$/) {
693 # those lines are ignored
694 # (empty lines are a little bit different)
695 goto NEW_LINE;
696 }
697 if ($line =~ m/^\s*$/ and $t eq "#") {
698 # Groff comments
699 goto NEW_LINE;
700 }
701 } else {
702 push @comments, $c;
703 }
704 } else {
705 # finally, we did not reach the end of the paragraph. The comments
706 # belong to the current paragraph.
707 push @comments, @next_comments;
708 @next_comments = ();
709 }
710
711 # A .I or .B request change the current font
712 # and on exit, switch the font to Roman
713 # When one of these request doesn't have its argument on its line
714 # (and when we support this usage), we must keep this font request to
715 # insert it later.
716 # It is a stack of fonts to be inserted (in case a .I is followed by
717 # a .B and then followed bysome text; note that in this case,
718 # only one \fR must be inserted at the end of the text)
719 my $insert_font = "";
720 while ($line =~ /\\$/ || $line =~ /^(\.[BI])\s*$/) {
721 my ($l2,$r2)=$self->SUPER::shiftline();
722 chomp($l2);
723 if ($line =~ /^(\.[BI])\s*$/) {
724 if ($l2 =~ /^[.'][\t ]*([BI]|BI|BR|IB|IR|RB|RI)(?:[\t ]|\s*$)/) {
725 my $font = $line;
726 $font =~ s/^\.([BI])\s*$/$1/;
727 $insert_font = "\\f$font$insert_font";
728 $line = $l2;
729 $ref = $r2;
730 } elsif ($l2 =~ /^[.'][\t ]*(SH|TP|TQ|P|PP|LP)(?:[\t ]|\s*$)/) {
731 $line =~ s/^\.([BI])\s*$/$insert_font\\f$1/;
732 $self->SUPER::unshiftline($l2,$r2);
733 } elsif ($l2 =~ /^([.'][\t ]*(?:IP)[\t ]+"?)(.*)$/) {
734 # Install the font modifier into the next line
735 # after a possible quote (")
736 my $macro = $1;
737 my $arg = $2;
738 $line =~ /^\.([BI])\s*$/;
739 $line = $macro."$insert_font\\f$1".$arg;
740 $ref = $r2;
741 } elsif ($l2 =~ /^[.']/) {
742 warn wrap_ref_mod($ref, "po4a::man", dgettext("po4a",
743 "Font modifiers followed by a command may disturb ".
744 "po4a. You should either remove the font modifier ".
745 "'%s', or integrate a \\f font modifier in the ".
746 "following command ('%s'), but continuing anyway."
747 ), $line, $l2);
748 $line = "PO4A-INLINE:$line:PO4A-INLINE";
749 $self->SUPER::unshiftline($l2,$r2);
750 } else {
751 # convert " to the groff's double quote glyph; it will be
752 # converted back to " in pre_trans. It is needed because
753 # otherwise, these quotes will be taken as arguments
754 # delimiters.
755 $l2 =~ s/"/\\(dq/g;
756 # append this line to the macro, with surrounding quotes, so
757 # that the line appear as an uniq argument.
758 $line .= ' "'.$l2.'"';
759 }
760 } else {
761 $line =~ s/\\$//;
762 $line .= $l2;
763 }
764 }
765 # Detect non-wrapped paragraphs
766 # This must be done before handling the .B, .RI ... font requests
767 $line =~ s/^($FONT_RE)(\s+)/$2$1/;
768
769 $line .= "\n";
770
771 # Handle font requests here
772 if ($line =~ /^[.'][\t ]*([BI]|BI|BR|IB|IR|RB|RI)(?:(?: +|\t)(.*)|)$/) {
773 my $macro = $1;
774 my $arguments = $2;
775 my @args = splitargs($ref,$arguments);
776 if ($macro eq 'B' || $macro eq 'I') {
777 # To keep the space(s), we must introduce some \&
778 @args = map { $_ =~ s/^(\s*)$/\\&$1\\&/s; $_ } @args;
779 my $arg=join(" ",@args);
780 $arg =~ s/^ +//;
781 this_macro_needs_args($macro,$ref,$arg);
782 $line = "$insert_font\\f$macro".$arg."\\fR\n";
783 $insert_font = "";
784 }
785 # .BI bold alternating with italic
786 # .BR bold/roman
787 # .IB italic/bold
788 # .IR italic/roman
789 # .RB roman/bold
790 # .RI roman/italic
791 if ($macro eq 'BI' || $macro eq 'BR' || $macro eq 'IB' ||
792 $macro eq 'IR' || $macro eq 'RB' || $macro eq 'RI' ) {
793 # num of seen args, first letter of macro name, second one
794 my ($i,$a,$b)=(0,substr($macro,0,1),substr($macro,1));
795 $line = join("", map { $i++ % 2 ?
796 "\\f$b$_" :
797 "\\f$a$_"
798 } @args)."\\fR\n";
799 if ($i eq 0) {
800 # If a .BI is used without argument, we must insert a
801 # \fI\fR. The \fR was inserted previously.
802 $line = "\\f$b$line";
803 }
804 }
805
806 if (length $insert_font) {
807 $line =~ s/\n$//;
808 $line = "$insert_font$line\\fR\n";
809 }
810
811 if ($line =~ /^(.*)\\c(\\f.)?\s*\\fR\n/) {
812 my $begin = $1;
813
814 my ($l2,$r2)=$self->SUPER::shiftline();
815 if ($l2 =~ /^[.']/) {
816 $self->SUPER::unshiftline($l2,$r2);
817 } else {
818 $l2 =~ s/\s*$//s;
819 $line = "$begin\\fR$l2\n";
820 }
821 }
822 }
823
824 return ($line,$ref);
825}
826
827# Overload Transtractor's pushline.
828# This pushline first push comments (if there are comments for the
829# current line, and the line is not empty), and then push the line.
830sub pushline {
831 my ($self, $line) = (shift, shift);
832 if ($line !~ m/^\s*$/) {
833 # add comments
834 foreach my $c (@comments) {
835 # comments are pushed (maybe at the wrong place).
836 $self->SUPER::pushline($self->r(".\\\"$c\n"));
837 }
838 @comments = ();
839 }
840
841 $self->SUPER::pushline($line);
842}
843
844# The default unshiftline from Transtractor may fail because shiftline
845# is overloaded
846sub unshiftline {
847 die wrap_mod("po4a::man", dgettext("po4a",
848"The unshiftline is not supported for the man module. ".
849"Please send a bug report with the groff page that generated ".
850"this error."));
851}
852
853###############################################
854#### FUNCTION TO TRANSLATE OR NOT THE TEXT ####
855###############################################
856sub pushmacro {
857 my $self=shift;
858 if (scalar @_) {
859# Do quote the arguments containing spaces, as it should.
860
861# but do not do so if they already contain quotes and escaped spaces
862# For example, cdrdao(1) uses:
863# .IP CATALOG\ "ddddddddddddd" (Here, the quote have to be displayed)
864# Adding extra quotes as in:
865# .IP "CATALOG\ "ddddddddddddd""
866# results in two args: 'CATALOG\ ' and 'ddddddddddddd""'
867$self->pushline(join(" ",map {
868# Replace double quotes by \(dq (double quotes could be
869# taken as an argument delimiter).
870# Only quotes not preceded by \ are taken into account
871# (\" introduces a comment).
872s/(?<!\\)"/\\\(dq/g if (defined $_);
873
874defined $_ ? (
875length($_)?
876 (m/([^\\] |^ )/ ? "\"$_\"" : "$_")
877 # Quote arguments that contain a space.
878 # (not needed for non breaknig spaces, i.e.
879 # spaces preceded by '\')
880 :'""' # empty argument
881) : '' # no argument
882 } @_)."\n");
883 } else {
884$self->pushline("\n");
885 }
886}
887sub this_macro_needs_args {
888 my ($macroname,$ref,$args)=@_;
889 unless (length($args)) {
890die wrap_ref_mod($ref, "po4a::man", dgettext("po4a",
891"macro %s called without arguments. ".
892"Even if placing the macro arguments on the next line is authorized ".
893"by man(7), handling this would make the po4a parser too complicate. ".
894"Please simply put the macro args on the same line."
895), $macroname);
896 }
897}
898
899sub pre_trans {
900 my ($self,$str,$ref,$type)=@_;
901 # Preformatting, so that translators don't see
902 # strange chars
903 my $origstr=$str;
904 print STDERR "pre_trans($str)="
905if ($debug{'pretrans'});
906
907 # Do as few treatments as possible with the .de, .ie and .if sections
908 if (defined $self->{type} && $self->{type} =~ m/^(ie|if|de)$/) {
909 return $str;
910 }
911
912 # Note: if you want to implement \c support, the gdb man page is your playground
913 if ( not defined $self->{type}) {
914 $str =~ s/(\G|^(?:.*?)\n|^) # Last position, or begin of a line
915 ([ \t]*[^.'][^\n]*(?<!\\)(?:\\\\)*) # the new line, which
916 \\c[ \t]*\n # ends by \c and followed by a line
917 (?![ \t]*[.'])/$1$2/sgx;# not followed by a command (.')
918 }
919 die wrap_ref_mod($ref, "po4a::man", dgettext("po4a","Escape sequence \\c encountered. This is not completely handled yet."))
920if ($str =~ /\\c/);
921
922 $str =~ s/>/E<gt>/sg;
923 $str =~ s/</E<lt>/sg;
924 $str =~ s/EE<lt>gt>/E<gt>/g; # could be done in a smarter way?
925
926 while ($str =~ m/^(.*)PO4A-INLINE:(.*?):PO4A-INLINE(.*)$/s) {
927 my ($t1,$t2, $t3) = ($1, $2, $3);
928 $str = "$1E<$2>";
929 if ($mdoc_mode) {
930 # When a punctuation sign must be joined to an argument, mdoc
931 # permits to use such a construct:
932 # .Ar file1 , file2 , file3 ) .
933 # Here, we move the punctuation out of the E<...> tag.
934 # This is reverted in post_trans.
935 # FIXME: To be checked with the French punctuation
936 while ($str =~ m/(?<!\\) +([.,;:\)\]]) *>/s) {
937 $str =~ s/(?<!\\) +([.,;:\)\]]) *>/>$1/s;
938 }
939 }
940 if (defined $t3 and length $t3) {
941 $t3 =~ s/^\n//s;
942 $str .= "\n$t3";
943 }
944 }
945
946 # simplify the fonts for the translators
947 if (defined $self->{type} && $self->{type} =~ m/^(SH|SS)$/) {
948 set_regular("B");
949 }
950 $str = do_fonts($str, $ref);
951 if (defined $self->{type} && $self->{type} =~ m/^(SH|SS)$/) {
952 set_regular("R");
953 }
954
955 # After the simplification, the first char can be a \n.
956 # Simply push these newlines before the translation, but make sure the
957 # resulting string is not empty (or an additional line will be
958 # added).
959 if ($str =~ /^(\n+)(.+)$/s) {
960 $self->pushline($1);
961 $str = $2;
962 }
963
964 unless ($mdoc_mode) {
965 # Kill minus sign/hyphen difference.
966 # Aestetic of printed man pages may suffer, but:
967 # * they are translator-unfriendly
968 # * they break when using utf8 (for obscure reasons)
969 # * they forbid the searches, since keybords don't have hyphen key
970 # * they forbid copy/paste, since options need minus sign, not hyphen
971 $str =~ s|\\-|-|sg;
972
973 # Groff bestiary
974 $str =~ s/\\\*\(lq/``/sg;
975 $str =~ s/\\\*\(rq/''/sg;
976 $str =~ s/\\\(dq/"/sg;
977 }
978
979 # non-breaking spaces
980 # some non-breaking spaces may have been added during the parsing
981 $str =~ s/\Q$nbs/\\ /sg;
982
983 print STDERR "$str\n" if ($debug{'pretrans'});
984 return $str;
985}
986
987sub post_trans {
988 my ($self,$str,$ref,$type,$wrap)=@_;
989 my $transstr=$str;
990
991 print STDERR "post_trans($str)="
992if ($debug{'postrans'});
993
994 # Do as few treatments as possible with the .de, .ie and .if sections
995 if (defined $self->{type} && $self->{type} =~ m/^(ie|if|de)$/) {
996 return $str;
997 }
998
999 unless ($mdoc_mode) {
1000 # Post formatting, so that groff see the strange chars
1001 $str =~ s|\\-|-|sg; # in case the translator added some of them manually
1002 # change hyphens to minus signs
1003 # (this shouldn't be done for \s-<number> font size modifiers)
1004 # nor on .so/.mso args
1005 unless (defined $self->{type} && $self->{type} =~ m/^m?so$/) {
1006 my $tmp = "";
1007 while ($str =~ m/^(.*?)-(.*)$/s) {
1008 my $begin = $1;
1009 $str = $2;
1010 my $tmp2 = $tmp.$begin;
1011 if ( ($begin =~ m/(?<!\\)(\\\\)*\\s$/s)
1012 or ($begin =~ m/(?<!\\)(\\\\)*\\\((.|E<[gl]t>)?$/s)
1013 or ($tmp2 =~ m/(?<!\\)(\\\\)*\\[ZHhCv]'([^']|(?<!\\)(\\\\)*\\')*$/)
1014 or ($tmp2 =~ m/(?<!\\)(\\\\)*\\(\*)?\[([^\]]|(?<!\\)(\\\\)*\\\[)*$/)
1015 or ($tmp2 =~ m/(?<!\\)(\\\\)*\\\*\(.?$/)) {
1016 # Do not change - to \- for
1017 # * \s-n (reduce font size)
1018 # * \(.. (e.g. '<-', '-D')
1019 # * inside a \h'...'
1020 # * inside a \C'...'
1021 # * inside a \[...]
1022 # * inside a \*(..
1023 # * inside a \*[...]
1024 # * inside a \v'...'
1025 # * inside a \H'...'
1026 # * inside a \Z'...'
1027 $tmp = $tmp2."-";
1028 } else {
1029 $tmp = $tmp2."\\-";
1030 }
1031 }
1032 $str = $tmp.$str;
1033 }
1034 }
1035
1036 # There must not be an end of line inside an inline macro
1037 $str =~ s/(E<\.[^>]*)\n([^>]*>)/$1 $2/gs;
1038
1039 # No . or ' on first char, or nroff will think it's a macro
1040 # * at the beginning of a paragraph, add \& (zero width space) at
1041 # the beginning of the line
1042 if (not defined $self->{type}) {
1043 # Only do it on regular text, because
1044 # his doesn't work after a TS (this macros shift
1045 # lines, which may contain macros)
1046 # or for the .ta arguments (e.g. .ta .5i 3i)
1047 $str =~ s/^((?:
1048 (?:CW|[RBI])<
1049 |$FONT_RE
1050 )?
1051 [.']
1052 )/\\&$1/mgx;
1053 } elsif ($self->{type} =~ m/^(TP|TQ)$/) {
1054 # But it is also needed for some type (e.g. TP, if followed by a
1055 # font macro)
1056 # This regular expression is the same as above
1057 $str =~ s/^((?:(?:CW|[RBI])<|$FONT_RE)?[.'])/\\&$1/mg;
1058 }
1059 # * degraded mode, doesn't work for the first line of a paragraph
1060 $str =~ s/\n([.'])/ $1/mg;
1061
1062 # Change ascii non-breaking space to groff one
1063 my $nbs_out = get_out_nbs($self->get_out_charset);
1064 $str =~ s/\Q$nbs_out/\\ /sg if defined $nbs_out;
1065 # No nbsp (said "\ " in groff on the last pos of the line, or groff adds
1066 # an extra space
1067 $str =~ s/\\ \n(?=.)/\\ /sg;
1068
1069 # Make sure we compute internal sequences right.
1070 # think about: B<AZE E<lt> EZA E<gt>>
1071 while ($str =~ m/^(.*)(CW|[RBI])<(.*)$/s) {
1072my ($done,$rest)=($1."\\f$2",$3);
1073$done =~ s/CW$/\(CW/;
1074my $lvl=1;
1075while (length $rest && $lvl > 0) {
1076 my $first=substr($rest,0,1);
1077 if ($first eq '<') {
1078$lvl++;
1079 } elsif ($first eq '>') {
1080$lvl--;
1081 }
1082 $done .= $first if ($lvl > 0);
1083 $rest=substr($rest,1);
1084}
1085die wrap_ref_mod($ref||$self->{ref}, "po4a::man", dgettext("po4a","Unbalanced '<' and '>' in font modifier. Faulty message: %s"),$str)
1086 if ($lvl > 0);
1087# Return to the regular font
1088$done .= "\\fP$rest";
1089$str=$done;
1090 }
1091
1092 while ($str =~ m/^(.*?)E<([.'][\t ]*.*?(?<!E<[gl]t))>(.*)$/s) {
1093 my ($t1, $t2, $t3) = ($1,$2,$3);
1094 $t1 =~ s/ +$//s;
1095 $t2 =~ s/\n/ /gs;
1096 if ($mdoc_mode) {
1097 # restore the punctuation inside the line (see pre_trans)
1098 if ($t3 =~ s/^([.,;:\)\]]+)//s) {
1099 my $punctuation = $1;
1100 $punctuation =~ s/([.,;:\)\]])/$1 /;
1101 $t2 .= " $punctuation";
1102 }
1103 }
1104 $t3 =~ s/^ +//s;
1105 if ($wrap) {
1106 # The no-wrap case should be checked
1107 $t1 =~ s/\n$//s;
1108 }
1109 $str = $t1;
1110 if (length $t1) {
1111 $t1 =~ s/\n$//s;
1112 $str = "$t1\n";
1113 }
1114 $str .= $t2;
1115 if (defined $t3 and length $t3) {
1116 $t3 =~ s/^\n//s;
1117 $str.= "\n$t3";
1118 }
1119 }
1120 my $str2 = $str;
1121 $str2 =~ s/E<[gl]t>//g;
1122 die wrap_ref_mod($ref||$self->{ref}, "po4a::man",
1123 dgettext("po4a","Unknown '<' or '>' sequence. ".
1124 "Faulty message: %s"),$str)
1125 if $str2 =~ /[<>]/;
1126 $str =~ s/E<gt>/>/mg;
1127 $str =~ s/E<lt>/</mg;
1128 # Don't do that, because we'll go into trouble if previous line was .TP
1129 # $str =~ s/^\\f([BI])(.*?)\\f[RP]$/\.$1 $2/mg;
1130
1131 unless ($mdoc_mode) {
1132 my $tmp = "";
1133 while ($str =~ m/^(.*?)(``|'')(.*)$/s) {
1134 $tmp .= $1;
1135 my $q = $2;
1136 $str = $3;
1137 # There are probably many more exceptions, here are those I could
1138 # detect in my manpages.
1139 # \*(.' \*(.`
1140 # \*' \*`
1141 # \N'xxx'
1142 if ($tmp =~ m/(?<!\\)(?:\\\\)*\\\*\($/s) {
1143 $tmp .= $q;
1144 } elsif ( $tmp =~ m/(?<!\\)(?:\\\\)*\\\*\(.$/s
1145 or $tmp =~ m/(?<!\\)(?:\\\\)*\\\*$/s
1146 or ($tmp =~ m/(?<!\\)(?:\\\\)*\\N'[0-9]*$/s
1147 and $q eq "''")) {
1148 $q =~ m/(.)(.)/;
1149 $tmp .= $1;
1150 $str = $2.$str;
1151 } else {
1152 $q =~ s/``/\\\*\(lq/;
1153 $q =~ s/''/\\\*\(rq/;
1154 $tmp .= $q;
1155 }
1156 }
1157 $str = $tmp.$str;
1158 }
1159 if (not defined $self->{type}) {
1160 $str =~ s/(?<!\\) $//mg;
1161 }
1162
1163 print STDERR "$str\n" if ($debug{'postrans'});
1164 return $str;
1165}
1166sub translate {
1167 my ($self,$str,$ref,$type) = (shift, shift, shift,shift);
1168 my (%options)=@_;
1169 my $origstr=$str;
1170
1171 return $str unless (defined $str) && length($str);
1172 return $str if ($str eq "\n");
1173 # Do not translate the strings that only consist of fonts, spaces and
1174 # \&. This is useful because we introduced \& in shiftline.
1175 if ($str =~ m/^($FONT_RE|\s|\\&)*$/s) {
1176 do_fonts($str, $ref||$self->{ref});
1177 return $str;
1178 }
1179
1180 # If a string is quoted, only translate the argument between the
1181 # quotes.
1182 if ($options{'wrap'} or $str !~ m/\n/s) {
1183 if ($str =~ m/^\"(.*)\"$/s and $1 !~ m/(?<!\\)\"/) {
1184 $str = '"'.$self->translate($1, $ref, $type, %options).'"';
1185 $str =~ s/\n"$/"\n/s;
1186 return $str;
1187 }
1188 }
1189
1190 $str=pre_trans($self,$str,$ref||$self->{ref},$type);
1191 $options{'comment'} .= join('\n', @comments);
1192 # Translate this
1193 $str = $self->SUPER::translate($str,
1194 $ref||$self->{ref},
1195 $type || $self->{type},
1196 %options);
1197 if ($options{'wrap'}) {
1198my (@paragraph);
1199@paragraph=split (/\n/,$str);
1200if (defined ($paragraph[0]) && $paragraph[0] eq '') {
1201 shift @paragraph;
1202}
1203$str = join("\n",@paragraph)."\n";
1204 }
1205 $str=post_trans($self,$str,$ref||$self->{ref},$type, $options{'wrap'});
1206 return $str;
1207}
1208
1209# shortcut
1210sub t {
1211 return $_[0]->translate($_[1]);
1212}
1213
1214# shortcut.
1215# As a rule of thumb, I do not recode macro names, unless they may be
1216# followed by other characters.
1217sub r {
1218 my $self = shift;
1219 my $str = shift;
1220
1221 # non-breaking spaces
1222 # some non-breaking spaces may have been added during the parsing
1223 $str =~ s/\Q$nbs/\\ /sg;
1224
1225 return $self->recode_skipped_text($str);
1226}
1227
1228
1229sub do_paragraph {
1230 my ($self,$paragraph,$wrapped_mode) = (shift,shift,shift);
1231
1232 # Following needed because of 'ft' (at least, see ft macro below)
1233 unless ($paragraph =~ m/\n$/s) {
1234my @paragraph = split(/\n/,$paragraph);
1235
1236$paragraph .= "\n"
1237 unless scalar (@paragraph) == 1;
1238 }
1239
1240 $self->pushline( $self->translate($paragraph,$self->{ref},"Plain text",
1241 "wrap" => ($wrapped_mode eq 'YES') ) );
1242}
1243
1244#############################
1245#### MAIN PARSE FUNCTION ####
1246#############################
1247sub parse{
1248 my $self = shift;
1249 my ($line,$ref);
1250 my ($paragraph)=""; # Buffer where we put the paragraph while building
1251 my $wrapped_mode='YES'; # Should we wrap the paragraph? Three possible values:
1252 # YES: do wrap
1253 # NO: don't wrap because this paragraph contains indented lines
1254 # this status disapear after the end of the paragraph
1255 # MACRONO: don't wrap because we saw the nf macro. It stays so
1256 # until the next fi macro.
1257
1258
1259 # We want to change the non-breaking space according to the input
1260 # document charset
1261 $nbs = get_in_nbs($self->{TT}{'file_in_charset'});
1262
1263 LINE:
1264 undef $self->{type};
1265 ($line,$ref)=$self->shiftline();
1266
1267 while (defined($line)) {
1268#print STDERR "line=$line;ref=$ref";
1269chomp($line);
1270$self->{ref}="$ref";
1271#print STDERR "LINE=$line<<\n";
1272
1273
1274if ($line =~ /^[.']/) {
1275 die wrap_mod("po4a::man", dgettext("po4a", "Unparsable line: %s"), $line)
1276unless ($line =~ /^([.']+\\*?)(\\["#])(.*)/ ||
1277$line =~ /^([.'])(\S*)(.*)/);
1278 my $arg1=$1;
1279 $arg1 .= $2;
1280 my $macro=$2;
1281 my $arguments=$3;
1282
1283 if ($inline{$macro}) {
1284$paragraph .= "PO4A-INLINE:".$line.":PO4A-INLINE\n";
1285goto LINE;
1286 }
1287
1288 # Split on spaces for arguments, but not spaces within double quotes
1289 my @args=();
1290 push @args,$arg1;
1291 if ($macro =~ /^(?:ta|TP|ie|if|de)$/) {
1292# The number of spaces may be critical for the 'ta' macro,
1293# and there is no need to split the arguments.
1294push @args, $arguments;
1295 } else {
1296push @args, splitargs($ref,$arguments);
1297 }
1298
1299
1300 if (length($paragraph)) {
1301do_paragraph($self,$paragraph,$wrapped_mode);
1302$paragraph="";
1303$wrapped_mode = $wrapped_mode eq 'NO' ? 'YES' : $wrapped_mode;
1304 }
1305
1306 # Special case: Don't change these lines
1307 # .\" => comments
1308 # .\# => comments
1309 # ." => comments
1310 # . => empty point on the line
1311 # .tr abcd...
1312 # => substitution like Perl's tr/ac/bd/ on output.
1313 if ($macro eq '\\"' || $macro eq '' || $macro eq 'tr' ||
1314 $macro eq '"' || $macro eq '\\#') {
1315$self->pushline($self->r($line)."\n");
1316goto LINE;
1317 }
1318 # Special case:
1319 # .nf => stop wrapped mode
1320 # .fi => wrap again
1321 if ($no_wrap_begin{$macro} or $no_wrap_end{$macro}) {
1322if ($no_wrap_end{$macro}) {
1323 $wrapped_mode='YES';
1324} else {
1325 $wrapped_mode='MACRONO';
1326}
1327$self->pushline($self->r($line)."\n");
1328goto LINE;
1329 }
1330
1331 # SH resets the wrapping (in addition to starting a section)
1332 if ($macro eq 'SH') {
1333$wrapped_mode='YES';
1334 }
1335
1336 unshift @args,$self;
1337 # Apply macro
1338 $self->{type}=$macro;
1339
1340 if (defined ($macro{$macro})) {
1341&{$macro{$macro}}(@args);
1342 } else {
1343if (defined $unknown_macros) {
1344 &{$unknown_macros}(@args);
1345} else {
1346$self->pushline($self->r($line)."\n");
1347die wrap_ref_mod($ref, "po4a::man", dgettext("po4a",
1348 "Unknown macro '%s'. Remove it from the document, or refer to the Locale::Po4a::Man manpage to see how po4a can handle new macros."), $line);
1349}
1350 }
1351
1352} elsif ($line =~ /^ +[^. ]/) {
1353 # (Lines containing only spaces are handled as empty lines)
1354 # Not a macro, but not a wrapped paragraph either
1355 $wrapped_mode = $wrapped_mode eq 'YES' ? 'NO' : $wrapped_mode;
1356 $paragraph .= $line."\n";
1357} elsif ($line =~ /^[^.].*/ && $line !~ /^ *$/) {
1358 # (Lines containing only spaces are handled latter as empty lines)
1359 if ($line =~ /^\\"/) {
1360# special case: the line is entirely a comment, keep the
1361# comment.
1362# NOTE: comment could also be found in the middle of a line.
1363# From info groff:
1364# Escape: \": Start a comment. Everything to the end of the
1365# input line is ignored.
1366$self->pushline($self->r($line)."\n");
1367goto LINE;
1368 } elsif ($line =~ /^\\#/) {
1369# Special groff comment. Do not keep the new line
1370goto LINE;
1371 } else {
1372# Not a macro
1373# * first, try to handle some "output line continuation" (\c)
1374$paragraph =~ s/\\c *(($FONT_RE)?)\n?$/$1/s;
1375# * append the line to the current paragraph
1376$paragraph .= $line."\n";
1377 }
1378} else { #empty line, or line containing only spaces
1379 if (length($paragraph)) {
1380 do_paragraph($self,$paragraph,$wrapped_mode);
1381 $paragraph="";
1382 }
1383 $wrapped_mode = $wrapped_mode eq 'NO' ? 'YES' : $wrapped_mode;
1384 $self->pushline($line."\n");
1385}
1386
1387# finally, we did not reach the end of the paragraph. The comments
1388# belong to the current paragraph.
1389push @comments, @next_comments;
1390@next_comments = ();
1391
1392# Reinit the loop
1393($line,$ref)=$self->shiftline();
1394undef $self->{type};
1395 }
1396
1397 if (length($paragraph)) {
1398do_paragraph($self,$paragraph,$wrapped_mode);
1399$wrapped_mode = $wrapped_mode eq 'NO' ? 'YES' : $wrapped_mode;
1400$paragraph="";
1401 }
1402
1403 # flush the last comments
1404 push @comments, @next_comments;
1405 @next_comments = @comments;
1406 @comments = ();
1407 for my $c (@next_comments) {
1408$self->pushline($self->r(".\\\"$c\n"));
1409 }
1410
1411 # reinitialize the module
1412 @next_comments = ();
1413 set_regular("R");
1414 set_font("R");
1415 set_font("R");
1416 $mdoc_mode = 0;
1417} # end of main
1418
1419# Cache the results of get_in_nbs and get_out_nbs
1420{
1421 my $last_in_charset;
1422 my $last_in_nbs;
1423
1424# get_in_nbs(charset)
1425# Return the representation of a non breaking space in the input charset
1426# (given in argument).
1427# or PO4A:VERY_IMPROBABLE_STRING_USEDFOR_NON-BREAKING-SPACES if this
1428# character doesn't exist in this charset.
1429 sub get_in_nbs() {
1430 my $charset = shift;
1431
1432 return $last_in_nbs
1433 if ( defined $charset
1434 and defined $last_in_charset
1435 and $charset eq $last_in_charset);
1436
1437 my $nbs = "\xA0";
1438 my $length;
1439 if (defined $charset and length $charset)
1440 {
1441 eval ("\$length = Encode::from_to(\$nbs, \"latin-1\",
1442 \$charset,
1443 1)");
1444 }
1445 # fall back solution
1446 $nbs = "PO4A:VERY_IMPROBABLE_STRING_USEDFOR_NON-BREAKING-SPACES"
1447 unless defined $length;
1448 $last_in_charset = $charset;
1449 $last_in_nbs = $nbs;
1450
1451 return $last_in_nbs;
1452 }
1453
1454 my $last_out_charset;
1455 my $last_out_nbs;
1456# get_out_nbs(charset)
1457# Return the representation of a non breaking space in the output charset
1458# (given in argument).
1459# or undef if this character doesn't exist in this charset.
1460 sub get_out_nbs() {
1461 my $charset = shift;
1462
1463 return $last_out_nbs
1464 if ( defined $charset
1465 and defined $last_out_charset
1466 and $charset eq $last_out_charset);
1467
1468 my $nbs = "\xA0";
1469 my $length;
1470 if (defined $charset and length $charset)
1471 {
1472 eval ("\$length = Encode::from_to(\$nbs, \"latin-1\",
1473 \$charset,
1474 1)");
1475 }
1476 # fall back solution
1477 undef $nbs
1478 unless defined $length;
1479 $last_out_charset = $charset;
1480 $last_out_nbs = $nbs;
1481
1482 return $last_out_nbs;
1483 }
1484
1485}
1486
1487# We can't push the header in the first line of the document, as in the
1488# other module, because the first line may contain indications on how the
1489# man page must be processed.
1490sub docheader {
1491 return "";
1492}
1493
1494# The header is pushed just before the .TH macro (this macro is mandatory
1495# and must be specified at the begining (there may be macro definitions
1496# before).
1497sub push_docheader {
1498 my $self = shift;
1499 $self->pushline(
1500".\\\"*******************************************************************\n".
1501".\\\"\n".
1502".\\\" This file was generated with po4a. Translate the source file.\n".
1503".\\\"\n".
1504".\\\"*******************************************************************\n"
1505 );
1506}
1507
1508# Split request's arguments.
1509# see:
1510# info groff --index-search "Request Arguments"
1511sub splitargs {
1512 my ($ref,$arguments) = ($_[0],$_[1]);
1513 my @args=();
1514 my $buffer="";
1515 my $escaped=0;
1516 if (! defined $arguments) {
1517 return @args;
1518 }
1519 # change non-breaking space before to ensure that split does what we want
1520 # We change them back before pushing into the arguments. The one which
1521 # will be translated will have the same change again (in pre_trans and
1522 # post_trans), but the ones which won't get translated are not changed
1523 # anymore. Let's play safe.
1524 $arguments =~ s/\\ /$nbs/g;
1525 $arguments =~ s/^ +//;
1526 $arguments =~ s/\\&"/\\(dq/g;
1527 $arguments =~ s/^ *//;
1528 while (length $arguments) {
1529 if ($arguments =~ s/^"((?:[^"]|"")*)"(?!") *//) {
1530 my $a = $1;
1531 $a =~ s/""/"/g if defined $a;
1532 push @args,$a;
1533 } elsif ($arguments =~ s/^"((?:[^"]|"")*) *$//) {
1534 # Unterminated quote, but this seems to be handled by removing
1535 # the trailing spaces and closing the quotes.
1536 my $a = $1;
1537 $a =~ s/""/"/g if defined $a;
1538 push @args,$a;
1539 } elsif ($arguments =~ s/^([^ ]+) *//) {
1540 push @args,$1;
1541 } else {
1542 die wrap_ref_mod($ref, "po4a::man", dgettext("po4a",
1543 "Cannot parse command arguments: %s"),
1544 $arguments)
1545 }
1546 }
1547 if ($debug{'splitargs'}) {
1548 print STDERR "ARGS=";
1549 map { print STDERR "$_^"} @args;
1550 print STDERR "\n";
1551 }
1552
1553 return @args;
1554}
1555
1556{
1557 #static variables
1558 # font stack.
1559 # Keep track of the current font (because a font modifier can
1560 # stay open at the end of a paragraph), and the previous font (to
1561 # handle \fP)
1562 my $current_font = "R";
1563 my $previous_font = "R";
1564 # $regular_font describe the "Regular" font, which is the font used
1565 # when there is no font modifier.
1566 # For example, .SS use a Bold font, and thus in
1567 # .SS This is a \fRsubsection\fB header
1568 # the \fR and \fB font modifiers have to be kept.
1569 my $regular_font = "R";
1570
1571 # Set the regular font
1572 # It takes the regular font in argument (when no argument is provided,
1573 # it uses "R").
1574 sub set_regular {
1575 print STDERR "set_regular('@_')\n"
1576 if ($debug{'fonts'});
1577 set_font(@_);
1578 $regular_font = $current_font;
1579 }
1580
1581 sub set_font {
1582 print STDERR "set_font('@_')\n"
1583 if ($debug{'fonts'});
1584 my $saved_previous = $previous_font;
1585 $previous_font = $current_font;
1586
1587 if (! defined $_[0]) {
1588 $current_font = "R";
1589 } elsif ($_[0] =~ /^(P|\[\]|\[P\])/) {
1590 $current_font = $saved_previous;
1591 } elsif (length($_[0]) == 1) {
1592 $current_font = $_[0];
1593 } elsif (length($_[0]) == 2) {
1594 $current_font = "($_[0]";
1595 } else {
1596 $current_font = "[$_[0]]";
1597 }
1598 print STDERR "r:'$regular_font', p:'$previous_font', c:'$current_font'\n"
1599 if ($debug{'fonts'});
1600 }
1601
1602 sub do_fonts {
1603 # one argument: a string
1604 my ($str, $ref) = (shift, shift);
1605 print STDERR "do_fonts('$str', '$ref')="
1606 if ($debug{'fonts'});
1607
1608 # restore the font stack
1609 $str = "\\f$previous_font\\f$current_font".$str;
1610 # In order to be able to split on /\\f/, without problem with
1611 # \\foo, groff backslash (\\) are changed to the (equivalent)
1612 # form: \e (this should be done in shiftline).
1613 my @array1=split(/\\f/, $str);
1614
1615 $str = shift @array1; # The first element is always empty because
1616 # the $current_font was put at the beginning
1617 # $last_font indicates the last font that was appended to the buffer.
1618 # It differ from $current_font because concecutive identical fonts
1619 # are not written in the buffer.
1620 my $last_font=$regular_font;
1621
1622 foreach my $elem (@array1) {
1623 # Do not touch the fonts in the inline macros
1624 # These inline macros may have their argument in bold or italic,
1625 # we can't know.
1626 if ($str =~ m/E<\.([^>]|E<gt>|E<lt>)*$/s) {
1627 # We can't use \\f here, otherwise the font simplifier regexp
1628 # will use the fonts of the inline macros.
1629 $str .= "PO4A-FAKE-FONT".$elem;
1630 next;
1631 }
1632
1633 # Replace \fP by the exact font (because some font modifiers will
1634 # be removed or added, which will break groff's font stack)
1635 $elem =~ s/^(P|\[\]|\[P\])/$previous_font/s;
1636 # change \f1 to \fR, etc.
1637 # Those fonts are defined in the DESC file, which
1638 # may depend on the groff device.
1639 # fonts 1 to 4 are usually mapped to R, I, B, BI
1640 # TODO: use an array for the font positions. This
1641 # array should be updated by .fp requests.
1642 $elem =~ s/^1/R/;
1643 $elem =~ s/^2/I/;
1644 $elem =~ s/^3/B/;
1645 $elem =~ s/^4/(BI/;
1646
1647 if ($elem =~ /^([1-4]|B|I|R|\(..|\[[^]]*\]|L)(.*)$/s) {
1648 # Each element should now start by a recognized font modifier
1649 my $new_font = $1;
1650 my $arg = $2;
1651 # Update the font stack
1652 $previous_font = $current_font;
1653 $current_font = $new_font;
1654
1655 if ($new_font eq $last_font) {
1656 # continue with the same font.
1657 $str.=$arg;
1658 } else {
1659 # A new font is used, update $last_font
1660 $last_font = $new_font;
1661 $str .= "\\f".$elem;
1662 }
1663 } else {
1664 die wrap_ref_mod($ref,
1665 "po4a::man",
1666 dgettext("po4a",
1667 "Unsupported font in: '%s'."),
1668 "\\f".$elem);
1669 }
1670 }
1671 # Do some simplification (they don't change the font stack)
1672 # Remove empty font modifiers at the end
1673 $str =~ s/($FONT_RE)*$//s;
1674
1675 # close any font modifier
1676 if ($str =~ /.*($FONT_RE)(.*?)$/s && $1 ne "\\f$regular_font") {
1677 $str =~ s/(\n?)$/\\f$regular_font$1/;
1678 }
1679
1680 # remove fonts with empty argument
1681 while ($str =~ /($FONT_RE){2}/) {
1682 # while $str has two consecutive font modifiers
1683 # only keep the second one.
1684 $str =~ s/($FONT_RE)($FONT_RE)/$2/s;
1685 }
1686
1687 # when there are two consecutive switches to the regular font,
1688 # remove the last one.
1689 while ($str =~ /^(.*)\\f$regular_font # anything followed by a
1690 # regular font
1691 ((?:\\(?!f)|[^\\])*) # the text concerned by
1692 # this font (i.e. without any
1693 # font modifier, i.e. it
1694 # contains no '\' followed by
1695 # an 'f')
1696 \\f$regular_font # another regular font
1697 (.*)$/sx) {
1698 $str = "$1\\f$regular_font$2$3";
1699 }
1700
1701 # the regular font modifier at the beginning of the string is not
1702 # needed (the do_fonts subroutine ensure that every paragraph ends with
1703 # the regular font.
1704 if ($str =~ /^(.*?)\\f$regular_font(.*)$/s && $1 !~ /$FONT_RE/) {
1705 $str = "$1$2";
1706 }
1707
1708 # Use special markup for common fonts, so that translators don't see
1709 # groff's font modifiers
1710 my $PO_FONTS = "B|I|R|\\(CW";
1711 # remove the regular font from this list
1712 $PO_FONTS =~ s/^$regular_font\|//;
1713 $PO_FONTS =~ s/\|$regular_font\|/|/;
1714 $PO_FONTS =~ s/\|$regular_font$//;
1715 while ($str =~ /^(.*?) # $1: anything (non greedy: as
1716 # few as possible)
1717 \\f($PO_FONTS) # followed by a common font
1718 # modifier ($2)
1719 ((?:\\[^f]|[^\\])*) # $3: the text concerned by
1720 # this font (i.e. without any
1721 # font modifier, i.e. it
1722 # contains no '\' followed by
1723 # an 'f')
1724 \\f # the next font modifier
1725 (.*)$/sx) { # $4: anything up to the end
1726 my ($begin, $font, $arg, $end) = ($1,$2,$3,$4);
1727 if ($end =~ /^$regular_font(.*)$/s) {
1728 # no need to add a switch to $regular_font
1729 $str = $begin."$font<$arg>$1";
1730 } else {
1731 $str = $begin."$font<$arg>\\f$end";
1732 }
1733 }
1734 $str =~ s/\(CW</CW</sg;
1735 $str =~ s/PO4A-FAKE-FONT/\\f/sg;
1736
1737 print STDERR "'$str'\n" if ($debug{'fonts'});
1738 return $str;
1739 }
1740}
1741
1742##########################################
1743#### DEFINITION OF THE MACROS WE KNOW ####
1744##########################################
1745# Each sub is passed self as first arg,
1746# plus the args present on the roff line
1747# ie, <<.TH LS "1" "October 2002" "ls (coreutils) 4.5.2" "User Commands">>
1748# is passed (".TH","LS","1","October 2002","ls (coreutils) 4.5.2","User Commands")
1749# Macro name is also passed, because .B (bold) will be encoded in pod format (and mangeled).
1750# They should return a list, which will be join'ed(' ',..)
1751# or undef when they don't want to add anything
1752
1753# Some well known macro handling
1754
1755# For macro taking only one argument, but people may forget the quotes.
1756# Example: >>.SH Another Section<< which should be >>.SH "Another Section"<<
1757sub translate_joined {
1758 my ($self,$macroname,$macroarg)=(shift,shift,join(" ",@_));
1759 #section# .S[HS] name
1760
1761 $self->pushmacro($macroname,
1762 $self->t($macroarg));
1763}
1764
1765# For macro taking several arguments, having to be translated separately
1766sub translate_each {
1767 my ($self,$first)= (shift,0);
1768 $self->pushmacro( map { $first++ ?$self->t($_):$_ } @_);
1769}
1770
1771# For macro which shouldn't be given any arg
1772sub noarg {
1773 my $self = shift;
1774 warn "Macro $_[0] does not accept any argument\n"
1775if (defined ($_[1]));
1776 $self->pushmacro(@_);
1777}
1778
1779# For macro whose arguments shouldn't be translated
1780sub untranslated {
1781 my ($self,$first)= (shift,0);
1782 $self->pushmacro( map { $first++ ?$self->r($_):$_ } @_);
1783}
1784
1785###
1786### man 7 man
1787###
1788
1789$macro{'TH'}= sub {
1790 my $self=shift;
1791 my ($th,$title,$section,$date,$source,$manual)=@_;
1792 #Preamble#.TH title section date source manual
1793# print STDERR "TH=$th;titre=$title;sec=$section;date=$date;source=$source;manual=$manual\n";
1794
1795 # Reset the memories
1796 $self->push_docheader();
1797
1798 $self->pushmacro($th,
1799 $self->t($title),
1800 $section,
1801 $self->t($date),
1802 $self->t($source),
1803 $self->t($manual));
1804};
1805
1806# .SS t Subheading t (like .SH, but used for a subsection inside a section).
1807$macro{'SS'}=$macro{'SH'}=sub {
1808 if (!defined $_[2]) {
1809 # The argument is on the next line.
1810 my ($self,$macroname) = (shift,shift);
1811 my ($l2,$ref2) = $self->shiftline();
1812 if ($l2 =~/^\./) {
1813 $self->SUPER::unshiftline($l2,$ref2);
1814 } else {
1815 chomp($l2);
1816 $self->pushmacro($macroname,
1817 $self->t($l2));
1818 }
1819 return;
1820 } else {
1821 return translate_joined(@_);
1822 }
1823};
1824
1825# Macro: .SM [text]
1826# Set the text on the same line or the text on the next line in a
1827# font that is one point size smaller than the default font.
1828# FIXME: Maybe we should find a better way to represent this (inline is
1829# not really nice in the PO).
1830$inline{'SM'}=1;
1831
1832# .SP n Skip n lines (I think)
1833$macro{'SP'}=\&untranslated;
1834
1835#Normal Paragraphs
1836# .LP Same as .PP (begin a new paragraph).
1837# .P Same as .PP (begin a new paragraph).
1838# .PP Begin a new paragraph and reset prevailing indent.
1839#Relative Margin Indent
1840# .RS i Start relative margin indent - moves the left margin i to the right
1841# As a result, all following paragraph(s) will be indented until
1842# the corresponding .RE.
1843# .RE End relative margin indent.
1844$macro{'LP'}=$macro{'P'}=$macro{'PP'}=sub {
1845 noarg(@_);
1846
1847 # From info groff:
1848 # The font size and shape are reset to the default value (10pt roman if no
1849 # `-rS' option is given on the command line).
1850 set_font("R");
1851};
1852$macro{'RE'}=\&noarg;
1853$macro{'RS'}=\&untranslated;
1854
1855sub parse_tp_tq {
1856 my $self=shift;
1857 my ($line,$l2,$ref2);
1858 $line .= $_[0] if defined($_[0]);
1859 $line .= ' '.$_[1] if defined($_[1]);
1860 $self->pushline($self->r($line)."\n");
1861
1862 ($l2,$ref2) = $self->shiftline();
1863 chomp($l2);
1864 while ($l2 =~ /^\.PD/) {
1865$self->pushline($self->r($l2)."\n");
1866($l2,$ref2) = $self->shiftline();
1867chomp($l2);
1868 }
1869 if ($l2 =~/^([.'][\t ]*([^\t ]*))(?:([\t ]+)(.*)$|$)/) {
1870 if ($inline{$2}) {
1871 my $tmp = "";
1872 if (defined $4 and length $4) {
1873 $tmp = $3.$self->t($4, "wrap" => 0);
1874 }
1875 $self->pushline($1.$tmp."\n");
1876 } else {
1877 # If the line after a .TP is a macro,
1878 # let the parser do it's job.
1879 # Note: use Transtractor unshiftline for now. This may require an
1880 # implementation of the man module's own implementation.
1881 # This may be a problem if, for example, the line resulted
1882 # of a line continuation.
1883 $self->SUPER::unshiftline($l2,$ref2);
1884 }
1885 } else {
1886$self->pushline($self->t($l2, "wrap" => 0)."\n");
1887 }
1888}
1889
1890#Indented Paragraph Macros
1891# .TP i Begin paragraph with hanging tag. The tag is given on the next line,
1892# but its results are like those of the .IP command.
1893$macro{'TP'}=sub {
1894 parse_tp_tq(@_);
1895
1896 # From info groff:
1897 # Note that neither font shape nor font size of the label [i.e. argument
1898 # or first line] is set to a default value; on the other hand, the rest of
1899 # the text has default font settings.
1900 set_font("R");
1901};
1902
1903# Indented Paragraph Macros
1904# .TQ Indicates continuation of the .TP labels that precede the indented
1905# paragraph.
1906$macro{'TQ'}=sub {
1907 warn "Macro $_[1] does not accept any argument\n"
1908if (defined ($_[2]));
1909
1910 parse_tp_tq(@_);
1911};
1912
1913# Indented Paragraph Macros
1914# .HP i Begin paragraph with a hanging indent (the first line of the paragraph
1915# is at the left margin of normal paragraphs, and the rest of the para-
1916# graph's lines are indented).
1917#
1918$macro{'HP'}=sub {
1919 untranslated(@_);
1920
1921 # From info groff:
1922 # Font size and face are reset to their default values.
1923 set_font("R");
1924};
1925
1926# Indented Paragraph Macros
1927# .IP [designator] [nnn]
1928# Sets up an indented paragraph, using designator as a tag to mark
1929# its beginning. The indentation is set to nnn if that argument is
1930# supplied (default unit is `n'), otherwise the default indentation
1931# value is used. Font size and face of the paragraph (but not the
1932# designator) are reset to its default values. To start an indented
1933# paragraph with a particular indentation but without a designator,
1934# use `""' (two doublequotes) as the second argument.
1935
1936# Note that the above is the groff_man(7) version, which of course differs radically
1937# from man(7). In one case, the designator is optional and the nnn is not, and the
1938# contrary in the other. This implies that when sticking to groff_man(7), we should
1939# mark an uniq argument as translatable.
1940
1941$macro{'IP'}=sub {
1942 my $self=shift;
1943 if (defined $_[2]) {
1944$self->pushmacro($_[0],$self->t($_[1]),$_[2]);
1945 } elsif (defined $_[1]) {
1946$self->pushmacro($_[0],$self->t($_[1]));
1947 } else {
1948$self->pushmacro(@_);
1949 }
1950
1951 # From info groff:
1952 # Font size and face of the paragraph (but not the designator) are reset
1953 # to their default values.
1954 set_font("R");
1955};
1956
1957# Hypertext Link Macros
1958# .UR u Begins a hypertext link to the URI (URL) u; it will end with
1959# the corresponding UE command. When generating HTML this should
1960# translate into the HTML command <A HREF="u">.
1961# There is an exception: if u is the special value ":", then no
1962# hypertext link of any kind will be generated until after the
1963# closing UE (this permits disabling hypertext links in
1964# phrases like LALR(1) when linking is not appropriate).
1965# .UE Ends the corresponding UR command; when generating HTML this
1966# should translate into </A>.
1967# .UN u Creates a named hypertext location named u; do not include a
1968# corresponding UE command.
1969# When generating HTML this should translate into the HTML command
1970# <A NAME="u" id="u">&nbsp;</A>
1971$macro{'UR'}=sub {
1972 return untranslated(@_)
1973if (defined($_[2]) && $_[2] eq ':');
1974 return translate_joined(@_);
1975};
1976$macro{'UE'}=\&noarg;
1977$macro{'UN'}=\&translate_joined;
1978
1979# Miscellaneous Macros
1980# .DT Reset tabs to default tab values (every 0.5 inches); does not
1981# cause a break.
1982# .PD d Set inter-paragraph vertical distance to d (if omitted, d=0.4v);
1983# does not cause a break.
1984$macro{'DT'}=\&noarg;
1985$macro{'PD'}=\&untranslated;
1986
1987# Indexing term (printed on standard error).
1988# (ms macros)
1989$macro{'IX'}=\&translate_each;
1990
1991###
1992### groff macros
1993###
1994# .br
1995$macro{'br'}=\&noarg;
1996# .bp N Eject current page and begin new page.
1997$macro{'bp'}=\&untranslated;
1998# .ad Begin line adjustment for output lines in current adjust mode.
1999# .ad c Start line adjustment in mode c (c=l,r,b,n).
2000$macro{'ad'}=\&untranslated;
2001# .de macro Define or redefine macro until .. is encountered.
2002$macro{'de'}=sub {
2003 my $self = shift;
2004 if ($groff_code ne "fail") {
2005 my $paragraph = "@_";
2006 my $end = ".";
2007 if ($paragraph=~/^[.'][\t ]*de[\t ]+([^\t ]+)[\t ]+([^\t ]+)[\t ]$/) {
2008 $end = $2;
2009 }
2010 my ($line, $ref) = $self->SUPER::shiftline();
2011 chomp $line;
2012 $paragraph .= "\n".$line;
2013 while (defined($line) and $line ne ".$end") {
2014 ($line, $ref) = $self->SUPER::shiftline();
2015 if (defined $line) {
2016 chomp $line;
2017 $paragraph .= "\n".$line;
2018 }
2019 }
2020 $paragraph .= "\n";
2021 if ($groff_code eq "verbatim") {
2022 $self->pushline( $self->r($paragraph) );
2023 } else {
2024 $self->pushline( $self->translate($paragraph,
2025 $self->{ref},
2026 "groff code",
2027 "wrap" => 0) );
2028 }
2029 } else {
2030 die wrap_ref_mod($self->{ref}, "po4a::man", dgettext("po4a", "This page defines a new macro with '.de'. Since po4a is not a real groff parser, this is not supported."));
2031 }
2032};
2033# .ds stringvar anything
2034# Set stringvar to anything.
2035$macro{'ds'}=sub {
2036 my ($self, $m) = (shift,shift);
2037 my $name = shift;
2038 my $string = "@_";
2039 # indicate to which variable this corresponds. The translator can
2040 # find references to this string in the translation "\*(name" or
2041 # "\*[name]"
2042 $self->{type} = "ds $name";
2043 $self->pushline($m." ".$self->r($name)." ".$self->translate($string)."\n");
2044};
2045# .fam Return to previous font family.
2046# .fam name Set the current font family to name.
2047$macro{'fam'}=\&untranslated;
2048# .fc a b Set field delimiter to a and pad character to b.
2049$macro{'fc'}=\&untranslated;
2050# .ft font Change to font name or number font;
2051$macro{'ft'}=sub {
2052 if (defined $_[2]) {
2053 set_font($_[2]);
2054 } else {
2055 set_font("P");
2056 }
2057};
2058# .hc c Set up additional hyphenation indicator character c.
2059$macro{'hc'}=\&untranslated;
2060# .hy Enable hyphenation (see nh)
2061# .hy N Switch to hyphenation mode N.
2062# .hym n Set the hyphenation margin to n (default scaling indicator m).
2063# .hys n Set the hyphenation space to n.
2064$macro{'hy'}=$macro{'hym'}=$macro{'hys'}=\&untranslated;
2065
2066# .ie cond anything If cond then anything else goto .el.
2067# .if cond anything If cond then anything; otherwise do nothing.
2068$macro{'ie'}=$macro{'if'}=sub {
2069 my $self = shift;
2070 if ($groff_code ne "fail") {
2071 my $m = $_[0];
2072 my $paragraph = "@_";
2073 my ($line,$ref);
2074 my $count = 0;
2075 $count = 1 if ($paragraph =~ m/(?<!\\)\\\{/s);
2076 while ( ($paragraph =~ m/(?<!\\)\\$/s)
2077 or ($count > 0)) {
2078 ($line,$ref)=$self->SUPER::shiftline();
2079 chomp $line;
2080 $paragraph .= "\n".$line;
2081 $count += 1 if ($line =~ m/(?<!\\)\\\{/s);
2082 $count -= 1 if ($line =~ m/(?<!\\)\\\}/s);
2083 }
2084 if ($m eq '.ie') {
2085 # The .el line may be preceded by comments
2086 ($line,$ref)=$self->SUPER::shiftline();
2087 chomp $line;
2088 while ($line =~ m/^[.']\\"/) {
2089 $paragraph .= "\n".$line;
2090 ($line,$ref)=$self->SUPER::shiftline();
2091 chomp $line;
2092 }
2093
2094 if ($line !~ m/^[.'][ \t]*el(\s|\\\{)/) {
2095 die wrap_ref_mod($self->{ref}, "po4a::man", dgettext("po4a",
2096 "The .ie macro must be followed by a .el macro."));
2097 }
2098 my $paragraph2 = $line;
2099 $count = 0;
2100 $count = 1 if ($line =~ m/(?<!\\)\\\{/s);
2101 while ( ($paragraph2 =~ m/(?<!\\)\\$/s)
2102 or ($count > 0)) {
2103 ($line,$ref)=$self->SUPER::shiftline();
2104 chomp $line;
2105 $paragraph2 .= "\n".$line;
2106 $count += 1 if ($line =~ m/(?<!\\)\\\{/s);
2107 $count -= 1 if ($line =~ m/(?<!\\)\\\}/s);
2108 }
2109 $paragraph .= "\n".$paragraph2;
2110 }
2111 $paragraph .= "\n";
2112 if ($groff_code eq "verbatim") {
2113 $self->pushline( $self->r($paragraph) );
2114 } else {
2115 $self->pushline( $self->translate($paragraph,
2116 $self->{ref},
2117 "groff code",
2118 "wrap" => 0) );
2119 }
2120 } else {
2121 die wrap_ref_mod($self->{ref}, "po4a::man", dgettext("po4a",
2122 "This page uses conditionals with '%s'. Since po4a is not a real groff parser, this is not supported."), $_[0]);
2123 }
2124};
2125# .in N Change indent according to N (default scaling indicator m).
2126$macro{'in'}=\&untranslated;
2127
2128# .ig end Ignore text until .end.
2129$macro{'ig'}=sub {
2130 my $self = shift;
2131 $self->pushmacro(@_);
2132 my ($name,$end) = (shift,shift||'');
2133 $end='' if ($end =~ m/^\\\"/);
2134 my ($line,$ref)=$self->shiftline();
2135 while (defined($line)) {
2136$self->pushline($self->r($line));
2137last if ($line =~ /^\.$end\./);
2138($line,$ref)=$self->shiftline();
2139 }
2140};
2141
2142
2143# .lf N file Set input line number to N and filename to file.
2144$macro{'lf'}=\&untranslated;
2145# .ll N Set line length according to N
2146$macro{'ll'}=\&untranslated;
2147
2148# .nh disable hyphenation (see hy)
2149$macro{'nh'}=\&untranslated;
2150# .na No Adjusting (see ad)
2151$macro{'na'}=\&untranslated;
2152# .ne N Need N vertical space
2153$macro{'ne'}=\&untranslated;
2154# .nr register N M
2155# Define or modify register
2156$macro{'nr'}=\&untranslated;
2157# .ps N Point size; same as \s[N]
2158$macro{'ps'}=\&untranslated;
2159# .so filename Include source file.
2160# .mso groff variant of .so (other search path)
2161$macro{'so'}= $macro{'mso'} = sub {
2162 warn wrap_mod("po4a::man", dgettext("po4a",
2163"This page includes another file with '%s'. Do not forget to translate this file ('%s')."), $_[1], $_[2]);
2164 my $self = shift;
2165 $self->pushmacro(@_);
2166};
2167# .sp Skip one line vertically.
2168# .sp N Space vertical distance N
2169$macro{'sp'}=\&untranslated;
2170# .vs [space]
2171# .vs +space
2172# .vs -space
2173# Change (increase, decrease) the vertical spacing by SPACE. The
2174# default scaling indicator is `p'.
2175$macro{'vs'}=\&untranslated;
2176# .ta T N Set tabs after every position that is a multiple of N.
2177# .ta n1 n2 ... nn T r1 r2 ... rn
2178# Set tabs at positions n1, n2, ..., nn, [...]
2179$macro{'ta'}=sub {
2180 # In some cases, a ta request can contain a translatable argument.
2181 # FIXME: detect those cases (something like 5i does not need to be
2182 # translated)
2183 my ($self,$m)=(shift,shift);
2184 my $line = "@_";
2185 $line =~ s/^ +//;
2186 $self->pushline($m." ".$self->translate($line,$self->{ref},'ta')."\n");
2187};
2188# .ti +N Temporary indent next line (default scaling indicator m).
2189$macro{'ti'}=\&untranslated;
2190
2191
2192###
2193### tbl macros
2194###
2195$macro{'TS'}=sub {
2196 my $self=shift;
2197 my ($in_headers,$buffer)=(1,"");
2198 my ($line,$ref)=$self->shiftline();
2199
2200 # Push table start
2201 $self->pushmacro(@_);
2202 while (defined($line)) {
2203if ($line =~ /^\.TE/) {
2204 # Table end
2205 $self->pushline($self->r($line));
2206 return;
2207}
2208if ($in_headers) {
2209 if ($line =~ /\.$/) {
2210$in_headers = 0;
2211 }
2212 $self->pushline($self->r($line));
2213} elsif ($line =~ /\\$/) {
2214 # Lines are continued on \ at the end of line
2215 $buffer .= $line;
2216} else {
2217 $buffer .= $line;
2218 # Arguments to translate are separated by \t
2219 $self->pushline(join("\t",
2220 map { $self->translate($buffer,
2221$ref,
2222'tbl table')
2223 } split (/\\t/,$line)));
2224 $buffer = "";
2225}
2226($line,$ref)=$self->shiftline();
2227 }
2228};
2229
2230###
2231### info groff
2232###
2233
2234## Builtin register, of course they do not need to be translated
2235
2236$macro{'F'}=$macro{'H'}=$macro{'V'}=$macro{'A'}=$macro{'T'}=\&untranslated;
2237
2238## ms package
2239##
2240#
2241# Displays and keeps. None of these macro accept a translated argument
2242# (they allow to make blocks of text which cannot be broken by new page)
2243
2244$macro{'DS'}=$macro{'LD'}=$macro{'DE'}=\&untranslated;
2245$macro{'ID'}=$macro{'BD'}=$macro{'CD'}=\&untranslated;
2246$macro{'RD'}=$macro{'KS'}=$macro{'KE'}=\&untranslated;
2247$macro{'KF'}=$macro{'B1'}=$macro{'B2'}=\&untranslated;
2248$macro{'DA'}=\&translate_joined;
2249
2250# .pc c Change page number character
2251$macro{'pc'}=\&translate_joined;
2252
2253# .ns Disable .sp and such
2254# .rs Enable them again
2255$macro{'ns'}=$macro{'rs'}=\&untranslated;
2256
2257# .cs font [width [em-size]]
2258# Switch to and from "constant glyph space mode".
2259$macro{'cs'}=\&untranslated;
2260
2261# .ss word_space_size [sentence_space_size]
2262# Change the minimum size of a space between filled words.
2263$macro{'ss'}=\&untranslated;
2264
2265# .ce Center one line horizontally
2266# .ce N Center N lines
2267# .ul N Underline N lines (but not the spaces)
2268# .cu N Underline N lines (even the spaces)
2269$macro{'ce'}=$macro{'ul'}=$macro{'cu'}=sub {
2270 my $self=shift;
2271 if (defined $_[1]) {
2272 if ($_[1] <= 0) {
2273 # disable centering, underlining, ...
2274 $self->pushmacro($_[0]);
2275 } else {
2276# All of these are not handled yet because the number of line may change
2277# during the translation
2278 die wrap_mod("po4a::man", dgettext("po4a",
2279"This page uses the '%s' request with the number of lines in argument. This is not supported yet."), $_[0]);
2280 }
2281 } else {
2282$self->pushmacro($_[0]);
2283 }
2284};
2285
2286# .ec [c]
2287# Set the escape character to C. With no argument the default
2288# escape character `\' is restored. It can be also used to
2289# re-enable the escape mechanism after an `eo' request.
2290$macro{'ec'}=sub {
2291 my $self=shift;
2292 if (defined $_[1]) {
2293 die wrap_mod("po4a::man", dgettext("po4a",
2294 "This page uses the '%s' request. This request is only supported when no argument is provided."), $_[0]);
2295 } else {
2296 $self->pushmacro($_[0]);
2297 }
2298};
2299
2300
2301###
2302### BSD compatibility macros: .AT and .UC
2303### (define the version of Berkley used)
2304### FIXME: the header ("3rd Berkeley Distribution" or such) declared
2305### by this macro isn't translatable we may want to remove
2306### this from the generated manpage, and declare our own header
2307###
2308$macro{'UC'}=$macro{'AT'}=\&untranslated;
2309
2310# Request: .hw word1 word2 ...
2311# Define how WORD1, WORD2, etc. are to be hyphenated. The words
2312# must be given with hyphens at the hyphenation points.
2313#
2314# If the English page needs to specify how a word must be hyphenated, the
2315# translated page may also have this need.
2316$macro{'hw'}=\&translate_each;
2317
2318
2319#############################################################################
2320#
2321# mdoc macros
2322#
2323# The macros are defined in mdoc(7) and groff_mdoc(7)
2324#
2325# TBC: Should the font processing be disabled in the mdoc mode?
2326#############################################################################
2327# FIXME: Maybe we should verify that the page is an mdoc page
2328# (add a flag in Dd, and always check that this flag is set in the
2329# other mdoc macros)
2330sub translate_mdoc {
2331 my ($self,$macroname)=(shift,shift);
2332 my $macroarg = "";
2333 foreach (@_) {
2334 $macroarg.=" " if (length $macroarg);
2335 if ($_ =~ m/((?<!\\) |\t|^$)/) {
2336 $macroarg.="\"$_\"";
2337 } else {
2338 $macroarg.=$_;
2339 }
2340 }
2341
2342 $self->pushline("$macroname ".$self->t($macroarg)."\n");
2343}
2344sub translate_mdoc_no_quotes {
2345 my ($self,$macroname, $macroarg)=(shift,shift, join(" ", @_));
2346
2347 $self->pushline("$macroname ".$self->t($macroarg)."\n");
2348}
2349#
2350# Title Macros
2351# ============
2352# .Dd Month day, year Document date.
2353$macro{'Dd'}=sub {
2354 my ($self,$macroname,$macroarg)=(shift,shift,join(" ",@_));
2355
2356 $mdoc_mode = 1;
2357 $self->push_docheader();
2358
2359# FIXME: It would be nice if we could switch from one set of macros to the
2360# other.
2361#
2362# This does not work at this time. If we erase the current set of macros,
2363# po4a fails when a configuration file uses both mdoc and groff pages.
2364#
2365# # Erase the current macro definitions
2366# %macro=();
2367# %inline=();
2368# %no_wrap_begin=();
2369# %no_wrap_end=();
2370 # Use the mdoc macros
2371 define_mdoc_macros();
2372
2373 $self->translate_mdoc_no_quotes($macroname,$macroarg);
2374};
2375
2376sub define_mdoc_macros {
2377 # .Dt DOCUMENT_TITLE [section] [volume] Title, in upper case.
2378 $macro{'Dt'}=\&translate_mdoc;
2379 # .Os OPERATING_SYSTEM [version/release] Operating system (BSD).
2380 $macro{'Os'}=\&translate_each;
2381 # Keep the quotes e.g. finger.1
2382 # Don't add quotes e.g. logger.1
2383
2384 # Page Layout Macros
2385 # ==================
2386 # .Sh Section Headers.
2387 # (man mdoc indicates only a limited set of valid headers,
2388 # but it should be OK to translate the header)
2389 $macro{'Sh'}= sub {
2390 my ($self,$macroname)=(shift,shift);
2391 my $macroarg = "";
2392 foreach (@_) {
2393 $macroarg.=" " if (length $macroarg);
2394 if ($_ =~ m/((?<!\\) |\t|^$)/) {
2395 $macroarg.="\"$_\"";
2396 } else {
2397 $macroarg.=$_;
2398 }
2399 }
2400 if ($mdoc{$macroarg}) {
2401 $self->pushline("$macroname ".$self->r($macroarg)."\n");
2402 } else {
2403 $self->pushline("$macroname ".$self->t($macroarg)."\n");
2404 }
2405 };
2406 # .Ss Subsection Headers.
2407 $macro{'Ss'}=\&translate_mdoc;
2408 # .Pp Paragraph Break. Vertical space (one line).
2409 $macro{'Pp'}=\&noarg;
2410 # .Lp Same as .Pp
2411 $macro{'Lp'}=\&noarg;
2412 # .D1 (D-one) Display-one Indent and display one text line.
2413 $macro{'D1'}=\&translate_mdoc;
2414 # .Dl (D-ell) Display-one literal.
2415 # Indent and display one line of literal text
2416 $macro{'Dl'}=\&translate_mdoc;
2417 # .Bd Begin-display block.
2418 # FIXME: Note: there are some options, some of the options argument
2419 # may be translatable (-file <name>, -offset <string>)
2420 $no_wrap_begin{'Bd'} = 1;
2421 # .Ed End-display (matches .Bd).
2422 $no_wrap_end{'Ed'} = 1;
2423 # .Bl Begin-list. Create lists or columns.
2424 # FIXME: As for .Bd, there are some options
2425 $macro{'Bl'}=\&untranslated;
2426 # .El End-list.
2427 $macro{'El'}=\&noarg;
2428 # .It List item.
2429 # FIXME: Maybe we could extract other modifiers
2430 # as in .It Fl l Ar num
2431 $macro{'It'}=\&translate_mdoc;
2432 # .Lk html link
2433 $macro{'Lk'}=\&untranslated;
2434
2435 # Manual Domain Macros
2436 # ====================
2437 # FIXME: I think most Manual and General text domain are in the inline category
2438 foreach (qw(Ad An Ar Cd Cm Dv Er Ev Fa Fd Fn Ic Li Nm Op Ot Pa St Va Vt Xr)) {
2439 $inline{$_} = 1;
2440 }
2441 # FIXME: some of these macros introduce a line in bold.
2442 # Using \fP in these line is not supported.
2443 # do_fonts should be called for every inline line
2444
2445 # General Text Domain
2446 # ===================
2447 foreach (qw(%A %B %C %D %I %J %N %O %P %Q %R %T %U %V
2448 Ac Ao Ap Aq At Bc Bf Bo Bq Brc Bro Brq Bx Db Dc Do Dq Ec Ef Em Eo Eq Fx No Ns
2449 Pc Pf Po Pq Qc Ql Qo Qq Re Rs Rv Sc So Sq Sm Sx Sy Tn Ux Xc Xo)) {
2450 $inline{$_} = 1;
2451 }
2452
2453 # FIXME: Maybe it should be joined with the preceding .Nm
2454 $macro{'Nd'}=\&translate_mdoc;
2455
2456 # Command line flags
2457 $inline{'Fl'} = 1;
2458 # Exit status
2459 $inline{'Ex'} = 1;
2460 # Opening option bracket
2461 $inline{'Oo'} = 1;
2462 # Closing option bracket
2463 $inline{'Oc'} = 1;
2464 # Begin keep (keep words in the same line)
2465 $inline{'Bk'} = 1;
2466 # End keep
2467 $inline{'Ek'} = 1;
2468 # Library Names
2469 $inline{'Lb'} = 1;
2470 # Function Types
2471 $inline{'Ft'} = 1;
2472 # Function open (for functions with many arguments)
2473 $inline{'Fo'} = 1;
2474 # Function close
2475 $inline{'Fc'} = 1;
2476 # OpenBSD macro
2477 $inline{'Ox'} = 1;
2478 # BSD/OS Macro
2479 $inline{'Bsx'} = 1;
2480 # #include statements
2481 $macro{'In'} = \&translate_mdoc;
2482 # NetBSD Macro
2483 $inline{'Nx'} = 1;
2484 # Curly brackets
2485 $inline{'Brq'} = 1;
2486 # Corporate name
2487 $inline{'%Q'} = 1;
2488 # Math symbol
2489 $inline{'Ms'} = 1;
2490 # Prints 'under development'
2491 $inline{'Ud'} = 1;
2492
2493 # This macro is a groff macro. I don't know if ot is valid in an mdoc page.
2494 # But this is used in some pages and seems to work
2495 $macro{'br'}=\&noarg;
2496
2497} # end of define_mdoc_macros
2498

Archive Download this file

Revision: 2380