perl section NAME perl - Practical Extraction and Report Language SYNOPSIS perl [ -sTuU ] [ -hv ] [ -V[:*configvar*] ] [ -cw ] [ -d[:*debugger*] ] [ -D[*number/list*] ] [ -pna ] [ -F*pattern* ] [ -l[*octal*] ] [ - 0[*octal*] ] [ -I*dir* ] [ -m[-]*module* ] [ -M[-]*'module...'* ] [ -P ] [ -S ] [ -x[*dir*] ] [ -i[*extension*] ] [ -e *'command'* ] [ -- ] [ *programfile* ] [ *argument* ]... For ease of access, the Perl manual has been split up into a number of sections: perl Perl overview (this section) perldelta Perl changes since previous version perl5004delta Perl changes in version 5.004 perlfaq Perl frequently asked questions perltoc Perl documentation table of contents perldata Perl data structures perlsyn Perl syntax perlop Perl operators and precedence perlre Perl regular expressions perlrun Perl execution and options perlfunc Perl builtin functions perlopentut Perl open() tutorial perlvar Perl predefined variables perlsub Perl subroutines perlmod Perl modules: how they work perlmodlib Perl modules: how to write and use perlmodinstall Perl modules: how to install from CPAN perlform Perl formats perllocale Perl locale support perlref Perl references perlreftut Perl references short introduction perldsc Perl data structures intro perllol Perl data structures: lists of lists perltoot Perl OO tutorial perlobj Perl objects perltie Perl objects hidden behind simple variables perlbot Perl OO tricks and examples perlipc Perl interprocess communication perlthrtut Perl threads tutorial perldebug Perl debugging perldiag Perl diagnostic messages perlsec Perl security perltrap Perl traps for the unwary perlport Perl portability guide perlstyle Perl style guide perlpod Perl plain old documentation perlbook Perl book information perlembed Perl ways to embed perl in your C or C++ application perlapio Perl internal IO abstraction interface perlxs Perl XS application programming interface perlxstut Perl XS tutorial perlguts Perl internal functions for those doing extensions perlcall Perl calling conventions from C perlhist Perl history records (If you're intending to read these straight through for the first time, the suggested order will tend to reduce the number of forward references.) By default, all of the above manpages are installed in the /usr/local/man/ directory. Extensive additional documentation for Perl modules is available. The default configuration for perl will place this additional documentation in the /usr/local/lib/perl5/man directory (or else in the man subdirectory of the Perl library directory). Some of this additional documentation is distributed standard with Perl, but you'll also find documentation for third-party modules there. You should be able to view Perl's documentation with your man(1) program by including the proper directories in the appropriate start-up files, or in the MANPATH environment variable. To find out where the configuration has installed the manpages, type: perl -V:man.dir If the directories have a common stem, such as /usr/local/man/man1 and /usr/local/man/man3, you need only to add that stem (/usr/local/man) to your man(1) configuration files or your MANPATH environment variable. If they do not share a stem, you'll have to add both stems. If that doesn't work for some reason, you can still use the supplied perldoc script to view module information. You might also look into getting a replacement man program. If something strange has gone wrong with your program and you're not sure where you should look for help, try the -w switch first. It will often point out exactly where the trouble is. DESCRIPTION Perl is a language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. It's also a good language for many system management tasks. The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal). Perl combines (in the author's opinion, anyway) some of the best features of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it. (Language historians will also note some vestiges of csh, Pascal, and even BASIC-PLUS.) Expression syntax corresponds quite closely to C expression syntax. Unlike most Unix utilities, Perl does not arbitrarily limit the size of your data-- if you've got the memory, Perl can slurp in your whole file as a single string. Recursion is of unlimited depth. And the tables used by hashes (sometimes called "associative arrays") grow as necessary to prevent degraded performance. Perl can use sophisticated pattern matching techniques to scan large amounts of data very quickly. Although optimized for scanning text, Perl can also deal with binary data, and can make dbm files look like hashes. Setuid Perl scripts are safer than C programs through a dataflow tracing mechanism which prevents many stupid security holes. If you have a problem that would ordinarily use sed or awk or sh, but it exceeds their capabilities or must run a little faster, and you don't want to write the silly thing in C, then Perl may be for you. There are also translators to turn your sed and awk scripts into Perl scripts. But wait, there's more... Perl version 5 is nearly a complete rewrite, and provides the following additional benefits: * Many usability enhancements It is now possible to write much more readable Perl code (even within regular expressions). Formerly cryptic variable names can be replaced by mnemonic identifiers. Error messages are more informative, and the optional warnings will catch many of the mistakes a novice might make. This cannot be stressed enough. Whenever you get mysterious behavior, try the -w switch!!! Whenever you don't get mysterious behavior, try using -w anyway. * Simplified grammar The new yacc grammar is one half the size of the old one. Many of the arbitrary grammar rules have been regularized. The number of reserved words has been cut by 2/3. Despite this, nearly all old Perl scripts will continue to work unchanged. * Lexical scoping Perl variables may now be declared within a lexical scope, like "auto" variables in C. Not only is this more efficient, but it contributes to better privacy for "programming in the large". Anonymous subroutines exhibit deep binding of lexical variables (closures). * Arbitrarily nested data structures Any scalar value, including any array element, may now contain a reference to any other variable or subroutine. You can easily create anonymous variables and subroutines. Perl manages your reference counts for you. * Modularity and reusability The Perl library is now defined in terms of modules which can be easily shared among various packages. A package may choose to import all or a portion of a module's published interface. Pragmas (that is, compiler directives) are defined and used by the same mechanism. * Object-oriented programming A package can function as a class. Dynamic multiple inheritance and virtual methods are supported in a straightforward manner and with very little new syntax. Filehandles may now be treated as objects. * Embeddable and Extensible Perl may now be embedded easily in your C or C++ application, and can either call or be called by your routines through a documented interface. The XS preprocessor is provided to make it easy to glue your C or C++ routines into Perl. Dynamic loading of modules is supported, and Perl itself can be made into a dynamic library. * POSIX compliant A major new module is the POSIX module, which provides access to all available POSIX routines and definitions, via object classes where appropriate. * Package constructors and destructors The new BEGIN and END blocks provide means to capture control as a package is being compiled, and after the program exits. As a degenerate case they work just like awk's BEGIN and END when you use the -p or -n switches. * Multiple simultaneous DBM implementations A Perl program may now access DBM, NDBM, SDBM, GDBM, and Berkeley DB files from the same script simultaneously. In fact, the old dbmopen interface has been generalized to allow any variable to be tied to an object class which defines its access methods. * Subroutine definitions may now be autoloaded In fact, the AUTOLOAD mechanism also allows you to define any arbitrary semantics for undefined subroutine calls. It's not for just autoloading. * Regular expression enhancements You can now specify nongreedy quantifiers. You can now do grouping without creating a backreference. You can now write regular expressions with embedded whitespace and comments for readability. A consistent extensibility mechanism has been added that is upwardly compatible with all old regular expressions. * Innumerable Unbundled Modules The Comprehensive Perl Archive Network described in the perlmodlib manpage contains hundreds of plug-and-play modules full of reusable code. See http://www.perl.com/CPAN for a site near you. * Compilability While not yet in full production mode, a working perl-to-C compiler does exist. It can generate portable byte code, simple C, or optimized C code. Okay, that's *definitely* enough hype. AVAILABILITY Perl is available for the vast majority of operating system platforms, including most Unix-like platforms. The following situation is as of February 1999 and Perl 5.005_03. The following platforms are able to build Perl from the standard source code distribution available at http://www.perl.com/CPAN/src/index.html AIX Linux SCO ODT/OSR A/UX MachTen Solaris BeOS MPE/iX SunOS BSD/OS NetBSD SVR4 DG/UX NextSTEP Tru64 UNIX 3) DomainOS OpenBSD Ultrix DOS DJGPP 1) OpenSTEP UNICOS DYNIX/ptx OS/2 VMS FreeBSD OS390 2) VOS HP-UX PowerMAX Windows 3.1 1) Hurd QNX Windows 95 1) 4) IRIX Windows 98 1) 4) Windows NT 1) 4) 1) in DOS mode either the DOS or OS/2 ports can be used 2) formerly known as MVS 3) formerly known as Digital UNIX and before that DEC OSF/1 4) compilers: Borland, Cygwin32, Mingw32 EGCS/GCC, VC++ The following platforms have been known to build Perl from the source but for the Perl release 5.005_03 we haven't been able to verify them, either because the hardware/software platforms are rather rare or because we don't have an active champion on these platforms, or both. 3b1 FPS Plan 9 AmigaOS GENIX PowerUX ConvexOS Greenhills RISC/os CX/UX ISC Stellar DC/OSx MachTen 68k SVR2 DDE SMES MiNT TI1500 DOS EMX MPC TitanOS Dynix NEWS-OS UNICOS/mk EP/IX Opus Unisys Dynix ESIX Unixware The following platforms are planned to be supported in the standard source code distribution of the Perl release 5.006 but are not supported in the Perl release 5.005_03: BS2000 Netware Rhapsody VM/ESA The following platforms have their own source code distributions and binaries available via http://www.perl.com/CPAN/ports/index.html. Perl release AS/400 5.003 MacOS 5.004 Netware 5.003_07 Tandem Guardian 5.004 The following platforms have only binaries available via http://www.perl.com/CPAN/ports/index.html. Perl release Acorn RISCOS 5.005_02 AOS 5.002 LynxOS 5.004_02 ENVIRONMENT See the perlrun manpage. AUTHOR Larry Wall , with the help of oodles of other folks. If your Perl success stories and testimonials may be of help to others who wish to advocate the use of Perl in their applications, or if you wish to simply express your gratitude to Larry and the Perl developers, please write to . FILES "@INC" locations of perl libraries SEE ALSO a2p awk to perl translator s2p sed to perl translator DIAGNOSTICS The -w switch produces some lovely diagnostics. See the perldiag manpage for explanations of all Perl's diagnostics. The `use diagnostics' pragma automatically turns Perl's normally terse warnings and errors into these longer forms. Compilation errors will tell you the line number of the error, with an indication of the next token or token type that was to be examined. (In the case of a script passed to Perl via -e switches, each -e is counted as one line.) Setuid scripts have additional constraints that can produce error messages such as "Insecure dependency". See the perlsec manpage. Did we mention that you should definitely consider using the -w switch? BUGS The -w switch is not mandatory. Perl is at the mercy of your machine's definitions of various operations such as type casting, atof(), and floating-point output with sprintf(). If your stdio requires a seek or eof between reads and writes on a particular stream, so does Perl. (This doesn't apply to sysread() and syswrite().) While none of the built-in data types have any arbitrary size limits (apart from memory size), there are still a few arbitrary limits: a given variable name may not be longer than 251 characters. Line numbers displayed by diagnostics are internally stored as short integers, so they are limited to a maximum of 65535 (higher numbers usually being affected by wraparound). You may mail your bug reports (be sure to include full configuration information as output by the myconfig program in the perl source tree, or by `perl -V') to . If you've succeeded in compiling perl, the perlbug script in the utils/ subdirectory can be used to help mail in a bug report. Perl actually stands for Pathologically Eclectic Rubbish Lister, but don't tell anyone I said that. NOTES The Perl motto is "There's more than one way to do it." Divining how many more is left as an exercise to the reader. The three principal virtues of a programmer are Laziness, Impatience, and Hubris. See the Camel Book for why. perl5004delta section NAME perldelta - what's new for perl5.004 DESCRIPTION This document describes differences between the 5.003 release (as documented in *Programming Perl*, second edition--the Camel Book) and this one. Supported Environments Perl5.004 builds out of the box on Unix, Plan 9, LynxOS, VMS, OS/2, QNX, AmigaOS, and Windows NT. Perl runs on Windows 95 as well, but it cannot be built there, for lack of a reasonable command interpreter. Core Changes Most importantly, many bugs were fixed, including several security problems. See the Changes file in the distribution for details. List assignment to %ENV works `%ENV = ()' and `%ENV = @list' now work as expected (except on VMS where it generates a fatal error). "Can't locate Foo.pm in @INC" error now lists @INC Compilation option: Binary compatibility with 5.003 There is a new Configure question that asks if you want to maintain binary compatibility with Perl 5.003. If you choose binary compatibility, you do not have to recompile your extensions, but you might have symbol conflicts if you embed Perl in another application, just as in the 5.003 release. By default, binary compatibility is preserved at the expense of symbol table pollution. $PERL5OPT environment variable You may now put Perl options in the $PERL5OPT environment variable. Unless Perl is running with taint checks, it will interpret this variable as if its contents had appeared on a "#!perl" line at the beginning of your script, except that hyphens are optional. PERL5OPT may only be used to set the following switches: -[DIMUdmw]. Limitations on -M, -m, and -T options The `-M' and `-m' options are no longer allowed on the `#!' line of a script. If a script needs a module, it should invoke it with the `use' pragma. The -T option is also forbidden on the `#!' line of a script, unless it was present on the Perl command line. Due to the way `#!' works, this usually means that -T must be in the first argument. Thus: #!/usr/bin/perl -T -w will probably work for an executable script invoked as `scriptname', while: #!/usr/bin/perl -w -T will probably fail under the same conditions. (Non-Unix systems will probably not follow this rule.) But `perl scriptname' is guaranteed to fail, since then there is no chance of -T being found on the command line before it is found on the `#!' line. More precise warnings If you removed the -w option from your Perl 5.003 scripts because it made Perl too verbose, we recommend that you try putting it back when you upgrade to Perl 5.004. Each new perl version tends to remove some undesirable warnings, while adding new warnings that may catch bugs in your scripts. Deprecated: Inherited `AUTOLOAD' for non-methods Before Perl 5.004, `AUTOLOAD' functions were looked up as methods (using the `@ISA' hierarchy), even when the function to be autoloaded was called as a plain function (e.g. `Foo::bar()'), not a method (e.g. `Foo- >bar()' or `$obj->bar()'). Perl 5.005 will use method lookup only for methods' `AUTOLOAD's. However, there is a significant base of existing code that may be using the old behavior. So, as an interim step, Perl 5.004 issues an optional warning when a non-method uses an inherited `AUTOLOAD'. The simple rule is: Inheritance will not work when autoloading non- methods. The simple fix for old code is: In any module that used to depend on inheriting `AUTOLOAD' for non-methods from a base class named `BaseClass', execute `*AUTOLOAD = \&BaseClass::AUTOLOAD' during startup. Previously deprecated %OVERLOAD is no longer usable Using %OVERLOAD to define overloading was deprecated in 5.003. Overloading is now defined using the overload pragma. %OVERLOAD is still used internally but should not be used by Perl scripts. See the overload manpage for more details. Subroutine arguments created only when they're modified In Perl 5.004, nonexistent array and hash elements used as subroutine parameters are brought into existence only if they are actually assigned to (via `@_'). Earlier versions of Perl vary in their handling of such arguments. Perl versions 5.002 and 5.003 always brought them into existence. Perl versions 5.000 and 5.001 brought them into existence only if they were not the first argument (which was almost certainly a bug). Earlier versions of Perl never brought them into existence. For example, given this code: undef @a; undef %a; sub show { print $_[0] }; sub change { $_[0]++ }; show($a[2]); change($a{b}); After this code executes in Perl 5.004, $a{b} exists but $a[2] does not. In Perl 5.002 and 5.003, both $a{b} and $a[2] would have existed (but $a[2]'s value would have been undefined). Group vector changeable with `$)' The `$)' special variable has always (well, in Perl 5, at least) reflected not only the current effective group, but also the group list as returned by the `getgroups()' C function (if there is one). However, until this release, there has not been a way to call the `setgroups()' C function from Perl. In Perl 5.004, assigning to `$)' is exactly symmetrical with examining it: The first number in its string value is used as the effective gid; if there are any numbers after the first one, they are passed to the `setgroups()' C function (if there is one). Fixed parsing of $$, &$, etc. Perl versions before 5.004 misinterpreted any type marker followed by "$" and a digit. For example, "$$0" was incorrectly taken to mean "${$}0" instead of "${$0}". This bug is (mostly) fixed in Perl 5.004. However, the developers of Perl 5.004 could not fix this bug completely, because at least two widely-used modules depend on the old meaning of "$$0" in a string. So Perl 5.004 still interprets "$$" in the old (broken) way inside strings; but it generates this message as a warning. And in Perl 5.005, this special treatment will cease. Fixed localization of $, $&, etc. Perl versions before 5.004 did not always properly localize the regex- related special variables. Perl 5.004 does localize them, as the documentation has always said it should. This may result in $1, $2, etc. no longer being set where existing programs use them. No resetting of $. on implicit close The documentation for Perl 5.0 has always stated that `$.' is *not* reset when an already-open file handle is reopened with no intervening call to `close'. Due to a bug, perl versions 5.000 through 5.003 *did* reset `$.' under that circumstance; Perl 5.004 does not. `wantarray' may return undef The `wantarray' operator returns true if a subroutine is expected to return a list, and false otherwise. In Perl 5.004, `wantarray' can also return the undefined value if a subroutine's return value will not be used at all, which allows subroutines to avoid a time-consuming calculation of a return value if it isn't going to be used. `eval EXPR' determines value of EXPR in scalar context Perl (version 5) used to determine the value of EXPR inconsistently, sometimes incorrectly using the surrounding context for the determination. Now, the value of EXPR (before being parsed by eval) is always determined in a scalar context. Once parsed, it is executed as before, by providing the context that the scope surrounding the eval provided. This change makes the behavior Perl4 compatible, besides fixing bugs resulting from the inconsistent behavior. This program: @a = qw(time now is time); print eval @a; print '|', scalar eval @a; used to print something like "timenowis881399109|4", but now (and in perl4) prints "4|4". Changes to tainting checks A bug in previous versions may have failed to detect some insecure conditions when taint checks are turned on. (Taint checks are used in setuid or setgid scripts, or when explicitly turned on with the `-T' invocation option.) Although it's unlikely, this may cause a previously- working script to now fail -- which should be construed as a blessing, since that indicates a potentially-serious security hole was just plugged. The new restrictions when tainting include: No glob() or <*> These operators may spawn the C shell (csh), which cannot be made safe. This restriction will be lifted in a future version of Perl when globbing is implemented without the use of an external program. No spawning if tainted $CDPATH, $ENV, $BASH_ENV These environment variables may alter the behavior of spawned programs (especially shells) in ways that subvert security. So now they are treated as dangerous, in the manner of $IFS and $PATH. No spawning if tainted $TERM doesn't look like a terminal name Some termcap libraries do unsafe things with $TERM. However, it would be unnecessarily harsh to treat all $TERM values as unsafe, since only shell metacharacters can cause trouble in $TERM. So a tainted $TERM is considered to be safe if it contains only alphanumerics, underscores, dashes, and colons, and unsafe if it contains other characters (including whitespace). New Opcode module and revised Safe module A new Opcode module supports the creation, manipulation and application of opcode masks. The revised Safe module has a new API and is implemented using the new Opcode module. Please read the new Opcode and Safe documentation. Embedding improvements In older versions of Perl it was not possible to create more than one Perl interpreter instance inside a single process without leaking like a sieve and/or crashing. The bugs that caused this behavior have all been fixed. However, you still must take care when embedding Perl in a C program. See the updated perlembed manpage for tips on how to manage your interpreters. Internal change: FileHandle class based on IO::* classes File handles are now stored internally as type IO::Handle. The FileHandle module is still supported for backwards compatibility, but it is now merely a front end to the IO::* modules -- specifically, IO::Handle, IO::Seekable, and IO::File. We suggest, but do not require, that you use the IO::* modules in new code. In harmony with this change, `*GLOB{FILEHANDLE}' is now just a backward- compatible synonym for `*GLOB{IO}'. Internal change: PerlIO abstraction interface It is now possible to build Perl with AT&T's sfio IO package instead of stdio. See the perlapio manpage for more details, and the INSTALL file for how to use it. New and changed syntax $coderef->(PARAMS) A subroutine reference may now be suffixed with an arrow and a (possibly empty) parameter list. This syntax denotes a call of the referenced subroutine, with the given parameters (if any). This new syntax follows the pattern of `$hashref->{FOO}' and `$aryref->[$foo]': You may now write `&$subref($foo)' as `$subref- >($foo)'. All of these arrow terms may be chained; thus, `&{$table- >{FOO}}($bar)' may now be written `$table->{FOO}->($bar)'. New and changed builtin constants __PACKAGE__ The current package name at compile time, or the undefined value if there is no current package (due to a `package;' directive). Like `__FILE__' and `__LINE__', `__PACKAGE__' does *not* interpolate into strings. New and changed builtin variables $^E Extended error message on some platforms. (Also known as $EXTENDED_OS_ERROR if you `use English'). $^H The current set of syntax checks enabled by `use strict'. See the documentation of `strict' for more details. Not actually new, but newly documented. Because it is intended for internal use by Perl core components, there is no `use English' long name for this variable. $^M By default, running out of memory it is not trappable. However, if compiled for this, Perl may use the contents of `$^M' as an emergency pool after die()ing with this message. Suppose that your Perl were compiled with -DPERL_EMERGENCY_SBRK and used Perl's malloc. Then $^M = 'a' x (1<<16); would allocate a 64K buffer for use when in emergency. See the INSTALL file for information on how to enable this option. As a disincentive to casual use of this advanced feature, there is no `use English' long name for this variable. New and changed builtin functions delete on slices This now works. (e.g. `delete @ENV{'PATH', 'MANPATH'}') flock is now supported on more platforms, prefers fcntl to lockf when emulating, and always flushes before (un)locking. printf and sprintf Perl now implements these functions itself; it doesn't use the C library function sprintf() any more, except for floating-point numbers, and even then only known flags are allowed. As a result, it is now possible to know which conversions and flags will work, and what they will do. The new conversions in Perl's sprintf() are: %i a synonym for %d %p a pointer (the address of the Perl value, in hexadecimal) %n special: *stores* the number of characters output so far into the next variable in the parameter list The new flags that go between the `%' and the conversion are: # prefix octal with "0", hex with "0x" h interpret integer as C type "short" or "unsigned short" V interpret integer as Perl's standard integer type Also, where a number would appear in the flags, an asterisk ("*") may be used instead, in which case Perl uses the next item in the parameter list as the given number (that is, as the field width or precision). If a field width obtained through "*" is negative, it has the same effect as the '-' flag: left-justification. See the "sprintf" entry in the perlfunc manpage for a complete list of conversion and flags. keys as an lvalue As an lvalue, `keys' allows you to increase the number of hash buckets allocated for the given hash. This can gain you a measure of efficiency if you know the hash is going to get big. (This is similar to pre-extending an array by assigning a larger number to $#array.) If you say keys %hash = 200; then `%hash' will have at least 200 buckets allocated for it. These buckets will be retained even if you do `%hash = ()'; use `undef %hash' if you want to free the storage while `%hash' is still in scope. You can't shrink the number of buckets allocated for the hash using `keys' in this way (but you needn't worry about doing this by accident, as trying has no effect). my() in Control Structures You can now use my() (with or without the parentheses) in the control expressions of control structures such as: while (defined(my $line = <>)) { $line = lc $line; } continue { print $line; } if ((my $answer = ) =~ /^y(es)?$/i) { user_agrees(); } elsif ($answer =~ /^n(o)?$/i) { user_disagrees(); } else { chomp $answer; die "`$answer' is neither `yes' nor `no'"; } Also, you can declare a foreach loop control variable as lexical by preceding it with the word "my". For example, in: foreach my $i (1, 2, 3) { some_function(); } $i is a lexical variable, and the scope of $i extends to the end of the loop, but not beyond it. Note that you still cannot use my() on global punctuation variables such as $_ and the like. pack() and unpack() A new format 'w' represents a BER compressed integer (as defined in ASN.1). Its format is a sequence of one or more bytes, each of which provides seven bits of the total value, with the most significant first. Bit eight of each byte is set, except for the last byte, in which bit eight is clear. If 'p' or 'P' are given undef as values, they now generate a NULL pointer. Both pack() and unpack() now fail when their templates contain invalid types. (Invalid types used to be ignored.) sysseek() The new sysseek() operator is a variant of seek() that sets and gets the file's system read/write position, using the lseek(2) system call. It is the only reliable way to seek before using sysread() or syswrite(). Its return value is the new position, or the undefined value on failure. use VERSION If the first argument to `use' is a number, it is treated as a version number instead of a module name. If the version of the Perl interpreter is less than VERSION, then an error message is printed and Perl exits immediately. Because `use' occurs at compile time, this check happens immediately during the compilation process, unlike `require VERSION', which waits until runtime for the check. This is often useful if you need to check the current Perl version before `use'ing library modules which have changed in incompatible ways from older versions of Perl. (We try not to do this more than we have to.) use Module VERSION LIST If the VERSION argument is present between Module and LIST, then the `use' will call the VERSION method in class Module with the given version as an argument. The default VERSION method, inherited from the UNIVERSAL class, croaks if the given version is larger than the value of the variable $Module::VERSION. (Note that there is not a comma after VERSION!) This version-checking mechanism is similar to the one currently used in the Exporter module, but it is faster and can be used with modules that don't use the Exporter. It is the recommended method for new code. prototype(FUNCTION) Returns the prototype of a function as a string (or `undef' if the function has no prototype). FUNCTION is a reference to or the name of the function whose prototype you want to retrieve. (Not actually new; just never documented before.) srand The default seed for `srand', which used to be `time', has been changed. Now it's a heady mix of difficult-to-predict system- dependent values, which should be sufficient for most everyday purposes. Previous to version 5.004, calling `rand' without first calling `srand' would yield the same sequence of random numbers on most or all machines. Now, when perl sees that you're calling `rand' and haven't yet called `srand', it calls `srand' with the default seed. You should still call `srand' manually if your code might ever be run on a pre-5.004 system, of course, or if you want a seed other than the default. $_ as Default Functions documented in the Camel to default to $_ now in fact do, and all those that do are so documented in the perlfunc manpage. `m//gc' does not reset search position on failure The `m//g' match iteration construct has always reset its target string's search position (which is visible through the `pos' operator) when a match fails; as a result, the next `m//g' match after a failure starts again at the beginning of the string. With Perl 5.004, this reset may be disabled by adding the "c" (for "continue") modifier, i.e. `m//gc'. This feature, in conjunction with the `\G' zero-width assertion, makes it possible to chain matches together. See the perlop manpage and the perlre manpage. `m//x' ignores whitespace before ?*+{} The `m//x' construct has always been intended to ignore all unescaped whitespace. However, before Perl 5.004, whitespace had the effect of escaping repeat modifiers like "*" or "?"; for example, `/a *b/x' was (mis)interpreted as `/a\*b/x'. This bug has been fixed in 5.004. nested `sub{}' closures work now Prior to the 5.004 release, nested anonymous functions didn't work right. They do now. formats work right on changing lexicals Just like anonymous functions that contain lexical variables that change (like a lexical index variable for a `foreach' loop), formats now work properly. For example, this silently failed before (printed only zeros), but is fine now: my $i; foreach $i ( 1 .. 10 ) { write; } format = my i is @# $i . However, it still fails (without a warning) if the foreach is within a subroutine: my $i; sub foo { foreach $i ( 1 .. 10 ) { write; } } foo; format = my i is @# $i . New builtin methods The `UNIVERSAL' package automatically contains the following methods that are inherited by all other classes: isa(CLASS) `isa' returns *true* if its object is blessed into a subclass of `CLASS' `isa' is also exportable and can be called as a sub with two arguments. This allows the ability to check what a reference points to. Example: use UNIVERSAL qw(isa); if(isa($ref, 'ARRAY')) { ... } can(METHOD) `can' checks to see if its object has a method called `METHOD', if it does then a reference to the sub is returned; if it does not then *undef* is returned. VERSION( [NEED] ) `VERSION' returns the version number of the class (package). If the NEED argument is given then it will check that the current version (as defined by the $VERSION variable in the given package) not less than NEED; it will die if this is not the case. This method is normally called as a class method. This method is called automatically by the `VERSION' form of `use'. use A 1.2 qw(some imported subs); # implies: A->VERSION(1.2); NOTE: `can' directly uses Perl's internal code for method lookup, and `isa' uses a very similar method and caching strategy. This may cause strange effects if the Perl code dynamically changes @ISA in any package. You may add other methods to the UNIVERSAL class via Perl or XS code. You do not need to `use UNIVERSAL' in order to make these methods available to your program. This is necessary only if you wish to have `isa' available as a plain subroutine in the current package. TIEHANDLE now supported See the perltie manpage for other kinds of tie()s. TIEHANDLE classname, LIST This is the constructor for the class. That means it is expected to return an object of some sort. The reference can be used to hold some internal information. sub TIEHANDLE { print "\n"; my $i; return bless \$i, shift; } PRINT this, LIST This method will be triggered every time the tied handle is printed to. Beyond its self reference it also expects the list that was passed to the print function. sub PRINT { $r = shift; $$r++; return print join( $, => map {uc} @_), $\; } PRINTF this, LIST This method will be triggered every time the tied handle is printed to with the `printf()' function. Beyond its self reference it also expects the format and list that was passed to the printf function. sub PRINTF { shift; my $fmt = shift; print sprintf($fmt, @_)."\n"; } READ this LIST This method will be called when the handle is read from via the `read' or `sysread' functions. sub READ { $r = shift; my($buf,$len,$offset) = @_; print "READ called, \$buf=$buf, \$len=$len, \$offset=$offset"; } READLINE this This method will be called when the handle is read from. The method should return undef when there is no more data. sub READLINE { $r = shift; return "PRINT called $$r times\n" } GETC this This method will be called when the `getc' function is called. sub GETC { print "Don't GETC, Get Perl"; return "a"; } DESTROY this As with the other types of ties, this method will be called when the tied handle is about to be destroyed. This is useful for debugging and possibly for cleaning up. sub DESTROY { print "\n"; } Malloc enhancements If perl is compiled with the malloc included with the perl distribution (that is, if `perl -V:d_mymalloc' is 'define') then you can print memory statistics at runtime by running Perl thusly: env PERL_DEBUG_MSTATS=2 perl your_script_here The value of 2 means to print statistics after compilation and on exit; with a value of 1, the statistics are printed only on exit. (If you want the statistics at an arbitrary time, you'll need to install the optional module Devel::Peek.) Three new compilation flags are recognized by malloc.c. (They have no effect if perl is compiled with system malloc().) -DPERL_EMERGENCY_SBRK If this macro is defined, running out of memory need not be a fatal error: a memory pool can allocated by assigning to the special variable `$^M'. See the section on "$^M". -DPACK_MALLOC Perl memory allocation is by bucket with sizes close to powers of two. Because of these malloc overhead may be big, especially for data of size exactly a power of two. If `PACK_MALLOC' is defined, perl uses a slightly different algorithm for small allocations (up to 64 bytes long), which makes it possible to have overhead down to 1 byte for allocations which are powers of two (and appear quite often). Expected memory savings (with 8-byte alignment in `alignbytes') is about 20% for typical Perl usage. Expected slowdown due to additional malloc overhead is in fractions of a percent (hard to measure, because of the effect of saved memory on speed). -DTWO_POT_OPTIMIZE Similarly to `PACK_MALLOC', this macro improves allocations of data with size close to a power of two; but this works for big allocations (starting with 16K by default). Such allocations are typical for big hashes and special-purpose scripts, especially image processing. On recent systems, the fact that perl requires 2M from system for 1M allocation will not affect speed of execution, since the tail of such a chunk is not going to be touched (and thus will not require real memory). However, it may result in a premature out-of-memory error. So if you will be manipulating very large blocks with sizes close to powers of two, it would be wise to define this macro. Expected saving of memory is 0-100% (100% in applications which require most memory in such 2**n chunks); expected slowdown is negligible. Miscellaneous efficiency enhancements Functions that have an empty prototype and that do nothing but return a fixed value are now inlined (e.g. `sub PI () { 3.14159 }'). Each unique hash key is only allocated once, no matter how many hashes have an entry with that key. So even if you have 100 copies of the same hash, the hash keys never have to be reallocated. Support for More Operating Systems Support for the following operating systems is new in Perl 5.004. Win32 Perl 5.004 now includes support for building a "native" perl under Windows NT, using the Microsoft Visual C++ compiler (versions 2.0 and above) or the Borland C++ compiler (versions 5.02 and above). The resulting perl can be used under Windows 95 (if it is installed in the same directory locations as it got installed in Windows NT). This port includes support for perl extension building tools like the MakeMaker manpage and the h2xs manpage, so that many extensions available on the Comprehensive Perl Archive Network (CPAN) can now be readily built under Windows NT. See http://www.perl.com/ for more information on CPAN and README.win32 in the perl distribution for more details on how to get started with building this port. There is also support for building perl under the Cygwin32 environment. Cygwin32 is a set of GNU tools that make it possible to compile and run many UNIX programs under Windows NT by providing a mostly UNIX-like interface for compilation and execution. See README.cygwin32 in the perl distribution for more details on this port and how to obtain the Cygwin32 toolkit. Plan 9 See README.plan9 in the perl distribution. QNX See README.qnx in the perl distribution. AmigaOS See README.amigaos in the perl distribution. Pragmata Six new pragmatic modules exist: use autouse MODULE => qw(sub1 sub2 sub3) Defers `require MODULE' until someone calls one of the specified subroutines (which must be exported by MODULE). This pragma should be used with caution, and only when necessary. use blib use blib 'dir' Looks for MakeMaker-like *'blib'* directory structure starting in *dir* (or current directory) and working back up to five levels of parent directories. Intended for use on command line with -M option as a way of testing arbitrary scripts against an uninstalled version of a package. use constant NAME => VALUE Provides a convenient interface for creating compile-time constants, See the section on "Constant Functions" in the perlsub manpage. use locale Tells the compiler to enable (or disable) the use of POSIX locales for builtin operations. When `use locale' is in effect, the current LC_CTYPE locale is used for regular expressions and case mapping; LC_COLLATE for string ordering; and LC_NUMERIC for numeric formating in printf and sprintf (but not in print). LC_NUMERIC is always used in write, since lexical scoping of formats is problematic at best. Each `use locale' or `no locale' affects statements to the end of the enclosing BLOCK or, if not inside a BLOCK, to the end of the current file. Locales can be switched and queried with POSIX::setlocale(). See the perllocale manpage for more information. use ops Disable unsafe opcodes, or any named opcodes, when compiling Perl code. use vmsish Enable VMS-specific language features. Currently, there are three VMS-specific features available: 'status', which makes `$?' and `system' return genuine VMS status values instead of emulating POSIX; 'exit', which makes `exit' take a genuine VMS status value instead of assuming that `exit 1' is an error; and 'time', which makes all times relative to the local time zone, in the VMS tradition. Modules Required Updates Though Perl 5.004 is compatible with almost all modules that work with Perl 5.003, there are a few exceptions: Module Required Version for Perl 5.004 ------ ------------------------------- Filter Filter-1.12 LWP libwww-perl-5.08 Tk Tk400.202 (-w makes noise) Also, the majordomo mailing list program, version 1.94.1, doesn't work with Perl 5.004 (nor with perl 4), because it executes an invalid regular expression. This bug is fixed in majordomo version 1.94.2. Installation directories The *installperl* script now places the Perl source files for extensions in the architecture-specific library directory, which is where the shared libraries for extensions have always been. This change is intended to allow administrators to keep the Perl 5.004 library directory unchanged from a previous version, without running the risk of binary incompatibility between extensions' Perl source and shared libraries. Module information summary Brand new modules, arranged by topic rather than strictly alphabetically: CGI.pm Web server interface ("Common Gateway Interface") CGI/Apache.pm Support for Apache's Perl module CGI/Carp.pm Log server errors with helpful context CGI/Fast.pm Support for FastCGI (persistent server process) CGI/Push.pm Support for server push CGI/Switch.pm Simple interface for multiple server types CPAN Interface to Comprehensive Perl Archive Network CPAN::FirstTime Utility for creating CPAN configuration file CPAN::Nox Runs CPAN while avoiding compiled extensions IO.pm Top-level interface to IO::* classes IO/File.pm IO::File extension Perl module IO/Handle.pm IO::Handle extension Perl module IO/Pipe.pm IO::Pipe extension Perl module IO/Seekable.pm IO::Seekable extension Perl module IO/Select.pm IO::Select extension Perl module IO/Socket.pm IO::Socket extension Perl module Opcode.pm Disable named opcodes when compiling Perl code ExtUtils/Embed.pm Utilities for embedding Perl in C programs ExtUtils/testlib.pm Fixes up @INC to use just-built extension FindBin.pm Find path of currently executing program Class/Struct.pm Declare struct-like datatypes as Perl classes File/stat.pm By-name interface to Perl's builtin stat Net/hostent.pm By-name interface to Perl's builtin gethost* Net/netent.pm By-name interface to Perl's builtin getnet* Net/protoent.pm By-name interface to Perl's builtin getproto* Net/servent.pm By-name interface to Perl's builtin getserv* Time/gmtime.pm By-name interface to Perl's builtin gmtime Time/localtime.pm By-name interface to Perl's builtin localtime Time/tm.pm Internal object for Time::{gm,local}time User/grent.pm By-name interface to Perl's builtin getgr* User/pwent.pm By-name interface to Perl's builtin getpw* Tie/RefHash.pm Base class for tied hashes with references as keys UNIVERSAL.pm Base class for *ALL* classes Fcntl New constants in the existing Fcntl modules are now supported, provided that your operating system happens to support them: F_GETOWN F_SETOWN O_ASYNC O_DEFER O_DSYNC O_FSYNC O_SYNC O_EXLOCK O_SHLOCK These constants are intended for use with the Perl operators sysopen() and fcntl() and the basic database modules like SDBM_File. For the exact meaning of these and other Fcntl constants please refer to your operating system's documentation for fcntl() and open(). In addition, the Fcntl module now provides these constants for use with the Perl operator flock(): LOCK_SH LOCK_EX LOCK_NB LOCK_UN These constants are defined in all environments (because where there is no flock() system call, Perl emulates it). However, for historical reasons, these constants are not exported unless they are explicitly requested with the ":flock" tag (e.g. `use Fcntl ':flock''). IO The IO module provides a simple mechanism to load all of the IO modules at one go. Currently this includes: IO::Handle IO::Seekable IO::File IO::Pipe IO::Socket For more information on any of these modules, please see its respective documentation. Math::Complex The Math::Complex module has been totally rewritten, and now supports more operations. These are overloaded: + - * / ** <=> neg ~ abs sqrt exp log sin cos atan2 "" (stringify) And these functions are now exported: pi i Re Im arg log10 logn ln cbrt root tan csc sec cot asin acos atan acsc asec acot sinh cosh tanh csch sech coth asinh acosh atanh acsch asech acoth cplx cplxe Math::Trig This new module provides a simpler interface to parts of Math::Complex for those who need trigonometric functions only for real numbers. DB_File There have been quite a few changes made to DB_File. Here are a few of the highlights: * Fixed a handful of bugs. * By public demand, added support for the standard hash function exists(). * Made it compatible with Berkeley DB 1.86. * Made negative subscripts work with RECNO interface. * Changed the default flags from O_RDWR to O_CREAT|O_RDWR and the default mode from 0640 to 0666. * Made DB_File automatically import the open() constants (O_RDWR, O_CREAT etc.) from Fcntl, if available. * Updated documentation. Refer to the HISTORY section in DB_File.pm for a complete list of changes. Everything after DB_File 1.01 has been added since 5.003. Net::Ping Major rewrite - support added for both udp echo and real icmp pings. Object-oriented overrides for builtin operators Many of the Perl builtins returning lists now have object-oriented overrides. These are: File::stat Net::hostent Net::netent Net::protoent Net::servent Time::gmtime Time::localtime User::grent User::pwent For example, you can now say use File::stat; use User::pwent; $his = (stat($filename)->st_uid == pwent($whoever)->pw_uid); Utility Changes pod2html Sends converted HTML to standard output The *pod2html* utility included with Perl 5.004 is entirely new. By default, it sends the converted HTML to its standard output, instead of writing it to a file like Perl 5.003's *pod2html* did. Use the -- outfile=FILENAME option to write to a file. xsubpp `void' XSUBs now default to returning nothing Due to a documentation/implementation bug in previous versions of Perl, XSUBs with a return type of `void' have actually been returning one value. Usually that value was the GV for the XSUB, but sometimes it was some already freed or reused value, which would sometimes lead to program failure. In Perl 5.004, if an XSUB is declared as returning `void', it actually returns no value, i.e. an empty list (though there is a backward-compatibility exception; see below). If your XSUB really does return an SV, you should give it a return type of `SV *'. For backward compatibility, *xsubpp* tries to guess whether a `void' XSUB is really `void' or if it wants to return an `SV *'. It does so by examining the text of the XSUB: if *xsubpp* finds what looks like an assignment to `ST(0)', it assumes that the XSUB's return type is really `SV *'. C Language API Changes `gv_fetchmethod' and `perl_call_sv' The `gv_fetchmethod' function finds a method for an object, just like in Perl 5.003. The GV it returns may be a method cache entry. However, in Perl 5.004, method cache entries are not visible to users; therefore, they can no longer be passed directly to `perl_call_sv'. Instead, you should use the `GvCV' macro on the GV to extract its CV, and pass the CV to `perl_call_sv'. The most likely symptom of passing the result of `gv_fetchmethod' to `perl_call_sv' is Perl's producing an "Undefined subroutine called" error on the *second* call to a given method (since there is no cache on the first call). `perl_eval_pv' A new function handy for eval'ing strings of Perl code inside C code. This function returns the value from the eval statement, which can be used instead of fetching globals from the symbol table. See the perlguts manpage, the perlembed manpage and the perlcall manpage for details and examples. Extended API for manipulating hashes Internal handling of hash keys has changed. The old hashtable API is still fully supported, and will likely remain so. The additions to the API allow passing keys as `SV*'s, so that `tied' hashes can be given real scalars as keys rather than plain strings (nontied hashes still can only use strings as keys). New extensions must use the new hash access functions and macros if they wish to use `SV*' keys. These additions also make it feasible to manipulate `HE*'s (hash entries), which can be more efficient. See the perlguts manpage for details. Documentation Changes Many of the base and library pods were updated. These new pods are included in section 1: the perldelta manpage This document. the perlfaq manpage Frequently asked questions. the perllocale manpage Locale support (internationalization and localization). the perltoot manpage Tutorial on Perl OO programming. the perlapio manpage Perl internal IO abstraction interface. the perlmodlib manpage Perl module library and recommended practice for module creation. Extracted from the perlmod manpage (which is much smaller as a result). the perldebug manpage Although not new, this has been massively updated. the perlsec manpage Although not new, this has been massively updated. New Diagnostics Several new conditions will trigger warnings that were silent before. Some only affect certain platforms. The following new warnings and errors outline these. These messages are classified as follows (listed in increasing order of desperation): (W) A warning (optional). (D) A deprecation (optional). (S) A severe warning (mandatory). (F) A fatal error (trappable). (P) An internal error you should never see (trappable). (X) A very fatal error (nontrappable). (A) An alien error message (not generated by Perl). "my" variable %s masks earlier declaration in same scope (W) A lexical variable has been redeclared in the same scope, effectively eliminating all access to the previous instance. This is almost always a typographical error. Note that the earlier variable will still exist until the end of the scope or until all closure referents to it are destroyed. %s argument is not a HASH element or slice (F) The argument to delete() must be either a hash element, such as $foo{$bar} $ref->[12]->{"susie"} or a hash slice, such as @foo{$bar, $baz, $xyzzy} @{$ref->[12]}{"susie", "queue"} Allocation too large: %lx (X) You can't allocate more than 64K on an MS-DOS machine. Allocation too large (F) You can't allocate more than 2^31+"small amount" bytes. Applying %s to %s will act on scalar(%s) (W) The pattern match (//), substitution (s///), and transliteration (tr///) operators work on scalar values. If you apply one of them to an array or a hash, it will convert the array or hash to a scalar value -- the length of an array, or the population info of a hash -- and then work on that scalar value. This is probably not what you meant to do. See the "grep" entry in the perlfunc manpage and the "map" entry in the perlfunc manpage for alternatives. Attempt to free nonexistent shared string (P) Perl maintains a reference counted internal table of strings to optimize the storage and access of hash keys and other strings. This indicates someone tried to decrement the reference count of a string that can no longer be found in the table. Attempt to use reference as lvalue in substr (W) You supplied a reference as the first argument to substr() used as an lvalue, which is pretty strange. Perhaps you forgot to dereference it first. See the "substr" entry in the perlfunc manpage. Bareword "%s" refers to nonexistent package (W) You used a qualified bareword of the form `Foo::', but the compiler saw no other uses of that namespace before that point. Perhaps you need to predeclare a package? Can't redefine active sort subroutine %s (F) Perl optimizes the internal handling of sort subroutines and keeps pointers into them. You tried to redefine one such sort subroutine when it was currently active, which is not allowed. If you really want to do this, you should write `sort { &func } @x' instead of `sort func @x'. Can't use bareword ("%s") as %s ref while "strict refs" in use (F) Only hard references are allowed by "strict refs". Symbolic references are disallowed. See the perlref manpage. Cannot resolve method `%s' overloading `%s' in package `%s' (P) Internal error trying to resolve overloading specified by a method name (as opposed to a subroutine reference). Constant subroutine %s redefined (S) You redefined a subroutine which had previously been eligible for inlining. See the section on "Constant Functions" in the perlsub manpage for commentary and workarounds. Constant subroutine %s undefined (S) You undefined a subroutine which had previously been eligible for inlining. See the section on "Constant Functions" in the perlsub manpage for commentary and workarounds. Copy method did not return a reference (F) The method which overloads "=" is buggy. See the section on "Copy Constructor" in the overload manpage. Died (F) You passed die() an empty string (the equivalent of `die ""') or you called it with no args and both `$@' and `$_' were empty. Exiting pseudo-block via %s (W) You are exiting a rather special block construct (like a sort block or subroutine) by unconventional means, such as a goto, or a loop control statement. See the "sort" entry in the perlfunc manpage. Identifier too long (F) Perl limits identifiers (names for variables, functions, etc.) to 252 characters for simple names, somewhat more for compound names (like `$A::B'). You've exceeded Perl's limits. Future versions of Perl are likely to eliminate these arbitrary limitations. Illegal character %s (carriage return) (F) A carriage return character was found in the input. This is an error, and not a warning, because carriage return characters can break multi-line strings, including here documents (e.g., `print <'. This may mean that your csh (C shell) is broken. If so, you should change all of the csh-related variables in config.sh: If you have tcsh, make the variables refer to it as if it were csh (e.g. `full_csh='/usr/bin/tcsh''); otherwise, make them all empty (except that `d_csh' should be `'undef'') so that Perl will think csh is missing. In either case, after editing config.sh, run `./Configure -S' and rebuild Perl. Invalid conversion in %s: "%s" (W) Perl does not understand the given format conversion. See the "sprintf" entry in the perlfunc manpage. Invalid type in pack: '%s' (F) The given character is not a valid pack type. See the "pack" entry in the perlfunc manpage. Invalid type in unpack: '%s' (F) The given character is not a valid unpack type. See the "unpack" entry in the perlfunc manpage. Name "%s::%s" used only once: possible typo (W) Typographical errors often show up as unique variable names. If you had a good reason for having a unique name, then just mention it again somehow to suppress the message (the `use vars' pragma is provided for just this purpose). Null picture in formline (F) The first argument to formline must be a valid format picture specification. It was found to be empty, which probably means you supplied it an uninitialized value. See the perlform manpage. Offset outside string (F) You tried to do a read/write/send/recv operation with an offset pointing outside the buffer. This is difficult to imagine. The sole exception to this is that `sysread()'ing past the buffer will extend the buffer and zero pad the new area. Out of memory! (X|F) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request. The request was judged to be small, so the possibility to trap it depends on the way Perl was compiled. By default it is not trappable. However, if compiled for this, Perl may use the contents of `$^M' as an emergency pool after die()ing with this message. In this case the error is trappable *once*. Out of memory during request for %s (F) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request. However, the request was judged large enough (compile-time default is 64K), so a possibility to shut down by trapping this error is granted. panic: frexp (P) The library function frexp() failed, making printf("%f") impossible. Possible attempt to put comments in qw() list (W) qw() lists contain items separated by whitespace; as with literal strings, comment characters are not ignored, but are instead treated as literal data. (You may have used different delimiters than the parentheses shown here; braces are also frequently used.) You probably wrote something like this: @list = qw( a # a comment b # another comment ); when you should have written this: @list = qw( a b ); If you really want comments, build your list the old-fashioned way, with quotes and commas: @list = ( 'a', # a comment 'b', # another comment ); Possible attempt to separate words with commas (W) qw() lists contain items separated by whitespace; therefore commas aren't needed to separate the items. (You may have used different delimiters than the parentheses shown here; braces are also frequently used.) You probably wrote something like this: qw! a, b, c !; which puts literal commas into some of the list items. Write it without commas if you don't want them to appear in your data: qw! a b c !; Scalar value @%s{%s} better written as $%s{%s} (W) You've used a hash slice (indicated by @) to select a single element of a hash. Generally it's better to ask for a scalar value (indicated by $). The difference is that `$foo{&bar}' always behaves like a scalar, both when assigning to it and when evaluating its argument, while `@foo{&bar}' behaves like a list when you assign to it, and provides a list context to its subscript, which can do weird things if you're expecting only one subscript. Stub found while resolving method `%s' overloading `%s' in package `%s' (P) Overloading resolution over @ISA tree may be broken by importing stubs. Stubs should never be implicitly created, but explicit calls to `can' may break this. Too late for "-T" option (X) The #! line (or local equivalent) in a Perl script contains the -T option, but Perl was not invoked with -T in its argument list. This is an error because, by the time Perl discovers a -T in a script, it's too late to properly taint everything from the environment. So Perl gives up. untie attempted while %d inner references still exist (W) A copy of the object returned from `tie' (or `tied') was still valid when `untie' was called. Unrecognized character %s (F) The Perl parser has no idea what to do with the specified character in your Perl script (or eval). Perhaps you tried to run a compressed script, a binary program, or a directory as a Perl program. Unsupported function fork (F) Your version of executable does not support forking. Note that under some systems, like OS/2, there may be different flavors of Perl executables, some of which may support fork, some not. Try changing the name you call Perl by to `perl_', `perl__', and so on. Use of "$$" to mean "${$}" is deprecated (D) Perl versions before 5.004 misinterpreted any type marker followed by "$" and a digit. For example, "$$0" was incorrectly taken to mean "${$}0" instead of "${$0}". This bug is (mostly) fixed in Perl 5.004. However, the developers of Perl 5.004 could not fix this bug completely, because at least two widely-used modules depend on the old meaning of "$$0" in a string. So Perl 5.004 still interprets "$$" in the old (broken) way inside strings; but it generates this message as a warning. And in Perl 5.005, this special treatment will cease. Value of %s can be "0"; test with defined() (W) In a conditional expression, you used , <*> (glob), `each()', or `readdir()' as a boolean value. Each of these constructs can return a value of "0"; that would make the conditional expression false, which is probably not what you intended. When using these constructs in conditional expressions, test their values with the `defined' operator. Variable "%s" may be unavailable (W) An inner (nested) *anonymous* subroutine is inside a *named* subroutine, and outside that is another subroutine; and the anonymous (innermost) subroutine is referencing a lexical variable defined in the outermost subroutine. For example: sub outermost { my $a; sub middle { sub { $a } } } If the anonymous subroutine is called or referenced (directly or indirectly) from the outermost subroutine, it will share the variable as you would expect. But if the anonymous subroutine is called or referenced when the outermost subroutine is not active, it will see the value of the shared variable as it was before and during the *first* call to the outermost subroutine, which is probably not what you want. In these circumstances, it is usually best to make the middle subroutine anonymous, using the `sub {}' syntax. Perl has specific support for shared variables in nested anonymous subroutines; a named subroutine in between interferes with this feature. Variable "%s" will not stay shared (W) An inner (nested) *named* subroutine is referencing a lexical variable defined in an outer subroutine. When the inner subroutine is called, it will probably see the value of the outer subroutine's variable as it was before and during the *first* call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable. In other words, the variable will no longer be shared. Furthermore, if the outer subroutine is anonymous and references a lexical variable outside itself, then the outer and inner subroutines will *never* share the given variable. This problem can usually be solved by making the inner subroutine anonymous, using the `sub {}' syntax. When inner anonymous subs that reference variables in outer subroutines are called or referenced, they are automatically rebound to the current values of such variables. Warning: something's wrong (W) You passed warn() an empty string (the equivalent of `warn ""') or you called it with no args and `$_' was empty. Ill-formed logical name |%s| in prime_env_iter (W) A warning peculiar to VMS. A logical name was encountered when preparing to iterate over %ENV which violates the syntactic rules governing logical names. Since it cannot be translated normally, it is skipped, and will not appear in %ENV. This may be a benign occurrence, as some software packages might directly modify logical name tables and introduce nonstandard names, or it may indicate that a logical name table has been corrupted. Got an error from DosAllocMem (P) An error peculiar to OS/2. Most probably you're using an obsolete version of Perl, and this should not happen anyway. Malformed PERLLIB_PREFIX (F) An error peculiar to OS/2. PERLLIB_PREFIX should be of the form prefix1;prefix2 or prefix1 prefix2 with nonempty prefix1 and prefix2. If `prefix1' is indeed a prefix of a builtin library search path, prefix2 is substituted. The error may appear if components are not found, or are too long. See "PERLLIB_PREFIX" in README.os2. PERL_SH_DIR too long (F) An error peculiar to OS/2. PERL_SH_DIR is the directory to find the `sh'-shell in. See "PERL_SH_DIR" in README.os2. Process terminated by SIG%s (W) This is a standard message issued by OS/2 applications, while *nix applications die in silence. It is considered a feature of the OS/2 port. One can easily disable this by appropriate sighandlers, see the section on "Signals" in the perlipc manpage. See also "Process terminated by SIGTERM/SIGINT" in README.os2. BUGS If you find what you think is a bug, you might check the headers of recently posted articles in the comp.lang.perl.misc newsgroup. There may also be information at http://www.perl.com/perl/, the Perl Home Page. If you believe you have an unreported bug, please run the perlbug program included with your release. Make sure you trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of `perl -V', will be sent off to to be analysed by the Perl porting team. SEE ALSO The Changes file for exhaustive details on what changed. The INSTALL file for how to build Perl. This file has been significantly updated for 5.004, so even veteran users should look through it. The README file for general stuff. The Copying file for copyright information. HISTORY Constructed by Tom Christiansen, grabbing material with permission from innumerable contributors, with kibitzing by more than a few Perl porters. Last update: Wed May 14 11:14:09 EDT 1997 perlapio section NAME perlapio - perl's IO abstraction interface. SYNOPSIS PerlIO *PerlIO_stdin(void); PerlIO *PerlIO_stdout(void); PerlIO *PerlIO_stderr(void); PerlIO *PerlIO_open(const char *,const char *); int PerlIO_close(PerlIO *); int PerlIO_stdoutf(const char *,...) int PerlIO_puts(PerlIO *,const char *); int PerlIO_putc(PerlIO *,int); int PerlIO_write(PerlIO *,const void *,size_t); int PerlIO_printf(PerlIO *, const char *,...); int PerlIO_vprintf(PerlIO *, const char *, va_list); int PerlIO_flush(PerlIO *); int PerlIO_eof(PerlIO *); int PerlIO_error(PerlIO *); void PerlIO_clearerr(PerlIO *); int PerlIO_getc(PerlIO *); int PerlIO_ungetc(PerlIO *,int); int PerlIO_read(PerlIO *,void *,size_t); int PerlIO_fileno(PerlIO *); PerlIO *PerlIO_fdopen(int, const char *); PerlIO *PerlIO_importFILE(FILE *, int flags); FILE *PerlIO_exportFILE(PerlIO *, int flags); FILE *PerlIO_findFILE(PerlIO *); void PerlIO_releaseFILE(PerlIO *,FILE *); void PerlIO_setlinebuf(PerlIO *); long PerlIO_tell(PerlIO *); int PerlIO_seek(PerlIO *,off_t,int); int PerlIO_getpos(PerlIO *,Fpos_t *) int PerlIO_setpos(PerlIO *,Fpos_t *) void PerlIO_rewind(PerlIO *); int PerlIO_has_base(PerlIO *); int PerlIO_has_cntptr(PerlIO *); int PerlIO_fast_gets(PerlIO *); int PerlIO_canset_cnt(PerlIO *); char *PerlIO_get_ptr(PerlIO *); int PerlIO_get_cnt(PerlIO *); void PerlIO_set_cnt(PerlIO *,int); void PerlIO_set_ptrcnt(PerlIO *,char *,int); char *PerlIO_get_base(PerlIO *); int PerlIO_get_bufsiz(PerlIO *); DESCRIPTION Perl's source code should use the above functions instead of those defined in ANSI C's *stdio.h*. The perl headers will `#define' them to the I/O mechanism selected at Configure time. The functions are modeled on those in *stdio.h*, but parameter order has been "tidied up a little". PerlIO * This takes the place of FILE *. Like FILE * it should be treated as opaque (it is probably safe to assume it is a pointer to something). PerlIO_stdin(), PerlIO_stdout(), PerlIO_stderr() Use these rather than `stdin', `stdout', `stderr'. They are written to look like "function calls" rather than variables because this makes it easier to *make them* function calls if platform cannot export data to loaded modules, or if (say) different "threads" might have different values. PerlIO_open(path, mode), PerlIO_fdopen(fd,mode) These correspond to fopen()/fdopen() arguments are the same. PerlIO_printf(f,fmt,...), PerlIO_vprintf(f,fmt,a) These are fprintf()/vfprintf() equivalents. PerlIO_stdoutf(fmt,...) This is printf() equivalent. printf is #defined to this function, so it is (currently) legal to use `printf(fmt,...)' in perl sources. PerlIO_read(f,buf,count), PerlIO_write(f,buf,count) These correspond to fread() and fwrite(). Note that arguments are different, there is only one "count" and order has "file" first. PerlIO_close(f) PerlIO_puts(f,s), PerlIO_putc(f,c) These correspond to fputs() and fputc(). Note that arguments have been revised to have "file" first. PerlIO_ungetc(f,c) This corresponds to ungetc(). Note that arguments have been revised to have "file" first. PerlIO_getc(f) This corresponds to getc(). PerlIO_eof(f) This corresponds to feof(). PerlIO_error(f) This corresponds to ferror(). PerlIO_fileno(f) This corresponds to fileno(), note that on some platforms, the meaning of "fileno" may not match Unix. PerlIO_clearerr(f) This corresponds to clearerr(), i.e., clears 'eof' and 'error' flags for the "stream". PerlIO_flush(f) This corresponds to fflush(). PerlIO_tell(f) This corresponds to ftell(). PerlIO_seek(f,o,w) This corresponds to fseek(). PerlIO_getpos(f,p), PerlIO_setpos(f,p) These correspond to fgetpos() and fsetpos(). If platform does not have the stdio calls then they are implemented in terms of PerlIO_tell() and PerlIO_seek(). PerlIO_rewind(f) This corresponds to rewind(). Note may be redefined in terms of PerlIO_seek() at some point. PerlIO_tmpfile() This corresponds to tmpfile(), i.e., returns an anonymous PerlIO which will automatically be deleted when closed. Co-existence with stdio There is outline support for co-existence of PerlIO with stdio. Obviously if PerlIO is implemented in terms of stdio there is no problem. However if perlio is implemented on top of (say) sfio then mechanisms must exist to create a FILE * which can be passed to library code which is going to use stdio calls. PerlIO_importFILE(f,flags) Used to get a PerlIO * from a FILE *. May need additional arguments, interface under review. PerlIO_exportFILE(f,flags) Given an PerlIO * return a 'native' FILE * suitable for passing to code expecting to be compiled and linked with ANSI C *stdio.h*. The fact that such a FILE * has been 'exported' is recorded, and may affect future PerlIO operations on the original PerlIO *. PerlIO_findFILE(f) Returns previously 'exported' FILE * (if any). Place holder until interface is fully defined. PerlIO_releaseFILE(p,f) Calling PerlIO_releaseFILE informs PerlIO that all use of FILE * is complete. It is removed from list of 'exported' FILE *s, and associated PerlIO * should revert to original behaviour. PerlIO_setlinebuf(f) This corresponds to setlinebuf(). Use is deprecated pending further discussion. (Perl core uses it *only* when "dumping"; it has nothing to do with $| auto-flush.) In addition to user API above there is an "implementation" interface which allows perl to get at internals of PerlIO. The following calls correspond to the various FILE_xxx macros determined by Configure. This section is really of interest to only those concerned with detailed perl-core behaviour or implementing a PerlIO mapping. PerlIO_has_cntptr(f) Implementation can return pointer to current position in the "buffer" and a count of bytes available in the buffer. PerlIO_get_ptr(f) Return pointer to next readable byte in buffer. PerlIO_get_cnt(f) Return count of readable bytes in the buffer. PerlIO_canset_cnt(f) Implementation can adjust its idea of number of bytes in the buffer. PerlIO_fast_gets(f) Implementation has all the interfaces required to allow perl's fast code to handle mechanism. PerlIO_fast_gets(f) = PerlIO_has_cntptr(f) && \ PerlIO_canset_cnt(f) && \ `Can set pointer into buffer' PerlIO_set_ptrcnt(f,p,c) Set pointer into buffer, and a count of bytes still in the buffer. Should be used only to set pointer to within range implied by previous calls to `PerlIO_get_ptr' and `PerlIO_get_cnt'. PerlIO_set_cnt(f,c) Obscure - set count of bytes in the buffer. Deprecated. Currently used in only doio.c to force count < -1 to -1. Perhaps should be PerlIO_set_empty or similar. This call may actually do nothing if "count" is deduced from pointer and a "limit". PerlIO_has_base(f) Implementation has a buffer, and can return pointer to whole buffer and its size. Used by perl for -T / -B tests. Other uses would be very obscure... PerlIO_get_base(f) Return *start* of buffer. PerlIO_get_bufsiz(f) Return *total size* of buffer. perlbook section NAME perlbook - Perl book information DESCRIPTION The Camel Book, officially known as *Programming Perl, Second Edition*, by Larry Wall et al, is the definitive reference work covering nearly all of Perl. You can order it and other Perl books from O'Reilly & Associates, 1-800-998-9938. Local/overseas is +1 707 829 0515. If you can locate an O'Reilly order form, you can also fax to +1 707 829 0104. If you're web-connected, you can even mosey on over to http://www.ora.com/ for an online order form. Other Perl books from various publishers and authors can be found listed in the perlfaq3 manpage. perlbot section NAME perlbot - Bag'o Object Tricks (the BOT) DESCRIPTION The following collection of tricks and hints is intended to whet curious appetites about such things as the use of instance variables and the mechanics of object and class relationships. The reader is encouraged to consult relevant textbooks for discussion of Object Oriented definitions and methodology. This is not intended as a tutorial for object-oriented programming or as a comprehensive guide to Perl's object oriented features, nor should it be construed as a style guide. The Perl motto still holds: There's more than one way to do it. OO SCALING TIPS 1 Do not attempt to verify the type of $self. That'll break if the class is inherited, when the type of $self is valid but its package isn't what you expect. See rule 5. 2 If an object-oriented (OO) or indirect-object (IO) syntax was used, then the object is probably the correct type and there's no need to become paranoid about it. Perl isn't a paranoid language anyway. If people subvert the OO or IO syntax then they probably know what they're doing and you should let them do it. See rule 1. 3 Use the two-argument form of bless(). Let a subclass use your constructor. See the section on "INHERITING A CONSTRUCTOR". 4 The subclass is allowed to know things about its immediate superclass, the superclass is allowed to know nothing about a subclass. 5 Don't be trigger happy with inheritance. A "using", "containing", or "delegation" relationship (some sort of aggregation, at least) is often more appropriate. See the section on "OBJECT RELATIONSHIPS", the section on "USING RELATIONSHIP WITH SDBM", and the section on "DELEGATION". 6 The object is the namespace. Make package globals accessible via the object. This will remove the guess work about the symbol's home package. See the section on "CLASS CONTEXT AND THE OBJECT". 7 IO syntax is certainly less noisy, but it is also prone to ambiguities that can cause difficult-to-find bugs. Allow people to use the sure-thing OO syntax, even if you don't like it. 8 Do not use function-call syntax on a method. You're going to be bitten someday. Someone might move that method into a superclass and your code will be broken. On top of that you're feeding the paranoia in rule 2. 9 Don't assume you know the home package of a method. You're making it difficult for someone to override that method. See the section on "THINKING OF CODE REUSE". INSTANCE VARIABLES An anonymous array or anonymous hash can be used to hold instance variables. Named parameters are also demonstrated. package Foo; sub new { my $type = shift; my %params = @_; my $self = {}; $self->{'High'} = $params{'High'}; $self->{'Low'} = $params{'Low'}; bless $self, $type; } package Bar; sub new { my $type = shift; my %params = @_; my $self = []; $self->[0] = $params{'Left'}; $self->[1] = $params{'Right'}; bless $self, $type; } package main; $a = Foo->new( 'High' => 42, 'Low' => 11 ); print "High=$a->{'High'}\n"; print "Low=$a->{'Low'}\n"; $b = Bar->new( 'Left' => 78, 'Right' => 40 ); print "Left=$b->[0]\n"; print "Right=$b->[1]\n"; SCALAR INSTANCE VARIABLES An anonymous scalar can be used when only one instance variable is needed. package Foo; sub new { my $type = shift; my $self; $self = shift; bless \$self, $type; } package main; $a = Foo->new( 42 ); print "a=$$a\n"; INSTANCE VARIABLE INHERITANCE This example demonstrates how one might inherit instance variables from a superclass for inclusion in the new class. This requires calling the superclass's constructor and adding one's own instance variables to the new object. package Bar; sub new { my $type = shift; my $self = {}; $self->{'buz'} = 42; bless $self, $type; } package Foo; @ISA = qw( Bar ); sub new { my $type = shift; my $self = Bar->new; $self->{'biz'} = 11; bless $self, $type; } package main; $a = Foo->new; print "buz = ", $a->{'buz'}, "\n"; print "biz = ", $a->{'biz'}, "\n"; OBJECT RELATIONSHIPS The following demonstrates how one might implement "containing" and "using" relationships between objects. package Bar; sub new { my $type = shift; my $self = {}; $self->{'buz'} = 42; bless $self, $type; } package Foo; sub new { my $type = shift; my $self = {}; $self->{'Bar'} = Bar->new; $self->{'biz'} = 11; bless $self, $type; } package main; $a = Foo->new; print "buz = ", $a->{'Bar'}->{'buz'}, "\n"; print "biz = ", $a->{'biz'}, "\n"; OVERRIDING SUPERCLASS METHODS The following example demonstrates how to override a superclass method and then call the overridden method. The SUPER pseudo-class allows the programmer to call an overridden superclass method without actually knowing where that method is defined. package Buz; sub goo { print "here's the goo\n" } package Bar; @ISA = qw( Buz ); sub google { print "google here\n" } package Baz; sub mumble { print "mumbling\n" } package Foo; @ISA = qw( Bar Baz ); sub new { my $type = shift; bless [], $type; } sub grr { print "grumble\n" } sub goo { my $self = shift; $self->SUPER::goo(); } sub mumble { my $self = shift; $self->SUPER::mumble(); } sub google { my $self = shift; $self->SUPER::google(); } package main; $foo = Foo->new; $foo->mumble; $foo->grr; $foo->goo; $foo->google; USING RELATIONSHIP WITH SDBM This example demonstrates an interface for the SDBM class. This creates a "using" relationship between the SDBM class and the new class Mydbm. package Mydbm; require SDBM_File; require Tie::Hash; @ISA = qw( Tie::Hash ); sub TIEHASH { my $type = shift; my $ref = SDBM_File->new(@_); bless {'dbm' => $ref}, $type; } sub FETCH { my $self = shift; my $ref = $self->{'dbm'}; $ref->FETCH(@_); } sub STORE { my $self = shift; if (defined $_[0]){ my $ref = $self->{'dbm'}; $ref->STORE(@_); } else { die "Cannot STORE an undefined key in Mydbm\n"; } } package main; use Fcntl qw( O_RDWR O_CREAT ); tie %foo, "Mydbm", "Sdbm", O_RDWR|O_CREAT, 0640; $foo{'bar'} = 123; print "foo-bar = $foo{'bar'}\n"; tie %bar, "Mydbm", "Sdbm2", O_RDWR|O_CREAT, 0640; $bar{'Cathy'} = 456; print "bar-Cathy = $bar{'Cathy'}\n"; THINKING OF CODE REUSE One strength of Object-Oriented languages is the ease with which old code can use new code. The following examples will demonstrate first how one can hinder code reuse and then how one can promote code reuse. This first example illustrates a class which uses a fully-qualified method call to access the "private" method BAZ(). The second example will show that it is impossible to override the BAZ() method. package FOO; sub new { my $type = shift; bless {}, $type; } sub bar { my $self = shift; $self->FOO::private::BAZ; } package FOO::private; sub BAZ { print "in BAZ\n"; } package main; $a = FOO->new; $a->bar; Now we try to override the BAZ() method. We would like FOO::bar() to call GOOP::BAZ(), but this cannot happen because FOO::bar() explicitly calls FOO::private::BAZ(). package FOO; sub new { my $type = shift; bless {}, $type; } sub bar { my $self = shift; $self->FOO::private::BAZ; } package FOO::private; sub BAZ { print "in BAZ\n"; } package GOOP; @ISA = qw( FOO ); sub new { my $type = shift; bless {}, $type; } sub BAZ { print "in GOOP::BAZ\n"; } package main; $a = GOOP->new; $a->bar; To create reusable code we must modify class FOO, flattening class FOO::private. The next example shows a reusable class FOO which allows the method GOOP::BAZ() to be used in place of FOO::BAZ(). package FOO; sub new { my $type = shift; bless {}, $type; } sub bar { my $self = shift; $self->BAZ; } sub BAZ { print "in BAZ\n"; } package GOOP; @ISA = qw( FOO ); sub new { my $type = shift; bless {}, $type; } sub BAZ { print "in GOOP::BAZ\n"; } package main; $a = GOOP->new; $a->bar; CLASS CONTEXT AND THE OBJECT Use the object to solve package and class context problems. Everything a method needs should be available via the object or should be passed as a parameter to the method. A class will sometimes have static or global data to be used by the methods. A subclass may want to override that data and replace it with new data. When this happens the superclass may not know how to find the new copy of the data. This problem can be solved by using the object to define the context of the method. Let the method look in the object for a reference to the data. The alternative is to force the method to go hunting for the data ("Is it in my class, or in a subclass? Which subclass?"), and this can be inconvenient and will lead to hackery. It is better just to let the object tell the method where that data is located. package Bar; %fizzle = ( 'Password' => 'XYZZY' ); sub new { my $type = shift; my $self = {}; $self->{'fizzle'} = \%fizzle; bless $self, $type; } sub enter { my $self = shift; # Don't try to guess if we should use %Bar::fizzle # or %Foo::fizzle. The object already knows which # we should use, so just ask it. # my $fizzle = $self->{'fizzle'}; print "The word is ", $fizzle->{'Password'}, "\n"; } package Foo; @ISA = qw( Bar ); %fizzle = ( 'Password' => 'Rumple' ); sub new { my $type = shift; my $self = Bar->new; $self->{'fizzle'} = \%fizzle; bless $self, $type; } package main; $a = Bar->new; $b = Foo->new; $a->enter; $b->enter; INHERITING A CONSTRUCTOR An inheritable constructor should use the second form of bless() which allows blessing directly into a specified class. Notice in this example that the object will be a BAR not a FOO, even though the constructor is in class FOO. package FOO; sub new { my $type = shift; my $self = {}; bless $self, $type; } sub baz { print "in FOO::baz()\n"; } package BAR; @ISA = qw(FOO); sub baz { print "in BAR::baz()\n"; } package main; $a = BAR->new; $a->baz; DELEGATION Some classes, such as SDBM_File, cannot be effectively subclassed because they create foreign objects. Such a class can be extended with some sort of aggregation technique such as the "using" relationship mentioned earlier or by delegation. The following example demonstrates delegation using an AUTOLOAD() function to perform message-forwarding. This will allow the Mydbm object to behave exactly like an SDBM_File object. The Mydbm class could now extend the behavior by adding custom FETCH() and STORE() methods, if this is desired. package Mydbm; require SDBM_File; require Tie::Hash; @ISA = qw(Tie::Hash); sub TIEHASH { my $type = shift; my $ref = SDBM_File->new(@_); bless {'delegate' => $ref}; } sub AUTOLOAD { my $self = shift; # The Perl interpreter places the name of the # message in a variable called $AUTOLOAD. # DESTROY messages should never be propagated. return if $AUTOLOAD =~ /::DESTROY$/; # Remove the package name. $AUTOLOAD =~ s/^Mydbm:://; # Pass the message to the delegate. $self->{'delegate'}->$AUTOLOAD(@_); } package main; use Fcntl qw( O_RDWR O_CREAT ); tie %foo, "Mydbm", "adbm", O_RDWR|O_CREAT, 0640; $foo{'bar'} = 123; print "foo-bar = $foo{'bar'}\n"; perlcall section NAME perlcall - Perl calling conventions from C DESCRIPTION The purpose of this document is to show you how to call Perl subroutines directly from C, i.e., how to write *callbacks*. Apart from discussing the C interface provided by Perl for writing callbacks the document uses a series of examples to show how the interface actually works in practice. In addition some techniques for coding callbacks are covered. Examples where callbacks are necessary include * An Error Handler You have created an XSUB interface to an application's C API. A fairly common feature in applications is to allow you to define a C function that will be called whenever something nasty occurs. What we would like is to be able to specify a Perl subroutine that will be called instead. * An Event Driven Program The classic example of where callbacks are used is when writing an event driven program like for an X windows application. In this case you register functions to be called whenever specific events occur, e.g., a mouse button is pressed, the cursor moves into a window or a menu item is selected. Although the techniques described here are applicable when embedding Perl in a C program, this is not the primary goal of this document. There are other details that must be considered and are specific to embedding Perl. For details on embedding Perl in C refer to the perlembed manpage. Before you launch yourself head first into the rest of this document, it would be a good idea to have read the following two documents - the perlxs manpage and the perlguts manpage. THE PERL_CALL FUNCTIONS Although this stuff is easier to explain using examples, you first need be aware of a few important definitions. Perl has a number of C functions that allow you to call Perl subroutines. They are I32 perl_call_sv(SV* sv, I32 flags) ; I32 perl_call_pv(char *subname, I32 flags) ; I32 perl_call_method(char *methname, I32 flags) ; I32 perl_call_argv(char *subname, I32 flags, register char **argv) ; The key function is *perl_call_sv*. All the other functions are fairly simple wrappers which make it easier to call Perl subroutines in special cases. At the end of the day they will all call *perl_call_sv* to invoke the Perl subroutine. All the *perl_call_** functions have a `flags' parameter which is used to pass a bit mask of options to Perl. This bit mask operates identically for each of the functions. The settings available in the bit mask are discussed in the section on "FLAG VALUES". Each of the functions will now be discussed in turn. perl_call_sv *perl_call_sv* takes two parameters, the first, `sv', is an SV*. This allows you to specify the Perl subroutine to be called either as a C string (which has first been converted to an SV) or a reference to a subroutine. The section, *Using perl_call_sv*, shows how you can make use of *perl_call_sv*. perl_call_pv The function, *perl_call_pv*, is similar to *perl_call_sv* except it expects its first parameter to be a C char* which identifies the Perl subroutine you want to call, e.g., `perl_call_pv("fred", 0)'. If the subroutine you want to call is in another package, just include the package name in the string, e.g., `"pkg::fred"'. perl_call_method The function *perl_call_method* is used to call a method from a Perl class. The parameter `methname' corresponds to the name of the method to be called. Note that the class that the method belongs to is passed on the Perl stack rather than in the parameter list. This class can be either the name of the class (for a static method) or a reference to an object (for a virtual method). See the perlobj manpage for more information on static and virtual methods and the section on "Using perl_call_method" for an example of using *perl_call_method*. perl_call_argv *perl_call_argv* calls the Perl subroutine specified by the C string stored in the `subname' parameter. It also takes the usual `flags' parameter. The final parameter, `argv', consists of a NULL terminated list of C strings to be passed as parameters to the Perl subroutine. See *Using perl_call_argv*. All the functions return an integer. This is a count of the number of items returned by the Perl subroutine. The actual items returned by the subroutine are stored on the Perl stack. As a general rule you should *always* check the return value from these functions. Even if you are expecting only a particular number of values to be returned from the Perl subroutine, there is nothing to stop someone from doing something unexpected - don't say you haven't been warned. FLAG VALUES The `flags' parameter in all the *perl_call_** functions is a bit mask which can consist of any combination of the symbols defined below, OR'ed together. G_VOID Calls the Perl subroutine in a void context. This flag has 2 effects: 1. It indicates to the subroutine being called that it is executing in a void context (if it executes *wantarray* the result will be the undefined value). 2. It ensures that nothing is actually returned from the subroutine. The value returned by the *perl_call_** function indicates how many items have been returned by the Perl subroutine - in this case it will be 0. G_SCALAR Calls the Perl subroutine in a scalar context. This is the default context flag setting for all the *perl_call_** functions. This flag has 2 effects: 1. It indicates to the subroutine being called that it is executing in a scalar context (if it executes *wantarray* the result will be false). 2. It ensures that only a scalar is actually returned from the subroutine. The subroutine can, of course, ignore the *wantarray* and return a list anyway. If so, then only the last element of the list will be returned. The value returned by the *perl_call_** function indicates how many items have been returned by the Perl subroutine - in this case it will be either 0 or 1. If 0, then you have specified the G_DISCARD flag. If 1, then the item actually returned by the Perl subroutine will be stored on the Perl stack - the section *Returning a Scalar* shows how to access this value on the stack. Remember that regardless of how many items the Perl subroutine returns, only the last one will be accessible from the stack - think of the case where only one value is returned as being a list with only one element. Any other items that were returned will not exist by the time control returns from the *perl_call_** function. The section *Returning a list in a scalar context* shows an example of this behavior. G_ARRAY Calls the Perl subroutine in a list context. As with G_SCALAR, this flag has 2 effects: 1. It indicates to the subroutine being called that it is executing in an array context (if it executes *wantarray* the result will be true). 2. It ensures that all items returned from the subroutine will be accessible when control returns from the *perl_call_** function. The value returned by the *perl_call_** function indicates how many items have been returned by the Perl subroutine. If 0, then you have specified the G_DISCARD flag. If not 0, then it will be a count of the number of items returned by the subroutine. These items will be stored on the Perl stack. The section *Returning a list of values* gives an example of using the G_ARRAY flag and the mechanics of accessing the returned items from the Perl stack. G_DISCARD By default, the *perl_call_** functions place the items returned from by the Perl subroutine on the stack. If you are not interested in these items, then setting this flag will make Perl get rid of them automatically for you. Note that it is still possible to indicate a context to the Perl subroutine by using either G_SCALAR or G_ARRAY. If you do not set this flag then it is *very* important that you make sure that any temporaries (i.e., parameters passed to the Perl subroutine and values returned from the subroutine) are disposed of yourself. The section *Returning a Scalar* gives details of how to dispose of these temporaries explicitly and the section *Using Perl to dispose of temporaries* discusses the specific circumstances where you can ignore the problem and let Perl deal with it for you. G_NOARGS Whenever a Perl subroutine is called using one of the *perl_call_** functions, it is assumed by default that parameters are to be passed to the subroutine. If you are not passing any parameters to the Perl subroutine, you can save a bit of time by setting this flag. It has the effect of not creating the `@_' array for the Perl subroutine. Although the functionality provided by this flag may seem straightforward, it should be used only if there is a good reason to do so. The reason for being cautious is that even if you have specified the G_NOARGS flag, it is still possible for the Perl subroutine that has been called to think that you have passed it parameters. In fact, what can happen is that the Perl subroutine you have called can access the `@_' array from a previous Perl subroutine. This will occur when the code that is executing the *perl_call_** function has itself been called from another Perl subroutine. The code below illustrates this sub fred { print "@_\n" } sub joe { &fred } &joe(1,2,3) ; This will print 1 2 3 What has happened is that `fred' accesses the `@_' array which belongs to `joe'. G_EVAL It is possible for the Perl subroutine you are calling to terminate abnormally, e.g., by calling *die* explicitly or by not actually existing. By default, when either of these events occurs, the process will terminate immediately. If you want to trap this type of event, specify the G_EVAL flag. It will put an *eval { }* around the subroutine call. Whenever control returns from the *perl_call_** function you need to check the `$@' variable as you would in a normal Perl script. The value returned from the *perl_call_** function is dependent on what other flags have been specified and whether an error has occurred. Here are all the different cases that can occur: * If the *perl_call_** function returns normally, then the value returned is as specified in the previous sections. * If G_DISCARD is specified, the return value will always be 0. * If G_ARRAY is specified *and* an error has occurred, the return value will always be 0. * If G_SCALAR is specified *and* an error has occurred, the return value will be 1 and the value on the top of the stack will be *undef*. This means that if you have already detected the error by checking `$@' and you want the program to continue, you must remember to pop the *undef* from the stack. See *Using G_EVAL* for details on using G_EVAL. G_KEEPERR You may have noticed that using the G_EVAL flag described above will always clear the `$@' variable and set it to a string describing the error iff there was an error in the called code. This unqualified resetting of `$@' can be problematic in the reliable identification of errors using the `eval {}' mechanism, because the possibility exists that perl will call other code (end of block processing code, for example) between the time the error causes `$@' to be set within `eval {}', and the subsequent statement which checks for the value of `$@' gets executed in the user's script. This scenario will mostly be applicable to code that is meant to be called from within destructors, asynchronous callbacks, signal handlers, `__DIE__' or `__WARN__' hooks, and `tie' functions. In such situations, you will not want to clear `$@' at all, but simply to append any new errors to any existing value of `$@'. The G_KEEPERR flag is meant to be used in conjunction with G_EVAL in *perl_call_** functions that are used to implement such code. This flag has no effect when G_EVAL is not used. When G_KEEPERR is used, any errors in the called code will be prefixed with the string "\t(in cleanup)", and appended to the current value of `$@'. The G_KEEPERR flag was introduced in Perl version 5.002. See *Using G_KEEPERR* for an example of a situation that warrants the use of this flag. Determining the Context As mentioned above, you can determine the context of the currently executing subroutine in Perl with *wantarray*. The equivalent test can be made in C by using the `GIMME_V' macro, which returns `G_ARRAY' if you have been called in an array context, `G_SCALAR' if in a scalar context, or `G_VOID' if in a void context (i.e. the return value will not be used). An older version of this macro is called `GIMME'; in a void context it returns `G_SCALAR' instead of `G_VOID'. An example of using the `GIMME_V' macro is shown in section *Using GIMME_V*. KNOWN PROBLEMS This section outlines all known problems that exist in the *perl_call_** functions. 1. If you are intending to make use of both the G_EVAL and G_SCALAR flags in your code, use a version of Perl greater than 5.000. There is a bug in version 5.000 of Perl which means that the combination of these two flags will not work as described in the section *FLAG VALUES*. Specifically, if the two flags are used when calling a subroutine and that subroutine does not call *die*, the value returned by *perl_call_** will be wrong. 2. In Perl 5.000 and 5.001 there is a problem with using *perl_call_** if the Perl sub you are calling attempts to trap a *die*. The symptom of this problem is that the called Perl sub will continue to completion, but whenever it attempts to pass control back to the XSUB, the program will immediately terminate. For example, say you want to call this Perl sub sub fred { eval { die "Fatal Error" ; } print "Trapped error: $@\n" if $@ ; } via this XSUB void Call_fred() CODE: PUSHMARK(SP) ; perl_call_pv("fred", G_DISCARD|G_NOARGS) ; fprintf(stderr, "back in Call_fred\n") ; When `Call_fred' is executed it will print Trapped error: Fatal Error As control never returns to `Call_fred', the `"back in Call_fred"' string will not get printed. To work around this problem, you can either upgrade to Perl 5.002 or higher, or use the G_EVAL flag with *perl_call_** as shown below void Call_fred() CODE: PUSHMARK(SP) ; perl_call_pv("fred", G_EVAL|G_DISCARD|G_NOARGS) ; fprintf(stderr, "back in Call_fred\n") ; EXAMPLES Enough of the definition talk, let's have a few examples. Perl provides many macros to assist in accessing the Perl stack. Wherever possible, these macros should always be used when interfacing to Perl internals. We hope this should make the code less vulnerable to any changes made to Perl in the future. Another point worth noting is that in the first series of examples I have made use of only the *perl_call_pv* function. This has been done to keep the code simpler and ease you into the topic. Wherever possible, if the choice is between using *perl_call_pv* and *perl_call_sv*, you should always try to use *perl_call_sv*. See *Using perl_call_sv* for details. No Parameters, Nothing returned This first trivial example will call a Perl subroutine, *PrintUID*, to print out the UID of the process. sub PrintUID { print "UID is $<\n" ; } and here is a C function to call it static void call_PrintUID() { dSP ; PUSHMARK(SP) ; perl_call_pv("PrintUID", G_DISCARD|G_NOARGS) ; } Simple, eh. A few points to note about this example. 1. Ignore `dSP' and `PUSHMARK(SP)' for now. They will be discussed in the next example. 2. We aren't passing any parameters to *PrintUID* so G_NOARGS can be specified. 3. We aren't interested in anything returned from *PrintUID*, so G_DISCARD is specified. Even if *PrintUID* was changed to return some value(s), having specified G_DISCARD will mean that they will be wiped by the time control returns from *perl_call_pv*. 4. As *perl_call_pv* is being used, the Perl subroutine is specified as a C string. In this case the subroutine name has been 'hard-wired' into the code. 5. Because we specified G_DISCARD, it is not necessary to check the value returned from *perl_call_pv*. It will always be 0. Passing Parameters Now let's make a slightly more complex example. This time we want to call a Perl subroutine, `LeftString', which will take 2 parameters - a string (`$s') and an integer (`$n'). The subroutine will simply print the first `$n' characters of the string. So the Perl subroutine would look like this sub LeftString { my($s, $n) = @_ ; print substr($s, 0, $n), "\n" ; } The C function required to call *LeftString* would look like this. static void call_LeftString(a, b) char * a ; int b ; { dSP ; ENTER ; SAVETMPS ; PUSHMARK(SP) ; XPUSHs(sv_2mortal(newSVpv(a, 0))); XPUSHs(sv_2mortal(newSViv(b))); PUTBACK ; perl_call_pv("LeftString", G_DISCARD); FREETMPS ; LEAVE ; } Here are a few notes on the C function *call_LeftString*. 1. Parameters are passed to the Perl subroutine using the Perl stack. This is the purpose of the code beginning with the line `dSP' and ending with the line `PUTBACK'. The `dSP' declares a local copy of the stack pointer. This local copy should always be accessed as `SP'. 2. If you are going to put something onto the Perl stack, you need to know where to put it. This is the purpose of the macro `dSP' - it declares and initializes a *local* copy of the Perl stack pointer. All the other macros which will be used in this example require you to have used this macro. The exception to this rule is if you are calling a Perl subroutine directly from an XSUB function. In this case it is not necessary to use the `dSP' macro explicitly - it will be declared for you automatically. 3. Any parameters to be pushed onto the stack should be bracketed by the `PUSHMARK' and `PUTBACK' macros. The purpose of these two macros, in this context, is to count the number of parameters you are pushing automatically. Then whenever Perl is creating the `@_' array for the subroutine, it knows how big to make it. The `PUSHMARK' macro tells Perl to make a mental note of the current stack pointer. Even if you aren't passing any parameters (like the example shown in the section *No Parameters, Nothing returned*) you must still call the `PUSHMARK' macro before you can call any of the *perl_call_** functions - Perl still needs to know that there are no parameters. The `PUTBACK' macro sets the global copy of the stack pointer to be the same as our local copy. If we didn't do this *perl_call_pv* wouldn't know where the two parameters we pushed were - remember that up to now all the stack pointer manipulation we have done is with our local copy, *not* the global copy. 4. The only flag specified this time is G_DISCARD. Because we are passing 2 parameters to the Perl subroutine this time, we have not specified G_NOARGS. 5. Next, we come to XPUSHs. This is where the parameters actually get pushed onto the stack. In this case we are pushing a string and an integer. See the section on "XSUBs and the Argument Stack" in the perlguts manpage for details on how the XPUSH macros work. 6. Because we created temporary values (by means of sv_2mortal() calls) we will have to tidy up the Perl stack and dispose of mortal SVs. This is the purpose of ENTER ; SAVETMPS ; at the start of the function, and FREETMPS ; LEAVE ; at the end. The `ENTER'/`SAVETMPS' pair creates a boundary for any temporaries we create. This means that the temporaries we get rid of will be limited to those which were created after these calls. The `FREETMPS'/`LEAVE' pair will get rid of any values returned by the Perl subroutine (see next example), plus it will also dump the mortal SVs we have created. Having `ENTER'/`SAVETMPS' at the beginning of the code makes sure that no other mortals are destroyed. Think of these macros as working a bit like using `{' and `}' in Perl to limit the scope of local variables. See the section *Using Perl to dispose of temporaries* for details of an alternative to using these macros. 7. Finally, *LeftString* can now be called via the *perl_call_pv* function. Returning a Scalar Now for an example of dealing with the items returned from a Perl subroutine. Here is a Perl subroutine, *Adder*, that takes 2 integer parameters and simply returns their sum. sub Adder { my($a, $b) = @_ ; $a + $b ; } Because we are now concerned with the return value from *Adder*, the C function required to call it is now a bit more complex. static void call_Adder(a, b) int a ; int b ; { dSP ; int count ; ENTER ; SAVETMPS; PUSHMARK(SP) ; XPUSHs(sv_2mortal(newSViv(a))); XPUSHs(sv_2mortal(newSViv(b))); PUTBACK ; count = perl_call_pv("Adder", G_SCALAR); SPAGAIN ; if (count != 1) croak("Big trouble\n") ; printf ("The sum of %d and %d is %d\n", a, b, POPi) ; PUTBACK ; FREETMPS ; LEAVE ; } Points to note this time are 1. The only flag specified this time was G_SCALAR. That means the `@_' array will be created and that the value returned by *Adder* will still exist after the call to *perl_call_pv*. 2. The purpose of the macro `SPAGAIN' is to refresh the local copy of the stack pointer. This is necessary because it is possible that the memory allocated to the Perl stack has been reallocated whilst in the *perl_call_pv* call. If you are making use of the Perl stack pointer in your code you must always refresh the local copy using SPAGAIN whenever you make use of the *perl_call_** functions or any other Perl internal function. 3. Although only a single value was expected to be returned from *Adder*, it is still good practice to check the return code from *perl_call_pv* anyway. Expecting a single value is not quite the same as knowing that there will be one. If someone modified *Adder* to return a list and we didn't check for that possibility and take appropriate action the Perl stack would end up in an inconsistent state. That is something you *really* don't want to happen ever. 4. The `POPi' macro is used here to pop the return value from the stack. In this case we wanted an integer, so `POPi' was used. Here is the complete list of POP macros available, along with the types they return. POPs SV POPp pointer POPn double POPi integer POPl long 5. The final `PUTBACK' is used to leave the Perl stack in a consistent state before exiting the function. This is necessary because when we popped the return value from the stack with `POPi' it updated only our local copy of the stack pointer. Remember, `PUTBACK' sets the global stack pointer to be the same as our local copy. Returning a list of values Now, let's extend the previous example to return both the sum of the parameters and the difference. Here is the Perl subroutine sub AddSubtract { my($a, $b) = @_ ; ($a+$b, $a-$b) ; } and this is the C function static void call_AddSubtract(a, b) int a ; int b ; { dSP ; int count ; ENTER ; SAVETMPS; PUSHMARK(SP) ; XPUSHs(sv_2mortal(newSViv(a))); XPUSHs(sv_2mortal(newSViv(b))); PUTBACK ; count = perl_call_pv("AddSubtract", G_ARRAY); SPAGAIN ; if (count != 2) croak("Big trouble\n") ; printf ("%d - %d = %d\n", a, b, POPi) ; printf ("%d + %d = %d\n", a, b, POPi) ; PUTBACK ; FREETMPS ; LEAVE ; } If *call_AddSubtract* is called like this call_AddSubtract(7, 4) ; then here is the output 7 - 4 = 3 7 + 4 = 11 Notes 1. We wanted array context, so G_ARRAY was used. 2. Not surprisingly `POPi' is used twice this time because we were retrieving 2 values from the stack. The important thing to note is that when using the `POP*' macros they come off the stack in *reverse* order. Returning a list in a scalar context Say the Perl subroutine in the previous section was called in a scalar context, like this static void call_AddSubScalar(a, b) int a ; int b ; { dSP ; int count ; int i ; ENTER ; SAVETMPS; PUSHMARK(SP) ; XPUSHs(sv_2mortal(newSViv(a))); XPUSHs(sv_2mortal(newSViv(b))); PUTBACK ; count = perl_call_pv("AddSubtract", G_SCALAR); SPAGAIN ; printf ("Items Returned = %d\n", count) ; for (i = 1 ; i <= count ; ++i) printf ("Value %d = %d\n", i, POPi) ; PUTBACK ; FREETMPS ; LEAVE ; } The other modification made is that *call_AddSubScalar* will print the number of items returned from the Perl subroutine and their value (for simplicity it assumes that they are integer). So if *call_AddSubScalar* is called call_AddSubScalar(7, 4) ; then the output will be Items Returned = 1 Value 1 = 3 In this case the main point to note is that only the last item in the list is returned from the subroutine, *AddSubtract* actually made it back to *call_AddSubScalar*. Returning Data from Perl via the parameter list It is also possible to return values directly via the parameter list - whether it is actually desirable to do it is another matter entirely. The Perl subroutine, *Inc*, below takes 2 parameters and increments each directly. sub Inc { ++ $_[0] ; ++ $_[1] ; } and here is a C function to call it. static void call_Inc(a, b) int a ; int b ; { dSP ; int count ; SV * sva ; SV * svb ; ENTER ; SAVETMPS; sva = sv_2mortal(newSViv(a)) ; svb = sv_2mortal(newSViv(b)) ; PUSHMARK(SP) ; XPUSHs(sva); XPUSHs(svb); PUTBACK ; count = perl_call_pv("Inc", G_DISCARD); if (count != 0) croak ("call_Inc: expected 0 values from 'Inc', got %d\n", count) ; printf ("%d + 1 = %d\n", a, SvIV(sva)) ; printf ("%d + 1 = %d\n", b, SvIV(svb)) ; FREETMPS ; LEAVE ; } To be able to access the two parameters that were pushed onto the stack after they return from *perl_call_pv* it is necessary to make a note of their addresses - thus the two variables `sva' and `svb'. The reason this is necessary is that the area of the Perl stack which held them will very likely have been overwritten by something else by the time control returns from *perl_call_pv*. Using G_EVAL Now an example using G_EVAL. Below is a Perl subroutine which computes the difference of its 2 parameters. If this would result in a negative result, the subroutine calls *die*. sub Subtract { my ($a, $b) = @_ ; die "death can be fatal\n" if $a < $b ; $a - $b ; } and some C to call it static void call_Subtract(a, b) int a ; int b ; { dSP ; int count ; ENTER ; SAVETMPS; PUSHMARK(SP) ; XPUSHs(sv_2mortal(newSViv(a))); XPUSHs(sv_2mortal(newSViv(b))); PUTBACK ; count = perl_call_pv("Subtract", G_EVAL|G_SCALAR); SPAGAIN ; /* Check the eval first */ if (SvTRUE(ERRSV)) { STRLEN n_a; printf ("Uh oh - %s\n", SvPV(ERRSV, n_a)) ; POPs ; } else { if (count != 1) croak("call_Subtract: wanted 1 value from 'Subtract', got %d\n", count) ; printf ("%d - %d = %d\n", a, b, POPi) ; } PUTBACK ; FREETMPS ; LEAVE ; } If *call_Subtract* is called thus call_Subtract(4, 5) the following will be printed Uh oh - death can be fatal Notes 1. We want to be able to catch the *die* so we have used the G_EVAL flag. Not specifying this flag would mean that the program would terminate immediately at the *die* statement in the subroutine *Subtract*. 2. The code if (SvTRUE(ERRSV)) { STRLEN n_a; printf ("Uh oh - %s\n", SvPV(ERRSV, n_a)) ; POPs ; } is the direct equivalent of this bit of Perl print "Uh oh - $@\n" if $@ ; `PL_errgv' is a perl global of type `GV *' that points to the symbol table entry containing the error. `ERRSV' therefore refers to the C equivalent of `$@'. 3. Note that the stack is popped using `POPs' in the block where `SvTRUE(ERRSV)' is true. This is necessary because whenever a *perl_call_** function invoked with G_EVAL|G_SCALAR returns an error, the top of the stack holds the value *undef*. Because we want the program to continue after detecting this error, it is essential that the stack is tidied up by removing the *undef*. Using G_KEEPERR Consider this rather facetious example, where we have used an XS version of the call_Subtract example above inside a destructor: package Foo; sub new { bless {}, $_[0] } sub Subtract { my($a,$b) = @_; die "death can be fatal" if $a < $b ; $a - $b; } sub DESTROY { call_Subtract(5, 4); } sub foo { die "foo dies"; } package main; eval { Foo->new->foo }; print "Saw: $@" if $@; # should be, but isn't This example will fail to recognize that an error occurred inside the `eval {}'. Here's why: the call_Subtract code got executed while perl was cleaning up temporaries when exiting the eval block, and because call_Subtract is implemented with *perl_call_pv* using the G_EVAL flag, it promptly reset `$@'. This results in the failure of the outermost test for `$@', and thereby the failure of the error trap. Appending the G_KEEPERR flag, so that the *perl_call_pv* call in call_Subtract reads: count = perl_call_pv("Subtract", G_EVAL|G_SCALAR|G_KEEPERR); will preserve the error and restore reliable error handling. Using perl_call_sv In all the previous examples I have 'hard-wired' the name of the Perl subroutine to be called from C. Most of the time though, it is more convenient to be able to specify the name of the Perl subroutine from within the Perl script. Consider the Perl code below sub fred { print "Hello there\n" ; } CallSubPV("fred") ; Here is a snippet of XSUB which defines *CallSubPV*. void CallSubPV(name) char * name CODE: PUSHMARK(SP) ; perl_call_pv(name, G_DISCARD|G_NOARGS) ; That is fine as far as it goes. The thing is, the Perl subroutine can be specified as only a string. For Perl 4 this was adequate, but Perl 5 allows references to subroutines and anonymous subroutines. This is where *perl_call_sv* is useful. The code below for *CallSubSV* is identical to *CallSubPV* except that the `name' parameter is now defined as an SV* and we use *perl_call_sv* instead of *perl_call_pv*. void CallSubSV(name) SV * name CODE: PUSHMARK(SP) ; perl_call_sv(name, G_DISCARD|G_NOARGS) ; Because we are using an SV to call *fred* the following can all be used CallSubSV("fred") ; CallSubSV(\&fred) ; $ref = \&fred ; CallSubSV($ref) ; CallSubSV( sub { print "Hello there\n" } ) ; As you can see, *perl_call_sv* gives you much greater flexibility in how you can specify the Perl subroutine. You should note that if it is necessary to store the SV (`name' in the example above) which corresponds to the Perl subroutine so that it can be used later in the program, it not enough just to store a copy of the pointer to the SV. Say the code above had been like this static SV * rememberSub ; void SaveSub1(name) SV * name CODE: rememberSub = name ; void CallSavedSub1() CODE: PUSHMARK(SP) ; perl_call_sv(rememberSub, G_DISCARD|G_NOARGS) ; The reason this is wrong is that by the time you come to use the pointer `rememberSub' in `CallSavedSub1', it may or may not still refer to the Perl subroutine that was recorded in `SaveSub1'. This is particularly true for these cases SaveSub1(\&fred) ; CallSavedSub1() ; SaveSub1( sub { print "Hello there\n" } ) ; CallSavedSub1() ; By the time each of the `SaveSub1' statements above have been executed, the SV*s which corresponded to the parameters will no longer exist. Expect an error message from Perl of the form Can't use an undefined value as a subroutine reference at ... for each of the `CallSavedSub1' lines. Similarly, with this code $ref = \&fred ; SaveSub1($ref) ; $ref = 47 ; CallSavedSub1() ; you can expect one of these messages (which you actually get is dependent on the version of Perl you are using) Not a CODE reference at ... Undefined subroutine &main::47 called ... The variable `$ref' may have referred to the subroutine `fred' whenever the call to `SaveSub1' was made but by the time `CallSavedSub1' gets called it now holds the number `47'. Because we saved only a pointer to the original SV in `SaveSub1', any changes to `$ref' will be tracked by the pointer `rememberSub'. This means that whenever `CallSavedSub1' gets called, it will attempt to execute the code which is referenced by the SV* `rememberSub'. In this case though, it now refers to the integer `47', so expect Perl to complain loudly. A similar but more subtle problem is illustrated with this code $ref = \&fred ; SaveSub1($ref) ; $ref = \&joe ; CallSavedSub1() ; This time whenever `CallSavedSub1' get called it will execute the Perl subroutine `joe' (assuming it exists) rather than `fred' as was originally requested in the call to `SaveSub1'. To get around these problems it is necessary to take a full copy of the SV. The code below shows `SaveSub2' modified to do that static SV * keepSub = (SV*)NULL ; void SaveSub2(name) SV * name CODE: /* Take a copy of the callback */ if (keepSub == (SV*)NULL) /* First time, so create a new SV */ keepSub = newSVsv(name) ; else /* Been here before, so overwrite */ SvSetSV(keepSub, name) ; void CallSavedSub2() CODE: PUSHMARK(SP) ; perl_call_sv(keepSub, G_DISCARD|G_NOARGS) ; To avoid creating a new SV every time `SaveSub2' is called, the function first checks to see if it has been called before. If not, then space for a new SV is allocated and the reference to the Perl subroutine, `name' is copied to the variable `keepSub' in one operation using `newSVsv'. Thereafter, whenever `SaveSub2' is called the existing SV, `keepSub', is overwritten with the new value using `SvSetSV'. Using perl_call_argv Here is a Perl subroutine which prints whatever parameters are passed to it. sub PrintList { my(@list) = @_ ; foreach (@list) { print "$_\n" } } and here is an example of *perl_call_argv* which will call *PrintList*. static char * words[] = {"alpha", "beta", "gamma", "delta", NULL} ; static void call_PrintList() { dSP ; perl_call_argv("PrintList", G_DISCARD, words) ; } Note that it is not necessary to call `PUSHMARK' in this instance. This is because *perl_call_argv* will do it for you. Using perl_call_method Consider the following Perl code { package Mine ; sub new { my($type) = shift ; bless [@_] } sub Display { my ($self, $index) = @_ ; print "$index: $$self[$index]\n" ; } sub PrintID { my($class) = @_ ; print "This is Class $class version 1.0\n" ; } } It implements just a very simple class to manage an array. Apart from the constructor, `new', it declares methods, one static and one virtual. The static method, `PrintID', prints out simply the class name and a version number. The virtual method, `Display', prints out a single element of the array. Here is an all Perl example of using it. $a = new Mine ('red', 'green', 'blue') ; $a->Display(1) ; PrintID Mine; will print 1: green This is Class Mine version 1.0 Calling a Perl method from C is fairly straightforward. The following things are required * a reference to the object for a virtual method or the name of the class for a static method. * the name of the method. * any other parameters specific to the method. Here is a simple XSUB which illustrates the mechanics of calling both the `PrintID' and `Display' methods from C. void call_Method(ref, method, index) SV * ref char * method int index CODE: PUSHMARK(SP); XPUSHs(ref); XPUSHs(sv_2mortal(newSViv(index))) ; PUTBACK; perl_call_method(method, G_DISCARD) ; void call_PrintID(class, method) char * class char * method CODE: PUSHMARK(SP); XPUSHs(sv_2mortal(newSVpv(class, 0))) ; PUTBACK; perl_call_method(method, G_DISCARD) ; So the methods `PrintID' and `Display' can be invoked like this $a = new Mine ('red', 'green', 'blue') ; call_Method($a, 'Display', 1) ; call_PrintID('Mine', 'PrintID') ; The only thing to note is that in both the static and virtual methods, the method name is not passed via the stack - it is used as the first parameter to *perl_call_method*. Using GIMME_V Here is a trivial XSUB which prints the context in which it is currently executing. void PrintContext() CODE: I32 gimme = GIMME_V; if (gimme == G_VOID) printf ("Context is Void\n") ; else if (gimme == G_SCALAR) printf ("Context is Scalar\n") ; else printf ("Context is Array\n") ; and here is some Perl to test it PrintContext ; $a = PrintContext ; @a = PrintContext ; The output from that will be Context is Void Context is Scalar Context is Array Using Perl to dispose of temporaries In the examples given to date, any temporaries created in the callback (i.e., parameters passed on the stack to the *perl_call_** function or values returned via the stack) have been freed by one of these methods * specifying the G_DISCARD flag with *perl_call_**. * explicitly disposed of using the `ENTER'/`SAVETMPS' - `FREETMPS'/`LEAVE' pairing. There is another method which can be used, namely letting Perl do it for you automatically whenever it regains control after the callback has terminated. This is done by simply not using the ENTER ; SAVETMPS ; ... FREETMPS ; LEAVE ; sequence in the callback (and not, of course, specifying the G_DISCARD flag). If you are going to use this method you have to be aware of a possible memory leak which can arise under very specific circumstances. To explain these circumstances you need to know a bit about the flow of control between Perl and the callback routine. The examples given at the start of the document (an error handler and an event driven program) are typical of the two main sorts of flow control that you are likely to encounter with callbacks. There is a very important distinction between them, so pay attention. In the first example, an error handler, the flow of control could be as follows. You have created an interface to an external library. Control can reach the external library like this perl --> XSUB --> external library Whilst control is in the library, an error condition occurs. You have previously set up a Perl callback to handle this situation, so it will get executed. Once the callback has finished, control will drop back to Perl again. Here is what the flow of control will be like in that situation perl --> XSUB --> external library ... error occurs ... external library --> perl_call --> perl | perl <-- XSUB <-- external library <-- perl_call <----+ After processing of the error using *perl_call_** is completed, control reverts back to Perl more or less immediately. In the diagram, the further right you go the more deeply nested the scope is. It is only when control is back with perl on the extreme left of the diagram that you will have dropped back to the enclosing scope and any temporaries you have left hanging around will be freed. In the second example, an event driven program, the flow of control will be more like this perl --> XSUB --> event handler ... event handler --> perl_call --> perl | event handler <-- perl_call <----+ ... event handler --> perl_call --> perl | event handler <-- perl_call <----+ ... event handler --> perl_call --> perl | event handler <-- perl_call <----+ In this case the flow of control can consist of only the repeated sequence event handler --> perl_call --> perl for practically the complete duration of the program. This means that control may *never* drop back to the surrounding scope in Perl at the extreme left. So what is the big problem? Well, if you are expecting Perl to tidy up those temporaries for you, you might be in for a long wait. For Perl to dispose of your temporaries, control must drop back to the enclosing scope at some stage. In the event driven scenario that may never happen. This means that as time goes on, your program will create more and more temporaries, none of which will ever be freed. As each of these temporaries consumes some memory your program will eventually consume all the available memory in your system - kapow! So here is the bottom line - if you are sure that control will revert back to the enclosing Perl scope fairly quickly after the end of your callback, then it isn't absolutely necessary to dispose explicitly of any temporaries you may have created. Mind you, if you are at all uncertain about what to do, it doesn't do any harm to tidy up anyway. Strategies for storing Callback Context Information Potentially one of the trickiest problems to overcome when designing a callback interface can be figuring out how to store the mapping between the C callback function and the Perl equivalent. To help understand why this can be a real problem first consider how a callback is set up in an all C environment. Typically a C API will provide a function to register a callback. This will expect a pointer to a function as one of its parameters. Below is a call to a hypothetical function `register_fatal' which registers the C function to get called when a fatal error occurs. register_fatal(cb1) ; The single parameter `cb1' is a pointer to a function, so you must have defined `cb1' in your code, say something like this static void cb1() { printf ("Fatal Error\n") ; exit(1) ; } Now change that to call a Perl subroutine instead static SV * callback = (SV*)NULL; static void cb1() { dSP ; PUSHMARK(SP) ; /* Call the Perl sub to process the callback */ perl_call_sv(callback, G_DISCARD) ; } void register_fatal(fn) SV * fn CODE: /* Remember the Perl sub */ if (callback == (SV*)NULL) callback = newSVsv(fn) ; else SvSetSV(callback, fn) ; /* register the callback with the external library */ register_fatal(cb1) ; where the Perl equivalent of `register_fatal' and the callback it registers, `pcb1', might look like this # Register the sub pcb1 register_fatal(\&pcb1) ; sub pcb1 { die "I'm dying...\n" ; } The mapping between the C callback and the Perl equivalent is stored in the global variable `callback'. This will be adequate if you ever need to have only one callback registered at any time. An example could be an error handler like the code sketched out above. Remember though, repeated calls to `register_fatal' will replace the previously registered callback function with the new one. Say for example you want to interface to a library which allows asynchronous file i/o. In this case you may be able to register a callback whenever a read operation has completed. To be of any use we want to be able to call separate Perl subroutines for each file that is opened. As it stands, the error handler example above would not be adequate as it allows only a single callback to be defined at any time. What we require is a means of storing the mapping between the opened file and the Perl subroutine we want to be called for that file. Say the i/o library has a function `asynch_read' which associates a C function `ProcessRead' with a file handle `fh' - this assumes that it has also provided some routine to open the file and so obtain the file handle. asynch_read(fh, ProcessRead) This may expect the C *ProcessRead* function of this form void ProcessRead(fh, buffer) int fh ; char * buffer ; { ... } To provide a Perl interface to this library we need to be able to map between the `fh' parameter and the Perl subroutine we want called. A hash is a convenient mechanism for storing this mapping. The code below shows a possible implementation static HV * Mapping = (HV*)NULL ; void asynch_read(fh, callback) int fh SV * callback CODE: /* If the hash doesn't already exist, create it */ if (Mapping == (HV*)NULL) Mapping = newHV() ; /* Save the fh -> callback mapping */ hv_store(Mapping, (char*)&fh, sizeof(fh), newSVsv(callback), 0) ; /* Register with the C Library */ asynch_read(fh, asynch_read_if) ; and `asynch_read_if' could look like this static void asynch_read_if(fh, buffer) int fh ; char * buffer ; { dSP ; SV ** sv ; /* Get the callback associated with fh */ sv = hv_fetch(Mapping, (char*)&fh , sizeof(fh), FALSE) ; if (sv == (SV**)NULL) croak("Internal error...\n") ; PUSHMARK(SP) ; XPUSHs(sv_2mortal(newSViv(fh))) ; XPUSHs(sv_2mortal(newSVpv(buffer, 0))) ; PUTBACK ; /* Call the Perl sub */ perl_call_sv(*sv, G_DISCARD) ; } For completeness, here is `asynch_close'. This shows how to remove the entry from the hash `Mapping'. void asynch_close(fh) int fh CODE: /* Remove the entry from the hash */ (void) hv_delete(Mapping, (char*)&fh, sizeof(fh), G_DISCARD) ; /* Now call the real asynch_close */ asynch_close(fh) ; So the Perl interface would look like this sub callback1 { my($handle, $buffer) = @_ ; } # Register the Perl callback asynch_read($fh, \&callback1) ; asynch_close($fh) ; The mapping between the C callback and Perl is stored in the global hash `Mapping' this time. Using a hash has the distinct advantage that it allows an unlimited number of callbacks to be registered. What if the interface provided by the C callback doesn't contain a parameter which allows the file handle to Perl subroutine mapping? Say in the asynchronous i/o package, the callback function gets passed only the `buffer' parameter like this void ProcessRead(buffer) char * buffer ; { ... } Without the file handle there is no straightforward way to map from the C callback to the Perl subroutine. In this case a possible way around this problem is to predefine a series of C functions to act as the interface to Perl, thus #define MAX_CB 3 #define NULL_HANDLE -1 typedef void (*FnMap)() ; struct MapStruct { FnMap Function ; SV * PerlSub ; int Handle ; } ; static void fn1() ; static void fn2() ; static void fn3() ; static struct MapStruct Map [MAX_CB] = { { fn1, NULL, NULL_HANDLE }, { fn2, NULL, NULL_HANDLE }, { fn3, NULL, NULL_HANDLE } } ; static void Pcb(index, buffer) int index ; char * buffer ; { dSP ; PUSHMARK(SP) ; XPUSHs(sv_2mortal(newSVpv(buffer, 0))) ; PUTBACK ; /* Call the Perl sub */ perl_call_sv(Map[index].PerlSub, G_DISCARD) ; } static void fn1(buffer) char * buffer ; { Pcb(0, buffer) ; } static void fn2(buffer) char * buffer ; { Pcb(1, buffer) ; } static void fn3(buffer) char * buffer ; { Pcb(2, buffer) ; } void array_asynch_read(fh, callback) int fh SV * callback CODE: int index ; int null_index = MAX_CB ; /* Find the same handle or an empty entry */ for (index = 0 ; index < MAX_CB ; ++index) { if (Map[index].Handle == fh) break ; if (Map[index].Handle == NULL_HANDLE) null_index = index ; } if (index == MAX_CB && null_index == MAX_CB) croak ("Too many callback functions registered\n") ; if (index == MAX_CB) index = null_index ; /* Save the file handle */ Map[index].Handle = fh ; /* Remember the Perl sub */ if (Map[index].PerlSub == (SV*)NULL) Map[index].PerlSub = newSVsv(callback) ; else SvSetSV(Map[index].PerlSub, callback) ; asynch_read(fh, Map[index].Function) ; void array_asynch_close(fh) int fh CODE: int index ; /* Find the file handle */ for (index = 0; index < MAX_CB ; ++ index) if (Map[index].Handle == fh) break ; if (index == MAX_CB) croak ("could not close fh %d\n", fh) ; Map[index].Handle = NULL_HANDLE ; SvREFCNT_dec(Map[index].PerlSub) ; Map[index].PerlSub = (SV*)NULL ; asynch_close(fh) ; In this case the functions `fn1', `fn2', and `fn3' are used to remember the Perl subroutine to be called. Each of the functions holds a separate hard-wired index which is used in the function `Pcb' to access the `Map' array and actually call the Perl subroutine. There are some obvious disadvantages with this technique. Firstly, the code is considerably more complex than with the previous example. Secondly, there is a hard-wired limit (in this case 3) to the number of callbacks that can exist simultaneously. The only way to increase the limit is by modifying the code to add more functions and then recompiling. None the less, as long as the number of functions is chosen with some care, it is still a workable solution and in some cases is the only one available. To summarize, here are a number of possible methods for you to consider for storing the mapping between C and the Perl callback 1. Ignore the problem - Allow only 1 callback For a lot of situations, like interfacing to an error handler, this may be a perfectly adequate solution. 2. Create a sequence of callbacks - hard wired limit If it is impossible to tell from the parameters passed back from the C callback what the context is, then you may need to create a sequence of C callback interface functions, and store pointers to each in an array. 3. Use a parameter to map to the Perl callback A hash is an ideal mechanism to store the mapping between C and Perl. Alternate Stack Manipulation Although I have made use of only the `POP*' macros to access values returned from Perl subroutines, it is also possible to bypass these macros and read the stack using the `ST' macro (See the perlxs manpage for a full description of the `ST' macro). Most of the time the `POP*' macros should be adequate, the main problem with them is that they force you to process the returned values in sequence. This may not be the most suitable way to process the values in some cases. What we want is to be able to access the stack in a random order. The `ST' macro as used when coding an XSUB is ideal for this purpose. The code below is the example given in the section *Returning a list of values* recoded to use `ST' instead of `POP*'. static void call_AddSubtract2(a, b) int a ; int b ; { dSP ; I32 ax ; int count ; ENTER ; SAVETMPS; PUSHMARK(SP) ; XPUSHs(sv_2mortal(newSViv(a))); XPUSHs(sv_2mortal(newSViv(b))); PUTBACK ; count = perl_call_pv("AddSubtract", G_ARRAY); SPAGAIN ; SP -= count ; ax = (SP - PL_stack_base) + 1 ; if (count != 2) croak("Big trouble\n") ; printf ("%d + %d = %d\n", a, b, SvIV(ST(0))) ; printf ("%d - %d = %d\n", a, b, SvIV(ST(1))) ; PUTBACK ; FREETMPS ; LEAVE ; } Notes 1. Notice that it was necessary to define the variable `ax'. This is because the `ST' macro expects it to exist. If we were in an XSUB it would not be necessary to define `ax' as it is already defined for you. 2. The code SPAGAIN ; SP -= count ; ax = (SP - PL_stack_base) + 1 ; sets the stack up so that we can use the `ST' macro. 3. Unlike the original coding of this example, the returned values are not accessed in reverse order. So `ST(0)' refers to the first value returned by the Perl subroutine and `ST(count-1)' refers to the last. Creating and calling an anonymous subroutine in C As we've already shown, `perl_call_sv' can be used to invoke an anonymous subroutine. However, our example showed a Perl script invoking an XSUB to perform this operation. Let's see how it can be done inside our C code: ... SV *cvrv = perl_eval_pv("sub { print 'You will not find me cluttering any namespace!' }", TRUE); ... perl_call_sv(cvrv, G_VOID|G_NOARGS); `perl_eval_pv' is used to compile the anonymous subroutine, which will be the return value as well (read more about `perl_eval_pv' in the "perl_eval_pv" entry in the perlguts manpage). Once this code reference is in hand, it can be mixed in with all the previous examples we've shown. SEE ALSO the perlxs manpage, the perlguts manpage, the perlembed manpage AUTHOR Paul Marquess Special thanks to the following people who assisted in the creation of the document. Jeff Okamoto, Tim Bunce, Nick Gianniotis, Steve Kelem, Gurusamy Sarathy and Larry Wall. DATE Version 1.3, 14th Apr 1997 perldata section NAME perldata - Perl data types DESCRIPTION Variable names Perl has three data structures: scalars, arrays of scalars, and associative arrays of scalars, known as "hashes". Normal arrays are indexed by number, starting with 0. (Negative subscripts count from the end.) Hash arrays are indexed by string. Values are usually referred to by name (or through a named reference). The first character of the name tells you to what sort of data structure it refers. The rest of the name tells you the particular value to which it refers. Most often, it consists of a single *identifier*, that is, a string beginning with a letter or underscore, and containing letters, underscores, and digits. In some cases, it may be a chain of identifiers, separated by `::' (or by `'', but that's deprecated); all but the last are interpreted as names of packages, to locate the namespace in which to look up the final identifier (see the "Packages" entry in the perlmod manpage for details). It's possible to substitute for a simple identifier an expression that produces a reference to the value at runtime; this is described in more detail below, and in the perlref manpage. There are also special variables whose names don't follow these rules, so that they don't accidentally collide with one of your normal variables. Strings that match parenthesized parts of a regular expression are saved under names containing only digits after the `$' (see the perlop manpage and the perlre manpage). In addition, several special variables that provide windows into the inner working of Perl have names containing punctuation characters (see the perlvar manpage). Scalar values are always named with '$', even when referring to a scalar that is part of an array. It works like the English word "the". Thus we have: $days # the simple scalar value "days" $days[28] # the 29th element of array @days $days{'Feb'} # the 'Feb' value from hash %days $#days # the last index of array @days but entire arrays or array slices are denoted by '@', which works much like the word "these" or "those": @days # ($days[0], $days[1],... $days[n]) @days[3,4,5] # same as @days[3..5] @days{'a','c'} # same as ($days{'a'},$days{'c'}) and entire hashes are denoted by '%': %days # (key1, val1, key2, val2 ...) In addition, subroutines are named with an initial '&', though this is optional when it's otherwise unambiguous (just as "do" is often redundant in English). Symbol table entries can be named with an initial '*', but you don't really care about that yet. Every variable type has its own namespace. You can, without fear of conflict, use the same name for a scalar variable, an array, or a hash (or, for that matter, a filehandle, a subroutine name, or a label). This means that $foo and @foo are two different variables. It also means that `$foo[1]' is a part of @foo, not a part of $foo. This may seem a bit weird, but that's okay, because it is weird. Because variable and array references always start with '$', '@', or '%', the "reserved" words aren't in fact reserved with respect to variable names. (They ARE reserved with respect to labels and filehandles, however, which don't have an initial special character. You can't have a filehandle named "log", for instance. Hint: you could say `open(LOG,'logfile')' rather than `open(log,'logfile')'. Using uppercase filehandles also improves readability and protects you from conflict with future reserved words.) Case *IS* significant--"FOO", "Foo", and "foo" are all different names. Names that start with a letter or underscore may also contain digits and underscores. It is possible to replace such an alphanumeric name with an expression that returns a reference to an object of that type. For a description of this, see the perlref manpage. Names that start with a digit may contain only more digits. Names that do not start with a letter, underscore, or digit are limited to one character, e.g., `$%' or `$$'. (Most of these one character names have a predefined significance to Perl. For instance, `$$' is the current process id.) Context The interpretation of operations and values in Perl sometimes depends on the requirements of the context around the operation or value. There are two major contexts: scalar and list. Certain operations return list values in contexts wanting a list, and scalar values otherwise. (If this is true of an operation it will be mentioned in the documentation for that operation.) In other words, Perl overloads certain operations based on whether the expected return value is singular or plural. (Some words in English work this way, like "fish" and "sheep".) In a reciprocal fashion, an operation provides either a scalar or a list context to each of its arguments. For example, if you say int( ) the integer operation provides a scalar context for the operator, which responds by reading one line from STDIN and passing it back to the integer operation, which will then find the integer value of that line and return that. If, on the other hand, you say sort( ) then the sort operation provides a list context for , which will proceed to read every line available up to the end of file, and pass that list of lines back to the sort routine, which will then sort those lines and return them as a list to whatever the context of the sort was. Assignment is a little bit special in that it uses its left argument to determine the context for the right argument. Assignment to a scalar evaluates the righthand side in a scalar context, while assignment to an array or array slice evaluates the righthand side in a list context. Assignment to a list also evaluates the righthand side in a list context. User defined subroutines may choose to care whether they are being called in a scalar or list context, but most subroutines do not need to care, because scalars are automatically interpolated into lists. See the "wantarray" entry in the perlfunc manpage. Scalar values All data in Perl is a scalar or an array of scalars or a hash of scalars. Scalar variables may contain various kinds of singular data, such as numbers, strings, and references. In general, conversion from one form to another is transparent. (A scalar may not contain multiple values, but may contain a reference to an array or hash containing multiple values.) Because of the automatic conversion of scalars, operations, and functions that return scalars don't need to care (and, in fact, can't care) whether the context is looking for a string or a number. Scalars aren't necessarily one thing or another. There's no place to declare a scalar variable to be of type "string", or of type "number", or type "filehandle", or anything else. Perl is a contextually polymorphic language whose scalars can be strings, numbers, or references (which includes objects). While strings and numbers are considered pretty much the same thing for nearly all purposes, references are strongly-typed uncastable pointers with builtin reference-counting and destructor invocation. A scalar value is interpreted as TRUE in the Boolean sense if it is not the null string or the number 0 (or its string equivalent, "0"). The Boolean context is just a special kind of scalar context. There are actually two varieties of null scalars: defined and undefined. Undefined null scalars are returned when there is no real value for something, such as when there was an error, or at end of file, or when you refer to an uninitialized variable or element of an array. An undefined null scalar may become defined the first time you use it as if it were defined, but prior to that you can use the defined() operator to determine whether the value is defined or not. To find out whether a given string is a valid nonzero number, it's usually enough to test it against both numeric 0 and also lexical "0" (although this will cause -w noises). That's because strings that aren't numbers count as 0, just as they do in awk: if ($str == 0 && $str ne "0") { warn "That doesn't look like a number"; } That's usually preferable because otherwise you won't treat IEEE notations like `NaN' or `Infinity' properly. At other times you might prefer to use the POSIX::strtod function or a regular expression to check whether data is numeric. See the perlre manpage for details on regular expressions. warn "has nondigits" if /\D/; warn "not a natural number" unless /^\d+$/; # rejects -3 warn "not an integer" unless /^-?\d+$/; # rejects +3 warn "not an integer" unless /^[+-]?\d+$/; warn "not a decimal number" unless /^-?\d+\.?\d*$/; # rejects .2 warn "not a decimal number" unless /^-?(?:\d+(?:\.\d*)?|\.\d+)$/; warn "not a C float" unless /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/; The length of an array is a scalar value. You may find the length of array @days by evaluating `$#days', as in csh. (Actually, it's not the length of the array, it's the subscript of the last element, because there is (ordinarily) a 0th element.) Assigning to `$#days' changes the length of the array. Shortening an array by this method destroys intervening values. Lengthening an array that was previously shortened *NO LONGER* recovers the values that were in those elements. (It used to in Perl 4, but we had to break this to make sure destructors were called when expected.) You can also gain some miniscule measure of efficiency by pre-extending an array that is going to get big. (You can also extend an array by assigning to an element that is off the end of the array.) You can truncate an array down to nothing by assigning the null list () to it. The following are equivalent: @whatever = (); $#whatever = -1; If you evaluate a named array in a scalar context, it returns the length of the array. (Note that this is not true of lists, which return the last value, like the C comma operator, nor of built-in functions, which return whatever they feel like returning.) The following is always true: scalar(@whatever) == $#whatever - $[ + 1; Version 5 of Perl changed the semantics of `$[': files that don't set the value of `$[' no longer need to worry about whether another file changed its value. (In other words, use of `$[' is deprecated.) So in general you can assume that scalar(@whatever) == $#whatever + 1; Some programmers choose to use an explicit conversion so nothing's left to doubt: $element_count = scalar(@whatever); If you evaluate a hash in a scalar context, it returns a value that is true if and only if the hash contains any key/value pairs. (If there are any key/value pairs, the value returned is a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash. This is pretty much useful only to find out whether Perl's (compiled in) hashing algorithm is performing poorly on your data set. For example, you stick 10,000 things in a hash, but evaluating %HASH in scalar context reveals "1/16", which means only one out of sixteen buckets has been touched, and presumably contains all 10,000 of your items. This isn't supposed to happen.) You can preallocate space for a hash by assigning to the keys() function. This rounds up the allocated bucked to the next power of two: keys(%users) = 1000; # allocate 1024 buckets Scalar value constructors Numeric literals are specified in any of the customary floating point or integer formats: 12345 12345.67 .23E-10 0xffff # hex 0377 # octal 4_294_967_296 # underline for legibility String literals are usually delimited by either single or double quotes. They work much like shell quotes: double-quoted string literals are subject to backslash and variable substitution; single-quoted strings are not (except for "`\''" and "`\\'"). The usual Unix backslash rules apply for making characters such as newline, tab, etc., as well as some more exotic forms. See the section on "Quote and Quotelike Operators" in the perlop manpage for a list. Octal or hex representations in string literals (e.g. '0xffff') are not automatically converted to their integer representation. The hex() and oct() functions make these conversions for you. See the "hex" entry in the perlfunc manpage and the "oct" entry in the perlfunc manpage for more details. You can also embed newlines directly in your strings, i.e., they can end on a different line than they begin. This is nice, but if you forget your trailing quote, the error will not be reported until Perl finds another line containing the quote character, which may be much further on in the script. Variable substitution inside strings is limited to scalar variables, arrays, and array slices. (In other words, names beginning with $ or @, followed by an optional bracketed expression as a subscript.) The following code segment prints out "The price is $100." $Price = '$100'; # not interpreted print "The price is $Price.\n"; # interpreted As in some shells, you can put curly brackets around the name to delimit it from following alphanumerics. In fact, an identifier within such curlies is forced to be a string, as is any single identifier within a hash subscript. Our earlier example, $days{'Feb'} can be written as $days{Feb} and the quotes will be assumed automatically. But anything more complicated in the subscript will be interpreted as an expression. Note that a single-quoted string must be separated from a preceding word by a space, because single quote is a valid (though deprecated) character in a variable name (see the "Packages" entry in the perlmod manpage). Three special literals are __FILE__, __LINE__, and __PACKAGE__, which represent the current filename, line number, and package name at that point in your program. They may be used only as separate tokens; they will not be interpolated into strings. If there is no current package (due to an empty `package;' directive), __PACKAGE__ is the undefined value. The tokens __END__ and __DATA__ may be used to indicate the logical end of the script before the actual end of file. Any following text is ignored, but may be read via a DATA filehandle: main::DATA for __END__, or PACKNAME::DATA (where PACKNAME is the current package) for __DATA__. The two control characters ^D and ^Z are synonyms for __END__ (or __DATA__ in a module). See the SelfLoader manpage for more description of __DATA__, and an example of its use. Note that you cannot read from the DATA filehandle in a BEGIN block: the BEGIN block is executed as soon as it is seen (during compilation), at which point the corresponding __DATA__ (or __END__) token has not yet been seen. A word that has no other interpretation in the grammar will be treated as if it were a quoted string. These are known as "barewords". As with filehandles and labels, a bareword that consists entirely of lowercase letters risks conflict with future reserved words, and if you use the -w switch, Perl will warn you about any such words. Some people may wish to outlaw barewords entirely. If you say use strict 'subs'; then any bareword that would NOT be interpreted as a subroutine call produces a compile-time error instead. The restriction lasts to the end of the enclosing block. An inner block may countermand this by saying `no strict 'subs''. Array variables are interpolated into double-quoted strings by joining all the elements of the array with the delimiter specified in the `$"' variable (`$LIST_SEPARATOR' in English), space by default. The following are equivalent: $temp = join($",@ARGV); system "echo $temp"; system "echo @ARGV"; Within search patterns (which also undergo double-quotish substitution) there is a bad ambiguity: Is `/$foo[bar]/' to be interpreted as `/${foo}[bar]/' (where `[bar]' is a character class for the regular expression) or as `/${foo[bar]}/' (where `[bar]' is the subscript to array @foo)? If @foo doesn't otherwise exist, then it's obviously a character class. If @foo exists, Perl takes a good guess about `[bar]', and is almost always right. If it does guess wrong, or if you're just plain paranoid, you can force the correct interpretation with curly brackets as above. A line-oriented form of quoting is based on the shell "here-doc" syntax. Following a `<<' you specify a string to terminate the quoted material, and all lines following the current line down to the terminating string are the value of the item. The terminating string may be either an identifier (a word), or some quoted text. If quoted, the type of quotes you use determines the treatment of the text, just as in regular quoting. An unquoted identifier works like double quotes. There must be no space between the `<<' and the identifier. (If you put a space it will be treated as a null identifier, which is valid, and matches the first empty line.) The terminating string must appear by itself (unquoted and with no surrounding whitespace) on the terminating line. print <' operator between key/value pairs. The `=>' operator is mostly just a more visually distinctive synonym for a comma, but it also arranges for its left-hand operand to be interpreted as a string--if it's a bareword that would be a legal identifier. This makes it nice for initializing hashes: %map = ( red => 0x00f, blue => 0x0f0, green => 0xf00, ); or for initializing hash references to be used as records: $rec = { witch => 'Mable the Merciless', cat => 'Fluffy the Ferocious', date => '10/31/1776', }; or for using call-by-named-parameter to complicated functions: $field = $query->radio_group( name => 'group_name', values => ['eenie','meenie','minie'], default => 'meenie', linebreak => 'true', labels => \%labels ); Note that just because a hash is initialized in that order doesn't mean that it comes out in that order. See the "sort" entry in the perlfunc manpage for examples of how to arrange for an output ordering. Typeglobs and Filehandles Perl uses an internal type called a *typeglob* to hold an entire symbol table entry. The type prefix of a typeglob is a `*', because it represents all types. This used to be the preferred way to pass arrays and hashes by reference into a function, but now that we have real references, this is seldom needed. The main use of typeglobs in modern Perl is create symbol table aliases. This assignment: *this = *that; makes $this an alias for $that, @this an alias for @that, %this an alias for %that, &this an alias for &that, etc. Much safer is to use a reference. This: local *Here::blue = \$There::green; temporarily makes $Here::blue an alias for $There::green, but doesn't make @Here::blue an alias for @There::green, or %Here::blue an alias for %There::green, etc. See the section on "Symbol Tables" in the perlmod manpage for more examples of this. Strange though this may seem, this is the basis for the whole module import/export system. Another use for typeglobs is to to pass filehandles into a function or to create new filehandles. If you need to use a typeglob to save away a filehandle, do it this way: $fh = *STDOUT; or perhaps as a real reference, like this: $fh = \*STDOUT; See the perlsub manpage for examples of using these as indirect filehandles in functions. Typeglobs are also a way to create a local filehandle using the local() operator. These last until their block is exited, but may be passed back. For example: sub newopen { my $path = shift; local *FH; # not my! open (FH, $path) or return undef; return *FH; } $fh = newopen('/etc/passwd'); Now that we have the *foo{THING} notation, typeglobs aren't used as much for filehandle manipulations, although they're still needed to pass brand new file and directory handles into or out of functions. That's because *HANDLE{IO} only works if HANDLE has already been used as a handle. In other words, *FH can be used to create new symbol table entries, but *foo{THING} cannot. Another way to create anonymous filehandles is with the IO::Handle module and its ilk. These modules have the advantage of not hiding different types of the same name during the local(). See the bottom of the "open()" entry in the perlfunc manpage for an example. See the perlref manpage, the perlsub manpage, and the section on "Symbol Tables" in the perlmod manpage for more discussion on typeglobs and the *foo{THING} syntax. perldebug section NAME perldebug - Perl debugging DESCRIPTION First of all, have you tried using the -w switch? The Perl Debugger "As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs." * --Maurice Wilkes, 1949* If you invoke Perl with the -d switch, your script runs under the Perl source debugger. This works like an interactive Perl environment, prompting for debugger commands that let you examine source code, set breakpoints, get stack backtraces, change the values of variables, etc. This is so convenient that you often fire up the debugger all by itself just to test out Perl constructs interactively to see what they do. For example: perl -d -e 42 In Perl, the debugger is not a separate program as it usually is in the typical compiled environment. Instead, the -d flag tells the compiler to insert source information into the parse trees it's about to hand off to the interpreter. That means your code must first compile correctly for the debugger to work on it. Then when the interpreter starts up, it preloads a Perl library file containing the debugger itself. The program will halt *right before* the first run-time executable statement (but see below regarding compile-time statements) and ask you to enter a debugger command. Contrary to popular expectations, whenever the debugger halts and shows you a line of code, it always displays the line it's *about* to execute, rather than the one it has just executed. Any command not recognized by the debugger is directly executed (`eval''d) as Perl code in the current package. (The debugger uses the DB package for its own state information.) Leading white space before a command would cause the debugger to think it's *NOT* a debugger command but for Perl, so be careful not to do that. Debugger Commands The debugger understands the following commands: h [command] Prints out a help message. If you supply another debugger command as an argument to the `h' command, it prints out the description for just that command. The special argument of `h h' produces a more compact help listing, designed to fit together on one screen. If the output of the `h' command (or any command, for that matter) scrolls past your screen, either precede the command with a leading pipe symbol so it's run through your pager, as in DB> |h You may change the pager which is used via `O pager=...' command. p expr Same as `print {$DB::OUT} expr' in the current package. In particular, because this is just Perl's own print function, this means that nested data structures and objects are not dumped, unlike with the `x' command. The `DB::OUT' filehandle is opened to /dev/tty, regardless of where STDOUT may be redirected to. x expr Evaluates its expression in list context and dumps out the result in a pretty-printed fashion. Nested data structures are printed out recursively, unlike the `print' function. The details of printout are governed by multiple `O'ptions. V [pkg [vars]] Display all (or some) variables in package (defaulting to the `main' package) using a data pretty-printer (hashes show their keys and values so you see what's what, control characters are made printable, etc.). Make sure you don't put the type specifier (like `$') there, just the symbol names, like this: V DB filename line Use `~pattern' and `!pattern' for positive and negative regexps. Nested data structures are printed out in a legible fashion, unlike the `print' function. The details of printout are governed by multiple `O'ptions. X [vars] Same as `V currentpackage [vars]'. T Produce a stack backtrace. See below for details on its output. s [expr] Single step. Executes until it reaches the beginning of another statement, descending into subroutine calls. If an expression is supplied that includes function calls, it too will be single-stepped. n [expr] Next. Executes over subroutine calls, until it reaches the beginning of the next statement. If an expression is supplied that includes function calls, those functions will be executed with stops before each statement. Repeat last `n' or `s' command. c [line|sub] Continue, optionally inserting a one-time-only breakpoint at the specified line or subroutine. l List next window of lines. l min+incr List `incr+1' lines starting at `min'. l min-max List lines `min' through `max'. `l -' is synonymous to `-'. l line List a single line. l subname List first window of lines from subroutine. - List previous window of lines. w [line] List window (a few lines) around the current line. . Return debugger pointer to the last-executed line and print it out. f filename Switch to viewing a different file or eval statement. If `filename' is not a full filename as found in values of %INC, it is considered as a regexp. /pattern/ Search forwards for pattern; final / is optional. ?pattern? Search backwards for pattern; final ? is optional. L List all breakpoints and actions. S [[!]pattern] List subroutine names [not] matching pattern. t Toggle trace mode (see also `AutoTrace' `O'ption). t expr Trace through execution of expr. For example: $ perl -de 42 Stack dump during die enabled outside of evals. Loading DB routines from perl5db.pl patch level 0.94 Emacs support available. Enter h or `h h' for help. main::(-e:1): 0 DB<1> sub foo { 14 } DB<2> sub bar { 3 } DB<3> t print foo() * bar() main::((eval 172):3): print foo() + bar(); main::foo((eval 168):2): main::bar((eval 170):2): 42 or, with the `O'ption `frame=2' set, DB<4> O f=2 frame = '2' DB<5> t print foo() * bar() 3: foo() * bar() entering main::foo 2: sub foo { 14 }; exited main::foo entering main::bar 2: sub bar { 3 }; exited main::bar 42 b [line] [condition] Set a breakpoint. If line is omitted, sets a breakpoint on the line that is about to be executed. If a condition is specified, it's evaluated each time the statement is reached and a breakpoint is taken only if the condition is true. Breakpoints may be set on only lines that begin an executable statement. Conditions don't use if: b 237 $x > 30 b 237 ++$count237 < 11 b 33 /pattern/i b subname [condition] Set a breakpoint at the first line of the named subroutine. b postpone subname [condition] Set breakpoint at first line of subroutine after it is compiled. b load filename Set breakpoint at the first executed line of the file. Filename should be a full name as found in values of %INC. b compile subname Sets breakpoint at the first statement executed after the subroutine is compiled. d [line] Delete a breakpoint at the specified line. If line is omitted, deletes the breakpoint on the line that is about to be executed. D Delete all installed breakpoints. a [line] command Set an action to be done before the line is executed. The sequence of steps taken by the debugger is 1. check for a breakpoint at this line 2. print the line if necessary (tracing) 3. do any actions associated with that line 4. prompt user if at a breakpoint or in single-step 5. evaluate line For example, this will print out $foo every time line 53 is passed: a 53 print "DB FOUND $foo\n" A Delete all installed actions. W [expr] Add a global watch-expression. W Delete all watch-expressions. O [opt[=val]] [opt"val"] [opt?]... Set or query values of options. val defaults to 1. opt can be abbreviated. Several options can be listed. `recallCommand', `ShellBang' The characters used to recall command or spawn shell. By default, these are both set to `!'. `pager' Program to use for output of pager-piped commands (those beginning with a `|' character.) By default, `$ENV{PAGER}' will be used. `tkRunning' Run Tk while prompting (with ReadLine). `signalLevel', `warnLevel', `dieLevel' Level of verbosity. By default the debugger is in a sane verbose mode, thus it will print backtraces on all the warnings and die-messages which are going to be printed out, and will print a message when interesting uncaught signals arrive. To disable this behaviour, set these values to 0. If `dieLevel' is 2, then the messages which will be caught by surrounding `eval' are also printed. `AutoTrace' Trace mode (similar to `t' command, but can be put into `PERLDB_OPTS'). `LineInfo' File or pipe to print line number info to. If it is a pipe (say, `|visual_perl_db'), then a short, "emacs like" message is used. `inhibit_exit' If 0, allows *stepping off* the end of the script. `PrintRet' affects printing of return value after `r' command. `ornaments' affects screen appearance of the command line (see the Term::ReadLine manpage). `frame' affects printing messages on entry and exit from subroutines. If `frame & 2' is false, messages are printed on entry only. (Printing on exit may be useful if inter(di)spersed with other messages.) If `frame & 4', arguments to functions are printed as well as the context and caller info. If `frame & 8', overloaded `stringify' and `tie'd `FETCH' are enabled on the printed arguments. If `frame & 16', the return value from the subroutine is printed as well. The length at which the argument list is truncated is governed by the next option: `maxTraceLen' length at which the argument list is truncated when `frame' option's bit 4 is set. The following options affect what happens with `V', `X', and `x' commands: `arrayDepth', `hashDepth' Print only first N elements ('' for all). `compactDump', `veryCompact' Change style of array and hash dump. If `compactDump', short array may be printed on one line. `globPrint' Whether to print contents of globs. `DumpDBFiles' Dump arrays holding debugged files. `DumpPackages' Dump symbol tables of packages. `DumpReused' Dump contents of "reused" addresses. `quote', `HighBit', `undefPrint' Change style of string dump. Default value of `quote' is `auto', one can enable either double- quotish dump, or single-quotish by setting it to `"' or `''. By default, characters with high bit set are printed *as is*. `UsageOnly' *very* rudimentally per-package memory usage dump. Calculates total size of strings in variables in the package. During startup options are initialized from `$ENV{PERLDB_OPTS}'. You can put additional initialization options `TTY', `noTTY', `ReadLine', and `NonStop' there. Example rc file: &parse_options("NonStop=1 LineInfo=db.out AutoTrace"); The script will run without human intervention, putting trace information into the file *db.out*. (If you interrupt it, you would better reset `LineInfo' to something "interactive"!) `TTY' The TTY to use for debugging I/O. `noTTY' If set, goes in `NonStop' mode, and would not connect to a TTY. If interrupt (or if control goes to debugger via explicit setting of $DB::signal or $DB::single from the Perl script), connects to a TTY specified by the `TTY' option at startup, or to a TTY found at runtime using `Term::Rendezvous' module of your choice. This module should implement a method `new' which returns an object with two methods: `IN' and `OUT', returning two filehandles to use for debugging input and output correspondingly. Method `new' may inspect an argument which is a value of `$ENV{PERLDB_NOTTY}' at startup, or is `"/tmp/perldbtty$$"' otherwise. `ReadLine' If false, readline support in debugger is disabled, so you can debug ReadLine applications. `NonStop' If set, debugger goes into noninteractive mode until interrupted, or programmatically by setting $DB::signal or $DB::single. Here's an example of using the `$ENV{PERLDB_OPTS}' variable: $ PERLDB_OPTS="N f=2" perl -d myprogram will run the script `myprogram' without human intervention, printing out the call tree with entry and exit points. Note that `N f=2' is equivalent to `NonStop=1 frame=2'. Note also that at the moment when this documentation was written all the options to the debugger could be uniquely abbreviated by the first letter (with exception of `Dump*' options). Other examples may include $ PERLDB_OPTS="N f A L=listing" perl -d myprogram - runs script noninteractively, printing info on each entry into a subroutine and each executed line into the file listing. (If you interrupt it, you would better reset `LineInfo' to something "interactive"!) $ env "PERLDB_OPTS=R=0 TTY=/dev/ttyc" perl -d myprogram may be useful for debugging a program which uses `Term::ReadLine' itself. Do not forget detach shell from the TTY in the window which corresponds to /dev/ttyc, say, by issuing a command like $ sleep 1000000 See the section on "Debugger Internals" below for more details. < [ command ] Set an action (Perl command) to happen before every debugger prompt. A multi-line command may be entered by backslashing the newlines. If `command' is missing, resets the list of actions. << command Add an action (Perl command) to happen before every debugger prompt. A multi-line command may be entered by backslashing the newlines. > command Set an action (Perl command) to happen after the prompt when you've just given a command to return to executing the script. A multi-line command may be entered by backslashing the newlines. If `command' is missing, resets the list of actions. >> command Adds an action (Perl command) to happen after the prompt when you've just given a command to return to executing the script. A multi-line command may be entered by backslashing the newlines. { [ command ] Set an action (debugger command) to happen before every debugger prompt. A multi-line command may be entered by backslashing the newlines. If `command' is missing, resets the list of actions. {{ command Add an action (debugger command) to happen before every debugger prompt. A multi-line command may be entered by backslashing the newlines. ! number Redo a previous command (default previous command). ! -number Redo number'th-to-last command. ! pattern Redo last command that started with pattern. See `O recallCommand', too. !! cmd Run cmd in a subprocess (reads from DB::IN, writes to DB::OUT) See `O shellBang' too. H -number Display last n commands. Only commands longer than one character are listed. If number is omitted, lists them all. q or ^D Quit. ("quit" doesn't work for this.) This is the only supported way to exit the debugger, though typing `exit' twice may do it too. Set an `O'ption `inhibit_exit' to 0 if you want to be able to *step off* the end the script. You may also need to set `$finished' to 0 at some moment if you want to step through global destruction. R Restart the debugger by execing a new session. It tries to maintain your history across this, but internal settings and command line options may be lost. Currently the following setting are preserved: history, breakpoints, actions, debugger `O'ptions, and the following command line options: -w, -I, and -e. |dbcmd Run debugger command, piping DB::OUT to current pager. ||dbcmd Same as `|dbcmd' but DB::OUT is temporarily selected as well. Often used with commands that would otherwise produce long output, such as |V main = [alias value] Define a command alias, like = quit q or list current aliases. command Execute command as a Perl statement. A missing semicolon will be supplied. m expr The expression is evaluated, and the methods which may be applied to the result are listed. m package The methods which may be applied to objects in the `package' are listed. Debugger input/output Prompt The debugger prompt is something like DB<8> or even DB<<17>> where that number is the command number, which you'd use to access with the builtin csh-like history mechanism, e.g., `!17' would repeat command number 17. The number of angle brackets indicates the depth of the debugger. You could get more than one set of brackets, for example, if you'd already at a breakpoint and then printed out the result of a function call that itself also has a breakpoint, or you step into an expression via `s/n/t expression' command. Multiline commands If you want to enter a multi-line command, such as a subroutine definition with several statements, or a format, you may escape the newline that would normally end the debugger command with a backslash. Here's an example: DB<1> for (1..4) { \ cont: print "ok\n"; \ cont: } ok ok ok ok Note that this business of escaping a newline is specific to interactive commands typed into the debugger. Stack backtrace Here's an example of what a stack backtrace via `T' command might look like: $ = main::infested called from file `Ambulation.pm' line 10 @ = Ambulation::legs(1, 2, 3, 4) called from file `camel_flea' line 7 $ = main::pests('bactrian', 4) called from file `camel_flea' line 4 The left-hand character up there tells whether the function was called in a scalar or list context (we bet you can tell which is which). What that says is that you were in the function `main::infested' when you ran the stack dump, and that it was called in a scalar context from line 10 of the file *Ambulation.pm*, but without any arguments at all, meaning it was called as `&infested'. The next stack frame shows that the function `Ambulation::legs' was called in a list context from the *camel_flea* file with four arguments. The last stack frame shows that `main::pests' was called in a scalar context, also from *camel_flea*, but from line 4. Note that if you execute `T' command from inside an active `use' statement, the backtrace will contain both `require' frame and an `eval') frame. Listing Listing given via different flavors of `l' command looks like this: DB<<13>> l 101: @i{@i} = (); 102:b @isa{@i,$pack} = () 103 if(exists $i{$prevpack} || exists $isa{$pack}); 104 } 105 106 next 107==> if(exists $isa{$pack}); 108 109:a if ($extra-- > 0) { 110: %isa = ($pack,1); Note that the breakable lines are marked with `:', lines with breakpoints are marked by `b', with actions by `a', and the next executed line is marked by `==>'. Frame listing When `frame' option is set, debugger would print entered (and optionally exited) subroutines in different styles. What follows is the start of the listing of env "PERLDB_OPTS=f=n N" perl -d -V for different values of `n': 1 entering main::BEGIN entering Config::BEGIN Package lib/Exporter.pm. Package lib/Carp.pm. Package lib/Config.pm. entering Config::TIEHASH entering Exporter::import entering Exporter::export entering Config::myconfig entering Config::FETCH entering Config::FETCH entering Config::FETCH entering Config::FETCH 2 entering main::BEGIN entering Config::BEGIN Package lib/Exporter.pm. Package lib/Carp.pm. exited Config::BEGIN Package lib/Config.pm. entering Config::TIEHASH exited Config::TIEHASH entering Exporter::import entering Exporter::export exited Exporter::export exited Exporter::import exited main::BEGIN entering Config::myconfig entering Config::FETCH exited Config::FETCH entering Config::FETCH exited Config::FETCH entering Config::FETCH 4 in $=main::BEGIN() from /dev/nul:0 in $=Config::BEGIN() from lib/Config.pm:2 Package lib/Exporter.pm. Package lib/Carp.pm. Package lib/Config.pm. in $=Config::TIEHASH('Config') from lib/Config.pm:644 in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/nul:0 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li in @=Config::myconfig() from /dev/nul:0 in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'PATCHLEVEL') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'SUBVERSION') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574 6 in $=main::BEGIN() from /dev/nul:0 in $=Config::BEGIN() from lib/Config.pm:2 Package lib/Exporter.pm. Package lib/Carp.pm. out $=Config::BEGIN() from lib/Config.pm:0 Package lib/Config.pm. in $=Config::TIEHASH('Config') from lib/Config.pm:644 out $=Config::TIEHASH('Config') from lib/Config.pm:644 in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/nul:0 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/ out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/ out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/nul:0 out $=main::BEGIN() from /dev/nul:0 in @=Config::myconfig() from /dev/nul:0 in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'PATCHLEVEL') from lib/Config.pm:574 out $=Config::FETCH(ref(Config), 'PATCHLEVEL') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'SUBVERSION') from lib/Config.pm:574 14 in $=main::BEGIN() from /dev/nul:0 in $=Config::BEGIN() from lib/Config.pm:2 Package lib/Exporter.pm. Package lib/Carp.pm. out $=Config::BEGIN() from lib/Config.pm:0 Package lib/Config.pm. in $=Config::TIEHASH('Config') from lib/Config.pm:644 out $=Config::TIEHASH('Config') from lib/Config.pm:644 in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/nul:0 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/nul:0 out $=main::BEGIN() from /dev/nul:0 in @=Config::myconfig() from /dev/nul:0 in $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574 out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574 in $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574 out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574 30 in $=CODE(0x15eca4)() from /dev/null:0 in $=CODE(0x182528)() from lib/Config.pm:2 Package lib/Exporter.pm. out $=CODE(0x182528)() from lib/Config.pm:0 scalar context return from CODE(0x182528): undef Package lib/Config.pm. in $=Config::TIEHASH('Config') from lib/Config.pm:628 out $=Config::TIEHASH('Config') from lib/Config.pm:628 scalar context return from Config::TIEHASH: empty hash in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171 out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171 scalar context return from Exporter::export: '' out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 scalar context return from Exporter::import: '' In all the cases indentation of lines shows the call tree, if bit 2 of `frame' is set, then a line is printed on exit from a subroutine as well, if bit 4 is set, then the arguments are printed as well as the caller info, if bit 8 is set, the arguments are printed even if they are tied or references, if bit 16 is set, the return value is printed as well. When a package is compiled, a line like this Package lib/Carp.pm. is printed with proper indentation. Debugging compile-time statements If you have any compile-time executable statements (code within a BEGIN block or a `use' statement), these will `NOT' be stopped by debugger, although `require's will (and compile-time statements can be traced with `AutoTrace' option set in `PERLDB_OPTS'). From your own Perl code, however, you can transfer control back to the debugger using the following statement, which is harmless if the debugger is not running: $DB::single = 1; If you set `$DB::single' to the value 2, it's equivalent to having just typed the `n' command, whereas a value of 1 means the `s' command. The `$DB::trace' variable should be set to 1 to simulate having typed the `t' command. Another way to debug compile-time code is to start debugger, set a breakpoint on *load* of some module thusly DB<7> b load f:/perllib/lib/Carp.pm Will stop on load of `f:/perllib/lib/Carp.pm'. and restart debugger by `R' command (if possible). One can use `b compile subname' for the same purpose. Debugger Customization Most probably you do not want to modify the debugger, it contains enough hooks to satisfy most needs. You may change the behaviour of debugger from the debugger itself, using `O'ptions, from the command line via `PERLDB_OPTS' environment variable, and from *customization files*. You can do some customization by setting up a .perldb file which contains initialization code. For instance, you could make aliases like these (the last one is one people expect to be there): $DB::alias{'len'} = 's/^len(.*)/p length($1)/'; $DB::alias{'stop'} = 's/^stop (at|in)/b/'; $DB::alias{'ps'} = 's/^ps\b/p scalar /'; $DB::alias{'quit'} = 's/^quit(\s*)/exit\$/'; One changes options from .perldb file via calls like this one; parse_options("NonStop=1 LineInfo=db.out AutoTrace=1 frame=2"); (the code is executed in the package `DB'). Note that .perldb is processed before processing `PERLDB_OPTS'. If .perldb defines the subroutine `afterinit', it is called after all the debugger initialization ends. .perldb may be contained in the current directory, or in the `LOGDIR'/`HOME' directory. If you want to modify the debugger, copy perl5db.pl from the Perl library to another name and modify it as necessary. You'll also want to set your `PERL5DB' environment variable to say something like this: BEGIN { require "myperl5db.pl" } As the last resort, one can use `PERL5DB' to customize debugger by directly setting internal variables or calling debugger functions. Readline Support As shipped, the only command line history supplied is a simplistic one that checks for leading exclamation points. However, if you install the Term::ReadKey and Term::ReadLine modules from CPAN, you will have full editing capabilities much like GNU *readline*(3) provides. Look for these in the modules/by-module/Term directory on CPAN. A rudimentary command line completion is also available. Unfortunately, the names of lexical variables are not available for completion. Editor Support for Debugging If you have GNU emacs installed on your system, it can interact with the Perl debugger to provide an integrated software development environment reminiscent of its interactions with C debuggers. Perl is also delivered with a start file for making emacs act like a syntax-directed editor that understands (some of) Perl's syntax. Look in the *emacs* directory of the Perl source distribution. (Historically, a similar setup for interacting with vi and the X11 window system had also been available, but at the time of this writing, no debugger support for vi currently exists.) The Perl Profiler If you wish to supply an alternative debugger for Perl to run, just invoke your script with a colon and a package argument given to the -d flag. One of the most popular alternative debuggers for Perl is DProf, the Perl profiler. As of this writing, DProf is not included with the standard Perl distribution, but it is expected to be included soon, for certain values of "soon". Meanwhile, you can fetch the Devel::Dprof module from CPAN. Assuming it's properly installed on your system, to profile your Perl program in the file mycode.pl, just type: perl -d:DProf mycode.pl When the script terminates the profiler will dump the profile information to a file called tmon.out. A tool like dprofpp (also supplied with the Devel::DProf package) can be used to interpret the information which is in that profile. Debugger support in perl When you call the caller function (see the "caller" entry in the perlfunc manpage) from the package DB, Perl sets the array @DB::args to contain the arguments the corresponding stack frame was called with. If perl is run with -d option, the following additional features are enabled (cf. the section on "$^P" in the perlvar manpage): * Perl inserts the contents of `$ENV{PERL5DB}' (or `BEGIN {require 'perl5db.pl'}' if not present) before the first line of the application. * The array `@{"_<$filename"}' is the line-by-line contents of $filename for all the compiled files. Same for `eval'ed strings which contain subroutines, or which are currently executed. The `$filename' for `eval'ed strings looks like `(eval 34)'. * The hash `%{"_<$filename"}' contains breakpoints and action (it is keyed by line number), and individual entries are settable (as opposed to the whole hash). Only true/false is important to Perl, though the values used by perl5db.pl have the form `"$break_condition\0$action"'. Values are magical in numeric context: they are zeros if the line is not breakable. Same for evaluated strings which contain subroutines, or which are currently executed. The $filename for `eval'ed strings looks like `(eval 34)'. * The scalar `${"_<$filename"}' contains `"_<$filename"'. Same for evaluated strings which contain subroutines, or which are currently executed. The $filename for `eval'ed strings looks like `(eval 34)'. * After each `require'd file is compiled, but before it is executed, `DB::postponed(*{"_<$filename"})' is called (if subroutine `DB::postponed' exists). Here the $filename is the expanded name of the `require'd file (as found in values of %INC). * After each subroutine `subname' is compiled existence of `$DB::postponed{subname}' is checked. If this key exists, `DB::postponed(subname)' is called (if subroutine `DB::postponed' exists). * A hash `%DB::sub' is maintained, with keys being subroutine names, values having the form `filename:startline-endline'. `filename' has the form `(eval 31)' for subroutines defined inside `eval's. * When execution of the application reaches a place that can have a breakpoint, a call to `DB::DB()' is performed if any one of variables $DB::trace, $DB::single, or $DB::signal is true. (Note that these variables are not `local'izable.) This feature is disabled when the control is inside `DB::DB()' or functions called from it (unless `$^D & (1<<30)'). * When execution of the application reaches a subroutine call, a call to `&DB::sub'(*args*) is performed instead, with `$DB::sub' being the name of the called subroutine. (Unless the subroutine is compiled in the package `DB'.) Note that if `&DB::sub' needs some external data to be setup for it to work, no subroutine call is possible until this is done. For the standard debugger `$DB::deep' (how many levels of recursion deep into the debugger you can go before a mandatory break) gives an example of such a dependency. The minimal working debugger consists of one line sub DB::DB {} which is quite handy as contents of `PERL5DB' environment variable: env "PERL5DB=sub DB::DB {}" perl -d your-script Another (a little bit more useful) minimal debugger can be created with the only line being sub DB::DB {print ++$i; scalar } This debugger would print the sequential number of encountered statement, and would wait for your `CR' to continue. The following debugger is quite functional: { package DB; sub DB {} sub sub {print ++$i, " $sub\n"; &$sub} } It prints the sequential number of subroutine call and the name of the called subroutine. Note that `&DB::sub' should be compiled into the package `DB'. Debugger Internals At the start, the debugger reads your rc file (./.perldb or ~/.perldb under Unix), which can set important options. This file may define a subroutine `&afterinit' to be executed after the debugger is initialized. After the rc file is read, the debugger reads environment variable PERLDB_OPTS and parses it as a rest of `O ...' line in debugger prompt. It also maintains magical internal variables, such as `@DB::dbline', `%DB::dbline', which are aliases for `@{"::_ 1), and before termination of the script (if `$ENV{PERL_DEBUG_MSTATS}' >= 1). The report format is similar to one in the following example: env PERL_DEBUG_MSTATS=2 perl -e "require Carp" Memory allocation statistics after compilation: (buckets 4(4)..8188(8192) 14216 free: 130 117 28 7 9 0 2 2 1 0 0 437 61 36 0 5 60924 used: 125 137 161 55 7 8 6 16 2 0 1 74 109 304 84 20 Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048. Memory allocation statistics after execution: (buckets 4(4)..8188(8192) 30888 free: 245 78 85 13 6 2 1 3 2 0 1 315 162 39 42 11 175816 used: 265 176 1112 111 26 22 11 27 2 1 1 196 178 1066 798 39 Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144. It is possible to ask for such a statistic at arbitrary moment by using Devel::Peek::mstats() (module Devel::Peek is available on CPAN). Here is the explanation of different parts of the format: `buckets SMALLEST(APPROX)..GREATEST(APPROX)' Perl's malloc() uses bucketed allocations. Every request is rounded up to the closest bucket size available, and a bucket of these size is taken from the pool of the buckets of this size. The above line describes limits of buckets currently in use. Each bucket has two sizes: memory footprint, and the maximal size of user data which may be put into this bucket. Say, in the above example the smallest bucket is both sizes 4. The biggest bucket has usable size 8188, and the memory footprint 8192. With debugging Perl some buckets may have negative usable size. This means that these buckets cannot (and will not) be used. For greater buckets the memory footprint may be one page greater than a power of 2. In such a case the corresponding power of two is printed instead in the `APPROX' field above. Free/Used The following 1 or 2 rows of numbers correspond to the number of buckets of each size between `SMALLEST' and `GREATEST'. In the first row the sizes (memory footprints) of buckets are powers of two (or possibly one page greater). In the second row (if present) the memory footprints of the buckets are between memory footprints of two buckets "above". Say, with the above example the memory footprints are (with current algorithm) free: 8 16 32 64 128 256 512 1024 2048 4096 8192 4 12 24 48 80 With non-`DEBUGGING' perl the buckets starting from `128'-long ones have 4-byte overhead, thus 8192-long bucket may take up to 8188- byte-long allocations. `Total sbrk(): SBRKed/SBRKs:CONTINUOUS' The first two fields give the total amount of memory perl sbrk()ed, and number of sbrk()s used. The third number is what perl thinks about continuity of returned chunks. As far as this number is positive, malloc() will assume that it is probable that sbrk() will provide continuous memory. The amounts sbrk()ed by external libraries is not counted. `pad: 0' The amount of sbrk()ed memory needed to keep buckets aligned. `heads: 2192' While memory overhead of bigger buckets is kept inside the bucket, for smaller buckets it is kept in separate areas. This field gives the total size of these areas. `chain: 0' malloc() may want to subdivide a bigger bucket into smaller buckets. If only a part of the deceased-bucket is left non-subdivided, the rest is kept as an element of a linked list. This field gives the total size of these chunks. `tail: 6144' To minimize amount of sbrk()s malloc() asks for more memory. This field gives the size of the yet-unused part, which is sbrk()ed, but never touched. Example of using -DL switch Below we show how to analyse memory usage by do 'lib/auto/POSIX/autosplit.ix'; The file in question contains a header and 146 lines similar to sub getcwd ; Note: *the discussion below supposes 32-bit architecture. In the newer versions of perl the memory usage of the constructs discussed here is much improved, but the story discussed below is a real-life story. This story is very terse, and assumes more than cursory knowledge of Perl internals.* Here is the itemized list of Perl allocations performed during parsing of this file: !!! "after" at test.pl line 3. Id subtot 4 8 12 16 20 24 28 32 36 40 48 56 64 72 80 80+ 0 02 13752 . . . . 294 . . . . . . . . . . 4 0 54 5545 . . 8 124 16 . . . 1 1 . . . . . 3 5 05 32 . . . . . . . 1 . . . . . . . . 6 02 7152 . . . . . . . . . . 149 . . . . . 7 02 3600 . . . . . 150 . . . . . . . . . . 7 03 64 . -1 . 1 . . 2 . . . . . . . . . 7 04 7056 . . . . . . . . . . . . . . . 7 7 17 38404 . . . . . . . 1 . . 442 149 . . 147 . 9 03 2078 17 249 32 . . . . 2 . . . . . . . . To see this list insert two `warn('!...')' statements around the call: warn('!'); do 'lib/auto/POSIX/autosplit.ix'; warn('!!! "after"'); and run it with -DL option. The first warn() will print memory allocation info before the parsing of the file, and will memorize the statistics at this point (we ignore what it prints). The second warn() will print increments w.r.t. this memorized statistics. This is the above printout. Different *Id*s on the left correspond to different subsystems of perl interpreter, they are just first argument given to perl memory allocation API New(). To find what `9 03' means `grep' the perl source for `903'. You will see that it is util.c, function savepvn(). This function is used to store a copy of existing chunk of memory. Using C debugger, one can see that it is called either directly from gv_init(), or via sv_magic(), and gv_init() is called from gv_fetchpv() - which is called from newSUB(). Note: to reach this place in debugger and skip all the calls to savepvn during the compilation of the main script, set a C breakpoint in Perl_warn(), `continue' this point is reached, *then* set breakpoint in Perl_savepvn(). Note that you may need to skip a handful of Perl_savepvn() which do not correspond to mass production of CVs (there are more `903' allocations than 146 similar lines of lib/auto/POSIX/autosplit.ix). Note also that `Perl_' prefixes are added by macroization code in perl header files to avoid conflicts with external libraries. Anyway, we see that `903' ids correspond to creation of globs, twice per glob - for glob name, and glob stringification magic. Here are explanations for other *Id*s above: `717' is for creation of bigger `XPV*' structures. In the above case it creates 3 `AV' per subroutine, one for a list of lexical variable names, one for a scratchpad (which contains lexical variables and `targets'), and one for the array of scratchpads needed for recursion. It also creates a `GV' and a `CV' per subroutine (all called from start_subparse()). `002' Creates C array corresponding to the `AV' of scratchpads, and the scratchpad itself (the first fake entry of this scratchpad is created though the subroutine itself is not defined yet). It also creates C arrays to keep data for the stash (this is one HV, but it grows, thus there are 4 big allocations: the big chunks are not freed, but are kept as additional arenas for `SV' allocations). `054' creates a `HEK' for the name of the glob for the subroutine (this name is a key in a *stash*). Big allocations with this *Id* correspond to allocations of new arenas to keep `HE'. `602' creates a `GP' for the glob for the subroutine. `702' creates the `MAGIC' for the glob for the subroutine. `704' creates *arenas* which keep SVs. -DL details If Perl is run with -DL option, then warn()s which start with `!' behave specially. They print a list of *categories* of memory allocations, and statistics of allocations of different sizes for these categories. If warn() string starts with `!!!' print changed categories only, print the differences in counts of allocations; `!!' print grown categories only; print the absolute values of counts, and totals; `!' print nonempty categories, print the absolute values of counts and totals. Limitations of -DL statistic If an extension or an external library does not use Perl API to allocate memory, these allocations are not counted. Debugging regular expressions There are two ways to enable debugging output for regular expressions. If your perl is compiled with `-DDEBUGGING', you may use the -Dr flag on the command line. Otherwise, one can `use re 'debug'', which has effects both at compile time, and at run time (and is *not* lexically scoped). Compile-time output The debugging output for the compile time looks like this: compiling RE `[bc]d(ef*g)+h[ij]k$' size 43 first at 1 1: ANYOF(11) 11: EXACT (13) 13: CURLYX {1,32767}(27) 15: OPEN1(17) 17: EXACT (19) 19: STAR(22) 20: EXACT (0) 22: EXACT (24) 24: CLOSE1(26) 26: WHILEM(0) 27: NOTHING(28) 28: EXACT (30) 30: ANYOF(40) 40: EXACT (42) 42: EOL(43) 43: END(0) anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating) stclass `ANYOF' minlen 7 The first line shows the pre-compiled form of the regexp, and the second shows the size of the compiled form (in arbitrary units, usually 4-byte words) and the label *id* of the first node which does a match. The last line (split into two lines in the above) contains the optimizer info. In the example shown, the optimizer found that the match should contain a substring `de' at the offset 1, and substring `gh' at some offset between 3 and infinity. Moreover, when checking for these substrings (to abandon impossible matches quickly) it will check for the substring `gh' before checking for the substring `de'. The optimizer may also use the knowledge that the match starts (at the `first' *id*) with a character class, and the match cannot be shorter than 7 chars. The fields of interest which may appear in the last line are `anchored' *STRING* `at' *POS* `floating' *STRING* `at' *POS1..POS2* see above; `matching floating/anchored' which substring to check first; `minlen' the minimal length of the match; `stclass' *TYPE* The type of the first matching node. `noscan' which advises to not scan for the found substrings; `isall' which says that the optimizer info is in fact all that the regular expression contains (thus one does not need to enter the RE engine at all); `GPOS' if the pattern contains `\G'; `plus' if the pattern starts with a repeated char (as in `x+y'); `implicit' if the pattern starts with `.*'; `with eval' if the pattern contain eval-groups (see the section on "(?{ code })" in the perlre manpage); `anchored(TYPE)' if the pattern may match only at a handful of places (with `TYPE' being `BOL', `MBOL', or `GPOS', see the table below). If a substring is known to match at end-of-line only, it may be followed by `$', as in `floating `k'$'. The optimizer-specific info is used to avoid entering (a slow) RE engine on strings which will definitely not match. If `isall' flag is set, a call to the RE engine may be avoided even when optimizer found an appropriate place for the match. The rest of the output contains the list of *nodes* of the compiled form of the RE. Each line has format ` '*id*: *TYPE* *OPTIONAL-INFO* (*next-id*) Types of nodes Here is the list of possible types with short descriptions: # TYPE arg-description [num-args] [longjump-len] DESCRIPTION # Exit points END no End of program. SUCCEED no Return from a subroutine, basically. # Anchors: BOL no Match "" at beginning of line. MBOL no Same, assuming multiline. SBOL no Same, assuming singleline. EOS no Match "" at end of string. EOL no Match "" at end of line. MEOL no Same, assuming multiline. SEOL no Same, assuming singleline. BOUND no Match "" at any word boundary BOUNDL no Match "" at any word boundary NBOUND no Match "" at any word non-boundary NBOUNDL no Match "" at any word non-boundary GPOS no Matches where last m//g left off. # [Special] alternatives ANY no Match any one character (except newline). SANY no Match any one character. ANYOF sv Match character in (or not in) this class. ALNUM no Match any alphanumeric character ALNUML no Match any alphanumeric char in locale NALNUM no Match any non-alphanumeric character NALNUML no Match any non-alphanumeric char in locale SPACE no Match any whitespace character SPACEL no Match any whitespace char in locale NSPACE no Match any non-whitespace character NSPACEL no Match any non-whitespace char in locale DIGIT no Match any numeric character NDIGIT no Match any non-numeric character # BRANCH The set of branches constituting a single choice are hooked # together with their "next" pointers, since precedence prevents # anything being concatenated to any individual branch. The # "next" pointer of the last BRANCH in a choice points to the # thing following the whole choice. This is also where the # final "next" pointer of each individual branch points; each # branch starts with the operand node of a BRANCH node. # BRANCH node Match this alternative, or the next... # BACK Normal "next" pointers all implicitly point forward; BACK # exists to make loop structures possible. # not used BACK no Match "", "next" ptr points backward. # Literals EXACT sv Match this string (preceded by length). EXACTF sv Match this string, folded (prec. by length). EXACTFL sv Match this string, folded in locale (w/len). # Do nothing NOTHING no Match empty string. # A variant of above which delimits a group, thus stops optimizations TAIL no Match empty string. Can jump here from outside. # STAR,PLUS '?', and complex '*' and '+', are implemented as circular # BRANCH structures using BACK. Simple cases (one character # per match) are implemented with STAR and PLUS for speed # and to minimize recursive plunges. # STAR node Match this (simple) thing 0 or more times. PLUS node Match this (simple) thing 1 or more times. CURLY sv 2 Match this simple thing {n,m} times. CURLYN no 2 Match next-after-this simple thing # {n,m} times, set parenths. CURLYM no 2 Match this medium-complex thing {n,m} times. CURLYX sv 2 Match this complex thing {n,m} times. # This terminator creates a loop structure for CURLYX WHILEM no Do curly processing and see if rest matches. # OPEN,CLOSE,GROUPP ...are numbered at compile time. OPEN num 1 Mark this point in input as start of #n. CLOSE num 1 Analogous to OPEN. REF num 1 Match some already matched string REFF num 1 Match already matched string, folded REFFL num 1 Match already matched string, folded in loc. # grouping assertions IFMATCH off 1 2 Succeeds if the following matches. UNLESSM off 1 2 Fails if the following matches. SUSPEND off 1 1 "Independent" sub-RE. IFTHEN off 1 1 Switch, should be preceeded by switcher . GROUPP num 1 Whether the group matched. # Support for long RE LONGJMP off 1 1 Jump far away. BRANCHJ off 1 1 BRANCH with long offset. # The heavy worker EVAL evl 1 Execute some Perl code. # Modifiers MINMOD no Next operator is not greedy. LOGICAL no Next opcode should set the flag only. # This is not used yet RENUM off 1 1 Group with independently numbered parens. # This is not really a node, but an optimized away piece of a "long" node. # To simplify debugging output, we mark it as if it were a node OPTIMIZED off Placeholder for dump. Run-time output First of all, when doing a match, one may get no run-time output even if debugging is enabled. this means that the RE engine was never entered, all of the job was done by the optimizer. If RE engine was entered, the output may look like this: Matching `[bc]d(ef*g)+h[ij]k$' against `abcdefg__gh__' Setting an EVAL scope, savestack=3 2 | 1: ANYOF 3 | 11: EXACT 4 | 13: CURLYX {1,32767} 4 | 26: WHILEM 0 out of 1..32767 cc=effff31c 4 | 15: OPEN1 4 | 17: EXACT 5 | 19: STAR EXACT can match 1 times out of 32767... Setting an EVAL scope, savestack=3 6 | 22: EXACT 7 <__gh__> | 24: CLOSE1 7 <__gh__> | 26: WHILEM 1 out of 1..32767 cc=effff31c Setting an EVAL scope, savestack=12 7 <__gh__> | 15: OPEN1 7 <__gh__> | 17: EXACT restoring \1 to 4(4)..7 failed, try continuation... 7 <__gh__> | 27: NOTHING 7 <__gh__> | 28: EXACT failed... failed... The most significant information in the output is about the particular *node* of the compiled RE which is currently being tested against the target string. The format of these lines is ` '*STRING-OFFSET* <*PRE-STRING*> <*POST-STRING*> |*ID*: *TYPE* The *TYPE* info is indented with respect to the backtracking level. Other incidental information appears interspersed within. perldelta section NAME perldelta - what's new for perl5.005 DESCRIPTION This document describes differences between the 5.004 release and this one. About the new versioning system Perl is now developed on two tracks: a maintenance track that makes small, safe updates to released production versions with emphasis on compatibility; and a development track that pursues more aggressive evolution. Maintenance releases (which should be considered production quality) have subversion numbers that run from `1' to `49', and development releases (which should be considered "alpha" quality) run from `50' to `99'. Perl 5.005 is the combined product of the new dual-track development scheme. Incompatible Changes WARNING: This version is not binary compatible with Perl 5.004. Starting with Perl 5.004_50 there were many deep and far-reaching changes to the language internals. If you have dynamically loaded extensions that you built under perl 5.003 or 5.004, you can continue to use them with 5.004, but you will need to rebuild and reinstall those extensions to use them 5.005. See the INSTALL manpage for detailed instructions on how to upgrade. Default installation structure has changed The new Configure defaults are designed to allow a smooth upgrade from 5.004 to 5.005, but you should read the INSTALL manpage for a detailed discussion of the changes in order to adapt them to your system. Perl Source Compatibility When none of the experimental features are enabled, there should be very few user-visible Perl source compatibility issues. If threads are enabled, then some caveats apply. `@_' and `$_' become lexical variables. The effect of this should be largely transparent to the user, but there are some boundary conditions under which user will need to be aware of the issues. For example, `local(@_)' results in a "Can't localize lexical variable @_ ..." message. This may be enabled in a future version. Some new keywords have been introduced. These are generally expected to have very little impact on compatibility. See the section on "New `INIT' keyword", the section on "New `lock' keyword", and the section on "New `qr//' operator". Certain barewords are now reserved. Use of these will provoke a warning if you have asked for them with the `-w' switch. See the section on "`our' is now a reserved word". C Source Compatibility There have been a large number of changes in the internals to support the new features in this release. Core sources now require ANSI C compiler An ANSI C compiler is now required to build perl. See INSTALL. All Perl global variables must now be referenced with an explicit prefix All Perl global variables that are visible for use by extensions now have a `PL_' prefix. New extensions should `not' refer to perl globals by their unqualified names. To preserve sanity, we provide limited backward compatibility for globals that are being widely used like `sv_undef' and `na' (which should now be written as `PL_sv_undef', `PL_na' etc.) If you find that your XS extension does not compile anymore because a perl global is not visible, try adding a `PL_' prefix to the global and rebuild. It is strongly recommended that all functions in the Perl API that don't begin with `perl' be referenced with a `Perl_' prefix. The bare function names without the `Perl_' prefix are supported with macros, but this support may cease in a future release. See the section on "API LISTING" in the perlguts manpage. Enabling threads has source compatibility issues Perl built with threading enabled requires extensions to use the new `dTHR' macro to initialize the handle to access per-thread data. If you see a compiler error that talks about the variable `thr' not being declared (when building a module that has XS code), you need to add `dTHR;' at the beginning of the block that elicited the error. The API function `perl_get_sv("@",FALSE)' should be used instead of directly accessing perl globals as `GvSV(errgv)'. The API call is backward compatible with existing perls and provides source compatibility with threading is enabled. See the section on "C Source Compatibility" for more information. Binary Compatibility This version is NOT binary compatible with older versions. All extensions will need to be recompiled. Further binaries built with threads enabled are incompatible with binaries built without. This should largely be transparent to the user, as all binary incompatible configurations have their own unique architecture name, and extension binaries get installed at unique locations. This allows coexistence of several configurations in the same directory hierarchy. See INSTALL. Security fixes may affect compatibility A few taint leaks and taint omissions have been corrected. This may lead to "failure" of scripts that used to work with older versions. Compiling with -DINCOMPLETE_TAINTS provides a perl with minimal amounts of changes to the tainting behavior. But note that the resulting perl will have known insecurities. Oneliners with the `-e' switch do not create temporary files anymore. Relaxed new mandatory warnings introduced in 5.004 Many new warnings that were introduced in 5.004 have been made optional. Some of these warnings are still present, but perl's new features make them less often a problem. See the section on "New Diagnostics". Licensing Perl has a new Social Contract for contributors. See Porting/Contract. The license included in much of the Perl documentation has changed. Most of the Perl documentation was previously under the implicit GNU General Public License or the Artistic License (at the user's choice). Now much of the documentation unambigously states the terms under which it may be distributed. Those terms are in general much less restrictive than the GNU GPL. See the perl manpage and the individual perl man pages listed therein. Core Changes Threads WARNING: Threading is considered an experimental feature. Details of the implementation may change without notice. There are known limitations and some bugs. These are expected to be fixed in future versions. See the README.threads manpage. Mach cthreads (NEXTSTEP, OPENSTEP, Rhapsody) are now supported by the Thread extension. Compiler WARNING: The Compiler and related tools are considered experimental. Features may change without notice, and there are known limitations and bugs. Since the compiler is fully external to perl, the default configuration will build and install it. The Compiler produces three different types of transformations of a perl program. The C backend generates C code that captures perl's state just before execution begins. It eliminates the compile-time overheads of the regular perl interpreter, but the run-time performance remains comparatively the same. The CC backend generates optimized C code equivalent to the code path at run-time. The CC backend has greater potential for big optimizations, but only a few optimizations are implemented currently. The Bytecode backend generates a platform independent bytecode representation of the interpreter's state just before execution. Thus, the Bytecode back end also eliminates much of the compilation overhead of the interpreter. The compiler comes with several valuable utilities. `B::Lint' is an experimental module to detect and warn about suspicious code, especially the cases that the `-w' switch does not detect. `B::Deparse' can be used to demystify perl code, and understand how perl optimizes certain constructs. `B::Xref' generates cross reference reports of all definition and use of variables, subroutines and formats in a program. `B::Showlex' show the lexical variables used by a subroutine or file at a glance. `perlcc' is a simple frontend for compiling perl. See `ext/B/README', the section on "B", and the respective compiler modules. Regular Expressions Perl's regular expression engine has been seriously overhauled, and many new constructs are supported. Several bugs have been fixed. Here is an itemized summary: Many new and improved optimizations Changes in the RE engine: Unneeded nodes removed; Substrings merged together; New types of nodes to process (SUBEXPR)* and similar expressions quickly, used if the SUBEXPR has no side effects and matches strings of the same length; Better optimizations by lookup for constant substrings; Better search for constants substrings anchored by $ ; Changes in Perl code using RE engine: More optimizations to s/longer/short/; study() was not working; /blah/ may be optimized to an analogue of index() if $& $` $' not seen; Unneeded copying of matched-against string removed; Only matched part of the string is copying if $` $' were not seen; Many bug fixes Note that only the major bug fixes are listed here. See Changes for others. Backtracking might not restore start of $3. No feedback if max count for * or + on "complex" subexpression was reached, similarly (but at compile time) for {3,34567} Primitive restrictions on max count introduced to decrease a possibility of a segfault; (ZERO-LENGTH)* could segfault; (ZERO-LENGTH)* was prohibited; Long REs were not allowed; /RE/g could skip matches at the same position after a zero-length match; New regular expression constructs The following new syntax elements are supported: (?<=RE) (?RE) \z New operator for precompiled regular expressions See the section on "New `qr//' operator". Other improvements Better debugging output (possibly with colors), even from non-debugging Perl; RE engine code now looks like C, not like assembler; Behaviour of RE modifiable by `use re' directive; Improved documentation; Test suite significantly extended; Syntax [:^upper:] etc., reserved inside character classes; Incompatible changes (?i) localized inside enclosing group; $( is not interpolated into RE any more; /RE/g may match at the same position (with non-zero length) after a zero-length match (bug fix). See the perlre manpage and the perlop manpage. Improved malloc() See banner at the beginning of `malloc.c' for details. Quicksort is internally implemented Perl now contains its own highly optimized qsort() routine. The new qsort() is resistant to inconsistent comparison functions, so Perl's `sort()' will not provoke coredumps any more when given poorly written sort subroutines. (Some C library `qsort()'s that were being used before used to have this problem.) In our testing, the new `qsort()' required the minimal number of pair-wise compares on average, among all known `qsort()' implementations. See `perlfunc/sort'. Reliable signals Perl's signal handling is susceptible to random crashes, because signals arrive asynchronously, and the Perl runtime is not reentrant at arbitrary times. However, one experimental implementation of reliable signals is available when threads are enabled. See `Thread::Signal'. Also see INSTALL for how to build a Perl capable of threads. Reliable stack pointers The internals now reallocate the perl stack only at predictable times. In particular, magic calls never trigger reallocations of the stack, because all reentrancy of the runtime is handled using a "stack of stacks". This should improve reliability of cached stack pointers in the internals and in XSUBs. More generous treatment of carriage returns Perl used to complain if it encountered literal carriage returns in scripts. Now they are mostly treated like whitespace within program text. Inside string literals and here documents, literal carriage returns are ignored if they occur paired with linefeeds, or get interpreted as whitespace if they stand alone. This behavior means that literal carriage returns in files should be avoided. You can get the older, more compatible (but less generous) behavior by defining the preprocessor symbol `PERL_STRICT_CR' when building perl. Of course, all this has nothing whatever to do with how escapes like `\r' are handled within strings. Note that this doesn't somehow magically allow you to keep all text files in DOS format. The generous treatment only applies to files that perl itself parses. If your C compiler doesn't allow carriage returns in files, you may still be unable to build modules that need a C compiler. Memory leaks `substr', `pos' and `vec' don't leak memory anymore when used in lvalue context. Many small leaks that impacted applications that embed multiple interpreters have been fixed. Better support for multiple interpreters The build-time option `-DMULTIPLICITY' has had many of the details reworked. Some previously global variables that should have been per- interpreter now are. With care, this allows interpreters to call each other. See the `PerlInterp' extension on CPAN. Behavior of local() on array and hash elements is now well-defined See the section on "Temporary Values via local()" in the perlsub manpage. `%!' is transparently tied to the the Errno manpage module See the perlvar manpage, and the Errno manpage. Pseudo-hashes are supported See the perlref manpage. `EXPR foreach EXPR' is supported See the perlsyn manpage. Keywords can be globally overridden See the perlsub manpage. `$^E' is meaningful on Win32 See the perlvar manpage. `foreach (1..1000000)' optimized `foreach (1..1000000)' is now optimized into a counting loop. It does not try to allocate a 1000000-size list anymore. `Foo::' can be used as implicitly quoted package name Barewords caused unintuitive behavior when a subroutine with the same name as a package happened to be defined. Thus, `new Foo @args', use the result of the call to `Foo()' instead of `Foo' being treated as a literal. The recommended way to write barewords in the indirect object slot is `new Foo:: @args'. Note that the method `new()' is called with a first argument of `Foo', not `Foo::' when you do that. `exists $Foo::{Bar::}' tests existence of a package It was impossible to test for the existence of a package without actually creating it before. Now `exists $Foo::{Bar::}' can be used to test if the `Foo::Bar' namespace has been created. Better locale support See the perllocale manpage. Experimental support for 64-bit platforms Perl5 has always had 64-bit support on systems with 64-bit longs. Starting with 5.005, the beginnings of experimental support for systems with 32-bit long and 64-bit 'long long' integers has been added. If you add -DUSE_LONG_LONG to your ccflags in config.sh (or manually define it in perl.h) then perl will be built with 'long long' support. There will be many compiler warnings, and the resultant perl may not work on all systems. There are many other issues related to third-party extensions and libraries. This option exists to allow people to work on those issues. prototype() returns useful results on builtins See the "prototype" entry in the perlfunc manpage. Extended support for exception handling `die()' now accepts a reference value, and `$@' gets set to that value in exception traps. This makes it possible to propagate exception objects. This is an undocumented experimental feature. Re-blessing in DESTROY() supported for chaining DESTROY() methods See the "Destructors" entry in the perlobj manpage. All `printf' format conversions are handled internally See the "printf" entry in the perlfunc manpage. New `INIT' keyword `INIT' subs are like `BEGIN' and `END', but they get run just before the perl runtime begins execution. e.g., the Perl Compiler makes use of `INIT' blocks to initialize and resolve pointers to XSUBs. New `lock' keyword The `lock' keyword is the fundamental synchronization primitive in threaded perl. When threads are not enabled, it is currently a noop. To minimize impact on source compatibility this keyword is "weak", i.e., any user-defined subroutine of the same name overrides it, unless a `use Thread' has been seen. New `qr//' operator The `qr//' operator, which is syntactically similar to the other quote- like operators, is used to create precompiled regular expressions. This compiled form can now be explicitly passed around in variables, and interpolated in other regular expressions. See the perlop manpage. `our' is now a reserved word Calling a subroutine with the name `our' will now provoke a warning when using the `-w' switch. Tied arrays are now fully supported See the Tie::Array manpage. Tied handles support is better Several missing hooks have been added. There is also a new base class for TIEARRAY implementations. See the Tie::Array manpage. 4th argument to substr substr() can now both return and replace in one operation. The optional 4th argument is the replacement string. See the "substr" entry in the perlfunc manpage. Negative LENGTH argument to splice splice() with a negative LENGTH argument now work similar to what the LENGTH did for substr(). Previously a negative LENGTH was treated as 0. See the "splice" entry in the perlfunc manpage. Magic lvalues are now more magical When you say something like `substr($x, 5) = "hi"', the scalar returned by substr() is special, in that any modifications to it affect $x. (This is called a 'magic lvalue' because an 'lvalue' is something on the left side of an assignment.) Normally, this is exactly what you would expect to happen, but Perl uses the same magic if you use substr(), pos(), or vec() in a context where they might be modified, like taking a reference with `\' or as an argument to a sub that modifies `@_'. In previous versions, this 'magic' only went one way, but now changes to the scalar the magic refers to ($x in the above example) affect the magic lvalue too. For instance, this code now acts differently: $x = "hello"; sub printit { $x = "g'bye"; print $_[0], "\n"; } printit(substr($x, 0, 5)); In previous versions, this would print "hello", but it now prints "g'bye". <> now reads in records If `$/' is a referenence to an integer, or a scalar that holds an integer, <> will read in records instead of lines. For more info, see the section on "$/" in the perlvar manpage. pack() format 'Z' supported The new format type 'Z' is useful for packing and unpacking null- terminated strings. See the section on "pack" in the perlfunc manpage. Significant bug fixes on empty files With `$/' set to `undef', slurping an empty file returns a string of zero length (instead of `undef', as it used to) for the first time the HANDLE is read. Subsequent reads yield `undef'. This means that the following will append "foo" to an empty file (it used to not do anything before): perl -0777 -pi -e 's/^/foo/' empty_file Note that the behavior of: perl -pi -e 's/^/foo/' empty_file is unchanged (it continues to leave the file empty). Supported Platforms Configure has many incremental improvements. Site-wide policy for building perl can now be made persistent, via Policy.sh. Configure also records the command-line arguments used in config.sh. New Platforms BeOS is now supported. See the README.beos manpage. DOS is now supported under the DJGPP tools. See the README.dos manpage. GNU/Hurd is now supported. MiNT is now supported. See the README.mint manpage. MPE/iX is now supported. See the README.mpeix manpage. MVS (aka OS390, aka Open Edition) is now supported. See the README.os390 manpage. Stratus VOS is now supported. See the README.vos manpage. Changes in existing support Win32 support has been vastly enhanced. Support for Perl Object, a C++ encapsulation of Perl. GCC and EGCS are now supported on Win32. See README.win32, aka the perlwin32 manpage. VMS configuration system has been rewritten. See the README.vms manpage. The hints files for most Unix platforms have seen incremental improvements. Modules and Pragmata New Modules B Perl compiler and tools. See the section on "B". Data::Dumper A module to pretty print Perl data. See the Data::Dumper manpage. Dumpvalue A module to dump perl values to the screen. See the Dumpvalue manpage. Errno A module to look up errors more conveniently. See the Errno manpage. File::Spec A portable API for file operations. ExtUtils::Installed Query and manage installed modules. ExtUtils::Packlist Manipulate .packlist files. Fatal Make functions/builtins succeed or die. IPC::SysV Constants and other support infrastructure for System V IPC operations in perl. Test A framework for writing testsuites. Tie::Array Base class for tied arrays. Tie::Handle Base class for tied handles. Thread Perl thread creation, manipulation, and support. attrs Set subroutine attributes. fields Compile-time class fields. re Various pragmata to control behavior of regular expressions. Changes in existing modules Benchmark You can now run tests for *n* seconds instead of guessing the right number of tests to run: e.g. timethese(-5, ...) will run each of the codes for at least 5 CPU seconds. Zero as the "number of repetitions" means "for at least 3 CPU seconds". The output format has also changed. For example: use Benchmark;$x=3;timethese(-5,{a=>sub{$x*$x},b=>sub{$x**2}}) will now output something like this: Benchmark: running a, b, each for at least 5 CPU seconds... a: 5 wallclock secs ( 5.77 usr + 0.00 sys = 5.77 CPU) @ 200551.91/s (n=1156516) b: 4 wallclock secs ( 5.00 usr + 0.02 sys = 5.02 CPU) @ 159605.18/s (n=800686) New features: "each for at least N CPU seconds...", "wallclock secs", and the "@ operations/CPU second (n=operations)". Carp Carp has a new function cluck(). cluck() warns, like carp(), but also adds a stack backtrace to the error message, like confess(). CGI CGI has been updated to version 2.42. Fcntl More Fcntl constants added: F_SETLK64, F_SETLKW64, O_LARGEFILE for large (more than 4G) file access (the 64-bit support is not yet working, though, so no need to get overly excited), Free/Net/OpenBSD locking behaviour flags F_FLOCK, F_POSIX, Linux F_SHLCK, and O_ACCMODE: the mask of O_RDONLY, O_WRONLY, and O_RDWR. Math::Complex The accessor methods Re, Im, arg, abs, rho, and theta, can now also act as mutators (accessor $z->Re(), mutator $z->Re(3)). Math::Trig A little bit of radial trigonometry (cylindrical and spherical) added: radial coordinate conversions and the great circle distance. POSIX POSIX now has its own platform-specific hints files. DB_File DB_File supports version 2.x of Berkeley DB. See `ext/DB_File/Changes'. MakeMaker MakeMaker now supports writing empty makefiles, provides a way to specify that site umask() policy should be honored. There is also better support for manipulation of .packlist files, and getting information about installed modules. Extensions that have both architecture-dependent and architecture- independent files are now always installed completely in the architecture-dependent locations. Previously, the shareable parts were shared both across architectures and across perl versions and were therefore liable to be overwritten with newer versions that might have subtle incompatibilities. CPAN See and the CPAN manpage. Cwd Cwd::cwd is faster on most platforms. Benchmark Keeps better time. Utility Changes `h2ph' and related utilities have been vastly overhauled. `perlcc', a new experimental front end for the compiler is available. The crude GNU `configure' emulator is now called `configure.gnu' to avoid trampling on `Configure' under case-insensitive filesystems. `perldoc' used to be rather slow. The slower features are now optional. In particular, case-insensitive searches need the `-i' switch, and recursive searches need `-r'. You can set these switches in the `PERLDOC' environment variable to get the old behavior. Documentation Changes Config.pm now has a glossary of variables. Porting/patching.pod has detailed instructions on how to create and submit patches for perl. the perlport manpage specifies guidelines on how to write portably. the perlmodinstall manpage describes how to fetch and install modules from `CPAN' sites. Some more Perl traps are documented now. See the perltrap manpage. the perlopentut manpage gives a tutorial on using open(). the perlreftut manpage gives a tutorial on references. the perlthrtut manpage gives a tutorial on threads. New Diagnostics Ambiguous call resolved as CORE::%s(), qualify as such or use & (W) A subroutine you have declared has the same name as a Perl keyword, and you have used the name without qualification for calling one or the other. Perl decided to call the builtin because the subroutine is not imported. To force interpretation as a subroutine call, either put an ampersand before the subroutine name, or qualify the name with its package. Alternatively, you can import the subroutine (or pretend that it's imported with the `use subs' pragma). To silently interpret it as the Perl operator, use the `CORE::' prefix on the operator (e.g. `CORE::log($x)') or by declaring the subroutine to be an object method (see the attrs manpage). Bad index while coercing array into hash (F) The index looked up in the hash found as the 0'th element of a pseudo-hash is not legal. Index values must be at 1 or greater. See the perlref manpage. Bareword "%s" refers to nonexistent package (W) You used a qualified bareword of the form `Foo::', but the compiler saw no other uses of that namespace before that point. Perhaps you need to predeclare a package? Can't call method "%s" on an undefined value (F) You used the syntax of a method call, but the slot filled by the object reference or package name contains an undefined value. Something like this will reproduce the error: $BADREF = 42; process $BADREF 1,2,3; $BADREF->process(1,2,3); Can't check filesystem of script "%s" for nosuid (P) For some reason you can't check the filesystem of the script for nosuid. Can't coerce array into hash (F) You used an array where a hash was expected, but the array has no information on how to map from keys to array indices. You can do that only with arrays that have a hash reference at index 0. Can't goto subroutine from an eval-string (F) The "goto subroutine" call can't be used to jump out of an eval "string". (You can use it to jump out of an eval {BLOCK}, but you probably don't want to.) Can't localize pseudo-hash element (F) You said something like `local $ar->{'key'}', where $ar is a reference to a pseudo-hash. That hasn't been implemented yet, but you can get a similar effect by localizing the corresponding array element directly -- `local $ar->[$ar->[0]{'key'}]'. Can't use %%! because Errno.pm is not available (F) The first time the %! hash is used, perl automatically loads the Errno.pm module. The Errno module is expected to tie the %! hash to provide symbolic names for `$!' errno values. Cannot find an opnumber for "%s" (F) A string of a form `CORE::word' was given to prototype(), but there is no builtin with the name `word'. Character class syntax [. .] is reserved for future extensions (W) Within regular expression character classes ([]) the syntax beginning with "[." and ending with ".]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[." and ".\]". Character class syntax [: :] is reserved for future extensions (W) Within regular expression character classes ([]) the syntax beginning with "[:" and ending with ":]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[:" and ":\]". Character class syntax [= =] is reserved for future extensions (W) Within regular expression character classes ([]) the syntax beginning with "[=" and ending with "=]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[=" and "=\]". %s: Eval-group in insecure regular expression (F) Perl detected tainted data when trying to compile a regular expression that contains the `(?{ ... })' zero-width assertion, which is unsafe. See the section on "(?{ code })" in the perlre manpage, and the perlsec manpage. %s: Eval-group not allowed, use re 'eval' (F) A regular expression contained the `(?{ ... })' zero-width assertion, but that construct is only allowed when the `use re 'eval'' pragma is in effect. See the section on "(?{ code })" in the perlre manpage. %s: Eval-group not allowed at run time (F) Perl tried to compile a regular expression containing the `(?{ ... })' zero-width assertion at run time, as it would when the pattern contains interpolated values. Since that is a security risk, it is not allowed. If you insist, you may still do this by explicitly building the pattern from an interpolated string at run time and using that in an eval(). See the section on "(?{ code })" in the perlre manpage. Explicit blessing to '' (assuming package main) (W) You are blessing a reference to a zero length string. This has the effect of blessing the reference into the package main. This is usually not what you want. Consider providing a default target package, e.g. bless($ref, $p || 'MyPackage'); Illegal hex digit ignored (W) You may have tried to use a character other than 0 - 9 or A - F in a hexadecimal number. Interpretation of the hexadecimal number stopped before the illegal character. No such array field (F) You tried to access an array as a hash, but the field name used is not defined. The hash at index 0 should map all valid field names to array indices for that to work. No such field "%s" in variable %s of type %s (F) You tried to access a field of a typed variable where the type does not know about the field name. The field names are looked up in the %FIELDS hash in the type package at compile time. The %FIELDS hash is usually set up with the 'fields' pragma. Out of memory during ridiculously large request (F) You can't allocate more than 2^31+"small amount" bytes. This error is most likely to be caused by a typo in the Perl program. e.g., `$arr[time]' instead of `$arr[$time]'. Range iterator outside integer range (F) One (or both) of the numeric arguments to the range operator ".." are outside the range which can be represented by integers internally. One possible workaround is to force Perl to use magical string increment by prepending "0" to your numbers. Recursive inheritance detected while looking for method '%s' in package '%s' (F) More than 100 levels of inheritance were encountered while invoking a method. Probably indicates an unintended loop in your inheritance hierarchy. Reference found where even-sized list expected (W) You gave a single reference where Perl was expecting a list with an even number of elements (for assignment to a hash). This usually means that you used the anon hash constructor when you meant to use parens. In any case, a hash requires key/value pairs. %hash = { one => 1, two => 2, }; # WRONG %hash = [ qw/ an anon array / ]; # WRONG %hash = ( one => 1, two => 2, ); # right %hash = qw( one 1 two 2 ); # also fine Undefined value assigned to typeglob (W) An undefined value was assigned to a typeglob, a la `*foo = undef'. This does nothing. It's possible that you really mean `undef *foo'. Use of reserved word "%s" is deprecated (D) The indicated bareword is a reserved word. Future versions of perl may use it as a keyword, so you're better off either explicitly quoting the word in a manner appropriate for its context of use, or using a different name altogether. The warning can be suppressed for subroutine names by either adding a `&' prefix, or using a package qualifier, e.g. `&our()', or `Foo::our()'. perl: warning: Setting locale failed. (S) The whole warning message will look something like: perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LC_ALL = "En_US", LANG = (unset) are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). Exactly what were the failed locale settings varies. In the above the settings were that the LC_ALL was "En_US" and the LANG had no value. This error means that Perl detected that you and/or your system administrator have set up the so-called variable system but Perl could not use those settings. This was not dead serious, fortunately: there is a "default locale" called "C" that Perl can and will use, the script will be run. Before you really fix the problem, however, you will get the same error message each time you run Perl. How to really fix the problem can be found in the section on "LOCALE PROBLEMS" in the perllocale manpage. Obsolete Diagnostics Can't mktemp() (F) The mktemp() routine failed for some reason while trying to process a -e switch. Maybe your /tmp partition is full, or clobbered. Removed because -e doesn't use temporary files any more. Can't write to temp file for -e: %s (F) The write routine failed for some reason while trying to process a -e switch. Maybe your /tmp partition is full, or clobbered. Removed because -e doesn't use temporary files any more. Cannot open temporary file (F) The create routine failed for some reason while trying to process a -e switch. Maybe your /tmp partition is full, or clobbered. Removed because -e doesn't use temporary files any more. regexp too big (F) The current implementation of regular expressions uses shorts as address offsets within a string. Unfortunately this means that if the regular expression compiles to longer than 32767, it'll blow up. Usually when you want a regular expression this big, there is a better way to do it with multiple statements. See the perlre manpage. Configuration Changes You can use "Configure -Uinstallusrbinperl" which causes installperl to skip installing perl also as /usr/bin/perl. This is useful if you prefer not to modify /usr/bin for some reason or another but harmful because many scripts assume to find Perl in /usr/bin/perl. BUGS If you find what you think is a bug, you might check the headers of recently posted articles in the comp.lang.perl.misc newsgroup. There may also be information at http://www.perl.com/perl/, the Perl Home Page. If you believe you have an unreported bug, please run the perlbug program included with your release. Make sure you trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of `perl -V', will be sent off to to be analysed by the Perl porting team. SEE ALSO The Changes file for exhaustive details on what changed. The INSTALL file for how to build Perl. The README file for general stuff. The Artistic and Copying files for copyright information. HISTORY Written by Gurusamy Sarathy , with many contributions from The Perl Porters. Send omissions or corrections to . perldiag section NAME perldiag - various Perl diagnostics DESCRIPTION These messages are classified as follows (listed in increasing order of desperation): (W) A warning (optional). (D) A deprecation (optional). (S) A severe warning (mandatory). (F) A fatal error (trappable). (P) An internal error you should never see (trappable). (X) A very fatal error (nontrappable). (A) An alien error message (not generated by Perl). Optional warnings are enabled by using the -w switch. Warnings may be captured by setting `$SIG{__WARN__}' to a reference to a routine that will be called on each warning instead of printing it. See the perlvar manpage. Trappable errors may be trapped using the eval operator. See the "eval" entry in the perlfunc manpage. Some of these messages are generic. Spots that vary are denoted with a %s, just as in a printf format. Note that some messages start with a %s! The symbols `"%(-?@' sort before the letters, while `[' and `\' sort after. "my" variable %s can't be in a package (F) Lexically scoped variables aren't in a package, so it doesn't make sense to try to declare one with a package qualifier on the front. Use local() if you want to localize a package variable. "my" variable %s masks earlier declaration in same %s (W) A lexical variable has been redeclared in the current scope or statement, effectively eliminating all access to the previous instance. This is almost always a typographical error. Note that the earlier variable will still exist until the end of the scope or until all closure referents to it are destroyed. "no" not allowed in expression (F) The "no" keyword is recognized and executed at compile time, and returns no useful value. See the perlmod manpage. "use" not allowed in expression (F) The "use" keyword is recognized and executed at compile time, and returns no useful value. See the perlmod manpage. % may only be used in unpack (F) You can't pack a string by supplying a checksum, because the checksumming process loses information, and you can't go the other way. See the "unpack" entry in the perlfunc manpage. %s (...) interpreted as function (W) You've run afoul of the rule that says that any list operator followed by parentheses turns into a function, with all the list operators arguments found inside the parentheses. See the section on "Terms and List Operators (Leftward)" in the perlop manpage. %s argument is not a HASH element (F) The argument to exists() must be a hash element, such as $foo{$bar} $ref->[12]->{"susie"} %s argument is not a HASH element or slice (F) The argument to delete() must be either a hash element, such as $foo{$bar} $ref->[12]->{"susie"} or a hash slice, such as @foo{$bar, $baz, $xyzzy} @{$ref->[12]}{"susie", "queue"} %s did not return a true value (F) A required (or used) file must return a true value to indicate that it compiled correctly and ran its initialization code correctly. It's traditional to end such a file with a "1;", though any true value would do. See the "require" entry in the perlfunc manpage. %s found where operator expected (S) The Perl lexer knows whether to expect a term or an operator. If it sees what it knows to be a term when it was expecting to see an operator, it gives you this warning. Usually it indicates that an operator or delimiter was omitted, such as a semicolon. %s had compilation errors (F) The final summary message when a `perl -c' fails. %s has too many errors (F) The parser has given up trying to parse the program after 10 errors. Further error messages would likely be uninformative. %s matches null string many times (W) The pattern you've specified would be an infinite loop if the regular expression engine didn't specifically check for that. See the perlre manpage. %s never introduced (S) The symbol in question was declared but somehow went out of scope before it could possibly have been used. %s syntax OK (F) The final summary message when a `perl -c' succeeds. %s: Command not found (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself. %s: Expression syntax (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself. %s: Undefined variable (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself. %s: not found (A) You've accidentally run your script through the Bourne shell instead of Perl. Check the #! line, or manually feed your script into Perl yourself. (in cleanup) %s (W) This prefix usually indicates that a DESTROY() method raised the indicated exception. Since destructors are usually called by the system at arbitrary points during execution, and often a vast number of times, the warning is issued only once for any number of failures that would otherwise result in the same message being repeated. Failure of user callbacks dispatched using the `G_KEEPERR' flag could also result in this warning. See the "G_KEEPERR" entry in the perlcall manpage. (Missing semicolon on previous line?) (S) This is an educated guess made in conjunction with the message "%s found where operator expected". Don't automatically put a semicolon on the previous line just because you saw this message. -P not allowed for setuid/setgid script (F) The script would have to be opened by the C preprocessor by name, which provides a race condition that breaks security. `-T' and `-B' not implemented on filehandles (F) Perl can't peek at the stdio buffer of filehandles when it doesn't know about your kind of stdio. You'll have to use a filename instead. `-p' destination: %s (F) An error occurred during the implicit output invoked by the `-p' command-line switch. (This output goes to STDOUT unless you've redirected it with select().) 500 Server error See Server error. ?+* follows nothing in regexp (F) You started a regular expression with a quantifier. Backslash it if you meant it literally. See the perlre manpage. @ outside of string (F) You had a pack template that specified an absolute position outside the string being unpacked. See the "pack" entry in the perlfunc manpage. accept() on closed fd (W) You tried to do an accept on a closed socket. Did you forget to check the return value of your socket() call? See the "accept" entry in the perlfunc manpage. Allocation too large: %lx (X) You can't allocate more than 64K on an MS-DOS machine. Applying %s to %s will act on scalar(%s) (W) The pattern match (//), substitution (s///), and transliteration (tr///) operators work on scalar values. If you apply one of them to an array or a hash, it will convert the array or hash to a scalar value -- the length of an array, or the population info of a hash -- and then work on that scalar value. This is probably not what you meant to do. See the "grep" entry in the perlfunc manpage and the "map" entry in the perlfunc manpage for alternatives. Arg too short for msgsnd (F) msgsnd() requires a string at least as long as sizeof(long). Ambiguous use of %s resolved as %s (W)(S) You said something that may not be interpreted the way you thought. Normally it's pretty easy to disambiguate it by supplying a missing quote, operator, parenthesis pair or declaration. Ambiguous call resolved as CORE::%s(), qualify as such or use & (W) A subroutine you have declared has the same name as a Perl keyword, and you have used the name without qualification for calling one or the other. Perl decided to call the builtin because the subroutine is not imported. To force interpretation as a subroutine call, either put an ampersand before the subroutine name, or qualify the name with its package. Alternatively, you can import the subroutine (or pretend that it's imported with the `use subs' pragma). To silently interpret it as the Perl operator, use the `CORE::' prefix on the operator (e.g. `CORE::log($x)') or by declaring the subroutine to be an object method (see the attrs manpage). Args must match #! line (F) The setuid emulator requires that the arguments Perl was invoked with match the arguments specified on the #! line. Since some systems impose a one-argument limit on the #! line, try combining switches; for example, turn `-w -U' into `-wU'. Argument "%s" isn't numeric%s (W) The indicated string was fed as an argument to an operator that expected a numeric value instead. If you're fortunate the message will identify which operator was so unfortunate. Array @%s missing the @ in argument %d of %s() (D) Really old Perl let you omit the @ on array names in some spots. This is now heavily deprecated. assertion botched: %s (P) The malloc package that comes with Perl had an internal failure. Assertion failed: file "%s" (P) A general assertion failed. The file in question must be examined. Assignment to both a list and a scalar (F) If you assign to a conditional operator, the 2nd and 3rd arguments must either both be scalars or both be lists. Otherwise Perl won't know which context to supply to the right side. Attempt to free non-arena SV: 0x%lx (P) All SV objects are supposed to be allocated from arenas that will be garbage collected on exit. An SV was discovered to be outside any of those arenas. Attempt to free nonexistent shared string (P) Perl maintains a reference counted internal table of strings to optimize the storage and access of hash keys and other strings. This indicates someone tried to decrement the reference count of a string that can no longer be found in the table. Attempt to free temp prematurely (W) Mortalized values are supposed to be freed by the free_tmps() routine. This indicates that something else is freeing the SV before the free_tmps() routine gets a chance, which means that the free_tmps() routine will be freeing an unreferenced scalar when it does try to free it. Attempt to free unreferenced glob pointers (P) The reference counts got screwed up on symbol aliases. Attempt to free unreferenced scalar (W) Perl went to decrement the reference count of a scalar to see if it would go to 0, and discovered that it had already gone to 0 earlier, and should have been freed, and in fact, probably was freed. This could indicate that SvREFCNT_dec() was called too many times, or that SvREFCNT_inc() was called too few times, or that the SV was mortalized when it shouldn't have been, or that memory has been corrupted. Attempt to pack pointer to temporary value (W) You tried to pass a temporary value (like the result of a function, or a computed expression) to the "p" pack() template. This means the result contains a pointer to a location that could become invalid anytime, even before the end of the current statement. Use literals or global values as arguments to the "p" pack() template to avoid this warning. Attempt to use reference as lvalue in substr (W) You supplied a reference as the first argument to substr() used as an lvalue, which is pretty strange. Perhaps you forgot to dereference it first. See the "substr" entry in the perlfunc manpage. Bad arg length for %s, is %d, should be %d (F) You passed a buffer of the wrong size to one of msgctl(), semctl() or shmctl(). In C parlance, the correct sizes are, respectively, sizeof(struct msqid_ds *), sizeof(struct semid_ds *), and sizeof(struct shmid_ds *). Bad filehandle: %s (F) A symbol was passed to something wanting a filehandle, but the symbol has no filehandle associated with it. Perhaps you didn't do an open(), or did it in another package. Bad free() ignored (S) An internal routine called free() on something that had never been malloc()ed in the first place. Mandatory, but can be disabled by setting environment variable `PERL_BADFREE' to 1. This message can be quite often seen with DB_File on systems with "hard" dynamic linking, like `AIX' and `OS/2'. It is a bug of `Berkeley DB' which is left unnoticed if `DB' uses *forgiving* system malloc(). Bad hash (P) One of the internal hash routines was passed a null HV pointer. Bad index while coercing array into hash (F) The index looked up in the hash found as the 0'th element of a pseudo-hash is not legal. Index values must be at 1 or greater. See the perlref manpage. Bad name after %s:: (F) You started to name a symbol by using a package prefix, and then didn't finish the symbol. In particular, you can't interpolate outside of quotes, so $var = 'myvar'; $sym = mypack::$var; is not the same as $var = 'myvar'; $sym = "mypack::$var"; Bad symbol for array (P) An internal request asked to add an array entry to something that wasn't a symbol table entry. Bad symbol for filehandle (P) An internal request asked to add a filehandle entry to something that wasn't a symbol table entry. Bad symbol for hash (P) An internal request asked to add a hash entry to something that wasn't a symbol table entry. Badly placed ()'s (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself. Bareword "%s" not allowed while "strict subs" in use (F) With "strict subs" in use, a bareword is only allowed as a subroutine identifier, in curly brackets or to the left of the "=>" symbol. Perhaps you need to predeclare a subroutine? Bareword "%s" refers to nonexistent package (W) You used a qualified bareword of the form `Foo::', but the compiler saw no other uses of that namespace before that point. Perhaps you need to predeclare a package? BEGIN failed--compilation aborted (F) An untrapped exception was raised while executing a BEGIN subroutine. Compilation stops immediately and the interpreter is exited. BEGIN not safe after errors--compilation aborted (F) Perl found a `BEGIN {}' subroutine (or a `use' directive, which implies a `BEGIN {}') after one or more compilation errors had already occurred. Since the intended environment for the `BEGIN {}' could not be guaranteed (due to the errors), and since subsequent code likely depends on its correct operation, Perl just gave up. bind() on closed fd (W) You tried to do a bind on a closed socket. Did you forget to check the return value of your socket() call? See the "bind" entry in the perlfunc manpage. Bizarre copy of %s in %s (P) Perl detected an attempt to copy an internal value that is not copiable. Callback called exit (F) A subroutine invoked from an external package via perl_call_sv() exited by calling exit. Can't "goto" outside a block (F) A "goto" statement was executed to jump out of what might look like a block, except that it isn't a proper block. This usually occurs if you tried to jump out of a sort() block or subroutine, which is a no-no. See the "goto" entry in the perlfunc manpage. Can't "goto" into the middle of a foreach loop (F) A "goto" statement was executed to jump into the middle of a foreach loop. You can't get there from here. See the "goto" entry in the perlfunc manpage. Can't "last" outside a block (F) A "last" statement was executed to break out of the current block, except that there's this itty bitty problem called there isn't a current block. Note that an "if" or "else" block doesn't count as a "loopish" block, as doesn't a block given to sort(). You can usually double the curlies to get the same effect though, because the inner curlies will be considered a block that loops once. See the "last" entry in the perlfunc manpage. Can't "next" outside a block (F) A "next" statement was executed to reiterate the current block, but there isn't a current block. Note that an "if" or "else" block doesn't count as a "loopish" block, as doesn't a block given to sort(). You can usually double the curlies to get the same effect though, because the inner curlies will be considered a block that loops once. See the "next" entry in the perlfunc manpage. Can't "redo" outside a block (F) A "redo" statement was executed to restart the current block, but there isn't a current block. Note that an "if" or "else" block doesn't count as a "loopish" block, as doesn't a block given to sort(). You can usually double the curlies to get the same effect though, because the inner curlies will be considered a block that loops once. See the "redo" entry in the perlfunc manpage. Can't bless non-reference value (F) Only hard references may be blessed. This is how Perl "enforces" encapsulation of objects. See the perlobj manpage. Can't break at that line (S) A warning intended to only be printed while running within the debugger, indicating the line number specified wasn't the location of a statement that could be stopped at. Can't call method "%s" in empty package "%s" (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't have ANYTHING defined in it, let alone methods. See the perlobj manpage. Can't call method "%s" on unblessed reference (F) A method call must know in what package it's supposed to run. It ordinarily finds this out from the object reference you supply, but you didn't supply an object reference in this case. A reference isn't an object reference until it has been blessed. See the perlobj manpage. Can't call method "%s" without a package or object reference (F) You used the syntax of a method call, but the slot filled by the object reference or package name contains an expression that returns a defined value which is neither an object reference nor a package name. Something like this will reproduce the error: $BADREF = 42; process $BADREF 1,2,3; $BADREF->process(1,2,3); Can't call method "%s" on an undefined value (F) You used the syntax of a method call, but the slot filled by the object reference or package name contains an undefined value. Something like this will reproduce the error: $BADREF = undef; process $BADREF 1,2,3; $BADREF->process(1,2,3); Can't chdir to %s (F) You called `perl -x/foo/bar', but `/foo/bar' is not a directory that you can chdir to, possibly because it doesn't exist. Can't check filesystem of script "%s" for nosuid (P) For some reason you can't check the filesystem of the script for nosuid. Can't coerce %s to integer in %s (F) Certain types of SVs, in particular real symbol table entries (typeglobs), can't be forced to stop being what they are. So you can't say things like: *foo += 1; You CAN say $foo = *foo; $foo += 1; but then $foo no longer contains a glob. Can't coerce %s to number in %s (F) Certain types of SVs, in particular real symbol table entries (typeglobs), can't be forced to stop being what they are. Can't coerce %s to string in %s (F) Certain types of SVs, in particular real symbol table entries (typeglobs), can't be forced to stop being what they are. Can't coerce array into hash (F) You used an array where a hash was expected, but the array has no information on how to map from keys to array indices. You can do that only with arrays that have a hash reference at index 0. Can't create pipe mailbox (P) An error peculiar to VMS. The process is suffering from exhausted quotas or other plumbing problems. Can't declare %s in my (F) Only scalar, array, and hash variables may be declared as lexical variables. They must have ordinary identifiers as names. Can't do inplace edit on %s: %s (S) The creation of the new file failed for the indicated reason. Can't do inplace edit without backup (F) You're on a system such as MS-DOS that gets confused if you try reading from a deleted (but still opened) file. You have to say `- i.bak', or some such. Can't do inplace edit: %s > 14 characters (S) There isn't enough room in the filename to make a backup name for the file. Can't do inplace edit: %s is not a regular file (S) You tried to use the -i switch on a special file, such as a file in /dev, or a FIFO. The file was ignored. Can't do setegid! (P) The setegid() call failed for some reason in the setuid emulator of suidperl. Can't do seteuid! (P) The setuid emulator of suidperl failed for some reason. Can't do setuid (F) This typically means that ordinary perl tried to exec suidperl to do setuid emulation, but couldn't exec it. It looks for a name of the form sperl5.000 in the same directory that the perl executable resides under the name perl5.000, typically /usr/local/bin on Unix machines. If the file is there, check the execute permissions. If it isn't, ask your sysadmin why he and/or she removed it. Can't do waitpid with flags (F) This machine doesn't have either waitpid() or wait4(), so only waitpid() without flags is emulated. Can't do {n,m} with n > m (F) Minima must be less than or equal to maxima. If you really want your regexp to match something 0 times, just put {0}. See the perlre manpage. Can't emulate -%s on #! line (F) The #! line specifies a switch that doesn't make sense at this point. For example, it'd be kind of silly to put a -x on the #! line. Can't exec "%s": %s (W) An system(), exec(), or piped open call could not execute the named program for the indicated reason. Typical reasons include: the permissions were wrong on the file, the file wasn't found in `$ENV{PATH}', the executable in question was compiled for another architecture, or the #! line in a script points to an interpreter that can't be run for similar reasons. (Or maybe your system doesn't support #! at all.) Can't exec %s (F) Perl was trying to execute the indicated program for you because that's what the #! line said. If that's not what you wanted, you may need to mention "perl" on the #! line somewhere. Can't execute %s (F) You used the -S switch, but the copies of the script to execute found in the PATH did not have correct permissions. Can't find %s on PATH, '.' not in PATH (F) You used the -S switch, but the script to execute could not be found in the PATH, or at least not with the correct permissions. The script exists in the current directory, but PATH prohibits running it. Can't find %s on PATH (F) You used the -S switch, but the script to execute could not be found in the PATH. Can't find label %s (F) You said to goto a label that isn't mentioned anywhere that it's possible for us to go to. See the "goto" entry in the perlfunc manpage. Can't find string terminator %s anywhere before EOF (F) Perl strings can stretch over multiple lines. This message means that the closing delimiter was omitted. Because bracketed quotes count nesting levels, the following is missing its final parenthesis: print q(The character '(' starts a side comment.); If you're getting this error from a here-document, you may have included unseen whitespace before or after your closing tag. A good programmer's editor will have a way to help you find these characters. Can't fork (F) A fatal error occurred while trying to fork while opening a pipeline. Can't get filespec - stale stat buffer? (S) A warning peculiar to VMS. This arises because of the difference between access checks under VMS and under the Unix model Perl assumes. Under VMS, access checks are done by filename, rather than by bits in the stat buffer, so that ACLs and other protections can be taken into account. Unfortunately, Perl assumes that the stat buffer contains all the necessary information, and passes it, instead of the filespec, to the access checking routine. It will try to retrieve the filespec using the device name and FID present in the stat buffer, but this works only if you haven't made a subsequent call to the CRTL stat() routine, because the device name is overwritten with each call. If this warning appears, the name lookup failed, and the access checking routine gave up and returned FALSE, just to be conservative. (Note: The access checking routine knows about the Perl `stat' operator and file tests, so you shouldn't ever see this warning in response to a Perl command; it arises only if some internal code takes stat buffers lightly.) Can't get pipe mailbox device name (P) An error peculiar to VMS. After creating a mailbox to act as a pipe, Perl can't retrieve its name for later use. Can't get SYSGEN parameter value for MAXBUF (P) An error peculiar to VMS. Perl asked $GETSYI how big you want your mailbox buffers to be, and didn't get an answer. Can't goto subroutine outside a subroutine (F) The deeply magical "goto subroutine" call can only replace one subroutine call for another. It can't manufacture one out of whole cloth. In general you should be calling it out of only an AUTOLOAD routine anyway. See the "goto" entry in the perlfunc manpage. Can't goto subroutine from an eval-string (F) The "goto subroutine" call can't be used to jump out of an eval "string". (You can use it to jump out of an eval {BLOCK}, but you probably don't want to.) Can't localize through a reference (F) You said something like `local $$ref', which Perl can't currently handle, because when it goes to restore the old value of whatever $ref pointed to after the scope of the local() is finished, it can't be sure that $ref will still be a reference. Can't localize lexical variable %s (F) You used local on a variable name that was previously declared as a lexical variable using "my". This is not allowed. If you want to localize a package variable of the same name, qualify it with the package name. Can't localize pseudo-hash element (F) You said something like `local $ar->{'key'}', where $ar is a reference to a pseudo-hash. That hasn't been implemented yet, but you can get a similar effect by localizing the corresponding array element directly -- `local $ar->[$ar->[0]{'key'}]'. Can't locate auto/%s.al in @INC (F) A function (or method) was called in a package which allows autoload, but there is no function to autoload. Most probable causes are a misprint in a function/method name or a failure to `AutoSplit' the file, say, by doing `make install'. Can't locate %s in @INC (F) You said to do (or require, or use) a file that couldn't be found in any of the libraries mentioned in @INC. Perhaps you need to set the PERL5LIB or PERL5OPT environment variable to say where the extra library is, or maybe the script needs to add the library name to @INC. Or maybe you just misspelled the name of the file. See the "require" entry in the perlfunc manpage. Can't locate object method "%s" via package "%s" (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't define that particular method, nor does any of its base classes. See the perlobj manpage. Can't locate package %s for @%s::ISA (W) The @ISA array contained the name of another package that doesn't seem to exist. Can't make list assignment to \%ENV on this system (F) List assignment to %ENV is not supported on some systems, notably VMS. Can't modify %s in %s (F) You aren't allowed to assign to the item indicated, or otherwise try to change it, such as with an auto-increment. Can't modify nonexistent substring (P) The internal routine that does assignment to a substr() was handed a NULL. Can't msgrcv to read-only var (F) The target of a msgrcv must be modifiable to be used as a receive buffer. Can't open %s: %s (S) The implicit opening of a file through use of the `<>' filehandle, either implicitly under the `-n' or `-p' command-line switches, or explicitly, failed for the indicated reason. Usually this is because you don't have read permission for a file which you named on the command line. Can't open bidirectional pipe (W) You tried to say `open(CMD, "|cmd|")', which is not supported. You can try any of several modules in the Perl library to do this, such as IPC::Open2. Alternately, direct the pipe's output to a file using ">", and then read it in under a different file handle. Can't open error file %s as stderr (F) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the file specified after '2>' or '2>>' on the command line for writing. Can't open input file %s as stdin (F) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the file specified after '<' on the command line for reading. Can't open output file %s as stdout (F) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the file specified after '>' or '>>' on the command line for writing. Can't open output pipe (name: %s) (P) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the pipe into which to send data destined for stdout. Can't open perl script "%s": %s (F) The script you specified can't be opened for the indicated reason. Can't redefine active sort subroutine %s (F) Perl optimizes the internal handling of sort subroutines and keeps pointers into them. You tried to redefine one such sort subroutine when it was currently active, which is not allowed. If you really want to do this, you should write `sort { &func } @x' instead of `sort func @x'. Can't rename %s to %s: %s, skipping file (S) The rename done by the -i switch failed for some reason, probably because you don't have write permission to the directory. Can't reopen input pipe (name: %s) in binary mode (P) An error peculiar to VMS. Perl thought stdin was a pipe, and tried to reopen it to accept binary data. Alas, it failed. Can't reswap uid and euid (P) The setreuid() call failed for some reason in the setuid emulator of suidperl. Can't return outside a subroutine (F) The return statement was executed in mainline code, that is, where there was no subroutine call to return out of. See the perlsub manpage. Can't stat script "%s" (P) For some reason you can't fstat() the script even though you have it open already. Bizarre. Can't swap uid and euid (P) The setreuid() call failed for some reason in the setuid emulator of suidperl. Can't take log of %g (F) For ordinary real numbers, you can't take the logarithm of a negative number or zero. There's a Math::Complex package that comes standard with Perl, though, if you really want to do that for the negative numbers. Can't take sqrt of %g (F) For ordinary real numbers, you can't take the square root of a negative number. There's a Math::Complex package that comes standard with Perl, though, if you really want to do that. Can't undef active subroutine (F) You can't undefine a routine that's currently running. You can, however, redefine it while it's running, and you can even undef the redefined subroutine while the old routine is running. Go figure. Can't unshift (F) You tried to unshift an "unreal" array that can't be unshifted, such as the main Perl stack. Can't upgrade that kind of scalar (P) The internal sv_upgrade routine adds "members" to an SV, making it into a more specialized kind of SV. The top several SV types are so specialized, however, that they cannot be interconverted. This message indicates that such a conversion was attempted. Can't upgrade to undef (P) The undefined SV is the bottom of the totem pole, in the scheme of upgradability. Upgrading to undef indicates an error in the code calling sv_upgrade. Can't use %%! because Errno.pm is not available (F) The first time the %! hash is used, perl automatically loads the Errno.pm module. The Errno module is expected to tie the %! hash to provide symbolic names for `$!' errno values. Can't use "my %s" in sort comparison (F) The global variables $a and $b are reserved for sort comparisons. You mentioned $a or $b in the same line as the <=> or cmp operator, and the variable had earlier been declared as a lexical variable. Either qualify the sort variable with the package name, or rename the lexical variable. Can't use %s for loop variable (F) Only a simple scalar variable may be used as a loop variable on a foreach. Can't use %s ref as %s ref (F) You've mixed up your reference types. You have to dereference a reference of the type needed. You can use the ref() function to test the type of the reference, if need be. Can't use \1 to mean $1 in expression (W) In an ordinary expression, backslash is a unary operator that creates a reference to its argument. The use of backslash to indicate a backreference to a matched substring is valid only as part of a regular expression pattern. Trying to do this in ordinary Perl code produces a value that prints out looking like SCALAR(0xdecaf). Use the $1 form instead. Can't use bareword ("%s") as %s ref while \"strict refs\" in use (F) Only hard references are allowed by "strict refs". Symbolic references are disallowed. See the perlref manpage. Can't use string ("%s") as %s ref while "strict refs" in use (F) Only hard references are allowed by "strict refs". Symbolic references are disallowed. See the perlref manpage. Can't use an undefined value as %s reference (F) A value used as either a hard reference or a symbolic reference must be a defined value. This helps to delurk some insidious errors. Can't use global %s in "my" (F) You tried to declare a magical variable as a lexical variable. This is not allowed, because the magic can be tied to only one location (namely the global variable) and it would be incredibly confusing to have variables in your program that looked like magical variables but weren't. Can't use subscript on %s (F) The compiler tried to interpret a bracketed expression as a subscript. But to the left of the brackets was an expression that didn't look like an array reference, or anything else subscriptable. Can't x= to read-only value (F) You tried to repeat a constant value (often the undefined value) with an assignment operator, which implies modifying the value itself. Perhaps you need to copy the value to a temporary, and repeat that. Cannot find an opnumber for "%s" (F) A string of a form `CORE::word' was given to prototype(), but there is no builtin with the name `word'. Cannot resolve method `%s' overloading `%s' in package `%s' (F|P) Error resolving overloading specified by a method name (as opposed to a subroutine reference): no such method callable via the package. If method name is `???', this is an internal error. Character class syntax [. .] is reserved for future extensions (W) Within regular expression character classes ([]) the syntax beginning with "[." and ending with ".]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[." and ".\]". Character class syntax [: :] is reserved for future extensions (W) Within regular expression character classes ([]) the syntax beginning with "[:" and ending with ":]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[:" and ":\]". Character class syntax [= =] is reserved for future extensions (W) Within regular expression character classes ([]) the syntax beginning with "[=" and ending with "=]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[=" and "=\]". chmod: mode argument is missing initial 0 (W) A novice will sometimes say chmod 777, $filename not realizing that 777 will be interpreted as a decimal number, equivalent to 01411. Octal constants are introduced with a leading 0 in Perl, as in C. Close on unopened file <%s> (W) You tried to close a filehandle that was never opened. Compilation failed in require (F) Perl could not compile a file specified in a `require' statement. Perl uses this generic message when none of the errors that it encountered were severe enough to halt compilation immediately. Complex regular subexpression recursion limit (%d) exceeded (W) The regular expression engine uses recursion in complex situations where back-tracking is required. Recursion depth is limited to 32766, or perhaps less in architectures where the stack cannot grow arbitrarily. ("Simple" and "medium" situations are handled without recursion and are not subject to a limit.) Try shortening the string under examination; looping in Perl code (e.g. with `while') rather than in the regular expression engine; or rewriting the regular expression so that it is simpler or backtracks less. (See the perlbook manpage for information on *Mastering Regular Expressions*.) connect() on closed fd (W) You tried to do a connect on a closed socket. Did you forget to check the return value of your socket() call? See the "connect" entry in the perlfunc manpage. Constant is not %s reference (F) A constant value (perhaps declared using the `use constant' pragma) is being dereferenced, but it amounts to the wrong type of reference. The message indicates the type of reference that was expected. This usually indicates a syntax error in dereferencing the constant value. See the section on "Constant Functions" in the perlsub manpage and the constant manpage. Constant subroutine %s redefined (S) You redefined a subroutine which had previously been eligible for inlining. See the section on "Constant Functions" in the perlsub manpage for commentary and workarounds. Constant subroutine %s undefined (S) You undefined a subroutine which had previously been eligible for inlining. See the section on "Constant Functions" in the perlsub manpage for commentary and workarounds. Copy method did not return a reference (F) The method which overloads "=" is buggy. See the section on "Copy Constructor" in the overload manpage. Corrupt malloc ptr 0x%lx at 0x%lx (P) The malloc package that comes with Perl had an internal failure. corrupted regexp pointers (P) The regular expression engine got confused by what the regular expression compiler gave it. corrupted regexp program (P) The regular expression engine got passed a regexp program without a valid magic number. Deep recursion on subroutine "%s" (W) This subroutine has called itself (directly or indirectly) 100 times more than it has returned. This probably indicates an infinite recursion, unless you're writing strange benchmark programs, in which case it indicates something else. Delimiter for here document is too long (F) In a here document construct like `< operator (F) The contents of a <> operator may not exceed the maximum size of a Perl identifier. If you're just trying to glob a long list of filenames, try using the glob() operator, or put the filenames into a variable and glob that. Execution of %s aborted due to compilation errors (F) The final summary message when a Perl compilation fails. Exiting eval via %s (W) You are exiting an eval by unconventional means, such as a goto, or a loop control statement. Exiting pseudo-block via %s (W) You are exiting a rather special block construct (like a sort block or subroutine) by unconventional means, such as a goto, or a loop control statement. See the "sort" entry in the perlfunc manpage. Exiting subroutine via %s (W) You are exiting a subroutine by unconventional means, such as a goto, or a loop control statement. Exiting substitution via %s (W) You are exiting a substitution by unconventional means, such as a return, a goto, or a loop control statement. Explicit blessing to '' (assuming package main) (W) You are blessing a reference to a zero length string. This has the effect of blessing the reference into the package main. This is usually not what you want. Consider providing a default target package, e.g. bless($ref, $p || 'MyPackage'); Fatal VMS error at %s, line %d (P) An error peculiar to VMS. Something untoward happened in a VMS system service or RTL routine; Perl's exit status should provide more details. The filename in "at %s" and the line number in "line %d" tell you which section of the Perl source code is distressed. fcntl is not implemented (F) Your machine apparently doesn't implement fcntl(). What is this, a PDP-11 or something? Filehandle %s never opened (W) An I/O operation was attempted on a filehandle that was never initialized. You need to do an open() or a socket() call, or call a constructor from the FileHandle package. Filehandle %s opened for only input (W) You tried to write on a read-only filehandle. If you intended it to be a read-write filehandle, you needed to open it with "+<" or "+>" or "+>>" instead of with "<" or nothing. If you intended only to write the file, use ">" or ">>". See the "open" entry in the perlfunc manpage. Filehandle opened for only input (W) You tried to write on a read-only filehandle. If you intended it to be a read-write filehandle, you needed to open it with "+<" or "+>" or "+>>" instead of with "<" or nothing. If you intended only to write the file, use ">" or ">>". See the "open" entry in the perlfunc manpage. Final $ should be \$ or $name (F) You must now decide whether the final $ in a string was meant to be a literal dollar sign, or was meant to introduce a variable name that happens to be missing. So you have to put either the backslash or the name. Final @ should be \@ or @name (F) You must now decide whether the final @ in a string was meant to be a literal "at" sign, or was meant to introduce a variable name that happens to be missing. So you have to put either the backslash or the name. Format %s redefined (W) You redefined a format. To suppress this warning, say { local $^W = 0; eval "format NAME =..."; } Format not terminated (F) A format must be terminated by a line with a solitary dot. Perl got to the end of your file without finding such a line. Found = in conditional, should be == (W) You said if ($foo = 123) when you meant if ($foo == 123) (or something like that). gdbm store returned %d, errno %d, key "%s" (S) A warning from the GDBM_File extension that a store failed. gethostent not implemented (F) Your C library apparently doesn't implement gethostent(), probably because if it did, it'd feel morally obligated to return every hostname on the Internet. get{sock,peer}name() on closed fd (W) You tried to get a socket or peer socket name on a closed socket. Did you forget to check the return value of your socket() call? getpwnam returned invalid UIC %#o for user "%s" (S) A warning peculiar to VMS. The call to `sys$getuai' underlying the `getpwnam' operator returned an invalid UIC. Glob not terminated (F) The lexer saw a left angle bracket in a place where it was expecting a term, so it's looking for the corresponding right angle bracket, and not finding it. Chances are you left some needed parentheses out earlier in the line, and you really meant a "less than". Global symbol "%s" requires explicit package name (F) You've said "use strict vars", which indicates that all variables must either be lexically scoped (using "my"), or explicitly qualified to say which package the global variable is in (using "::"). goto must have label (F) Unlike with "next" or "last", you're not allowed to goto an unspecified destination. See the "goto" entry in the perlfunc manpage. Had to create %s unexpectedly (S) A routine asked for a symbol from a symbol table that ought to have existed already, but for some reason it didn't, and had to be created on an emergency basis to prevent a core dump. Hash %%s missing the % in argument %d of %s() (D) Really old Perl let you omit the % on hash names in some spots. This is now heavily deprecated. Identifier too long (F) Perl limits identifiers (names for variables, functions, etc.) to about 250 characters for simple names, and somewhat more for compound names (like `$A::B'). You've exceeded Perl's limits. Future versions of Perl are likely to eliminate these arbitrary limitations. Ill-formed logical name |%s| in prime_env_iter (W) A warning peculiar to VMS. A logical name was encountered when preparing to iterate over %ENV which violates the syntactic rules governing logical names. Because it cannot be translated normally, it is skipped, and will not appear in %ENV. This may be a benign occurrence, as some software packages might directly modify logical name tables and introduce nonstandard names, or it may indicate that a logical name table has been corrupted. Illegal character %s (carriage return) (F) A carriage return character was found in the input. This is an error, and not a warning, because carriage return characters can break multi-line strings, including here documents (e.g., `print <'. Usually, this means that you supplied a `glob' pattern that caused the external program to fail and exit with a nonzero status. If the message indicates that the abnormal exit resulted in a coredump, this may also mean that your csh (C shell) is broken. If so, you should change all of the csh-related variables in config.sh: If you have tcsh, make the variables refer to it as if it were csh (e.g. `full_csh='/usr/bin/tcsh''); otherwise, make them all empty (except that `d_csh' should be `'undef'') so that Perl will think csh is missing. In either case, after editing config.sh, run `./Configure -S' and rebuild Perl. internal urp in regexp at /%s/ (P) Something went badly awry in the regular expression parser. invalid [] range in regexp (F) The range specified in a character class had a minimum character greater than the maximum character. See the perlre manpage. Invalid conversion in %s: "%s" (W) Perl does not understand the given format conversion. See the "sprintf" entry in the perlfunc manpage. Invalid type in pack: '%s' (F) The given character is not a valid pack type. See the "pack" entry in the perlfunc manpage. (W) The given character is not a valid pack type but used to be silently ignored. Invalid type in unpack: '%s' (F) The given character is not a valid unpack type. See the "unpack" entry in the perlfunc manpage. (W) The given character is not a valid unpack type but used to be silently ignored. ioctl is not implemented (F) Your machine apparently doesn't implement ioctl(), which is pretty strange for a machine that supports C. junk on end of regexp (P) The regular expression parser is confused. Label not found for "last %s" (F) You named a loop to break out of, but you're not currently in a loop of that name, not even if you count where you were called from. See the "last" entry in the perlfunc manpage. Label not found for "next %s" (F) You named a loop to continue, but you're not currently in a loop of that name, not even if you count where you were called from. See the "last" entry in the perlfunc manpage. Label not found for "redo %s" (F) You named a loop to restart, but you're not currently in a loop of that name, not even if you count where you were called from. See the "last" entry in the perlfunc manpage. listen() on closed fd (W) You tried to do a listen on a closed socket. Did you forget to check the return value of your socket() call? See the "listen" entry in the perlfunc manpage. Method for operation %s not found in package %s during blessing (F) An attempt was made to specify an entry in an overloading table that doesn't resolve to a valid subroutine. See the overload manpage. Might be a runaway multi-line %s string starting on line %d (S) An advisory indicating that the previous error may have been caused by a missing delimiter on a string or pattern, because it eventually ended earlier on the current line. Misplaced _ in number (W) An underline in a decimal constant wasn't on a 3-digit boundary. Missing $ on loop variable (F) Apparently you've been programming in csh too much. Variables are always mentioned with the $ in Perl, unlike in the shells, where it can vary from one line to the next. Missing comma after first argument to %s function (F) While certain functions allow you to specify a filehandle or an "indirect object" before the argument list, this ain't one of them. Missing operator before %s? (S) This is an educated guess made in conjunction with the message "%s found where operator expected". Often the missing operator is a comma. Missing right bracket (F) The lexer counted more opening curly brackets (braces) than closing ones. As a general rule, you'll find it's missing near the place you were last editing. Modification of a read-only value attempted (F) You tried, directly or indirectly, to change the value of a constant. You didn't, of course, try "2 = 1", because the compiler catches that. But an easy way to do the same thing is: sub mod { $_[0] = 1 } mod(2); Another way is to assign to a substr() that's off the end of the string. Modification of non-creatable array value attempted, subscript %d (F) You tried to make an array value spring into existence, and the subscript was probably negative, even counting from end of the array backwards. Modification of non-creatable hash value attempted, subscript "%s" (P) You tried to make a hash value spring into existence, and it couldn't be created for some peculiar reason. Module name must be constant (F) Only a bare module name is allowed as the first argument to a "use". msg%s not implemented (F) You don't have System V message IPC on your system. Multidimensional syntax %s not supported (W) Multidimensional arrays aren't written like `$foo[1,2,3]'. They're written like `$foo[1][2][3]', as in C. Name "%s::%s" used only once: possible typo (W) Typographical errors often show up as unique variable names. If you had a good reason for having a unique name, then just mention it again somehow to suppress the message. The `use vars' pragma is provided for just this purpose. Negative length (F) You tried to do a read/write/send/recv operation with a buffer length that is less than 0. This is difficult to imagine. nested *?+ in regexp (F) You can't quantify a quantifier without intervening parentheses. So things like ** or +* or ?* are illegal. Note, however, that the minimal matching quantifiers, `*?', `+?', and `??' appear to be nested quantifiers, but aren't. See the perlre manpage. No #! line (F) The setuid emulator requires that scripts have a well-formed #! line even on machines that don't support the #! construct. No %s allowed while running setuid (F) Certain operations are deemed to be too insecure for a setuid or setgid script to even be allowed to attempt. Generally speaking there will be another way to do what you want that is, if not secure, at least securable. See the perlsec manpage. No -e allowed in setuid scripts (F) A setuid script can't be specified by the user. No comma allowed after %s (F) A list operator that has a filehandle or "indirect object" is not allowed to have a comma between that and the following arguments. Otherwise it'd be just another one of the arguments. One possible cause for this is that you expected to have imported a constant to your name space with use or import while no such importing took place, it may for example be that your operating system does not support that particular constant. Hopefully you did use an explicit import list for the constants you expect to see, please see the "use" entry in the perlfunc manpage and the "import" entry in the perlfunc manpage. While an explicit import list would probably have caught this error earlier it naturally does not remedy the fact that your operating system still does not support that constant. Maybe you have a typo in the constants of the symbol import list of use or import or in the constant name at the line where this error was triggered? No command into which to pipe on command line (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '|' at the end of the command line, so it doesn't know where you want to pipe the output from this command. No DB::DB routine defined (F) The currently executing code was compiled with the -d switch, but for some reason the perl5db.pl file (or some facsimile thereof) didn't define a routine to be called at the beginning of each statement. Which is odd, because the file should have been required automatically, and should have blown up the require if it didn't parse right. No dbm on this machine (P) This is counted as an internal error, because every machine should supply dbm nowadays, because Perl comes with SDBM. See the SDBM_File manpage. No DBsub routine (F) The currently executing code was compiled with the -d switch, but for some reason the perl5db.pl file (or some facsimile thereof) didn't define a DB::sub routine to be called at the beginning of each ordinary subroutine call. No error file after 2> or 2>> on command line (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '2>' or a '2>>' on the command line, but can't find the name of the file to which to write data destined for stderr. No input file after < on command line (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '<' on the command line, but can't find the name of the file from which to read data for stdin. No output file after > on command line (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a lone '>' at the end of the command line, so it doesn't know where you wanted to redirect stdout. No output file after > or >> on command line (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '>' or a '>>' on the command line, but can't find the name of the file to which to write data destined for stdout. No Perl script found in input (F) You called `perl -x', but no line was found in the file beginning with #! and containing the word "perl". No setregid available (F) Configure didn't find anything resembling the setregid() call for your system. No setreuid available (F) Configure didn't find anything resembling the setreuid() call for your system. No space allowed after -I (F) The argument to -I must follow the -I immediately with no intervening space. No such array field (F) You tried to access an array as a hash, but the field name used is not defined. The hash at index 0 should map all valid field names to array indices for that to work. No such field "%s" in variable %s of type %s (F) You tried to access a field of a typed variable where the type does not know about the field name. The field names are looked up in the %FIELDS hash in the type package at compile time. The %FIELDS hash is usually set up with the 'fields' pragma. No such pipe open (P) An error peculiar to VMS. The internal routine my_pclose() tried to close a pipe which hadn't been opened. This should have been caught earlier as an attempt to close an unopened filehandle. No such signal: SIG%s (W) You specified a signal name as a subscript to %SIG that was not recognized. Say `kill -l' in your shell to see the valid signal names on your system. Not a CODE reference (F) Perl was trying to evaluate a reference to a code value (that is, a subroutine), but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See also the perlref manpage. Not a format reference (F) I'm not sure how you managed to generate a reference to an anonymous format, but this indicates you did, and that it didn't exist. Not a GLOB reference (F) Perl was trying to evaluate a reference to a "typeglob" (that is, a symbol table entry that looks like `*foo'), but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See the perlref manpage. Not a HASH reference (F) Perl was trying to evaluate a reference to a hash value, but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See the perlref manpage. Not a perl script (F) The setuid emulator requires that scripts have a well-formed #! line even on machines that don't support the #! construct. The line must mention perl. Not a SCALAR reference (F) Perl was trying to evaluate a reference to a scalar value, but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See the perlref manpage. Not a subroutine reference (F) Perl was trying to evaluate a reference to a code value (that is, a subroutine), but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See also the perlref manpage. Not a subroutine reference in overload table (F) An attempt was made to specify an entry in an overloading table that doesn't somehow point to a valid subroutine. See the overload manpage. Not an ARRAY reference (F) Perl was trying to evaluate a reference to an array value, but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See the perlref manpage. Not enough arguments for %s (F) The function requires more arguments than you specified. Not enough format arguments (W) A format specified more picture fields than the next line supplied. See the perlform manpage. Null filename used (F) You can't require the null filename, especially because on many machines that means the current directory! See the "require" entry in the perlfunc manpage. Null picture in formline (F) The first argument to formline must be a valid format picture specification. It was found to be empty, which probably means you supplied it an uninitialized value. See the perlform manpage. NULL OP IN RUN (P) Some internal routine called run() with a null opcode pointer. Null realloc (P) An attempt was made to realloc NULL. NULL regexp argument (P) The internal pattern matching routines blew it big time. NULL regexp parameter (P) The internal pattern matching routines are out of their gourd. Number too long (F) Perl limits the representation of decimal numbers in programs to about about 250 characters. You've exceeded that length. Future versions of Perl are likely to eliminate this arbitrary limitation. In the meantime, try using scientific notation (e.g. "1e6" instead of "1_000_000"). Odd number of elements in hash assignment (S) You specified an odd number of elements to initialize a hash, which is odd, because hashes come in key/value pairs. Offset outside string (F) You tried to do a read/write/send/recv operation with an offset pointing outside the buffer. This is difficult to imagine. The sole exception to this is that `sysread()'ing past the buffer will extend the buffer and zero pad the new area. oops: oopsAV (S) An internal warning that the grammar is screwed up. oops: oopsHV (S) An internal warning that the grammar is screwed up. Operation `%s': no method found, %s (F) An attempt was made to perform an overloaded operation for which no handler was defined. While some handlers can be autogenerated in terms of other handlers, there is no default handler for any operation, unless `fallback' overloading key is specified to be true. See the overload manpage. Operator or semicolon missing before %s (S) You used a variable or subroutine call where the parser was expecting an operator. The parser has assumed you really meant to use an operator, but this is highly likely to be incorrect. For example, if you say "*foo *foo" it will be interpreted as if you said "*foo * 'foo'". Out of memory for yacc stack (F) The yacc parser wanted to grow its stack so it could continue parsing, but realloc() wouldn't give it more memory, virtual or otherwise. Out of memory during request for %s (X|F) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request. The request was judged to be small, so the possibility to trap it depends on the way perl was compiled. By default it is not trappable. However, if compiled for this, Perl may use the contents of `$^M' as an emergency pool after die()ing with this message. In this case the error is trappable *once*. Out of memory during "large" request for %s (F) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request. However, the request was judged large enough (compile-time default is 64K), so a possibility to shut down by trapping this error is granted. Out of memory during ridiculously large request (F) You can't allocate more than 2^31+"small amount" bytes. This error is most likely to be caused by a typo in the Perl program. e.g., `$arr[time]' instead of `$arr[$time]'. page overflow (W) A single call to write() produced more lines than can fit on a page. See the perlform manpage. panic: ck_grep (P) Failed an internal consistency check trying to compile a grep. panic: ck_split (P) Failed an internal consistency check trying to compile a split. panic: corrupt saved stack index (P) The savestack was requested to restore more localized values than there are in the savestack. panic: die %s (P) We popped the context stack to an eval context, and then discovered it wasn't an eval context. panic: do_match (P) The internal pp_match() routine was called with invalid operational data. panic: do_split (P) Something terrible went wrong in setting up for the split. panic: do_subst (P) The internal pp_subst() routine was called with invalid operational data. panic: do_trans (P) The internal do_trans() routine was called with invalid operational data. panic: frexp (P) The library function frexp() failed, making printf("%f") impossible. panic: goto (P) We popped the context stack to a context with the specified label, and then discovered it wasn't a context we know how to do a goto in. panic: INTERPCASEMOD (P) The lexer got into a bad state at a case modifier. panic: INTERPCONCAT (P) The lexer got into a bad state parsing a string with brackets. panic: last (P) We popped the context stack to a block context, and then discovered it wasn't a block context. panic: leave_scope clearsv (P) A writable lexical variable became read-only somehow within the scope. panic: leave_scope inconsistency (P) The savestack probably got out of sync. At least, there was an invalid enum on the top of it. panic: malloc (P) Something requested a negative number of bytes of malloc. panic: mapstart (P) The compiler is screwed up with respect to the map() function. panic: null array (P) One of the internal array routines was passed a null AV pointer. panic: pad_alloc (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from. panic: pad_free curpad (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from. panic: pad_free po (P) An invalid scratch pad offset was detected internally. panic: pad_reset curpad (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from. panic: pad_sv po (P) An invalid scratch pad offset was detected internally. panic: pad_swipe curpad (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from. panic: pad_swipe po (P) An invalid scratch pad offset was detected internally. panic: pp_iter (P) The foreach iterator got called in a non-loop context frame. panic: realloc (P) Something requested a negative number of bytes of realloc. panic: restartop (P) Some internal routine requested a goto (or something like it), and didn't supply the destination. panic: return (P) We popped the context stack to a subroutine or eval context, and then discovered it wasn't a subroutine or eval context. panic: scan_num (P) scan_num() got called on something that wasn't a number. panic: sv_insert (P) The sv_insert() routine was told to remove more string than there was string. panic: top_env (P) The compiler attempted to do a goto, or something weird like that. panic: yylex (P) The lexer got into a bad state while processing a case modifier. Parentheses missing around "%s" list (W) You said something like my $foo, $bar = @_; when you meant my ($foo, $bar) = @_; Remember that "my" and "local" bind closer than comma. Perl %3.3f required--this is only version %s, stopped (F) The module in question uses features of a version of Perl more recent than the currently running version. How long has it been since you upgraded, anyway? See the "require" entry in the perlfunc manpage. Permission denied (F) The setuid emulator in suidperl decided you were up to no good. pid %d not a child (W) A warning peculiar to VMS. Waitpid() was asked to wait for a process which isn't a subprocess of the current process. While this is fine from VMS' perspective, it's probably not what you intended. POSIX getpgrp can't take an argument (F) Your C compiler uses POSIX getpgrp(), which takes no argument, unlike the BSD version, which takes a pid. Possible attempt to put comments in qw() list (W) qw() lists contain items separated by whitespace; as with literal strings, comment characters are not ignored, but are instead treated as literal data. (You may have used different delimiters than the parentheses shown here; braces are also frequently used.) You probably wrote something like this: @list = qw( a # a comment b # another comment ); when you should have written this: @list = qw( a b ); If you really want comments, build your list the old-fashioned way, with quotes and commas: @list = ( 'a', # a comment 'b', # another comment ); Possible attempt to separate words with commas (W) qw() lists contain items separated by whitespace; therefore commas aren't needed to separate the items. (You may have used different delimiters than the parentheses shown here; braces are also frequently used.) You probably wrote something like this: qw! a, b, c !; which puts literal commas into some of the list items. Write it without commas if you don't want them to appear in your data: qw! a b c !; Possible memory corruption: %s overflowed 3rd argument (F) An ioctl() or fcntl() returned more than Perl was bargaining for. Perl guesses a reasonable buffer size, but puts a sentinel byte at the end of the buffer just in case. This sentinel byte got clobbered, and Perl assumes that memory is now corrupted. See the "ioctl" entry in the perlfunc manpage. Precedence problem: open %s should be open(%s) (S) The old irregular construct open FOO || die; is now misinterpreted as open(FOO || die); because of the strict regularization of Perl 5's grammar into unary and list operators. (The old open was a little of both.) You must put parentheses around the filehandle, or use the new "or" operator instead of "||". print on closed filehandle %s (W) The filehandle you're printing on got itself closed sometime before now. Check your logic flow. printf on closed filehandle %s (W) The filehandle you're writing to got itself closed sometime before now. Check your logic flow. Probable precedence problem on %s (W) The compiler found a bareword where it expected a conditional, which often indicates that an || or && was parsed as part of the last argument of the previous construct, for example: open FOO || die; Prototype mismatch: %s vs %s (S) The subroutine being declared or defined had previously been declared or defined with a different function prototype. Range iterator outside integer range (F) One (or both) of the numeric arguments to the range operator ".." are outside the range which can be represented by integers internally. One possible workaround is to force Perl to use magical string increment by prepending "0" to your numbers. Read on closed filehandle <%s> (W) The filehandle you're reading from got itself closed sometime before now. Check your logic flow. Reallocation too large: %lx (F) You can't allocate more than 64K on an MS-DOS machine. Recompile perl with -DDEBUGGING to use -D switch (F) You can't use the -D option unless the code to produce the desired output is compiled into Perl, which entails some overhead, which is why it's currently left out of your copy. Recursive inheritance detected in package '%s' (F) More than 100 levels of inheritance were used. Probably indicates an unintended loop in your inheritance hierarchy. Recursive inheritance detected while looking for method '%s' in package '%s' (F) More than 100 levels of inheritance were encountered while invoking a method. Probably indicates an unintended loop in your inheritance hierarchy. Reference found where even-sized list expected (W) You gave a single reference where Perl was expecting a list with an even number of elements (for assignment to a hash). This usually means that you used the anon hash constructor when you meant to use parens. In any case, a hash requires key/value pairs. %hash = { one => 1, two => 2, }; # WRONG %hash = [ qw/ an anon array / ]; # WRONG %hash = ( one => 1, two => 2, ); # right %hash = qw( one 1 two 2 ); # also fine Reference miscount in sv_replace() (W) The internal sv_replace() function was handed a new SV with a reference count of other than 1. regexp *+ operand could be empty (F) The part of the regexp subject to either the * or + quantifier could match an empty string. regexp memory corruption (P) The regular expression engine got confused by what the regular expression compiler gave it. regexp out of space (P) A "can't happen" error, because safemalloc() should have caught it earlier. regexp too big (F) The current implementation of regular expressions uses shorts as address offsets within a string. Unfortunately this means that if the regular expression compiles to longer than 32767, it'll blow up. Usually when you want a regular expression this big, there is a better way to do it with multiple statements. See the perlre manpage. Reversed %s= operator (W) You wrote your assignment operator backwards. The = must always comes last, to avoid ambiguity with subsequent unary operators. Runaway format (F) Your format contained the ~~ repeat-until-blank sequence, but it produced 200 lines at once, and the 200th line looked exactly like the 199th line. Apparently you didn't arrange for the arguments to exhaust themselves, either by using ^ instead of @ (for scalar variables), or by shifting or popping (for array variables). See the perlform manpage. Scalar value @%s[%s] better written as $%s[%s] (W) You've used an array slice (indicated by @) to select a single element of an array. Generally it's better to ask for a scalar value (indicated by $). The difference is that `$foo[&bar]' always behaves like a scalar, both when assigning to it and when evaluating its argument, while `@foo[&bar]' behaves like a list when you assign to it, and provides a list context to its subscript, which can do weird things if you're expecting only one subscript. On the other hand, if you were actually hoping to treat the array element as a list, you need to look into how references work, because Perl will not magically convert between scalars and lists for you. See the perlref manpage. Scalar value @%s{%s} better written as $%s{%s} (W) You've used a hash slice (indicated by @) to select a single element of a hash. Generally it's better to ask for a scalar value (indicated by $). The difference is that `$foo{&bar}' always behaves like a scalar, both when assigning to it and when evaluating its argument, while `@foo{&bar}' behaves like a list when you assign to it, and provides a list context to its subscript, which can do weird things if you're expecting only one subscript. On the other hand, if you were actually hoping to treat the hash element as a list, you need to look into how references work, because Perl will not magically convert between scalars and lists for you. See the perlref manpage. Script is not setuid/setgid in suidperl (F) Oddly, the suidperl program was invoked on a script without a setuid or setgid bit set. This doesn't make much sense. Search pattern not terminated (F) The lexer couldn't find the final delimiter of a // or m{} construct. Remember that bracketing delimiters count nesting level. Missing the leading `$' from a variable `$m' may cause this error. %sseek() on unopened file (W) You tried to use the seek() or sysseek() function on a filehandle that was either never opened or has since been closed. select not implemented (F) This machine doesn't implement the select() system call. sem%s not implemented (F) You don't have System V semaphore IPC on your system. semi-panic: attempt to dup freed string (S) The internal newSVsv() routine was called to duplicate a scalar that had previously been marked as free. Semicolon seems to be missing (W) A nearby syntax error was probably caused by a missing semicolon, or possibly some other missing operator, such as a comma. Send on closed socket (W) The filehandle you're sending to got itself closed sometime before now. Check your logic flow. Sequence (? incomplete (F) A regular expression ended with an incomplete extension (?. See the perlre manpage. Sequence (?#... not terminated (F) A regular expression comment must be terminated by a closing parenthesis. Embedded parentheses aren't allowed. See the perlre manpage. Sequence (?%s...) not implemented (F) A proposed regular expression extension has the character reserved but has not yet been written. See the perlre manpage. Sequence (?%s...) not recognized (F) You used a regular expression extension that doesn't make sense. See the perlre manpage. Server error Also known as "500 Server error". This is a CGI error, not a Perl error. You need to make sure your script is executable, is accessible by the user CGI is running the script under (which is probably not the user account you tested it under), does not rely on any environment variables (like PATH) from the user it isn't running under, and isn't in a location where the CGI server can't find it, basically, more or less. Please see the following for more information: http://www.perl.com/CPAN/doc/FAQs/cgi/idiots-guide.html http://www.perl.com/CPAN/doc/FAQs/cgi/perl-cgi-faq.html ftp://rtfm.mit.edu/pub/usenet/news.answers/www/cgi-faq http://hoohoo.ncsa.uiuc.edu/cgi/interface.html http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html You should also look at the perlfaq9 manpage. setegid() not implemented (F) You tried to assign to `$)', and your operating system doesn't support the setegid() system call (or equivalent), or at least Configure didn't think so. seteuid() not implemented (F) You tried to assign to `$>', and your operating system doesn't support the seteuid() system call (or equivalent), or at least Configure didn't think so. setrgid() not implemented (F) You tried to assign to `$(', and your operating system doesn't support the setrgid() system call (or equivalent), or at least Configure didn't think so. setruid() not implemented (F) You tried to assign to `$<', and your operating system doesn't support the setruid() system call (or equivalent), or at least Configure didn't think so. Setuid/gid script is writable by world (F) The setuid emulator won't run a script that is writable by the world, because the world might have written on it already. shm%s not implemented (F) You don't have System V shared memory IPC on your system. shutdown() on closed fd (W) You tried to do a shutdown on a closed socket. Seems a bit superfluous. SIG%s handler "%s" not defined (W) The signal handler named in %SIG doesn't, in fact, exist. Perhaps you put it into the wrong package? sort is now a reserved word (F) An ancient error message that almost nobody ever runs into anymore. But before sort was a keyword, people sometimes used it as a filehandle. Sort subroutine didn't return a numeric value (F) A sort comparison routine must return a number. You probably blew it by not using `<=>' or `cmp', or by not using them correctly. See the "sort" entry in the perlfunc manpage. Sort subroutine didn't return single value (F) A sort comparison subroutine may not return a list value with more or less than one element. See the "sort" entry in the perlfunc manpage. Split loop (P) The split was looping infinitely. (Obviously, a split shouldn't iterate more times than there are characters of input, which is what happened.) See the "split" entry in the perlfunc manpage. Stat on unopened file <%s> (W) You tried to use the stat() function (or an equivalent file test) on a filehandle that was either never opened or has since been closed. Statement unlikely to be reached (W) You did an exec() with some statement after it other than a die(). This is almost always an error, because exec() never returns unless there was a failure. You probably wanted to use system() instead, which does return. To suppress this warning, put the exec() in a block by itself. Strange *+?{} on zero-length expression (W) You applied a regular expression quantifier in a place where it makes no sense, such as on a zero-width assertion. Try putting the quantifier inside the assertion instead. For example, the way to match "abc" provided that it is followed by three repetitions of "xyz" is `/abc(?=(?:xyz){3})/', not `/abc(?=xyz){3}/'. Stub found while resolving method `%s' overloading `%s' in package `%s' (P) Overloading resolution over @ISA tree may be broken by importation stubs. Stubs should never be implicitely created, but explicit calls to `can' may break this. Subroutine %s redefined (W) You redefined a subroutine. To suppress this warning, say { local $^W = 0; eval "sub name { ... }"; } Substitution loop (P) The substitution was looping infinitely. (Obviously, a substitution shouldn't iterate more times than there are characters of input, which is what happened.) See the discussion of substitution in the section on "Quote and Quote-like Operators" in the perlop manpage. Substitution pattern not terminated (F) The lexer couldn't find the interior delimiter of a s/// or s{}{} construct. Remember that bracketing delimiters count nesting level. Missing the leading `$' from variable `$s' may cause this error. Substitution replacement not terminated (F) The lexer couldn't find the final delimiter of a s/// or s{}{} construct. Remember that bracketing delimiters count nesting level. Missing the leading `$' from variable `$s' may cause this error. substr outside of string (S),(W) You tried to reference a substr() that pointed outside of a string. That is, the absolute value of the offset was larger than the length of the string. See the "substr" entry in the perlfunc manpage. This warning is mandatory if substr is used in an lvalue context (as the left hand side of an assignment or as a subroutine argument for example). suidperl is no longer needed since %s (F) Your Perl was compiled with -DSETUID_SCRIPTS_ARE_SECURE_NOW, but a version of the setuid emulator somehow got run anyway. syntax error (F) Probably means you had a syntax error. Common reasons include: A keyword is misspelled. A semicolon is missing. A comma is missing. An opening or closing parenthesis is missing. An opening or closing brace is missing. A closing quote is missing. Often there will be another error message associated with the syntax error giving more information. (Sometimes it helps to turn on -w.) The error message itself often tells you where it was in the line when it decided to give up. Sometimes the actual error is several tokens before this, because Perl is good at understanding random input. Occasionally the line number may be misleading, and once in a blue moon the only way to figure out what's triggering the error is to call `perl -c' repeatedly, chopping away half the program each time to see if the error went away. Sort of the cybernetic version of 20 questions. syntax error at line %d: `%s' unexpected (A) You've accidentally run your script through the Bourne shell instead of Perl. Check the #! line, or manually feed your script into Perl yourself. System V %s is not implemented on this machine (F) You tried to do something with a function beginning with "sem", "shm", or "msg" but that System V IPC is not implemented in your machine. In some machines the functionality can exist but be unconfigured. Consult your system support. Syswrite on closed filehandle (W) The filehandle you're writing to got itself closed sometime before now. Check your logic flow. Target of goto is too deeply nested (F) You tried to use `goto' to reach a label that was too deeply nested for Perl to reach. Perl is doing you a favor by refusing. tell() on unopened file (W) You tried to use the tell() function on a filehandle that was either never opened or has since been closed. Test on unopened file <%s> (W) You tried to invoke a file test operator on a filehandle that isn't open. Check your logic. See also the section on "-X" in the perlfunc manpage. That use of $[ is unsupported (F) Assignment to `$[' is now strictly circumscribed, and interpreted as a compiler directive. You may say only one of $[ = 0; $[ = 1; ... local $[ = 0; local $[ = 1; ... This is to prevent the problem of one module changing the array base out from under another module inadvertently. See the section on "$[" in the perlvar manpage. The %s function is unimplemented The function indicated isn't implemented on this architecture, according to the probings of Configure. The crypt() function is unimplemented due to excessive paranoia (F) Configure couldn't find the crypt() function on your machine, probably because your vendor didn't supply it, probably because they think the U.S. Government thinks it's a secret, or at least that they will continue to pretend that it is. And if you quote me on that, I will deny it. The stat preceding `-l _' wasn't an lstat (F) It makes no sense to test the current stat buffer for symbolic linkhood if the last stat that wrote to the stat buffer already went past the symlink to get to the real file. Use an actual filename instead. times not implemented (F) Your version of the C library apparently doesn't do times(). I suspect you're not running on Unix. Too few args to syscall (F) There has to be at least one argument to syscall() to specify the system call to call, silly dilly. Too late for "-T" option (X) The #! line (or local equivalent) in a Perl script contains the -T option, but Perl was not invoked with -T in its command line. This is an error because, by the time Perl discovers a -T in a script, it's too late to properly taint everything from the environment. So Perl gives up. If the Perl script is being executed as a command using the #! mechanism (or its local equivalent), this error can usually be fixed by editing the #! line so that the -T option is a part of Perl's first argument: e.g. change `perl -n -T' to `perl -T -n'. If the Perl script is being executed as `perl scriptname', then the -T option must appear on the command line: `perl -T scriptname'. Too late for "-%s" option (X) The #! line (or local equivalent) in a Perl script contains the -M or -m option. This is an error because -M and -m options are not intended for use inside scripts. Use the `use' pragma instead. Too many ('s Too many )'s (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself. Too many args to syscall (F) Perl supports a maximum of only 14 args to syscall(). Too many arguments for %s (F) The function requires fewer arguments than you specified. trailing \ in regexp (F) The regular expression ends with an unbackslashed backslash. Backslash it. See the perlre manpage. Transliteration pattern not terminated (F) The lexer couldn't find the interior delimiter of a tr/// or tr[][] or y/// or y[][] construct. Missing the leading `$' from variables `$tr' or `$y' may cause this error. Transliteration replacement not terminated (F) The lexer couldn't find the final delimiter of a tr/// or tr[][] construct. truncate not implemented (F) Your machine doesn't implement a file truncation mechanism that Configure knows about. Type of arg %d to %s must be %s (not %s) (F) This function requires the argument in that position to be of a certain type. Arrays must be @NAME or `@{EXPR}'. Hashes must be %NAME or `%{EXPR}'. No implicit dereferencing is allowed--use the {EXPR} forms as an explicit dereference. See the perlref manpage. umask: argument is missing initial 0 (W) A umask of 222 is incorrect. It should be 0222, because octal literals always start with 0 in Perl, as in C. umask not implemented (F) Your machine doesn't implement the umask function and you tried to use it to restrict permissions for yourself (EXPR & 0700). Unable to create sub named "%s" (F) You attempted to create or access a subroutine with an illegal name. Unbalanced context: %d more PUSHes than POPs (W) The exit code detected an internal inconsistency in how many execution contexts were entered and left. Unbalanced saves: %d more saves than restores (W) The exit code detected an internal inconsistency in how many values were temporarily localized. Unbalanced scopes: %d more ENTERs than LEAVEs (W) The exit code detected an internal inconsistency in how many blocks were entered and left. Unbalanced tmps: %d more allocs than frees (W) The exit code detected an internal inconsistency in how many mortal scalars were allocated and freed. Undefined format "%s" called (F) The format indicated doesn't seem to exist. Perhaps it's really in another package? See the perlform manpage. Undefined sort subroutine "%s" called (F) The sort comparison routine specified doesn't seem to exist. Perhaps it's in a different package? See the "sort" entry in the perlfunc manpage. Undefined subroutine &%s called (F) The subroutine indicated hasn't been defined, or if it was, it has since been undefined. Undefined subroutine called (F) The anonymous subroutine you're trying to call hasn't been defined, or if it was, it has since been undefined. Undefined subroutine in sort (F) The sort comparison routine specified is declared but doesn't seem to have been defined yet. See the "sort" entry in the perlfunc manpage. Undefined top format "%s" called (F) The format indicated doesn't seem to exist. Perhaps it's really in another package? See the perlform manpage. Undefined value assigned to typeglob (W) An undefined value was assigned to a typeglob, a la `*foo = undef'. This does nothing. It's possible that you really mean `undef *foo'. unexec of %s into %s failed! (F) The unexec() routine failed for some reason. See your local FSF representative, who probably put it there in the first place. Unknown BYTEORDER (F) There are no byte-swapping functions for a machine with this byte order. unmatched () in regexp (F) Unbackslashed parentheses must always be balanced in regular expressions. If you're a vi user, the % key is valuable for finding the matching parenthesis. See the perlre manpage. Unmatched right bracket (F) The lexer counted more closing curly brackets (braces) than opening ones, so you're probably missing an opening bracket. As a general rule, you'll find the missing one (so to speak) near the place you were last editing. unmatched [] in regexp (F) The brackets around a character class must match. If you wish to include a closing bracket in a character class, backslash it or put it first. See the perlre manpage. Unquoted string "%s" may clash with future reserved word (W) You used a bareword that might someday be claimed as a reserved word. It's best to put such a word in quotes, or capitalize it somehow, or insert an underbar into it. You might also declare it as a subroutine. Unrecognized character %s (F) The Perl parser has no idea what to do with the specified character in your Perl script (or eval). Perhaps you tried to run a compressed script, a binary program, or a directory as a Perl program. Unrecognized signal name "%s" (F) You specified a signal name to the kill() function that was not recognized. Say `kill -l' in your shell to see the valid signal names on your system. Unrecognized switch: -%s (-h will show valid options) (F) You specified an illegal option to Perl. Don't do that. (If you think you didn't do that, check the #! line to see if it's supplying the bad switch on your behalf.) Unsuccessful %s on filename containing newline (W) A file operation was attempted on a filename, and that operation failed, PROBABLY because the filename contained a newline, PROBABLY because you forgot to chop() or chomp() it off. See the "chomp" entry in the perlfunc manpage. Unsupported directory function "%s" called (F) Your machine doesn't support opendir() and readdir(). Unsupported function fork (F) Your version of executable does not support forking. Note that under some systems, like OS/2, there may be different flavors of Perl executables, some of which may support fork, some not. Try changing the name you call Perl by to `perl_', `perl__', and so on. Unsupported function %s (F) This machine doesn't implement the indicated function, apparently. At least, Configure doesn't think so. Unsupported socket function "%s" called (F) Your machine doesn't support the Berkeley socket mechanism, or at least that's what Configure thought. Unterminated <> operator (F) The lexer saw a left angle bracket in a place where it was expecting a term, so it's looking for the corresponding right angle bracket, and not finding it. Chances are you left some needed parentheses out earlier in the line, and you really meant a "less than". Use of "$$" to mean "${$}" is deprecated (D) Perl versions before 5.004 misinterpreted any type marker followed by "$" and a digit. For example, "$$0" was incorrectly taken to mean "${$}0" instead of "${$0}". This bug is (mostly) fixed in Perl 5.004. However, the developers of Perl 5.004 could not fix this bug completely, because at least two widely-used modules depend on the old meaning of "$$0" in a string. So Perl 5.004 still interprets "$$" in the old (broken) way inside strings; but it generates this message as a warning. And in Perl 5.005, this special treatment will cease. Use of $# is deprecated (D) This was an ill-advised attempt to emulate a poorly defined awk feature. Use an explicit printf() or sprintf() instead. Use of $* is deprecated (D) This variable magically turned on multi-line pattern matching, both for you and for any luckless subroutine that you happen to call. You should use the new `//m' and `//s' modifiers now to do that without the dangerous action-at-a-distance effects of `$*'. Use of %s in printf format not supported (F) You attempted to use a feature of printf that is accessible from only C. This usually means there's a better way to do it in Perl. Use of bare << to mean <<"" is deprecated (D) You are now encouraged to use the explicitly quoted form if you wish to use an empty line as the terminator of the here-document. Use of implicit split to @_ is deprecated (D) It makes a lot of work for the compiler when you clobber a subroutine's argument list, so it's better if you assign the results of a split() explicitly to an array (or list). Use of inherited AUTOLOAD for non-method %s() is deprecated (D) As an (ahem) accidental feature, `AUTOLOAD' subroutines are looked up as methods (using the `@ISA' hierarchy) even when the subroutines to be autoloaded were called as plain functions (e.g. `Foo::bar()'), not as methods (e.g. `Foo->bar()' or `$obj->bar()'). This bug will be rectified in Perl 5.005, which will use method lookup only for methods' `AUTOLOAD's. However, there is a significant base of existing code that may be using the old behavior. So, as an interim step, Perl 5.004 issues an optional warning when non-methods use inherited `AUTOLOAD's. The simple rule is: Inheritance will not work when autoloading non- methods. The simple fix for old code is: In any module that used to depend on inheriting `AUTOLOAD' for non-methods from a base class named `BaseClass', execute `*AUTOLOAD = \&BaseClass::AUTOLOAD' during startup. In code that currently says `use AutoLoader; @ISA = qw(AutoLoader);' you should remove AutoLoader from @ISA and change `use AutoLoader;' to `use AutoLoader 'AUTOLOAD';'. Use of reserved word "%s" is deprecated (D) The indicated bareword is a reserved word. Future versions of perl may use it as a keyword, so you're better off either explicitly quoting the word in a manner appropriate for its context of use, or using a different name altogether. The warning can be suppressed for subroutine names by either adding a `&' prefix, or using a package qualifier, e.g. `&our()', or `Foo::our()'. Use of %s is deprecated (D) The construct indicated is no longer recommended for use, generally because there's a better way to do it, and also because the old way has bad side effects. Use of uninitialized value (W) An undefined value was used as if it were already defined. It was interpreted as a "" or a 0, but maybe it was a mistake. To suppress this warning assign an initial value to your variables. Useless use of "re" pragma (W) You did `use re;' without any arguments. That isn't very useful. Useless use of %s in void context (W) You did something without a side effect in a context that does nothing with the return value, such as a statement that doesn't return a value from a block, or the left side of a scalar comma operator. Very often this points not to stupidity on your part, but a failure of Perl to parse your program the way you thought it would. For example, you'd get this if you mixed up your C precedence with Python precedence and said $one, $two = 1, 2; when you meant to say ($one, $two) = (1, 2); Another common error is to use ordinary parentheses to construct a list reference when you should be using square or curly brackets, for example, if you say $array = (1,2); when you should have said $array = [1,2]; The square brackets explicitly turn a list value into a scalar value, while parentheses do not. So when a parenthesized list is evaluated in a scalar context, the comma is treated like C's comma operator, which throws away the left argument, which is not what you want. See the perlref manpage for more on this. untie attempted while %d inner references still exist (W) A copy of the object returned from `tie' (or `tied') was still valid when `untie' was called. Value of %s can be "0"; test with defined() (W) In a conditional expression, you used , <*> (glob), `each()', or `readdir()' as a boolean value. Each of these constructs can return a value of "0"; that would make the conditional expression false, which is probably not what you intended. When using these constructs in conditional expressions, test their values with the `defined' operator. Variable "%s" is not imported%s (F) While "use strict" in effect, you referred to a global variable that you apparently thought was imported from another module, because something else of the same name (usually a subroutine) is exported by that module. It usually means you put the wrong funny character on the front of your variable. Variable "%s" may be unavailable (W) An inner (nested) *anonymous* subroutine is inside a *named* subroutine, and outside that is another subroutine; and the anonymous (innermost) subroutine is referencing a lexical variable defined in the outermost subroutine. For example: sub outermost { my $a; sub middle { sub { $a } } } If the anonymous subroutine is called or referenced (directly or indirectly) from the outermost subroutine, it will share the variable as you would expect. But if the anonymous subroutine is called or referenced when the outermost subroutine is not active, it will see the value of the shared variable as it was before and during the *first* call to the outermost subroutine, which is probably not what you want. In these circumstances, it is usually best to make the middle subroutine anonymous, using the `sub {}' syntax. Perl has specific support for shared variables in nested anonymous subroutines; a named subroutine in between interferes with this feature. Variable "%s" will not stay shared (W) An inner (nested) *named* subroutine is referencing a lexical variable defined in an outer subroutine. When the inner subroutine is called, it will probably see the value of the outer subroutine's variable as it was before and during the *first* call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable. In other words, the variable will no longer be shared. Furthermore, if the outer subroutine is anonymous and references a lexical variable outside itself, then the outer and inner subroutines will *never* share the given variable. This problem can usually be solved by making the inner subroutine anonymous, using the `sub {}' syntax. When inner anonymous subs that reference variables in outer subroutines are called or referenced, they are automatically rebound to the current values of such variables. Variable syntax (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself. perl: warning: Setting locale failed. (S) The whole warning message will look something like: perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LC_ALL = "En_US", LANG = (unset) are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). Exactly what were the failed locale settings varies. In the above the settings were that the LC_ALL was "En_US" and the LANG had no value. This error means that Perl detected that you and/or your system administrator have set up the so-called variable system but Perl could not use those settings. This was not dead serious, fortunately: there is a "default locale" called "C" that Perl can and will use, the script will be run. Before you really fix the problem, however, you will get the same error message each time you run Perl. How to really fix the problem can be found in the perllocale manpage section LOCALE PROBLEMS. Warning: something's wrong (W) You passed warn() an empty string (the equivalent of `warn ""') or you called it with no args and `$_' was empty. Warning: unable to close filehandle %s properly (S) The implicit close() done by an open() got an error indication on the close(). This usually indicates your file system ran out of disk space. Warning: Use of "%s" without parentheses is ambiguous (S) You wrote a unary operator followed by something that looks like a binary operator that could also have been interpreted as a term or unary operator. For instance, if you know that the rand function has a default argument of 1.0, and you write rand + 5; you may THINK you wrote the same thing as rand() + 5; but in actual fact, you got rand(+5); So put in parentheses to say what you really mean. Write on closed filehandle (W) The filehandle you're writing to got itself closed sometime before now. Check your logic flow. X outside of string (F) You had a pack template that specified a relative position before the beginning of the string being unpacked. See the "pack" entry in the perlfunc manpage. x outside of string (F) You had a pack template that specified a relative position after the end of the string being unpacked. See the "pack" entry in the perlfunc manpage. Xsub "%s" called in sort (F) The use of an external subroutine as a sort comparison is not yet supported. Xsub called in sort (F) The use of an external subroutine as a sort comparison is not yet supported. You can't use `-l' on a filehandle (F) A filehandle represents an opened file, and when you opened the file it already went past any symlink you are presumably trying to look for. Use a filename instead. YOU HAVEN'T DISABLED SET-ID SCRIPTS IN THE KERNEL YET! (F) And you probably never will, because you probably don't have the sources to your kernel, and your vendor probably doesn't give a rip about what you want. Your best bet is to use the wrapsuid script in the eg directory to put a setuid C wrapper around your script. You need to quote "%s" (W) You assigned a bareword as a signal handler name. Unfortunately, you already have a subroutine of that name declared, which means that Perl 5 will try to call the subroutine when the assignment is executed, which is probably not what you want. (If it IS what you want, put an & in front.) [gs]etsockopt() on closed fd (W) You tried to get or set a socket option on a closed socket. Did you forget to check the return value of your socket() call? See the "getsockopt" entry in the perlfunc manpage. \1 better written as $1 (W) Outside of patterns, backreferences live on as variables. The use of backslashes is grandfathered on the right-hand side of a substitution, but stylistically it's better to use the variable form because other Perl programmers will expect it, and it works better if there are more than 9 backreferences. '|' and '<' may not both be specified on command line (F) An error peculiar to VMS. Perl does its own command line redirection, and found that STDIN was a pipe, and that you also tried to redirect STDIN using '<'. Only one STDIN stream to a customer, please. '|' and '>' may not both be specified on command line (F) An error peculiar to VMS. Perl does its own command line redirection, and thinks you tried to redirect stdout both to a file and into a pipe to another command. You need to choose one or the other, though nothing's stopping you from piping into a program or Perl script which 'splits' output into two streams, such as open(OUT,">$ARGV[0]") or die "Can't write to $ARGV[0]: $!"; while () { print; print OUT; } close OUT; Got an error from DosAllocMem (P) An error peculiar to OS/2. Most probably you're using an obsolete version of Perl, and this should not happen anyway. Malformed PERLLIB_PREFIX (F) An error peculiar to OS/2. PERLLIB_PREFIX should be of the form prefix1;prefix2 or prefix1 prefix2 with nonempty prefix1 and prefix2. If `prefix1' is indeed a prefix of a builtin library search path, prefix2 is substituted. The error may appear if components are not found, or are too long. See "PERLLIB_PREFIX" in README.os2. PERL_SH_DIR too long (F) An error peculiar to OS/2. PERL_SH_DIR is the directory to find the `sh'-shell in. See "PERL_SH_DIR" in README.os2. Process terminated by SIG%s (W) This is a standard message issued by OS/2 applications, while *nix applications die in silence. It is considered a feature of the OS/2 port. One can easily disable this by appropriate sighandlers, see the section on "Signals" in the perlipc manpage. See also "Process terminated by SIGTERM/SIGINT" in README.os2. perldsc section NAME perldsc - Perl Data Structures Cookbook DESCRIPTION The single feature most sorely lacking in the Perl programming language prior to its 5.0 release was complex data structures. Even without direct language support, some valiant programmers did manage to emulate them, but it was hard work and not for the faint of heart. You could occasionally get away with the `$m{$LoL,$b}' notation borrowed from *awk* in which the keys are actually more like a single concatenated string `"$LoL$b"', but traversal and sorting were difficult. More desperate programmers even hacked Perl's internal symbol table directly, a strategy that proved hard to develop and maintain--to put it mildly. The 5.0 release of Perl let us have complex data structures. You may now write something like this and all of a sudden, you'd have a array with three dimensions! for $x (1 .. 10) { for $y (1 .. 10) { for $z (1 .. 10) { $LoL[$x][$y][$z] = $x ** $y + $z; } } } Alas, however simple this may appear, underneath it's a much more elaborate construct than meets the eye! How do you print it out? Why can't you say just `print @LoL'? How do you sort it? How can you pass it to a function or get one of these back from a function? Is is an object? Can you save it to disk to read back later? How do you access whole rows or columns of that matrix? Do all the values have to be numeric? As you see, it's quite easy to become confused. While some small portion of the blame for this can be attributed to the reference-based implementation, it's really more due to a lack of existing documentation with examples designed for the beginner. This document is meant to be a detailed but understandable treatment of the many different sorts of data structures you might want to develop. It should also serve as a cookbook of examples. That way, when you need to create one of these complex data structures, you can just pinch, pilfer, or purloin a drop-in example from here. Let's look at each of these possible constructs in detail. There are separate sections on each of the following: * arrays of arrays * hashes of arrays * arrays of hashes * hashes of hashes * more elaborate constructs But for now, let's look at general issues common to all these types of data structures. REFERENCES The most important thing to understand about all data structures in Perl -- including multidimensional arrays--is that even though they might appear otherwise, Perl `@ARRAY's and `%HASH'es are all internally one- dimensional. They can hold only scalar values (meaning a string, number, or a reference). They cannot directly contain other arrays or hashes, but instead contain *references* to other arrays or hashes. You can't use a reference to a array or hash in quite the same way that you would a real array or hash. For C or C++ programmers unused to distinguishing between arrays and pointers to the same, this can be confusing. If so, just think of it as the difference between a structure and a pointer to a structure. You can (and should) read more about references in the perlref(1) man page. Briefly, references are rather like pointers that know what they point to. (Objects are also a kind of reference, but we won't be needing them right away--if ever.) This means that when you have something which looks to you like an access to a two-or-more-dimensional array and/or hash, what's really going on is that the base type is merely a one- dimensional entity that contains references to the next level. It's just that you can *use* it as though it were a two-dimensional one. This is actually the way almost all C multidimensional arrays work as well. $list[7][12] # array of arrays $list[7]{string} # array of hashes $hash{string}[7] # hash of arrays $hash{string}{'another string'} # hash of hashes Now, because the top level contains only references, if you try to print out your array in with a simple print() function, you'll get something that doesn't look very nice, like this: @LoL = ( [2, 3], [4, 5, 7], [0] ); print $LoL[1][2]; 7 print @LoL; ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0) That's because Perl doesn't (ever) implicitly dereference your variables. If you want to get at the thing a reference is referring to, then you have to do this yourself using either prefix typing indicators, like `${$blah}', `@{$blah}', `@{$blah[$i]}', or else postfix pointer arrows, like `$a->[3]', `$h->{fred}', or even `$ob->method()->[3]'. COMMON MISTAKES The two most common mistakes made in constructing something like an array of arrays is either accidentally counting the number of elements or else taking a reference to the same memory location repeatedly. Here's the case where you just get the count instead of a nested array: for $i (1..10) { @list = somefunc($i); $LoL[$i] = @list; # WRONG! } That's just the simple case of assigning a list to a scalar and getting its element count. If that's what you really and truly want, then you might do well to consider being a tad more explicit about it, like this: for $i (1..10) { @list = somefunc($i); $counts[$i] = scalar @list; } Here's the case of taking a reference to the same memory location again and again: for $i (1..10) { @list = somefunc($i); $LoL[$i] = \@list; # WRONG! } So, what's the big problem with that? It looks right, doesn't it? After all, I just told you that you need an array of references, so by golly, you've made me one! Unfortunately, while this is true, it's still broken. All the references in @LoL refer to the *very same place*, and they will therefore all hold whatever was last in @list! It's similar to the problem demonstrated in the following C program: #include main() { struct passwd *getpwnam(), *rp, *dp; rp = getpwnam("root"); dp = getpwnam("daemon"); printf("daemon name is %s\nroot name is %s\n", dp->pw_name, rp->pw_name); } Which will print daemon name is daemon root name is daemon The problem is that both `rp' and `dp' are pointers to the same location in memory! In C, you'd have to remember to malloc() yourself some new memory. In Perl, you'll want to use the array constructor `[]' or the hash constructor `{}' instead. Here's the right way to do the preceding broken code fragments: for $i (1..10) { @list = somefunc($i); $LoL[$i] = [ @list ]; } The square brackets make a reference to a new array with a *copy* of what's in @list at the time of the assignment. This is what you want. Note that this will produce something similar, but it's much harder to read: for $i (1..10) { @list = 0 .. $i; @{$LoL[$i]} = @list; } Is it the same? Well, maybe so--and maybe not. The subtle difference is that when you assign something in square brackets, you know for sure it's always a brand new reference with a new *copy* of the data. Something else could be going on in this new case with the `@{$LoL[$i]}}' dereference on the left-hand-side of the assignment. It all depends on whether `$LoL[$i]' had been undefined to start with, or whether it already contained a reference. If you had already populated @LoL with references, as in $LoL[3] = \@another_list; Then the assignment with the indirection on the left-hand-side would use the existing reference that was already there: @{$LoL[3]} = @list; Of course, this *would* have the "interesting" effect of clobbering @another_list. (Have you ever noticed how when a programmer says something is "interesting", that rather than meaning "intriguing", they're disturbingly more apt to mean that it's "annoying", "difficult", or both? :-) So just remember always to use the array or hash constructors with `[]' or `{}', and you'll be fine, although it's not always optimally efficient. Surprisingly, the following dangerous-looking construct will actually work out fine: for $i (1..10) { my @list = somefunc($i); $LoL[$i] = \@list; } That's because my() is more of a run-time statement than it is a compile-time declaration *per se*. This means that the my() variable is remade afresh each time through the loop. So even though it *looks* as though you stored the same variable reference each time, you actually did not! This is a subtle distinction that can produce more efficient code at the risk of misleading all but the most experienced of programmers. So I usually advise against teaching it to beginners. In fact, except for passing arguments to functions, I seldom like to see the gimme-a-reference operator (backslash) used much at all in code. Instead, I advise beginners that they (and most of the rest of us) should try to use the much more easily understood constructors `[]' and `{}' instead of relying upon lexical (or dynamic) scoping and hidden reference-counting to do the right thing behind the scenes. In summary: $LoL[$i] = [ @list ]; # usually best $LoL[$i] = \@list; # perilous; just how my() was that list? @{ $LoL[$i] } = @list; # way too tricky for most programmers CAVEAT ON PRECEDENCE Speaking of things like `@{$LoL[$i]}', the following are actually the same thing: $listref->[2][2] # clear $$listref[2][2] # confusing That's because Perl's precedence rules on its five prefix dereferencers (which look like someone swearing: `$ @ * % &') make them bind more tightly than the postfix subscripting brackets or braces! This will no doubt come as a great shock to the C or C++ programmer, who is quite accustomed to using `*a[i]' to mean what's pointed to by the *i'th* element of `a'. That is, they first take the subscript, and only then dereference the thing at that subscript. That's fine in C, but this isn't C. The seemingly equivalent construct in Perl, `$$listref[$i]' first does the deref of `$listref', making it take $listref as a reference to an array, and then dereference that, and finally tell you the *i'th* value of the array pointed to by $LoL. If you wanted the C notion, you'd have to write `${$LoL[$i]}' to force the `$LoL[$i]' to get evaluated first before the leading `$' dereferencer. WHY YOU SHOULD ALWAYS `use strict' If this is starting to sound scarier than it's worth, relax. Perl has some features to help you avoid its most common pitfalls. The best way to avoid getting confused is to start every program like this: #!/usr/bin/perl -w use strict; This way, you'll be forced to declare all your variables with my() and also disallow accidental "symbolic dereferencing". Therefore if you'd done this: my $listref = [ [ "fred", "barney", "pebbles", "bambam", "dino", ], [ "homer", "bart", "marge", "maggie", ], [ "george", "jane", "elroy", "judy", ], ]; print $listref[2][2]; The compiler would immediately flag that as an error *at compile time*, because you were accidentally accessing `@listref', an undeclared variable, and it would thereby remind you to write instead: print $listref->[2][2] DEBUGGING Before version 5.002, the standard Perl debugger didn't do a very nice job of printing out complex data structures. With 5.002 or above, the debugger includes several new features, including command line editing as well as the `x' command to dump out complex data structures. For example, given the assignment to $LoL above, here's the debugger output: DB<1> x $LoL $LoL = ARRAY(0x13b5a0) 0 ARRAY(0x1f0a24) 0 'fred' 1 'barney' 2 'pebbles' 3 'bambam' 4 'dino' 1 ARRAY(0x13b558) 0 'homer' 1 'bart' 2 'marge' 3 'maggie' 2 ARRAY(0x13b540) 0 'george' 1 'jane' 2 'elroy' 3 'judy' CODE EXAMPLES Presented with little comment (these will get their own manpages someday) here are short code examples illustrating access of various types of data structures. LISTS OF LISTS Declaration of a LIST OF LISTS @LoL = ( [ "fred", "barney" ], [ "george", "jane", "elroy" ], [ "homer", "marge", "bart" ], ); Generation of a LIST OF LISTS # reading from file while ( <> ) { push @LoL, [ split ]; } # calling a function for $i ( 1 .. 10 ) { $LoL[$i] = [ somefunc($i) ]; } # using temp vars for $i ( 1 .. 10 ) { @tmp = somefunc($i); $LoL[$i] = [ @tmp ]; } # add to an existing row push @{ $LoL[0] }, "wilma", "betty"; Access and Printing of a LIST OF LISTS # one element $LoL[0][0] = "Fred"; # another element $LoL[1][1] =~ s/(\w)/\u$1/; # print the whole thing with refs for $aref ( @LoL ) { print "\t [ @$aref ],\n"; } # print the whole thing with indices for $i ( 0 .. $#LoL ) { print "\t [ @{$LoL[$i]} ],\n"; } # print the whole thing one at a time for $i ( 0 .. $#LoL ) { for $j ( 0 .. $#{ $LoL[$i] } ) { print "elt $i $j is $LoL[$i][$j]\n"; } } HASHES OF LISTS Declaration of a HASH OF LISTS %HoL = ( flintstones => [ "fred", "barney" ], jetsons => [ "george", "jane", "elroy" ], simpsons => [ "homer", "marge", "bart" ], ); Generation of a HASH OF LISTS # reading from file # flintstones: fred barney wilma dino while ( <> ) { next unless s/^(.*?):\s*//; $HoL{$1} = [ split ]; } # reading from file; more temps # flintstones: fred barney wilma dino while ( $line = <> ) { ($who, $rest) = split /:\s*/, $line, 2; @fields = split ' ', $rest; $HoL{$who} = [ @fields ]; } # calling a function that returns a list for $group ( "simpsons", "jetsons", "flintstones" ) { $HoL{$group} = [ get_family($group) ]; } # likewise, but using temps for $group ( "simpsons", "jetsons", "flintstones" ) { @members = get_family($group); $HoL{$group} = [ @members ]; } # append new members to an existing family push @{ $HoL{"flintstones"} }, "wilma", "betty"; Access and Printing of a HASH OF LISTS # one element $HoL{flintstones}[0] = "Fred"; # another element $HoL{simpsons}[1] =~ s/(\w)/\u$1/; # print the whole thing foreach $family ( keys %HoL ) { print "$family: @{ $HoL{$family} }\n" } # print the whole thing with indices foreach $family ( keys %HoL ) { print "family: "; foreach $i ( 0 .. $#{ $HoL{$family} } ) { print " $i = $HoL{$family}[$i]"; } print "\n"; } # print the whole thing sorted by number of members foreach $family ( sort { @{$HoL{$b}} <=> @{$HoL{$a}} } keys %HoL ) { print "$family: @{ $HoL{$family} }\n" } # print the whole thing sorted by number of members and name foreach $family ( sort { @{$HoL{$b}} <=> @{$HoL{$a}} || $a cmp $b } keys %HoL ) { print "$family: ", join(", ", sort @{ $HoL{$family} }), "\n"; } LISTS OF HASHES Declaration of a LIST OF HASHES @LoH = ( { Lead => "fred", Friend => "barney", }, { Lead => "george", Wife => "jane", Son => "elroy", }, { Lead => "homer", Wife => "marge", Son => "bart", } ); Generation of a LIST OF HASHES # reading from file # format: LEAD=fred FRIEND=barney while ( <> ) { $rec = {}; for $field ( split ) { ($key, $value) = split /=/, $field; $rec->{$key} = $value; } push @LoH, $rec; } # reading from file # format: LEAD=fred FRIEND=barney # no temp while ( <> ) { push @LoH, { split /[\s+=]/ }; } # calling a function that returns a key,value list, like # "lead","fred","daughter","pebbles" while ( %fields = getnextpairset() ) { push @LoH, { %fields }; } # likewise, but using no temp vars while (<>) { push @LoH, { parsepairs($_) }; } # add key/value to an element $LoH[0]{pet} = "dino"; $LoH[2]{pet} = "santa's little helper"; Access and Printing of a LIST OF HASHES # one element $LoH[0]{lead} = "fred"; # another element $LoH[1]{lead} =~ s/(\w)/\u$1/; # print the whole thing with refs for $href ( @LoH ) { print "{ "; for $role ( keys %$href ) { print "$role=$href->{$role} "; } print "}\n"; } # print the whole thing with indices for $i ( 0 .. $#LoH ) { print "$i is { "; for $role ( keys %{ $LoH[$i] } ) { print "$role=$LoH[$i]{$role} "; } print "}\n"; } # print the whole thing one at a time for $i ( 0 .. $#LoH ) { for $role ( keys %{ $LoH[$i] } ) { print "elt $i $role is $LoH[$i]{$role}\n"; } } HASHES OF HASHES Declaration of a HASH OF HASHES %HoH = ( flintstones => { lead => "fred", pal => "barney", }, jetsons => { lead => "george", wife => "jane", "his boy" => "elroy", }, simpsons => { lead => "homer", wife => "marge", kid => "bart", }, ); Generation of a HASH OF HASHES # reading from file # flintstones: lead=fred pal=barney wife=wilma pet=dino while ( <> ) { next unless s/^(.*?):\s*//; $who = $1; for $field ( split ) { ($key, $value) = split /=/, $field; $HoH{$who}{$key} = $value; } # reading from file; more temps while ( <> ) { next unless s/^(.*?):\s*//; $who = $1; $rec = {}; $HoH{$who} = $rec; for $field ( split ) { ($key, $value) = split /=/, $field; $rec->{$key} = $value; } } # calling a function that returns a key,value hash for $group ( "simpsons", "jetsons", "flintstones" ) { $HoH{$group} = { get_family($group) }; } # likewise, but using temps for $group ( "simpsons", "jetsons", "flintstones" ) { %members = get_family($group); $HoH{$group} = { %members }; } # append new members to an existing family %new_folks = ( wife => "wilma", pet => "dino", ); for $what (keys %new_folks) { $HoH{flintstones}{$what} = $new_folks{$what}; } Access and Printing of a HASH OF HASHES # one element $HoH{flintstones}{wife} = "wilma"; # another element $HoH{simpsons}{lead} =~ s/(\w)/\u$1/; # print the whole thing foreach $family ( keys %HoH ) { print "$family: { "; for $role ( keys %{ $HoH{$family} } ) { print "$role=$HoH{$family}{$role} "; } print "}\n"; } # print the whole thing somewhat sorted foreach $family ( sort keys %HoH ) { print "$family: { "; for $role ( sort keys %{ $HoH{$family} } ) { print "$role=$HoH{$family}{$role} "; } print "}\n"; } # print the whole thing sorted by number of members foreach $family ( sort { keys %{$HoH{$b}} <=> keys %{$HoH{$a}} } keys %HoH ) { print "$family: { "; for $role ( sort keys %{ $HoH{$family} } ) { print "$role=$HoH{$family}{$role} "; } print "}\n"; } # establish a sort order (rank) for each role $i = 0; for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i } # now print the whole thing sorted by number of members foreach $family ( sort { keys %{ $HoH{$b} } <=> keys %{ $HoH{$a} } } keys %HoH ) { print "$family: { "; # and print these according to rank order for $role ( sort { $rank{$a} <=> $rank{$b} } keys %{ $HoH{$family} } ) { print "$role=$HoH{$family}{$role} "; } print "}\n"; } MORE ELABORATE RECORDS Declaration of MORE ELABORATE RECORDS Here's a sample showing how to create and use a record whose fields are of many different sorts: $rec = { TEXT => $string, SEQUENCE => [ @old_values ], LOOKUP => { %some_table }, THATCODE => \&some_function, THISCODE => sub { $_[0] ** $_[1] }, HANDLE => \*STDOUT, }; print $rec->{TEXT}; print $rec->{SEQUENCE}[0]; $last = pop @ { $rec->{SEQUENCE} }; print $rec->{LOOKUP}{"key"}; ($first_k, $first_v) = each %{ $rec->{LOOKUP} }; $answer = $rec->{THATCODE}->($arg); $answer = $rec->{THISCODE}->($arg1, $arg2); # careful of extra block braces on fh ref print { $rec->{HANDLE} } "a string\n"; use FileHandle; $rec->{HANDLE}->autoflush(1); $rec->{HANDLE}->print(" a string\n"); Declaration of a HASH OF COMPLEX RECORDS %TV = ( flintstones => { series => "flintstones", nights => [ qw(monday thursday friday) ], members => [ { name => "fred", role => "lead", age => 36, }, { name => "wilma", role => "wife", age => 31, }, { name => "pebbles", role => "kid", age => 4, }, ], }, jetsons => { series => "jetsons", nights => [ qw(wednesday saturday) ], members => [ { name => "george", role => "lead", age => 41, }, { name => "jane", role => "wife", age => 39, }, { name => "elroy", role => "kid", age => 9, }, ], }, simpsons => { series => "simpsons", nights => [ qw(monday) ], members => [ { name => "homer", role => "lead", age => 34, }, { name => "marge", role => "wife", age => 37, }, { name => "bart", role => "kid", age => 11, }, ], }, ); Generation of a HASH OF COMPLEX RECORDS # reading from file # this is most easily done by having the file itself be # in the raw data format as shown above. perl is happy # to parse complex data structures if declared as data, so # sometimes it's easiest to do that # here's a piece by piece build up $rec = {}; $rec->{series} = "flintstones"; $rec->{nights} = [ find_days() ]; @members = (); # assume this file in field=value syntax while (<>) { %fields = split /[\s=]+/; push @members, { %fields }; } $rec->{members} = [ @members ]; # now remember the whole thing $TV{ $rec->{series} } = $rec; ########################################################### # now, you might want to make interesting extra fields that # include pointers back into the same data structure so if # change one piece, it changes everywhere, like for examples # if you wanted a {kids} field that was an array reference # to a list of the kids' records without having duplicate # records and thus update problems. ########################################################### foreach $family (keys %TV) { $rec = $TV{$family}; # temp pointer @kids = (); for $person ( @{ $rec->{members} } ) { if ($person->{role} =~ /kid|son|daughter/) { push @kids, $person; } } # REMEMBER: $rec and $TV{$family} point to same data!! $rec->{kids} = [ @kids ]; } # you copied the list, but the list itself contains pointers # to uncopied objects. this means that if you make bart get # older via $TV{simpsons}{kids}[0]{age}++; # then this would also change in print $TV{simpsons}{members}[2]{age}; # because $TV{simpsons}{kids}[0] and $TV{simpsons}{members}[2] # both point to the same underlying anonymous hash table # print the whole thing foreach $family ( keys %TV ) { print "the $family"; print " is on during @{ $TV{$family}{nights} }\n"; print "its members are:\n"; for $who ( @{ $TV{$family}{members} } ) { print " $who->{name} ($who->{role}), age $who->{age}\n"; } print "it turns out that $TV{$family}{lead} has "; print scalar ( @{ $TV{$family}{kids} } ), " kids named "; print join (", ", map { $_->{name} } @{ $TV{$family}{kids} } ); print "\n"; } Database Ties You cannot easily tie a multilevel data structure (such as a hash of hashes) to a dbm file. The first problem is that all but GDBM and Berkeley DB have size limitations, but beyond that, you also have problems with how references are to be represented on disk. One experimental module that does partially attempt to address this need is the MLDBM module. Check your nearest CPAN site as described in the perlmodlib manpage for source code to MLDBM. SEE ALSO perlref(1), perllol(1), perldata(1), perlobj(1) AUTHOR Tom Christiansen Last update: Wed Oct 23 04:57:50 MET DST 1996 perlembed section NAME perlembed - how to embed perl in your C program DESCRIPTION PREAMBLE Do you want to: Use C from Perl? Read the perlxstut manpage, the perlxs manpage, the h2xs manpage, and the perlguts manpage. Use a Unix program from Perl? Read about back-quotes and about `system' and `exec' in the perlfunc manpage. Use Perl from Perl? Read about the "do" entry in the perlfunc manpage and the "eval" entry in the perlfunc manpage and the "require" entry in the perlfunc manpage and the "use" entry in the perlfunc manpage. Use C from C? Rethink your design. Use Perl from C? Read on... ROADMAP the section on "Compiling your C program" the section on "Adding a Perl interpreter to your C program" the section on "Calling a Perl subroutine from your C program" the section on "Evaluating a Perl statement from your C program" the section on "Performing Perl pattern matches and substitutions from your C program" the section on "Fiddling with the Perl stack from your C program" the section on "Maintaining a persistent interpreter" the section on "Maintaining multiple interpreter instances" the section on "Using Perl modules, which themselves use C libraries, from your C program" the section on "Embedding Perl under Win32" Compiling your C program If you have trouble compiling the scripts in this documentation, you're not alone. The cardinal rule: COMPILE THE PROGRAMS IN EXACTLY THE SAME WAY THAT YOUR PERL WAS COMPILED. (Sorry for yelling.) Also, every C program that uses Perl must link in the *perl library*. What's that, you ask? Perl is itself written in C; the perl library is the collection of compiled C programs that were used to create your perl executable (*/usr/bin/perl* or equivalent). (Corollary: you can't use Perl from your C program unless Perl has been compiled on your machine, or installed properly--that's why you shouldn't blithely copy Perl executables from machine to machine without also copying the *lib* directory.) When you use Perl from C, your C program will--usually--allocate, "run", and deallocate a *PerlInterpreter* object, which is defined by the perl library. If your copy of Perl is recent enough to contain this documentation (version 5.002 or later), then the perl library (and *EXTERN.h* and *perl.h*, which you'll also need) will reside in a directory that looks like this: /usr/local/lib/perl5/your_architecture_here/CORE or perhaps just /usr/local/lib/perl5/CORE or maybe something like /usr/opt/perl5/CORE Execute this statement for a hint about where to find CORE: perl -MConfig -e 'print $Config{archlib}' Here's how you'd compile the example in the next section, the section on "Adding a Perl interpreter to your C program", on my Linux box: % gcc -O2 -Dbool=char -DHAS_BOOL -I/usr/local/include -I/usr/local/lib/perl5/i586-linux/5.003/CORE -L/usr/local/lib/perl5/i586-linux/5.003/CORE -o interp interp.c -lperl -lm (That's all one line.) On my DEC Alpha running old 5.003_05, the incantation is a bit different: % cc -O2 -Olimit 2900 -DSTANDARD_C -I/usr/local/include -I/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE -L/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE -L/usr/local/lib -D__LANGUAGE_C__ -D_NO_PROTO -o interp interp.c -lperl -lm How can you figure out what to add? Assuming your Perl is post-5.001, execute a `perl -V' command and pay special attention to the "cc" and "ccflags" information. You'll have to choose the appropriate compiler (*cc*, *gcc*, et al.) for your machine: `perl -MConfig -e 'print $Config{cc}'' will tell you what to use. You'll also have to choose the appropriate library directory (*/usr/local/lib/...*) for your machine. If your compiler complains that certain functions are undefined, or that it can't locate *-lperl*, then you need to change the path following the `-L'. If it complains that it can't find *EXTERN.h* and *perl.h*, you need to change the path following the `-I'. You may have to add extra libraries as well. Which ones? Perhaps those printed by perl -MConfig -e 'print $Config{libs}' Provided your perl binary was properly configured and installed the ExtUtils::Embed module will determine all of this information for you: % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts` If the ExtUtils::Embed module isn't part of your Perl distribution, you can retrieve it from http://www.perl.com/perl/CPAN/modules/by- module/ExtUtils/. (If this documentation came from your Perl distribution, then you're running 5.004 or better and you already have it.) The ExtUtils::Embed kit on CPAN also contains all source code for the examples in this document, tests, additional examples and other information you may find useful. Adding a Perl interpreter to your C program In a sense, perl (the C program) is a good example of embedding Perl (the language), so I'll demonstrate embedding with *miniperlmain.c*, included in the source distribution. Here's a bastardized, nonportable version of *miniperlmain.c* containing the essentials of embedding: #include /* from the Perl distribution */ #include /* from the Perl distribution */ static PerlInterpreter *my_perl; /*** The Perl interpreter ***/ int main(int argc, char **argv, char **env) { my_perl = perl_alloc(); perl_construct(my_perl); perl_parse(my_perl, NULL, argc, argv, (char **)NULL); perl_run(my_perl); perl_destruct(my_perl); perl_free(my_perl); } Notice that we don't use the `env' pointer. Normally handed to `perl_parse' as its final argument, `env' here is replaced by `NULL', which means that the current environment will be used. Now compile this program (I'll call it *interp.c*) into an executable: % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts` After a successful compilation, you'll be able to use *interp* just like perl itself: % interp print "Pretty Good Perl \n"; print "10890 - 9801 is ", 10890 - 9801; Pretty Good Perl 10890 - 9801 is 1089 or % interp -e 'printf("%x", 3735928559)' deadbeef You can also read and execute Perl statements from a file while in the midst of your C program, by placing the filename in *argv[1]* before calling *perl_run*. Calling a Perl subroutine from your C program To call individual Perl subroutines, you can use any of the perl_call_* functions documented in the perlcall manpage. In this example we'll use `perl_call_argv'. That's shown below, in a program I'll call *showtime.c*. #include #include static PerlInterpreter *my_perl; int main(int argc, char **argv, char **env) { char *args[] = { NULL }; my_perl = perl_alloc(); perl_construct(my_perl); perl_parse(my_perl, NULL, argc, argv, NULL); /*** skipping perl_run() ***/ perl_call_argv("showtime", G_DISCARD | G_NOARGS, args); perl_destruct(my_perl); perl_free(my_perl); } where *showtime* is a Perl subroutine that takes no arguments (that's the *G_NOARGS*) and for which I'll ignore the return value (that's the *G_DISCARD*). Those flags, and others, are discussed in the perlcall manpage. I'll define the *showtime* subroutine in a file called *showtime.pl*: print "I shan't be printed."; sub showtime { print time; } Simple enough. Now compile and run: % cc -o showtime showtime.c `perl -MExtUtils::Embed -e ccopts -e ldopts` % showtime showtime.pl 818284590 yielding the number of seconds that elapsed between January 1, 1970 (the beginning of the Unix epoch), and the moment I began writing this sentence. In this particular case we don't have to call *perl_run*, but in general it's considered good practice to ensure proper initialization of library code, including execution of all object `DESTROY' methods and package `END {}' blocks. If you want to pass arguments to the Perl subroutine, you can add strings to the `NULL'-terminated `args' list passed to *perl_call_argv*. For other data types, or to examine return values, you'll need to manipulate the Perl stack. That's demonstrated in the last section of this document: the section on "Fiddling with the Perl stack from your C program". Evaluating a Perl statement from your C program Perl provides two API functions to evaluate pieces of Perl code. These are the "perl_eval_sv" entry in the perlguts manpage and the "perl_eval_pv" entry in the perlguts manpage. Arguably, these are the only routines you'll ever need to execute snippets of Perl code from within your C program. Your code can be as long as you wish; it can contain multiple statements; it can employ the "use" entry in the perlfunc manpage, the "require" entry in the perlfunc manpage, and the "do" entry in the perlfunc manpage to include external Perl files. *perl_eval_pv* lets us evaluate individual Perl strings, and then extract variables for coercion into C types. The following program, *string.c*, executes three Perl strings, extracting an `int' from the first, a `float' from the second, and a `char *' from the third. #include #include static PerlInterpreter *my_perl; main (int argc, char **argv, char **env) { STRLEN n_a; char *embedding[] = { "", "-e", "0" }; my_perl = perl_alloc(); perl_construct( my_perl ); perl_parse(my_perl, NULL, 3, embedding, NULL); perl_run(my_perl); /** Treat $a as an integer **/ perl_eval_pv("$a = 3; $a **= 2", TRUE); printf("a = %d\n", SvIV(perl_get_sv("a", FALSE))); /** Treat $a as a float **/ perl_eval_pv("$a = 3.14; $a **= 2", TRUE); printf("a = %f\n", SvNV(perl_get_sv("a", FALSE))); /** Treat $a as a string **/ perl_eval_pv("$a = 'rekcaH lreP rehtonA tsuJ'; $a = reverse($a);", TRUE); printf("a = %s\n", SvPV(perl_get_sv("a", FALSE), n_a)); perl_destruct(my_perl); perl_free(my_perl); } All of those strange functions with *sv* in their names help convert Perl scalars to C types. They're described in the perlguts manpage. If you compile and run *string.c*, you'll see the results of using *SvIV()* to create an `int', *SvNV()* to create a `float', and *SvPV()* to create a string: a = 9 a = 9.859600 a = Just Another Perl Hacker In the example above, we've created a global variable to temporarily store the computed value of our eval'd expression. It is also possible and in most cases a better strategy to fetch the return value from *perl_eval_pv()* instead. Example: ... STRLEN n_a; SV *val = perl_eval_pv("reverse 'rekcaH lreP rehtonA tsuJ'", TRUE); printf("%s\n", SvPV(val,n_a)); ... This way, we avoid namespace pollution by not creating global variables and we've simplified our code as well. Performing Perl pattern matches and substitutions from your C program The *perl_eval_sv()* function lets us evaluate strings of Perl code, so we can define some functions that use it to "specialize" in matches and substitutions: *match()*, *substitute()*, and *matches()*. I32 match(SV *string, char *pattern); Given a string and a pattern (e.g., `m/clasp/' or `/\b\w*\b/', which in your C program might appear as "/\\b\\w*\\b/"), match() returns 1 if the string matches the pattern and 0 otherwise. int substitute(SV **string, char *pattern); Given a pointer to an `SV' and an `=~' operation (e.g., `s/bob/robert/g' or `tr[A-Z][a-z]'), substitute() modifies the string within the `AV' at according to the operation, returning the number of substitutions made. int matches(SV *string, char *pattern, AV **matches); Given an `SV', a pattern, and a pointer to an empty `AV', matches() evaluates `$string =~ $pattern' in an array context, and fills in *matches* with the array elements, returning the number of matches found. Here's a sample program, *match.c*, that uses all three (long lines have been wrapped here): #include #include /** my_perl_eval_sv(code, error_check) ** kinda like perl_eval_sv(), ** but we pop the return value off the stack **/ SV* my_perl_eval_sv(SV *sv, I32 croak_on_error) { dSP; SV* retval; STRLEN n_a; PUSHMARK(SP); perl_eval_sv(sv, G_SCALAR); SPAGAIN; retval = POPs; PUTBACK; if (croak_on_error && SvTRUE(ERRSV)) croak(SvPVx(ERRSV, n_a)); return retval; } /** match(string, pattern) ** ** Used for matches in a scalar context. ** ** Returns 1 if the match was successful; 0 otherwise. **/ I32 match(SV *string, char *pattern) { SV *command = NEWSV(1099, 0), *retval; STRLEN n_a; sv_setpvf(command, "my $string = '%s'; $string =~ %s", SvPV(string,n_a), pattern); retval = my_perl_eval_sv(command, TRUE); SvREFCNT_dec(command); return SvIV(retval); } /** substitute(string, pattern) ** ** Used for =~ operations that modify their left-hand side (s/// and tr///) ** ** Returns the number of successful matches, and ** modifies the input string if there were any. **/ I32 substitute(SV **string, char *pattern) { SV *command = NEWSV(1099, 0), *retval; STRLEN n_a; sv_setpvf(command, "$string = '%s'; ($string =~ %s)", SvPV(*string,n_a), pattern); retval = my_perl_eval_sv(command, TRUE); SvREFCNT_dec(command); *string = perl_get_sv("string", FALSE); return SvIV(retval); } /** matches(string, pattern, matches) ** ** Used for matches in an array context. ** ** Returns the number of matches, ** and fills in **matches with the matching substrings **/ I32 matches(SV *string, char *pattern, AV **match_list) { SV *command = NEWSV(1099, 0); I32 num_matches; STRLEN n_a; sv_setpvf(command, "my $string = '%s'; @array = ($string =~ %s)", SvPV(string,n_a), pattern); my_perl_eval_sv(command, TRUE); SvREFCNT_dec(command); *match_list = perl_get_av("array", FALSE); num_matches = av_len(*match_list) + 1; /** assume $[ is 0 **/ return num_matches; } main (int argc, char **argv, char **env) { PerlInterpreter *my_perl = perl_alloc(); char *embedding[] = { "", "-e", "0" }; AV *match_list; I32 num_matches, i; SV *text = NEWSV(1099,0); STRLEN n_a; perl_construct(my_perl); perl_parse(my_perl, NULL, 3, embedding, NULL); sv_setpv(text, "When he is at a convenience store and the bill comes to some amount like 76 cents, Maynard is aware that there is something he *should* do, something that will enable him to get back a quarter, but he has no idea *what*. He fumbles through his red squeezey changepurse and gives the boy three extra pennies with his dollar, hoping that he might luck into the correct amount. The boy gives him back two of his own pennies and then the big shiny quarter that is his prize. -RICHH"); if (match(text, "m/quarter/")) /** Does text contain 'quarter'? **/ printf("match: Text contains the word 'quarter'.\n\n"); else printf("match: Text doesn't contain the word 'quarter'.\n\n"); if (match(text, "m/eighth/")) /** Does text contain 'eighth'? **/ printf("match: Text contains the word 'eighth'.\n\n"); else printf("match: Text doesn't contain the word 'eighth'.\n\n"); /** Match all occurrences of /wi../ **/ num_matches = matches(text, "m/(wi..)/g", &match_list); printf("matches: m/(wi..)/g found %d matches...\n", num_matches); for (i = 0; i < num_matches; i++) printf("match: %s\n", SvPV(*av_fetch(match_list, i, FALSE),n_a)); printf("\n"); /** Remove all vowels from text **/ num_matches = substitute(&text, "s/[aeiou]//gi"); if (num_matches) { printf("substitute: s/[aeiou]//gi...%d substitutions made.\n", num_matches); printf("Now text is: %s\n\n", SvPV(text,n_a)); } /** Attempt a substitution **/ if (!substitute(&text, "s/Perl/C/")) { printf("substitute: s/Perl/C...No substitution made.\n\n"); } SvREFCNT_dec(text); PL_perl_destruct_level = 1; perl_destruct(my_perl); perl_free(my_perl); } which produces the output (again, long lines have been wrapped here) match: Text contains the word 'quarter'. match: Text doesn't contain the word 'eighth'. matches: m/(wi..)/g found 2 matches... match: will match: with substitute: s/[aeiou]//gi...139 substitutions made. Now text is: Whn h s t cnvnnc str nd th bll cms t sm mnt lk 76 cnts, Mynrd s wr tht thr s smthng h *shld* d, smthng tht wll nbl hm t gt bck qrtr, bt h hs n d *wht*. H fmbls thrgh hs rd sqzy chngprs nd gvs th by thr xtr pnns wth hs dllr, hpng tht h mght lck nt th crrct mnt. Th by gvs hm bck tw f hs wn pnns nd thn th bg shny qrtr tht s hs prz. -RCHH substitute: s/Perl/C...No substitution made. Fiddling with the Perl stack from your C program When trying to explain stacks, most computer science textbooks mumble something about spring-loaded columns of cafeteria plates: the last thing you pushed on the stack is the first thing you pop off. That'll do for our purposes: your C program will push some arguments onto "the Perl stack", shut its eyes while some magic happens, and then pop the results--the return value of your Perl subroutine--off the stack. First you'll need to know how to convert between C types and Perl types, with newSViv() and sv_setnv() and newAV() and all their friends. They're described in the perlguts manpage. Then you'll need to know how to manipulate the Perl stack. That's described in the perlcall manpage. Once you've understood those, embedding Perl in C is easy. Because C has no builtin function for integer exponentiation, let's make Perl's ** operator available to it (this is less useful than it sounds, because Perl implements ** with C's *pow()* function). First I'll create a stub exponentiation function in *power.pl*: sub expo { my ($a, $b) = @_; return $a ** $b; } Now I'll create a C program, *power.c*, with a function *PerlPower()* that contains all the perlguts necessary to push the two arguments into *expo()* and to pop the return value out. Take a deep breath... #include #include static PerlInterpreter *my_perl; static void PerlPower(int a, int b) { dSP; /* initialize stack pointer */ ENTER; /* everything created after here */ SAVETMPS; /* ...is a temporary variable. */ PUSHMARK(SP); /* remember the stack pointer */ XPUSHs(sv_2mortal(newSViv(a))); /* push the base onto the stack */ XPUSHs(sv_2mortal(newSViv(b))); /* push the exponent onto stack */ PUTBACK; /* make local stack pointer global */ perl_call_pv("expo", G_SCALAR); /* call the function */ SPAGAIN; /* refresh stack pointer */ /* pop the return value from stack */ printf ("%d to the %dth power is %d.\n", a, b, POPi); PUTBACK; FREETMPS; /* free that return value */ LEAVE; /* ...and the XPUSHed "mortal" args.*/ } int main (int argc, char **argv, char **env) { char *my_argv[] = { "", "power.pl" }; my_perl = perl_alloc(); perl_construct( my_perl ); perl_parse(my_perl, NULL, 2, my_argv, (char **)NULL); perl_run(my_perl); PerlPower(3, 4); /*** Compute 3 ** 4 ***/ perl_destruct(my_perl); perl_free(my_perl); } Compile and run: % cc -o power power.c `perl -MExtUtils::Embed -e ccopts -e ldopts` % power 3 to the 4th power is 81. Maintaining a persistent interpreter When developing interactive and/or potentially long-running applications, it's a good idea to maintain a persistent interpreter rather than allocating and constructing a new interpreter multiple times. The major reason is speed: since Perl will only be loaded into memory once. However, you have to be more cautious with namespace and variable scoping when using a persistent interpreter. In previous examples we've been using global variables in the default package `main'. We knew exactly what code would be run, and assumed we could avoid variable collisions and outrageous symbol table growth. Let's say your application is a server that will occasionally run Perl code from some arbitrary file. Your server has no way of knowing what code it's going to run. Very dangerous. If the file is pulled in by `perl_parse()', compiled into a newly constructed interpreter, and subsequently cleaned out with `perl_destruct()' afterwards, you're shielded from most namespace troubles. One way to avoid namespace collisions in this scenario is to translate the filename into a guaranteed-unique package name, and then compile the code into that package using the "eval" entry in the perlfunc manpage. In the example below, each file will only be compiled once. Or, the application might choose to clean out the symbol table associated with the file after it's no longer needed. Using the "perl_call_argv" entry in the perlcall manpage, We'll call the subroutine `Embed::Persistent::eval_file' which lives in the file `persistent.pl' and pass the filename and boolean cleanup/cache flag as arguments. Note that the process will continue to grow for each file that it uses. In addition, there might be `AUTOLOAD'ed subroutines and other conditions that cause Perl's symbol table to grow. You might want to add some logic that keeps track of the process size, or restarts itself after a certain number of requests, to ensure that memory consumption is minimized. You'll also want to scope your variables with the "my" entry in the perlfunc manpage whenever possible. package Embed::Persistent; #persistent.pl use strict; use vars '%Cache'; use Symbol qw(delete_package); sub valid_package_name { my($string) = @_; $string =~ s/([^A-Za-z0-9\/])/sprintf("_%2x",unpack("C",$1))/eg; # second pass only for words starting with a digit $string =~ s|/(\d)|sprintf("/_%2x",unpack("C",$1))|eg; # Dress it up as a real package name $string =~ s|/|::|g; return "Embed" . $string; } sub eval_file { my($filename, $delete) = @_; my $package = valid_package_name($filename); my $mtime = -M $filename; if(defined $Cache{$package}{mtime} && $Cache{$package}{mtime} <= $mtime) { # we have compiled this subroutine already, # it has not been updated on disk, nothing left to do print STDERR "already compiled $package->handler\n"; } else { local *FH; open FH, $filename or die "open '$filename' $!"; local($/) = undef; my $sub = ; close FH; #wrap the code into a subroutine inside our unique package my $eval = qq{package $package; sub handler { $sub; }}; { # hide our variables within this block my($filename,$mtime,$package,$sub); eval $eval; } die $@ if $@; #cache it unless we're cleaning out each time $Cache{$package}{mtime} = $mtime unless $delete; } eval {$package->handler;}; die $@ if $@; delete_package($package) if $delete; #take a look if you want #print Devel::Symdump->rnew($package)->as_string, $/; } 1; __END__ /* persistent.c */ #include #include /* 1 = clean out filename's symbol table after each request, 0 = don't */ #ifndef DO_CLEAN #define DO_CLEAN 0 #endif static PerlInterpreter *perl = NULL; int main(int argc, char **argv, char **env) { char *embedding[] = { "", "persistent.pl" }; char *args[] = { "", DO_CLEAN, NULL }; char filename [1024]; int exitstatus = 0; STRLEN n_a; if((perl = perl_alloc()) == NULL) { fprintf(stderr, "no memory!"); exit(1); } perl_construct(perl); exitstatus = perl_parse(perl, NULL, 2, embedding, NULL); if(!exitstatus) { exitstatus = perl_run(perl); while(printf("Enter file name: ") && gets(filename)) { /* call the subroutine, passing it the filename as an argument */ args[0] = filename; perl_call_argv("Embed::Persistent::eval_file", G_DISCARD | G_EVAL, args); /* check $@ */ if(SvTRUE(ERRSV)) fprintf(stderr, "eval error: %s\n", SvPV(ERRSV,n_a)); } } PL_perl_destruct_level = 0; perl_destruct(perl); perl_free(perl); exit(exitstatus); } Now compile: % cc -o persistent persistent.c `perl -MExtUtils::Embed -e ccopts -e ldopts` Here's a example script file: #test.pl my $string = "hello"; foo($string); sub foo { print "foo says: @_\n"; } Now run: % persistent Enter file name: test.pl foo says: hello Enter file name: test.pl already compiled Embed::test_2epl->handler foo says: hello Enter file name: ^C Maintaining multiple interpreter instances Some rare applications will need to create more than one interpreter during a session. Such an application might sporadically decide to release any resources associated with the interpreter. The program must take care to ensure that this takes place *before* the next interpreter is constructed. By default, the global variable `PL_perl_destruct_level' is set to `0', since extra cleaning isn't needed when a program has only one interpreter. Setting `PL_perl_destruct_level' to `1' makes everything squeaky clean: PL_perl_destruct_level = 1; while(1) { ... /* reset global variables here with PL_perl_destruct_level = 1 */ perl_construct(my_perl); ... /* clean and reset _everything_ during perl_destruct */ perl_destruct(my_perl); perl_free(my_perl); ... /* let's go do it again! */ } When *perl_destruct()* is called, the interpreter's syntax parse tree and symbol tables are cleaned up, and global variables are reset. Now suppose we have more than one interpreter instance running at the same time. This is feasible, but only if you used the `-DMULTIPLICITY' flag when building Perl. By default, that sets `PL_perl_destruct_level' to `1'. Let's give it a try: #include #include /* we're going to embed two interpreters */ /* we're going to embed two interpreters */ #define SAY_HELLO "-e", "print qq(Hi, I'm $^X\n)" int main(int argc, char **argv, char **env) { PerlInterpreter *one_perl = perl_alloc(), *two_perl = perl_alloc(); char *one_args[] = { "one_perl", SAY_HELLO }; char *two_args[] = { "two_perl", SAY_HELLO }; perl_construct(one_perl); perl_construct(two_perl); perl_parse(one_perl, NULL, 3, one_args, (char **)NULL); perl_parse(two_perl, NULL, 3, two_args, (char **)NULL); perl_run(one_perl); perl_run(two_perl); perl_destruct(one_perl); perl_destruct(two_perl); perl_free(one_perl); perl_free(two_perl); } Compile as usual: % cc -o multiplicity multiplicity.c `perl -MExtUtils::Embed -e ccopts -e ldopts` Run it, Run it: % multiplicity Hi, I'm one_perl Hi, I'm two_perl Using Perl modules, which themselves use C libraries, from your C program If you've played with the examples above and tried to embed a script that *use()*s a Perl module (such as *Socket*) which itself uses a C or C++ library, this probably happened: Can't load module Socket, dynamic loading not available in this perl. (You may need to build a new perl executable which either supports dynamic loading or has the Socket module statically linked into it.) What's wrong? Your interpreter doesn't know how to communicate with these extensions on its own. A little glue will help. Up until now you've been calling *perl_parse()*, handing it NULL for the second argument: perl_parse(my_perl, NULL, argc, my_argv, NULL); That's where the glue code can be inserted to create the initial contact between Perl and linked C/C++ routines. Let's take a look some pieces of *perlmain.c* to see how Perl does this: #ifdef __cplusplus # define EXTERN_C extern "C" #else # define EXTERN_C extern #endif static void xs_init _((void)); EXTERN_C void boot_DynaLoader _((CV* cv)); EXTERN_C void boot_Socket _((CV* cv)); EXTERN_C void xs_init() { char *file = __FILE__; /* DynaLoader is a special case */ newXS("DynaLoader::boot_DynaLoader", boot_DynaLoader, file); newXS("Socket::bootstrap", boot_Socket, file); } Simply put: for each extension linked with your Perl executable (determined during its initial configuration on your computer or when adding a new extension), a Perl subroutine is created to incorporate the extension's routines. Normally, that subroutine is named *Module::bootstrap()* and is invoked when you say *use Module*. In turn, this hooks into an XSUB, *boot_Module*, which creates a Perl counterpart for each of the extension's XSUBs. Don't worry about this part; leave that to the *xsubpp* and extension authors. If your extension is dynamically loaded, DynaLoader creates *Module::bootstrap()* for you on the fly. In fact, if you have a working DynaLoader then there is rarely any need to link in any other extensions statically. Once you have this code, slap it into the second argument of *perl_parse()*: perl_parse(my_perl, xs_init, argc, my_argv, NULL); Then compile: % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts` % interp use Socket; use SomeDynamicallyLoadedModule; print "Now I can use extensions!\n"' ExtUtils::Embed can also automate writing the *xs_init* glue code. % perl -MExtUtils::Embed -e xsinit -- -o perlxsi.c % cc -c perlxsi.c `perl -MExtUtils::Embed -e ccopts` % cc -c interp.c `perl -MExtUtils::Embed -e ccopts` % cc -o interp perlxsi.o interp.o `perl -MExtUtils::Embed -e ldopts` Consult the perlxs manpage and the perlguts manpage for more details. Embedding Perl under Win32 At the time of this writing (5.004), there are two versions of Perl which run under Win32. (The two versions are merging in 5.005.) Interfacing to ActiveState's Perl library is quite different from the examples in this documentation, as significant changes were made to the internal Perl API. However, it is possible to embed ActiveState's Perl runtime. For details, see the Perl for Win32 FAQ at http://www.perl.com/CPAN/doc/FAQs/win32/perlwin32faq.html. With the "official" Perl version 5.004 or higher, all the examples within this documentation will compile and run untouched, although the build process is slightly different between Unix and Win32. For starters, backticks don't work under the Win32 native command shell. The ExtUtils::Embed kit on CPAN ships with a script called genmake, which generates a simple makefile to build a program from a single C source file. It can be used like this: C:\ExtUtils-Embed\eg> perl genmake interp.c C:\ExtUtils-Embed\eg> nmake C:\ExtUtils-Embed\eg> interp -e "print qq{I'm embedded in Win32!\n}" You may wish to use a more robust environment such as the Microsoft Developer Studio. In this case, run this to generate perlxsi.c: perl -MExtUtils::Embed -e xsinit Create a new project and Insert -> Files into Project: perlxsi.c, perl.lib, and your own source files, e.g. interp.c. Typically you'll find perl.lib in C:\perl\lib\CORE, if not, you should see the CORE directory relative to `perl -V:archlib'. The studio will also need this path so it knows where to find Perl include files. This path can be added via the Tools -> Options -> Directories menu. Finally, select Build -> Build interp.exe and you're ready to go. MORAL You can sometimes *write faster code* in C, but you can always *write code faster* in Perl. Because you can use each from the other, combine them as you wish. AUTHOR Jon Orwant and Doug MacEachern , with small contributions from Tim Bunce, Tom Christiansen, Guy Decoux, Hallvard Furuseth, Dov Grobgeld, and Ilya Zakharevich. Doug MacEachern has an article on embedding in Volume 1, Issue 4 of The Perl Journal (http://tpj.com). Doug is also the developer of the most widely-used Perl embedding: the mod_perl system (perl.apache.org), which embeds Perl in the Apache web server. Oracle, Binary Evolution, ActiveState, and Ben Sugars's nsapi_perl have used this model for Oracle, Netscape and Internet Information Server Perl plugins. July 22, 1998 COPYRIGHT Copyright (C) 1995, 1996, 1997, 1998 Doug MacEachern and Jon Orwant. All Rights Reserved. Permission is granted to make and distribute verbatim copies of this documentation provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this documentation under the conditions for verbatim copying, provided also that they are marked clearly as modified versions, that the authors' names and title are unchanged (though subtitles and additional authors' names may be added), and that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this documentation into another language, under the above conditions for modified versions. perlfaq section NAME perlfaq - frequently asked questions about Perl ($Date: 1999/01/08 05:54:52 $) DESCRIPTION This document is structured into the following sections: perlfaq: Structural overview of the FAQ. This document. the perlfaq1 manpage: General Questions About Perl Very general, high-level information about Perl. * What is Perl? * Who supports Perl? Who develops it? Why is it free? * Which version of Perl should I use? * What are perl4 and perl5? * What is perl6? * How stable is Perl? * Is Perl difficult to learn? * How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl? * Can I do [task] in Perl? * When shouldn't I program in Perl? * What's the difference between "perl" and "Perl"? * Is it a Perl program or a Perl script? * What is a JAPH? * Where can I get a list of Larry Wall witticisms? * How can I convince my sysadmin/supervisor/employees to use version (5/5.005/Perl instead of some other language)? the perlfaq2 manpage: Obtaining and Learning about Perl Where to find source and documentation to Perl, support, and related matters. * What machines support Perl? Where do I get it? * How can I get a binary version of Perl? * I don't have a C compiler on my system. How can I compile perl? * I copied the Perl binary from one machine to another, but scripts don't work. * I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work? * What modules and extensions are available for Perl? What is CPAN? What does CPAN/src/... mean? * Is there an ISO or ANSI certified version of Perl? * Where can I get information on Perl? * What are the Perl newsgroups on USENET? Where do I post questions? * Where should I post source code? * Perl Books * Perl in Magazines * Perl on the Net: FTP and WWW Access * What mailing lists are there for perl? * Archives of comp.lang.perl.misc * Where can I buy a commercial version of Perl? * Where do I send bug reports? * What is perl.com? the perlfaq3 manpage: Programming Tools Programmer tools and programming support. * How do I do (anything)? * How can I use Perl interactively? * Is there a Perl shell? * How do I debug my Perl programs? * How do I profile my Perl programs? * How do I cross-reference my Perl programs? * Is there a pretty-printer (formatter) for Perl? * Is there a ctags for Perl? * Is there an IDE or Windows Perl Editor? * Where can I get Perl macros for vi? * Where can I get perl-mode for emacs? * How can I use curses with Perl? * How can I use X or Tk with Perl? * How can I generate simple menus without using CGI or Tk? * What is undump? * How can I make my Perl program run faster? * How can I make my Perl program take less memory? * Is it unsafe to return a pointer to local data? * How can I free an array or hash so my program shrinks? * How can I make my CGI script more efficient? * How can I hide the source for my Perl program? * How can I compile my Perl program into byte code or C? * How can I compile Perl into Java? * How can I get `#!perl' to work on [MS-DOS,NT,...]? * Can I write useful perl programs on the command line? * Why don't perl one-liners work on my DOS/Mac/VMS system? * Where can I learn about CGI or Web programming in Perl? * Where can I learn about object-oriented Perl programming? * Where can I learn about linking C with Perl? [h2xs, xsubpp] * I've read perlembed, perlguts, etc., but I can't embed perl in my C program, what am I doing wrong? * When I tried to run my script, I got this message. What does it mean? * What's MakeMaker? the perlfaq4 manpage: Data Manipulation Manipulating numbers, dates, strings, arrays, hashes, and miscellaneous data issues. * Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)? * Why isn't my octal data interpreted correctly? * Does Perl have a round() function? What about ceil() and floor()? Trig functions? * How do I convert bits into ints? * Why doesn't & work the way I want it to? * How do I multiply matrices? * How do I perform an operation on a series of integers? * How can I output Roman numerals? * Why aren't my random numbers random? * How do I find the week-of-the-year/day-of-the-year? * How can I compare two dates and find the difference? * How can I take a string and turn it into epoch seconds? * How can I find the Julian Day? * How do I find yesterday's date? * Does Perl have a year 2000 problem? Is Perl Y2K compliant? * How do I validate input? * How do I unescape a string? * How do I remove consecutive pairs of characters? * How do I expand function calls in a string? * How do I find matching/nesting anything? * How do I reverse a string? * How do I expand tabs in a string? * How do I reformat a paragraph? * How can I access/change the first N letters of a string? * How do I change the Nth occurrence of something? * How can I count the number of occurrences of a substring within a string? * How do I capitalize all the words on one line? * How can I split a [character] delimited string except when inside [character]? (Comma-separated files) * How do I strip blank space from the beginning/end of a string? * How do I pad a string with blanks or pad a number with zeroes? * How do I extract selected columns from a string? * How do I find the soundex value of a string? * How can I expand variables in text strings? * What's wrong with always quoting "$vars"? * Why don't my <? * Is there a leak/bug in glob()? * How can I open a file with a leading ">" or trailing blanks? * How can I reliably rename a file? * How can I lock a file? * Why can't I just open(FH, ">file.lock")? * I still don't get locking. I just want to increment the number in the file. How can I do this? * How do I randomly update a binary file? * How do I get a file's timestamp in perl? * How do I set a file's timestamp in perl? * How do I print to more than one file at once? * How can I read in a file by paragraphs? * How can I read a single character from a file? From the keyboard? * How can I tell whether there's a character waiting on a filehandle? * How do I do a `tail -f' in perl? * How do I dup() a filehandle in Perl? * How do I close a file descriptor by number? * Why can't I use "C:\temp\foo" in DOS paths? What doesn't `C:\temp\foo.exe` work? * Why doesn't glob("*.*") get all the files? * Why does Perl let me delete read-only files? Why does `-i' clobber protected files? Isn't this a bug in Perl? * How do I select a random line from a file? * Why do I get weird spaces when I print an array of lines? the perlfaq6 manpage: Regexps Pattern matching and regular expressions. * How can I hope to use regular expressions without creating illegible and unmaintainable code? * I'm having trouble matching over more than one line. What's wrong? * How can I pull out lines between two patterns that are themselves on different lines? * I put a regular expression into $/ but it didn't work. What's wrong? * How do I substitute case insensitively on the LHS, but preserving case on the RHS? * How can I make `\w' match national character sets? * How can I match a locale-smart version of `/[a-zA-Z]/'? * How can I quote a variable to use in a regexp? * What is `/o' really for? * How do I use a regular expression to strip C style comments from a file? * Can I use Perl regular expressions to match balanced text? * What does it mean that regexps are greedy? How can I get around it? * How do I process each word on each line? * How can I print out a word-frequency or line-frequency summary? * How can I do approximate matching? * How do I efficiently match many regular expressions at once? * Why don't word-boundary searches with `\b' work for me? * Why does using $&, $`, or $' slow my program down? * What good is `\G' in a regular expression? * Are Perl regexps DFAs or NFAs? Are they POSIX compliant? * What's wrong with using grep or map in a void context? * How can I match strings with multibyte characters? * How do I match a pattern that is supplied by the user? the perlfaq7 manpage: General Perl Language Issues General Perl language issues that don't clearly fit into any of the other sections. * Can I get a BNF/yacc/RE for the Perl language? * What are all these $@%* punctuation signs, and how do I know when to use them? * Do I always/never have to quote my strings or use semicolons and commas? * How do I skip some return values? * How do I temporarily block warnings? * What's an extension? * Why do Perl operators have different precedence than C operators? * How do I declare/create a structure? * How do I create a module? * How do I create a class? * How can I tell if a variable is tainted? * What's a closure? * What is variable suicide and how can I prevent it? * How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regexp}? * How do I create a static variable? * What's the difference between dynamic and lexical (static) scoping? Between local() and my()? * How can I access a dynamic variable while a similarly named lexical is in scope? * What's the difference between deep and shallow binding? * Why doesn't "my($foo) = ;" work right? * How do I redefine a builtin function, operator, or method? * What's the difference between calling a function as &foo and foo()? * How do I create a switch or case statement? * How can I catch accesses to undefined variables/functions/methods? * Why can't a method included in this same file be found? * How can I find out my current package? * How can I comment out a large block of perl code? * How do I clear a package? the perlfaq8 manpage: System Interaction Interprocess communication (IPC), control over the user-interface (keyboard, screen and pointing devices). * How do I find out which operating system I'm running under? * How come exec() doesn't return? * How do I do fancy stuff with the keyboard/screen/mouse? * How do I print something out in color? * How do I read just one key without waiting for a return key? * How do I check whether input is ready on the keyboard? * How do I clear the screen? * How do I get the screen size? * How do I ask the user for a password? * How do I read and write the serial port? * How do I decode encrypted password files? * How do I start a process in the background? * How do I trap control characters/signals? * How do I modify the shadow password file on a Unix system? * How do I set the time and date? * How can I sleep() or alarm() for under a second? * How can I measure time under a second? * How can I do an atexit() or setjmp()/longjmp()? (Exception handling) * Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean? * How can I call my system's unique C functions from Perl? * Where do I get the include files to do ioctl() or syscall()? * Why do setuid perl scripts complain about kernel problems? * How can I open a pipe both to and from a command? * Why can't I get the output of a command with system()? * How can I capture STDERR from an external command? * Why doesn't open() return an error when a pipe open fails? * What's wrong with using backticks in a void context? * How can I call backticks without shell processing? * Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)? * How can I convert my shell script to perl? * Can I use perl to run a telnet or ftp session? * How can I write expect in Perl? * Is there a way to hide perl's command line from programs such as "ps"? * I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible? * How do I close a process's filehandle without waiting for it to complete? * How do I fork a daemon process? * How do I make my program run with sh and csh? * How do I find out if I'm running interactively or not? * How do I timeout a slow event? * How do I set CPU limits? * How do I avoid zombies on a Unix system? * How do I use an SQL database? * How do I make a system() exit on control-C? * How do I open a file without blocking? * How do I install a CPAN module? * What's the difference between require and use? * How do I keep my own module/library directory? * How do I add the directory my program lives in to the module/library search path? * How do I add a directory to my include path at runtime? * What is socket.ph and where do I get it? the perlfaq9 manpage: Networking Networking, the Internet, and a few on the web. * My CGI script runs from the command line but not the browser. (500 Server Error) * How can I get better error messages from a CGI program? * How do I remove HTML from a string? * How do I extract URLs? * How do I download a file from the user's machine? How do I open a file on another machine? * How do I make a pop-up menu in HTML? * How do I fetch an HTML file? * How do I automate an HTML form submission? * How do I decode or create those %-encodings on the web? * How do I redirect to another page? * How do I put a password on my web pages? * How do I edit my .htpasswd and .htgroup files with Perl? * How do I make sure users can't enter values into a form that cause my CGI script to do bad things? * How do I parse a mail header? * How do I decode a CGI form? * How do I check a valid mail address? * How do I decode a MIME/BASE64 string? * How do I return the user's mail address? * How do I send mail? * How do I read mail? * How do I find out my hostname/domainname/IP address? * How do I fetch a news article or the active newsgroups? * How do I fetch/put an FTP file? * How can I do RPC in Perl? Where to get this document This document is posted regularly to comp.lang.perl.announce and several other related newsgroups. It is available in a variety of formats from CPAN in the /CPAN/doc/FAQs/FAQ/ directory, or on the web at http://www.perl.com/perl/faq/ . How to contribute to this document You may mail corrections, additions, and suggestions to perlfaq- suggestions@perl.com . This alias should not be used to *ask* FAQs. It's for fixing the current FAQ. Send questions to the comp.lang.perl.misc newsgroup. What will happen if you mail your Perl programming problems to the authors Your questions will probably go unread, unless they're suggestions of new questions to add to the FAQ, in which case they should have gone to the perlfaq-suggestions@perl.com instead. You should have read section 2 of this faq. There you would have learned that comp.lang.perl.misc is the appropriate place to go for free advice. If your question is really important and you require a prompt and correct answer, you should hire a consultant. Credits When I first began the Perl FAQ in the late 80s, I never realized it would have grown to over a hundred pages, nor that Perl would ever become so popular and widespread. This document could not have been written without the tremendous help provided by Larry Wall and the rest of the Perl Porters. Author and Copyright Information Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. Bundled Distributions When included as part of the Standard Version of Perl, or as part of its complete documentation whether printed or otherwise, this work may be distributed only under the terms of Perl's Artistic License. Any distribution of this file or derivatives thereof *outside* of that package require that special arrangements be made with copyright holder. Irrespective of its distribution, all code examples in these files are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required. Disclaimer This information is offered in good faith and in the hope that it may be of use, but is not guaranteed to be correct, up to date, or suitable for any particular purpose whatsoever. The authors accept no liability in respect of this information or its use. Changes 7/January/99 Small touchups here and there. Added all questions in this document as a sort of table of contents. 22/June/98 Significant changes throughout in preparation for the 5.005 release. 24/April/97 Style and whitespace changes from Chip, new question on reading one character at a time from a terminal using POSIX from Tom. 23/April/97 Added http://www.oasis.leo.org/perl/ to the perlfaq2 manpage. Style fix to the perlfaq3 manpage. Added floating point precision, fixed complex number arithmetic, cross-references, caveat for Text::Wrap, alternative answer for initial capitalizing, fixed incorrect regexp, added example of Tie::IxHash to the perlfaq4 manpage. Added example of passing and storing filehandles, added commify to the perlfaq5 manpage. Restored variable suicide, and added mass commenting to the perlfaq7 manpage. Added Net::Telnet, fixed backticks, added reader/writer pair to telnet question, added FindBin, grouped module questions together in the perlfaq8 manpage. Expanded caveats for the simple URL extractor, gave LWP example, added CGI security question, expanded on the mail address answer in the perlfaq9 manpage. 25/March/97 Added more info to the binary distribution section of the perlfaq2 manpage. Added Net::Telnet to the perlfaq6 manpage. Fixed typos in the perlfaq8 manpage. Added mail sending example to the perlfaq9 manpage. Added Merlyn's columns to the perlfaq2 manpage. 18/March/97 Added the DATE to the NAME section, indicating which sections have changed. Mentioned SIGPIPE and the perlipc manpage in the forking open answer in the perlfaq8 manpage. Fixed description of a regular expression in the perlfaq4 manpage. 17/March/97 Version Various typos fixed throughout. Added new question on Perl BNF on the perlfaq7 manpage. Initial Release: 11/March/97 This is the initial release of version 3 of the FAQ; consequently there have been no changes since its initial release. perlfaq1 section NAME perlfaq1 - General Questions About Perl ($Revision: 1.20 $, $Date: 1999/01/08 04:22:09 $) DESCRIPTION This section of the FAQ answers very general, high-level questions about Perl. What is Perl? Perl is a high-level programming language with an eclectic heritage written by Larry Wall and a cast of thousands. It derives from the ubiquitous C programming language and to a lesser extent from sed, awk, the Unix shell, and at least a dozen other tools and languages. Perl's process, file, and text manipulation facilities make it particularly well-suited for tasks involving quick prototyping, system utilities, software tools, system management tasks, database access, graphical programming, networking, and world wide web programming. These strengths make it especially popular with system administrators and CGI script authors, but mathematicians, geneticists, journalists, and even managers also use Perl. Maybe you should, too. Who supports Perl? Who develops it? Why is it free? The original culture of the pre-populist Internet and the deeply-held beliefs of Perl's author, Larry Wall, gave rise to the free and open distribution policy of perl. Perl is supported by its users. The core, the standard Perl library, the optional modules, and the documentation you're reading now were all written by volunteers. See the personal note at the end of the README file in the perl source distribution for more details. See the perlhist manpage (new as of 5.005) for Perl's milestone releases. In particular, the core development team (known as the Perl Porters) are a rag-tag band of highly altruistic individuals committed to producing better software for free than you could hope to purchase for money. You may snoop on pending developments via nntp://news.perl.com/perl.porters- gw/ and the Deja News archive at http://www.dejanews.com/ using the perl.porters-gw newsgroup, or you can subscribe to the mailing list by sending perl5-porters-request@perl.org a subscription request. While the GNU project includes Perl in its distributions, there's no such thing as "GNU Perl". Perl is not produced nor maintained by the Free Software Foundation. Perl's licensing terms are also more open than GNU software's tend to be. You can get commercial support of Perl if you wish, although for most users the informal support will more than suffice. See the answer to "Where can I buy a commercial version of perl?" for more information. Which version of Perl should I use? You should definitely use version 5. Version 4 is old, limited, and no longer maintained; its last patch (4.036) was in 1992, long ago and far away. Sure, it's stable, but so is anything that's dead; in fact, perl4 had been called a dead, flea-bitten camel carcass. The most recent production release is 5.005_02 (although 5.004_04 is still supported). The most cutting-edge development release is 5.005_54. Further references to the Perl language in this document refer to the production release unless otherwise specified. There may be one or more official bug fixes for 5.005_02 by the time you read this, and also perhaps some experimental versions on the way to the next release. All releases prior to 5.004 were subject to buffer overruns, a grave security issue. What are perl4 and perl5? Perl4 and perl5 are informal names for different versions of the Perl programming language. It's easier to say "perl5" than it is to say "the 5(.004) release of Perl", but some people have interpreted this to mean there's a language called "perl5", which isn't the case. Perl5 is merely the popular name for the fifth major release (October 1994), while perl4 was the fourth major release (March 1991). There was also a perl1 (in January 1988), a perl2 (June 1988), and a perl3 (October 1989). The 5.0 release is, essentially, a ground-up rewrite of the original perl source code from releases 1 through 4. It has been modularized, object-oriented, tweaked, trimmed, and optimized until it almost doesn't look like the old code. However, the interface is mostly the same, and compatibility with previous releases is very high. See the section on "Perl4 to Perl5 Traps" in the perltrap manpage. To avoid the "what language is perl5?" confusion, some people prefer to simply use "perl" to refer to the latest version of perl and avoid using "perl5" altogether. It's not really that big a deal, though. See the perlhist manpage for a history of Perl revisions. What is perl6? Perl6 is a semi-jocular reference to the Topaz project. Headed by Chip Salzenberg, Topaz is yet-another ground-up rewrite of the current release of Perl, one whose major goal is to create a more maintainable core than found in release 5. Written in nominally portable C++, Topaz hopes to maintain 100% source-compatibility with previous releases of Perl but to run significantly faster and smaller. The Topaz team hopes to provide an XS compatibility interface to allow most XS modules to work unchanged, albeit perhaps without the efficiency that the new interface uowld allow. New features in Topaz are as yet undetermined, and will be addressed once compatibility and performance goals are met. If you are a hard-working C++ wizard with a firm command of Perl's internals, and you would like to work on the project, send a request to perl6-porters-request@perl.org to subscribe to the Topaz mailing list. There is no ETA for Topaz. It is expected to be several years before it achieves enough robustness, compatibility, portability, and performance to replace perl5 for ordinary use by mere mortals. How stable is Perl? Production releases, which incorporate bug fixes and new functionality, are widely tested before release. Since the 5.000 release, we have averaged only about one production release per year. Larry and the Perl development team occasionally make changes to the internal core of the language, but all possible efforts are made toward backward compatibility. While not quite all perl4 scripts run flawlessly under perl5, an update to perl should nearly never invalidate a program written for an earlier version of perl (barring accidental bug fixes and the rare new keyword). Is Perl difficult to learn? No, Perl is easy to start learning -- and easy to keep learning. It looks like most programming languages you're likely to have experience with, so if you've ever written an C program, an awk script, a shell script, or even BASIC program, you're already part way there. Most tasks only require a small subset of the Perl language. One of the guiding mottos for Perl development is "there's more than one way to do it" (TMTOWTDI, sometimes pronounced "tim toady"). Perl's learning curve is therefore shallow (easy to learn) and long (there's a whole lot you can do if you really want). Finally, because Perl is frequently (but not always, and certainly not by definition) an interpreted language, you can write your programs and test them without an intermediate compilation step, allowing you to experiment and test/debug quickly and easily. This ease of experimentation flattens the learning curve even more. Things that make Perl easier to learn: Unix experience, almost any kind of programming experience, an understanding of regular expressions, and the ability to understand other people's code. If there's something you need to do, then it's probably already been done, and a working example is usually available for free. Don't forget the new perl modules, either. They're discussed in Part 3 of this FAQ, along with CPAN, which is discussed in Part 2. How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl? Favorably in some areas, unfavorably in others. Precisely which areas are good and bad is often a personal choice, so asking this question on Usenet runs a strong risk of starting an unproductive Holy War. Probably the best thing to do is try to write equivalent code to do a set of tasks. These languages have their own newsgroups in which you can learn about (but hopefully not argue about) them. Some comparison documents can be found at http://language.perl.com/versus/ if you really can't stop yourself. Can I do [task] in Perl? Perl is flexible and extensible enough for you to use on virtually any task, from one-line file-processing tasks to large, elaborate systems. For many people, Perl serves as a great replacement for shell scripting. For others, it serves as a convenient, high-level replacement for most of what they'd program in low-level languages like C or C++. It's ultimately up to you (and possibly your management) which tasks you'll use Perl for and which you won't. If you have a library that provides an API, you can make any component of it available as just another Perl function or variable using a Perl extension written in C or C++ and dynamically linked into your main perl interpreter. You can also go the other direction, and write your main program in C or C++, and then link in some Perl code on the fly, to create a powerful application. See the perlembed manpage. That said, there will always be small, focused, special-purpose languages dedicated to a specific problem domain that are simply more convenient for certain kinds of problems. Perl tries to be all things to all people, but nothing special to anyone. Examples of specialized languages that come to mind include prolog and matlab. When shouldn't I program in Perl? When your manager forbids it -- but do consider replacing them :-). Actually, one good reason is when you already have an existing application written in another language that's all done (and done well), or you have an application language specifically designed for a certain task (e.g. prolog, make). For various reasons, Perl is probably not well-suited for real-time embedded systems, low-level operating systems development work like device drivers or context-switching code, complex multi-threaded shared- memory applications, or extremely large applications. You'll notice that perl is not itself written in Perl. The new, native-code compiler for Perl may eventually reduce the limitations given in the previous statement to some degree, but understand that Perl remains fundamentally a dynamically typed language, not a statically typed one. You certainly won't be chastised if you don't trust nuclear-plant or brain-surgery monitoring code to it. And Larry will sleep easier, too -- Wall Street programs not withstanding. :-) What's the difference between "perl" and "Perl"? One bit. Oh, you weren't talking ASCII? :-) Larry now uses "Perl" to signify the language proper and "perl" the implementation of it, i.e. the current interpreter. Hence Tom's quip that "Nothing but perl can parse Perl." You may or may not choose to follow this usage. For example, parallelism means "awk and perl" and "Python and Perl" look ok, while "awk and Perl" and "Python and perl" do not. But never write "PERL", because perl isn't really an acronym, aprocryphal folklore and post-facto expansions notwithstanding. Is it a Perl program or a Perl script? Larry doesn't really care. He says (half in jest) that "a script is what you give the actors. A program is what you give the audience." Originally, a script was a canned sequence of normally interactive commands, that is, a chat script. Something like a uucp or ppp chat script or an expect script fits the bill nicely, as do configuration scripts run by a program at its start up, such .cshrc or .ircrc, for example. Chat scripts were just drivers for existing programs, not stand-alone programs in their own right. A computer scientist will correctly explain that all programs are interpreted, and that the only question is at what level. But if you ask this question of someone who isn't a computer scientist, they might tell you that a *program* has been compiled to physical machine code once, and can then be run multiple times, whereas a *script* must be translated by a program each time it's used. Perl programs are (usually) neither strictly compiled nor strictly interpreted. They can be compiled to a byte-code form (something of a Perl virtual machine) or to completely different languages, like C or assembly language. You can't tell just by looking at it whether the source is destined for a pure interpreter, a parse-tree interpreter, a byte-code interpreter, or a native-code compiler, so it's hard to give a definitive answer here. Now that "script" and "scripting" are terms that have been seized by unscrupulous or unknowing marketeers for their own nefarious purposes, they have begun to take on strange and often pejorative meanings, like "non serious" or "not real programming". Consequently, some perl programmers prefer to avoid them altogether. What is a JAPH? These are the "just another perl hacker" signatures that some people sign their postings with. Randal Schwartz made these famous. About 100 of the earlier ones are available from http://www.perl.com/CPAN/misc/japh . Where can I get a list of Larry Wall witticisms? Over a hundred quips by Larry, from postings of his or source code, can be found at http://www.perl.com/CPAN/misc/lwall-quotes.txt.gz . Newer examples can be found by perusing Larry's postings: http://x1.dejanews.com/dnquery.xp?QRY=*&DBS=2&ST=PS&defaultOp=AND&LNG=ALL&format=terse&showsort=date&maxhits=100&subjects=&groups=&authors=larry@*wall.org&fromdate=&todate= How can I convince my sysadmin/supervisor/employees to use version (5/5.005/Perl instead of some other language)? If your manager or employees are wary of unsupported software, or software which doesn't officially ship with your Operating System, you might try to appeal to their self-interest. If programmers can be more productive using and utilizing Perl constructs, functionality, simplicity, and power, then the typical manager/supervisor/employee may be persuaded. Regarding using Perl in general, it's also sometimes helpful to point out that delivery times may be reduced using Perl, as compared to other languages. If you have a project which has a bottleneck, especially in terms of translation or testing, Perl almost certainly will provide a viable, and quick solution. In conjunction with any persuasion effort, you should not fail to point out that Perl is used, quite extensively, and with extremely reliable and valuable results, at many large computer software and/or hardware companies throughout the world. In fact, many Unix vendors now ship Perl by default, and support is usually just a news- posting away, if you can't find the answer in the *comprehensive* documentation, including this FAQ. See http://www.perl.org/advocacy/ for more information. If you face reluctance to upgrading from an older version of perl, then point out that version 4 is utterly unmaintained and unsupported by the Perl Development Team. Another big sell for Perl5 is the large number of modules and extensions which greatly reduce development time for any given task. Also mention that the difference between version 4 and version 5 of Perl is like the difference between awk and C++. (Well, ok, maybe not quite that distinct, but you get the idea.) If you want support and a reasonable guarantee that what you're developing will continue to work in the future, then you have to run the supported version. That probably means running the 5.005 release, although 5.004 isn't that bad. Several important bugs were fixed from the 5.000 through 5.003 versions, though, so try upgrading past them if possible. Of particular note is the massive bughunt for buffer overflow problems that went into the 5.004 release. All releases prior to that, including perl4, are considered insecure and should be upgraded as soon as possible. AUTHOR AND COPYRIGHT Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as an integrated part of the Standard Distribution of Perl or of its documentation (printed or otherwise), this work is covered under Perl's Artistic Licence. For separate distributions of all or part of this FAQ outside of that, see the perlfaq manpage. Irrespective of its distribution, all code examples here are public domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required. perlfaq2 section NAME perlfaq2 - Obtaining and Learning about Perl ($Revision: 1.30 $, $Date: 1998/12/29 19:43:32 $) DESCRIPTION This section of the FAQ answers questions about where to find source and documentation for Perl, support, and related matters. What machines support Perl? Where do I get it? The standard release of Perl (the one maintained by the perl development team) is distributed only in source code form. You can find this at http://www.perl.com/CPAN/src/latest.tar.gz , which in standard Internet format (a gzipped archive in POSIX tar format). Perl builds and runs on a bewildering number of platforms. Virtually all known and current Unix derivatives are supported (Perl's native platform), as are proprietary systems like VMS, DOS, OS/2, Windows, QNX, BeOS, and the Amiga. There are also the beginnings of support for MPE/iX. Binary distributions for some proprietary platforms, including Apple systems, can be found http://www.perl.com/CPAN/ports/ directory. Because these are not part of the standard distribution, they may and in fact do differ from the base Perl port in a variety of ways. You'll have to check their respective release notes to see just what the differences are. These differences can be either positive (e.g. extensions for the features of the particular platform that are not supported in the source release of perl) or negative (e.g. might be based upon a less current source release of perl). How can I get a binary version of Perl? If you don't have a C compiler because your vendor for whatever reasons did not include one with your system, the best thing to do is grab a binary version of gcc from the net and use that to compile perl with. CPAN only has binaries for systems that are terribly hard to get free compilers for, not for Unix systems. Some URLs that might help you are: http://language.perl.com/info/software.html http://www.perl.com/latest/ http://www.perl.com/CPAN/ports/ If you want information on proprietary systems. A simple installation guide for MS-DOS is available at http://www.cs.ruu.nl/~piet/perl5dos.html and similarly for Windows 3.1 at http://www.cs.ruu.nl/~piet/perlwin3.html . I don't have a C compiler on my system. How can I compile perl? Since you don't have a C compiler, you're doomed and your vendor should be sacrificed to the Sun gods. But that doesn't help you. What you need to do is get a binary version of gcc for your system first. Consult the Usenet FAQs for your operating system for information on where to get such a binary version. I copied the Perl binary from one machine to another, but scripts don't work. That's probably because you forgot libraries, or library paths differ. You really should build the whole distribution on the machine it will eventually live on, and then type `make install'. Most other approaches are doomed to failure. One simple way to check that things are in the right place is to print out the hard-coded @INC which perl is looking for. % perl -e 'print join("\n",@INC)' If this command lists any paths which don't exist on your system, then you may need to move the appropriate libraries to these locations, or create symlinks, aliases, or shortcuts appropriately. @INC is also printed as part of the output of % perl -V You might also want to check out the section on "How do I keep my own module/library directory?" in the perlfaq8 manpage. I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work? Read the INSTALL file, which is part of the source distribution. It describes in detail how to cope with most idiosyncrasies that the Configure script can't work around for any given system or architecture. What modules and extensions are available for Perl? What is CPAN? What does CPAN/src/... mean? CPAN stands for Comprehensive Perl Archive Network, a huge archive replicated on dozens of machines all over the world. CPAN contains source code, non-native ports, documentation, scripts, and many third- party modules and extensions, designed for everything from commercial database interfaces to keyboard/screen control to web walking and CGI scripts. The master machine for CPAN is ftp://ftp.funet.fi/pub/languages/perl/CPAN/, but you can use the address http://www.perl.com/CPAN/CPAN.html to fetch a copy from a "site near you". See http://www.perl.com/CPAN (without a slash at the end) for how this process works. CPAN/path/... is a naming convention for files available on CPAN sites. CPAN indicates the base directory of a CPAN mirror, and the rest of the path is the path from that directory to the file. For instance, if you're using ftp://ftp.funet.fi/pub/languages/perl/CPAN as your CPAN site, the file CPAN/misc/japh file is downloadable as ftp://ftp.funet.fi/pub/languages/perl/CPAN/misc/japh . Considering that there are hundreds of existing modules in the archive, one probably exists to do nearly anything you can think of. Current categories under CPAN/modules/by-category/ include perl core modules; development support; operating system interfaces; networking, devices, and interprocess communication; data type utilities; database interfaces; user interfaces; interfaces to other languages; filenames, file systems, and file locking; internationalization and locale; world wide web support; server and daemon utilities; archiving and compression; image manipulation; mail and news; control flow utilities; filehandle and I/O; Microsoft Windows modules; and miscellaneous modules. Is there an ISO or ANSI certified version of Perl? Certainly not. Larry expects that he'll be certified before Perl is. Where can I get information on Perl? The complete Perl documentation is available with the perl distribution. If you have perl installed locally, you probably have the documentation installed as well: type `man perl' if you're on a system resembling Unix. This will lead you to other important man pages, including how to set your $MANPATH. If you're not on a Unix system, access to the documentation will be different; for example, it might be only in HTML format. But all proper perl installations have fully-accessible documentation. You might also try `perldoc perl' in case your system doesn't have a proper man command, or it's been misinstalled. If that doesn't work, try looking in /usr/local/lib/perl5/pod for documentation. If all else fails, consult the CPAN/doc directory, which contains the complete documentation in various formats, including native pod, troff, html, and plain text. There's also a web page at http://www.perl.com/perl/info/documentation.html that might help. Many good books have been written about Perl -- see the section below for more details. Tutorial documents are included in current or upcoming Perl releases include the perltoot manpage for objects, the perlopentut manpage for file opening semantics, the perlreftut manpage for managing references, and the perlxstut manpage for linking C and Perl together. There may be more by the time you read this. The following URLs might also be of assistance: http://language.perl.com/info/documentation.html http://reference.perl.com/query.cgi?tutorials What are the Perl newsgroups on USENET? Where do I post questions? The now defunct comp.lang.perl newsgroup has been superseded by the following groups: comp.lang.perl.announce Moderated announcement group comp.lang.perl.misc Very busy group about Perl in general comp.lang.perl.moderated Moderated discussion group comp.lang.perl.modules Use and development of Perl modules comp.lang.perl.tk Using Tk (and X) from Perl comp.infosystems.www.authoring.cgi Writing CGI scripts for the Web. There is also USENET gateway to the mailing list used by the crack Perl development team (perl5-porters) at news://news.perl.com/perl.porters- gw/ . Where should I post source code? You should post source code to whichever group is most appropriate, but feel free to cross-post to comp.lang.perl.misc. If you want to cross- post to alt.sources, please make sure it follows their posting standards, including setting the Followup-To header line to NOT include alt.sources; see their FAQ (http://www.faqs.org/faqs/alt-sources-intro/) for details. If you're just looking for software, first use Alta Vista, Deja News, and search CPAN. This is faster and more productive than just posting a request. Perl Books A number of books on Perl and/or CGI programming are available. A few of these are good, some are ok, but many aren't worth your money. Tom Christiansen maintains a list of these books, some with extensive reviews, at http://www.perl.com/perl/critiques/index.html. The incontestably definitive reference book on Perl, written by the creator of Perl, is now in its second edition: Programming Perl (the "Camel Book"): by Larry Wall, Tom Christiansen, and Randal Schwartz ISBN 1-56592-149-6 (English) ISBN 4-89052-384-7 (Japanese) URL: http://www.oreilly.com/catalog/pperl2/ (French, German, Italian, and Hungarian translations also available) The companion volume to the Camel containing thousands of real-world examples, mini-tutorials, and complete programs (first premiering at the 1998 Perl Conference), is: The Perl Cookbook (the "Ram Book"): by Tom Christiansen and Nathan Torkington, with Foreword by Larry Wall ISBN: 1-56592-243-3 URL: http://perl.oreilly.com/cookbook/ If you're already a hard-core systems programmer, then the Camel Book might suffice for you to learn Perl from. But if you're not, check out: Learning Perl (the "Llama Book"): by Randal Schwartz and Tom Christiansen with Foreword by Larry Wall ISBN: 1-56592-284-0 URL: http://www.oreilly.com/catalog/lperl2/ Despite the picture at the URL above, the second edition of "Llama Book" really has a blue cover, and is updated for the 5.004 release of Perl. Various foreign language editions are available, including *Learning Perl on Win32 Systems* (the Gecko Book). If you're not an accidental programmer, but a more serious and possibly even degreed computer scientist who doesn't need as much hand-holding as we try to provide in the Llama or its defurred cousin the Gecko, please check out the delightful book, *Perl: The Programmer's Companion*, written by Nigel Chapman. You can order O'Reilly books directly from O'Reilly & Associates, 1-800- 998-9938. Local/overseas is 1-707-829-0515. If you can locate an O'Reilly order form, you can also fax to 1-707-829-0104. See http://www.ora.com/ on the Web. What follows is a list of the books that the FAQ authors found personally useful. Your mileage may (but, we hope, probably won't) vary. Recommended books on (or mostly on) Perl follow; those marked with a star may be ordered from O'Reilly. References *Programming Perl by Larry Wall, Tom Christiansen, and Randal L. Schwartz *Perl 5 Desktop Reference By Johan Vromans Tutorials *Learning Perl [2nd edition] by Randal L. Schwartz and Tom Christiansen with foreword by Larry Wall *Learning Perl on Win32 Systems by Randal L. Schwartz, Erik Olson, and Tom Christiansen, with foreword by Larry Wall Perl: The Programmer's Companion by Nigel Chapman Cross-Platform Perl by Eric F. Johnson MacPerl: Power and Ease by Vicki Brown and Chris Nandor, foreword by Matthias Neeracher Task-Oriented *The Perl Cookbook by Tom Christiansen and Nathan Torkington with foreword by Larry Wall Perl5 Interactive Course [2nd edition] by Jon Orwant *Advanced Perl Programming by Sriram Srinivasan Effective Perl Programming by Joseph Hall Special Topics *Mastering Regular Expressions by Jeffrey Friedl How to Set up and Maintain a World Wide Web Site [2nd edition] by Lincoln Stein Perl in Magazines The first and only periodical devoted to All Things Perl, *The Perl Journal* contains tutorials, demonstrations, case studies, announcements, contests, and much more. TPJ has columns on web development, databases, Win32 Perl, graphical programming, regular expressions, and networking, and sponsors the Obfuscated Perl Contest. It is published quarterly under the gentle hand of its editor, Jon Orwant. See http://www.tpj.com/ or send mail to subscriptions@tpj.com . Beyond this, magazines that frequently carry high-quality articles on Perl are *Web Techniques* (see http://www.webtechniques.com/), *Performance Computing* (http://www.performance-computing.com/), and Usenix's newsletter/magazine to its members, *login:*, at http://www.usenix.org/. Randal's Web Technique's columns are available on the web at http://www.stonehenge.com/merlyn/WebTechniques/. Perl on the Net: FTP and WWW Access To get the best (and possibly cheapest) performance, pick a site from the list below and use it to grab the complete list of mirror sites. >From there you can find the quickest site for you. Remember, the following list is *not* the complete list of CPAN mirrors. http://www.perl.com/CPAN-local http://www.perl.com/CPAN (redirects to an ftp mirror) http://www.perl.org/CPAN ftp://ftp.funet.fi/pub/languages/perl/CPAN/ http://www.cs.ruu.nl/pub/PERL/CPAN/ ftp://ftp.cs.colorado.edu/pub/perl/CPAN/ What mailing lists are there for perl? Most of the major modules (tk, CGI, libwww-perl) have their own mailing lists. Consult the documentation that came with the module for subscription information. The Perl Institute attempts to maintain a list of mailing lists at: http://www.perl.org/maillist.html Archives of comp.lang.perl.misc Have you tried Deja News or Alta Vista? Those are the best archives. Just look up "*perl*" as a newsgroup. http://www.dejanews.com/dnquery.xp?QRY=&DBS=2&ST=PS&defaultOp=AND&LNG=ALL&format=terse&showsort=date&maxhits=25&subjects=&groups=*perl*&authors=&fromdate=&todate= You'll probably want to trim that down a bit, though. ftp.cis.ufl.edu:/pub/perl/comp.lang.perl.*/monthly has an almost complete collection dating back to 12/89 (missing 08/91 through 12/93). They are kept as one large file for each month. You'll probably want more a sophisticated query and retrieval mechanism than a file listing, preferably one that allows you to retrieve articles using a fast-access indices, keyed on at least author, date, subject, thread (as in "trn") and probably keywords. The best solution the FAQ authors know of is the MH pick command, but it is very slow to select on 18000 articles. If you have, or know where can be found, the missing sections, please let perlfaq-suggestions@perl.com know. Where can I buy a commercial version of Perl? In a real sense, Perl already *is* commercial software: It has a licence that you can grab and carefully read to your manager. It is distributed in releases and comes in well-defined packages. There is a very large user community and an extensive literature. The comp.lang.perl.* newsgroups and several of the mailing lists provide free answers to your questions in near real-time. Perl has traditionally been supported by Larry, scores of software designers and developers, and myriads of programmers, all working for free to create a useful thing to make life better for everyone. However, these answers may not suffice for managers who require a purchase order from a company whom they can sue should anything go awry. Or maybe they need very serious hand-holding and contractual obligations. Shrink-wrapped CDs with perl on them are available from several sources if that will help. For example, many perl books carry a perl distribution on them, as do the O'Reily Perl Resource Kits (in both the Unix flavor and in the proprietary Microsoft flavor); the free Unix distributions also all come with Perl. Or you can purchase a real support contract. Although Cygnus historically provided this service, they no longer sell support contracts for Perl. Instead, the Paul Ingram Group will be taking up the slack through The Perl Clinic. The following is a commercial from them: "Do you need professional support for Perl and/or Oraperl? Do you need a support contract with defined levels of service? Do you want to pay only for what you need? "The Paul Ingram Group has provided quality software development and support services to some of the world's largest corporations for ten years. We are now offering the same quality support services for Perl at The Perl Clinic. This service is led by Tim Bunce, an active perl porter since 1994 and well known as the author and maintainer of the DBI, DBD::Oracle, and Oraperl modules and author/co-maintainer of The Perl 5 Module List. We also offer Oracle users support for Perl5 Oraperl and related modules (which Oracle is planning to ship as part of Oracle Web Server 3). 20% of the profit from our Perl support work will be donated to The Perl Institute." For more information, contact The Perl Clinic: Tel: +44 1483 424424 Fax: +44 1483 419419 Web: http://www.perl.co.uk/ Email: perl-support-info@perl.co.uk or Tim.Bunce@ig.co.uk See also www.perl.com for updates on tutorials, training, and support. Where do I send bug reports? If you are reporting a bug in the perl interpreter or the modules shipped with perl, use the *perlbug* program in the perl distribution or mail your report to perlbug@perl.com . If you are posting a bug with a non-standard port (see the answer to "What platforms is Perl available for?"), a binary distribution, or a non-standard module (such as Tk, CGI, etc), then please see the documentation that came with it to determine the correct place to post bugs. Read the perlbug(1) man page (perl5.004 or later) for more information. What is perl.com? The perl.com domain is owned by Tom Christiansen, who created it as a public service long before perl.org came about. Despite the name, it's a pretty non-commercial site meant to be a clearinghouse for information about all things Perlian, accepting no paid advertisements, bouncy happy gifs, or silly java applets on its pages. The Perl Home Page at http://www.perl.com/ is currently hosted on a T3 line courtesy of Songline Systems, a software-oriented subsidiary of O'Reilly and Associates. Other starting points include http://language.perl.com/ http://conference.perl.com/ http://reference.perl.com/ AUTHOR AND COPYRIGHT Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as an integrated part of the Standard Distribution of Perl or of its documentation (printed or otherwise), this work is covered under Perl's Artistic Licence. For separate distributions of all or part of this FAQ outside of that, see the perlfaq manpage. Irrespective of its distribution, all code examples here are public domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required. perlfaq3 section NAME perlfaq3 - Programming Tools ($Revision: 1.33 $, $Date: 1998/12/29 20:12:12 $) DESCRIPTION This section of the FAQ answers questions related to programmer tools and programming support. How do I do (anything)? Have you looked at CPAN (see the perlfaq2 manpage)? The chances are that someone has already written a module that can solve your problem. Have you read the appropriate man pages? Here's a brief index: Basics perldata, perlvar, perlsyn, perlop, perlsub Execution perlrun, perldebug Functions perlfunc Objects perlref, perlmod, perlobj, perltie Data Structures perlref, perllol, perldsc Modules perlmod, perlmodlib, perlsub Regexps perlre, perlfunc, perlop, perllocale Moving to perl5 perltrap, perl Linking w/C perlxstut, perlxs, perlcall, perlguts, perlembed Various http://www.perl.com/CPAN/doc/FMTEYEWTK/index.html (not a man-page but still useful) the perltoc manpage provides a crude table of contents for the perl man page set. How can I use Perl interactively? The typical approach uses the Perl debugger, described in the perldebug(1) man page, on an ``empty'' program, like this: perl -de 42 Now just type in any legal Perl code, and it will be immediately evaluated. You can also examine the symbol table, get stack backtraces, check variable values, set breakpoints, and other operations typically found in symbolic debuggers. Is there a Perl shell? In general, no. The Shell.pm module (distributed with perl) makes perl try commands which aren't part of the Perl language as shell commands. perlsh from the source distribution is simplistic and uninteresting, but may still be what you want. How do I debug my Perl programs? Have you used `-w'? It enables warnings for dubious practices. Have you tried `use strict'? It prevents you from using symbolic references, makes you predeclare any subroutines that you call as bare words, and (probably most importantly) forces you to predeclare your variables with `my' or `use vars'. Did you check the returns of each and every system call? The operating system (and thus Perl) tells you whether they worked or not, and if not why. open(FH, "> /etc/cantwrite") or die "Couldn't write to /etc/cantwrite: $!\n"; Did you read the perltrap manpage? It's full of gotchas for old and new Perl programmers, and even has sections for those of you who are upgrading from languages like *awk* and *C*. Have you tried the Perl debugger, described in the perldebug manpage? You can step through your program and see what it's doing and thus work out why what it's doing isn't what it should be doing. How do I profile my Perl programs? You should get the Devel::DProf module from CPAN, and also use Benchmark.pm from the standard distribution. Benchmark lets you time specific portions of your code, while Devel::DProf gives detailed breakdowns of where your code spends its time. Here's a sample use of Benchmark: use Benchmark; @junk = `cat /etc/motd`; $count = 10_000; timethese($count, { 'map' => sub { my @a = @junk; map { s/a/b/ } @a; return @a }, 'for' => sub { my @a = @junk; local $_; for (@a) { s/a/b/ }; return @a }, }); This is what it prints (on one machine--your results will be dependent on your hardware, operating system, and the load on your machine): Benchmark: timing 10000 iterations of for, map... for: 4 secs ( 3.97 usr 0.01 sys = 3.98 cpu) map: 6 secs ( 4.97 usr 0.00 sys = 4.97 cpu) Be aware that a good benchmark is very hard to write. It only tests the data you give it, and really proves little about differing complexities of contrasting algorithms. How do I cross-reference my Perl programs? The B::Xref module, shipped with the new, alpha-release Perl compiler (not the general distribution prior to the 5.005 release), can be used to generate cross-reference reports for Perl programs. perl -MO=Xref[,OPTIONS] scriptname.plx Is there a pretty-printer (formatter) for Perl? There is no program that will reformat Perl as much as indent(1) does for C. The complex feedback between the scanner and the parser (this feedback is what confuses the vgrind and emacs programs) makes it challenging at best to write a stand-alone Perl parser. Of course, if you simply follow the guidelines in the perlstyle manpage, you shouldn't need to reformat. The habit of formatting your code as you write it will help prevent bugs. Your editor can and should help you with this. The perl-mode for emacs can provide a remarkable amount of help with most (but not all) code, and even less programmable editors can provide significant assistance. Tom swears by the following settings in vi and its clones: set ai sw=4 map ^O {^M}^[O^T Now put that in your .exrc file (replacing the caret characters with control characters) and away you go. In insert mode, ^T is for indenting, ^D is for undenting, and ^O is for blockdenting -- as it were. If you haven't used the last one, you're missing a lot. A more complete example, with comments, can be found at http://www.perl.com/CPAN-local/authors/id/TOMC/scripts/toms.exrc.gz If you are used to using the *vgrind* program for printing out nice code to a laser printer, you can take a stab at this using http://www.perl.com/CPAN/doc/misc/tips/working.vgrind.entry, but the results are not particularly satisfying for sophisticated code. The a2ps at http://www.infres.enst.fr/~demaille/a2ps/ does lots of things related to generating nicely printed output of documents. Is there a etags/ctags for perl? With respect to the source code for the Perl interpreter, yes. There has been support for etags in the source for a long time. Ctags was introduced in v5.005_54 (and probably 5.005_03). After building perl, type 'make etags' or 'make ctags' and both sets of tag files will be built. Now, if you're looking to build a tag file for perl code, then there's a simple one at http://www.perl.com/CPAN/authors/id/TOMC/scripts/ptags.gz which may do the trick. And if not, it's easy to hack into what you want. Is there an IDE or Windows Perl Editor? If you're on Unix, you already have an IDE -- Unix itself. You just have to learn the toolbox. If you're not, then you probably don't have a toolbox, so may need something else. PerlBuilder (XXX URL to follow) is an integrated development environment for Windows that supports Perl development. Perl programs are just plain text, though, so you could download emacs for Windows (XXX) or vim for win32 (http://www.cs.vu.nl/~tmgil/vi.html). If you're transferring Windows files to Unix, be sure to transfer in ASCII mode so the ends of lines are appropriately converted. Where can I get Perl macros for vi? For a complete version of Tom Christiansen's vi configuration file, see http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/toms.exrc.gz, the standard benchmark file for vi emulators. This runs best with nvi, the current version of vi out of Berkeley, which incidentally can be built with an embedded Perl interpreter -- see http://www.perl.com/CPAN/src/misc. Where can I get perl-mode for emacs? Since Emacs version 19 patchlevel 22 or so, there have been both a perl- mode.el and support for the perl debugger built in. These should come with the standard Emacs 19 distribution. In the perl source directory, you'll find a directory called "emacs", which contains a cperl-mode that color-codes keywords, provides context- sensitive help, and other nifty things. Note that the perl-mode of emacs will have fits with `"main'foo"' (single quote), and mess up the indentation and hilighting. You are probably using `"main::foo"' in new Perl code anyway, so this shouldn't be an issue. How can I use curses with Perl? The Curses module from CPAN provides a dynamically loadable object module interface to a curses library. A small demo can be found at the directory http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/rep; this program repeats a command and updates the screen as needed, rendering rep ps axu similar to top. How can I use X or Tk with Perl? Tk is a completely Perl-based, object-oriented interface to the Tk toolkit that doesn't force you to use Tcl just to get at Tk. Sx is an interface to the Athena Widget set. Both are available from CPAN. See the directory http://www.perl.com/CPAN/modules/by- category/08_User_Interfaces/ Invaluable for Perl/Tk programming are: the Perl/Tk FAQ at http://w4.lns.cornell.edu/~pvhp/ptk/ptkTOC.html , the Perl/Tk Reference Guide available at http://www.perl.com/CPAN- local/authors/Stephen_O_Lidie/ , and the online manpages at http://www- users.cs.umn.edu/~amundson/perl/perltk/toc.html . How can I generate simple menus without using CGI or Tk? The http://www.perl.com/CPAN/authors/id/SKUNZ/perlmenu.v4.0.tar.gz module, which is curses-based, can help with this. What is undump? See the next questions. How can I make my Perl program run faster? The best way to do this is to come up with a better algorithm. This can often make a dramatic difference. Chapter 8 in the Camel has some efficiency tips in it you might want to look at. Jon Bentley's book ``Programming Pearls'' (that's not a misspelling!) has some good tips on optimization, too. Advice on benchmarking boils down to: benchmark and profile to make sure you're optimizing the right part, look for better algorithms instead of microtuning your code, and when all else fails consider just buying faster hardware. A different approach is to autoload seldom-used Perl code. See the AutoSplit and AutoLoader modules in the standard distribution for that. Or you could locate the bottleneck and think about writing just that part in C, the way we used to take bottlenecks in C code and write them in assembler. Similar to rewriting in C is the use of modules that have critical sections written in C (for instance, the PDL module from CPAN). In some cases, it may be worth it to use the backend compiler to produce byte code (saving compilation time) or compile into C, which will certainly save compilation time and sometimes a small amount (but not much) execution time. See the question about compiling your Perl programs for more on the compiler--the wins aren't as obvious as you'd hope. If you're currently linking your perl executable to a shared *libc.so*, you can often gain a 10-25% performance benefit by rebuilding it to link with a static libc.a instead. This will make a bigger perl executable, but your Perl programs (and programmers) may thank you for it. See the INSTALL file in the source distribution for more information. Unsubstantiated reports allege that Perl interpreters that use sfio outperform those that don't (for IO intensive applications). To try this, see the INSTALL file in the source distribution, especially the ``Selecting File IO mechanisms'' section. The undump program was an old attempt to speed up your Perl program by storing the already-compiled form to disk. This is no longer a viable option, as it only worked on a few architectures, and wasn't a good solution anyway. How can I make my Perl program take less memory? When it comes to time-space tradeoffs, Perl nearly always prefers to throw memory at a problem. Scalars in Perl use more memory than strings in C, arrays take more than that, and hashes use even more. While there's still a lot to be done, recent releases have been addressing these issues. For example, as of 5.004, duplicate hash keys are shared amongst all hashes using them, so require no reallocation. In some cases, using substr() or vec() to simulate arrays can be highly beneficial. For example, an array of a thousand booleans will take at least 20,000 bytes of space, but it can be turned into one 125-byte bit vector for a considerable memory savings. The standard Tie::SubstrHash module can also help for certain types of data structure. If you're working with specialist data structures (matrices, for instance) modules that implement these in C may use less memory than equivalent Perl modules. Another thing to try is learning whether your Perl was compiled with the system malloc or with Perl's builtin malloc. Whichever one it is, try using the other one and see whether this makes a difference. Information about malloc is in the INSTALL file in the source distribution. You can find out whether you are using perl's malloc by typing `perl - V:usemymalloc'. Is it unsafe to return a pointer to local data? No, Perl's garbage collection system takes care of this. sub makeone { my @a = ( 1 .. 10 ); return \@a; } for $i ( 1 .. 10 ) { push @many, makeone(); } print $many[4][5], "\n"; print "@many\n"; How can I free an array or hash so my program shrinks? You can't. On most operating systems, memory allocated to a program can never be returned to the system. That's why long-running programs sometimes re-exec themselves. Some operating systems (notably, FreeBSD and Linux) allegedly reclaim large chunks of memory that is no longer used, but it doesn't appear to happen with Perl (yet). The Mac appears to be the only platform that will reliably (albeit, slowly) return memory to the OS. We've had reports that on Linux (Redhat 5.1) on Intel, `undef $scalar' will return memory to the system, while on Solaris 2.6 it won't. In general, try it yourself and see. However, judicious use of my() on your variables will help make sure that they go out of scope so that Perl can free up their storage for use in other parts of your program. A global variable, of course, never goes out of scope, so you can't get its space automatically reclaimed, although undef()ing and/or delete()ing it will achieve the same effect. In general, memory allocation and de-allocation isn't something you can or should be worrying about much in Perl, but even this capability (preallocation of data types) is in the works. How can I make my CGI script more efficient? Beyond the normal measures described to make general Perl programs faster or smaller, a CGI program has additional issues. It may be run several times per second. Given that each time it runs it will need to be re-compiled and will often allocate a megabyte or more of system memory, this can be a killer. Compiling into C isn't going to help you because the process start-up overhead is where the bottleneck is. There are two popular ways to avoid this overhead. One solution involves running the Apache HTTP server (available from http://www.apache.org/) with either of the mod_perl or mod_fastcgi plugin modules. With mod_perl and the Apache::Registry module (distributed with mod_perl), httpd will run with an embedded Perl interpreter which pre- compiles your script and then executes it within the same address space without forking. The Apache extension also gives Perl access to the internal server API, so modules written in Perl can do just about anything a module written in C can. For more on mod_perl, see http://perl.apache.org/ With the FCGI module (from CPAN) and the mod_fastcgi module (available from http://www.fastcgi.com/) each of your perl scripts becomes a permanent CGI daemon process. Both of these solutions can have far-reaching effects on your system and on the way you write your CGI scripts, so investigate them with care. See http://www.perl.com/CPAN/modules/by- category/15_World_Wide_Web_HTML_HTTP_CGI/ . A non-free, commercial product, ``The Velocity Engine for Perl'', (http://www.binevolve.com/ or also be worth looking at. It will allow you to increase the performance of your perl scripts, upto 25 times faster than normal CGI perl by running in persistent perl mode, or 4 to 5 times faster without any modification to your existing CGI scripts. Fully functional evaluation copies are available from the web site. How can I hide the source for my Perl program? Delete it. :-) Seriously, there are a number of (mostly unsatisfactory) solutions with varying levels of ``security''. First of all, however, you *can't* take away read permission, because the source code has to be readable in order to be compiled and interpreted. (That doesn't mean that a CGI script's source is readable by people on the web, though, only by people with access to the filesystem) So you have to leave the permissions at the socially friendly 0755 level. Some people regard this as a security problem. If your program does insecure things, and relies on people not knowing how to exploit those insecurities, it is not secure. It is often possible for someone to determine the insecure things and exploit them without viewing the source. Security through obscurity, the name for hiding your bugs instead of fixing them, is little security indeed. You can try using encryption via source filters (Filter::* from CPAN), but any decent programmer will be able to decrypt it. You can try using the byte code compiler and interpreter described below, but the curious might still be able to de-compile it. You can try using the native-code compiler described below, but crackers might be able to disassemble it. These pose varying degrees of difficulty to people wanting to get at your code, but none can definitively conceal it (this is true of every language, not just Perl). If you're concerned about people profiting from your code, then the bottom line is that nothing but a restrictive licence will give you legal security. License your software and pepper it with threatening statements like ``This is unpublished proprietary software of XYZ Corp. Your access to it does not give you permission to use it blah blah blah.'' We are not lawyers, of course, so you should see a lawyer if you want to be sure your licence's wording will stand up in court. How can I compile my Perl program into byte code or C? Malcolm Beattie has written a multifunction backend compiler, available from CPAN, that can do both these things. It is included in the perl5.005 release, but is still considered experimental. This means it's fun to play with if you're a programmer but not really for people looking for turn-key solutions. Merely compiling into C does not in and of itself guarantee that your code will run very much faster. That's because except for lucky cases where a lot of native type inferencing is possible, the normal Perl run time system is still present and so your program will take just as long to run and be just as big. Most programs save little more than compilation time, leaving execution no more than 10-30% faster. A few rare programs actually benefit significantly (like several times faster), but this takes some tweaking of your code. You'll probably be astonished to learn that the current version of the compiler generates a compiled form of your script whose executable is just as big as the original perl executable, and then some. That's because as currently written, all programs are prepared for a full eval() statement. You can tremendously reduce this cost by building a shared *libperl.so* library and linking against that. See the INSTALL podfile in the perl source distribution for details. If you link your main perl binary with this, it will make it miniscule. For example, on one author's system, /usr/bin/perl is only 11k in size! In general, the compiler will do nothing to make a Perl program smaller, faster, more portable, or more secure. In fact, it will usually hurt all of those. The executable will be bigger, your VM system may take longer to load the whole thing, the binary is fragile and hard to fix, and compilation never stopped software piracy in the form of crackers, viruses, or bootleggers. The real advantage of the compiler is merely packaging, and once you see the size of what it makes (well, unless you use a shared *libperl.so*), you'll probably want a complete Perl install anyway. How can I compile Perl into Java? You can't. Not yet, anyway. You can integrate Java and Perl with the Perl Resource Kit from O'Reilly and Associates. See http://www.oreilly.com/catalog/prkunix/ for more information. The Java interface will be supported in the core 5.006 release of Perl. How can I get `#!perl' to work on [MS-DOS,NT,...]? For OS/2 just use extproc perl -S -your_switches as the first line in `*.cmd' file (`-S' due to a bug in cmd.exe's `extproc' handling). For DOS one should first invent a corresponding batch file, and codify it in `ALTERNATIVE_SHEBANG' (see the INSTALL file in the source distribution for more information). The Win95/NT installation, when using the ActiveState port of Perl, will modify the Registry to associate the `.pl' extension with the perl interpreter. If you install another port (Gurusamy Sarathy's is the recommended Win95/NT port), or (eventually) build your own Win95/NT Perl using a Windows port of gcc (e.g., with cygwin32 or mingw32), then you'll have to modify the Registry yourself. In addition to associating `.pl' with the interpreter, NT people can use: `SET PATHEXT=%PATHEXT%;.PL' to let them run the program `install-linux.pl' merely by typing `install-linux'. Macintosh perl scripts will have the appropriate Creator and Type, so that double-clicking them will invoke the perl application. *IMPORTANT!*: Whatever you do, PLEASE don't get frustrated, and just throw the perl interpreter into your cgi-bin directory, in order to get your scripts working for a web server. This is an EXTREMELY big security risk. Take the time to figure out how to do it correctly. Can I write useful perl programs on the command line? Yes. Read the perlrun manpage for more information. Some examples follow. (These assume standard Unix shell quoting rules.) # sum first and last fields perl -lane 'print $F[0] + $F[-1]' * # identify text files perl -le 'for(@ARGV) {print if -f && -T _}' * # remove (most) comments from C program perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c # make file a month younger than today, defeating reaper daemons perl -e '$X=24*60*60; utime(time(),time() + 30 * $X,@ARGV)' * # find first unused uid perl -le '$i++ while getpwuid($i); print $i' # display reasonable manpath echo $PATH | perl -nl -072 -e ' s![^/+]*$!man!&&-d&&!$s{$_}++&&push@m,$_;END{print"@m"}' Ok, the last one was actually an obfuscated perl entry. :-) Why don't perl one-liners work on my DOS/Mac/VMS system? The problem is usually that the command interpreters on those systems have rather different ideas about quoting than the Unix shells under which the one-liners were created. On some systems, you may have to change single-quotes to double ones, which you must *NOT* do on Unix or Plan9 systems. You might also have to change a single % to a %%. For example: # Unix perl -e 'print "Hello world\n"' # DOS, etc. perl -e "print \"Hello world\n\"" # Mac print "Hello world\n" (then Run "Myscript" or Shift-Command-R) # VMS perl -e "print ""Hello world\n""" The problem is that none of this is reliable: it depends on the command interpreter. Under Unix, the first two often work. Under DOS, it's entirely possible neither works. If 4DOS was the command shell, you'd probably have better luck like this: perl -e "print "Hello world\n"" Under the Mac, it depends which environment you are using. The MacPerl shell, or MPW, is much like Unix shells in its support for several quoting variants, except that it makes free use of the Mac's non-ASCII characters as control characters. Using qq(), q(), and qx(), instead of "double quotes", 'single quotes', and `backticks`, may make one-liners easier to write. There is no general solution to all of this. It is a mess, pure and simple. Sucks to be away from Unix, huh? :-) [Some of this answer was contributed by Kenneth Albanowski.] Where can I learn about CGI or Web programming in Perl? For modules, get the CGI or LWP modules from CPAN. For textbooks, see the two especially dedicated to web stuff in the question on books. For problems and questions related to the web, like ``Why do I get 500 Errors'' or ``Why doesn't it run from the browser right when it runs fine on the command line'', see these sources: WWW Security FAQ http://www.w3.org/Security/Faq/ Web FAQ http://www.boutell.com/faq/ CGI FAQ http://www.webthing.com/tutorials/cgifaq.html HTTP Spec http://www.w3.org/pub/WWW/Protocols/HTTP/ HTML Spec http://www.w3.org/TR/REC-html40/ http://www.w3.org/pub/WWW/MarkUp/ CGI Spec http://www.w3.org/CGI/ CGI Security FAQ http://www.go2net.com/people/paulp/cgi-security/safe-cgi.txt Also take a look at the perlfaq9 manpage Where can I learn about object-oriented Perl programming? the perltoot manpage is a good place to start, and you can use the perlobj manpage and the perlbot manpage for reference. Perltoot didn't come out until the 5.004 release, but you can get a copy (in pod, html, or postscript) from http://www.perl.com/CPAN/doc/FMTEYEWTK/ . Where can I learn about linking C with Perl? [h2xs, xsubpp] If you want to call C from Perl, start with the perlxstut manpage, moving on to the perlxs manpage, the xsubpp manpage, and the perlguts manpage. If you want to call Perl from C, then read the perlembed manpage, the perlcall manpage, and the perlguts manpage. Don't forget that you can learn a lot from looking at how the authors of existing extension modules wrote their code and solved their problems. I've read perlembed, perlguts, etc., but I can't embed perl in my C program, what am I doing wrong? Download the ExtUtils::Embed kit from CPAN and run `make test'. If the tests pass, read the pods again and again and again. If they fail, see the perlbug manpage and send a bugreport with the output of `make test TEST_VERBOSE=1' along with `perl -V'. When I tried to run my script, I got this message. What does it mean? the perldiag manpage has a complete list of perl's error messages and warnings, with explanatory text. You can also use the splain program (distributed with perl) to explain the error messages: perl program 2>diag.out splain [-v] [-p] diag.out or change your program to explain the messages for you: use diagnostics; or use diagnostics -verbose; What's MakeMaker? This module (part of the standard perl distribution) is designed to write a Makefile for an extension module from a Makefile.PL. For more information, see the ExtUtils::MakeMaker manpage. AUTHOR AND COPYRIGHT Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as an integrated part of the Standard Distribution of Perl or of its documentation (printed or otherwise), this work is covered under Perl's Artistic Licence. For separate distributions of all or part of this FAQ outside of that, see the perlfaq manpage. Irrespective of its distribution, all code examples here are public domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required. perlfaq4 section NAME perlfaq4 - Data Manipulation ($Revision: 1.40 $, $Date: 1999/01/08 04:26:39 $) DESCRIPTION The section of the FAQ answers question related to the manipulation of data as numbers, dates, strings, arrays, hashes, and miscellaneous data issues. Data: Numbers Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)? The infinite set that a mathematician thinks of as the real numbers can only be approximate on a computer, since the computer only has a finite number of bits to store an infinite number of, um, numbers. Internally, your computer represents floating-point numbers in binary. Floating-point numbers read in from a file or appearing as literals in your program are converted from their decimal floating-point representation (eg, 19.95) to the internal binary representation. However, 19.95 can't be precisely represented as a binary floating-point number, just like 1/3 can't be exactly represented as a decimal floating-point number. The computer's binary representation of 19.95, therefore, isn't exactly 19.95. When a floating-point number gets printed, the binary floating-point representation is converted back to decimal. These decimal numbers are displayed in either the format you specify with printf(), or the current output format for numbers (see the section on "$#" in the perlvar manpage if you use print. `$#' has a different default value in Perl5 than it did in Perl4. Changing `$#' yourself is deprecated. This affects all computer languages that represent decimal floating- point numbers in binary, not just Perl. Perl provides arbitrary- precision decimal numbers with the Math::BigFloat module (part of the standard Perl distribution), but mathematical operations are consequently slower. To get rid of the superfluous digits, just use a format (eg, `printf("%.2f", 19.95)') to get the required precision. See the section on "Floating-point Arithmetic" in the perlop manpage. Why isn't my octal data interpreted correctly? Perl only understands octal and hex numbers as such when they occur as literals in your program. If they are read in from somewhere and assigned, no automatic conversion takes place. You must explicitly use oct() or hex() if you want the values converted. oct() interprets both hex ("0x350") numbers and octal ones ("0350" or even without the leading "0", like "377"), while hex() only converts hexadecimal ones, with or without a leading "0x", like "0x255", "3A", "ff", or "deadbeef". This problem shows up most often when people try using chmod(), mkdir(), umask(), or sysopen(), which all want permissions in octal. chmod(644, $file); # WRONG -- perl -w catches this chmod(0644, $file); # right Does Perl have a round() function? What about ceil() and floor()? Trig functions? Remember that int() merely truncates toward 0. For rounding to a certain number of digits, sprintf() or printf() is usually the easiest route. printf("%.3f", 3.1415926535); # prints 3.142 The POSIX module (part of the standard perl distribution) implements ceil(), floor(), and a number of other mathematical and trigonometric functions. use POSIX; $ceil = ceil(3.5); # 4 $floor = floor(3.5); # 3 In 5.000 to 5.003 Perls, trigonometry was done in the Math::Complex module. With 5.004, the Math::Trig module (part of the standard perl distribution) implements the trigonometric functions. Internally it uses the Math::Complex module and some functions can break out from the real axis into the complex plane, for example the inverse sine of 2. Rounding in financial applications can have serious implications, and the rounding method used should be specified precisely. In these cases, it probably pays not to trust whichever system rounding is being used by Perl, but to instead implement the rounding function you need yourself. To see why, notice how you'll still have an issue on half-way-point alternation: for ($i = 0; $i < 1.01; $i += 0.05) { printf "%.1f ",$i} 0.0 0.1 0.1 0.2 0.2 0.2 0.3 0.3 0.4 0.4 0.5 0.5 0.6 0.7 0.7 0.8 0.8 0.9 0.9 1.0 1.0 Don't blame Perl. It's the same as in C. IEEE says we have to do this. Perl numbers whose absolute values are integers under 2**31 (on 32 bit machines) will work pretty much like mathematical integers. Other numbers are not guaranteed. How do I convert bits into ints? To turn a string of 1s and 0s like `10110110' into a scalar containing its binary value, use the pack() function (documented in the section on "pack" in the perlfunc manpage): $decimal = pack('B8', '10110110'); Here's an example of going the other way: $binary_string = join('', unpack('B*', "\x29")); Why doesn't & work the way I want it to? The behavior of binary arithmetic operators depends on whether they're used on numbers or strings. The operators treat a string as a series of bits and work with that (the string `"3"' is the bit pattern `00110011'). The operators work with the binary form of a number (the number `3' is treated as the bit pattern `00000011'). So, saying `11 & 3' performs the "and" operation on numbers (yielding `1'). Saying `"11" & "3"' performs the "and" operation on strings (yielding `"1"'). Most problems with `&' and `|' arise because the programmer thinks they have a number but really it's a string. The rest arise because the programmer says: if ("\020\020" & "\101\101") { # ... } but a string consisting of two null bytes (the result of `"\020\020" & "\101\101"') is not a false value in Perl. You need: if ( ("\020\020" & "\101\101") !~ /[^\000]/) { # ... } How do I multiply matrices? Use the Math::Matrix or Math::MatrixReal modules (available from CPAN) or the PDL extension (also available from CPAN). How do I perform an operation on a series of integers? To call a function on each element in an array, and collect the results, use: @results = map { my_func($_) } @array; For example: @triple = map { 3 * $_ } @single; To call a function on each element of an array, but ignore the results: foreach $iterator (@array) { some_func($iterator); } To call a function on each integer in a (small) range, you can use: @results = map { some_func($_) } (5 .. 25); but you should be aware that the `..' operator creates an array of all integers in the range. This can take a lot of memory for large ranges. Instead use: @results = (); for ($i=5; $i < 500_005; $i++) { push(@results, some_func($i)); } How can I output Roman numerals? Get the http://www.perl.com/CPAN/modules/by-module/Roman module. Why aren't my random numbers random? If you're using a version of Perl before 5.004, you must call `srand' once at the start of your program to seed the random number generator. 5.004 and later automatically call `srand' at the beginning. Don't call `srand' more than once--you make your numbers less random, rather than more. Computers are good at being predictable and bad at being random (despite appearances caused by bugs in your programs :-). http://www.perl.com/CPAN/doc/FMTEYEWTK/random, courtesy of Tom Phoenix, talks more about this.. John von Neumann said, ``Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin.'' If you want numbers that are more random than `rand' with `srand' provides, you should also check out the Math::TrulyRandom module from CPAN. It uses the imperfections in your system's timer to generate random numbers, but this takes quite a while. If you want a better pseudorandom generator than comes with your operating system, look at ``Numerical Recipes in C'' at http://www.nr.com/ . Data: Dates How do I find the week-of-the-year/day-of-the-year? The day of the year is in the array returned by localtime() (see the section on "localtime" in the perlfunc manpage): $day_of_year = (localtime(time()))[7]; or more legibly (in 5.004 or higher): use Time::localtime; $day_of_year = localtime(time())->yday; You can find the week of the year by dividing this by 7: $week_of_year = int($day_of_year / 7); Of course, this believes that weeks start at zero. The Date::Calc module from CPAN has a lot of date calculation functions, including day of the year, week of the year, and so on. Note that not all businesses consider ``week 1'' to be the same; for example, American businesses often consider the first week with a Monday in it to be Work Week #1, despite ISO 8601, which considers WW1 to be the first week with a Thursday in it. How can I compare two dates and find the difference? If you're storing your dates as epoch seconds then simply subtract one from the other. If you've got a structured date (distinct year, day, month, hour, minute, seconds values) then use one of the Date::Manip and Date::Calc modules from CPAN. How can I take a string and turn it into epoch seconds? If it's a regular enough string that it always has the same format, you can split it up and pass the parts to `timelocal' in the standard Time::Local module. Otherwise, you should look into the Date::Calc and Date::Manip modules from CPAN. How can I find the Julian Day? Neither Date::Manip nor Date::Calc deal with Julian days. Instead, there is an example of Julian date calculation that should help you in Time::JulianDay (part of the Time-modules bundle) which can be found at http://www.perl.com/CPAN/modules/by-module/Time/. How do I find yesterday's date? The `time()' function returns the current time in seconds since the epoch. Take one day off that: $yesterday = time() - ( 24 * 60 * 60 ); Then you can pass this to `localtime()' and get the individual year, month, day, hour, minute, seconds values. Does Perl have a year 2000 problem? Is Perl Y2K compliant? Short answer: No, Perl does not have a Year 2000 problem. Yes, Perl is Y2K compliant (whatever that means). The programmers you've hired to use it, however, probably are not. Long answer: The question belies a true understanding of the issue. Perl is just as Y2K compliant as your pencil--no more, and no less. Can you use your pencil to write a non-Y2K-compliant memo? Of course you can. Is that the pencil's fault? Of course it isn't. The date and time functions supplied with perl (gmtime and localtime) supply adequate information to determine the year well beyond 2000 (2038 is when trouble strikes for 32-bit machines). The year returned by these functions when used in an array context is the year minus 1900. For years between 1910 and 1999 this *happens* to be a 2-digit decimal number. To avoid the year 2000 problem simply do not treat the year as a 2-digit number. It isn't. When gmtime() and localtime() are used in scalar context they return a timestamp string that contains a fully-expanded year. For example, `$timestamp = gmtime(1005613200)' sets $timestamp to "Tue Nov 13 01:00:00 2001". There's no year 2000 problem here. That doesn't mean that Perl can't be used to create non-Y2K compliant programs. It can. But so can your pencil. It's the fault of the user, not the language. At the risk of inflaming the NRA: ``Perl doesn't break Y2K, people do.'' See http://language.perl.com/news/y2k.html for a longer exposition. Data: Strings How do I validate input? The answer to this question is usually a regular expression, perhaps with auxiliary logic. See the more specific questions (numbers, mail addresses, etc.) for details. How do I unescape a string? It depends just what you mean by ``escape''. URL escapes are dealt with in the perlfaq9 manpage. Shell escapes with the backslash (`\') character are removed with: s/\\(.)/$1/g; This won't expand `"\n"' or `"\t"' or any other special escapes. How do I remove consecutive pairs of characters? To turn `"abbcccd"' into `"abccd"': s/(.)\1/$1/g; How do I expand function calls in a string? This is documented in the perlref manpage. In general, this is fraught with quoting and readability problems, but it is possible. To interpolate a subroutine call (in list context) into a string: print "My sub returned @{[mysub(1,2,3)]} that time.\n"; If you prefer scalar context, similar chicanery is also useful for arbitrary expressions: print "That yields ${\($n + 5)} widgets\n"; Version 5.004 of Perl had a bug that gave list context to the expression in `${...}', but this is fixed in version 5.005. See also ``How can I expand variables in text strings?'' in this section of the FAQ. How do I find matching/nesting anything? This isn't something that can be done in one regular expression, no matter how complicated. To find something between two single characters, a pattern like `/x([^x]*)x/' will get the intervening bits in $1. For multiple ones, then something more like `/alpha(.*?)omega/' would be needed. But none of these deals with nested patterns, nor can they. For that you'll have to write a parser. If you are serious about writing a parser, there are a number of modules or oddities that will make your life a lot easier. There is the CPAN module Parse::RecDescent, the standard module Text::Balanced, the byacc program, the CPAN module Parse::Yapp, and Mark-Jason Dominus's excellent *py* tool at http://www.plover.com/~mjd/perl/py/ . One simple destructive, inside-out approach that you might try is to pull out the smallest nesting parts one at a time: while (s//BEGIN((?:(?!BEGIN)(?!END).)*)END/gs) { # do something with $1 } A more complicated and sneaky approach is to make Perl's regular expression engine do it for you. This is courtesy Dean Inada, and rather has the nature of an Obfuscated Perl Contest entry, but it really does work: # $_ contains the string to parse # BEGIN and END are the opening and closing markers for the # nested text. @( = ('(',''); @) = (')',''); ($re=$_)=~s/((BEGIN)|(END)|.)/$)[!$3]\Q$1\E$([!$2]/gs; @$ = (eval{/$re/},$@!~/unmatched/); print join("\n",@$[0..$#$]) if( $$[-1] ); How do I reverse a string? Use reverse() in scalar context, as documented in the "reverse" entry in the perlfunc manpage. $reversed = reverse $string; How do I expand tabs in a string? You can do it yourself: 1 while $string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e; Or you can just use the Text::Tabs module (part of the standard perl distribution). use Text::Tabs; @expanded_lines = expand(@lines_with_tabs); How do I reformat a paragraph? Use Text::Wrap (part of the standard perl distribution): use Text::Wrap; print wrap("\t", ' ', @paragraphs); The paragraphs you give to Text::Wrap should not contain embedded newlines. Text::Wrap doesn't justify the lines (flush-right). How can I access/change the first N letters of a string? There are many ways. If you just want to grab a copy, use substr(): $first_byte = substr($a, 0, 1); If you want to modify part of a string, the simplest way is often to use substr() as an lvalue: substr($a, 0, 3) = "Tom"; Although those with a pattern matching kind of thought process will likely prefer: $a =~ s/^.../Tom/; How do I change the Nth occurrence of something? You have to keep track of N yourself. For example, let's say you want to change the fifth occurrence of `"whoever"' or `"whomever"' into `"whosoever"' or `"whomsoever"', case insensitively. $count = 0; s{((whom?)ever)}{ ++$count == 5 # is it the 5th? ? "${2}soever" # yes, swap : $1 # renege and leave it there }igex; In the more general case, you can use the `/g' modifier in a `while' loop, keeping count of matches. $WANT = 3; $count = 0; while (/(\w+)\s+fish\b/gi) { if (++$count == $WANT) { print "The third fish is a $1 one.\n"; # Warning: don't `last' out of this loop } } That prints out: `"The third fish is a red one."' You can also use a repetition count and repeated pattern like this: /(?:\w+\s+fish\s+){2}(\w+)\s+fish/i; How can I count the number of occurrences of a substring within a string? There are a number of ways, with varying efficiency: If you want a count of a certain single character (X) within a string, you can use the `tr///' function like so: $string = "ThisXlineXhasXsomeXx'sXinXit"; $count = ($string =~ tr/X//); print "There are $count X charcters in the string"; This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a larger string, `tr///' won't work. What you can do is wrap a while() loop around a global pattern match. For example, let's count negative integers: $string = "-9 55 48 -2 23 -76 4 14 -44"; while ($string =~ /-\d+/g) { $count++ } print "There are $count negative numbers in the string"; How do I capitalize all the words on one line? To make the first letter of each word upper case: $line =~ s/\b(\w)/\U$1/g; This has the strange effect of turning "`don't do it'" into "`Don'T Do It'". Sometimes you might want this, instead (Suggested by Brian Foy): $string =~ s/ ( (^\w) #at the beginning of the line | # or (\s\w) #preceded by whitespace ) /\U$1/xg; $string =~ /([\w']+)/\u\L$1/g; To make the whole line upper case: $line = uc($line); To force each word to be lower case, with the first letter upper case: $line =~ s/(\w+)/\u\L$1/g; You can (and probably should) enable locale awareness of those characters by placing a `use locale' pragma in your program. See the perllocale manpage for endless details on locales. This is sometimes referred to as putting something into "title case", but that's not quite accurate. Consdier the proper capitalization of the movie *Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb*, for example. How can I split a [character] delimited string except when inside [character]? (Comma-separated files) Take the example case of trying to split a string that is comma- separated into its different fields. (We'll pretend you said comma- separated, not comma-delimited, which is different and almost never what you mean.) You can't use `split(/,/)' because you shouldn't split if the comma is inside quotes. For example, take a data line like this: SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped" Due to the restriction of the quotes, this is a fairly complex problem. Thankfully, we have Jeffrey Friedl, author of a highly recommended book on regular expressions, to handle these for us. He suggests (assuming your string is contained in $text): @new = (); push(@new, $+) while $text =~ m{ "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside the quotes | ([^,]+),? | , }gx; push(@new, undef) if substr($text,-1,1) eq ','; If you want to represent quotation marks inside a quotation-mark- delimited field, escape them with backslashes (eg, `"like \"this\""'. Unescaping them is a task addressed earlier in this section. Alternatively, the Text::ParseWords module (part of the standard perl distribution) lets you say: use Text::ParseWords; @new = quotewords(",", 0, $text); There's also a Text::CSV module on CPAN. How do I strip blank space from the beginning/end of a string? Although the simplest approach would seem to be: $string =~ s/^\s*(.*?)\s*$/$1/; This is unnecessarily slow, destructive, and fails with embedded newlines. It is much better faster to do this in two steps: $string =~ s/^\s+//; $string =~ s/\s+$//; Or more nicely written as: for ($string) { s/^\s+//; s/\s+$//; } This idiom takes advantage of the `foreach' loop's aliasing behavior to factor out common code. You can do this on several strings at once, or arrays, or even the values of a hash if you use a slide: # trim whitespace in the scalar, the array, # and all the values in the hash foreach ($scalar, @array, @hash{keys %hash}) { s/^\s+//; s/\s+$//; } How do I pad a string with blanks or pad a number with zeroes? (This answer contributed by Uri Guttman) In the following examples, `$pad_len' is the length to which you wish to pad the string, `$text' or `$num' contains the string to be padded, and `$pad_char' contains the padding character. You can use a single character string constant instead of the `$pad_char' variable if you know what it is in advance. The simplest method use the `sprintf' function. It can pad on the left or right with blanks and on the left with zeroes. # Left padding with blank: $padded = sprintf( "%${pad_len}s", $text ) ; # Right padding with blank: $padded = sprintf( "%${pad_len}s", $text ) ; # Left padding with 0: $padded = sprintf( "%0${pad_len}d", $num ) ; If you need to pad with a character other than blank or zero you can use one of the following methods. These methods generate a pad string with the `x' operator and concatenate that with the original text. Left and right padding with any character: $padded = $pad_char x ( $pad_len - length( $text ) ) . $text ; $padded = $text . $pad_char x ( $pad_len - length( $text ) ) ; Or you can left or right pad $text directly: $text .= $pad_char x ( $pad_len - length( $text ) ) ; substr( $text, 0, 0 ) = $pad_char x ( $pad_len - length( $text ) ) ; How do I extract selected columns from a string? Use substr() or unpack(), both documented in the perlfunc manpage. If you prefer thinking in terms of columns instead of widths, you can use this kind of thing: # determine the unpack format needed to split Linux ps output # arguments are cut columns my $fmt = cut2fmt(8, 14, 20, 26, 30, 34, 41, 47, 59, 63, 67, 72); sub cut2fmt { my(@positions) = @_; my $template = ''; my $lastpos = 1; for my $place (@positions) { $template .= "A" . ($place - $lastpos) . " "; $lastpos = $place; } $template .= "A*"; return $template; } How do I find the soundex value of a string? Use the standard Text::Soundex module distributed with perl. How can I expand variables in text strings? Let's assume that you have a string like: $text = 'this has a $foo in it and a $bar'; If those were both global variables, then this would suffice: $text =~ s/\$(\w+)/${$1}/g; # no /e needed But since they are probably lexicals, or at least, they could be, you'd have to do this: $text =~ s/(\$\w+)/$1/eeg; die if $@; # needed /ee, not /e It's probably better in the general case to treat those variables as entries in some special hash. For example: %user_defs = ( foo => 23, bar => 19, ); $text =~ s/\$(\w+)/$user_defs{$1}/g; See also ``How do I expand function calls in a string?'' in this section of the FAQ. What's wrong with always quoting "$vars"? The problem is that those double-quotes force stringification, coercing numbers and references into strings, even when you don't want them to be. Think of it this way: double-quote expansion is used to produce new strings. If you already have a string, why do you need more? If you get used to writing odd things like these: print "$var"; # BAD $new = "$old"; # BAD somefunc("$var"); # BAD You'll be in trouble. Those should (in 99.8% of the cases) be the simpler and more direct: print $var; $new = $old; somefunc($var); Otherwise, besides slowing you down, you're going to break code when the thing in the scalar is actually neither a string nor a number, but a reference: func(\@array); sub func { my $aref = shift; my $oref = "$aref"; # WRONG } You can also get into subtle problems on those few operations in Perl that actually do care about the difference between a string and a number, such as the magical `++' autoincrement operator or the syscall() function. Stringification also destroys arrays. @lines = `command`; print "@lines"; # WRONG - extra blanks print @lines; # right Why don't my <op_ppaddr)() ) ; @@@ TAINT_NOT; @@@ return 0; @@@ } MAIN_INTERPRETER_LOOP Or with a fixed amount of leading white space, with remaining indentation correctly preserved: $poem = fix< 1 ? \@intersection : \@difference }, $element; } How do I test whether two arrays or hashes are equal? The following code works for single-level arrays. It uses a stringwise comparison, and does not distinguish defined versus undefined empty strings. Modify if you have other needs. $are_equal = compare_arrays(\@frogs, \@toads); sub compare_arrays { my ($first, $second) = @_; local $^W = 0; # silence spurious -w undef complaints return 0 unless @$first == @$second; for (my $i = 0; $i < @$first; $i++) { return 0 if $first->[$i] ne $second->[$i]; } return 1; } For multilevel structures, you may wish to use an approach more like this one. It uses the CPAN module FreezeThaw: use FreezeThaw qw(cmpStr); @a = @b = ( "this", "that", [ "more", "stuff" ] ); printf "a and b contain %s arrays\n", cmpStr(\@a, \@b) == 0 ? "the same" : "different"; This approach also works for comparing hashes. Here we'll demonstrate two different answers: use FreezeThaw qw(cmpStr cmpStrHard); %a = %b = ( "this" => "that", "extra" => [ "more", "stuff" ] ); $a{EXTRA} = \%b; $b{EXTRA} = \%a; printf "a and b contain %s hashes\n", cmpStr(\%a, \%b) == 0 ? "the same" : "different"; printf "a and b contain %s hashes\n", cmpStrHard(\%a, \%b) == 0 ? "the same" : "different"; The first reports that both those the hashes contain the same data, while the second reports that they do not. Which you prefer is left as an exercise to the reader. How do I find the first array element for which a condition is true? You can use this if you care about the index: for ($i= 0; $i < @array; $i++) { if ($array[$i] eq "Waldo") { $found_index = $i; last; } } Now `$found_index' has what you want. How do I handle linked lists? In general, you usually don't need a linked list in Perl, since with regular arrays, you can push and pop or shift and unshift at either end, or you can use splice to add and/or remove arbitrary number of elements at arbitrary points. Both pop and shift are both O(1) operations on perl's dynamic arrays. In the absence of shifts and pops, push in general needs to reallocate on the order every log(N) times, and unshift will need to copy pointers each time. If you really, really wanted, you could use structures as described in the perldsc manpage or the perltoot manpage and do just what the algorithm book tells you to do. For example, imagine a list node like this: $node = { VALUE => 42, LINK => undef, }; You could walk the list this way: print "List: "; for ($node = $head; $node; $node = $node->{LINK}) { print $node->{VALUE}, " "; } print "\n"; You could grow the list this way: my ($head, $tail); $tail = append($head, 1); # grow a new head for $value ( 2 .. 10 ) { $tail = append($tail, $value); } sub append { my($list, $value) = @_; my $node = { VALUE => $value }; if ($list) { $node->{LINK} = $list->{LINK}; $list->{LINK} = $node; } else { $_[0] = $node; # replace caller's version } return $node; } But again, Perl's built-in are virtually always good enough. How do I handle circular lists? Circular lists could be handled in the traditional fashion with linked lists, or you could just do something like this with an array: unshift(@array, pop(@array)); # the last shall be first push(@array, shift(@array)); # and vice versa How do I shuffle an array randomly? Use this: # fisher_yates_shuffle( \@array ) : # generate a random permutation of @array in place sub fisher_yates_shuffle { my $array = shift; my $i; for ($i = @$array; --$i; ) { my $j = int rand ($i+1); next if $i == $j; @$array[$i,$j] = @$array[$j,$i]; } } fisher_yates_shuffle( \@array ); # permutes @array in place You've probably seen shuffling algorithms that works using splice, randomly picking another element to swap the current element with: srand; @new = (); @old = 1 .. 10; # just a demo while (@old) { push(@new, splice(@old, rand @old, 1)); } This is bad because splice is already O(N), and since you do it N times, you just invented a quadratic algorithm; that is, O(N**2). This does not scale, although Perl is so efficient that you probably won't notice this until you have rather largish arrays. How do I process/modify each element of an array? Use `for'/`foreach': for (@lines) { s/foo/bar/; # change that word y/XZ/ZX/; # swap those letters } Here's another; let's compute spherical volumes: for (@volumes = @radii) { # @volumes has changed parts $_ **= 3; $_ *= (4/3) * 3.14159; # this will be constant folded } If you want to do the same thing to modify the values of the hash, you may not use the `values' function, oddly enough. You need a slice: for $orbit ( @orbits{keys %orbits} ) { ($orbit **= 3) *= (4/3) * 3.14159; } How do I select a random element from an array? Use the rand() function (see the "rand" entry in the perlfunc manpage): # at the top of the program: srand; # not needed for 5.004 and later # then later on $index = rand @array; $element = $array[$index]; Make sure you *only call srand once per program, if then*. If you are calling it more than once (such as before each call to rand), you're almost certainly doing something wrong. How do I permute N elements of a list? Here's a little program that generates all permutations of all the words on each line of input. The algorithm embodied in the permute() function should work on any list: #!/usr/bin/perl -n # tsc-permute: permute each word of input permute([split], []); sub permute { my @items = @{ $_[0] }; my @perms = @{ $_[1] }; unless (@items) { print "@perms\n"; } else { my(@newitems,@newperms,$i); foreach $i (0 .. $#items) { @newitems = @items; @newperms = @perms; unshift(@newperms, splice(@newitems, $i, 1)); permute([@newitems], [@newperms]); } } } How do I sort an array by (anything)? Supply a comparison function to sort() (described in the "sort" entry in the perlfunc manpage): @list = sort { $a <=> $b } @list; The default sort function is cmp, string comparison, which would sort `(1, 2, 10)' into `(1, 10, 2)'. `<=>', used above, is the numerical comparison operator. If you have a complicated function needed to pull out the part you want to sort on, then don't do it inside the sort function. Pull it out first, because the sort BLOCK can be called many times for the same element. Here's an example of how to pull out the first word after the first number on each item, and then sort those words case-insensitively. @idx = (); for (@data) { ($item) = /\d+\s*(\S+)/; push @idx, uc($item); } @sorted = @data[ sort { $idx[$a] cmp $idx[$b] } 0 .. $#idx ]; Which could also be written this way, using a trick that's come to be known as the Schwartzian Transform: @sorted = map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { [ $_, uc((/\d+\s*(\S+)/ )[0] ] } @data; If you need to sort on several fields, the following paradigm is useful. @sorted = sort { field1($a) <=> field1($b) || field2($a) cmp field2($b) || field3($a) cmp field3($b) } @data; This can be conveniently combined with precalculation of keys as given above. See http://www.perl.com/CPAN/doc/FMTEYEWTK/sort.html for more about this approach. See also the question below on sorting hashes. How do I manipulate arrays of bits? Use pack() and unpack(), or else vec() and the bitwise operations. For example, this sets $vec to have bit N set if $ints[N] was set: $vec = ''; foreach(@ints) { vec($vec,$_,1) = 1 } And here's how, given a vector in $vec, you can get those bits into your @ints array: sub bitvec_to_list { my $vec = shift; my @ints; # Find null-byte density then select best algorithm if ($vec =~ tr/\0// / length $vec > 0.95) { use integer; my $i; # This method is faster with mostly null-bytes while($vec =~ /[^\0]/g ) { $i = -9 + 8 * pos $vec; push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); } } else { # This method is a fast general algorithm use integer; my $bits = unpack "b*", $vec; push @ints, 0 if $bits =~ s/^(\d)// && $1; push @ints, pos $bits while($bits =~ /1/g); } return \@ints; } This method gets faster the more sparse the bit vector is. (Courtesy of Tim Bunce and Winfried Koenig.) Here's a demo on how to use vec(): # vec demo $vector = "\xff\x0f\xef\xfe"; print "Ilya's string \\xff\\x0f\\xef\\xfe represents the number ", unpack("N", $vector), "\n"; $is_set = vec($vector, 23, 1); print "Its 23rd bit is ", $is_set ? "set" : "clear", ".\n"; pvec($vector); set_vec(1,1,1); set_vec(3,1,1); set_vec(23,1,1); set_vec(3,1,3); set_vec(3,2,3); set_vec(3,4,3); set_vec(3,4,7); set_vec(3,8,3); set_vec(3,8,7); set_vec(0,32,17); set_vec(1,32,17); sub set_vec { my ($offset, $width, $value) = @_; my $vector = ''; vec($vector, $offset, $width) = $value; print "offset=$offset width=$width value=$value\n"; pvec($vector); } sub pvec { my $vector = shift; my $bits = unpack("b*", $vector); my $i = 0; my $BASE = 8; print "vector length in bytes: ", length($vector), "\n"; @bytes = unpack("A8" x length($vector), $bits); print "bits are: @bytes\n\n"; } Why does defined() return true on empty arrays and hashes? The short story is that you should probably only use defined on scalars or functions, not on aggregates (arrays and hashes). See the "defined" entry in the perlfunc manpage in the 5.004 release or later of Perl for more detail. Data: Hashes (Associative Arrays) How do I process an entire hash? Use the each() function (see the "each" entry in the perlfunc manpage) if you don't care whether it's sorted: while ( ($key, $value) = each %hash) { print "$key = $value\n"; } If you want it sorted, you'll have to use foreach() on the result of sorting the keys as shown in an earlier question. What happens if I add or remove keys from a hash while iterating over it? Don't do that. How do I look up a hash element by value? Create a reverse hash: %by_value = reverse %by_key; $key = $by_value{$value}; That's not particularly efficient. It would be more space-efficient to use: while (($key, $value) = each %by_key) { $by_value{$value} = $key; } If your hash could have repeated values, the methods above will only find one of the associated keys. This may or may not worry you. How can I know how many entries are in a hash? If you mean how many keys, then all you have to do is take the scalar sense of the keys() function: $num_keys = scalar keys %hash; In void context it just resets the iterator, which is faster for tied hashes. How do I sort a hash (optionally by value instead of key)? Internally, hashes are stored in a way that prevents you from imposing an order on key-value pairs. Instead, you have to sort a list of the keys or values: @keys = sort keys %hash; # sorted by key @keys = sort { $hash{$a} cmp $hash{$b} } keys %hash; # and by value Here we'll do a reverse numeric sort by value, and if two keys are identical, sort by length of key, and if that fails, by straight ASCII comparison of the keys (well, possibly modified by your locale -- see the perllocale manpage). @keys = sort { $hash{$b} <=> $hash{$a} || length($b) <=> length($a) || $a cmp $b } keys %hash; How can I always keep my hash sorted? You can look into using the DB_File module and tie() using the $DB_BTREE hash bindings as documented in the section on "In Memory Databases" in the DB_File manpage. The Tie::IxHash module from CPAN might also be instructive. What's the difference between "delete" and "undef" with hashes? Hashes are pairs of scalars: the first is the key, the second is the value. The key will be coerced to a string, although the value can be any kind of scalar: string, number, or reference. If a key `$key' is present in the array, `exists($key)' will return true. The value for a given key can be `undef', in which case `$array{$key}' will be `undef' while `$exists{$key}' will return true. This corresponds to (`$key', `undef') being in the hash. Pictures help... here's the `%ary' table: keys values +------+------+ | a | 3 | | x | 7 | | d | 0 | | e | 2 | +------+------+ And these conditions hold $ary{'a'} is true $ary{'d'} is false defined $ary{'d'} is true defined $ary{'a'} is true exists $ary{'a'} is true (perl5 only) grep ($_ eq 'a', keys %ary) is true If you now say undef $ary{'a'} your table now reads: keys values +------+------+ | a | undef| | x | 7 | | d | 0 | | e | 2 | +------+------+ and these conditions now hold; changes in caps: $ary{'a'} is FALSE $ary{'d'} is false defined $ary{'d'} is true defined $ary{'a'} is FALSE exists $ary{'a'} is true (perl5 only) grep ($_ eq 'a', keys %ary) is true Notice the last two: you have an undef value, but a defined key! Now, consider this: delete $ary{'a'} your table now reads: keys values +------+------+ | x | 7 | | d | 0 | | e | 2 | +------+------+ and these conditions now hold; changes in caps: $ary{'a'} is false $ary{'d'} is false defined $ary{'d'} is true defined $ary{'a'} is false exists $ary{'a'} is FALSE (perl5 only) grep ($_ eq 'a', keys %ary) is FALSE See, the whole entry is gone! Why don't my tied hashes make the defined/exists distinction? They may or may not implement the EXISTS() and DEFINED() methods differently. For example, there isn't the concept of undef with hashes that are tied to DBM* files. This means the true/false tables above will give different results when used on such a hash. It also means that exists and defined do the same thing with a DBM* file, and what they end up doing is not what they do with ordinary hashes. How do I reset an each() operation part-way through? Using `keys %hash' in scalar context returns the number of keys in the hash *and* resets the iterator associated with the hash. You may need to do this if you use `last' to exit a loop early so that when you re-enter it, the hash iterator has been reset. How can I get the unique keys from two hashes? First you extract the keys from the hashes into arrays, and then solve the uniquifying the array problem described above. For example: %seen = (); for $element (keys(%foo), keys(%bar)) { $seen{$element}++; } @uniq = keys %seen; Or more succinctly: @uniq = keys %{{%foo,%bar}}; Or if you really want to save space: %seen = (); while (defined ($key = each %foo)) { $seen{$key}++; } while (defined ($key = each %bar)) { $seen{$key}++; } @uniq = keys %seen; How can I store a multidimensional array in a DBM file? Either stringify the structure yourself (no fun), or else get the MLDBM (which uses Data::Dumper) module from CPAN and layer it on top of either DB_File or GDBM_File. How can I make my hash remember the order I put elements into it? Use the Tie::IxHash from CPAN. use Tie::IxHash; tie(%myhash, Tie::IxHash); for ($i=0; $i<20; $i++) { $myhash{$i} = 2*$i; } @keys = keys %myhash; # @keys = (0,1,2,3,...) Why does passing a subroutine an undefined element in a hash create it? If you say something like: somefunc($hash{"nonesuch key here"}); Then that element "autovivifies"; that is, it springs into existence whether you store something there or not. That's because functions get scalars passed in by reference. If somefunc() modifies `$_[0]', it has to be ready to write it back into the caller's version. This has been fixed as of perl5.004. Normally, merely accessing a key's value for a nonexistent key does *not* cause that key to be forever there. This is different than awk's behavior. How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays? Usually a hash ref, perhaps like this: $record = { NAME => "Jason", EMPNO => 132, TITLE => "deputy peon", AGE => 23, SALARY => 37_000, PALS => [ "Norbert", "Rhys", "Phineas"], }; References are documented in the perlref manpage and the upcoming the perlreftut manpage. Examples of complex data structures are given in the perldsc manpage and the perllol manpage. Examples of structures and object-oriented classes are in the perltoot manpage. How can I use a reference as a hash key? You can't do this directly, but you could use the standard Tie::Refhash module distributed with perl. Data: Misc How do I handle binary data correctly? Perl is binary clean, so this shouldn't be a problem. For example, this works fine (assuming the files are found): if (`cat /vmunix` =~ /gzip/) { print "Your kernel is GNU-zip enabled!\n"; } On some legacy systems, however, you have to play tedious games with "text" versus "binary" files. See the section on "binmode" in the perlfunc manpage, or the upcoming the perlopentut manpage manpage. If you're concerned about 8-bit ASCII data, then see the perllocale manpage. If you want to deal with multibyte characters, however, there are some gotchas. See the section on Regular Expressions. How do I determine whether a scalar is a number/whole/integer/float? Assuming that you don't care about IEEE notations like "NaN" or "Infinity", you probably just want to use a regular expression. if (/\D/) { print "has nondigits\n" } if (/^\d+$/) { print "is a whole number\n" } if (/^-?\d+$/) { print "is an integer\n" } if (/^[+-]?\d+$/) { print "is a +/- integer\n" } if (/^-?\d+\.?\d*$/) { print "is a real number\n" } if (/^-?(?:\d+(?:\.\d*)?|\.\d+)$/) { print "is a decimal number" } if (/^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/) { print "a C float" } If you're on a POSIX system, Perl's supports the `POSIX::strtod' function. Its semantics are somewhat cumbersome, so here's a `getnum' wrapper function for more convenient access. This function takes a string and returns the number it found, or `undef' for input that isn't a C float. The `is_numeric' function is a front end to `getnum' if you just want to say, ``Is this a float?'' sub getnum { use POSIX qw(strtod); my $str = shift; $str =~ s/^\s+//; $str =~ s/\s+$//; $! = 0; my($num, $unparsed) = strtod($str); if (($str eq '') || ($unparsed != 0) || $!) { return undef; } else { return $num; } } sub is_numeric { defined &getnum } Or you could check out String::Scanf which can be found at http://www.perl.com/CPAN/modules/by-module/String/. The POSIX module (part of the standard Perl distribution) provides the `strtol' and `strtod' for converting strings to double and longs, respectively. How do I keep persistent data across program calls? For some specific applications, you can use one of the DBM modules. See the AnyDBM_File manpage. More generically, you should consult the FreezeThaw, Storable, or Class::Eroot modules from CPAN. Here's one example using Storable's `store' and `retrieve' functions: use Storable; store(\%hash, "filename"); # later on... $href = retrieve("filename"); # by ref %hash = %{ retrieve("filename") }; # direct to hash How do I print out or copy a recursive data structure? The Data::Dumper module on CPAN (or the 5.005 release of Perl) is great for printing out data structures. The Storable module, found on CPAN, provides a function called `dclone' that recursively copies its argument. use Storable qw(dclone); $r2 = dclone($r1); Where $r1 can be a reference to any kind of data structure you'd like. It will be deeply copied. Because `dclone' takes and returns references, you'd have to add extra punctuation if you had a hash of arrays that you wanted to copy. %newhash = %{ dclone(\%oldhash) }; How do I define methods for every class/object? Use the UNIVERSAL class (see the UNIVERSAL manpage). How do I verify a credit card checksum? Get the Business::CreditCard module from CPAN. How do I pack arrays of doubles or floats for XS code? The kgbpack.c code in the PGPLOT module on CPAN does just this. If you're doing a lot of float or double processing, consider using the PDL module from CPAN instead--it makes number-crunching easy. AUTHOR AND COPYRIGHT Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as part of the Standard Version of Perl, or as part of its complete documentation whether printed or otherwise, this work may be distributed only under the terms of Perl's Artistic Licence. Any distribution of this file or derivatives thereof *outside* of that package require that special arrangements be made with copyright holder. Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required. perlfaq5 section NAME perlfaq5 - Files and Formats ($Revision: 1.34 $, $Date: 1999/01/08 05:46:13 $) DESCRIPTION This section deals with I/O and the "f" issues: filehandles, flushing, formats, and footers. How do I flush/unbuffer an output filehandle? Why must I do this? The C standard I/O library (stdio) normally buffers characters sent to devices. This is done for efficiency reasons, so that there isn't a system call for each byte. Any time you use print() or write() in Perl, you go though this buffering. syswrite() circumvents stdio and buffering. In most stdio implementations, the type of output buffering and the size of the buffer varies according to the type of device. Disk files are block buffered, often with a buffer size of more than 2k. Pipes and sockets are often buffered with a buffer size between 1/2 and 2k. Serial devices (e.g. modems, terminals) are normally line-buffered, and stdio sends the entire line when it gets the newline. Perl does not support truly unbuffered output (except insofar as you can `syswrite(OUT, $char, 1)'). What it does instead support is "command buffering", in which a physical write is performed after every output command. This isn't as hard on your system as unbuffering, but does get the output where you want it when you want it. If you expect characters to get to your device when you print them there, you'll want to autoflush its handle. Use select() and the `$|' variable to control autoflushing (see the section on "$|" in the perlvar manpage and the "select" entry in the perlfunc manpage): $old_fh = select(OUTPUT_HANDLE); $| = 1; select($old_fh); Or using the traditional idiom: select((select(OUTPUT_HANDLE), $| = 1)[0]); Or if don't mind slowly loading several thousand lines of module code just because you're afraid of the `$|' variable: use FileHandle; open(DEV, "+autoflush(1); or the newer IO::* modules: use IO::Handle; open(DEV, ">/dev/printer"); # but is this? DEV->autoflush(1); or even this: use IO::Socket; # this one is kinda a pipe? $sock = IO::Socket::INET->new(PeerAddr => 'www.perl.com', PeerPort => 'http(80)', Proto => 'tcp'); die "$!" unless $sock; $sock->autoflush(); print $sock "GET / HTTP/1.0" . "\015\012" x 2; $document = join('', <$sock>); print "DOC IS: $document\n"; Note the bizarrely hardcoded carriage return and newline in their octal equivalents. This is the ONLY way (currently) to assure a proper flush on all platforms, including Macintosh. That the way things work in network programming: you really should specify the exact bit pattern on the network line terminator. In practice, `"\n\n"' often works, but this is not portable. See the perlfaq9 manpage for other examples of fetching URLs over the web. How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file? Those are operations of a text editor. Perl is not a text editor. Perl is a programming language. You have to decompose the problem into low- level calls to read, write, open, close, and seek. Although humans have an easy time thinking of a text file as being a sequence of lines that operates much like a stack of playing cards -- or punch cards -- computers usually see the text file as a sequence of bytes. In general, there's no direct way for Perl to seek to a particular line of a file, insert text into a file, or remove text from a file. (There are exceptions in special circumstances. You can add or remove at the very end of the file. Another is replacing a sequence of bytes with another sequence of the same length. Another is using the `$DB_RECNO' array bindings as documented in the DB_File manpage. Yet another is manipulating files with all lines the same length.) The general solution is to create a temporary copy of the text file with the changes you want, then copy that over the original. This assumes no locking. $old = $file; $new = "$file.tmp.$$"; $bak = "$file.orig"; open(OLD, "< $old") or die "can't open $old: $!"; open(NEW, "> $new") or die "can't open $new: $!"; # Correct typos, preserving case while () { s/\b(p)earl\b/${1}erl/i; (print NEW $_) or die "can't write to $new: $!"; } close(OLD) or die "can't close $old: $!"; close(NEW) or die "can't close $new: $!"; rename($old, $bak) or die "can't rename $old to $bak: $!"; rename($new, $old) or die "can't rename $new to $old: $!"; Perl can do this sort of thing for you automatically with the `-i' command-line switch or the closely-related `$^I' variable (see the perlrun manpage for more details). Note that `-i' may require a suffix on some non-Unix systems; see the platform-specific documentation that came with your port. # Renumber a series of tests from the command line perl -pi -e 's/(^\s+test\s+)\d+/ $1 . ++$count /e' t/op/taint.t # form a script local($^I, @ARGV) = ('.orig', glob("*.c")); while (<>) { if ($. == 1) { print "This line should appear at the top of each file\n"; } s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case print; close ARGV if eof; # Reset $. } If you need to seek to an arbitrary line of a file that changes infrequently, you could build up an index of byte positions of where the line ends are in the file. If the file is large, an index of every tenth or hundredth line end would allow you to seek and read fairly efficiently. If the file is sorted, try the look.pl library (part of the standard perl distribution). In the unique case of deleting lines at the end of a file, you can use tell() and truncate(). The following code snippet deletes the last line of a file without making a copy or reading the whole file into memory: open (FH, "+< $file"); while ( ) { $addr = tell(FH) unless eof(FH) } truncate(FH, $addr); Error checking is left as an exercise for the reader. How do I count the number of lines in a file? One fairly efficient way is to count newlines in the file. The following program uses a feature of tr///, as documented in the perlop manpage. If your text file doesn't end with a newline, then it's not really a proper text file, so this may report one fewer line than you expect. $lines = 0; open(FILE, $filename) or die "Can't open `$filename': $!"; while (sysread FILE, $buffer, 4096) { $lines += ($buffer =~ tr/\n//); } close FILE; This assumes no funny games with newline translations. How do I make a temporary file name? Use the `new_tmpfile' class method from the IO::File module to get a filehandle opened for reading and writing. Use this if you don't need to know the file's name. use IO::File; $fh = IO::File->new_tmpfile() or die "Unable to make new temporary file: $!"; Or you can use the `tmpnam' function from the POSIX module to get a filename that you then open yourself. Use this if you do need to know the file's name. use Fcntl; use POSIX qw(tmpnam); # try new temporary filenames until we get one that didn't already # exist; the check should be unnecessary, but you can't be too careful do { $name = tmpnam() } until sysopen(FH, $name, O_RDWR|O_CREAT|O_EXCL); # install atexit-style handler so that when we exit or die, # we automatically delete this temporary file END { unlink($name) or die "Couldn't unlink $name : $!" } # now go on to use the file ... If you're committed to doing this by hand, use the process ID and/or the current time-value. If you need to have many temporary files in one process, use a counter: BEGIN { use Fcntl; my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMP} || $ENV{TEMP}; my $base_name = sprintf("%s/%d-%d-0000", $temp_dir, $$, time()); sub temp_file { local *FH; my $count = 0; until (defined(fileno(FH)) || $count++ > 100) { $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e; sysopen(FH, $base_name, O_WRONLY|O_EXCL|O_CREAT); } if (defined(fileno(FH)) return (*FH, $base_name); } else { return (); } } } How can I manipulate fixed-record-length files? The most efficient way is using pack() and unpack(). This is faster than using substr() when taking many, many strings. It is slower for just a few. Here is a sample chunk of code to break up and put back together again some fixed-format input lines, in this case from the output of a normal, Berkeley-style ps: # sample input line: # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what $PS_T = 'A6 A4 A7 A5 A*'; open(PS, "ps|"); print scalar ; while () { ($pid, $tt, $stat, $time, $command) = unpack($PS_T, $_); for $var (qw!pid tt stat time command!) { print "$var: <$$var>\n"; } print 'line=', pack($PS_T, $pid, $tt, $stat, $time, $command), "\n"; } We've used `$$var' in a way that forbidden by `use strict 'refs''. That is, we've promoted a string to a scalar variable reference using symbolic references. This is ok in small programs, but doesn't scale well. It also only works on global variables, not lexicals. How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles? The fastest, simplest, and most direct way is to localize the typeglob of the filehandle in question: local *TmpHandle; Typeglobs are fast (especially compared with the alternatives) and reasonably easy to use, but they also have one subtle drawback. If you had, for example, a function named TmpHandle(), or a variable named %TmpHandle, you just hid it from yourself. sub findme { local *HostFile; open(HostFile, ") { print if /\b127\.(0\.0\.)?1\b/; } # *HostFile automatically closes/disappears here } Here's how to use this in a loop to open and store a bunch of filehandles. We'll use as values of the hash an ordered pair to make it easy to sort the hash in insertion order. @names = qw(motd termcap passwd hosts); my $i = 0; foreach $filename (@names) { local *FH; open(FH, "/etc/$filename") || die "$filename: $!"; $file{$filename} = [ $i++, *FH ]; } # Using the filehandles in the array foreach $name (sort { $file{$a}[0] <=> $file{$b}[0] } keys %file) { my $fh = $file{$name}[1]; my $line = <$fh>; print "$name $. $line"; } For passing filehandles to functions, the easiest way is to preface them with a star, as in func(*STDIN). See the section on "Passing Filehandles" in the perlfaq7 manpage for details. If you want to create many anonymous handles, you should check out the Symbol, FileHandle, or IO::Handle (etc.) modules. Here's the equivalent code with Symbol::gensym, which is reasonably light-weight: foreach $filename (@names) { use Symbol; my $fh = gensym(); open($fh, "/etc/$filename") || die "open /etc/$filename: $!"; $file{$filename} = [ $i++, $fh ]; } Or here using the semi-object-oriented FileHandle module, which certainly isn't light-weight: use FileHandle; foreach $filename (@names) { my $fh = FileHandle->new("/etc/$filename") or die "$filename: $!"; $file{$filename} = [ $i++, $fh ]; } Please understand that whether the filehandle happens to be a (probably localized) typeglob or an anonymous handle from one of the modules, in no way affects the bizarre rules for managing indirect handles. See the next question. How can I use a filehandle indirectly? An indirect filehandle is using something other than a symbol in a place that a filehandle is expected. Here are ways to get those: $fh = SOME_FH; # bareword is strict-subs hostile $fh = "SOME_FH"; # strict-refs hostile; same package only $fh = *SOME_FH; # typeglob $fh = \*SOME_FH; # ref to typeglob (bless-able) $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob Or to use the `new' method from the FileHandle or IO modules to create an anonymous filehandle, store that in a scalar variable, and use it as though it were a normal filehandle. use FileHandle; $fh = FileHandle->new(); use IO::Handle; # 5.004 or higher $fh = IO::Handle->new(); Then use any of those as you would a normal filehandle. Anywhere that Perl is expecting a filehandle, an indirect filehandle may be used instead. An indirect filehandle is just a scalar variable that contains a filehandle. Functions like `print', `open', `seek', or the `' diamond operator will accept either a read filehandle or a scalar variable containing one: ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR); print $ofh "Type it: "; $got = <$ifh> print $efh "What was that: $got"; If you're passing a filehandle to a function, you can write the function in two ways: sub accept_fh { my $fh = shift; print $fh "Sending to indirect filehandle\n"; } Or it can localize a typeglob and use the filehandle directly: sub accept_fh { local *FH = shift; print FH "Sending to localized filehandle\n"; } Both styles work with either objects or typeglobs of real filehandles. (They might also work with strings under some circumstances, but this is risky.) accept_fh(*STDOUT); accept_fh($handle); In the examples above, we assigned the filehandle to a scalar variable before using it. That is because only simple scalar variables, not expressions or subscripts into hashes or arrays, can be used with built- ins like `print', `printf', or the diamond operator. These are illegal and won't even compile: @fd = (*STDIN, *STDOUT, *STDERR); print $fd[1] "Type it: "; # WRONG $got = <$fd[0]> # WRONG print $fd[2] "What was that: $got"; # WRONG With `print' and `printf', you get around this by using a block and an expression where you would place the filehandle: print { $fd[1] } "funny stuff\n"; printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559; # Pity the poor deadbeef. That block is a proper block like any other, so you can put more complicated code there. This sends the message out to one of two places: $ok = -x "/bin/cat"; print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n"; print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n"; This approach of treating `print' and `printf' like object methods calls doesn't work for the diamond operator. That's because it's a real operator, not just a function with a comma-less argument. Assuming you've been storing typeglobs in your structure as we did above, you can use the built-in function named `readline' to reads a record just as `<>' does. Given the initialization shown above for @fd, this would work, but only because readline() require a typeglob. It doesn't work with objects or strings, which might be a bug we haven't fixed yet. $got = readline($fd[0]); Let it be noted that the flakiness of indirect filehandles is not related to whether they're strings, typeglobs, objects, or anything else. It's the syntax of the fundamental operators. Playing the object game doesn't help you at all here. How can I set up a footer format to be used with write()? There's no builtin way to do this, but the perlform manpage has a couple of techniques to make it possible for the intrepid hacker. How can I write() into a string? See the section on "Accessing Formatting Internals" in the perlform manpage for an swrite() function. How can I output my numbers with commas added? This one will do it for you: sub commify { local $_ = shift; 1 while s/^([-+]?\d+)(\d{3})/$1,$2/; return $_; } $n = 23659019423.2331; print "GOT: ", commify($n), "\n"; GOT: 23,659,019,423.2331 You can't just: s/^([-+]?\d+)(\d{3})/$1,$2/g; because you have to put the comma in and then recalculate your position. Alternatively, this commifies all numbers in a line regardless of whether they have decimal portions, are preceded by + or -, or whatever: # from Andrew Johnson sub commify { my $input = shift; $input = reverse $input; $input =~ s<(\d\d\d)(?=\d)(?!\d*\.)><$1,>g; return scalar reverse $input; } How can I translate tildes (~) in a filename? Use the <> (glob()) operator, documented in the perlfunc manpage. This requires that you have a shell installed that groks tildes, meaning csh or tcsh or (some versions of) ksh, and thus may have portability problems. The Glob::KGlob module (available from CPAN) gives more portable glob functionality. Within Perl, you may use this directly: $filename =~ s{ ^ ~ # find a leading tilde ( # save this in $1 [^/] # a non-slash character * # repeated 0 or more times (0 means me) ) }{ $1 ? (getpwnam($1))[7] : ( $ENV{HOME} || $ENV{LOGDIR} ) }ex; How come when I open a file read-write it wipes it out? Because you're using something like this, which truncates the file and *then* gives you read-write access: open(FH, "+> /path/name"); # WRONG (almost always) Whoops. You should instead use this, which will fail if the file doesn't exist. Using ">" always clobbers or creates. Using "<" never does either. The "+" doesn't change this. Here are examples of many kinds of file opens. Those using sysopen() all assume use Fcntl; To open file for reading: open(FH, "< $path") || die $!; sysopen(FH, $path, O_RDONLY) || die $!; To open file for writing, create new file if needed or else truncate old file: open(FH, "> $path") || die $!; sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT) || die $!; sysopen(FH, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666) || die $!; To open file for writing, create new file, file must not exist: sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT) || die $!; sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT, 0666) || die $!; To open file for appending, create if necessary: open(FH, ">> $path") || die $!; sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT) || die $!; sysopen(FH, $path, O_WRONLY|O_APPEND|O_CREAT, 0666) || die $!; To open file for appending, file must exist: sysopen(FH, $path, O_WRONLY|O_APPEND) || die $!; To open file for update, file must exist: open(FH, "+< $path") || die $!; sysopen(FH, $path, O_RDWR) || die $!; To open file for update, create file if necessary: sysopen(FH, $path, O_RDWR|O_CREAT) || die $!; sysopen(FH, $path, O_RDWR|O_CREAT, 0666) || die $!; To open file for update, file must not exist: sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT) || die $!; sysopen(FH, $path, O_RDWR|O_EXCL|O_CREAT, 0666) || die $!; To open a file without blocking, creating if necessary: sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT) or die "can't open /tmp/somefile: $!": Be warned that neither creation nor deletion of files is guaranteed to be an atomic operation over NFS. That is, two processes might both successful create or unlink the same file! Therefore O_EXCL isn't so exclusive as you might wish. See also the new the perlopentut manpage if you have it (new for 5.006). Why do I sometimes get an "Argument list too long" when I use <*>? The `<>' operator performs a globbing operation (see above). By default glob() forks csh(1) to do the actual glob expansion, but csh can't handle more than 127 items and so gives the error message `Argument list too long'. People who installed tcsh as csh won't have this problem, but their users may be surprised by it. To get around this, either do the glob yourself with readdir() and patterns, or use a module like Glob::KGlob, one that doesn't use the shell to do globbing. This is expected to be fixed soon. Is there a leak/bug in glob()? Due to the current implementation on some operating systems, when you use the glob() function or its angle-bracket alias in a scalar context, you may cause a leak and/or unpredictable behavior. It's best therefore to use glob() only in list context. How can I open a file with a leading ">" or trailing blanks? Normally perl ignores trailing blanks in filenames, and interprets certain leading characters (or a trailing "|") to mean something special. To avoid this, you might want to use a routine like this. It makes incomplete pathnames into explicit relative ones, and tacks a trailing null byte on the name to make perl leave it alone: sub safe_filename { local $_ = shift; s#^([^./])#./$1#; $_ .= "\0"; return $_; } $badpath = "<< $fn") or "couldn't open $badpath: $!"; This assumes that you are using POSIX (portable operating systems interface) paths. If you are on a closed, non-portable, proprietary system, you may have to adjust the `"./"' above. It would be a lot clearer to use sysopen(), though: use Fcntl; $badpath = "<<file.lock")? A common bit of code NOT TO USE is this: sleep(3) while -e "file.lock"; # PLEASE DO NOT USE open(LCK, "> file.lock"); # THIS BROKEN CODE This is a classic race condition: you take two steps to do something which must be done in one. That's why computer hardware provides an atomic test-and-set instruction. In theory, this "ought" to work: sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT) or die "can't open file.lock: $!": except that lamentably, file creation (and deletion) is not atomic over NFS, so this won't work (at least, not every time) over the net. Various schemes involving link() have been suggested, but these tend to involve busy-wait, which is also subdesirable. I still don't get locking. I just want to increment the number in the file. How can I do this? Didn't anyone ever tell you web-page hit counters were useless? They don't count number of hits, they're a waste of time, and they serve only to stroke the writer's vanity. Better to pick a random number. It's more realistic. Anyway, this is what you can do if you can't help yourself. use Fcntl ':flock'; sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!"; flock(FH, LOCK_EX) or die "can't flock numfile: $!"; $num = || 0; seek(FH, 0, 0) or die "can't rewind numfile: $!"; truncate(FH, 0) or die "can't truncate numfile: $!"; (print FH $num+1, "\n") or die "can't write numfile: $!"; # Perl as of 5.004 automatically flushes before unlocking flock(FH, LOCK_UN) or die "can't flock numfile: $!"; close FH or die "can't close numfile: $!"; Here's a much better web-page hit counter: $hits = int( (time() - 850_000_000) / rand(1_000) ); If the count doesn't impress your friends, then the code might. :-) How do I randomly update a binary file? If you're just trying to patch a binary, in many cases something as simple as this works: perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs However, if you have fixed sized records, then you might do something more like this: $RECSIZE = 220; # size of record, in bytes $recno = 37; # which record to update open(FH, "+mtime); print "file $file updated at $date_string\n"; The POSIX::strftime() approach has the benefit of being, in theory, independent of the current locale. See the perllocale manpage for details. How do I set a file's timestamp in perl? You use the utime() function documented in the "utime" entry in the perlfunc manpage. By way of example, here's a little program that copies the read and write times from its first argument to all the rest of them. if (@ARGV < 2) { die "usage: cptimes timestamp_file other_files ...\n"; } $timestamp = shift; ($atime, $mtime) = (stat($timestamp))[8,9]; utime $atime, $mtime, @ARGV; Error checking is, as usual, left as an exercise for the reader. Note that utime() currently doesn't work correctly with Win95/NT ports. A bug has been reported. Check it carefully before using it on those platforms. How do I print to more than one file at once? If you only have to do this once, you can do this: for $fh (FH1, FH2, FH3) { print $fh "whatever\n" } To connect up to one filehandle to several output filehandles, it's easiest to use the tee(1) program if you have it, and let it take care of the multiplexing: open (FH, "| tee file1 file2 file3"); Or even: # make STDOUT go to three files, plus original STDOUT open (STDOUT, "| tee file1 file2 file3") or die "Teeing off: $!\n"; print "whatever\n" or die "Writing: $!\n"; close(STDOUT) or die "Closing: $!\n"; Otherwise you'll have to write your own multiplexing print function -- or your own tee program -- or use Tom Christiansen's, at http://www.perl.com/CPAN/authors/id/TOMC/scripts/tct.gz, which is written in Perl and offers much greater functionality than the stock version. How can I read in a file by paragraphs? Use the `$/' variable (see the perlvar manpage for details). You can either set it to `""' to eliminate empty paragraphs (`"abc\n\n\n\ndef"', for instance, gets treated as two paragraphs and not three), or `"\n\n"' to accept empty paragraphs. Note that a blank line must have no blanks in it. Thus `"fred\n \nstuff\n\n"' is one paragraph, but `"fred\n\nstuff\n\n"' is two. How can I read a single character from a file? From the keyboard? You can use the builtin `getc()' function for most filehandles, but it won't (easily) work on a terminal device. For STDIN, either use the Term::ReadKey module from CPAN, or use the sample code in the "getc" entry in the perlfunc manpage. If your system supports the portable operating system programming interface (POSIX), you can use the following code, which you'll note turns off echo processing as well. #!/usr/bin/perl -w use strict; $| = 1; for (1..4) { my $got; print "gimme: "; $got = getone(); print "--> $got\n"; } exit; BEGIN { use POSIX qw(:termios_h); my ($term, $oterm, $echo, $noecho, $fd_stdin); $fd_stdin = fileno(STDIN); $term = POSIX::Termios->new(); $term->getattr($fd_stdin); $oterm = $term->getlflag(); $echo = ECHO | ECHOK | ICANON; $noecho = $oterm & ~$echo; sub cbreak { $term->setlflag($noecho); $term->setcc(VTIME, 1); $term->setattr($fd_stdin, TCSANOW); } sub cooked { $term->setlflag($oterm); $term->setcc(VTIME, 0); $term->setattr($fd_stdin, TCSANOW); } sub getone { my $key = ''; cbreak(); sysread(STDIN, $key, 1); cooked(); return $key; } } END { cooked() } The Term::ReadKey module from CPAN may be easier to use. Recent version include also support for non-portable systems as well. use Term::ReadKey; open(TTY, " reports the following: To put the PC in "raw" mode, use ioctl with some magic numbers gleaned from msdos.c (Perl source file) and Ralf Brown's interrupt list (comes across the net every so often): $old_ioctl = ioctl(STDIN,0,0); # Gets device info $old_ioctl &= 0xff; ioctl(STDIN,1,$old_ioctl | 32); # Writes it back, setting bit 5 Then to read a single character: sysread(STDIN,$c,1); # Read a single character And to put the PC back to "cooked" mode: ioctl(STDIN,1,$old_ioctl); # Sets it back to cooked mode. So now you have $c. If `ord($c) == 0', you have a two byte code, which means you hit a special key. Read another byte with `sysread(STDIN,$c,1)', and that value tells you what combination it was according to this table: # PC 2-byte keycodes = ^@ + the following: # HEX KEYS # --- ---- # 0F SHF TAB # 10-19 ALT QWERTYUIOP # 1E-26 ALT ASDFGHJKL # 2C-32 ALT ZXCVBNM # 3B-44 F1-F10 # 47-49 HOME,UP,PgUp # 4B LEFT # 4D RIGHT # 4F-53 END,DOWN,PgDn,Ins,Del # 54-5D SHF F1-F10 # 5E-67 CTR F1-F10 # 68-71 ALT F1-F10 # 73-77 CTR LEFT,RIGHT,END,PgDn,HOME # 78-83 ALT 1234567890-= # 84 CTR PgUp This is all trial and error I did a long time ago, I hope I'm reading the file that worked. How can I tell whether there's a character waiting on a filehandle? The very first thing you should do is look into getting the Term::ReadKey extension from CPAN. As we mentioned earlier, it now even has limited support for non-portable (read: not open systems, closed, proprietary, not POSIX, not Unix, etc) systems. You should also check out the Frequently Asked Questions list in comp.unix.* for things like this: the answer is essentially the same. It's very system dependent. Here's one solution that works on BSD systems: sub key_ready { my($rin, $nfd); vec($rin, fileno(STDIN), 1) = 1; return $nfd = select($rin,undef,undef,0); } If you want to find out how many characters are waiting, there's also the FIONREAD ioctl call to be looked at. The *h2ph* tool that comes with Perl tries to convert C include files to Perl code, which can be `require'd. FIONREAD ends up defined as a function in the *sys/ioctl.ph* file: require 'sys/ioctl.ph'; $size = pack("L", 0); ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n"; $size = unpack("L", $size); If *h2ph* wasn't installed or doesn't work for you, you can *grep* the include files by hand: % grep FIONREAD /usr/include/*/* /usr/include/asm/ioctls.h:#define FIONREAD 0x541B Or write a small C program using the editor of champions: % cat > fionread.c #include main() { printf("%#08x\n", FIONREAD); } ^D % cc -o fionread fionread.c % ./fionread 0x4004667f And then hard-code it, leaving porting as an exercise to your successor. $FIONREAD = 0x4004667f; # XXX: opsys dependent $size = pack("L", 0); ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n"; $size = unpack("L", $size); FIONREAD requires a filehandle connected to a stream, meaning sockets, pipes, and tty devices work, but *not* files. How do I do a `tail -f' in perl? First try seek(GWFILE, 0, 1); The statement `seek(GWFILE, 0, 1)' doesn't change the current position, but it does clear the end-of-file condition on the handle, so that the next makes Perl try again to read something. If that doesn't work (it relies on features of your stdio implementation), then you need something more like this: for (;;) { for ($curpos = tell(GWFILE); ; $curpos = tell(GWFILE)) { # search for some stuff and put it into files } # sleep for a while seek(GWFILE, $curpos, 0); # seek to where we had been } If this still doesn't work, look into the POSIX module. POSIX defines the clearerr() method, which can remove the end of file condition on a filehandle. The method: read until end of file, clearerr(), read some more. Lather, rinse, repeat. There's also a File::Tail module from CPAN. How do I dup() a filehandle in Perl? If you check the "open" entry in the perlfunc manpage, you'll see that several of the ways to call open() should do the trick. For example: open(LOG, ">>/tmp/logfile"); open(STDERR, ">&LOG"); Or even with a literal numeric descriptor: $fd = $ENV{MHCONTEXTFD}; open(MHCONTEXT, "<&=$fd"); # like fdopen(3S) Note that "<&STDIN" makes a copy, but "<&=STDIN" make an alias. That means if you close an aliased handle, all aliases become inaccessible. This is not true with a copied one. Error checking, as always, has been left as an exercise for the reader. How do I close a file descriptor by number? This should rarely be necessary, as the Perl close() function is to be used for things that Perl opened itself, even if it was a dup of a numeric descriptor, as with MHCONTEXT above. But if you really have to, you may be able to do this: require 'sys/syscall.ph'; $rc = syscall(&SYS_close, $fd + 0); # must force numeric die "can't sysclose $fd: $!" unless $rc == -1; Why can't I use "C:\temp\foo" in DOS paths? What doesn't `C:\temp\foo.exe` work? Whoops! You just put a tab and a formfeed into that filename! Remember that within double quoted strings ("like\this"), the backslash is an escape character. The full list of these is in the section on "Quote and Quote-like Operators" in the perlop manpage. Unsurprisingly, you don't have a file called "c:(tab)emp(formfeed)oo" or "c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem. Either single-quote your strings, or (preferably) use forward slashes. Since all DOS and Windows versions since something like MS-DOS 2.0 or so have treated `/' and `\' the same in a path, you might as well use the one that doesn't clash with Perl -- or the POSIX shell, ANSI C and C++, awk, Tcl, Java, or Python, just to mention a few. POSIX paths are more portable, too. Why doesn't glob("*.*") get all the files? Because even on non-Unix ports, Perl's glob function follows standard Unix globbing semantics. You'll need `glob("*")' to get all (non-hidden) files. This makes glob() portable even to legacy systems. Your port may include proprietary globbing functions as well. Check its documentation for details. Why does Perl let me delete read-only files? Why does `-i' clobber protected files? Isn't this a bug in Perl? This is elaborately and painstakingly described in the "Far More Than You Ever Wanted To Know" in http://www.perl.com/CPAN/doc/FMTEYEWTK/file- dir-perms . The executive summary: learn how your filesystem works. The permissions on a file say what can happen to the data in that file. The permissions on a directory say what can happen to the list of files in that directory. If you delete a file, you're removing its name from the directory (so the operation depends on the permissions of the directory, not of the file). If you try to write to the file, the permissions of the file govern whether you're allowed to. How do I select a random line from a file? Here's an algorithm from the Camel Book: srand; rand($.) < 1 && ($line = $_) while <>; This has a significant advantage in space over reading the whole file in. A simple proof by induction is available upon request if you doubt its correctness. Why do I get weird spaces when I print an array of lines? Saying print "@lines\n"; joins together the elements of `@lines' with a space between them. If `@lines' were `("little", "fluffy", "clouds")' then the above statement would print: little fluffy clouds but if each element of `@lines' was a line of text, ending a newline character `("little\n", "fluffy\n", "clouds\n")' then it would print: little fluffy clouds If your array contains lines, just print them: print @lines; AUTHOR AND COPYRIGHT Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as an integrated part of the Standard Distribution of Perl or of its documentation (printed or otherwise), this work is covered under Perl's Artistic Licence. For separate distributions of all or part of this FAQ outside of that, see the perlfaq manpage. Irrespective of its distribution, all code examples here are public domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required. perlfaq6 section NAME perlfaq6 - Regexps ($Revision: 1.25 $, $Date: 1999/01/08 04:50:47 $) DESCRIPTION This section is surprisingly small because the rest of the FAQ is littered with answers involving regular expressions. For example, decoding a URL and checking whether something is a number are handled with regular expressions, but those answers are found elsewhere in this document (in the section on Data and the Networking one on networking, to be precise). How can I hope to use regular expressions without creating illegible and unmaintainable code? Three techniques can make regular expressions maintainable and understandable. Comments Outside the Regexp Describe what you're doing and how you're doing it, using normal Perl comments. # turn the line into the first word, a colon, and the # number of characters on the rest of the line s/^(\w+)(.*)/ lc($1) . ":" . length($2) /meg; Comments Inside the Regexp The `/x' modifier causes whitespace to be ignored in a regexp pattern (except in a character class), and also allows you to use normal comments there, too. As you can imagine, whitespace and comments help a lot. `/x' lets you turn this: s{<(?:[^>'"]*|".*?"|'.*?')+>}{}gs; into this: s{ < # opening angle bracket (?: # Non-backreffing grouping paren [^>'"] * # 0 or more things that are neither > nor ' nor " | # or else ".*?" # a section between double quotes (stingy match) | # or else '.*?' # a section between single quotes (stingy match) ) + # all occurring one or more times > # closing angle bracket }{}gsx; # replace with nothing, i.e. delete It's still not quite so clear as prose, but it is very useful for describing the meaning of each part of the pattern. Different Delimiters While we normally think of patterns as being delimited with `/' characters, they can be delimited by almost any character. the perlre manpage describes this. For example, the `s///' above uses braces as delimiters. Selecting another delimiter can avoid quoting the delimiter within the pattern: s/\/usr\/local/\/usr\/share/g; # bad delimiter choice s#/usr/local#/usr/share#g; # better I'm having trouble matching over more than one line. What's wrong? Either you don't have more than one line in the string you're looking at (probably), or else you aren't using the correct modifier(s) on your pattern (possibly). There are many ways to get multiline data into a string. If you want it to happen automatically while reading input, you'll want to set $/ (probably to '' for paragraphs or `undef' for the whole file) to allow you to read more than one line at a time. Read the perlre manpage to help you decide which of `/s' and `/m' (or both) you might want to use: `/s' allows dot to include newline, and `/m' allows caret and dollar to match next to a newline, not just at the end of the string. You do need to make sure that you've actually got a multiline string in there. For example, this program detects duplicate words, even when they span line breaks (but not paragraph ones). For this example, we don't need `/s' because we aren't using dot in a regular expression that we want to cross line boundaries. Neither do we need `/m' because we aren't wanting caret or dollar to match at any point inside the record next to newlines. But it's imperative that $/ be set to something other than the default, or else we won't actually ever have a multiline record read in. $/ = ''; # read in more whole paragraph, not just one line while ( <> ) { while ( /\b([\w'-]+)(\s+\1)+\b/gi ) { # word starts alpha print "Duplicate $1 at paragraph $.\n"; } } Here's code that finds sentences that begin with "From " (which would be mangled by many mailers): $/ = ''; # read in more whole paragraph, not just one line while ( <> ) { while ( /^From /gm ) { # /m makes ^ match next to \n print "leading from in paragraph $.\n"; } } Here's code that finds everything between START and END in a paragraph: undef $/; # read in whole file, not just one line or paragraph while ( <> ) { while ( /START(.*?)END/sm ) { # /s makes . cross line boundaries print "$1\n"; } } How can I pull out lines between two patterns that are themselves on different lines? You can use Perl's somewhat exotic `..' operator (documented in the perlop manpage): perl -ne 'print if /START/ .. /END/' file1 file2 ... If you wanted text and not lines, you would use perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ... But if you want nested occurrences of `START' through `END', you'll run up against the problem described in the question in this section on matching balanced text. Here's another example of using `..': while (<>) { $in_header = 1 .. /^$/; $in_body = /^$/ .. eof(); # now choose between them } continue { reset if eof(); # fix $. } I put a regular expression into $/ but it didn't work. What's wrong? $/ must be a string, not a regular expression. Awk has to be better for something. :-) Actually, you could do this if you don't mind reading the whole file into memory: undef $/; @records = split /your_pattern/, ; The Net::Telnet module (available from CPAN) has the capability to wait for a pattern in the input stream, or timeout if it doesn't appear within a certain time. ## Create a file with three lines. open FH, ">file"; print FH "The first line\nThe second line\nThe third line\n"; close FH; ## Get a read/write filehandle to it. $fh = new FileHandle "+ $fh); ## Search for the second line and print out the third. $file->waitfor('/second line\n/'); print $file->getline; How do I substitute case insensitively on the LHS, but preserving case on the RHS? It depends on what you mean by "preserving case". The following script makes the substitution have the same case, letter by letter, as the original. If the substitution has more characters than the string being substituted, the case of the last character is used for the rest of the substitution. # Original by Nathan Torkington, massaged by Jeffrey Friedl # sub preserve_case($$) { my ($old, $new) = @_; my ($state) = 0; # 0 = no change; 1 = lc; 2 = uc my ($i, $oldlen, $newlen, $c) = (0, length($old), length($new)); my ($len) = $oldlen < $newlen ? $oldlen : $newlen; for ($i = 0; $i < $len; $i++) { if ($c = substr($old, $i, 1), $c =~ /[\W\d_]/) { $state = 0; } elsif (lc $c eq $c) { substr($new, $i, 1) = lc(substr($new, $i, 1)); $state = 1; } else { substr($new, $i, 1) = uc(substr($new, $i, 1)); $state = 2; } } # finish up with any remaining new (for when new is longer than old) if ($newlen > $oldlen) { if ($state == 1) { substr($new, $oldlen) = lc(substr($new, $oldlen)); } elsif ($state == 2) { substr($new, $oldlen) = uc(substr($new, $oldlen)); } } return $new; } $a = "this is a TEsT case"; $a =~ s/(test)/preserve_case($1, "success")/gie; print "$a\n"; This prints: this is a SUcCESS case How can I make `\w' match national character sets? See the perllocale manpage. How can I match a locale-smart version of `/[a-zA-Z]/'? One alphabetic character would be `/[^\W\d_]/', no matter what locale you're in. Non-alphabetics would be `/[\W\d_]/' (assuming you don't consider an underscore a letter). How can I quote a variable to use in a regexp? The Perl parser will expand $variable and @variable references in regular expressions unless the delimiter is a single quote. Remember, too, that the right-hand side of a `s///' substitution is considered a double-quoted string (see the perlop manpage for more details). Remember also that any regexp special characters will be acted on unless you precede the substitution with \Q. Here's an example: $string = "to die?"; $lhs = "die?"; $rhs = "sleep no more"; $string =~ s/\Q$lhs/$rhs/; # $string is now "to sleep no more" Without the \Q, the regexp would also spuriously match "di". What is `/o' really for? Using a variable in a regular expression match forces a re-evaluation (and perhaps recompilation) each time through. The `/o' modifier locks in the regexp the first time it's used. This always happens in a constant regular expression, and in fact, the pattern was compiled into the internal format at the same time your entire program was. Use of `/o' is irrelevant unless variable interpolation is used in the pattern, and if so, the regexp engine will neither know nor care whether the variables change after the pattern is evaluated the *very first* time. `/o' is often used to gain an extra measure of efficiency by not performing subsequent evaluations when you know it won't matter (because you know the variables won't change), or more rarely, when you don't want the regexp to notice if they do. For example, here's a "paragrep" program: $/ = ''; # paragraph mode $pat = shift; while (<>) { print if /$pat/o; } How do I use a regular expression to strip C style comments from a file? While this actually can be done, it's much harder than you'd think. For example, this one-liner perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c will work in many but not all cases. You see, it's too simple-minded for certain kinds of C programs, in particular, those with what appear to be comments in quoted strings. For that, you'd need something like this, created by Jeffrey Friedl: $/ = undef; $_ = <>; s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|\n+|.[^/"'\\]*)#$2#g; print; This could, of course, be more legibly written with the `/x' modifier, adding whitespace and comments. Can I use Perl regular expressions to match balanced text? Although Perl regular expressions are more powerful than "mathematical" regular expressions, because they feature conveniences like backreferences (`\1' and its ilk), they still aren't powerful enough. You still need to use non-regexp techniques to parse balanced text, such as the text enclosed between matching parentheses or braces, for example. An elaborate subroutine (for 7-bit ASCII only) to pull out balanced and possibly nested single chars, like ``' and `'', `{' and `}', or `(' and `)' can be found in http://www.perl.com/CPAN/authors/id/TOMC/scripts/pull_quotes.gz . The C::Scan module from CPAN contains such subs for internal usage, but they are undocumented. What does it mean that regexps are greedy? How can I get around it? Most people mean that greedy regexps match as much as they can. Technically speaking, it's actually the quantifiers (`?', `*', `+', `{}') that are greedy rather than the whole pattern; Perl prefers local greed and immediate gratification to overall greed. To get non-greedy versions of the same quantifiers, use (`??', `*?', `+?', `{}?'). An example: $s1 = $s2 = "I am very very cold"; $s1 =~ s/ve.*y //; # I am cold $s2 =~ s/ve.*?y //; # I am very cold Notice how the second substitution stopped matching as soon as it encountered "y ". The `*?' quantifier effectively tells the regular expression engine to find a match as quickly as possible and pass control on to whatever is next in line, like you would if you were playing hot potato. How do I process each word on each line? Use the split function: while (<>) { foreach $word ( split ) { # do something with $word here } } Note that this isn't really a word in the English sense; it's just chunks of consecutive non-whitespace characters. To work with only alphanumeric sequences, you might consider while (<>) { foreach $word (m/(\w+)/g) { # do something with $word here } } How can I print out a word-frequency or line-frequency summary? To do this, you have to parse out each word in the input stream. We'll pretend that by word you mean chunk of alphabetics, hyphens, or apostrophes, rather than the non-whitespace chunk idea of a word given in the previous question: while (<>) { while ( /(\b[^\W_\d][\w'-]+\b)/g ) { # misses "`sheep'" $seen{$1}++; } } while ( ($word, $count) = each %seen ) { print "$count $word\n"; } If you wanted to do the same thing for lines, you wouldn't need a regular expression: while (<>) { $seen{$_}++; } while ( ($line, $count) = each %seen ) { print "$count $line"; } If you want these output in a sorted order, see the section on Hashes. How can I do approximate matching? See the module String::Approx available from CPAN. How do I efficiently match many regular expressions at once? The following is extremely inefficient: # slow but obvious way @popstates = qw(CO ON MI WI MN); while (defined($line = <>)) { for $state (@popstates) { if ($line =~ /\b$state\b/i) { print $line; last; } } } That's because Perl has to recompile all those patterns for each of the lines of the file. As of the 5.005 release, there's a much better approach, one which makes use of the new `qr//' operator: # use spiffy new qr// operator, with /i flag even use 5.005; @popstates = qw(CO ON MI WI MN); @poppats = map { qr/\b$_\b/i } @popstates; while (defined($line = <>)) { for $patobj (@poppats) { print $line if $line =~ /$patobj/; } } Why don't word-boundary searches with `\b' work for me? Two common misconceptions are that `\b' is a synonym for `\s+', and that it's the edge between whitespace characters and non-whitespace characters. Neither is correct. `\b' is the place between a `\w' character and a `\W' character (that is, `\b' is the edge of a "word"). It's a zero-width assertion, just like `^', `$', and all the other anchors, so it doesn't consume any characters. the perlre manpage describes the behaviour of all the regexp metacharacters. Here are examples of the incorrect application of `\b', with fixes: "two words" =~ /(\w+)\b(\w+)/; # WRONG "two words" =~ /(\w+)\s+(\w+)/; # right " =matchless= text" =~ /\b=(\w+)=\b/; # WRONG " =matchless= text" =~ /=(\w+)=/; # right Although they may not do what you thought they did, `\b' and `\B' can still be quite useful. For an example of the correct use of `\b', see the example of matching duplicate words over multiple lines. An example of using `\B' is the pattern `\Bis\B'. This will find occurrences of "is" on the insides of words only, as in "thistle", but not "this" or "island". Why does using $&, $`, or $' slow my program down? Because once Perl sees that you need one of these variables anywhere in the program, it has to provide them on each and every pattern match. The same mechanism that handles these provides for the use of $1, $2, etc., so you pay the same price for each regexp that contains capturing parentheses. But if you never use $&, etc., in your script, then regexps *without* capturing parentheses won't be penalized. So avoid $&, $', and $` if you can, but if you can't, once you've used them at all, use them at will because you've already paid the price. Remember that some algorithms really appreciate them. As of the 5.005 release. the $& variable is no longer "expensive" the way the other two are. What good is `\G' in a regular expression? The notation `\G' is used in a match or substitution in conjunction the `/g' modifier (and ignored if there's no `/g') to anchor the regular expression to the point just past where the last match occurred, i.e. the pos() point. A failed match resets the position of `\G' unless the `/c' modifier is in effect. For example, suppose you had a line of text quoted in standard mail and Usenet notation, (that is, with leading `>' characters), and you want change each leading `>' into a corresponding `:'. You could do so in this way: s/^(>+)/':' x length($1)/gem; Or, using `\G', the much simpler (and faster): s/\G>/:/g; A more sophisticated use might involve a tokenizer. The following lex- like example is courtesy of Jeffrey Friedl. It did not work in 5.003 due to bugs in that release, but does work in 5.004 or better. (Note the use of `/c', which prevents a failed match with `/g' from resetting the search position back to the beginning of the string.) while (<>) { chomp; PARSER: { m/ \G( \d+\b )/gcx && do { print "number: $1\n"; redo; }; m/ \G( \w+ )/gcx && do { print "word: $1\n"; redo; }; m/ \G( \s+ )/gcx && do { print "space: $1\n"; redo; }; m/ \G( [^\w\d]+ )/gcx && do { print "other: $1\n"; redo; }; } } Of course, that could have been written as while (<>) { chomp; PARSER: { if ( /\G( \d+\b )/gcx { print "number: $1\n"; redo PARSER; } if ( /\G( \w+ )/gcx { print "word: $1\n"; redo PARSER; } if ( /\G( \s+ )/gcx { print "space: $1\n"; redo PARSER; } if ( /\G( [^\w\d]+ )/gcx { print "other: $1\n"; redo PARSER; } } } But then you lose the vertical alignment of the regular expressions. Are Perl regexps DFAs or NFAs? Are they POSIX compliant? While it's true that Perl's regular expressions resemble the DFAs (deterministic finite automata) of the egrep(1) program, they are in fact implemented as NFAs (non-deterministic finite automata) to allow backtracking and backreferencing. And they aren't POSIX-style either, because those guarantee worst-case behavior for all cases. (It seems that some people prefer guarantees of consistency, even when what's guaranteed is slowness.) See the book "Mastering Regular Expressions" (from O'Reilly) by Jeffrey Friedl for all the details you could ever hope to know on these matters (a full citation appears in the perlfaq2 manpage). What's wrong with using grep or map in a void context? Both grep and map build a return list, regardless of their context. This means you're making Perl go to the trouble of building up a return list that you then just ignore. That's no way to treat a programming language, you insensitive scoundrel! How can I match strings with multibyte characters? This is hard, and there's no good way. Perl does not directly support wide characters. It pretends that a byte and a character are synonymous. The following set of approaches was offered by Jeffrey Friedl, whose article in issue #5 of The Perl Journal talks about this very matter. Let's suppose you have some weird Martian encoding where pairs of ASCII uppercase letters encode single Martian letters (i.e. the two bytes "CV" make a single Martian letter, as do the two bytes "SG", "VS", "XX", etc.). Other bytes represent single characters, just like ASCII. So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the nine characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'. Now, say you want to search for the single character `/GX/'. Perl doesn't know about Martian, so it'll find the two bytes "GX" in the "I am CVSGXX!" string, even though that character isn't there: it just looks like it is because "SG" is next to "XX", but there's no real "GX". This is a big problem. Here are a few ways, all painful, to deal with it: $martian =~ s/([A-Z][A-Z])/ $1 /g; # Make sure adjacent ``martian'' bytes # are no longer adjacent. print "found GX!\n" if $martian =~ /GX/; Or like this: @chars = $martian =~ m/([A-Z][A-Z]|[^A-Z])/g; # above is conceptually similar to: @chars = $text =~ m/(.)/g; # foreach $char (@chars) { print "found GX!\n", last if $char eq 'GX'; } Or like this: while ($martian =~ m/\G([A-Z][A-Z]|.)/gs) { # \G probably unneeded print "found GX!\n", last if $1 eq 'GX'; } Or like this: die "sorry, Perl doesn't (yet) have Martian support )-:\n"; There are many double- (and multi-) byte encodings commonly used these days. Some versions of these have 1-, 2-, 3-, and 4-byte characters, all mixed. How do I match a pattern that is supplied by the user? Well, if it's really a pattern, then just use chomp($pattern = ); if ($line =~ /$pattern/) { } Or, since you have no guarantee that your user entered a valid regular expression, trap the exception this way: if (eval { $line =~ /$pattern/ }) { } But if all you really want to search for a string, not a pattern, then you should either use the index() function, which is made for string searching, or if you can't be disabused of using a pattern match on a non-pattern, then be sure to use `\Q'...`\E', documented in the perlre manpage. $pattern = ; open (FILE, $input) or die "Couldn't open input $input: $!; aborting"; while () { print if /\Q$pattern\E/; } close FILE; AUTHOR AND COPYRIGHT Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as part of the Standard Version of Perl, or as part of its complete documentation whether printed or otherwise, this work may be distributed only under the terms of Perl's Artistic Licence. Any distribution of this file or derivatives thereof *outside* of that package require that special arrangements be made with copyright holder. Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required. perlfaq7 section NAME perlfaq7 - Perl Language Issues ($Revision: 1.24 $, $Date: 1999/01/08 05:32:11 $) DESCRIPTION This section deals with general Perl language issues that don't clearly fit into any of the other sections. Can I get a BNF/yacc/RE for the Perl language? There is no BNF, but you can paw your way through the yacc grammar in perly.y in the source distribution if you're particularly brave. The grammar relies on very smart tokenizing code, so be prepared to venture into toke.c as well. In the words of Chaim Frenkel: "Perl's grammar can not be reduced to BNF. The work of parsing perl is distributed between yacc, the lexer, smoke and mirrors." What are all these $@%* punctuation signs, and how do I know when to use them? They are type specifiers, as detailed in the perldata manpage: $ for scalar values (number, string or reference) @ for arrays % for hashes (associative arrays) * for all types of that symbol name. In version 4 you used them like pointers, but in modern perls you can just use references. While there are a few places where you don't actually need these type specifiers, you should always use them. A couple of others that you're likely to encounter that aren't really type specifiers are: <> are used for inputting a record from a filehandle. \ takes a reference to something. Note that is *neither* the type specifier for files nor the name of the handle. It is the `<>' operator applied to the handle FILE. It reads one line (well, record - see the section on "$/" in the perlvar manpage) from the handle FILE in scalar context, or *all* lines in list context. When performing open, close, or any other operation besides `<>' on files, or even talking about the handle, do *not* use the brackets. These are correct: `eof(FH)', `seek(FH, 0, 2)' and "copying from STDIN to FILE". Do I always/never have to quote my strings or use semicolons and commas? Normally, a bareword doesn't need to be quoted, but in most cases probably should be (and must be under `use strict'). But a hash key consisting of a simple word (that isn't the name of a defined subroutine) and the left-hand operand to the `=>' operator both count as though they were quoted: This is like this ------------ --------------- $foo{line} $foo{"line"} bar => stuff "bar" => stuff The final semicolon in a block is optional, as is the final comma in a list. Good style (see the perlstyle manpage) says to put them in except for one-liners: if ($whoops) { exit 1 } @nums = (1, 2, 3); if ($whoops) { exit 1; } @lines = ( "There Beren came from mountains cold", "And lost he wandered under leaves", ); How do I skip some return values? One way is to treat the return values as a list and index into it: $dir = (getpwnam($user))[7]; Another way is to use undef as an element on the left-hand-side: ($dev, $ino, undef, undef, $uid, $gid) = stat($file); How do I temporarily block warnings? The `$^W' variable (documented in the perlvar manpage) controls runtime warnings for a block: { local $^W = 0; # temporarily turn off warnings $a = $b + $c; # I know these might be undef } Note that like all the punctuation variables, you cannot currently use my() on `$^W', only local(). A new `use warnings' pragma is in the works to provide finer control over all this. The curious should check the perl5-porters mailing list archives for details. What's an extension? A way of calling compiled C code from Perl. Reading the perlxstut manpage is a good place to learn more about extensions. Why do Perl operators have different precedence than C operators? Actually, they don't. All C operators that Perl copies have the same precedence in Perl as they do in C. The problem is with operators that C doesn't have, especially functions that give a list context to everything on their right, eg print, chmod, exec, and so on. Such functions are called "list operators" and appear as such in the precedence table in the perlop manpage. A common mistake is to write: unlink $file || die "snafu"; This gets interpreted as: unlink ($file || die "snafu"); To avoid this problem, either put in extra parentheses or use the super low precedence `or' operator: (unlink $file) || die "snafu"; unlink $file or die "snafu"; The "English" operators (`and', `or', `xor', and `not') deliberately have precedence lower than that of list operators for just such situations as the one above. Another operator with surprising precedence is exponentiation. It binds more tightly even than unary minus, making `-2**2' product a negative not a positive four. It is also right-associating, meaning that `2**3**2' is two raised to the ninth power, not eight squared. Although it has the same precedence as in C, Perl's `?:' operator produces an lvalue. This assigns $x to either $a or $b, depending on the trueness of $maybe: ($maybe ? $a : $b) = $x; How do I declare/create a structure? In general, you don't "declare" a structure. Just use a (probably anonymous) hash reference. See the perlref manpage and the perldsc manpage for details. Here's an example: $person = {}; # new anonymous hash $person->{AGE} = 24; # set field AGE to 24 $person->{NAME} = "Nat"; # set field NAME to "Nat" If you're looking for something a bit more rigorous, try the perltoot manpage. How do I create a module? A module is a package that lives in a file of the same name. For example, the Hello::There module would live in Hello/There.pm. For details, read the perlmod manpage. You'll also find the Exporter manpage helpful. If you're writing a C or mixed-language module with both C and Perl, then you should study the perlxstut manpage. Here's a convenient template you might wish you use when starting your own module. Make sure to change the names appropriately. package Some::Module; # assumes Some/Module.pm use strict; BEGIN { use Exporter (); use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS); ## set the version for version checking; uncomment to use ## $VERSION = 1.00; # if using RCS/CVS, this next line may be preferred, # but beware two-digit versions. $VERSION = do{my@r=q$Revision: 1.24 $=~/\d+/g;sprintf '%d.'.'%02d'x$#r,@r}; @ISA = qw(Exporter); @EXPORT = qw(&func1 &func2 &func3); %EXPORT_TAGS = ( ); # eg: TAG => [ qw!name1 name2! ], # your exported package globals go here, # as well as any optionally exported functions @EXPORT_OK = qw($Var1 %Hashit); } use vars @EXPORT_OK; # non-exported package globals go here use vars qw( @more $stuff ); # initialize package globals, first exported ones $Var1 = ''; %Hashit = (); # then the others (which are still accessible as $Some::Module::stuff) $stuff = ''; @more = (); # all file-scoped lexicals must be created before # the functions below that use them. # file-private lexicals go here my $priv_var = ''; my %secret_hash = (); # here's a file-private function as a closure, # callable as &$priv_func; it cannot be prototyped. my $priv_func = sub { # stuff goes here. }; # make all your functions, whether exported or not; # remember to put something interesting in the {} stubs sub func1 {} # no prototype sub func2() {} # proto'd void sub func3($$) {} # proto'd to 2 scalars # this one isn't exported, but could be called! sub func4(\%) {} # proto'd to 1 hash ref END { } # module clean-up code here (global destructor) 1; # modules must return true The h2xs program will create stubs for all the important stuff for you: % h2xs -XA -n My::Module How do I create a class? See the perltoot manpage for an introduction to classes and objects, as well as the perlobj manpage and the perlbot manpage. How can I tell if a variable is tainted? See the section on "Laundering and Detecting Tainted Data" in the perlsec manpage. Here's an example (which doesn't use any system calls, because the kill() is given no processes to signal): sub is_tainted { return ! eval { join('',@_), kill 0; 1; }; } This is not `-w' clean, however. There is no `-w' clean way to detect taintedness - take this as a hint that you should untaint all possibly- tainted data. What's a closure? Closures are documented in the perlref manpage. *Closure* is a computer science term with a precise but hard-to-explain meaning. Closures are implemented in Perl as anonymous subroutines with lasting references to lexical variables outside their own scopes. These lexicals magically refer to the variables that were around when the subroutine was defined (deep binding). Closures make sense in any programming language where you can have the return value of a function be itself a function, as you can in Perl. Note that some languages provide anonymous functions but are not capable of providing proper closures; the Python language, for example. For more information on closures, check out any textbook on functional programming. Scheme is a language that not only supports but encourages closures. Here's a classic function-generating function: sub add_function_generator { return sub { shift + shift }; } $add_sub = add_function_generator(); $sum = $add_sub->(4,5); # $sum is 9 now. The closure works as a *function template* with some customization slots left out to be filled later. The anonymous subroutine returned by add_function_generator() isn't technically a closure because it refers to no lexicals outside its own scope. Contrast this with the following make_adder() function, in which the returned anonymous function contains a reference to a lexical variable outside the scope of that function itself. Such a reference requires that Perl return a proper closure, thus locking in for all time the value that the lexical had when the function was created. sub make_adder { my $addpiece = shift; return sub { shift + $addpiece }; } $f1 = make_adder(20); $f2 = make_adder(555); Now `&$f1($n)' is always 20 plus whatever $n you pass in, whereas `&$f2($n)' is always 555 plus whatever $n you pass in. The $addpiece in the closure sticks around. Closures are often used for less esoteric purposes. For example, when you want to pass in a bit of code into a function: my $line; timeout( 30, sub { $line = } ); If the code to execute had been passed in as a string, `'$line = '', there would have been no way for the hypothetical timeout() function to access the lexical variable $line back in its caller's scope. What is variable suicide and how can I prevent it? Variable suicide is when you (temporarily or permanently) lose the value of a variable. It is caused by scoping through my() and local() interacting with either closures or aliased foreach() iterator variables and subroutine arguments. It used to be easy to inadvertently lose a variable's value this way, but now it's much harder. Take this code: my $f = "foo"; sub T { while ($i++ < 3) { my $f = $f; $f .= "bar"; print $f, "\n" } } T; print "Finally $f\n"; The $f that has "bar" added to it three times should be a new `$f' (`my $f' should create a new local variable each time through the loop). It isn't, however. This is a bug, and will be fixed. How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regexp}? With the exception of regexps, you need to pass references to these objects. See the section on "Pass by Reference" in the perlsub manpage for this particular question, and the perlref manpage for information on references. Passing Variables and Functions Regular variables and functions are quite easy: just pass in a reference to an existing or anonymous variable or function: func( \$some_scalar ); func( \@some_array ); func( [ 1 .. 10 ] ); func( \%some_hash ); func( { this => 10, that => 20 } ); func( \&some_func ); func( sub { $_[0] ** $_[1] } ); Passing Filehandles To pass filehandles to subroutines, use the `*FH' or `\*FH' notations. These are "typeglobs" - see the section on "Typeglobs and Filehandles" in the perldata manpage and especially the section on "Pass by Reference" in the perlsub manpage for more information. Here's an excerpt: If you're passing around filehandles, you could usually just use the bare typeglob, like *STDOUT, but typeglobs references would be better because they'll still work properly under `use strict 'refs''. For example: splutter(\*STDOUT); sub splutter { my $fh = shift; print $fh "her um well a hmmm\n"; } $rec = get_rec(\*STDIN); sub get_rec { my $fh = shift; return scalar <$fh>; } If you're planning on generating new filehandles, you could do this: sub openit { my $name = shift; local *FH; return open (FH, $path) ? *FH : undef; } $fh = openit('< /etc/motd'); print <$fh>; Passing Regexps To pass regexps around, you'll need to either use one of the highly experimental regular expression modules from CPAN (Nick Ing- Simmons's Regexp or Ilya Zakharevich's Devel::Regexp), pass around strings and use an exception-trapping eval, or else be very, very clever. Here's an example of how to pass in a string to be regexp compared: sub compare($$) { my ($val1, $regexp) = @_; my $retval = eval { $val =~ /$regexp/ }; die if $@; return $retval; } $match = compare("old McDonald", q/d.*D/); Make sure you never say something like this: return eval "\$val =~ /$regexp/"; # WRONG or someone can sneak shell escapes into the regexp due to the double interpolation of the eval and the double-quoted string. For example: $pattern_of_evil = 'danger ${ system("rm -rf * &") } danger'; eval "\$string =~ /$pattern_of_evil/"; Those preferring to be very, very clever might see the O'Reilly book, *Mastering Regular Expressions*, by Jeffrey Friedl. Page 273's Build_MatchMany_Function() is particularly interesting. A complete citation of this book is given in the perlfaq2 manpage. Passing Methods To pass an object method into a subroutine, you can do this: call_a_lot(10, $some_obj, "methname") sub call_a_lot { my ($count, $widget, $trick) = @_; for (my $i = 0; $i < $count; $i++) { $widget->$trick(); } } Or you can use a closure to bundle up the object and its method call and arguments: my $whatnot = sub { $some_obj->obfuscate(@args) }; func($whatnot); sub func { my $code = shift; &$code(); } You could also investigate the can() method in the UNIVERSAL class (part of the standard perl distribution). How do I create a static variable? As with most things in Perl, TMTOWTDI. What is a "static variable" in other languages could be either a function-private variable (visible only within a single function, retaining its value between calls to that function), or a file-private variable (visible only to functions within the file it was declared in) in Perl. Here's code to implement a function-private variable: BEGIN { my $counter = 42; sub prev_counter { return --$counter } sub next_counter { return $counter++ } } Now prev_counter() and next_counter() share a private variable $counter that was initialized at compile time. To declare a file-private variable, you'll still use a my(), putting it at the outer scope level at the top of the file. Assume this is in file Pax.pm: package Pax; my $started = scalar(localtime(time())); sub begun { return $started } When `use Pax' or `require Pax' loads this module, the variable will be initialized. It won't get garbage-collected the way most variables going out of scope do, because the begun() function cares about it, but no one else can get it. It is not called $Pax::started because its scope is unrelated to the package. It's scoped to the file. You could conceivably have several packages in that same file all accessing the same private variable, but another file with the same package couldn't get to it. See the section on "Persistent Private Variables" in the perlsub manpage for details. What's the difference between dynamic and lexical (static) scoping? Between local() and my()? `local($x)' saves away the old value of the global variable `$x', and assigns a new value for the duration of the subroutine, *which is visible in other functions called from that subroutine*. This is done at run-time, so is called dynamic scoping. local() always affects global variables, also called package variables or dynamic variables. `my($x)' creates a new variable that is only visible in the current subroutine. This is done at compile-time, so is called lexical or static scoping. my() always affects private variables, also called lexical variables or (improperly) static(ly scoped) variables. For instance: sub visible { print "var has value $var\n"; } sub dynamic { local $var = 'local'; # new temporary value for the still-global visible(); # variable called $var } sub lexical { my $var = 'private'; # new private variable, $var visible(); # (invisible outside of sub scope) } $var = 'global'; visible(); # prints global dynamic(); # prints local lexical(); # prints global Notice how at no point does the value "private" get printed. That's because $var only has that value within the block of the lexical() function, and it is hidden from called subroutine. In summary, local() doesn't make what you think of as private, local variables. It gives a global variable a temporary value. my() is what you're looking for if you want private variables. See the section on "Private Variables via my()" in the perlsub manpage and the section on "Temporary Values via local()" in the perlsub manpage for excruciating details. How can I access a dynamic variable while a similarly named lexical is in scope? You can do this via symbolic references, provided you haven't set `use strict "refs"'. So instead of $var, use `${'var'}'. local $var = "global"; my $var = "lexical"; print "lexical is $var\n"; no strict 'refs'; print "global is ${'var'}\n"; If you know your package, you can just mention it explicitly, as in $Some_Pack::var. Note that the notation $::var is *not* the dynamic $var in the current package, but rather the one in the `main' package, as though you had written $main::var. Specifying the package directly makes you hard-code its name, but it executes faster and avoids running afoul of `use strict "refs"'. What's the difference between deep and shallow binding? In deep binding, lexical variables mentioned in anonymous subroutines are the same ones that were in scope when the subroutine was created. In shallow binding, they are whichever variables with the same names happen to be in scope when the subroutine is called. Perl always uses deep binding of lexical variables (i.e., those created with my()). However, dynamic variables (aka global, local, or package variables) are effectively shallowly bound. Consider this just one more reason not to use them. See the answer to the section on "What's a closure?". Why doesn't "my($foo) = ;" work right? `my()' and `local()' give list context to the right hand side of `='. The read operation, like so many of Perl's functions and operators, can tell which context it was called in and behaves appropriately. In general, the scalar() function can help. This function does nothing to the data itself (contrary to popular myth) but rather tells its argument to behave in whatever its scalar fashion is. If that function doesn't have a defined scalar behavior, this of course doesn't help you (such as with sort()). To enforce scalar context in this particular case, however, you need merely omit the parentheses: local($foo) = ; # WRONG local($foo) = scalar(); # ok local $foo = ; # right You should probably be using lexical variables anyway, although the issue is the same here: my($foo) = ; # WRONG my $foo = ; # right How do I redefine a builtin function, operator, or method? Why do you want to do that? :-) If you want to override a predefined function, such as open(), then you'll have to import the new definition from a different module. See the section on "Overriding Builtin Functions" in the perlsub manpage. There's also an example in the section on "Class::Template" in the perltoot manpage. If you want to overload a Perl operator, such as `+' or `**', then you'll want to use the `use overload' pragma, documented in the overload manpage. If you're talking about obscuring method calls in parent classes, see the section on "Overridden Methods" in the perltoot manpage. What's the difference between calling a function as &foo and foo()? When you call a function as `&foo', you allow that function access to your current @_ values, and you by-pass prototypes. That means that the function doesn't get an empty @_, it gets yours! While not strictly speaking a bug (it's documented that way in the perlsub manpage), it would be hard to consider this a feature in most cases. When you call your function as `&foo()', then you *do* get a new @_, but prototyping is still circumvented. Normally, you want to call a function using `foo()'. You may only omit the parentheses if the function is already known to the compiler because it already saw the definition (`use' but not `require'), or via a forward reference or `use subs' declaration. Even in this case, you get a clean @_ without any of the old values leaking through where they don't belong. How do I create a switch or case statement? This is explained in more depth in the the perlsyn manpage. Briefly, there's no official case statement, because of the variety of tests possible in Perl (numeric comparison, string comparison, glob comparison, regexp matching, overloaded comparisons, ...). Larry couldn't decide how best to do this, so he left it out, even though it's been on the wish list since perl1. The general answer is to write a construct like this: for ($variable_to_test) { if (/pat1/) { } # do something elsif (/pat2/) { } # do something else elsif (/pat3/) { } # do something else else { } # default } Here's a simple example of a switch based on pattern matching, this time lined up in a way to make it look more like a switch statement. We'll do a multi-way conditional based on the type of reference stored in $whatchamacallit: SWITCH: for (ref $whatchamacallit) { /^$/ && die "not a reference"; /SCALAR/ && do { print_scalar($$ref); last SWITCH; }; /ARRAY/ && do { print_array(@$ref); last SWITCH; }; /HASH/ && do { print_hash(%$ref); last SWITCH; }; /CODE/ && do { warn "can't print function ref"; last SWITCH; }; # DEFAULT warn "User defined type skipped"; } See `perlsyn/"Basic BLOCKs and Switch Statements"' for many other examples in this style. Sometimes you should change the positions of the constant and the variable. For example, let's say you wanted to test which of many answers you were given, but in a case-insensitive way that also allows abbreviations. You can use the following technique if the strings all start with different characters, or if you want to arrange the matches so that one takes precedence over another, as `"SEND"' has precedence over `"STOP"' here: chomp($answer = <>); if ("SEND" =~ /^\Q$answer/i) { print "Action is send\n" } elsif ("STOP" =~ /^\Q$answer/i) { print "Action is stop\n" } elsif ("ABORT" =~ /^\Q$answer/i) { print "Action is abort\n" } elsif ("LIST" =~ /^\Q$answer/i) { print "Action is list\n" } elsif ("EDIT" =~ /^\Q$answer/i) { print "Action is edit\n" } A totally different approach is to create a hash of function references. my %commands = ( "happy" => \&joy, "sad", => \&sullen, "done" => sub { die "See ya!" }, "mad" => \&angry, ); print "How are you? "; chomp($string = ); if ($commands{$string}) { $commands{$string}->(); } else { print "No such command: $string\n"; } How can I catch accesses to undefined variables/functions/methods? The AUTOLOAD method, discussed in the section on "Autoloading" in the perlsub manpage and the section on "AUTOLOAD: Proxy Methods" in the perltoot manpage, lets you capture calls to undefined functions and methods. When it comes to undefined variables that would trigger a warning under `-w', you can use a handler to trap the pseudo-signal `__WARN__' like this: $SIG{__WARN__} = sub { for ( $_[0] ) { # voici un switch statement /Use of uninitialized value/ && do { # promote warning to a fatal die $_; }; # other warning cases to catch could go here; warn $_; } }; Why can't a method included in this same file be found? Some possible reasons: your inheritance is getting confused, you've misspelled the method name, or the object is of the wrong type. Check out the perltoot manpage for details on these. You may also use `print ref($object)' to find out the class `$object' was blessed into. Another possible reason for problems is because you've used the indirect object syntax (eg, `find Guru "Samy"') on a class name before Perl has seen that such a package exists. It's wisest to make sure your packages are all defined before you start using them, which will be taken care of if you use the `use' statement instead of `require'. If not, make sure to use arrow notation (eg, `Guru->find("Samy")') instead. Object notation is explained in the perlobj manpage. Make sure to read about creating modules in the perlmod manpage and the perils of indirect objects in the section on "WARNING" in the perlobj manpage. How can I find out my current package? If you're just a random program, you can do this to find out what the currently compiled package is: my $packname = __PACKAGE__; But if you're a method and you want to print an error message that includes the kind of object you were called on (which is not necessarily the same as the one in which you were compiled): sub amethod { my $self = shift; my $class = ref($self) || $self; warn "called me from a $class object"; } How can I comment out a large block of perl code? Use embedded POD to discard it: # program is here =for nobody This paragraph is commented out # program continues =begin comment text all of this stuff here will be ignored by everyone =end comment text =cut This can't go just anywhere. You have to put a pod directive where the parser is expecting a new statement, not just in the middle of an expression or some other arbitrary yacc grammar production. How do I clear a package? Use this code, provided by Mark-Jason Dominus: sub scrub_package { no strict 'refs'; my $pack = shift; die "Shouldn't delete main package" if $pack eq "" || $pack eq "main"; my $stash = *{$pack . '::'}{HASH}; my $name; foreach $name (keys %$stash) { my $fullname = $pack . '::' . $name; # Get rid of everything with that name. undef $$fullname; undef @$fullname; undef %$fullname; undef &$fullname; undef *$fullname; } } Or, if you're using a recent release of Perl, you can just use the Symbol::delete_package() function instead. AUTHOR AND COPYRIGHT Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as part of the Standard Version of Perl, or as part of its complete documentation whether printed or otherwise, this work may be distributed only under the terms of Perl's Artistic Licence. Any distribution of this file or derivatives thereof *outside* of that package require that special arrangements be made with copyright holder. Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required. perlfaq8 section NAME perlfaq8 - System Interaction ($Revision: 1.36 $, $Date: 1999/01/08 05:36:34 $) DESCRIPTION This section of the Perl FAQ covers questions involving operating system interaction. This involves interprocess communication (IPC), control over the user-interface (keyboard, screen and pointing devices), and most anything else not related to data manipulation. Read the FAQs and documentation specific to the port of perl to your operating system (eg, the perlvms manpage, the perlplan9 manpage, ...). These should contain more detailed information on the vagaries of your perl. How do I find out which operating system I'm running under? The $^O variable ($OSNAME if you use English) contains the operating system that your perl binary was built for. How come exec() doesn't return? Because that's what it does: it replaces your currently running program with a different one. If you want to keep going (as is probably the case if you're asking this question) use system() instead. How do I do fancy stuff with the keyboard/screen/mouse? How you access/control keyboards, screens, and pointing devices ("mice") is system-dependent. Try the following modules: Keyboard Term::Cap Standard perl distribution Term::ReadKey CPAN Term::ReadLine::Gnu CPAN Term::ReadLine::Perl CPAN Term::Screen CPAN Screen Term::Cap Standard perl distribution Curses CPAN Term::ANSIColor CPAN Mouse Tk CPAN Some of these specific cases are shown below. How do I print something out in color? In general, you don't, because you don't know whether the recipient has a color-aware display device. If you know that they have an ANSI terminal that understands color, you can use the Term::ANSIColor module from CPAN: use Term::ANSIColor; print color("red"), "Stop!\n", color("reset"); print color("green"), "Go!\n", color("reset"); Or like this: use Term::ANSIColor qw(:constants); print RED, "Stop!\n", RESET; print GREEN, "Go!\n", RESET; How do I read just one key without waiting for a return key? Controlling input buffering is a remarkably system-dependent matter. If most systems, you can just use the stty command as shown in the "getc" entry in the perlfunc manpage, but as you see, that's already getting you into portability snags. open(TTY, "+/dev/tty 2>&1"; $key = getc(TTY); # perhaps this works # OR ELSE sysread(TTY, $key, 1); # probably this does system "stty -cbreak /dev/tty 2>&1"; The Term::ReadKey module from CPAN offers an easy-to-use interface that should be more efficient than shelling out to stty for each key. It even includes limited support for Windows. use Term::ReadKey; ReadMode('cbreak'); $key = ReadKey(0); ReadMode('normal'); However, that requires that you have a working C compiler and can use it to build and install a CPAN module. Here's a solution using the standard POSIX module, which is already on your systems (assuming your system supports POSIX). use HotKey; $key = readkey(); And here's the HotKey module, which hides the somewhat mystifying calls to manipulate the POSIX termios structures. # HotKey.pm package HotKey; @ISA = qw(Exporter); @EXPORT = qw(cbreak cooked readkey); use strict; use POSIX qw(:termios_h); my ($term, $oterm, $echo, $noecho, $fd_stdin); $fd_stdin = fileno(STDIN); $term = POSIX::Termios->new(); $term->getattr($fd_stdin); $oterm = $term->getlflag(); $echo = ECHO | ECHOK | ICANON; $noecho = $oterm & ~$echo; sub cbreak { $term->setlflag($noecho); # ok, so i don't want echo either $term->setcc(VTIME, 1); $term->setattr($fd_stdin, TCSANOW); } sub cooked { $term->setlflag($oterm); $term->setcc(VTIME, 0); $term->setattr($fd_stdin, TCSANOW); } sub readkey { my $key = ''; cbreak(); sysread(STDIN, $key, 1); cooked(); return $key; } END { cooked() } 1; How do I check whether input is ready on the keyboard? The easiest way to do this is to read a key in nonblocking mode with the Term::ReadKey module from CPAN, passing it an argument of -1 to indicate not to block: use Term::ReadKey; ReadMode('cbreak'); if (defined ($char = ReadKey(-1)) ) { # input was waiting and it was $char } else { # no input was waiting } ReadMode('normal'); # restore normal tty settings How do I clear the screen? If you only have to so infrequently, use `system': system("clear"); If you have to do this a lot, save the clear string so you can print it 100 times without calling a program 100 times: $clear_string = `clear`; print $clear_string; If you're planning on doing other screen manipulations, like cursor positions, etc, you might wish to use Term::Cap module: use Term::Cap; $terminal = Term::Cap->Tgetent( {OSPEED => 9600} ); $clear_string = $terminal->Tputs('cl'); How do I get the screen size? If you have Term::ReadKey module installed from CPAN, you can use it to fetch the width and height in characters and in pixels: use Term::ReadKey; ($wchar, $hchar, $wpixels, $hpixels) = GetTerminalSize(); This is more portable than the raw `ioctl', but not as illustrative: require 'sys/ioctl.ph'; die "no TIOCGWINSZ " unless defined &TIOCGWINSZ; open(TTY, "+autoflush(1); As mentioned in the previous item, this still doesn't work when using socket I/O between Unix and Macintosh. You'll need to hardcode your line terminators, in that case. non-blocking input If you are doing a blocking read() or sysread(), you'll have to arrange for an alarm handler to provide a timeout (see the "alarm" entry in the perlfunc manpage). If you have a non-blocking open, you'll likely have a non-blocking read, which means you may have to use a 4-arg select() to determine whether I/O is ready on that device (see the section on "select" in the perlfunc manpage. While trying to read from his caller-id box, the notorious Jamie Zawinski , after much gnashing of teeth and fighting with sysread, sysopen, POSIX's tcgetattr business, and various other functions that go bump in the night, finally came up with this: sub open_modem { use IPC::Open2; my $stty = `/bin/stty -g`; open2( \*MODEM_IN, \*MODEM_OUT, "cu -l$modem_device -s2400 2>&1"); # starting cu hoses /dev/tty's stty settings, even when it has # been opened on a pipe... system("/bin/stty $stty"); $_ = ; chop; if ( !m/^Connected/ ) { print STDERR "$0: cu printed `$_' instead of `Connected'\n"; } } How do I decode encrypted password files? You spend lots and lots of money on dedicated hardware, but this is bound to get you talked about. Seriously, you can't if they are Unix password files - the Unix password system employs one-way encryption. It's more like hashing than encryption. The best you can check is whether something else hashes to the same string. You can't turn a hash back into the original string. Programs like Crack can forcibly (and intelligently) try to guess passwords, but don't (can't) guarantee quick success. If you're worried about users selecting bad passwords, you should proactively check when they try to change their password (by modifying passwd(1), for example). How do I start a process in the background? You could use system("cmd &") or you could use fork as documented in the section on "fork" in the perlfunc manpage, with further examples in the perlipc manpage. Some things to be aware of, if you're on a Unix-like system: STDIN, STDOUT, and STDERR are shared Both the main process and the backgrounded one (the "child" process) share the same STDIN, STDOUT and STDERR filehandles. If both try to access them at once, strange things can happen. You may want to close or reopen these for the child. You can get around this with `open'ing a pipe (see the section on "open" in the perlfunc manpage) but on some systems this means that the child process cannot outlive the parent. Signals You'll have to catch the SIGCHLD signal, and possibly SIGPIPE too. SIGCHLD is sent when the backgrounded process finishes. SIGPIPE is sent when you write to a filehandle whose child process has closed (an untrapped SIGPIPE can cause your program to silently die). This is not an issue with `system("cmd&")'. Zombies You have to be prepared to "reap" the child process when it finishes $SIG{CHLD} = sub { wait }; See the section on "Signals" in the perlipc manpage for other examples of code to do this. Zombies are not an issue with `system("prog &")'. How do I trap control characters/signals? You don't actually "trap" a control character. Instead, that character generates a signal which is sent to your terminal's currently foregrounded process group, which you then trap in your process. Signals are documented in the section on "Signals" in the perlipc manpage and chapter 6 of the Camel. Be warned that very few C libraries are re-entrant. Therefore, if you attempt to print() in a handler that got invoked during another stdio operation your internal structures will likely be in an inconsistent state, and your program will dump core. You can sometimes avoid this by using syswrite() instead of print(). Unless you're exceedingly careful, the only safe things to do inside a signal handler are: set a variable and exit. And in the first case, you should only set a variable in such a way that malloc() is not called (eg, by setting a variable that already has a value). For example: $Interrupted = 0; # to ensure it has a value $SIG{INT} = sub { $Interrupted++; syswrite(STDERR, "ouch\n", 5); } However, because syscalls restart by default, you'll find that if you're in a "slow" call, such as , read(), connect(), or wait(), that the only way to terminate them is by "longjumping" out; that is, by raising an exception. See the time-out handler for a blocking flock() in the section on "Signals" in the perlipc manpage or chapter 6 of the Camel. How do I modify the shadow password file on a Unix system? If perl was installed correctly, and your shadow library was written properly, the getpw*() functions described in the perlfunc manpage should in theory provide (read-only) access to entries in the shadow password file. To change the file, make a new shadow password file (the format varies from system to system - see the passwd(5) manpage for specifics) and use pwd_mkdb(8) to install it (see the pwd_mkdb(5) manpage for more details). How do I set the time and date? Assuming you're running under sufficient permissions, you should be able to set the system-wide date and time by running the date(1) program. (There is no way to set the time and date on a per-process basis.) This mechanism will work for Unix, MS-DOS, Windows, and NT; the VMS equivalent is `set time'. However, if all you want to do is change your timezone, you can probably get away with setting an environment variable: $ENV{TZ} = "MST7MDT"; # unixish $ENV{'SYS$TIMEZONE_DIFFERENTIAL'}="-5" # vms system "trn comp.lang.perl.misc"; How can I sleep() or alarm() for under a second? If you want finer granularity than the 1 second that the sleep() function provides, the easiest way is to use the select() function as documented in the section on "select" in the perlfunc manpage. If your system has itimers and syscall() support, you can check out the old example in http://www.perl.com/CPAN/doc/misc/ancient/tutorial/eg/itimers.pl . How can I measure time under a second? In general, you may not be able to. The Time::HiRes module (available from CPAN) provides this functionality for some systems. If your system supports both the syscall() function in Perl as well as a system call like gettimeofday(2), then you may be able to do something like this: require 'sys/syscall.ph'; $TIMEVAL_T = "LL"; $done = $start = pack($TIMEVAL_T, ()); syscall( &SYS_gettimeofday, $start, 0) != -1 or die "gettimeofday: $!"; ########################## # DO YOUR OPERATION HERE # ########################## syscall( &SYS_gettimeofday, $done, 0) != -1 or die "gettimeofday: $!"; @start = unpack($TIMEVAL_T, $start); @done = unpack($TIMEVAL_T, $done); # fix microseconds for ($done[1], $start[1]) { $_ /= 1_000_000 } $delta_time = sprintf "%.4f", ($done[0] + $done[1] ) - ($start[0] + $start[1] ); How can I do an atexit() or setjmp()/longjmp()? (Exception handling) Release 5 of Perl added the END block, which can be used to simulate atexit(). Each package's END block is called when the program or thread ends (see the perlmod manpage manpage for more details). For example, you can use this to make sure your filter program managed to finish its output without filling up the disk: END { close(STDOUT) || die "stdout close failed: $!"; } The END block isn't called when untrapped signals kill the program, though, so if you use END blocks you should also use use sigtrap qw(die normal-signals); Perl's exception-handling mechanism is its eval() operator. You can use eval() as setjmp and die() as longjmp. For details of this, see the section on signals, especially the time-out handler for a blocking flock() in the section on "Signals" in the perlipc manpage and chapter 6 of the Camel. If exception handling is all you're interested in, try the exceptions.pl library (part of the standard perl distribution). If you want the atexit() syntax (and an rmexit() as well), try the AtExit module available from CPAN. Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean? Some Sys-V based systems, notably Solaris 2.X, redefined some of the standard socket constants. Since these were constant across all architectures, they were often hardwired into perl code. The proper way to deal with this is to "use Socket" to get the correct values. Note that even though SunOS and Solaris are binary compatible, these values are different. Go figure. How can I call my system's unique C functions from Perl? In most cases, you write an external module to do it - see the answer to "Where can I learn about linking C with Perl? [h2xs, xsubpp]". However, if the function is a system call, and your system supports syscall(), you can use the syscall function (documented in the perlfunc manpage). Remember to check the modules that came with your distribution, and CPAN as well - someone may already have written a module to do it. Where do I get the include files to do ioctl() or syscall()? Historically, these would be generated by the h2ph tool, part of the standard perl distribution. This program converts cpp(1) directives in C header files to files containing subroutine definitions, like &SYS_getitimer, which you can use as arguments to your functions. It doesn't work perfectly, but it usually gets most of the job done. Simple files like errno.h, syscall.h, and socket.h were fine, but the hard ones like ioctl.h nearly always need to hand-edited. Here's how to install the *.ph files: 1. become super-user 2. cd /usr/include 3. h2ph *.h */*.h If your system supports dynamic loading, for reasons of portability and sanity you probably ought to use h2xs (also part of the standard perl distribution). This tool converts C header files to Perl extensions. See the perlxstut manpage for how to get started with h2xs. If your system doesn't support dynamic loading, you still probably ought to use h2xs. See the perlxstut manpage and the ExtUtils::MakeMaker manpage for more information (in brief, just use make perl instead of a plain make to rebuild perl with a new static extension). Why do setuid perl scripts complain about kernel problems? Some operating systems have bugs in the kernel that make setuid scripts inherently insecure. Perl gives you a number of options (described in the perlsec manpage) to work around such systems. How can I open a pipe both to and from a command? The IPC::Open2 module (part of the standard perl distribution) is an easy-to-use approach that internally uses pipe(), fork(), and exec() to do the job. Make sure you read the deadlock warnings in its documentation, though (see the IPC::Open2 manpage). See the section on "Bidirectional Communication with Another Process" in the perlipc manpage and the section on "Bidirectional Communication with Yourself" in the perlipc manpage You may also use the IPC::Open3 module (part of the standard perl distribution), but be warned that it has a different order of arguments from IPC::Open2 (see the IPC::Open3 manpage). Why can't I get the output of a command with system()? You're confusing the purpose of system() and backticks (``). system() runs a command and returns exit status information (as a 16 bit value: the low 7 bits are the signal the process died from, if any, and the high 8 bits are the actual exit value). Backticks (``) run a command and return what it sent to STDOUT. $exit_status = system("mail-users"); $output_string = `ls`; How can I capture STDERR from an external command? There are three basic ways of running external commands: system $cmd; # using system() $output = `$cmd`; # using backticks (``) open (PIPE, "cmd |"); # using open() With system(), both STDOUT and STDERR will go the same place as the script's versions of these, unless the command redirects them. Backticks and open() read only the STDOUT of your command. With any of these, you can change file descriptors before the call: open(STDOUT, ">logfile"); system("ls"); or you can use Bourne shell file-descriptor redirection: $output = `$cmd 2>some_file`; open (PIPE, "cmd 2>some_file |"); You can also use file-descriptor redirection to make STDERR a duplicate of STDOUT: $output = `$cmd 2>&1`; open (PIPE, "cmd 2>&1 |"); Note that you *cannot* simply open STDERR to be a dup of STDOUT in your Perl program and avoid calling the shell to do the redirection. This doesn't work: open(STDERR, ">&STDOUT"); $alloutput = `cmd args`; # stderr still escapes This fails because the open() makes STDERR go to where STDOUT was going at the time of the open(). The backticks then make STDOUT go to a string, but don't change STDERR (which still goes to the old STDOUT). Note that you *must* use Bourne shell (sh(1)) redirection syntax in backticks, not csh(1)! Details on why Perl's system() and backtick and pipe opens all use the Bourne shell are in http://www.perl.com/CPAN/doc/FMTEYEWTK/versus/csh.whynot . To capture a command's STDERR and STDOUT together: $output = `cmd 2>&1`; # either with backticks $pid = open(PH, "cmd 2>&1 |"); # or with an open pipe while () { } # plus a read To capture a command's STDOUT but discard its STDERR: $output = `cmd 2>/dev/null`; # either with backticks $pid = open(PH, "cmd 2>/dev/null |"); # or with an open pipe while () { } # plus a read To capture a command's STDERR but discard its STDOUT: $output = `cmd 2>&1 1>/dev/null`; # either with backticks $pid = open(PH, "cmd 2>&1 1>/dev/null |"); # or with an open pipe while () { } # plus a read To exchange a command's STDOUT and STDERR in order to capture the STDERR but leave its STDOUT to come out our old STDERR: $output = `cmd 3>&1 1>&2 2>&3 3>&-`; # either with backticks $pid = open(PH, "cmd 3>&1 1>&2 2>&3 3>&-|");# or with an open pipe while () { } # plus a read To read both a command's STDOUT and its STDERR separately, it's easiest and safest to redirect them separately to files, and then read from those files when the program is done: system("program args 1>/tmp/program.stdout 2>/tmp/program.stderr"); Ordering is important in all these examples. That's because the shell processes file descriptor redirections in strictly left to right order. system("prog args 1>tmpfile 2>&1"); system("prog args 2>&1 1>tmpfile"); The first command sends both standard out and standard error to the temporary file. The second command sends only the old standard output there, and the old standard error shows up on the old standard out. Why doesn't open() return an error when a pipe open fails? Because the pipe open takes place in two steps: first Perl calls fork() to start a new process, then this new process calls exec() to run the program you really wanted to open. The first step reports success or failure to your process, so open() can only tell you whether the fork() succeeded or not. To find out if the exec() step succeeded, you have to catch SIGCHLD and wait() to get the exit status. You should also catch SIGPIPE if you're writing to the child--you may not have found out the exec() failed by the time you write. This is documented in the perlipc manpage. In some cases, even this won't work. If the second argument to a piped open() contains shell metacharacters, perl fork()s, then exec()s a shell to decode the metacharacters and eventually run the desired program. Now when you call wait(), you only learn whether or not the *shell* could be successfully started. Best to avoid shell metacharacters. On systems that follow the spawn() paradigm, open() *might* do what you expect--unless perl uses a shell to start your command. In this case the fork()/exec() description still applies. What's wrong with using backticks in a void context? Strictly speaking, nothing. Stylistically speaking, it's not a good way to write maintainable code because backticks have a (potentially humungous) return value, and you're ignoring it. It's may also not be very efficient, because you have to read in all the lines of output, allocate memory for them, and then throw it away. Too often people are lulled to writing: `cp file file.bak`; And now they think "Hey, I'll just always use backticks to run programs." Bad idea: backticks are for capturing a program's output; the system() function is for running programs. Consider this line: `cat /etc/termcap`; You haven't assigned the output anywhere, so it just wastes memory (for a little while). Plus you forgot to check `$?' to see whether the program even ran correctly. Even if you wrote print `cat /etc/termcap`; In most cases, this could and probably should be written as system("cat /etc/termcap") == 0 or die "cat program failed!"; Which will get the output quickly (as its generated, instead of only at the end) and also check the return value. system() also provides direct control over whether shell wildcard processing may take place, whereas backticks do not. How can I call backticks without shell processing? This is a bit tricky. Instead of writing @ok = `grep @opts '$search_string' @filenames`; You have to do this: my @ok = (); if (open(GREP, "-|")) { while () { chomp; push(@ok, $_); } close GREP; } else { exec 'grep', @opts, $search_string, @filenames; } Just as with system(), no shell escapes happen when you exec() a list. There are more examples of this the section on "Safe Pipe Opens" in the perlipc manpage. Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)? Because some stdio's set error and eof flags that need clearing. The POSIX module defines clearerr() that you can use. That is the technically correct way to do it. Here are some less reliable workarounds: 1 Try keeping around the seekpointer and go there, like this: $where = tell(LOG); seek(LOG, $where, 0); 2 If that doesn't work, try seeking to a different part of the file and then back. 3 If that doesn't work, try seeking to a different part of the file, reading something, and then seeking back. 4 If that doesn't work, give up on your stdio package and use sysread. How can I convert my shell script to perl? Learn Perl and rewrite it. Seriously, there's no simple converter. Things that are awkward to do in the shell are easy to do in Perl, and this very awkwardness is what would make a shell->perl converter nigh-on impossible to write. By rewriting it, you'll think about what you're really trying to do, and hopefully will escape the shell's pipeline datastream paradigm, which while convenient for some matters, causes many inefficiencies. Can I use perl to run a telnet or ftp session? Try the Net::FTP, TCP::Client, and Net::Telnet modules (available from CPAN). http://www.perl.com/CPAN/scripts/netstuff/telnet.emul.shar will also help for emulating the telnet protocol, but Net::Telnet is quite probably easier to use.. If all you want to do is pretend to be telnet but don't need the initial telnet handshaking, then the standard dual-process approach will suffice: use IO::Socket; # new in 5.004 $handle = IO::Socket::INET->new('www.perl.com:80') || die "can't connect to port 80 on www.perl.com: $!"; $handle->autoflush(1); if (fork()) { # XXX: undef means failure select($handle); print while ; # everything from stdin to socket } else { print while <$handle>; # everything from socket to stdout } close $handle; exit; How can I write expect in Perl? Once upon a time, there was a library called chat2.pl (part of the standard perl distribution), which never really got finished. If you find it somewhere, *don't use it*. These days, your best bet is to look at the Expect module available from CPAN, which also requires two other modules from CPAN, IO::Pty and IO::Stty. Is there a way to hide perl's command line from programs such as "ps"? First of all note that if you're doing this for security reasons (to avoid people seeing passwords, for example) then you should rewrite your program so that critical information is never given as an argument. Hiding the arguments won't make your program completely secure. To actually alter the visible command line, you can assign to the variable $0 as documented in the perlvar manpage. This won't work on all operating systems, though. Daemon programs like sendmail place their state there, as in: $0 = "orcus [accepting connections]"; I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible? Unix In the strictest sense, it can't be done -- the script executes as a different process from the shell it was started from. Changes to a process are not reflected in its parent, only in its own children created after the change. There is shell magic that may allow you to fake it by eval()ing the script's output in your shell; check out the comp.unix.questions FAQ for details. How do I close a process's filehandle without waiting for it to complete? Assuming your system supports such things, just send an appropriate signal to the process (see the section on "kill" in the perlfunc manpage. It's common to first send a TERM signal, wait a little bit, and then send a KILL signal to finish it off. How do I fork a daemon process? If by daemon process you mean one that's detached (disassociated from its tty), then the following process is reported to work on most Unixish systems. Non-Unix users should check their Your_OS::Process module for other solutions. * Open /dev/tty and use the TIOCNOTTY ioctl on it. See the tty(4) manpage for details. Or better yet, you can just use the POSIX::setsid() function, so you don't have to worry about process groups. * Change directory to / * Reopen STDIN, STDOUT, and STDERR so they're not connected to the old tty. * Background yourself like this: fork && exit; The Proc::Daemon module, available from CPAN, provides a function to perform these actions for you. How do I make my program run with sh and csh? See the eg/nih script (part of the perl source distribution). How do I find out if I'm running interactively or not? Good question. Sometimes `-t STDIN' and `-t STDOUT' can give clues, sometimes not. if (-t STDIN && -t STDOUT) { print "Now what? "; } On POSIX systems, you can test whether your own process group matches the current process group of your controlling terminal as follows: use POSIX qw/getpgrp tcgetpgrp/; open(TTY, "/dev/tty") or die $!; $tpgrp = tcgetpgrp(fileno(*TTY)); $pgrp = getpgrp(); if ($tpgrp == $pgrp) { print "foreground\n"; } else { print "background\n"; } How do I timeout a slow event? Use the alarm() function, probably in conjunction with a signal handler, as documented the section on "Signals" in the perlipc manpage and chapter 6 of the Camel. You may instead use the more flexible Sys::AlarmCall module available from CPAN. How do I set CPU limits? Use the BSD::Resource module from CPAN. How do I avoid zombies on a Unix system? Use the reaper code from the section on "Signals" in the perlipc manpage to call wait() when a SIGCHLD is received, or else use the double-fork technique described in the "fork" entry in the perlfunc manpage. How do I use an SQL database? There are a number of excellent interfaces to SQL databases. See the DBD::* modules available from http://www.perl.com/CPAN/modules/dbperl/DBD . A lot of information on this can be found at http://www.hermetica.com/technologia/perl/DBI/index.html . How do I make a system() exit on control-C? You can't. You need to imitate the system() call (see the perlipc manpage for sample code) and then have a signal handler for the INT signal that passes the signal on to the subprocess. Or you can check for it: $rc = system($cmd); if ($rc & 127) { die "signal death" } How do I open a file without blocking? If you're lucky enough to be using a system that supports non-blocking reads (most Unixish systems do), you need only to use the O_NDELAY or O_NONBLOCK flag from the Fcntl module in conjunction with sysopen(): use Fcntl; sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT, 0644) or die "can't open /tmp/somefile: $!": How do I install a CPAN module? The easiest way is to have the CPAN module do it for you. This module comes with perl version 5.004 and later. To manually install the CPAN module, or any well-behaved CPAN module for that matter, follow these steps: 1 Unpack the source into a temporary area. 2 perl Makefile.PL 3 make 4 make test 5 make install If your version of perl is compiled without dynamic loading, then you just need to replace step 3 (make) with make perl and you will get a new perl binary with your extension linked in. See the ExtUtils::MakeMaker manpage for more details on building extensions. See also the next question. What's the difference between require and use? Perl offers several different ways to include code from one file into another. Here are the deltas between the various inclusion constructs: 1) do $file is like eval `cat $file`, except the former: 1.1: searches @INC and updates %INC. 1.2: bequeaths an *unrelated* lexical scope on the eval'ed code. 2) require $file is like do $file, except the former: 2.1: checks for redundant loading, skipping already loaded files. 2.2: raises an exception on failure to find, compile, or execute $file. 3) require Module is like require "Module.pm", except the former: 3.1: translates each "::" into your system's directory separator. 3.2: primes the parser to disambiguate class Module as an indirect object. 4) use Module is like require Module, except the former: 4.1: loads the module at compile time, not run-time. 4.2: imports symbols and semantics from that package to the current one. In general, you usually want `use' and a proper Perl module. How do I keep my own module/library directory? When you build modules, use the PREFIX option when generating Makefiles: perl Makefile.PL PREFIX=/u/mydir/perl then either set the PERL5LIB environment variable before you run scripts that use the modules/libraries (see the perlrun manpage) or say use lib '/u/mydir/perl'; This is almost the same as: BEGIN { unshift(@INC, '/u/mydir/perl'); } except that the lib module checks for machine-dependent subdirectories. See Perl's the lib manpage for more information. How do I add the directory my program lives in to the module/library search path? use FindBin; use lib "$FindBin::Bin"; use your_own_modules; How do I add a directory to my include path at runtime? Here are the suggested ways of modifying your include path: the PERLLIB environment variable the PERL5LIB environment variable the perl -Idir command line flag the use lib pragma, as in use lib "$ENV{HOME}/myown_perllib"; The latter is particularly useful because it knows about machine dependent architectures. The lib.pm pragmatic module was first included with the 5.002 release of Perl. What is socket.ph and where do I get it? It's a perl4-style file defining values for system networking constants. Sometimes it is built using h2ph when Perl is installed, but other times it is not. Modern programs `use Socket;' instead. AUTHOR AND COPYRIGHT Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as part of the Standard Version of Perl, or as part of its complete documentation whether printed or otherwise, this work may be distributed only under the terms of Perl's Artistic Licence. Any distribution of this file or derivatives thereof *outside* of that package require that special arrangements be made with copyright holder. Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required. perlfaq9 section NAME perlfaq9 - Networking ($Revision: 1.24 $, $Date: 1999/01/08 05:39:48 $) DESCRIPTION This section deals with questions related to networking, the internet, and a few on the web. My CGI script runs from the command line but not the browser. (500 Server Error) If you can demonstrate that you've read the following FAQs and that your problem isn't something simple that can be easily answered, you'll probably receive a courteous and useful reply to your question if you post it on comp.infosystems.www.authoring.cgi (if it's something to do with HTTP, HTML, or the CGI protocols). Questions that appear to be Perl questions but are really CGI ones that are posted to comp.lang.perl.misc may not be so well received. The useful FAQs and related documents are: CGI FAQ http://www.webthing.com/tutorials/cgifaq.html Web FAQ http://www.boutell.com/faq/ WWW Security FAQ http://www.w3.org/Security/Faq/ HTTP Spec http://www.w3.org/pub/WWW/Protocols/HTTP/ HTML Spec http://www.w3.org/TR/REC-html40/ http://www.w3.org/pub/WWW/MarkUp/ CGI Spec http://www.w3.org/CGI/ CGI Security FAQ http://www.go2net.com/people/paulp/cgi-security/safe-cgi.txt How can I get better error messages from a CGI program? Use the CGI::Carp module. It replaces `warn' and `die', plus the normal Carp modules `carp', `croak', and `confess' functions with more verbose and safer versions. It still sends them to the normal server error log. use CGI::Carp; warn "This is a complaint"; die "But this one is serious"; The following use of CGI::Carp also redirects errors to a file of your choice, placed in a BEGIN block to catch compile-time warnings as well: BEGIN { use CGI::Carp qw(carpout); open(LOG, ">>/var/local/cgi-logs/mycgi-log") or die "Unable to append to mycgi-log: $!\n"; carpout(*LOG); } You can even arrange for fatal errors to go back to the client browser, which is nice for your own debugging, but might confuse the end user. use CGI::Carp qw(fatalsToBrowser); die "Bad error here"; Even if the error happens before you get the HTTP header out, the module will try to take care of this to avoid the dreaded server 500 errors. Normal warnings still go out to the server error log (or wherever you've sent them with `carpout') with the application name and date stamp prepended. How do I remove HTML from a string? The most correct way (albeit not the fastest) is to use HTML::Parse from CPAN (part of the HTML-Tree package on CPAN). Many folks attempt a simple-minded regular expression approach, like `s/<.*?>//g', but that fails in many cases because the tags may continue over line breaks, they may contain quoted angle-brackets, or HTML comment may be present. Plus folks forget to convert entities, like `<' for example. Here's one "simple-minded" approach, that works for most files: #!/usr/bin/perl -p0777 s/<(?:[^>'"]*|(['"]).*?\1)*>//gs If you want a more complete solution, see the 3-stage striphtml program in http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/striphtml.gz . Here are some tricky cases that you should think about when picking a solution: A > B A > B <# Just data #> >>>>>>>>>>> ]]> If HTML comments include other tags, those solutions would also break on text like this: How do I extract URLs? A quick but imperfect approach is #!/usr/bin/perl -n00 # qxurl - tchrist@perl.com print "$2\n" while m{ < \s* A \s+ HREF \s* = \s* (["']) (.*?) \1 \s* > }gsix; This version does not adjust relative URLs, understand alternate bases, deal with HTML comments, deal with HREF and NAME attributes in the same tag, or accept URLs themselves as arguments. It also runs about 100x faster than a more "complete" solution using the LWP suite of modules, such as the http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/xurl.gz program. How do I download a file from the user's machine? How do I open a file on another machine? In the context of an HTML form, you can use what's known as multipart/form-data encoding. The CGI.pm module (available from CPAN) supports this in the start_multipart_form() method, which isn't the same as the startform() method. How do I make a pop-up menu in HTML? Use the