NAME

perl5200delta - what is new for perl v5.20.0

DESCRIPTION

This document describes differences between the 5.18.0 release and the 5.20.0 release.

If you are upgrading from an earlier release such as 5.16.0, first read perl5180delta, which describes differences between 5.16.0 and 5.18.0.

Core Enhancements

Experimental Subroutine signatures

Declarative syntax to unwrap argument list into lexical variables. sub foo ($a,$b) {...} checks the number of arguments and puts the arguments into lexical variables. Signatures are not equivalent to the existing idiom of sub foo { my($a,$b) = @_; ... }. Signatures are only available by enabling a non-default feature, and generate warnings about being experimental. The syntactic clash with prototypes is managed by disabling the short prototype syntax when signatures are enabled.

See "Signatures" in perlsub for details.

subs now take a prototype attribute

When declaring or defining a sub, the prototype can now be specified inside of a prototype attribute instead of in parens following the name.

For example, sub foo($$){} could be rewritten as sub foo : prototype($$){}.

More consistent prototype parsing

Multiple semicolons in subroutine prototypes have long been tolerated and treated as a single semicolon. There was one case where this did not happen. A subroutine whose prototype begins with "*" or ";*" can affect whether a bareword is considered a method name or sub call. This now applies also to ";;;*".

Whitespace has long been allowed inside subroutine prototypes, so sub( $ $ ) is equivalent to sub($$), but until now it was stripped when the subroutine was parsed. Hence, whitespace was not allowed in prototypes set by Scalar::Util::set_prototype. Now it is permitted, and the parser no longer strips whitespace. This means prototype &mysub returns the original prototype, whitespace and all.

rand now uses a consistent random number generator

Previously perl would use a platform specific random number generator, varying between the libc rand(), random() or drand48().

This meant that the quality of perl's random numbers would vary from platform to platform, from the 15 bits of rand() on Windows to 48-bits on POSIX platforms such as Linux with drand48().

Perl now uses its own internal drand48() implementation on all platforms. This does not make perl's rand cryptographically secure. [perl #115928]

New slice syntax

The new %hash{...} and %array[...] syntax returns a list of key/value (or index/value) pairs. See "Key/Value Hash Slices" in perldata.

Experimental Postfix Dereferencing

When the postderef feature is in effect, the following syntactical equivalencies are set up:

  $sref->$*;  # same as ${ $sref }  # interpolates
  $aref->@*;  # same as @{ $aref }  # interpolates
  $href->%*;  # same as %{ $href }
  $cref->&*;  # same as &{ $cref }
  $gref->**;  # same as *{ $gref }

  $aref->$#*; # same as $#{ $aref }

  $gref->*{ $slot }; # same as *{ $gref }{ $slot }

  $aref->@[ ... ];  # same as @$aref[ ... ]  # interpolates
  $href->@{ ... };  # same as @$href{ ... }  # interpolates
  $aref->%[ ... ];  # same as %$aref[ ... ]
  $href->%{ ... };  # same as %$href{ ... }

Those marked as interpolating only interpolate if the associated postderef_qq feature is also enabled. This feature is experimental and will trigger experimental::postderef-category warnings when used, unless they are suppressed.

For more information, consult the Postfix Dereference Syntax section of perlref.

Unicode 6.3 now supported

Perl now supports and is shipped with Unicode 6.3 (though Perl may be recompiled with any previous Unicode release as well). A detailed list of Unicode 6.3 changes is at http://www.unicode.org/versions/Unicode6.3.0/.

New \p{Unicode} regular expression pattern property

This is a synonym for \p{Any} and matches the set of Unicode-defined code points 0 - 0x10FFFF.

Better 64-bit support

On 64-bit platforms, the internal array functions now use 64-bit offsets, allowing Perl arrays to hold more than 2**31 elements, if you have the memory available.

The regular expression engine now supports strings longer than 2**31 characters. [perl #112790, #116907]

The functions PerlIO_get_bufsiz, PerlIO_get_cnt, PerlIO_set_cnt and PerlIO_set_ptrcnt now have SSize_t, rather than int, return values and parameters.

use locale now works on UTF-8 locales

Until this release, only single-byte locales, such as the ISO 8859 series were supported. Now, the increasingly common multi-byte UTF-8 locales are also supported. A UTF-8 locale is one in which the character set is Unicode and the encoding is UTF-8. The POSIX LC_CTYPE category operations (case changing (like lc(), "\U"), and character classification (\w, \D, qr/[[:punct:]]/)) under such a locale work just as if not under locale, but instead as if under use feature 'unicode_strings', except taint rules are followed. Sorting remains by code point order in this release. [perl #56820].

use locale now compiles on systems without locale ability

Previously doing this caused the program to not compile. Within its scope the program behaves as if in the "C" locale. Thus programs written for platforms that support locales can run on locale-less platforms without change. Attempts to change the locale away from the "C" locale will, of course, fail.

More locale initialization fallback options

If there was an error with locales during Perl start-up, it immediately gave up and tried to use the "C" locale. Now it first tries using other locales given by the environment variables, as detailed in "ENVIRONMENT" in perllocale. For example, if LC_ALL and LANG are both set, and using the LC_ALL locale fails, Perl will now try the LANG locale, and only if that fails, will it fall back to "C". On Windows machines, Perl will try, ahead of using "C", the system default locale if all the locales given by environment variables fail.

-DL runtime option now added for tracing locale setting

This is designed for Perl core developers to aid in field debugging bugs regarding locales.

-F now implies -a and -a implies -n

Previously -F without -a was a no-op, and -a without -n or -p was a no-op, with this change, if you supply -F then both -a and -n are implied and if you supply -a then -n is implied.

You can still use -p for its extra behaviour. [perl #116190]

$a and $b warnings exemption

The special variables $a and $b, used in sort, are now exempt from "used once" warnings, even where sort is not used. This makes it easier for CPAN modules to provide functions using $a and $b for similar purposes. [perl #120462]

Security

Avoid possible read of free()d memory during parsing

It was possible that free()d memory could be read during parsing in the unusual circumstance of the Perl program ending with a heredoc and the last line of the file on disk having no terminating newline character. This has now been fixed.

Incompatible Changes

do can no longer be used to call subroutines

The do SUBROUTINE(LIST) form has resulted in a deprecation warning since Perl v5.0.0, and is now a syntax error.

Quote-like escape changes

The character after \c in a double-quoted string ("..." or qq(...)) or regular expression must now be a printable character and may not be {.

A literal { after \B or \b is now fatal.

These were deprecated in perl v5.14.0.

Tainting happens under more circumstances; now conforms to documentation

This affects regular expression matching and changing the case of a string (lc, "\U", etc.) within the scope of use locale. The result is now tainted based on the operation, no matter what the contents of the string were, as the documentation (perlsec, "SECURITY" in perllocale) indicates it should. Previously, for the case change operation, if the string contained no characters whose case change could be affected by the locale, the result would not be tainted. For example, the result of uc() on an empty string or one containing only above-Latin1 code points is now tainted, and wasn't before. This leads to more consistent tainting results. Regular expression patterns taint their non-binary results (like $&, $2) if and only if the pattern contains elements whose matching depends on the current (potentially tainted) locale. Like the case changing functions, the actual contents of the string being matched now do not matter, whereas formerly it did. For example, if the pattern contains a \w, the results will be tainted even if the match did not have to use that portion of the pattern to succeed or fail, because what a \w matches depends on locale. However, for example, a . in a pattern will not enable tainting, because the dot matches any single character, and what the current locale is doesn't change in any way what matches and what doesn't.

\p{}, \P{} matching has changed for non-Unicode code points.

\p{} and \P{} are defined by Unicode only on Unicode-defined code points (U+0000 through U+10FFFF). Their behavior on matching these legal Unicode code points is unchanged, but there are changes for code points 0x110000 and above. Previously, Perl treated the result of matching \p{} and \P{} against these as undef, which translates into "false". For \P{}, this was then complemented into "true". A warning was supposed to be raised when this happened. However, various optimizations could prevent the warning, and the results were often counter-intuitive, with both a match and its seeming complement being false. Now all non-Unicode code points are treated as typical unassigned Unicode code points. This generally is more Do-What-I-Mean. A warning is raised only if the results are arguably different from a strict Unicode approach, and from what Perl used to do. Code that needs to be strictly Unicode compliant can make this warning fatal, and then Perl always raises the warning.

Details are in "Beyond Unicode code points" in perlunicode.

\p{All} has been expanded to match all possible code points

The Perl-defined regular expression pattern element \p{All}, unused on CPAN, used to match just the Unicode code points; now it matches all possible code points; that is, it is equivalent to qr/./s. Thus \p{All} is no longer synonymous with \p{Any}, which continues to match just the Unicode code points, as Unicode says it should.

Data::Dumper's output may change

Depending on the data structures dumped and the settings set for Data::Dumper, the dumped output may have changed from previous versions.

If you have tests that depend on the exact output of Data::Dumper, they may fail.

To avoid this problem in your code, test against the data structure from evaluating the dumped structure, instead of the dump itself.

Locale decimal point character no longer leaks outside of use locale scope

This is actually a bug fix, but some code has come to rely on the bug being present, so this change is listed here. The current locale that the program is running under is not supposed to be visible to Perl code except within the scope of a use locale. However, until now under certain circumstances, the character used for a decimal point (often a comma) leaked outside the scope. If your code is affected by this change, simply add a use locale.

Assignments of Windows sockets error codes to $! now prefer errno.h values over WSAGetLastError() values

In previous versions of Perl, Windows sockets error codes as returned by WSAGetLastError() were assigned to $!, and some constants such as ECONNABORTED, not in errno.h in VC++ (or the various Windows ports of gcc) were defined to corresponding WSAE* values to allow $! to be tested against the E* constants exported by Errno and POSIX.

This worked well until VC++ 2010 and later, which introduced new E* constants with values > 100 into errno.h, including some being (re)defined by perl to WSAE* values. That caused problems when linking XS code against other libraries which used the original definitions of errno.h constants.

To avoid this incompatibility, perl now maps WSAE* error codes to E* values where possible, and assigns those values to $!. The E* constants exported by Errno and POSIX are updated to match so that testing $! against them, wherever previously possible, will continue to work as expected, and all E* constants found in errno.h are now exported from those modules with their original errno.h values.

In order to avoid breakage in existing Perl code which assigns WSAE* values to $!, perl now intercepts the assignment and performs the same mapping to E* values as it uses internally when assigning to $! itself.

However, one backwards-incompatibility remains: existing Perl code which compares $! against the numeric values of the WSAE* error codes that were previously assigned to $! will now be broken in those cases where a corresponding E* value has been assigned instead. This is only an issue for those E* values < 100, which were always exported from Errno and POSIX with their original errno.h values, and therefore could not be used for WSAE* error code tests (e.g. WSAEINVAL is 10022, but the corresponding EINVAL is 22). (E* values > 100, if present, were redefined to WSAE* values anyway, so compatibility can be achieved by using the E* constants, which will work both before and after this change, albeit using different numeric values under the hood.)

Functions PerlIO_vsprintf and PerlIO_sprintf have been removed

These two functions, undocumented, unused in CPAN, and problematic, have been removed.

Deprecations

The /\C/ character class

The /\C/ regular expression character class is deprecated. From perl 5.22 onwards it will generate a warning, and from perl 5.24 onwards it will be a regular expression compiler error. If you need to examine the individual bytes that make up a UTF8-encoded character, then use utf8::encode() on the string (or a copy) first.

Literal control characters in variable names

This deprecation affects things like $\cT, where \cT is a literal control (such as a NAK or NEGATIVE ACKNOWLEDGE character) in the source code. Surprisingly, it appears that originally this was intended as the canonical way of accessing variables like $^T, with the caret form only being added as an alternative.

The literal control form is being deprecated for two main reasons. It has what are likely unfixable bugs, such as $\cI not working as an alias for $^I, and their usage not being portable to non-ASCII platforms: While $^T will work everywhere, \cT is whitespace in EBCDIC. [perl #119123]

References to non-integers and non-positive integers in $/

Setting $/ to a reference to zero or a reference to a negative integer is now deprecated, and will behave exactly as though it was set to undef. If you want slurp behavior set $/ to undef explicitly.

Setting $/ to a reference to a non integer is now forbidden and will throw an error. Perl has never documented what would happen in this context and while it used to behave the same as setting $/ to the address of the references in future it may behave differently, so we have forbidden this usage.

Character matching routines in POSIX

Use of any of these functions in the POSIX module is now deprecated: isalnum, isalpha, iscntrl, isdigit, isgraph, islower, isprint, ispunct, isspace, isupper, and isxdigit. The functions are buggy and don't work on UTF-8 encoded strings. See their entries in POSIX for more information.

A warning is raised on the first call to any of them from each place in the code that they are called. (Hence a repeated statement in a loop will raise just the one warning.)

Interpreter-based threads are now discouraged

The "interpreter-based threads" provided by Perl are not the fast, lightweight system for multitasking that one might expect or hope for. Threads are implemented in a way that make them easy to misuse. Few people know how to use them correctly or will be able to provide help.

The use of interpreter-based threads in perl is officially discouraged.

Module removals

The following modules will be removed from the core distribution in a future release, and will at that time need to be installed from CPAN. Distributions on CPAN which require these modules will need to list them as prerequisites.

The core versions of these modules will now issue "deprecated"-category warnings to alert you to this fact. To silence these deprecation warnings, install the modules in question from CPAN.

Note that the planned removal of these modules from core does not reflect a judgement about the quality of the code and should not be taken as a suggestion that their use be halted. Their disinclusion from core primarily hinges on their necessity to bootstrapping a fully functional, CPAN-capable Perl installation, not on concerns over their design.

CGI and its associated CGI:: packages
inc::latest
Package::Constants
Module::Build and its associated Module::Build:: packages

Utility removals

The following utilities will be removed from the core distribution in a future release, and will at that time need to be installed from CPAN.

find2perl
s2p
a2p

Performance Enhancements

Modules and Pragmata

New Modules and Pragmata

Updated Modules and Pragmata

Documentation

New Documentation

perlrepository

This document was removed (actually, renamed perlgit and given a major overhaul) in Perl v5.14, causing Perl documentation websites to show the now out of date version in Perl v5.12 as the latest version. It has now been restored in stub form, directing readers to current information.

Changes to Existing Documentation

perldata

perldebguts

perlexperiment

perlfunc

perlguts

perlhack

perlhacktips

perllexwarn

perllocale

perlop

perlopentut

perlre

perlreguts

perlsub

perltrap

perlunicode

perlvar

perlxs

Diagnostics

The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see perldiag.

New Diagnostics

New Errors

New Warnings

Changes to Existing Diagnostics

Utility Changes

a2p

bisect.pl

The git bisection tool Porting/bisect.pl has had many enhancements.

It is provided as part of the source distribution but not installed because it is not self-contained as it relies on being run from within a git checkout. Note also that it makes no attempt to fix tests, correct runtime bugs or make something useful to install - its purpose is to make minimal changes to get any historical revision of interest to build and run as close as possible to "as-was", and thereby make git bisect easy to use.

find2perl

perlbug

Configuration and Compilation

Testing

Platform Support

New Platforms

Android

Perl can now be built for Android, either natively or through cross-compilation, for all three currently available architectures (ARM, MIPS, and x86), on a wide range of versions.

Bitrig

Compile support has been added for Bitrig, a fork of OpenBSD.

FreeMiNT

Support has been added for FreeMiNT, a free open-source OS for the Atari ST system and its successors, based on the original MiNT that was officially adopted by Atari.

Synology

Synology ships its NAS boxes with a lean Linux distribution (DSM) on relative cheap CPU's (like the Marvell Kirkwood mv6282 - ARMv5tel or Freescale QorIQ P1022 ppc - e500v2) not meant for workstations or development. These boxes should build now. The basic problems are the non-standard location for tools.

Discontinued Platforms

sfio

Code related to supporting the sfio I/O system has been removed.

Perl 5.004 added support to use the native API of sfio, AT&T's Safe/Fast I/O library. This code still built with v5.8.0, albeit with many regression tests failing, but was inadvertently broken before the v5.8.1 release, meaning that it has not worked on any version of Perl released since then. In over a decade we have received no bug reports about this, hence it is clear that no-one is using this functionality on any version of Perl that is still supported to any degree.

AT&T 3b1

Configure support for the 3b1, also known as the AT&T Unix PC (and the similar AT&T 7300), has been removed.

DG/UX

DG/UX was a Unix sold by Data General. The last release was in April 2001. It only runs on Data General's own hardware.

EBCDIC

In the absence of a regular source of smoke reports, code intended to support native EBCDIC platforms will be removed from perl before 5.22.0.

Platform-Specific Notes

Cygwin
GNU/Hurd

The BSD compatibility library libbsd is no longer required for builds.

Linux

The hints file now looks for libgdbm_compat only if libgdbm itself is also wanted. The former is never useful without the latter, and in some circumstances, including it could actually prevent building.

Mac OS

The build system now honors an ld setting supplied by the user running Configure.

MidnightBSD

objformat was removed from version 0.4-RELEASE of MidnightBSD and had been deprecated on earlier versions. This caused the build environment to be erroneously configured for a.out rather than elf. This has been now been corrected.

Mixed-endian platforms

The code supporting pack and unpack operations on mixed endian platforms has been removed. We believe that Perl has long been unable to build on mixed endian architectures (such as PDP-11s), so we don't think that this change will affect any platforms which were able to build v5.18.0.

VMS
Win32
WinCE

Internal Changes

Selected Bug Fixes

Regular Expressions

Perl 5 Debugger and -d

Lexical Subroutines

Everything Else

Known Problems

Obituary

Diana Rosa, 27, of Rio de Janeiro, went to her long rest on May 10, 2014, along with the plush camel she kept hanging on her computer screen all the time. She was a passionate Perl hacker who loved the language and its community, and who never missed a Rio.pm event. She was a true artist, an enthusiast about writing code, singing arias and graffiting walls. We'll never forget you.

Greg McCarroll died on August 28, 2013.

Greg was well known for many good reasons. He was one of the organisers of the first YAPC::Europe, which concluded with an unscheduled auction where he frantically tried to raise extra money to avoid the conference making a loss. It was Greg who mistakenly arrived for a london.pm meeting a week late; some years later he was the one who sold the choice of official meeting date at a YAPC::Europe auction, and eventually as glorious leader of london.pm he got to inherit the irreverent confusion that he had created.

Always helpful, friendly and cheerfully optimistic, you will be missed, but never forgotten.

Acknowledgements

Perl 5.20.0 represents approximately 12 months of development since Perl 5.18.0 and contains approximately 470,000 lines of changes across 2,900 files from 124 authors.

Excluding auto-generated files, documentation and release tools, there were approximately 280,000 lines of changes to 1,800 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.20.0:

Aaron Crane, Abhijit Menon-Sen, Abigail, Abir Viqar, Alan Haggai Alavi, Alan Hourihane, Alexander Voronov, Alexandr Ciornii, Andy Dougherty, Anno Siegel, Aristotle Pagaltzis, Arthur Axel 'fREW' Schmidt, Brad Gilbert, Brendan Byrd, Brian Childs, Brian Fraser, Brian Gottreu, Chris 'BinGOs' Williams, Christian Millour, Colin Kuskie, Craig A. Berry, Dabrien 'Dabe' Murphy, Dagfinn Ilmari Mannsåker, Daniel Dragan, Darin McBride, David Golden, David Leadbeater, David Mitchell, David Nicol, David Steinbrunner, Dennis Kaarsemaker, Dominic Hargreaves, Ed Avis, Eric Brine, Evan Zacks, Father Chrysostomos, Florian Ragwitz, François Perrad, Gavin Shelley, Gideon Israel Dsouza, Gisle Aas, Graham Knop, H.Merijn Brand, Hauke D, Heiko Eissfeldt, Hiroo Hayashi, Hojung Youn, James E Keenan, Jarkko Hietaniemi, Jerry D. Hedden, Jess Robinson, Jesse Luehrs, Johan Vromans, John Gardiner Myers, John Goodyear, John P. Linderman, John Peacock, kafka, Kang-min Liu, Karen Etheridge, Karl Williamson, Keedi Kim, Kent Fredric, kevin dawson, Kevin Falcone, Kevin Ryde, Leon Timmermans, Lukas Mai, Marc Simpson, Marcel Grünauer, Marco Peereboom, Marcus Holland-Moritz, Mark Jason Dominus, Martin McGrath, Matthew Horsfall, Max Maischein, Mike Doherty, Moritz Lenz, Nathan Glenn, Nathan Trapuzzano, Neil Bowers, Neil Williams, Nicholas Clark, Niels Thykier, Niko Tyni, Olivier Mengué, Owain G. Ainsworth, Paul Green, Paul Johnson, Peter John Acklam, Peter Martini, Peter Rabbitson, Petr Písař, Philip Boulain, Philip Guenther, Piotr Roszatycki, Rafael Garcia-Suarez, Reini Urban, Reuben Thomas, Ricardo Signes, Ruslan Zakirov, Sergey Alekseev, Shirakata Kentaro, Shlomi Fish, Slaven Rezic, Smylers, Steffen Müller, Steve Hay, Sullivan Beck, Thomas Sibley, Tobias Leich, Toby Inkster, Tokuhiro Matsuno, Tom Christiansen, Tom Hukins, Tony Cook, Victor Efimov, Viktor Turskyi, Vladimir Timofeev, YAMASHINA Hio, Yves Orton, Zefram, Zsbán Ambrus, Ævar Arnfjörð Bjarmason.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V, will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.