NAME

perlintern - autogenerated documentation of purely internal Perl functions

DESCRIPTION

This file is the autogenerated documentation of functions in the Perl interpreter that are documented using Perl's internal documentation format but are not marked as part of the Perl API. In other words, they are not for use in extensions!

Exceptions are the public ck_ check functions and pp_ push/pop opcodes, which are part of the public API available to extensions, but still documented only here, until they can be moved to the perlapi pod.

Array Manipulation Functions

av_reify

The elements of the @_ argarray, marked as !AvREAL && AvREIFY, become real refcounted SVs.

Bumps the refcount for every non-empty value, and clears all the invalid values.

Sets AvREIFY_off and AvREAL_on.

AvREAL arrays handle refcounts of the elements, !AvREAL arrays ignore them.

        void    av_reify(AV *av)

Check routines

A check routine is called at the end of the "newOP" creation routines. So at the point that a ck_ routine fires, we have no idea what the context is, either upward in the syntax tree, or either forward or backward in the execution order.

Lexical slots (op_targ) are also not yet known, this is done at the end of a check function in op_std_init(o). For more see the comments at the top of op.c for details.

See regen/opcodes which opcode calls which check function. Not all ops have a specific check function.

"ck_fun" is a generic arity type checker, "ck_type" a generic type checker for un- and binops.

fold_constants(op_integerize(op_std_init(o))) is the default treatment, i.e. fold constants, apply use integer optimizations and initialize the op_targ for uninitialized pads.

Prototypes are generated by regen/embed_lib.pl by scanning regen/opcodes, check functions are not in embed.fnc.

aassign_padcheck

helper function for S_aassign_scan().

Check a PAD-related op for commonality and/or set its generation number. Returns a boolean indicating whether its shared. bool aassign_padcheck(OP* o, bool rhs)

aassign_scan

Helper function for OPpASSIGN_COMMON* detection in rpeep(). It scans the left or right hand subtree of the aassign op, and returns a set of flags indicating what sorts of things it found there. 'rhs' indicates whether we're scanning the LHS or RHS. If the former, we set PL_generation on lexical vars; if the latter, we see if PL_generation matches.

'top' indicates whether we're recursing or at the top level. 'scalars_p' is a pointer to a counter of the number of scalar SVs seen. This fn will increment it by the number seen. It's not intended to be an accurate count (especially as many ops can push a variable number of SVs onto the stack); rather it's used as to test whether there can be at most 1 SV pushed; so it's only meanings are "0, 1, many". int aassign_scan(OP* o, bool rhs, bool top, int *scalars_p)

arg_check_type

Check if the declared static type of the argument from pn can be fullfilled by the dynamic type of the arg in OP* o (padsv, const, any return type). If possible add a typecast to o to fullfill it. contravariant.

Signatures are new, hence much stricter, than return-types and assignments. arg_check_type;

arg_check_type_sv

Check if the declared static type of the argument from pn can be fullfilled by the dynamic type of the arg in SV* sv. The run-time variant of arg_check_type, contravariant.

Signatures are new, hence much stricter, than return-types and assignments. NOTE: the perl_ form of this function is deprecated.

        void    arg_check_type_sv(const PADNAME* pn, SV* sv,
                                  GV *cvname)
arg_type_sv

Return the type for the sv. Optionally sets usertype and u8 (if usertype is utf8), when usertype is not NULL, and the SV is blessed.

                arg_type_sv;
can_class_typecheck

Returns 1 if this class has a compile-time @ISA or we are already at the run-time phase. This is not called for coretypes, coretypes would always return 1.

Check for class or package types. Does the class has an compile-time ISA to allow compile-time checks? #249 HvCLASS: Is it a cperl class? Does it use base or fields? If not cannot do this check before run-time.

(Essentially cperl classes are just syntactic and performance optimized sugar over base/fields with roles and multi-dispatch support. We don't invent anything new, we just fix what p5p broke.)

                can_class_typecheck;
check_for_bool_cxt

See if the ops following o are such that o will always be executed in boolean context: that is, the SV which o pushes onto the stack will only ever be consumed by later ops via SvTRUE(sv) or similar. If so, set a suitable private flag on o. Normally this will be bool_flag; but see below why maybe_flag is needed too.

Typically the two flags you pass will be the generic OPpTRUEBOOL and OPpMAYBE_TRUEBOOL, buts it's possible that for some ops those bits may already be taken, so you'll have to give that op two different flags.

More explanation of 'maybe_flag' and 'safe_and' parameters. The binary logical ops &&, ||, // (plus 'if' and 'unless' which use those underlying ops) short-circuit, which means that rather than necessarily returning a truth value, they may return the LH argument, which may not be boolean. For example in $x = (keys %h || -1), keys should return a key count rather than a boolean, even though its sort-of being used in boolean context.

So we only consider such logical ops to provide boolean context to their LH argument if they themselves are in void or boolean context. However, sometimes the context isn't known until run-time. In this case the op is marked with the maybe_flag flag it.

Consider the following.

    sub f { ....;  if (%h) { .... } }

This is actually compiled as

    sub f { ....;  %h && do { .... } }

Here we won't know until runtime whether the final statement (and hence the &&) is in void context and so is safe to return a boolean value. So mark o with maybe_flag rather than the bool_flag. Note that there is cost associated with determining context at runtime (e.g. a call to block_gimme()), so it may not be worth setting (at compile time) and testing (at runtime) maybe_flag if the scalar verses boolean costs savings are marginal.

However, we can do slightly better with && (compared to || and //): this op only returns its LH argument when that argument is false. In this case, as long as the op promises to return a false value which is valid in both boolean and scalar contexts, we can mark an op consumed by && with bool_flag rather than maybe_flag. For example as long as pp_padhv and pp_rv2hv return SV_ZERO rather than SV_NO for a false result in boolean context, then it's safe. An op which promises to handle this case is indicated by setting safe_and to true.

        void    check_for_bool_cxt(OP* o, bool safe_and,
                                   U8 bool_flag, U8 maybe_flag)
ck_aassign

CHECK callback for aassign (t2 L L "(:List,:List):List")

Checks types and adds OPpMAP_PAIR to %hash = map.

TODO: constant folding with OpSPECIAL TODO: fill lhs AvFILLp with gh210-computedsizearydecl OP * ck_aassign(OP *o)

ck_aelem

Check for typed and shaped arrays, and promote ops.

With constant indices throws compile-time "Array index out of bounds" and "Too many elements" errors.

No natively typed arrays yet. OP * ck_aelem(OP *o)

ck_anoncode

CHECK callback for anoncode (s$ S)

Creates an anon pad.

NOTE: the perl_ form of this function is deprecated.

        OP *    ck_anoncode(OP *o)
ck_backtick

CHECK callback for `` and qx (tu% S?)

Handle readpipe overrides, the missing default argument and apply $^H{open_IN} or $^H{open_OUT} io hints.

TODO: Handle cperl macro `` unquote syntax here later. OP * ck_backtick(OP *o)

ck_bitop

CHECK callback for all bitops, if generic, integer or string variants.

Integerize the results (as if under use integer), and handle some warnings.

        OP *    ck_bitop(OP *o)
ck_cmp

CHECK callback for numeric comparisons (all but *cmp) Optimize index() == -1 or < 0 into OPpINDEX_BOOLNEG, ditto != -1 or >= 0 into OPpTRUEBOOL.

Warn on $[ (did you mean $] ?)

        OP *    ck_cmp(OP *o)
ck_concat

CHECK callback for concat

Handles STACKED. Leaves out op_integerize, as concat is for strings only. OP * ck_concat(OP *o)

ck_defined

CHECK callback for defined (isu% S? "(:Scalar):Bool")

Errors now on @array and %hash arguments.

Also calls "ck_rfun", turning the argument into a reference, which is still useful for defined &sub, not calling sub, just checking if &sub has a body. OP * ck_defined(OP *o)

ck_delete

CHECK callback for delete (% S "(:Str):Void")

Handle array and hash elements and slices.

        OP *    ck_delete(OP *o)
ck_each

CHECK callback for each, valus and keys and its array variants.

Optimizes into the array specific variants, checks for type errors, and die on the old 5.14 experimental feature which allowed each, keys, push, pop, shift, splice, unshift, and values to be called with a scalar argument. See "Syntactical Enhancements" in perl5140delta This experiment is considered unsuccessful, and has been removed.

        OP *    ck_each(OP *o)
ck_eof

CHECK callback for getc and eof (is% F?)

Esp. set the missing default argument to *ARGV OP * ck_eof(OP *o)

ck_eval

CHECK callback for entereval (du% S?) and entertry (d|)

... OP * ck_eval(OP *o)

ck_exec

CHECK callback for system and exec (imsT@ S? L)

If as list or string.

        OP *    ck_exec(OP *o)
ck_exists

CHECK callback for exists (is% S "(:Str):Bool")

Handle hash or array elements, and ref subs.

        OP *    ck_exists(OP *o)
ck_ftst

CHECK callback for stat, lstat (u- F?) and the -X file tests (isu- F-+)

Handle _ and a missing optional arg.

        OP *    ck_ftst(OP *o)
ck_fun

CHECK callback for the rest

check and fix arguments of internal op calls, but not entersub user-level signatured or prototyped calls. throw arity errors, unify arg list, e.g. add scalar cast, add $_ ... OP * ck_fun(OP *o)

ck_glob

CHECK callback for glob (t@ S?)

glob defaults its first arg to $_

Also handles initializing an optional external File::Glob hook on certain platforms. OP * ck_glob(OP *o)

ck_grep

CHECK callback for grepstart and mapstart (m@ C L)

Handles BLOCK and ordinary comma style, throwing an error if the comma-less version is not on a BLOCK.

Applies lexical $_ optimization or handles the default $_.

        OP *    ck_grep(OP *o)
ck_index

CHECK callback for index, rindex (sT@ S S S?)

Does compile-time fbm (Boyer-Moore) compilation on a constant string. OP * ck_index(OP *o)

ck_length

CHECK callback for length, only needed to throw compile-time warnings when length is mixed up with scalar.

        OP *    ck_length(OP *o)
ck_lfun

CHECK callback for {i_,}{pre,post}{inc,dec} (dIs1 S) and sprintf.

Turns on MOD on all kids, setting it to a lvalue function. See "modkids". OP * ck_lfun(OP *o)

ck_listiob

CHECK callback for prtf,print,say (ims@ F? L)

Checks for the 1st bareword filehandle argument, if without comma. And if list argument was provided, or add $_. OP * ck_listiob(OP *o)

ck_match

CHECK callback for match,qr,subst,trans,transr

Sets TARGET_MY and the targ offset on my $_ (not with qr), which avoids runtime lookup of the global $_.

Note: This optimization was removed in perl5 with 5.24. In perl5 you have to fight with other dynamic default topics in blocks, overwriting each other. OP * ck_match(OP *o)

ck_method

CHECK callback for method (d.)

Creates one of the 4 METHOP ops. Checks for static SUPER:: calls. See also "ck_subr" OP * ck_method(OP *o)

ck_negate

Check the ! op, negate and turn off OPpCONST_STRICT of the argument. OP * ck_negate(OP *o)

ck_nomg

For tie and bless

Check if the first argument is not a typed coretype. We guarantee coretyped variables to have no magic.

For bless we also require a ref. Check for the most common mistakes as first argument, which cannot be a ref.

For bless we can predict the result type if the 2nd arg is a constant. This allows to type the result of the new method.

    sub D3::new {bless[],"D3"};
    my B2 $obj1 = D3->new;

We disallow the blessing to coretypes. This needs to be done via normal compile-time declarations, not dynamic blessing. OP * ck_nomg(OP *o)

ck_pad

Check for const and types. Called from newOP/newPADOP this is too early, the target is attached later. But we also call it from constant folding. Having an explicit CONST op allows constant optimizations on it. OP * ck_pad(OP *o)

ck_readline

CHECK callback for readline, the <> op. (t% F? "(:Scalar?):Any")

Adds *ARGV if missing. OP * ck_readline(OP *o)

ck_rfun

CHECK callback for lock (s% R)

Calls "refkids" to turn the argument into a reference.

Remember that lock can be called on everything, scalar, ref, array, hash or sub, but internally we better work with a scalar reference. OP * ck_rfun(OP *o)

ck_rvconst

CHECK callback for rv2[gsc]const (ds1 R "(:Ref):Scalar")

Error on bareword constants, initialize the symbol.

        OP *    ck_rvconst(OP *o)
ck_sassign

CHECK callback for sassign (s2 S S "(:Scalar,:Scalar):Scalar")

Esp. handles state var initialization and tries to optimize away the assignment for a lexical $_ via "maybe_targlex".

Checks types.

TODO: constant folding with OpSPECIAL OP * ck_sassign(OP *o)

ck_smartmatch

CHECK callback for smartmatch (s2)

Rearranges the kids to refs if not SPECIAL, and optimizes the runtime MATCH to a compile-time QR. OP * ck_smartmatch(OP *o)

ck_spair

CHECK callback for chop, chomp and refgen with optional lists

Transforms single-element lists into the single argument variant op srefgen, schop, schomp.

        OP *    ck_spair(OP *o)
ck_subr

CHECK callback for entersub, enterxssub, enterffi. All (dm1 L). See also "ck_method" OP * ck_subr(OP *o)

ck_substr

CHECK callback for substr (st@ S S S? S?) turning for the 4 arg variant into an lvalue sub. OP * ck_substr(OP *o)

ck_svconst

CHECK callback for const (ps$ "():Scalar") and hintseval (s$)

Turns on COW and READONLY for the scalar. OP * ck_svconst(OP *o)

ck_tell

CHECK callback for tell and seek OP * ck_tell(OP *o)

ck_trunc

CHECK callback for truncate (is@ S S) truncate really behaves as if it had both "S S" and "F S" i.e. with a bare handle argument turns on SPECIAL and off CONST_STRICT. OP * ck_trunc(OP *o)

ck_type

Check unop and binops for typed args, find specialized match and promote. Forget about native types (escape analysis) here, use the boxed variants. We can only unbox them later in rpeep sequences, by adding unbox...box ops. Set the OpRETTYPE of unops and binops. OP * ck_type(OP *o)

cv_type_set

NOTE: this function is experimental and may change or be removed without notice.

Set the return type of a sub.

When the padnames are still the default compile-time comppad_name, then we have to clone it to be able to set private [0] types for each subroutine, to avoid to overwrite them each other. If it's a private CvPADLIST already don't clone it.

        void    cv_type_set(CV *cv, HV *stash)
inplace_aassign

Check for in place reverse and sort assignments like "@a = reverse @a" and modify the optree to make them work inplace.

                inplace_aassign;
io_hints

Apply $^H{open_IN} or $^H{open_OUT} io hints, by setting op_private bits for raw or crlf.

        void    io_hints(OP* o)
is_types_strict

Check if the current lexical block has use types 'strict' enabled.

                is_types_strict;
match_type

Match a coretype from arg or op (atyp) to the declared stash of a variable (dtyp). Searches stash in @aname::ISA (contravariant, for arguments).

Added a 4th parameter if to allow inserting a type cast: numify. Scalar => Bool/Numeric Currently castable is only: Scalar/Ref/Sub/Regexp => Bool/Numeric Maybe allow casting from Scalar/Numeric to Int => int() and Scalar to Str => stringify()

On atyp == type_Object check the name and its ISA instead. int match_type(const HV* stash, core_types_t atyp, const char* aname, bool au8, int *castable)

match_type1

match an UNOP type with the given arg.

        int     match_type1(const U32 sig, core_types_t arg1)
match_type2

match an BINOP type with the given args.

        int     match_type2(const U32 sig, core_types_t arg1,
                            core_types_t arg2)
match_user_type
                                        |NN const char* aname|bool au8

Match a usertype from argument (aname+au8) to the declared usertype name of a variable (dstash). Searches dstash in @aname::ISA (contravariant, for arguments).

On return-type checks the arguments get in reversed (covariant).

Note that old-style package ISA's are created dynamically. Only classes with compile-time known ISA's can be checked at compile-time. Which are currently: use base/fields using Internals::HvCLASS, and later the perl6 syntax class Name is Parent {} match_user_type;

maybe_multideref

Given an op_next chain of ops beginning at 'start' that potentially represent a series of one or more aggregate derefs (such as $a->[1]{$key}), examine the chain, and if appropriate, convert the whole chain to a single OP_MULTIDEREF op (maybe with a few additional ops left in too).

The caller will have already verified that the first few ops in the chain following 'start' indicate a multideref candidate, and will have set 'orig_o' to the point further on in the chain where the first index expression (if any) begins. 'orig_action' specifies what type of beginning has already been determined by the ops between start..orig_o (e.g. $lex_ary[], $pkg_ary->{}, expr->[], etc).

'hints' contains any hints flags that need adding (currently just OPpHINT_STRICT_REFS) as found in any rv2av/hv skipped by the caller. void maybe_multideref(OP *start, OP *orig_o, UV orig_action, U8 hints)

maybe_targlex

Sets the possible lexical $_ TARGET_MY optimization, skipping a scalar assignment. OP* maybe_targlex(OP* o)

mderef_uoob_gvsv

check the key index sv of the first INDEX_gvsv of a MDEREF_AV, compare it with the given key, and set INDEX_uoob.

Only available without threads. Threaded perls use "mderef_uoob_targ" instead. bool mderef_uoob_gvsv(OP* o, SV* idx)

mderef_uoob_targ

check the targ of the first INDEX_padsv of a MDEREF_AV, compare it with the given targ, and set INDEX_uoob. bool mderef_uoob_targ(OP* o, PADOFFSET targ)

op_typed_user

Return the type as core_types_t enum of the op. User-defined types are only returned as type_Object, get the name of those with S_typename().

TODO: add defined return types of all ops, and user-defined CV types for entersub.

u8 returns 0 or 1 (HEKf_UTF8), not SVf_UTF8

        core_types_t op_typed_user(OP* o, char** usertype,
                                   int* u8)
_op_check_type

Check if the declared static type of the op (i.e. assignment) from the lhs pn can be fullfilled by the dynamic type of the rhs in OP* o (padsv, const, any return type). If possible add a typecast to o to fullfill it.

Different to arg_check_type a type violation is not fatal, it only throws a compile-time warning when no applicable type-conversion can be applied. Return-types and assignments are passed through the type inferencer and applied to old constructs, not signatures, hence not so strict.

Contravariant: Enables you to use a more generic (less derived) type than originally specified.

But note this special implicit perl case: scalar = list; # (array|hash) <=> scalar = shift list; _op_check_type;

peep void peep(OP* o)
peep_leaveloop

check loop bounds and possibly turn aelem/mderef/aelemfast_lex into an unchecked faster aelem_u.

1) if index bound to size/arylen, optimize to unchecked aelem_u variants, even without parametrized typed. need to check the right array, and if the loop index is used as is, or within an expression.

2) with static bounds check unrolling.

3) with static ranges and shaped arrays, can possibly optimize to aelem_u

Returns TRUE when some op was changed.

        bool    peep_leaveloop(BINOP* leave, OP* from, OP* to)
ret_check_type

Check if the declared static type of the return type from the lhs pn can be fullfilled by the dynamic type of the rhs in OP* o (padsv, const, any return type). If possible add a typecast to o to fullfill it.

Different to arg_check_type a type violation is not fatal, it only throws a compile-time warning when no applicable type-conversion can be applied. Return-types and assignments are passed through the type inferencer and applied to old constructs, not signatures, hence not so strict.

Covariant: Enables you to use a more derived type than originally specified. ret_check_type;

rpeep

The peephole optimizer. We visit the ops in the order they're to execute. See the comments at the top of this file for more details about when peep() is called.

Warning: rpeep is not a real peephole optimizer as other compilers implement it due to historic ballast. It started more as a glorified op nullifier. It sets op_opt when done, and does not run it again when it sees this flag at the op. When it's set it turns the op into NULL.

More important, it sets op_opt to 1 by default, even if it has no intention to nullify ("optimize away") the current op. Any optimization which wants to keep the op needs to unset op_opt.

        void    rpeep(OP* o)
stash_to_coretype

stash_to_coretype(HV* stash) converts the name of the padname type to the core_types_t enum.

For native types we still return the non-native counterpart. PERL_NATIVE_TYPES is implemented in the native type branch, with escape analysis, upgrading long-enough sequences to native ops in rpeep. core_types_t stash_to_coretype(const HV* stash)

Compile-time scope hooks

BhkENTRY

NOTE: this function is experimental and may change or be removed without notice.

Return an entry from the BHK structure. which is a preprocessor token indicating which entry to return. If the appropriate flag is not set this will return NULL. The type of the return value depends on which entry you ask for.

        void *  BhkENTRY(BHK *hk, which)
BhkFLAGS

NOTE: this function is experimental and may change or be removed without notice.

Return the BHK's flags.

        U32     BhkFLAGS(BHK *hk)
CALL_BLOCK_HOOKS

NOTE: this function is experimental and may change or be removed without notice.

Call all the registered block hooks for type which. which is a preprocessing token; the type of arg depends on which.

        void    CALL_BLOCK_HOOKS(which, arg)
fold_constants

Apply constant folding to a scalar at compile-time, via a fake eval. Returns a new op_folded op which replaces the old constant expression, or the old unfolded op.

        OP*     fold_constants(OP * const o)
gen_constant_list

Compile-time expansion of a range list.

  e.g. 0..4 => 0,1,2,3,4

        OP*     gen_constant_list(OP* o)
hasterm

Adds the field name, padoffset and the field index to the current class.

        OP*     hasterm(OP *o)
jmaybe

Join list by $;, \034. Adds $;, the $SUBSCRIPT_SEPARATOR before the op list, if there is a list.

If you refer to a hash element as $foo{$x,$y,$z} it really means $foo{join($;, $x, $y, $z)}

        OP*     jmaybe(OP *o)
localize

lex: 0 local 1 my|our|state 2 has OP* localize(OP *o, I32 lex)

op_integerize

Change the optype to the integer variant, when use integer is in scope.

        OP*     op_integerize(OP *o)
op_std_init

Fixup all temp. pads: apply scalar context, and allocate missing targs.

        OP*     op_std_init(OP *o)

Custom Operators

core_prototype

This function assigns the prototype of the named core function to sv, or to a new mortal SV if sv is NULL. It returns the modified sv, or NULL if the core function has no prototype. code is a code as returned by keyword(). It must not be equal to 0.

        SV *    core_prototype(SV *sv, const char *name,
                               const int code,
                               int * const opnum)
coresub_op

Provide the coreargs arguments for &CORE::* subroutines, usually with matching ops. coreargssv is either the opnum (as UV) or the name (as PV) of no such op exists. code is the result of keyword(), and maybe negative. See gv.c: S_maybe_add_coresub().

        OP *    coresub_op(SV *const coreargssv, const int code,
                           const int opnum)
report_redefined_cv

If a CV is overwritten, warn by whom when use warnings 'redefine' is in effect. void report_redefined_cv(const SV *name, const CV *old_cv, SV * const *new_const_svp)

CV Manipulation Functions

check_type_and_open

Return NULL if the file doesn't exist or isn't a file; else return PerlIO_openn().

        PerlIO * check_type_and_open(SV *name)
create_eval_scope

NOTE: this function is experimental and may change or be removed without notice.

Common-ish code salvaged from Perl_call_sv and pp_entertry, because it was also needed by Perl_fold_constants. void create_eval_scope(OP *retop, U32 flags)

delete_eval_scope

NOTE: this function is experimental and may change or be removed without notice.

Common code for Perl_call_sv and Perl_fold_constants, put here to keep it close to the related Perl_create_eval_scope. void delete_eval_scope()

docatch

Check for the cases 0 or 3 of cur_env.je_ret, only used inside an eval context.

0 is used as continue inside eval,

3 is used for a die caught by an inner eval - continue inner loop

See cop.h: je_mustcatch, when set at any runlevel to TRUE, means eval ops must establish a local jmpenv to handle exception traps.

        OP*     docatch(Perl_ppaddr_t firstpp)
doeval_compile

Compile a require/do or an eval ''.

outside is the lexically enclosing CV (if any) that invoked us. seq is the current COP scope value. hh is the saved hints hash, if any.

Returns a bool indicating whether the compile was successful; if so, PL_eval_start contains the first op of the compiled code; otherwise, pushes undef.

This function is called from two places: pp_require and pp_entereval. These can be distinguished by whether PL_op is entereval.

        bool    doeval_compile(U8 gimme, CV* outside, U32 seq,
                               HV* hh)
doopen_pm

doopen_pm(): return the equivalent of PerlIO_openn() on the given name, but first check for bad names (\0) and non-files. Also if the filename ends in .pm and unless PERL_DISABLE_PMC, try loading Foo.pmc first.

        PerlIO * doopen_pm(SV *name, bool do_pmc)
path_is_searchable

require doesn't search in @INC for absolute names, or when the name is explicitly relative the current directory: i.e. ./, ../ bool path_is_searchable(const char *name)

require_file

Handle require Foo::Bar, require "Foo/Bar.pm" and do "Foo.pm".

The first form will have already been converted at compile time to the second form. OP * require_file(SV *sv)

require_version

Implements 'require 5.010001', the version check part of require. OP * require_version(SV *sv)

try_yyparse

Run yyparse() in a setjmp wrapper. Returns: 0: yyparse() successful 1: yyparse() failed 3: yyparse() died

        int     try_yyparse(int gramtype)

CV reference counts and CvOUTSIDE

CvWEAKOUTSIDE

Each CV has a pointer, CvOUTSIDE(), to its lexically enclosing CV (if any). Because pointers to anonymous sub prototypes are stored in & pad slots, it is a possible to get a circular reference, with the parent pointing to the child and vice-versa. To avoid the ensuing memory leak, we do not increment the reference count of the CV pointed to by CvOUTSIDE in the one specific instance that the parent has a & pad slot pointing back to us. In this case, we set the CvWEAKOUTSIDE flag in the child. This allows us to determine under what circumstances we should decrement the refcount of the parent when freeing the child.

There is a further complication with non-closure anonymous subs (i.e. those that do not refer to any lexicals outside that sub). In this case, the anonymous prototype is shared rather than being cloned. This has the consequence that the parent may be freed while there are still active children, e.g.,

    BEGIN { $a = sub { eval '$x' } }

In this case, the BEGIN is freed immediately after execution since there are no active references to it: the anon sub prototype has CvWEAKOUTSIDE set since it's not a closure, and $a points to the same CV, so it doesn't contribute to BEGIN's refcount either. When $a is executed, the eval '$x' causes the chain of CvOUTSIDEs to be followed, and the freed BEGIN is accessed.

To avoid this, whenever a CV and its associated pad is freed, any & entries in the pad are explicitly removed from the pad, and if the refcount of the pointed-to anon sub is still positive, then that child's CvOUTSIDE is set to point to its grandparent. This will only occur in the single specific case of a non-closure anon prototype having one or more active references (such as $a above).

One other thing to consider is that a CV may be merely undefined rather than freed, eg undef &foo. In this case, its refcount may not have reached zero, but we still delete its pad and its CvROOT etc. Since various children may still have their CvOUTSIDE pointing at this undefined CV, we keep its own CvOUTSIDE for the time being, so that the chain of lexical scopes is unbroken. For example, the following should print 123:

    my $x = 123;
    sub tmp { sub { eval '$x' } }
    my $a = tmp();
    undef &tmp;
    print  $a->();

        bool    CvWEAKOUTSIDE(CV *cv)

Debugging Utilities

append_gv_name

Append to the out SV the name of the gv.

        void    append_gv_name(GV *gv, SV *out)
append_padvar

Append to the out SV, the names of the n lexicals starting at offset off in the CV * cv.

        void    append_padvar(PADOFFSET off, CV *cv, SV *out,
                              int n, bool paren,
                              char force_sigil)
av_dump

Dump all the av values. sv_dump dumps only a limited amount of keys.

Only available with -DDEBUGGING.

        void    av_dump(SV* av)
cop_dump

Dumps a COP, even when it is deleted. Esp. useful for lexical hints in PL_curcop.

With DEBUGGING only. void cop_dump(const OP *o)

deb_hechain

NOTE: this function is experimental and may change or be removed without notice.

Print the HE chain.

Only avalaible with -DDEBUGGING.

        void    deb_hechain(HE* entry)
deb_hek

NOTE: this function is experimental and may change or be removed without notice.

Print the HEK key and value, along with the hash and flags.

Only avalaible with -DDEBUGGING.

        void    deb_hek(HEK* hek, SV* val)
deb_padvar

Print the names of the n lexical vars starting at pad offset off.

        void    deb_padvar(PADOFFSET off, int n, bool paren)
hv_dump

Dump all the hv keys and optionally values. sv_dump dumps only a limited amount of keys.

Only available with -DDEBUGGING.

        void    hv_dump(SV* sv, bool with_values)
multiconcat_stringify

Return a temporary SV containing a stringified representation of the op_aux field of a MULTICONCAT op. Note that if the aux contains both plain and utf8 versions of the const string and indices, only the first is displayed.

        SV*     multiconcat_stringify(const OP* o)
multideref_stringify

Return a temporary SV containing a stringified representation of the op_aux field of a MULTIDEREF op, associated with CV cv

        SV*     multideref_stringify(const OP* o, CV *cv)
sequence_num

Return a unique integer to represent the address of op o. If it already exists in PL_op_sequence, just return it; otherwise add it.

 *** Note that this isn't thread-safe.
        UV      sequence_num(const OP *o)
signature_stringify

Return a temporary SV containing a stringified representation of the op_aux field of a SIGNATURE op, associated with CV cv.

        SV*     signature_stringify(const OP* o, CV *cv)

Embedding Functions

cv_clone_padname0

NOTE: this function is experimental and may change or be removed without notice.

Clones the first PADNAME slot, just references the other. Called by "cv_type_set" and "newATTRSUB_x" for an extern sub, to set its private return type.

        PADNAMELIST * cv_clone_padname0(CV *cv,
                                        PADNAMELIST *pnl)
cv_dump

dump the contents of a CV

        void    cv_dump(CV *cv, const char *title)
cv_forget_slab

When a CV has a reference count on its slab (CvSLABBED), it is responsible for making sure it is freed. (Hence, no two CVs should ever have a reference count on the same slab.) The CV only needs to reference the slab during compilation. Once it is compiled and CvROOT attached, it has finished its job, so it can forget the slab.

        void    cv_forget_slab(CV *cv)
do_dump_pad

Dump the contents of a padlist

        void    do_dump_pad(I32 level, PerlIO *file,
                            PADLIST *padlist, int full)
open_script
        PerlIO * open_script(const char *scriptname,
                             bool dosearch, bool *suidscript)
pad_alloc_name

Allocates a place in the currently-compiling pad (via "pad_alloc" in perlapi) and then stores a name for that entry. name is adopted and becomes the name entry; it must already contain the name string. typestash and ourstash and the padadd_STATE flag get added to name. None of the other processing of "pad_add_name_pvn" in perlapi is done. Returns the offset of the allocated pad slot.

        PADOFFSET pad_alloc_name(PADNAME *name, U32 flags,
                                 HV *typestash, HV *ourstash)
pad_block_start

Update the pad compilation state variables on entry to a new block.

        void    pad_block_start(int full)
pad_check_dup

Check for shadow warnings, duplicate declarations. Report any of:

     * a 'my' in the current scope with the same name;
     * an 'our' (anywhere in the pad) with the same name and the
       same stash as 'ourstash'

is_our indicates that the name to check is an "our" declaration.

        void    pad_check_dup(PADNAME *name, U32 flags,
                              const HV *ourstash)
pad_findlex

Find a named lexical anywhere in a chain of nested pads. Add fake entries in the inner pads if it's found in an outer one.

Returns the offset in the bottom pad of the lex or the fake lex. cv is the CV in which to start the search, and seq is the current cop_seq to match against. If warn is true, print appropriate warnings. The out_* vars return values, and so are pointers to where the returned values should be stored. out_capture, if non-null, requests that the innermost instance of the lexical is captured; out_name is set to the innermost matched pad name or fake pad name; out_flags returns the flags normally associated with the PARENT_FAKELEX_FLAGS field of a fake pad name.

Note that pad_findlex() is recursive; it recurses up the chain of CVs, then comes back down, adding fake entries as it goes. It has to be this way because fake names in anon protoypes have to store in xpadn_low the index into the parent pad.

With cperl all PADs are UTF8 so the flags argument must be either 0 or padadd_STALEOK.

        PADOFFSET pad_findlex(const char *namepv,
                              STRLEN namelen, U32 flags,
                              const CV* cv, U32 seq, int warn,
                              SV** out_capture,
                              PADNAME** out_name,
                              int *out_flags)
pad_find_outeroffset

Search the real pad offset in one of the outer CVs for the fake pad entry in the current CV, usually the PL_compcv. See "pad_findmy_real" in perlapi.

        PADOFFSET pad_find_outeroffset(PADNAME *pn, CV* cv)
pad_fixup_inner_anons

For any anon CVs in the pad, change CvOUTSIDE of that CV from old_cv to new_cv if necessary. Needed when a newly-compiled CV has to be moved to a pre-existing CV struct.

        void    pad_fixup_inner_anons(PADLIST *padlist,
                                      CV *old_cv, CV *new_cv)
pad_free

Free the SV at offset po in the current pad.

        void    pad_free(PADOFFSET po)
pad_leavemy

Cleanup at end of scope during compilation: set the max seq number for lexicals in this scope and warn of any lexicals that never got introduced.

        void    pad_leavemy()
padlist_dup

Duplicates a pad.

        PADLIST * padlist_dup(PADLIST *srcpad,
                              CLONE_PARAMS *param)
padname_dup

Duplicates a pad name.

        PADNAME * padname_dup(PADNAME *src, CLONE_PARAMS *param)
padnamelist_dup

Duplicates a pad name list.

        PADNAMELIST * padnamelist_dup(PADNAMELIST *srcpad,
                                      CLONE_PARAMS *param)
pad_push

Push a new pad frame onto the padlist, unless there's already a pad at this depth, in which case don't bother creating a new one. Then give the new pad an @_ in slot zero.

        void    pad_push(PADLIST *padlist, int depth)
pad_reset

Mark all the current temporaries for reuse

        void    pad_reset()
pad_swipe

Abandon the tmp in the current pad at offset po and replace with a new one.

        void    pad_swipe(PADOFFSET po, bool refadjust)

Functions in file inline.h

strip_spaces

CvPROTO returns the prototype as stored, which is not necessarily what the interpreter should be using. Specifically, the interpreter assumes that spaces have been stripped, which has been the case if the prototype was added by toke.c, but is generally not the case if it was added elsewhere.

Since we can't enforce the spacelessness at assignment time, this routine provides a temporary copy of the string at parse time with spaces removed. orig is the start of the original string, len is the length of the string and will be updated when this returns.

        char*   strip_spaces(const char * orig,
                             STRLEN * const len)

Functions in file op.c

new_slab

Creates a new memory region, a slab, for ops, with room for sz pointers. sz starts with PERL_SLAB_SIZE (=64) and is then extended by factor two in Slab_Alloc().

                new_slab;
no_fh_allowed

Throws a parser error: Missing comma after first argument to %s function for an op which does not take an optional comma-less filehandle argument. i.e. not print $fh arg, rather call $fh, $arg.

                no_fh_allowed;
op_next_nn

Returns the next non-NULL op, skipping all NULL ops in the chain.

        OP*     op_next_nn
op_prevstart_nn

Returns the previous op, pointing via OpNEXT to us. Walks down the CvSTART until it finds us.

        OP*     op_prevstart_nn
op_prev_nn

Returns the previous sibling or parent op, pointing via OpSIBLNG or OpFIRST to us. Walks the the siblings until the parent, and then descent again to the kids until it finds us.

        OP*     op_prev_nn
opslab_force_free

Forcefully frees the slab area, even if there are still live OPs in it. Frees all the containing OPs. Like opslab_free(), but first calls op_free() on any ops in the slab not marked as OP_FREED.

NOTE: the perl_ form of this function is deprecated.

        void    opslab_force_free(OPSLAB *slab)
opslab_free

Free a chain of OP slabs. Should only be called after all ops contained in it have been freed. At this point, its reference count should be 1, because OpslabREFCNT_dec() skips doing rc-- when it detects that rc == 1, and just directly calls opslab_free(). (Note that the reference count which PL_compcv held on the slab should have been removed once compilation of the sub was complete).

NOTE: the perl_ form of this function is deprecated.

        void    opslab_free(OPSLAB *slab)
opslab_free_nopad

Frees the slab area, embedded into temporary disabling PL_comppad.

NOTE: the perl_ form of this function is deprecated.

        void    opslab_free_nopad(OPSLAB *slab)
prune_chain_head

remove any leading "empty" ops from the op_next chain whose first node's address is stored in op_p. Store the updated address of the first node in op_p.

                prune_chain_head;
Slab_Alloc

Creates a new memory region, a slab, for some ops, with room for sz pointers. sz starts with PERL_SLAB_SIZE (=64) and is then extended by factor two. If PL_compcv isn't compiling, malloc() instead.

NOTE: the perl_ form of this function is deprecated.

        void*   Slab_Alloc(size_t sz)
Slab_Free

Free memory for the slabbed op.

NOTE: the perl_ form of this function is deprecated.

        void    Slab_Free(void *op)
typename

Returns the sanitized typename of the stash of the padname type, without main:: prefix.

                typename;

Functions in file pp.c

softref2xv

Helper function for pp_rv2sv and pp_rv2av. Interns the string to a symbol.

Optionally also consumes the local stackptr pointing to the sv (*spp[0] == sv) to use a subsequent LVAL NULL kid in the calling rv2sv.

Note: sv might change. Get it with GvSV of the result afterwards.

        GV*     softref2xv(SV *sv, const char *const what,
                           const svtype type, SV ***spp)

Functions in file regcomp.c

regprop

printable representation of opcode, with run time support void regprop(const regexp *prog, SV* sv, const regnode* o, const regmatch_info *reginfo, const RExC_state_t *pRExC_state)

reg_temp_copy

Copy ssv to dsv, both of which should of type SVt_REGEXP or SVt_PVLV, except that dsv will be created if NULL.

This function is used in two main ways. First to implement

    $r = qr/....; $s = $$r;

Secondly, it is used as a hacky workaround to the structural issue of match results being stored in the regexp structure which is in turn stored in PL_curpm/PL_reg_curpm. The problem is that due to qr// the pattern could be PL_curpm in multiple contexts, and could require multiple result sets being associated with the pattern simultaneously, such as when doing a recursive match with (??{$qr})

The solution is to make a lightweight copy of the regexp structure when a qr// is returned from the code executed by (??{$qr}) this lightweight copy doesn't actually own any of its data except for the starp/end and the actual regexp structure itself.

NOTE: the perl_ form of this function is deprecated.

        REGEXP* reg_temp_copy(REGEXP* dsv, REGEXP* ssv)

GV Functions

gv_magicalize
                     |STRLEN len|const svtype sv_type

 * gv_magicalize() is called by gv_fetchpvn_flags when creating
 * a new GV, gv is NN.
 * Note that it does not insert the GV into the stash prior to
 * magicalization, which some variables require need in order
 * to work (like %+, %-, %!), so callers must take care of
 * that.
 * 
 * It returns true if the gv did turn out to be magical one; i.e.,
 * if gv_magicalize actually did something.
 */
PERL_STATIC_INLINE bool
S_gv_magicalize(pTHX_ GV *gv, HV *stash, const char *name, STRLEN len,
                      const svtype sv_type)
{
    I32 paren;

    PERL_ARGS_ASSERT_GV_MAGICALIZE;
    
    if (stash != PL_defstash) { /* not the main stash */
        /* We only have to check for a few names here: a, b, EXPORT*, ISA
           and VERSION. All the others apply only to the main stash or to
           CORE (which is checked right after this). */
        if (len) {
            switch (*name) {
            case 'E':
                if (
                    len >= 6 && name[1] == 'X' &&
                    (memEQs(name, len, "EXPORT")
                    ||memEQs(name, len, "EXPORT_OK")
                    ||memEQs(name, len, "EXPORT_FAIL")
                    ||memEQs(name, len, "EXPORT_TAGS"))
                )
                    GvMULTI_on(gv);
                break;
            case 'I':
                if (memEQs(name, len, "ISA"))
                    gv_magicalize_isa(gv);
                break;
            case 'V':
                if (memEQs(name, len, "VERSION"))
                    GvMULTI_on(gv);
                break;
            case 'a':
                if (stash == PL_debstash && len==4 && memEQc(name,"args")) {
                    GvMULTI_on(gv_AVadd(gv));
                    break;
                }
                /* FALLTHROUGH for a */
            case 'b':
                if (len == 1 && sv_type == SVt_PV)
                    GvMULTI_on(gv);
                /* FALLTHROUGH */
            default:
                goto try_core;
            }
            goto ret;
        }
      try_core:
        if (len > 1 /* shortest is uc */ && HvNAMELEN_get(stash) == 4) {
            /* Avoid null warning: */
            const char * const stashname = HvNAME(stash); assert(stashname);
            if (memEQc(stashname, "CORE"))
                S_maybe_add_coresub(aTHX_ 0, gv, name, len);
        }
    }
    else if (len > 1) {
#ifndef EBCDIC
        if (*name > 'V' ) {
            NOOP;
            /* Nothing else to do.
               The compiler will probably turn the switch statement into a
               branch table. Make sure we avoid even that small overhead for
               the common case of lower case variable names.  (On EBCDIC
               platforms, we can't just do:
                 if (NATIVE_TO_ASCII(*name) > NATIVE_TO_ASCII('V') ) {
               because cases like '\027' in the switch statement below are
               C1 (non-ASCII) controls on those platforms, so the remapping
               would make them larger than 'V')
             */
        } else
#endif
        {
            switch (*name) {
            case 'A':
                if (strEQc(name, "ARGV"))
                    IoFLAGS(GvIOn_NN(gv)) |= IOf_ARGV|IOf_START;
                else if (strEQc(name, "ARGVOUT"))
                    GvMULTI_on(gv);
                break;
            case 'E':
                if (
                    len >= 6 && name[1] == 'X' &&
                    (memEQs(name, len, "EXPORT")
                    ||memEQs(name, len, "EXPORT_OK")
                    ||memEQs(name, len, "EXPORT_FAIL")
                    ||memEQs(name, len, "EXPORT_TAGS"))
                )
                    GvMULTI_on(gv);
                break;
            case 'I':
                if (strEQc(name, "ISA")) {
                    gv_magicalize_isa(gv);
                }
                break;
            case 'S':
                if (strEQc(name, "SIG")) {
                    HV *hv;
                    I32 i;
                    if (!PL_psig_name) {
                        Newxz(PL_psig_name, 2 * SIG_SIZE, SV*);
                        Newxz(PL_psig_pend, SIG_SIZE, int);
                        PL_psig_ptr = PL_psig_name + SIG_SIZE;
                    } else {
                        /* I think that the only way to get here is to re-use an
                           embedded perl interpreter, where the previous
                           use didn't clean up fully because
                           PL_perl_destruct_level was 0. I'm not sure that we
                           "support" that, in that I suspect in that scenario
                           there are sufficient other garbage values left in the
                           interpreter structure that something else will crash
                           before we get here. I suspect that this is one of
                           those "doctor, it hurts when I do this" bugs.  */
                        Zero(PL_psig_name, 2 * SIG_SIZE, SV*);
                        Zero(PL_psig_pend, SIG_SIZE, int);
                    }
                    GvMULTI_on(gv);
                    hv = GvHVn(gv);
                    hv_magic(hv, NULL, PERL_MAGIC_sig);
                    for (i = 1; i < SIG_SIZE; i++) {
                        SV * const * const init
                            = hv_fetch_ifexists(hv, PL_sig_name[i], strlen(PL_sig_name[i]), 1);
                        if (init)
                            sv_setsv(*init, UNDEF);
                    }
                }
                break;
            case 'V':
                if (strEQc(name, "VERSION"))
                    GvMULTI_on(gv);
                break;
            case '\003':        /* $^CHILD_ERROR_NATIVE */
                if (strEQc(name, "\003HILD_ERROR_NATIVE"))
                    goto magicalize;
                                /* @{^CAPTURE} %{^CAPTURE} */
                if (memEQs(name, len, "\003APTURE")) {
                    AV* const av = GvAVn(gv);
                    const Size_t n = *name;

                    sv_magic(MUTABLE_SV(av), (SV*)n, PERL_MAGIC_regdata, NULL, 0);
                    SvREADONLY_on(av);

                    if (sv_type == SVt_PVHV || sv_type == SVt_PVGV)
                        require_tie_mod_s(gv, '-', "Tie::Hash::NamedCapture",0);

                } else          /* %{^CAPTURE_ALL} */
                if (memEQs(name, len, "\003APTURE_ALL")) {
                    if (sv_type == SVt_PVHV || sv_type == SVt_PVGV)
                        require_tie_mod_s(gv, '+', "Tie::Hash::NamedCapture",0);
                }
                break;
            case '\005':        /* $^ENCODING */
                if (strEQc(name, "\005NCODING"))
                    goto magicalize;
                if (strEQc(name, "\005_NCODING"))
                    goto magicalize;
                break;
            case '\007':        /* $^GLOBAL_PHASE */
                if (strEQc(name, "\007LOBAL_PHASE"))
                    goto ro_magicalize;
                break;
            case '\014':        /* $^LAST_FH */
                if (strEQc(name, "\014AST_FH"))
                    goto ro_magicalize;
                break;
            case '\015':        /* $^MATCH */
                if (strEQc(name, "\015ATCH")) {
                    paren = RX_BUFF_IDX_CARET_FULLMATCH;
                    goto storeparen;
                }
                break;
            case '\017':        /* $^OPEN */
                if (strEQc(name, "\017PEN"))
                    goto magicalize;
                break;
            case '\020':        /* $^PREMATCH  $^POSTMATCH */
                if (strEQc(name, "\020REMATCH")) {
                    paren = RX_BUFF_IDX_CARET_PREMATCH;
                    goto storeparen;
                }
                if (strEQc(name, "\020OSTMATCH")) {
                    paren = RX_BUFF_IDX_CARET_POSTMATCH;
                    goto storeparen;
                }
                break;
            case '\023':
                if (memEQs(name, len, "\023AFE_LOCALES"))
                    goto ro_magicalize;
                break;
            case '\024':        /* ${^TAINT} */
                if (strEQc(name, "\024AINT"))
                    goto ro_magicalize;
                break;
            case '\025':        /* ${^UNICODE}, ${^UTF8LOCALE} */
                if (strEQc(name, "\025NICODE"))
                    goto ro_magicalize;
                if (strEQc(name, "\025TF8LOCALE"))
                    goto ro_magicalize;
                if (strEQc(name, "\025TF8CACHE"))
                    goto magicalize;
                break;
            case '\027':        /* $^WARNING_BITS */
                if (strEQc(name, "\027ARNING_BITS"))
                    goto magicalize;
#ifdef WIN32
                else if (strEQc(name, "\025IN32_SLOPPY_STAT"))
                    goto magicalize;
#endif
                break;
            case '1':
            case '2':
            case '3':
            case '4':
            case '5':
            case '6':
            case '7':
            case '8':
            case '9':
                {
                    /* Ensures that we have an all-digit variable, ${"1foo"} fails
                       this test  */
                    UV uv;
                    if (!grok_atoUV(name, &uv, NULL))
                        goto ret;
                    if (UNLIKELY(uv > I32_MAX))
                        Perl_croak(aTHX_ "panic: gv name too long (%" UVuf ")", uv);
                    paren = (I32)uv;
                    goto storeparen;
                }
            }
        }
    } else {
        /* Names of length 1.  (Or 0. But name is NUL terminated, so that will
           be case '\0' in this switch statement (ie a default case)  */
        switch (*name) {
        case '&':               /* $& */
            paren = RX_BUFF_IDX_FULLMATCH;
            goto sawampersand;
        case '`':               /* $` */
            paren = RX_BUFF_IDX_PREMATCH;
            goto sawampersand;
        case '\'':              /* $' */
            paren = RX_BUFF_IDX_POSTMATCH;
        sawampersand:
#ifdef PERL_SAWAMPERSAND
            if (!(
                sv_type == SVt_PVAV ||
                sv_type == SVt_PVHV ||
                sv_type == SVt_PVCV ||
                sv_type == SVt_PVFM ||
                sv_type == SVt_PVIO
                )) { PL_sawampersand |=
                        (*name == '`')
                            ? SAWAMPERSAND_LEFT
                            : (*name == '&')
                                ? SAWAMPERSAND_MIDDLE
                                : SAWAMPERSAND_RIGHT;
                }
#endif
            goto storeparen;
        case '1':               /* $1 */
        case '2':               /* $2 */
        case '3':               /* $3 */
        case '4':               /* $4 */
        case '5':               /* $5 */
        case '6':               /* $6 */
        case '7':               /* $7 */
        case '8':               /* $8 */
        case '9':               /* $9 */
            paren = *name - '0';

        storeparen:
            /* Flag the capture variables with a NULL mg_ptr
               Use mg_len for the array index to lookup.  */
            sv_magic(GvSVn(gv), MUTABLE_SV(gv), PERL_MAGIC_sv, NULL, paren);
            break;

        case ':':               /* $: */
            sv_setpv(GvSVn(gv),PL_chopset);
            goto magicalize;

        case '?':               /* $? */
#ifdef COMPLEX_STATUS
            SvUPGRADE(GvSVn(gv), SVt_PVLV);
#endif
            goto magicalize;

        case '!':               /* $! */
            GvMULTI_on(gv);
            /* If %! has been used, automatically load Errno.pm. */

            sv_magic(GvSVn(gv), MUTABLE_SV(gv), PERL_MAGIC_sv, name, len);

            /* magicalization must be done before require_tie_mod_s is called */
            if (sv_type == SVt_PVHV || sv_type == SVt_PVGV)
                require_tie_mod_s(gv, '!', "Errno", 1);

            break;
        case '-':               /* $-, %-, @- */
        case '+':               /* $+, %+, @+ */
            GvMULTI_on(gv); /* no used once warnings here */
            {   /* $- $+ */
                sv_magic(GvSVn(gv), MUTABLE_SV(gv), PERL_MAGIC_sv, name, len);
                if (*name == '+')
                    SvREADONLY_on(GvSVn(gv));
            }
            {   /* %- %+ */
                if (sv_type == SVt_PVHV || sv_type == SVt_PVGV)
                    require_tie_mod_s(gv, *name, "Tie::Hash::NamedCapture",0);
            }
            {   /* @- @+ */
                AV* const av = GvAVn(gv);
                const Size_t n = *name;

                sv_magic(MUTABLE_SV(av), (SV*)n, PERL_MAGIC_regdata, NULL, 0);
                SvREADONLY_on(av);
            }
            break;
        case '*':               /* $* */
        case '#':               /* $# */
            if (sv_type == SVt_PV)
                /* diag_listed_as: $* is no longer supported as of Perl 5.30 */
                Perl_croak(aTHX_ "$%c is no longer supported as of Perl 5.30", *name);
            break;
        case '\010':    /* $^H */
            {
                HV *const hv = GvHVn(gv);
                hv_magic(hv, NULL, PERL_MAGIC_hints);
            }
            goto magicalize;
        case '\023':    /* $^S */
        ro_magicalize:
            SvREADONLY_on(GvSVn(gv));
            /* FALLTHROUGH */
        case '0':               /* $0 */
        case '^':               /* $^ */
        case '~':               /* $~ */
        case '=':               /* $= */
        case '%':               /* $% */
        case '.':               /* $. */
        case '(':               /* $( */
        case ')':               /* $) */
        case '<':               /* $< */
        case '>':               /* $> */
        case '\\':              /* $\ */
        case '/':               /* $/ */
        case '|':               /* $| */
        case '$':               /* $$ */
        case '[':               /* $[ */
        case '\001':    /* $^A */
        case '\003':    /* $^C */
        case '\004':    /* $^D */
        case '\005':    /* $^E */
        case '\006':    /* $^F */
        case '\011':    /* $^I, NOT \t in EBCDIC */
        case '\016':    /* $^N */
        case '\017':    /* $^O */
        case '\020':    /* $^P */
        case '\024':    /* $^T */
        case '\027':    /* $^W */
        magicalize:
            sv_magic(GvSVn(gv), MUTABLE_SV(gv), PERL_MAGIC_sv, name, len);
            break;

        case '\014':    /* $^L */
            sv_setpvs(GvSVn(gv),"\f");
            break;
        case ';':               /* $; */
            sv_setpvs(GvSVn(gv),"\034");
            break;
        case ']':               /* $] */
        {
            SV * const sv = GvSV(gv);
            if (!sv_derived_from(PL_patchlevel, "version"))
                upg_version(PL_patchlevel, TRUE);
            GvSV(gv) = vnumify(PL_patchlevel);
            SvREADONLY_on(GvSV(gv));
            SvREFCNT_dec(sv);
        }
        break;
        case '\026':    /* $^V */
        {
            SV * const sv = GvSV(gv);
            GvSV(gv) = new_version(PL_patchlevel);
            SvREADONLY_on(GvSV(gv));
            SvREFCNT_dec(sv);
        }
        break;
        case 'a':
        case 'b':
            if (sv_type == SVt_PV)
                GvMULTI_on(gv);
        }
    }

   ret:
    /* Return true if we actually did something.  */
    {
        const GP* const gp = GvGP(gv);
        return gp->gp_av || gp->gp_hv || gp->gp_io || gp->gp_cv
            || ( gp->gp_sv && (SvOK(gp->gp_sv) || SvMAGICAL(gp->gp_sv)));
    }
}

/* If we do ever start using this later on in the file, we need to make sure we don’t accidentally use the wrong definition. */ #undef SvREADONLY_on

/* This function is called when the stash already holds the GV of the magic * variable we're looking for, but we need to check that it has the correct * kind of magic. For example, if someone first uses $! and then %!, the * latter would end up here, and we add the Errno tie to the HASH slot of * the *! glob. */ PERL_STATIC_INLINE void S_maybe_multimagic_gv(pTHX_ GV *gv, const char *name, const svtype sv_type) { PERL_ARGS_ASSERT_MAYBE_MULTIMAGIC_GV;

    if (sv_type == SVt_PVHV || sv_type == SVt_PVGV) {
        if (*name == '!')
            require_tie_mod_s(gv, '!', "Errno", 1);
        else if (*name == '-' || *name == '+')
            require_tie_mod_s(gv, *name, "Tie::Hash::NamedCapture", 0);
    } else if (sv_type == SVt_PV) {
        if (*name == '*' || *name == '#') {
            /* diag_listed_as: $* is no longer supported as of Perl 5.30 */
            Perl_croak(aTHX_ "$%c is no longer supported as of Perl 5.30", *name);
        }
    }
    if (sv_type==SVt_PV || sv_type==SVt_PVGV) {
      switch (*name) {
#ifdef PERL_SAWAMPERSAND
      case '`':
          PL_sawampersand |= SAWAMPERSAND_LEFT;
          (void)GvSVn(gv);
          break;
      case '&':
          PL_sawampersand |= SAWAMPERSAND_MIDDLE;
          (void)GvSVn(gv);
          break;
      case '\'':
          PL_sawampersand |= SAWAMPERSAND_RIGHT;
          (void)GvSVn(gv);
          break;
#endif
      }
    }
}

GV * Perl_gv_fetchpvn_flags(pTHX_ const char *nambeg, STRLEN full_len, I32 flags, const svtype sv_type) { const char *name = nambeg; const char *const name_end = nambeg + full_len; GV *gv = NULL; GV**gvp; HV *stash = NULL; char *hvname; STRLEN len; const I32 no_init = flags & (GV_NOADD_NOINIT | GV_NOINIT); const I32 no_expand = flags & GV_NOEXPAND; const I32 add = flags & ~GV_NOADD_MASK; const U32 is_utf8 = flags & SVf_UTF8; U32 faking_it; bool addmg = cBOOL(flags & GV_ADDMG);

    PERL_ARGS_ASSERT_GV_FETCHPVN_FLAGS;
    PERL_DTRACE_PROBE_GLOB_ENTRY(PERL_DTRACE_GLOB_MODE_FETCH, name);

     /* If we have GV_NOTQUAL, the caller promised that
      * there is no stash, so we can skip the check.
      * Similarly if full_len is 0, since then we're
      * dealing with something like *{""} or ""->foo()
      */
    if ((flags & GV_NOTQUAL) || !full_len)
        len = full_len;
    else if (parse_gv_stash_name(&stash, &gv, &name, &len, nambeg, full_len, is_utf8, add)) {
        if (name == name_end) return gv;
    }
    else
        return NULL;

    if (!stash && !find_default_stash(&stash, name, len, is_utf8, add, sv_type))
        return NULL;
    if (SvTYPE(stash) != SVt_PVHV)
        return NULL;
    
    /* By this point we should have a stash and a name */
    /* On protected stashes and !add we might need to try exists first */
    /*
    if (SvREADONLY(stash) && !add && !hv_exists(stash, name, is_utf8 ? -(I32)len : (I32)len)) {
        if (addmg) gv = (GV *)newSV(0);
        else return NULL;
    }
    */
    gvp = (GV**)hv_fetch(stash, name, is_utf8 ? -(I32)len : (I32)len, add);
    if (!gvp || *gvp == (const GV *)UNDEF) {
        if (addmg) gv = (GV *)newSV(0);
        else return NULL;
    }
    else gv = *gvp, addmg = 0;
    /* From this point on, addmg means gv has not been inserted in the
       symtab yet. */

    /* Check if a sub shadows a package. Skip UNIVERSAL::isa.
       TODO: Only if a method is called in this package.
             Or if the package contains a method.
     */
    if (sv_type == SVt_PVCV
        && add
        && ckWARN(WARN_SHADOW)
        && stash != PL_defstash         /* allow &main */
        && stash != PL_debstash         /* and &DB */
        && (hvname = HvNAME(stash))
        && strNEc(hvname, "UNIVERSAL")) /* and &UNIVERSAL::* */
    {
        char *stashname;
        char tmpbuf[1024];
        int stashname_is_dyn = 0;
        if (UNLIKELY(len >= sizeof(tmpbuf)-3)) {
            stashname_is_dyn = 1;
            Newx(stashname, len+3, char);
            assert(stashname);
        } else {
            stashname = tmpbuf;
        }
        Copy(name, stashname, len, char);
        stashname[len]   = ':';
        stashname[len+1] = ':';
        stashname[len+2] = 0;
        if (hv_fetch(stash, stashname, is_utf8 ? -(I32)(len+2) : (I32)len+2, 0)) {
            /* diag_listed_as: Subroutine &%s::%s masks existing package %s */
            Perl_warner(aTHX_ packWARN(WARN_SHADOW),
                        "Subroutine &%s::%s masks existing package %s::%s",
                        hvname, name, hvname, name);
        }
        if (UNLIKELY(stashname_is_dyn))
            free (stashname);
    }

    if (SvTYPE(gv) == SVt_PVGV) {
        /* The GV already exists, so return it, but check if we need to do
         * anything else with it before that.
         */
        if (add) {
            /* This is the heuristic that handles if a variable triggers the
             * 'used only once' warning.  If there's already a GV in the stash
             * with this name, then we assume that the variable has been used
             * before and turn its MULTI flag on.
             * It's a heuristic because it can easily be "tricked", like with
             * BEGIN { $a = 1; $::{foo} = *a }; () = $foo
             * not warning about $main::foo being used just once
             */
            GvMULTI_on(gv);
            gv_init_svtype(gv, sv_type);
            /* You reach this path once the typeglob has already been created,
               either by the same or a different sigil.  If this path didn't
               exist, then (say) referencing $! first, and %! second would
               mean that %! was not handled correctly.  */
            if (len == 1 && stash == PL_defstash) {
                maybe_multimagic_gv(gv, name, sv_type);
            }
            else if (sv_type == SVt_PVAV
                  && memEQs(name, len, "ISA")
                  && (!GvAV(gv) || !SvSMAGICAL(GvAV(gv))))
                gv_magicalize_isa(gv);
        }
        return gv;
    } else if (no_init) {
        assert(!addmg);
        return gv;
    }
    /* If GV_NOEXPAND is true and what we got off the stash is a ref,
     * don't expand it to a glob. This is an optimization so that things
     * copying constants over, like Exporter, don't have to be rewritten
     * to take into account that you can store more than just globs in
     * stashes.
     */
    else if (no_expand && SvROK(gv)) {
        assert(!addmg);
        return gv;
    }

    /* Adding a new symbol.
       Unless of course there was already something non-GV here, in which case
       we want to behave as if there was always a GV here, containing some sort
       of subroutine.
       Otherwise we run the risk of creating things like GvIO, which can cause
       subtle bugs. eg the one that tripped up SQL::Translator  */

    faking_it = SvOK(gv);

    if (add & GV_ADDWARN)
        Perl_ck_warner_d(aTHX_ packWARN(WARN_INTERNAL),
                "Had to create %" UTF8f " unexpectedly",
                 UTF8fARG(is_utf8, name_end-nambeg, nambeg));
    gv_init_pvn(gv, stash, name, len, (add & GV_ADDMULTI)|is_utf8);

    if (   full_len != 0
        && isIDFIRST_lazy_if_safe(name, name + full_len, is_utf8)
        && !ckWARN(WARN_ONCE) )
    {
        GvMULTI_on(gv) ;
    }

    /* set up magic where warranted */
    if ( gv_magicalize(gv, stash, name, len, sv_type) ) {
        /* See 23496c6 */
        if (addmg) {
            /* gv_magicalize magicalised this gv, so we want it
             * stored in the symtab.
             * Effectively the caller is asking, ‘Does this gv exist?’ 
             * And we respond, ‘Er, *now* it does!’
             */
            (void)hv_store(stash,name,len,(SV *)gv,0);
        }
    }
    else if (addmg) {
        /* The temporary GV created above */
        SvREFCNT_dec_NN(gv);
        gv = NULL;
    }
    
    if (gv) gv_init_svtype(gv, faking_it ? SVt_PVCV : sv_type);
    PERL_DTRACE_PROBE_GLOB_RETURN(PERL_DTRACE_GLOB_MODE_FETCH, name);
    return gv;
}

void Perl_gv_fullname4(pTHX_ SV *sv, const GV *gv, const char *prefix, bool keepmain) { const char *name; const HV * const hv = GvSTASH(gv);

    PERL_ARGS_ASSERT_GV_FULLNAME4;

    sv_setpv(sv, prefix ? prefix : "");

    if (hv && (name = HvNAME(hv))) {
        const STRLEN len = HvNAMELEN(hv);
        if (keepmain || ! memBEGINs(name, len, "main")) {
            sv_catpvn_flags(sv,name,len,HvNAMEUTF8(hv)?SV_CATUTF8:SV_CATBYTES);
            sv_catpvs(sv,"::");
        }
    }
    else sv_catpvs(sv,"__ANON__::");
    sv_catsv(sv,sv_2mortal(newSVhek(GvNAME_HEK(gv))));
}

void Perl_gv_efullname4(pTHX_ SV *sv, const GV *gv, const char *prefix, bool keepmain) { const GV * const egv = GvEGVx(gv);

    PERL_ARGS_ASSERT_GV_EFULLNAME4;

    gv_fullname4(sv, egv ? egv : gv, prefix, keepmain);
}

/* gv_magicalize;

gv_try_downgrade

NOTE: this function is experimental and may change or be removed without notice.

If the typeglob gv can be expressed more succinctly, by having something other than a real GV in its place in the stash, replace it with the optimised form. Basic requirements for this are that gv is a real typeglob, is sufficiently ordinary, and is only referenced from its package. This function is meant to be used when a GV has been looked up in part to see what was there, causing upgrading, but based on what was found it turns out that the real GV isn't required after all.

If gv is a completely empty typeglob, it is deleted from the stash.

If gv is a typeglob containing only a sufficiently-ordinary constant sub, the typeglob is replaced with a scalar-reference placeholder that more compactly represents the same thing.

        void    gv_try_downgrade(GV* gv)

Hash Entries

refcounted_he_fetch_pvs

Like "refcounted_he_fetch_pvn", but takes a literal string instead of a string/length pair, and no precomputed hash.

        SV *    refcounted_he_fetch_pvs(
                    const struct refcounted_he *chain,
                    "literal string" key, U32 flags
                )
refcounted_he_new_pvs

Like "refcounted_he_new_pvn", but takes a literal string instead of a string/length pair, and no precomputed hash.

        struct refcounted_he * refcounted_he_new_pvs(
                                   struct refcounted_he *parent,
                                   "literal string" key,
                                   SV *value, U32 flags
                               )

Hash Manipulation Functions

hv_backreferences_p

NOTE: this function is experimental and may change or be removed without notice.

Returns the modifiable pointer to the field holding the AV* of backreferences. See also "sv_get_backrefs" in perlapi.

        AV**    hv_backreferences_p(HV *hv)
hv_ename_add

Adds a name to a stash's internal list of effective names. See "hv_ename_delete".

This is called when a stash is assigned to a new location in the symbol table.

        void    hv_ename_add(HV *hv, const char *name, U32 len,
                             U32 flags)
hv_ename_delete

Removes a name from a stash's internal list of effective names. If this is the name returned by HvENAME, then another name in the list will take its place (HvENAME will use it).

This is called when a stash is deleted from the symbol table.

        void    hv_ename_delete(HV *hv, const char *name,
                                U32 len, U32 flags)
hv_kill_backrefs

NOTE: this function is experimental and may change or be removed without notice.

Calls "sv_kill_backrefs" on the hash backreferences, and frees it.

        void    hv_kill_backrefs(HV *hv)
hv_placeholders_p

Returns the pointer to modifiable field to the count of hash placeholders, the deleted elements. Used as HvPLACEHOLDERS(hv)++

        SSize_t* hv_placeholders_p(HV *hv)
refcounted_he_chain_2hv

Generates and returns a HV * representing the content of a refcounted_he chain. flags is currently unused and must be zero.

        HV *    refcounted_he_chain_2hv(
                    const struct refcounted_he *c, U32 flags
                )
refcounted_he_fetch_pv

Like "refcounted_he_fetch_pvn", but takes a nul-terminated string instead of a string/length pair.

        SV *    refcounted_he_fetch_pv(
                    const struct refcounted_he *chain,
                    const char *key, U32 hash, U32 flags
                )
refcounted_he_fetch_pvn

Search along a refcounted_he chain for an entry with the key specified by keypv and keylen. If flags has the REFCOUNTED_HE_KEY_UTF8 bit set, the key octets are interpreted as UTF-8, otherwise they are interpreted as Latin-1. hash is a precomputed hash of the key string, or zero if it has not been precomputed. Returns a mortal scalar representing the value associated with the key, or PLACEHOLDER if there is no value associated with the key.

        SV *    refcounted_he_fetch_pvn(
                    const struct refcounted_he *chain,
                    const char *keypv, STRLEN keylen, U32 hash,
                    U32 flags
                )
refcounted_he_fetch_sv

Like "refcounted_he_fetch_pvn", but takes a Perl scalar instead of a string/length pair.

        SV *    refcounted_he_fetch_sv(
                    const struct refcounted_he *chain, SV *key,
                    U32 hash, U32 flags
                )
refcounted_he_free

Decrements the reference count of a refcounted_he by one. If the reference count reaches zero the structure's memory is freed, which (recursively) causes a reduction of its parent refcounted_he's reference count. It is safe to pass a null pointer to this function: no action occurs in this case.

        void    refcounted_he_free(struct refcounted_he *he)
refcounted_he_inc

Increment the reference count of a refcounted_he. The pointer to the refcounted_he is also returned. It is safe to pass a null pointer to this function: no action occurs and a null pointer is returned.

        struct refcounted_he * refcounted_he_inc(
                                   struct refcounted_he *he
                               )
refcounted_he_new_pv

Like "refcounted_he_new_pvn", but takes a nul-terminated string instead of a string/length pair.

        struct refcounted_he * refcounted_he_new_pv(
                                   struct refcounted_he *parent,
                                   const char *key, U32 hash,
                                   SV *value, U32 flags
                               )
refcounted_he_new_pvn

Creates a new refcounted_he. This consists of a single key/value pair and a reference to an existing refcounted_he chain (which may be empty), and thus forms a longer chain. When using the longer chain, the new key/value pair takes precedence over any entry for the same key further along the chain.

The new key is specified by keypv and keylen. If flags has the REFCOUNTED_HE_KEY_UTF8 bit set, the key octets are interpreted as UTF-8, otherwise they are interpreted as Latin-1. hash is a precomputed hash of the key string, or zero if it has not been precomputed.

value is the scalar value to store for this key. value is copied by this function, which thus does not take ownership of any reference to it, and later changes to the scalar will not be reflected in the value visible in the refcounted_he. Complex types of scalar will not be stored with referential integrity, but will be coerced to strings. value may be either null or PLACEHOLDER to indicate that no value is to be associated with the key; this, as with any non-null value, takes precedence over the existence of a value for the key further along the chain.

parent points to the rest of the refcounted_he chain to be attached to the new refcounted_he. This function takes ownership of one reference to parent, and returns one reference to the new refcounted_he.

        struct refcounted_he * refcounted_he_new_pvn(
                                   struct refcounted_he *parent,
                                   const char *keypv,
                                   STRLEN keylen, U32 hash,
                                   SV *value, U32 flags
                               )
refcounted_he_new_sv

Like "refcounted_he_new_pvn", but takes a Perl scalar instead of a string/length pair.

        struct refcounted_he * refcounted_he_new_sv(
                                   struct refcounted_he *parent,
                                   SV *key, U32 hash, SV *value,
                                   U32 flags
                               )

Hook manipulation

add_does_methods

Copy all not-existing methods from the parent roles to the class/role. Fixup ISA for type checks. Fixup changed oelem{,fast} indices.

Duplicates are fatal: "Method %s from %s already exists in %s during role composition"

        void    add_does_methods(HV* klass, AV* does)
add_isa_fields

Copy all not-existing fields from parent classes or roles to the class of name. Duplicates are fatal with roles, ignored with classes.

        void    add_isa_fields(HV* klass, AV* isa)
class_isamagic

Set closed ISA magic to the array in pkg, either @ISA or @DOES.

        void    class_isamagic(OP* o, SV* pkg, const char* what,
                               int len)
const_av_xsub

Efficient sub that returns a constant array value.

        void    const_av_xsub(CV* cv)
const_sv_xsub

Efficient sub that returns a constant scalar value.

        void    const_sv_xsub(CV* cv)
do_method_finalize

A field may start as lexical or access call in the class block and method pad, and can to be converted to oelemfast ops, which are basically aelemfast_lex_u (lexical typed self, const ix < 256).

  PADxV targ     -> OELEMFAST(self)[targ]

  $field         -> $self->field[i] (same as above)
  $self->{field} ->     -"- (do not use)
  $self->field   ->     -"-

  exists $self->field    -> compile-time const if exists
  exists $self->{field}  -> compile-time const (do not use)
  exists $self->{$field} -> exists oelem

If the field is computed, convert to a new 'oelem' op, which does the field lookup at run-time.

        void    do_method_finalize(const HV* klass,
                                   const CV* cv, OP* o,
                                   const PADOFFSET self)
method_finalize

Resolve internal lexicals or field helem's or field accessors to fields in the class method or sub.

Field helem's might get deleted, as they don't work outside of classes. Only subs and methods inside the class are processed, not outside! void method_finalize(const HV* klass, const CV* cv)

Mu_av_xsub

XS template to set or return object array values from it's compile-time field offset.

    class MY {
      has @a;
    }
    my $c = new MY;
    $c->a = (0..2); # (0,1,2)
    print scalar $c->a; # 3
    $c->a = 1;     # (1)
    $c->a = 0..2;  # (0,1,2)
    $c->a = 1,2;   # (1,2)

        void    Mu_av_xsub(CV* cv)
Mu_sv_xsub

XS template to return an object scalar value from it's compile-time field offset.

        void    Mu_sv_xsub(CV* cv)
padnamelist_type_fixup

Changes all types in the padnames from the old klass to a new class. Needed for cloned roles.

        void    padnamelist_type_fixup(PADNAMELIST *pnl,
                                       HV* oldklass,
                                       HV* newklass)

IO Functions

start_glob

NOTE: this function is experimental and may change or be removed without notice.

Function called by do_readline to spawn a glob (or do the glob inside perl on VMS). This code used to be inline, but now perl uses File::Glob this glob starter is only used by miniperl during the build process, or when PERL_EXTERNAL_GLOB is defined. Moving it away shrinks pp_hot.c; shrinking pp_hot.c helps speed perl up.

        PerlIO* start_glob(SV *tmpglob, IO *io)

Lexer interface

ao

ao(toketype) looks for an '=' next to the operator, whithout whitespace, that has just been parsed and turns it into an ASSIGNOP if it finds one.

        int     ao(int toketype)
check_uni

Check the unary operators to ensure there's no ambiguity in how they're used, and if so warn about it. An ambiguous piece of code would be:

    rand + 5

This doesn't mean rand() + 5. Because rand() is a unary operator, the +5 is its argument.

        void    check_uni()
find_in_coretypes

NOTE: this function is experimental and may change or be removed without notice.

Check for and autocreate coretypes. Some of them inherited, setting the ISA. Returns NULL if the name is not a coretype. NOTE: the perl_ form of this function is deprecated.

        HV *    find_in_coretypes(const char *pkgname,
                                  STRLEN len)
force_ident

Called when the lexer wants $foo *foo &foo etc, but the program text only contains the "foo" portion. The first argument is a pointer to the "foo", and the second argument is the type symbol to prefix. Forces the next token to be a "BAREWORD". Creates the symbol if it didn't already exist (via gv_fetchpv()).

        void    force_ident(const char *s, int kind)
force_next

When the lexer realizes it knows the next token (for instance, it is reordering tokens for the parser) then it can call S_force_next to know what token to return the next time the lexer is called. Caller will need to set PL_nextval[] and possibly PL_expect to ensure the lexer handles the token correctly.

        void    force_next(I32 type)
force_version

Forces the next token to be a version number. If the next token appears to be an invalid version number, (e.g. "v2b"), and if "guessing" is TRUE, then no new token is created (and the caller must use an alternative parsing method).

        char*   force_version(char *s, int guessing)
force_word

When the lexer knows the next thing is a word (for instance, it has just seen -> and it knows that the next char is a word char, then it calls S_force_word to stick the next word into the PL_nexttoke/val lookahead.

Arguments: char *start : buffer position (must be within PL_linestr) int token : PL_next* will be this type of bare word (e.g., METHOD,BAREWORD) int check_keyword : if true, Perl checks to make sure the word isn't a keyword (do this if the word is a label, e.g. goto FOO) int allow_pack : if true, : characters will also be allowed (require, use, etc. do this)

        char*   force_word(char *start, int token,
                           int check_keyword, int allow_pack)
incline

This subroutine name is short for "increment line". It has nothing to do with tilting, whether at windmills or pinball tables. It increments the current line number in CopLINE(PL_curcop) and checks to see whether the line starts with a comment of the form # line 500 "foo.pm" If so, it sets the current line number and file to the values in the comment.

        void    incline(const char *s, const char *end)
lop

Build a list operator (or something that might be one). The rules: - if we have a next token, then it's a list operator (no parens) for which the next token has already been parsed; e.g., sort foo @args sort foo (@args) - if the next thing is an opening paren, then it's a function - else it's a list operator

        I32     lop(I32 f, expectation x, char *s)
missingterm

Complain about missing quote/regexp/heredoc terminator. If it's called with NULL then it cauterizes the line buffer. If we're in a delimited string and the delimiter is a control character, it's reformatted into a two-char sequence like ^C. This is fatal. void missingterm(char *s, STRLEN len)

new_constant

do any overload::constant lookup.

Either returns sv, or mortalizes/frees sv and returns a new SV*. Best used as sv=new_constant(..., sv, ...).

  If s, pv are NULL, calls subroutine with one argument,
  and <type> is used with error messages only.
  <type> is assumed to be well formed UTF-8

  If error_msg is not NULL, *error_msg will be set to any error encountered.
  otherwise yyerror() will be used to output it

        SV*     new_constant(const char *s, STRLEN len,
                             const char *key, STRLEN keylen,
                             SV *sv, SV *pv, const char *type,
                             STRLEN typelen,
                             const char ** error_msg)
no_op

When Perl expects an operator and finds something else, no_op prints the warning. It always prints "<something> found where operator expected. It prints "Missing semicolon on previous line?" if the surprise occurs at the start of the line. "do you need to predeclare ..." is printed out for code like "sub bar; foo bar $x" where the compiler doesn't know if foo is a method call or a function. It prints "Missing operator before end of line" if there's nothing after the missing operator, or "... before <...>" if there is something after the missing operator.

PL_bufptr is expected to point to the start of the thing that was found, and s after the next token or partial token.

        void    no_op(const char *const what, char *s)
notify_parser_that_changed_to_utf8

Called when $^H is changed to indicate that HINT_UTF8 has changed from off to on. At compile time, this has the effect of entering a 'use utf8' section. This means that any input was not previously checked for UTF-8 (because it was off), but now we do need to check it, or our assumptions about the input being sane could be wrong, and we could segfault. This routine just sets a flag so that the next time we look at the input we do the well-formed UTF-8 check. If we aren't in the proper phase, there may not be a parser object, but if there is, setting the flag is harmless.

        void    notify_parser_that_changed_to_utf8()
num_constlistexpr

Number of const list elements. depth starts with 0

NOTE: the perl_ form of this function is deprecated.

        SSize_t num_constlistexpr(OP* o, int depth)
parse_subsignature

Parse a sequence of zero or more Perl signature arguments, everything between the () parentheses, seperated by ',', with optional '=' or '?' default values and ending slurpy params ('@' or '%').

    sub f ($a, $b = 1) {...}

Return an OP_LINESEQ op, which has as its children, an OP_SIGNATURE, plus 0 or more (sassign, nextstate) pairs for each default arg expression that can't be optimised into the OP_SIGNATURE. Returns NULL on error.

It gives the OP_SIGNATURE op an op_aux array, which contains collections of actions and args; the args being things like what pad ranges to introduce, and simple default args such as an integer constant, an SV constant, or a simple lex or package var.

Note that we attach this data to CV via an OP_SIGNATURE rather than directly attaching it to the CV, so that it doesn't need copying each time a new thread is cloned.

Done: - perl6-like optional args: ($opt?) i.e. ($opt=undef) - types in leading position (int $i) - attributes (:const, types), ($i :int :const) - no double copies into @_ - scalar references compiled to direct access, not just copies (\$a) => my $a = $_[0]. - call-by-value and call-by-ref supported. call-by-ref could be improved through so we don't change constants to be called-by-ref, and rather copy it. Todo: - error in ck_subr when @_/$_[] in signatured bodies is used

NOTE: the perl_ form of this function is deprecated.

        OP *    parse_subsignature()
pending_ident

Looks up an identifier in the pad or in a package.

Returns: PRIVATEREF if this is a lexical name. BAREWORD if this belongs to a package.

Structure: if we're in a my declaration croak if they tried to say my($foo::bar) build the ops for a my() declaration if it's an access to a my() variable build ops for access to a my() variable if in a dq string, and they've said @foo and we can't find @foo warn build ops for a bareword

        int     pending_ident()
scan_const

Extracts the next constant part of a pattern, double-quoted string, or transliteration. This is terrifying code.

For example, in parsing the double-quoted string "ab\x63$d", it would stop at the '$' and return an OP_CONST containing 'abc'.

It looks at PL_lex_inwhat and PL_lex_inpat to find out whether it's processing a pattern (PL_lex_inpat is true), a transliteration (PL_lex_inwhat == OP_TRANS is true), or a double-quoted string.

Returns a pointer to the character scanned up to. If this is advanced from the start pointer supplied (i.e. if anything was successfully parsed), will leave an OP_CONST for the substring scanned in pl_yylval. Caller must intuit reason for not parsing further by looking at the next characters herself.

In patterns: expand: \N{FOO} => \N{U+hex_for_character_FOO} (if FOO expands to multiple characters, expands to \N{U+xx.XX.yy ...})

  pass through:
      all other \-char, including \N and \N{ apart from \N{ABC}

  stops on:
      @ and $ where it appears to be a var, but not for $ as tail anchor
      \l \L \u \U \Q \E
      (?{  or  (??{

In transliterations: characters are VERY literal, except for - not at the start or end of the string, which indicates a range. If the range is in bytes, scan_const expands the range to the full set of intermediate characters. If the range is in utf8, the hyphen is replaced with a certain range mark which will be handled by pmtrans() in op.c.

In double-quoted strings: backslashes: double-quoted style: \r and \n constants: \x31, etc. deprecated backrefs: \1 (in substitution replacements) case and quoting: \U \Q \E stops on @ and $

scan_const does *not* construct ops to handle interpolated strings. It stops processing as soon as it finds an embedded $ or @ variable and leaves it to the caller to work out what's going on.

embedded arrays (whether in pattern or not) could be: @foo, @::foo, @'foo, @{foo}, @$foo, @+, @-.

$ in double-quoted strings must be the symbol of an embedded scalar.

$ in pattern could be $foo or could be tail anchor. Assumption: it's a tail anchor if $ is the last thing in the string, or if it's followed by one of "()| \r\n\t"

\1 (backreferences) are turned into $1 in substitutions

The structure of the code is while (there's a character to process) { handle transliteration ranges skip regexp comments /(?#comment)/ and codes /(?{code})/ skip #-initiated comments in //x patterns check for embedded arrays check for embedded scalars if (backslash) { deprecate \1 in substitution replacements handle string-changing backslashes \l \U \Q \E, etc. switch (what was escaped) { handle \- in a transliteration (becomes a literal -) if a pattern and not \N{, go treat as regular character handle \132 (octal characters) handle \x15 and \x{1234} (hex characters) handle \N{name} (named characters, also \N{3,5} in a pattern) handle \cV (control characters) handle printf-style backslashes (\f, \r, \n, etc) } (end switch) continue } (end if backslash) handle regular character } (end while character to read)

        char*   scan_const(char *start)
scan_heredoc

Takes a pointer to the first < in <<FOO. Returns a pointer to the byte following <<FOO.

This function scans a heredoc, which involves different methods depending on whether we are in a string eval, quoted construct, etc. This is because PL_linestr could containing a single line of input, or a whole string being eval'ed, or the contents of the current quote- like operator.

The two basic methods are: - Steal lines from the input stream - Scan the heredoc in PL_linestr and remove it therefrom

In a file scope or filtered eval, the first method is used; in a string eval, the second.

In a quote-like operator, we have to choose between the two, depending on where we can find a newline. We peek into outer lex- ing scopes until we find one with a newline in it. If we reach the outermost lexing scope and it is a file, we use the stream method. Otherwise it is treated as an eval.

        char*   scan_heredoc(char *s)
scan_inputsymbol

takes: position of first '<' in input buffer

returns: position of first char following the last '>' in input buffer.

side-effects: pl_yylval and lex_op are set.

This code handles:

   <>           read from ARGV
   <<>>         read from ARGV without magic open
   <FH>         read from filehandle
   <pkg::FH>    read from package qualified filehandle
   <pkg'FH>     read from package qualified filehandle
   <$fh>        read from filehandle in $fh
   <*.h>        filename glob

        char*   scan_inputsymbol(char *start)
scan_str

NOTE: this function is experimental and may change or be removed without notice.

takes: start position in buffer keep_bracketed_quoted preserve \ quoting of embedded delimiters, but only if they are of the open/close form keep_delims preserve the delimiters around the string re_reparse compiling a run-time /(?{})/: collapse // to /, and skip encoding src delimp if non-null, this is set to the position of the closing delimiter, or just after it if the closing and opening delimiters differ (i.e., the opening delimiter of a substitu- tion replacement)

returns: position to continue reading from buffer

side-effects: multi_start, multi_close, lex_repl or lex_stuff, and updates the read buffer.

This subroutine pulls a string out of the input. It is called for:

        q               single quotes           q(literal text)
        '               single quotes           'literal text'
        qq              double quotes           qq(interpolate $here please)
        "               double quotes           "interpolate $here please"
        qx              backticks               qx(/bin/ls -l)
        `               backticks               `/bin/ls -l`
        qw              quote words             @EXPORT_OK = qw( func() $spam )
        m//             regexp match            m/this/
        s///            regexp substitute       s/this/that/
        tr///           string transliterate    tr/this/that/
        y///            string transliterate    y/this/that/
        ($*@)           sub prototypes          sub foo ($)
        (stuff)         sub attr parameters     sub foo : attr(stuff)
        <>              readline or globs       <FOO>, <>, <$fh>, or <*.c>

In most of these cases (all but <>, patterns and transliterate) yylex() calls scan_str(). m// makes yylex() call scan_pat() which calls scan_str(). s/// makes yylex() call scan_subst() which calls scan_str(). tr/// and y/// make yylex() call scan_trans() which calls scan_str().

It skips whitespace before the string starts, and treats the first character as the delimiter. If the delimiter is one of ([{< then the corresponding "close" character )]}> is used as the closing delimiter. It allows quoting of delimiters, and if the string has balanced delimiters ([{<>}]) it allows nesting.

On success, the SV with the resulting string is put into lex_stuff or, if that is already non-NULL, into lex_repl. The second case occurs only when parsing the RHS of the special constructs s/// and tr/// (y///). For convenience, the terminating delimiter character is stuffed into SvIVX of the SV.

        char*   scan_str(char *start, int keep_quoted,
                         int keep_delims, int re_reparse,
                         char **delimp)
scan_word

NOTE: this function is experimental and may change or be removed without notice.

Returns a NUL terminated string, with the length of the string written to *slp. Note that the perl5 API misses the normalize argument.

        char*   scan_word(char *s, char *dest, STRLEN destlen,
                          int allow_package, STRLEN *slp,
                          int *normalize)
validate_proto

NOTE: this function is experimental and may change or be removed without notice.

This function performs syntax checking on a prototype, proto. If warn is true, any illegal characters or mismatched brackets will trigger illegalproto warnings, declaring that they were detected in the prototype for name.

The return value is true if this is a valid prototype, and false if it is not, regardless of whether warn was true or false.

Note that NULL is a valid proto and will always return true.

In cperl with maybe_sig TRUE this also detects if it's a signature, and returns FALSE then. Thus the illegalproto warnings are relaxed.

NOTE: the perl_ form of this function is deprecated.

        bool    validate_proto(SV *name, SV *proto, bool dowarn,
                               bool curstash, bool maybe_sig)
yylex

Works out what to call the token just pulled out of the input stream. The yacc parser takes care of taking the ops we return and stitching them into a tree.

Returns: The type of the next token.

Structure: Check if we have already built the token; if so, use it. Switch based on the current state: - if we have a case modifier in a string, deal with that - handle other cases of interpolation inside a string - scan the next line if we are inside a format In the normal state, switch on the next character: - default: if alphabetic, go to key lookup unrecognized character - croak - 0/4/26: handle end-of-line or EOF - cases for whitespace - \n and #: handle comments and line numbers - various operators, brackets and sigils - numbers - quotes - 'v': vstrings (or go to key lookup) - 'x' repetition operator (or go to key lookup) - other ASCII alphanumerics (key lookup begins here): word before => ? keyword plugin scan built-in keyword (but do nothing with it yet) check for statement label check for lexical subs goto just_a_word if there is one see whether built-in keyword is overridden switch on keyword number: - default: just_a_word: not a built-in keyword; handle bareword lookup disambiguate between method and sub call fall back to bareword - cases for built-in keywords

        int     yylex()

Magical Functions

magic_clearhint

Triggered by a delete from %^H, records the key to PL_compiling.cop_hints_hash.

        int     magic_clearhint(SV* sv, MAGIC* mg)
magic_clearhints

Triggered by clearing %^H, resets PL_compiling.cop_hints_hash.

        int     magic_clearhints(SV* sv, MAGIC* mg)
magic_getffi_encoded int magic_getffi_encoded(SV* sv, MAGIC* mg)
magic_methcall

Invoke a magic method (like FETCH).

sv and mg are the tied thingy and the tie magic.

meth is the name of the method to call.

argc is the number of args (in addition to $self) to pass to the method.

The flags can be:

    G_DISCARD     invoke method with G_DISCARD flag and don't
                  return a value
    G_UNDEF_FILL  fill the stack with argc pointers to
                  PL_sv_undef

The arguments themselves are any values following the flags argument.

Returns the SV (if any) returned by the method, or NULL on failure.

        SV*     magic_methcall(SV *sv, const MAGIC *mg,
                               SV *meth, U32 flags, U32 argc,
                               ...)
magic_setffi_encoded

Get and set the name of the FFI string argument :encoded() attribute.

        int     magic_setffi_encoded(SV* sv, MAGIC* mg)
magic_sethint

Triggered by a store to %^H, records the key/value pair to PL_compiling.cop_hints_hash. It is assumed that hints aren't storing anything that would need a deep copy. Maybe we should warn if we find a reference.

        int     magic_sethint(SV* sv, MAGIC* mg)
mg_localize

Copy some of the magic from an existing SV to new localized version of that SV. Container magic (e.g., %ENV, $1, tie) gets copied, value magic doesn't (e.g., taint, pos).

If setmagic is false then no set magic will be called on the new (empty) SV. This typically means that assignment will soon follow (e.g. 'local $x = $y'), and that will handle the magic.

        void    mg_localize(SV* sv, SV* nsv, bool setmagic)

Miscellaneous Functions

closest_cop

Look for curop starting from o. cop is the last COP we've seen. opnext means that curop is actually the ->op_next of the op we are seeking.

NOTE: the perl_ form of this function is deprecated.

        const COP* closest_cop(const COP *cop, const OP *o,
                               const OP *curop, bool opnext)
free_c_backtrace

Deallocates a backtrace received from get_c_bracktrace.

        void    free_c_backtrace(Perl_c_backtrace* bt)
get_c_backtrace

Collects the backtrace (aka "stacktrace") into a single linear malloced buffer, which the caller must Perl_free_c_backtrace().

Scans the frames back by depth + skip, then drops the skip innermost, returning at most depth frames.

        Perl_c_backtrace* get_c_backtrace(int max_depth,
                                          int skip)
get_db_sub

Stores the called cv in $DB::sub, either as name or as CV ptr (with NONAME, an anon sub).

sv contains the entersub argument from the stack, which is either a CVREF or a GV, or NULL if called via goto. It is not really needed.

In the debugger entersub does not call the function, but &DB::sub which then calls the cv.

        void    get_db_sub(SV *sv, CV *cv)
prep_cif

Prepare the compile-time argument and return types and arity for an extern sub for ffi_prep_cif().

See man ffi_prep_cif. void prep_cif(CV* cv, const char *nativeconv, const char *encoded)

MRO Functions

mro_get_linear_isa_dfs

Returns the Depth-First Search linearization of @ISA the given stash. The return value is a read-only AV*. level should be 0 (it is used internally in this function's recursion).

You are responsible for SvREFCNT_inc() on the return value if you plan to store it anywhere semi-permanently (otherwise it might be deleted out from under you the next time the cache is invalidated).

        AV*     mro_get_linear_isa_dfs(HV* stash, U32 level)
mro_package_moved

Call this function to signal to a stash that it has been assigned to another spot in the stash hierarchy. stash is the stash that has been assigned. oldstash is the stash it replaces, if any. gv is the glob that is actually being assigned to.

This can also be called with a null first argument to indicate that oldstash has been deleted.

This function invalidates isa caches on the old stash, on all subpackages nested inside it, and on the subclasses of all those, including non-existent packages that have corresponding entries in stash.

It also sets the effective names (HvENAME) on all the stashes as appropriate.

If the gv is present and is not in the symbol table, then this function simply returns. This checked will be skipped if flags & 1.

        void    mro_package_moved(HV * const stash,
                                  HV * const oldstash,
                                  const GV * const gv,
                                  U32 flags)

Numeric functions

grok_atoUV

parse a string, looking for a decimal unsigned integer.

On entry, pv points to the beginning of the string; valptr points to a UV that will receive the converted value, if found; endptr is either NULL or points to a variable that points to one byte beyond the point in pv that this routine should examine. If endptr is NULL, pv is assumed to be NUL-terminated.

Returns FALSE if pv doesn't represent a valid unsigned integer value (with no leading zeros). Otherwise it returns TRUE, and sets *valptr to that value.

If you constrain the portion of pv that is looked at by this function (by passing a non-NULL endptr), and if the intial bytes of that portion form a valid value, it will return TRUE, setting *endptr to the byte following the final digit of the value. But if there is no constraint at what's looked at, all of pv must be valid in order for TRUE to be returned.

The only characters this accepts are the decimal digits '0'..'9'.

As opposed to atoi(3) or strtol(3), grok_atoUV does NOT allow optional leading whitespace, nor negative inputs. If such features are required, the calling code needs to explicitly implement those.

Note that this function returns FALSE for inputs that would overflow a UV, or have leading zeros. Thus a single 0 is accepted, but not 00 nor 01, 002, etc.

Background: atoi has severe problems with illegal inputs, it cannot be used for incremental parsing, and therefore should be avoided atoi and strtol are also affected by locale settings, which can also be seen as a bug (global state controlled by user environment).

        bool    grok_atoUV(const char* pv, UV* valptr,
                           const char** endptr)

Optree construction

force_list

promote o and any siblings to be a list if its not already; i.e.

 o - A - B

becomes

 list
   |
 pushmark - o - A - B

If nullit it true, the list op is nulled. OP* force_list(OP* arg, bool nullit)

package

Implements the package keyword, used in perly.y. Saves the old current package, and sets the new current package and package name (for __PACKAGE__).

On cperl also checks for a shadow method overriding method access to this new package. Note that "class_role" in perlapi inlines most of this function also.

NOTE: the perl_ form of this function is deprecated.

        void    package(OP* o)

Optree Manipulation Functions

alloc_LOGOP

NOTE: this function is experimental and may change or be removed without notice.

lowest-level newLOGOP-style function - just allocates and populates the struct. Higher-level stuff should be done by S_new_logop() / newLOGOP(). This function exists mainly to avoid op_first assignment being spread throughout this file.

NOTE: the perl_ form of this function is deprecated.

        LOGOP*  alloc_LOGOP(I32 type, OP *first, OP *other)
apply_attrs

Calls the attribute importer with the target and a list of attributes. As manually done via BEGIN{ require; attributes-import($pkg, $rv, @attrs)}>.

See "apply_attrs_my" for the variant which defers the import call to run-time, enabling run-time attribute arguments, i.e. variables, not only constant barewords, and see "attrs_runtime" which extracts the run-time part of attrs.

        void    apply_attrs(HV *stash, SV *target, OP *attrs)
apply_attrs_my

Similar to "apply_attrs" calls the attribute importer with the target, which must be a lexical and a list of attributes. As manually done via use attributes $pkg, $rv, @attrs. But contrary to "apply_attrs" this defers attributes-import()> to run-time.

Returns the list of attributes in the **imopsp argument.

Used in cperl with non-constant attrs arguments to defer the import to run-time. [cperl #291] perl5 cannot handle run-time args like :native($lib). threaded cperl cannot handle those variables yet.

        void    apply_attrs_my(HV *stash, OP *target, OP *attrs,
                               OP **imopsp)
attrs_has_const

NOTE: this function is experimental and may change or be removed without notice.

Checks the attrs list if ":const" is in it. But not ("const", my $x).

Returns the number of found attribs with const, which is only relevant for 1 for const being the single attr, 0 if no const was found, and >1 if there are also other attribs besides const.

If from_assign is TRUE, the attrs are already expanded to a full ENTERSUB import call. If not it's a list, not attrs. If from_assign is FALSE, it is from an unexpanded attrlist our VAR :ATTR declaration, without ENTERSUB.

  TRUE:  my $s :const = 1;  LIST-PUSHMARK-ENTERSUB
  TRUE:  my @a :const = 1;  LIST-PUSHMARK-PADAV-ENTERSUB
  TRUE:  our $s :const = 1; LIST-PUSHMARK-RV2SV(gv)-ENTERSUB
  FALSE: our $s :const = 1; CONST
  TRUE:  ("const",my $s) = 1; LIST-PUSHMARK-CONST

        int     attrs_has_const(OP* o, bool from_assign)
attrs_runtime

NOTE: this function is experimental and may change or be removed without notice.

Extract the run-time part of sub attributes with arguments, i.e. variables, not just constant barewords or strings. Might be extended to other lexical args, not just subs.

Returns NULL on none or only constant attribute arguments, otherwise returns the run-time attributes->import code.

        OP *    attrs_runtime(CV *cv, OP *attrs)
bind_match OP* bind_match(I32 type, OP *left, OP *right)
cant_declare void cant_declare(OP* o)
check_hash_fields_and_hekify

for a helem/hslice/kvslice, if its a fixed hash, croak on invalid const fields. Also, convert CONST keys to HEK-in-SVs.

rop is the op that retrieves the hash; key_op is the first key; real if false, only check (and possibly croak); don't update op

                check_hash_fields_and_hekify;
cv_check_inline

NOTE: this function is experimental and may change or be removed without notice.

Examine an optree to determine whether it's in-lineable. In contrast to op_const_sv allow short op sequences which are not constant folded. Max 10 ops, no new pad (?), no intermediate return, no recursion, ... no call-by-ref: $_[i] aelemfast(*_) or aelem rv2av or multideref($_[$x]) TODO later: call-by-ref, new lexicals. walk by sib not next (skipping other).

cv_inline needs to translate the args, change return to jumps.

$lhs = call(...); => $lhs = do {...inlined...}; bool cv_check_inline(const OP *o, CV *compcv)

cv_do_inline

NOTE: this function is experimental and may change or be removed without notice.

Needs to translate the args to local pads. o: entersub cvop: leavesub Splice inlined leavesub block, replacing pushmark .. entersub. METHOD should not arrive here, neither $obj->method.

handle args: shift, = @_ or just accept SIGNATURED subs with PERL_FAKE_SIGNATURE. with a OP_SIGNATURE it is easier. without need to populate @_. if arg is call-by-value make a copy. adjust or add targs, with local or eval{} or caller, entersub, ... need to add ENTER/LEAVE, skip ENTER/LEAVE if certain ops are absent.

$lhs = call(...); => $lhs = do {...inlined...};

Converted to a simplier ck step, without linked op_next ptrs. Not in rpeep anymore. Only activated with PERL_INLINE_SUBS OP* cv_do_inline(OP *parent, OP *o, OP *cvop, CV *cv)

dup_attrlist

Return a copy of an attribute list, i.e. a CONST or LIST with a list of CONST or PADSV/RV2SV-GV values.

        OP *    dup_attrlist(OP *o)
finalize_op

Calls several op-specific finalizers, warnings and fixups.

        void    finalize_op(OP* o)
finalize_optree

This function finalizes the optree. Should be called directly after the complete optree is built. It does some additional checking which can't be done in the normal ck_xxx functions and makes the tree thread-safe.

        void    finalize_optree(OP* o)
invert

Add a unary NOT op in front, inverting the op.

        OP*     invert(OP* cmd)
list

Sets list context for the op.

        OP*     list(OP* o)
listkids

Sets list context for all kids.

        OP*     listkids(OP* o)
maybe_multiconcat

NOTE: this function is experimental and may change or be removed without notice.

Given an OP_STRINGIFY, OP_SASSIGN, OP_CONCAT or OP_SPRINTF op, possibly convert it (and its children) into an OP_MULTICONCAT. See the code comments just before pp_multiconcat() for the full details of what OP_MULTICONCAT supports.

Basically we're looking for an optree with a chain of OP_CONCATS down the LHS (or an OP_SPRINTF), with possibly an OP_SASSIGN, and/or OP_STRINGIFY, and/or OP_CONCAT acting as '.=' at its head, e.g.

     $x = "$a$b-$c"

 looks like

     SASSIGN
        |
     STRINGIFY   -- PADSV[$x]
        |
        |
     ex-PUSHMARK -- CONCAT/S
                       |
                    CONCAT/S  -- PADSV[$d]
                       |
                    CONCAT    -- CONST["-"]
                       |
                    PADSV[$a] -- PADSV[$b]

Note that at this stage the OP_SASSIGN may have already been optimised away with OPpTARGET_MY set on the OP_STRINGIFY or OP_CONCAT.

        void    maybe_multiconcat(OP *o)
maybe_op_signature

Does fake_signatures. If the sub starts with 'my (...) = @_', replace those ops with an OP_SIGNATURE. Here we don't have to add the default $self invocant.

Cannot handle shift as this leaves leftover args.

                maybe_op_signature;
modkids

Sets lvalue context for all kids.

        OP*     modkids(OP *o, I32 type)
move_proto_attr

Move a run-time attribute to a compile-time prototype handling, as with :prototype(...)

Set CV prototype in name from :prototype() attribute.

        void    move_proto_attr(OP **proto, OP **attrs,
                                const GV *name, bool curstash)
my_attrs

Prepend the lexical variable with the attribute->import call.

        OP *    my_attrs(OP *o, OP *attrs)
my_kid OP * my_kid(OP *o, OP *attrs, OP **imopsp)
newASSIGNOP_maybe_const

Checks the attrs of the left if it has const. If so check dissect my_attrs() and check if there's another attr. If so defer attribute->import to run-time. If not just const the left side.

OpSPECIAL on the assign op denotes :const. Undo temp. READONLY-ness via a private OPpASSIGN_CONSTINIT bit during assignment at run-time.

Do various compile-time assignments on const rhs values, to enable constant folding. my @a[] = (...) comes also here, setting the computed lhs AvSHAPED size.

Return the newASSIGNOP, or the folded assigned value.

        OP*     newASSIGNOP_maybe_const(OP* left, I32 optype,
                                        OP* right)
newATTRSUB_x

Construct a Perl subroutine, also performing some surrounding jobs.

This function is expected to be called in a Perl compilation context, and some aspects of the subroutine are taken from global variables associated with compilation. In particular, PL_compcv represents the subroutine that is currently being compiled. It must be non-null when this function is called, and some aspects of the subroutine being constructed are taken from it. The constructed subroutine may actually be a reuse of the PL_compcv object, but will not necessarily be so.

If block is null then the subroutine will have no body, and for the time being it will be an error to call it. This represents a forward subroutine declaration such as sub foo ($$);. If block is non-null then it provides the Perl code of the subroutine body, which will be executed when the subroutine is called. This body includes any argument unwrapping code resulting from a subroutine signature or similar. The pad use of the code must correspond to the pad attached to PL_compcv. The code is not expected to include a leavesub or leavesublv op; this function will add such an op. block is consumed by this function and will become part of the constructed subroutine.

proto specifies the subroutine's prototype, unless one is supplied as an attribute (see below). If proto is null, then the subroutine will not have a prototype. If proto is non-null, it must point to a const op whose value is a string, and the subroutine will have that string as its prototype. If a prototype is supplied as an attribute, the attribute takes precedence over proto, but in that case proto should preferably be null. In any case, proto is consumed by this function.

attrs supplies attributes to be applied the subroutine. A handful of attributes take effect by built-in means, being applied to PL_compcv immediately when seen. Other attributes are collected up and attached to the subroutine by this route. attrs may be null to supply no attributes, or point to a const op for a single attribute, or point to a list op whose children apart from the pushmark are const ops for one or more attributes. Each const op must be a string, giving the attribute name optionally followed by parenthesised arguments, in the manner in which attributes appear in Perl source. The attributes will be applied to the sub by this function. attrs is consumed by this function.

If o_is_gv is false and o is null, then the subroutine will be anonymous. If o_is_gv is false and o is non-null, then o must point to a const op, which will be consumed by this function, and its string value supplies a name for the subroutine. The name may be qualified or unqualified, and if it is unqualified then a default stash will be selected in some manner. If o_is_gv is true, then o doesn't point to an OP at all, but is instead a cast pointer to a GV by which the subroutine will be named.

If there is already a subroutine of the specified name, then the new sub will either replace the existing one in the glob or be merged with the existing one. A warning may be generated about redefinition. Likewise if a package with the same name exists already, a shadow warning is generated about the inaccessibility of the package.

If the subroutine has one of a few special names, such as BEGIN or END, then it will be claimed by the appropriate queue for automatic running of phase-related subroutines. In this case the relevant glob will be left not containing any subroutine, even if it did contain one before. In the case of BEGIN, the subroutine will be executed and the reference to it disposed of before this function returns.

The function returns a pointer to the constructed subroutine. If the sub is anonymous then ownership of one counted reference to the subroutine is transferred to the caller. If the sub is named then the caller does not get ownership of a reference. In most such cases, where the sub has a non-phase name, the sub will be alive at the point it is returned by virtue of being contained in the glob that names it. A phase-named subroutine will usually be alive by virtue of the reference owned by the phase's automatic run queue. But a BEGIN subroutine, having already been executed, will quite likely have been destroyed already by the time this function returns, making it erroneous for the caller to make any use of the returned pointer. It is the caller's responsibility to ensure that it knows which of these situations applies.

        CV *    newATTRSUB_x(I32 floor, OP *o, OP *proto,
                             OP *attrs, OP *block, bool o_is_gv)
newXS_len_flags

Construct an XS subroutine, also performing some surrounding jobs.

The subroutine will have the entry point subaddr. It will have the prototype specified by the nul-terminated string proto, or no prototype if proto is null. The prototype string is copied; the caller can mutate the supplied string afterwards. If filename is non-null, it must be a nul-terminated filename, and the subroutine will have its CvFILE set accordingly. By default CvFILE is set to point directly to the supplied string, which must be static. If flags has the XS_DYNAMIC_FILENAME bit set, then a copy of the string will be taken instead.

Other aspects of the subroutine will be left in their default state. If anything else needs to be done to the subroutine for it to function correctly, it is the caller's responsibility to do that after this function has constructed it. However, beware of the subroutine potentially being destroyed before this function returns, as described below.

If name is null then the subroutine will be anonymous, with its CvGV referring to an __ANON__ glob. If name is non-null then the subroutine will be named accordingly, referenced by the appropriate glob. name is a string of length len bytes giving a sigilless symbol name, in UTF-8 if flags has the SVf_UTF8 bit set and in Latin-1 otherwise. The name may be either qualified or unqualified, with the stash defaulting in the same manner as for gv_fetchpvn_flags. flags may contain flag bits understood by gv_fetchpvn_flags with the same meaning as they have there, such as GV_ADDWARN. The symbol is always added to the stash if necessary, with GV_ADDMULTI semantics. CvFLAGS are not valid flags, only GV_ flags.

If there is already a subroutine of the specified name, then the new sub will replace the existing one in the glob. A warning may be generated about the redefinition. If the old subroutine was CvCONST then the decision about whether to warn is influenced by an expectation about whether the new subroutine will become a constant of similar value. That expectation is determined by const_svp. (Note that the call to this function doesn't make the new subroutine CvCONST in any case; that is left to the caller.) If const_svp is null then it indicates that the new subroutine will not become a constant. If const_svp is non-null then it indicates that the new subroutine will become a constant, and it points to an SV* that provides the constant value that the subroutine will have.

If the subroutine has one of a few special names, such as BEGIN or END, then it will be claimed by the appropriate queue for automatic running of phase-related subroutines. In this case the relevant glob will be left not containing any subroutine, even if it did contain one before. In the case of BEGIN, the subroutine will be executed and the reference to it disposed of before this function returns, and also before its prototype is set. If a BEGIN subroutine would not be sufficiently constructed by this function to be ready for execution then the caller must prevent this happening by giving the subroutine a different name.

The function returns a pointer to the constructed subroutine. If the sub is anonymous then ownership of one counted reference to the subroutine is transferred to the caller. If the sub is named then the caller does not get ownership of a reference. In most such cases, where the sub has a non-phase name, the sub will be alive at the point it is returned by virtue of being contained in the glob that names it. A phase-named subroutine will usually be alive by virtue of the reference owned by the phase's automatic run queue. But a BEGIN subroutine, having already been executed, will quite likely have been destroyed already by the time this function returns, making it erroneous for the caller to make any use of the returned pointer. It is the caller's responsibility to ensure that it knows which of these situations applies.

        CV *    newXS_len_flags(const char *name, STRLEN len,
                                XSUBADDR_t subaddr,
                                const char *const filename,
                                NULLOK const char *const proto,
                                NULLOK SV **const_svp,
                                U32 flags)
op_clear

free all the SVs (gv, pad, ...) attached to the op.

NOTE: the perl_ form of this function is deprecated.

        void    op_clear(OP* o)
op_clear_gv

Free a GV attached to an OP

        void    op_clear_gv(OP* o, PADOFFSET *ixp)
op_const_sv

op_const_sv: examine an optree to determine whether it's in-lineable into a single CONST op. It walks the tree in exec order (next), not in tree order (sibling, first).

Can be called in 2 ways:

!allow_lex look for a single OP_CONST with attached value: return the value

allow_lex && !CvCONST(cv);

        examine the clone prototype, and if contains only a single
        OP_CONST, return the value; or if it contains a single PADSV ref-
        erencing an outer lexical, turn on CvCONST to indicate the CV is
        a candidate for "constizing" at clone time, and return NULL.
                op_const_sv;
op_gv_set

Set the gv as the op_sv. With threads also relocate a gv to the pad for thread safety. cperl-only

        void    op_gv_set(OP* o, GV* gv)
op_relocate_sv

Relocate sv to the pad for thread safety. Despite being a "constant", the SV is written to, for reference counts, sv_upgrade() etc.

        void    op_relocate_sv(SV** svp, PADOFFSET* targp)
op_sibling_newUNOP

replace the sibling following start with a new UNOP, which becomes the parent of the original sibling; e.g.

   op_sibling_newUNOP(P, A, unop-args...)
  
   P              P
   |      becomes |
   A-B-C          A-U-C
                    |
                    B

where U is the new UNOP.

parent and start args are the same as for op_sibling_splice(); type and flags args are as newUNOP().

Returns the new UNOP.

                op_sibling_newUNOP;
optimize_op

Helper for optimize_optree() which optimises a single op then recurses to optimise any children.

        void    optimize_op(OP* o)
optimize_optree

This function applies some optimisations to the optree in top-down order. It is called before the peephole optimizer, which processes ops in execution order. Note that finalize_optree() also does a top-down scan, but is called *after* the peephole optimizer.

        void    optimize_optree(OP* o)
op_unscope

NOTE: this function is experimental and may change or be removed without notice.

Nullify all state ops in the kids of a lineseq.

        OP*     op_unscope(OP* o)
process_optree

NOTE: this function is experimental and may change or be removed without notice.

Do the post-compilation processing of an op_tree with specified root and start

  * attach it to cv (if non-null)
  * set refcnt
  * run pre-peep optimizer, peep, finalize, prune an empty head, etc
  * tidy pad

                process_optree;
refkids

Sets ref context for all kids.

        OP*     refkids(OP* o, I32 type)
sawparens OP* sawparens(OP* o)
scalarboolean

Checks boolean context for the op, merely for syntax warnings.

Note: We cannot "set_boolean" context here, as some ops still require the non-boolified stackvalue. See "check_for_bool_cxt".

                scalarboolean;
scalarkids

Sets scalar context for all kids.

                scalarkids;
scalarseq

Sets scalar void context for scalar sequences: lineseq, scope, leave and leavetry.

        OP*     scalarseq(OP* o)
scalarvoid

Assigns scalar void context to the optree, i.e. it takes only a scalar argument, no list and returns nothing.

        OP*     scalarvoid(OP* o)
set_boolean

Force the op to be in boolean context, similar to "scalar" and "scalarboolean" This just abstracts away the various private TRUEBOOL flag values.

        OP*     set_boolean(OP *o)
traverse_op_tree

Return the next op in a depth-first traversal of the op tree, returning NULL when the traversal is complete.

The initial call must supply the root of the tree as both top and o.

For now it's static, but it may be exposed to the API in the future.

                traverse_op_tree;

Pad Data Structures

CX_CURPAD_SAVE

Save the current pad in the given context block_loop structure. With threads only.

        void    CX_CURPAD_SAVE(struct context)
CX_CURPAD_SV

Access the SV at offset po in the saved current pad in the given context block_loop structure (can be used as an lvalue). With threads only.

        SV *    CX_CURPAD_SV(struct context, PADOFFSET po)
PAD_BASE_SV

Get the value from slot po in the base (DEPTH=1) pad of a padlist

        SV *    PAD_BASE_SV(PADLIST padlist, PADOFFSET po)
PAD_CLONE_VARS

Clone the state variables associated with running and compiling pads.

        void    PAD_CLONE_VARS(PerlInterpreter *proto_perl,
                               CLONE_PARAMS* param)
PAD_COMPNAME_FLAGS

Return the flags for the current compiling pad name at offset po. Assumes a valid slot entry.

        U32     PAD_COMPNAME_FLAGS(PADOFFSET po)
PAD_COMPNAME_GEN

The generation number of the name at offset po in the current compiling pad (lvalue).

        STRLEN  PAD_COMPNAME_GEN(PADOFFSET po)
PAD_COMPNAME_GEN_set

Sets the generation number of the name at offset po in the current compiling pad (lvalue) to gen. STRLEN PAD_COMPNAME_GEN_set(PADOFFSET po, int gen)

PAD_COMPNAME_OURSTASH

Return the stash associated with an our variable. Assumes the slot entry is a valid our lexical.

        HV *    PAD_COMPNAME_OURSTASH(PADOFFSET po)
PAD_COMPNAME_PV

Return the name of the current compiling pad name at offset po. Assumes a valid slot entry.

        char *  PAD_COMPNAME_PV(PADOFFSET po)
PAD_COMPNAME_TYPE

Return the type (stash) of the current compiling pad name at offset po. Must be a valid name. Returns null if not typed.

        HV *    PAD_COMPNAME_TYPE(PADOFFSET po)
PadnameIsOUR

Whether this is an "our" variable.

        bool    PadnameIsOUR(PADNAME pn)
PadnameIsSTATE

Whether this is a "state" variable.

        bool    PadnameIsSTATE(PADNAME pn)
PadnameOURSTASH

The stash in which this "our" variable was declared.

        HV *    PadnameOURSTASH()
PadnameOUTER

Whether this entry belongs to an outer pad. Entries for which this is true are often referred to as 'fake'.

        bool    PadnameOUTER(PADNAME pn)
PadnameTYPE

The stash associated with a typed lexical. This returns the %Foo:: hash for my Foo $bar.

        HV *    PadnameTYPE(PADNAME pn)
PAD_RESTORE_LOCAL

Restore the old pad saved into the local variable opad by PAD_SAVE_LOCAL()

        void    PAD_RESTORE_LOCAL(PAD *opad)
PAD_SAVE_LOCAL

Save the current pad to the local variable opad, then make the current pad equal to npad

        void    PAD_SAVE_LOCAL(PAD *opad, PAD *npad)
PAD_SAVE_SETNULLPAD

Save the current pad then set it to null.

        void    PAD_SAVE_SETNULLPAD()
PAD_SETSV

Set the slot at offset po in the current pad to sv

        SV *    PAD_SETSV(PADOFFSET po, SV* sv)
PAD_SET_CUR

Set the current pad to be pad n in the padlist, saving the previous current pad. NB currently this macro expands to a string too long for some compilers, so it's best to replace it with

    SAVECOMPPAD();
    PAD_SET_CUR_NOSAVE(padlist,n);


        void    PAD_SET_CUR(PADLIST padlist, I32 n)
PAD_SET_CUR_NOSAVE

like PAD_SET_CUR, but without the save

        void    PAD_SET_CUR_NOSAVE(PADLIST padlist, I32 n)
PAD_SV

Get the value at offset po in the current pad

        SV *    PAD_SV(PADOFFSET po)
PAD_SVl

Lightweight and lvalue version of PAD_SV. Get or set the value at offset po in the current pad. Unlike PAD_SV, does not print diagnostics with -DX. For internal use only.

        SV *    PAD_SVl(PADOFFSET po)
SAVECLEARSV

Clear the pointed to pad value on scope exit. (i.e. the runtime action of my)

        void    SAVECLEARSV(SV **svp)
SAVECOMPPAD

save PL_comppad and PL_curpad

        void    SAVECOMPPAD()
SAVEPADSV

Save a pad slot (used to restore after an iteration)

XXX DAPM it would make more sense to make the arg a PADOFFSET void SAVEPADSV(PADOFFSET po)

Per-Interpreter Variables

PL_DBsingle

When Perl is run in debugging mode, with the -d switch, this SV is a boolean which indicates whether subs are being single-stepped. Single-stepping is automatically turned on after every step. This is the C variable which corresponds to Perl's $DB::single variable. See "PL_DBsub".

        SV *    PL_DBsingle
PL_DBsub

When Perl is run in debugging mode, with the -d switch, this GV contains the SV which holds the name of the sub being debugged. This is the C variable which corresponds to Perl's $DB::sub variable. See "PL_DBsingle".

        GV *    PL_DBsub
PL_DBtrace

Trace variable used when Perl is run in debugging mode, with the -d switch. This is the C variable which corresponds to Perl's $DB::trace variable. See "PL_DBsingle".

        SV *    PL_DBtrace
PL_dowarn

The C variable that roughly corresponds to Perl's $^W warning variable. However, $^W is treated as a boolean, whereas PL_dowarn is a collection of flag bits.

        U8      PL_dowarn
PL_last_in_gv

The GV which was last used for a filehandle input operation. (<FH>)

        GV*     PL_last_in_gv
PL_ofsgv

The glob containing the output field separator - *, in Perl space.

        GV*     PL_ofsgv
PL_rs

The input record separator - $/ in Perl space.

        SV*     PL_rs

Stack Manipulation Macros

djSP

Declare Just SP. This is actually identical to dSP, and declares a local copy of perl's stack pointer, available via the SP macro. See "SP" in perlapi. (Available for backward source code compatibility with the old (Perl 5.005) thread model.)

                djSP;
LVRET

True if this op will be the return value of an lvalue subroutine

SV Manipulation Functions

An SV (or AV, HV, etc.) is allocated in two parts: the head (struct sv, av, hv...) contains type and reference count information, and for many types, a pointer to the body (struct xrv, xpv, xpviv...), which contains fields specific to each type. Some types store all they need in the head, so don't have a body.

In all but the most memory-paranoid configurations (ex: PURIFY), heads and bodies are allocated out of arenas, which by default are approximately 4K chunks of memory parcelled up into N heads or bodies. Sv-bodies are allocated by their sv-type, guaranteeing size consistency needed to allocate safely from arrays.

For SV-heads, the first slot in each arena is reserved, and holds a link to the next arena, some flags, and a note of the number of slots. Snaked through each arena chain is a linked list of free items; when this becomes empty, an extra arena is allocated and divided up into N items which are threaded into the free list.

SV-bodies are similar, but they use arena-sets by default, which separate the link and info from the arena itself, and reclaim the 1st slot in the arena. SV-bodies are further described later.

The following global variables are associated with arenas:

 PL_sv_arenaroot     pointer to list of SV arenas
 PL_sv_root          pointer to list of free SV structures

 PL_body_arenas      head of linked-list of body arenas
 PL_body_roots[]     array of pointers to list of free bodies of svtype
                     arrays are indexed by the svtype needed

A few special SV heads are not allocated from an arena, but are instead directly created in the interpreter structure, eg PL_sv_undef. The size of arenas can be changed from the default by setting PERL_ARENA_SIZE appropriately at compile time.

The SV arena serves the secondary purpose of allowing still-live SVs to be located and destroyed during final cleanup.

At the lowest level, the macros new_SV() and del_SV() grab and free an SV head. (If debugging with -DD, del_SV() calls the function S_del_sv() to return the SV to the free list with error checking.) new_SV() calls more_sv() / sv_add_arena() to add an extra arena if the free list is empty. SVs in the free list have their SvTYPE field set to all ones.

At the time of very final cleanup, sv_free_arenas() is called from perl_destruct() to physically free all the arenas allocated since the start of the interpreter.

The function visit() scans the SV arenas list, and calls a specified function for each SV it finds which is still live - ie which has an SvTYPE other than all 1's, and a non-zero SvREFCNT. visit() is used by the following functions (specified as [function that calls visit()] / [function called by visit() for each SV]):

    sv_report_used() / do_report_used()
                        dump all remaining SVs (debugging aid)

    sv_clean_objs() / do_clean_objs(),do_clean_named_objs(),
                      do_clean_named_io_objs(),do_curse()
                        Attempt to free all objects pointed to by RVs,
                        try to do the same for all objects indir-
                        ectly referenced by typeglobs too, and
                        then do a final sweep, cursing any
                        objects that remain.  Called once from
                        perl_destruct(), prior to calling sv_clean_all()
                        below.

    sv_clean_all() / do_clean_all()
                        SvREFCNT_dec(sv) each remaining SV, possibly
                        triggering an sv_free(). It also sets the
                        SVf_BREAK flag on the SV to indicate that the
                        refcnt has been artificially lowered, and thus
                        stopping sv_free() from giving spurious warnings
                        about SVs which unexpectedly have a refcnt
                        of zero.  called repeatedly from perl_destruct()
                        until there are no SVs left.
anonymise_cv_maybe

We're about to free a GV which has a CV that refers back to us. If that CV will outlive us, make it anonymous (i.e. fix up its CvGV field)

        void    anonymise_cv_maybe(GV *gv, CV *cv)
sv_2num

NOTE: this function is experimental and may change or be removed without notice.

Return an SV with the numeric value of the source SV, doing any necessary reference or overload conversion. The caller is expected to have handled get-magic already.

        SV*     sv_2num(SV *const sv)
sv_add_arena

Given a chunk of memory, link it to the head of the list of arenas, and split it into a list of free SVs.

        void    sv_add_arena(char *const ptr, const U32 size,
                             const U32 flags)
sv_add_backref

Give tsv backref magic if it hasn't already got it, then push a back-reference to sv onto the array associated with the backref magic.

As an optimisation, if there's only one backref and it's not an AV, store it directly in the HvAUX or mg_obj slot, avoiding the need to allocate an AV. (Whether the slot holds an AV tells us whether this is active.)

A discussion about the backreferences array and its refcount:

The AV holding the backreferences is pointed to either as the mg_obj of PERL_MAGIC_backref, or in the specific case of a HV, from the xhv_backreferences field. The array is created with a refcount of 2. This means that if during global destruction the array gets picked on before its parent to have its refcount decremented by the random zapper, it won't actually be freed, meaning it's still there for when its parent gets freed.

When the parent SV is freed, the extra ref is killed by "sv_kill_backrefs" in perlintern. The other ref is killed, in the case of magic, by "mg_free" in perlapi / MGf_REFCOUNTED, or for a hash, by "hv_kill_backrefs" in perlintern.

When a single backref SV is stored directly, it is not reference counted.

        void    sv_add_backref(SV *const tsv, SV *const sv)
sv_clean_all

Decrement the refcnt of each remaining SV, possibly triggering a cleanup. This function may have to be called multiple times to free SVs which are in complex self-referential hierarchies.

        Size_t  sv_clean_all()
sv_del_backref

Delete a back-reference to ourselves from the backref magic associated with the SV we point to.

        void    sv_del_backref(SV *const tsv, SV *const sv)
sv_free2

NOTE: this function is experimental and may change or be removed without notice.

Private helper function for SvREFCNT_dec(). Called with rc set to original SvREFCNT(sv), where rc == 0 or 1

        void    sv_free2(SV *const sv, const U32 refcnt)
sv_free_arenas

Deallocate the memory used by all arenas. Note that all the individual SV heads and bodies within the arenas must already have been freed.

        void    sv_free_arenas()
sv_kill_backrefs

NOTE: this function is experimental and may change or be removed without notice.

Delete all back-references to ourselves from the backreferences array.

        void    sv_kill_backrefs(SV *const sv, AV *const av)
sv_len_utf8_nomg

Returns the number of characters in the string in an SV, counting wide UTF-8 bytes as a single character. Ignores get magic.

        STRLEN  sv_len_utf8_nomg(SV *const sv)
SvTHINKFIRST

A quick flag check to see whether an sv should be passed to sv_force_normal to be "downgraded" before SvIVX or SvPVX can be modified directly.

For example, if your scalar is a reference and you want to modify the SvIVX slot, you can't just do SvROK_off, as that will leak the referent.

This is used internally by various sv-modifying functions, such as sv_setsv, sv_setiv and sv_pvn_force.

One case that this does not handle is a gv without SvFAKE set. After

    if (SvTHINKFIRST(gv)) sv_force_normal(gv);

it will still be a gv.

SvTHINKFIRST sometimes produces false positives. In those cases sv_force_normal does nothing.

        U32     SvTHINKFIRST(SV *sv)

Unicode Support

find_uninit_var

NOTE: this function is experimental and may change or be removed without notice.

Find the name of the undefined variable (if any) that caused the operator to issue a "Use of uninitialized value" warning. If match is true, only return a name if its value matches uninit_sv. So roughly speaking, if a unary operator (such as OP_COS) generates a warning, then following the direct child of the op may yield an OP_PADSV or OP_GV that gives the name of the undefined variable. On the other hand, with OP_ADD there are two branches to follow, so we only print the variable name if we get an exact match. desc_p points to a string pointer holding the description of the op. This may be updated if needed.

The name is returned as a mortal SV.

Assumes that PL_op is the OP that originally triggered the error, and that PL_comppad/PL_curpad points to the currently executing pad.

        SV*     find_uninit_var(const OP *const obase,
                                const SV *const uninit_sv,
                                bool match, const char **desc_p)
isSCRIPT_RUN

Returns a bool as to whether or not the sequence of bytes from s up to but not including send form a "script run". utf8_target is TRUE iff the sequence starting at s is to be treated as UTF-8. To be precise, except for two degenerate cases given below, this function returns TRUE iff all code points in it come from any combination of three "scripts" given by the Unicode "Script Extensions" property: Common, Inherited, and possibly one other. Additionally all decimal digits must come from the same consecutive sequence of 10.

For example, if all the characters in the sequence are Greek, or Common, or Inherited, this function will return TRUE, provided any decimal digits in it are from the same block of digits in Common. (These are the ASCII digits "0".."9" and additionally a block for full width forms of these, and several others used in mathematical notation.) For scripts (unlike Greek) that have their own digits defined this will accept either digits from that set or from one of the Common digit sets, but not a combination of the two. Some scripts, such as Arabic, have more than one set of digits. All digits must come from the same set for this function to return TRUE.

*ret_script, if ret_script is not NULL, will on return of TRUE contain the script found, using the SCX_enum typedef. Its value will be SCX_INVALID if the function returns FALSE.

If the sequence is empty, TRUE is returned, but *ret_script (if asked for) will be SCX_INVALID.

If the sequence contains a single code point which is unassigned to a character in the version of Unicode being used, the function will return TRUE, and the script will be SCX_Unknown. Any other combination of unassigned code points in the input sequence will result in the function treating the input as not being a script run.

The returned script will be SCX_Inherited iff all the code points in it are from the Inherited script.

Otherwise, the returned script will be SCX_Common iff all the code points in it are from the Inherited or Common scripts.

        bool    isSCRIPT_RUN(const U8 *s, const U8 *send,
                             const bool utf8_target)
is_utf8_non_invariant_string

Returns TRUE if "is_utf8_invariant_string" in perlapi returns FALSE for the first len bytes of the string s, but they are, nonetheless, legal Perl-extended UTF-8; otherwise returns FALSE.

A TRUE return means that at least one code point represented by the sequence either is a wide character not representable as a single byte, or the representation differs depending on whether the sequence is encoded in UTF-8 or not.

See also "is_utf8_invariant_string" in perlapi, "is_utf8_string" in perlapi

        bool    is_utf8_non_invariant_string(const U8* const s,
                                             STRLEN len)
report_uninit

Print appropriate "Use of uninitialized variable" warning.

        void    report_uninit(const SV *uninit_sv)
utf8_add_script

Adds the given ASCIIZ script to %utf8::SCRIPTS, and initializes it lazily.

        void    utf8_add_script(const char* script)
utf8_check_script

NOTE: this function is experimental and may change or be removed without notice.

Check if the script property of the unicode character was declared via use utf8 'Script'. If this character is the first of a not excluded valid script, add the script to the list of allowed scripts, otherwise error.

Note that the argument is guaranteed to be not of the Common or Latin script property.

        void    utf8_check_script(const U8 *s)
utf8_error_script

If this character is the first non-Latin or non-Common character, and no other scripts were declared, and the script is either member of %VALID_SCRIPTS, or is not member of %utf8::EXCLUDED_SCRIPTS, then add the script to the list of allowed scripts, otherwise error.

%utf8::EXCLUDED_SCRIPTS map the Moderately Restrictive Level for identifiers. i.e. Allow Recommended scripts except Cyrillic and Greek.

Also allow Latin + :Japanese, Latin + :Hanb and Latin + :Korean, but always only the first encounter of such a combination.

Use an extra error message for %utf8::LIMITED_SCRIPTS errors, as this is a new restriction since v5.29.2c.

Note that the argument is guaranteed to be not of the Common or Latin script property.

        void    utf8_error_script(const U8 *s,
                                  const char* script, UV uv)
uvuni_get_script

Returns the script property as string of the unicode character.

        char*   uvuni_get_script(const UV uv)
variant_under_utf8_count

This function looks at the sequence of bytes between s and e, which are assumed to be encoded in ASCII/Latin1, and returns how many of them would change should the string be translated into UTF-8. Due to the nature of UTF-8, each of these would occupy two bytes instead of the single one in the input string. Thus, this function returns the precise number of bytes the string would expand by when translated to UTF-8.

Unlike most of the other functions that have utf8 in their name, the input to this function is NOT a UTF-8-encoded string. The function name is slightly odd to emphasize this.

This function is internal to Perl because khw thinks that any XS code that would want this is probably operating too close to the internals. Presenting a valid use case could change that.

See also "is_utf8_invariant_string" in perlapi and "is_utf8_invariant_string_loc" in perlapi,

        Size_t  variant_under_utf8_count(const U8* const s,
                                         const U8* const e)

Warning and Dieing

find_script

Searches for the executable script.

If dosearch, i.e. <-S> is true and if scriptname does not contain path delimiters, search the PATH for scriptname.

If SEARCH_EXTS is also defined, will look for each scriptname{SEARCH_EXTS} whenever scriptname is not found while searching the PATH.

Assuming SEARCH_EXTS is ".foo",".bar",NULL, PATH search proceeds as follows:

  If DOSISH or VMSISH:
    + look for ./scriptname{,.foo,.bar}
    + search the PATH for scriptname{,.foo,.bar}

  If !DOSISH:
    + look *only* in the PATH for scriptname{,.foo,.bar} (note
      this will not look in '.' if it's not in the PATH)

This is called by "open_script" when -e was not specified.

        char*   find_script(
                    const char *scriptname, bool dosearch,
                    const char *const *const search_ext,
                    I32 flags
                )
vwarner_security

The vwarner variant which adds security specific prefix and suffices, and ignores any $SIG{__WARN__} hooks.

        void    vwarner_security(U32 err, const char* pat,
                                 va_list* args)

Undocumented functions

The following functions are currently undocumented. If you use one of them, you may wish to consider creating and submitting documentation for it.

PerlIO_restore_errno
PerlIO_save_errno
PerlLIO_dup2_cloexec
PerlLIO_dup_cloexec
PerlLIO_open3_cloexec
PerlLIO_open_cloexec
PerlProc_pipe_cloexec
PerlSock_accept_cloexec
PerlSock_socket_cloexec
PerlSock_socketpair_cloexec
Slab_to_ro
Slab_to_rw
_add_range_to_invlist
_byte_dump_string
_get_encoding
_get_regclass_nonbitmap_data
_inverse_folds
_invlistEQ
_invlist_array_init
_invlist_contains_cp
_invlist_dump
_invlist_intersection
_invlist_intersection_maybe_complement_2nd
_invlist_invert
_invlist_len
_invlist_subtract
_invlist_union
_invlist_union_maybe_complement_2nd
_is_grapheme
_is_in_locale_category
_mem_collxfrm
_new_invlist
_new_invlist_C_array
_setup_canned_invlist
_to_fold_latin1
_to_upper_title_latin1
_warn_problematic_locale
abort_execution
add_cp_to_invlist
allocmy
amagic_is_enabled
append_utf8_from_native_byte
apply
av_extend_guts
av_nonelem
boot_core_PerlIO
boot_core_UNIVERSAL
boot_core_mro
boot_core_xsutils
cando
check_utf8_print
ck_entersub_args_core
ck_join
ck_null
ck_open
ck_prototype
ck_refassign
ck_repeat
ck_require
ck_return
ck_select
ck_shift
ck_sort
ck_split
ck_stringify
compute_EXACTish
core_type_name
croak_caller
croak_no_mem
croak_popstack
croak_shaped_array
ctz
current_re_engine
custom_op_get_field
cv_ckproto_len_flags
cv_clone_into
cv_const_sv_or_av
cv_undef_flags
cvgv_from_hek
cvgv_set
cvstash_set
deb_stack_all
defelem_target
delimcpy_no_escape
die_unwind
do_aexec
do_aexec5
do_eof
do_exec
do_exec3
do_ipcctl
do_ipcget
do_msgrcv
do_msgsnd
do_ncmp
do_open6
do_open_raw
do_print
do_readline
do_seek
do_semop
do_shmio
do_sysseek
do_tell
do_trans
do_vecget
do_vecset
do_vop
does_utf8_overflow
dofile
drand48_init_r
drand48_r
dtrace_probe_call
dtrace_probe_glob
dtrace_probe_hash
dtrace_probe_load
dtrace_probe_op
dtrace_probe_phase
dump_sv_child
dup_warnings
emulate_cop_io
feature_is_enabled
fields_padoffset
find_lexical_cv
find_runcv_where
find_rundefsv2
foldEQ_latin1_s2_folded
form_short_octal_warning
free_tied_hv_pool
get_and_check_backslash_N_name
get_debug_opts
get_hash_seed
get_invlist_iter_addr
get_invlist_offset_addr
get_invlist_previous_index_addr
get_no_modify
get_opargs
get_re_arg
getenv_len
grok_bslash_c
grok_bslash_o
grok_bslash_x
gv_fetchmeth_internal
gv_override
gv_setref
gv_stashpvn_internal
gv_stashsvpvn_cached
handle_named_backref
handle_user_defined_property
hfree_next_entry
hv_pushkv
init_argv_symbols
init_constants
init_dbargs
init_debugger
init_named_cv
init_uniprops
invlist_array
invlist_clear
invlist_clone
invlist_highest
invlist_is_iterating
invlist_iterfinish
invlist_iterinit
invlist_max
invlist_previous_index
invlist_set_len
invlist_set_previous_index
invlist_trim
io_close
isFF_OVERLONG
isFOO_lc
is_invlist
is_utf8_common
is_utf8_common_with_len
is_utf8_overlong_given_start_byte_ok
isinfnansv
keyword
keyword_plugin_standard
magic_clear_all_env
magic_cleararylen_p
magic_clearenv
magic_clearisa
magic_clearpack
magic_clearsig
magic_copycallchecker
magic_existspack
magic_freearylen_p
magic_freeovrld
magic_get
magic_getarylen
magic_getdebugvar
magic_getdefelem
magic_getnkeys
magic_getpack
magic_getpos
magic_getsig
magic_getsubstr
magic_gettaint
magic_getuvar
magic_getvec
magic_killbackrefs
magic_nextpack
magic_regdata_cnt
magic_regdatum_get
magic_regdatum_set
magic_scalarpack
magic_set
magic_set_all_env
magic_setarylen
magic_setcollxfrm
magic_setdbline
magic_setdebugvar
magic_setdefelem
magic_setenv
magic_setisa
magic_setlvref
magic_setmglob
magic_setnkeys
magic_setnonelem
magic_setpack
magic_setpos
magic_setregexp
magic_setsig
magic_setsubstr
magic_settaint
magic_setutf8
magic_setuvar
magic_setvec
magic_sizepack
magic_wipepack
malloc_good_size
malloced_size
mem_collxfrm
mem_log_alloc
mem_log_free
mem_log_realloc
mg_find_mglob
mode_from_discipline
more_bodies
mro_meta_dup
mro_meta_init
munge_qwlist_to_paren_list
my_clearenv
my_lstat_flags
my_memrchr
my_mkostemp
my_mkstemp
my_mkstemp_cloexec
my_stat_flags
my_strerror
my_unexec
newGP
newMETHOP_internal
newSTUB
newSVavdefelem
newXS_deffile
new_entersubop
new_warnings_bitfield
nextargv
noperl_die
oopsAV
oopsHV
op_destroy
op_refcnt_dec
op_refcnt_inc
op_typed
opmethod_stash
package_version
pad_add_weakref
padlist_dump
padlist_store
padname_free
padnamelist_free
parse_unicode_opts
parse_uniprop_string
parser_free
parser_free_nexttoke_ops
pmruntime
pn_peek
pnl_dump
populate_isa
ptr_hash
qerror
re_exec_indentf
re_indentf
re_printf
reg_named_buff
reg_named_buff_iter
reg_numbered_buff_fetch
reg_numbered_buff_length
reg_numbered_buff_store
reg_qr_package
reg_skipcomment
regcurly
report_evil_fh
report_wrongway_fh
rsignal_restore
rsignal_save
rxres_save
same_dirent
save_strlen
save_to_buffer
scalar
set_caret_X
set_numeric_standard
set_numeric_underlying
set_padlist
setfd_cloexec
setfd_cloexec_for_nonsysfd
setfd_cloexec_or_inhexec_by_sysfdness
setfd_inhexec
setfd_inhexec_for_sysfd
should_warn_nl
sighandler
skipspace_flags
ssc_add_range
ssc_clear_locale
ssc_cp_and
ssc_intersection
ssc_union
sub_crush_depth
sv_buf_to_ro
sv_magicext_mglob
sv_mortalcopy_flags
sv_only_taint_gmagic
sv_or_pv_pos_u2b
sv_resetpvn
sv_sethek
sv_setsv_cow
sv_unglob
swash_fetch
swash_init
tied_method
tmps_grow_p
translate_substr_offsets
try_amagic_bin
try_amagic_un
uiv_2buf
utilize
varname
vivify_defelem
vivify_ref
wait4pid
was_lvalue_sub
watch
win32_croak_not_implemented
write_to_stderr
xs_boot_epilog
yyerror
yyerror_pv
yyerror_pvn
yyparse
yyquit
yyunlex

AUTHORS

The autodocumentation system was originally added to the Perl core by Benjamin Stuhl. Documentation is by whoever was kind enough to document their functions.

SEE ALSO

perlguts, perlapi, perlapio