Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 50 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
50
Dung lượng
535,35 KB
Nội dung
478
Chapter 20 PHP and Zend Engine Internals
n
opcode 1—Here the ZEND_ASSIGN handler assigns to Register 0 (the pointer to
$hi) the value hello. Register 1 is also assigned to, but it is never used. Register 1
would be utilized if the assignment were being used in an expression like this:
if($hi = ‘hello’){}
n
opcode 2—Here you re-fetch the value of $hi, now into Register 2.You use the
op ZEND_FETCH_R because the variable is used in a read-only context.
n
opcode 3—ZEND_ECHO prints the value of Register 2 (or, more accurately, sends it
to the output buffering system). echo (and print, its alias) are operations that are
built in to PHP itself, as opposed to functions that need to be called.
n
opcode 4—ZEND_RETURN is called, setting the return value of the script to 1.Even
though
return is not explicitly called in the script, every script contains an
implicit return
1, which is executed if the script completes without return being
explicitly called.
Here is a more complex example:
<?php
$hi =
‘hello’;
echo strtoupper($hi);
?>
The intermediate code dump looks similar:
opnum line opcode op1 op2 result
0 2 ZEND_FETCH_W
“hi”‘0
1 2 ZEND_ASSIGN
‘0 “hello”‘0
2 3 ZEND_FETCH_R “hi”‘2
3 3 ZEND_SEND_VAR
‘2
4 3 ZEND_DO_FCALL
“strtoupper”‘3
5 3 ZEND_ECHO ‘3
6 5 ZEND_RETURN 1
Notice the differences between these two scripts.
n
opcode 3—The ZEND_SEND_VAR op pushes a pointer to Register 2 (the variable
$hi) onto the argument stack.This argument stack is how the called function
receives its arguments. Because the function called here is an internal function
(implemented in C and not in PHP), its operation is completely hidden from PHP.
Later you will see how a userspace function receives arguments.
n
opcode 4—The ZEND_DO_FCALL op calls the function strtoupper and indicates
that Register 3 is where its return value should be set.
Here is an example of a trivial PHP script that implements conditional flow control:
<?php
$i = 0;
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
479
How the Zend Engine Works: Opcodes and Op Arrays
while($i < 5) {
$i++;
}
?>
opnum line opcode op1 op2 result
0 2 ZEND_FETCH_W “i”‘0
1 2 ZEND_ASSIGN ‘0 0 ‘0
2 3 ZEND_FETCH_R “i”‘2
3 3 ZEND_IS_SMALLER ‘2 5 ‘2
4 3 ZEND_JMPZ $3
5 4 ZEND_FETCH_RW “i”‘4
6 4 ZEND_POST_INC ‘4 ‘4
7 4 ZEND_FREE $5
8 5 ZEND_JMP
9 7 ZEND_RETURN 1
Note here that you have a ZEND_JMPZ op to set a conditional branch point (to evaluate
whether you should jump to the end of the loop if $i is greater than or equal to 5) and
a ZEND_JMP op to bring you back to the top of the loop to reevaluate the condition at
the end of each iteration.
Observe the following in these examples:
n
Six registers are allocated and used in this code, even though only two registers are
ever used at any one time. Register reuse is not implemented in PHP. For large
scripts, thousands of registers may be allocated.
n
No real optimization is performed on the code.This postincrement:
$i++;
could be optimized to a pre-increment:
++$i;
because it is used in a void context (that is, it is not used in an expression where
the former value of $i needs to be stored.) This would save you having to stash its
value in a register.
n
The jump oplines are not displayed in the debugger.This is really the fault of the
assembly dumper.The Zend Engine leaves ops used for some internal purposes
marked as unused.
Before we move on, there is one last important example to look at.The example show-
ing function calls earlier in this chapter uses
strtoupper, which is a built-in function.
Calling a function written in PHP looks similar to that to calling a built-in function:
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
480
Chapter 20 PHP and Zend Engine Internals
<?php
function hello($name) {
echo “hello\n”;
}
hello(“George”);
?>
opnum line opcode op1 op2 result
0 2 ZEND_NOP
1 5 ZEND_SEND_VAL “George”
2 5 ZEND_DO_FCALL “hello”‘0
3 7 ZEND_RETURN 1
But where is the function code? This code simply sets the argument stack (via
ZEND_SEND_VAL) and calls hello, but you don’t see the code for hello anywhere.This is
because functions in PHP are op arrays as well, as if they were miniature scripts. For
example, here is the op array for the function
hello:
FUNCTION: hello
opnum line opcode op1 op2 result
0 2 ZEND_FETCH_W “name”‘0
1 2 ZEND_RECV 1 ‘0
2 3 ZEND_ECHO “hello%0A”
3 4 ZEND_RETURN NULL
This looks pretty similar to the inline code you’ve seen before.The only difference is
ZEND_RECV, which reads off the argument stack.As with standalone scripts, even though
you don’t explicitly return at the end, a ZEND_RETURN op is implicitly added, and it
returns null.
Calling includes work similarly to function calls:
<?php
include(“file.inc”);
?>
opnum line opcode op1 op2 result
0 2 ZEND_INCLUDE_OR_EVAL “file.inc”‘0
1 4 ZEND_RETURN 1
This illustrates an important aspect of the PHP language: All includes and requires
happen at runtime. So when a script is initially parsed, the op array for that script is gen-
erated, and any functions and classes defined in its top-level file (the one that is actually
run) are inserted into the symbol table; but no potentially included scripts are parsed yet.
When the script is executed, if an
include statement is encountered, the include is
then parsed and executed on the spot. Figure 20.1 illustrates the flow of a normal PHP
script.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
481
How the Zend Engine Works: Opcodes and Op Arrays
Figure 20.1 The execution path of a PHP script.
This design choice has a number of repercussions:
n
Flexibility—It is an oft-vaunted fact that PHP is a runtime language. One of the
important things that being a runtime language means for PHP is that it supports
conditional inclusion of files and conditional declaration of functions and classes.
Here’s an example:
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
482
Chapter 20 PHP and Zend Engine Internals
if($condition) {
include(“file1.inc”);
}
else {
include(“file2.inc”);
}
In this example, the runtime parsing and execution of included files makes this
operation more efficient (because files are included only when needed), and it
eliminates the potential hassles of symbol conflicts if two files contain different
implementations of the same function or class.
n
Speed—Having to actually compile includes on-the-fly means that a significant
portion of a script’s execution time is spent simply compiling its dependant
includes. If a file is included twice, it must be parsed and executed twice.
include_once and require_once partially solve that problem, but it is further
exacerbated by the fact that PHP resets its compiler state completely between
script executions. (We’ll talk about that more in a minute, as well as some ways to
minimize that effect. )
Variables
Programming languages come in two basic flavors when it comes to how variables are
declared:
n
Statically typed—Statically typed languages include languages such as C++ or
Java, where a variable is assigned a type (for example,
int or String) and that type
is fixed at compile time.
n
Dynamically typed—Dynamically typed languages include languages such as
PHP, Perl, Python, and VBScript, where types are automatically inferred at run-
time. If you use this:
$variable = 0;
PHP will automatically create it as an integer type.
Furthermore, there are two additional criteria for how types are enforced or converted
between:
n
Strongly typed—In a strongly typed language, if an expression receives an argu-
ment of the wrong type, an error is generated.Without exception, statically typed
languages are strongly typed (although many allow one type to be cast, or forced
to be interpreted, as another type). Some dynamically typed languages, such as
Python and Ruby, have strong typing; in them, exceptions are thrown if variables
are used in an incorrect context.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
483
Variables
n
Weakly typed—A weakly typed language does not necessarily enforce types.This
is usually accompanied by autoconversion of variables to appropriate types. For
instance, in this:
$string = “The value of \$variable is $variable.”;
$variable (which was autocast into an integer when it was first set) is now auto-
converted into a string type so that it can be used to create $string.
All these typing strategies have their relative benefits and drawbacks. Static typing allows
you to enforce a certain level of data validation at compile time. For this reason,
dynamically typed languages tend to be slower than statically typed languages. Dynamic
typing is, of course, more flexible. Most interpreted languages choose to go with dynam-
ic typing because it fits their flexibility.
Strong typing similarly allows you a good amount of built-in data validation, in this
case at runtime.Weak typing provides additional flexibility by allowing variables to auto-
convert between types as necessary.The interpreted languages are pretty well split on
strong typing versus weak typing. Python and Ruby (both of which bill themselves as
general-purpose “enterprise” languages) implement strong typing, whereas Perl, PHP, and
JavaScript implement weak typing.
PHP is both dynamically typed and weakly typed. One slight exception is the option-
al type checking for argument types in functions. For example, this:
function foo(User $array) { }
and this:
function bar( Exception $array) {}
enforce being passed a User or an Exception object (or one of its descendants or imple-
menters), respectively.
To fully understand types in PHP, you need to look under the hood at the data struc-
tures used in the engine. In PHP, all variables are zvals, represented by the following C
structure:
struct _zval_struct {
/* Variable information */
zvalue_value value; /* value */
zend_uint refcount;
zend_uchar type; /* active type */
zend_uchar is_ref;
};
and its complementary data container:
typedef union _zvalue_value {
long lval; /* long value */
double dval; /* double value */
struct {
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
484
Chapter 20 PHP and Zend Engine Internals
char *val;
int len;
} str; /* string value */
HashTable *ht; /* hashtable value */
zend_object_value obj; /* handle to an object */
} zvalue_value;
The zval consists of its own value (which we’ll get to in a moment), a refcount, a type,
and the flag is_ref.
A zval’s refcount is the reference counter for the value associated with that variable.
When you instantiate a new variable, like this, it is created with a reference count of 1:
$variable = ‘foo’;
If you create a copy of $variable,thezval for its value has its reference count incre-
mented. So after you perform the following, the
zval for ‘foo’ has a reference count of
2:
$variable_copy = $variable;
If you then change $variable, it will be associated to a new zval with a reference
count of 1, and the original string
‘foo’ will have its reference count decremented to 1,
as follows:
$variable = ‘bar’;
When a variable falls out of scope (say it’s defined in a function and that function is
returned from), or when the variable is destroyed, its
zval’s reference count is decre-
mented by one.When a zval’s refcount reaches 0, it is picked up by the garbage-
collection system and its contents will be freed.
The zval type is especially interesting.The fact that PHP is a weakly typed language
does not mean that variables do not have types.The type attribute of the zval specifies
what the current type of the zval is; this indicates which part of the zvalue_value
union should be looked at for its value.
Finally, is_ref indicates whether this zval actually holds data or is simply a reference
to another zval that holds data.
The
zvalue_value value is where the data for a zval is actually stored.This is a
union of all the possible base types for a variable in PHP: long integers, doubles, strings,
hashtables (arrays), and object handles.
union in C is a composite data type that uses a
minimal amount of space to store at different times different possible types. Practically,
this means that the data stored for a zval is either a numeric representation, a string rep-
resentation, an array representation, or an object representation, but never more than one
at a time.This is in contrast to a language such as Perl, where all these potential represen-
tations can coexist (this is how in Perl you can have a variable that has entirely different
representations when accessed as a string than when accessed as a number).
When you switch types in PHP (which is almost never done explicitly—almost
always implicitly, when a usage demands a
zval be in a different representation than it
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
485
Variables
currently is), zvalue_value is converted into the required format.This is why you get
behavior like this:
$a = “00”;
$a += 0;
echo $a;
which prints 0 and not 00 because the extra characters are silently discarded when $a is
converted to an integer on the second line.
Variable types are also important in comparison.When you compare two variables
with the identical operator (===), like this, the active types for the zvals are compared,
and if they are different, the comparison fails outright:
$a = 0;
$b =
‘0’;
echo ($a === $b)?”Match”:”Doesn’t Match”;
For that reason, this example fails.
With the is equal operator (
==), the comparison that is performed is based on the
active types of the operands. If the operands are strings or nulls, they are compared as
strings, if either is a Boolean, they are converted to Boolean values and compared, and
otherwise they are converted to numbers and compared.Although this results in the
==
operator being symmetrical (for example, if $a == $b is the same as $b == $a), it actu-
ally is not transitive.The following example of this was kindly provided by Dan Cowgill:
$a = “0”;
$b = 0;
$c =
“”;
echo ($a == $b)?
”True”:”False”; // True
echo ($b == $c)?”True”:”False”; // True
echo ($a == $c)?
”True”:”False”; // False
Although transitivity may seem like a basic feature of an operator algebra, understanding
how == works makes it clear why transitivity does not hold. Here are some examples:
n
“0” == 0 because both variables end up being converted to integers and com-
pared.
n
$b == $c because both $b and $c are converted to integers and compared.
n
However, $a != $c because both $a and $c are strings, and when they are com-
pared as strings, they are decidedly different.
In his commentary on this example, Dan compared this to the
== and eq operators in
Perl, which are both transitive.They are both transitive, though, because they are both
typed comparison.
== in Perl coerces both operands into numbers before performing the
comparison, whereas
eq coerces both operands into strings.The PHP == is not a typed
comparator, though, and it coerces variables only if they are not of the same active type.
Thus the lack of transitivity.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
486
Chapter 20 PHP and Zend Engine Internals
Functions
You’ve seen that when a piece of code calls a function, it populates the argument stack
via ZEND_SEND_VAL and uses a ZEND_DO_FCALL op to execute the function. But what
does that really do? To really understand how these things work, you need to go back to
even before compilation.When PHP starts up, it looks through all its registered exten-
sions (both the ones that were compiled statically and any that were registered in the
php.ini file) and registers all the functions that they define.These functions look like
this:
typedef struct _zend_internal_function {
/* Common elements */
zend_uchar type;
zend_uchar *arg_types;
char *function_name;
zend_class_entry *scope;
zend_uint fn_flags;
union _zend_function *prototype;
/* END of common elements */
void (*handler)(INTERNAL_FUNCTION_PARAMETERS);
} zend_internal_function;
The important things to note here are the type (which is always ZEND_INTERNAL_
FUNCTION, meaning that it is an extension function written in C), the function name, and
the handler, which is a C function pointer to the function itself and is part of the exten-
sion code.
Registering one of these functions basically amounts to its being inserted into the
global function table (a hashtable in which functions are stored).
User-defined functions are, of course, inserted by the compiler.When the compiler
(by which I still mean the lexer, parser, and code generator all together) encounters a
piece of code like this:
function say_hello($name)
{
echo “Hello $name\n”;
}
it compiles the code inside the function’s block as a new op array, creates a zend_
function with that op array, and inserts that zend_function into the global function
table with its type set to ZEND_USER_FUNCTION.A zend_function looks like this:
typedef union _zend_function {
zend_uchar type;
struct {
zend_uchar type; /* never used */
zend_uchar *arg_types;
char *function_name;
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
487
Classes
zend_class_entry *scope;
zend_uint fn_flags;
union _zend_function *prototype;
} common;
zend_op_array op_array;
zend_internal_function internal_function;
} zend_function;
This definition can be rather confusing if you don’t recognize one of the design goals:
For the most part, zend_functions are zend_internal_functions are op arrays.They
are not identical structs, but all the elements that are in “common” they hold in com-
mon.Thus they can safely be casted to each other.
In practice, this means that when a ZEND_DO_FCALL op is executed, it stashes away the
current scope, populates the argument stack, and looks up the requested function by
name (actually by the lowercase version of the name because PHP implements case-
insensitive function names), returning a pointer to a zend_function. If the function’s
type is ZEND_INTERNAL_FUNCTION, it can be recast to a zend_internal_function and
executed via zend_execute_internal, which executes internal functions. Otherwise, it
will be executed via zend_execute, the same function that is called to execute scripts
and includes.This works because for user functions are completely identical to op
arrays.
As you can likely infer from the way that PHP functions work, ZEND_SEND_VAL does
not push an argument’s zval onto the argument stack; instead, it copies it and pushes the
copy onto the stack.This has the consequence that unless a variable is passed by refer-
ence (with the exception of objects), changing its value in a function does not change
the argument passed—it changes only the copy.To change a passed argument in a func-
tion, pass it by reference.
Classes
Classes are similar to functions in that, like functions, they are stashed in their own global
symbol table; but they are more complex than functions.Whereas functions are similar to
scripts (possessing the same instruction set), classes are like a miniature version of the
entire execution scope.
A class is represented by a
zend_class_entry, like this:
struct _zend_class_entry {
char type;
char *name;
zend_uint name_length;
struct _zend_class_entry *parent;
int refcount;
zend_bool constants_updated;
zend_uint ce_flags;
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
[...]... #include #include php. h” php_ ini.h” “ext/standard/info.h” php_ example.h” #define VERSION “1.0” function_entry example_functions[] = { {NULL, NULL, NULL} }; zend_module_entry example_module_entry = { STANDARD_MODULE_HEADER, “example”, example_functions, PHP_ MINIT(example), PHP_ MSHUTDOWN(example), PHP_ RINIT(example), PHP_ RSHUTDOWN(example), PHP_ MINFO(example), 505 506 Chapter 21 Extending PHP: Part I VERSION,... COMPILE_DL_EXAMPLE ZEND_GET_MODULE(example) #endif PHP_ MINIT_FUNCTION(example) { return SUCCESS; PHP_ MSHUTDOWN_FUNCTION(example) { return SUCCESS; } PHP_ RINIT_FUNCTION(example) { return SUCCESS; } PHP_ RSHUTDOWN_FUNCTION(example) { return SUCCESS; } PHP_ MINFO_FUNCTION(example) { php_ info_print_table_start(); php_ info_print_table_header(2, “example support”, “enabled”); php_ info_print_table_end(); } Later sections... SAPI-specified output functions n php_ init_config—This n reads in the php. ini file and acts on its contents php_ request_shutdown—This is the master function to destroy per-request resources The PHP Request Life Cycle n php_ end_ob_buffers—This is used to flush output buffers, if output buffering has been enabled n php_ module_shutdown—This is the master shutdown function for PHP, triggering all the rest... methods for that in Chapter 23 The PHP Request Life Cycle Now that you have a decent understanding of how the Zend Engine works, let’s look at how the engine sits inside PHP and how PHP itself sits inside other applications Any discussion of the architecture of PHP starts with a diagram such as Figure 20.2, which shows the application layers in PHP The outermost layer, where PHP interacts with other applications,... build the extension: PHP_ ARG_ENABLE(example, to enable the example extension, [ enable-example enable the example extension.]) if test “ $PHP_ EXAMPLE” != “no”; then PHP_ NEW_EXTENSION(example, example.c, $ext_shared) fi Extension Basics The PHP build system supports the full m4 syntax set, as well as some custom macros Here is a partial list of the custom PHP build system macros: n PHP_ CHECK_LIBRARY(library,... 20.3 The mod _php5 request life cycle 501 502 Chapter 20 PHP and Zend Engine Internals Further Reading Documentation for the Zend Engine is pretty scarce If you prefer a more hands-on introduction than is presented here, skip ahead to Chapter 23, where you will see a complete walkthrough of the CGI SAPI as well as extensive coverage of how to embed PHP into external applications 21 Extending PHP: Part... Extension Basics Creating basic files: config.m4 cvsignore example.c php_ example.h CREDITS EXPERIMENTAL tests/001.phpt example .php [done] To use your new extension, you have to execute the following: 1 2 3 4 5 6 7 8 $ $ $ $ $ $ $ $ cd vi ext/example/config.m4 /buildconf /configure [with|enable]-example make /php -f ext/example/example .php vi ext/example/example.c make Repeat steps 3-6 until you are satisfied... the php. ini variable always_populate_raw_post_data is true), the read_post handler is called to populate $HTTP_RAW_POST_DATA and $_POST Chapter 23 takes a closer look at using the SAPI interface to integrate PHP into applications and does a complete walkthrough of the CGI SAPI The PHP Core There are several key steps in activating and running a PHP interpreter.When an application wants to start a PHP. .. profilers, debuggers, and semantics-altering extensions 493 494 Chapter 20 PHP and Zend Engine Internals The SAPI Layer The SAPI layer is the abstraction layer that allows for easy embedding of PHP into other applications Some SAPIs include the following: mod _php5 —This is the PHP module for Apache, and it is a SAPI that embeds PHP into the Apache Web server fastcgi—This is an implementation of FastCGI... shows the complete life cycle of the mod _php5 SAPI After the initial server startup, the process loops the handling requests LY F E T M A The PHP Request Life Cycle Startup Per Request Steps sapi_startup php_ request_startup php_ module_startup php_ output_activate php_ output_startup zend_activate zend_startup sapi_activate pull in request data from Apache parse ini values zend_activate_modules run zend_extension . watermark.
493
The PHP Request Life Cycle
Figure 20.2 The architecture of PHP.
Below the SAPI layer lies the PHP engine itself.The core PHP code handles. the architecture of PHP starts with a diagram such as Figure 20.2,
which shows the application layers in PHP.
The outermost layer, where PHP interacts with