Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 50 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
50
Dung lượng
494,91 KB
Nội dung
278
Chapter 10 Data Component Caching
<?php
function generate_navigation($tag) {
list($topic, $subtopic) = explode(‘-’, $tag, 2);
if(function_exists(“generate_navigation_$topic”)) {
return call_user_func(“generate_navigation_$topic”, $subtopic);
}
else {
return ‘unknown’;
}
}
?>
A generation function for a project summary looks like this:
<?php
require_once ‘Project.inc’;
function generate_navigation_project($name) {
try {
if(!$name) {
throw new Exception();
}
$project = new Project($name);
}
catch (Exception $e){
return ‘unknown project’;
}
?>
<table>
<tr>
<td>Author:</td><td><?= $project->author ?>
</tr>
<tr>
<td>Summary:</td><td><?= $project->short_description ?>
</tr>
<tr>
<td>Availability:</td>
<td><a href=”<?= $project->file_url ?>”>click here</a></td>
</tr>
<tr>
<td><?= $project->long_description ?></td>
</tr>
</table>
<?php
}
?>
This looks almost exactly like your first attempt for caching the entire project page, and
in fact you can use the same caching strategy you applied there.The only change you
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
279
Integrating Caching into Application Code
should make is to alter the get_cachefile function in order to avoid colliding with
cache files from the full page:
<?php
require_once ‘Project.inc’;
function generate_navigation_project($name) {
try {
if(!$name) {
throw new Exception;
}
$cache = new Cache_File(Project::get_cachefile_nav($name));
if($text = $cache->get()) {
print $text;
return;
}
$project = new Project($name);
$cache->begin();
}
catch (Exception $e){
return ‘unkonwn project’;
}
?>
<table>
<tr>
<td>Author:</td><td><?= $project->author ? >
</tr>
<tr>
<td>Summary:</td><td><?= $project->short_description ?>
</tr>
<tr>
<td>Availability:</td><td><a href=”<?= $project->file_url ?>”>click
here</a></td>
</tr>
<tr>
<td><?= $project->long_description ?></td>
</tr>
</table>
<?php
$cache->end();
}
And in Project.inc you add this:
public function get_cachefile_nav($name) {
global $CACHEBASE;
return “$CACHEBASE/projects/nav/$name.cache”;
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
280
Chapter 10 Data Component Caching
}
?>
It’s as simple as that!
Implementing a Query Cache
Now you need to tackle the weather element of the navigation bar you’ve been working
with.You can use the Simple Object Application Protocol (SOAP) interface at xmeth-
ods.net to retrieve real-time weather statistics by ZIP code. Don’t worry if you have not
seen SOAP requests in PHP before; we’ll discuss them in depth in Chapter 16,“RPC:
Interacting with Remote Services.” generate_navigation_weather() creates a Weather
object for the specified ZIP code and then invokes some SOAP magic to return the
temperature in that location:
<?php
include_once ‘SOAP/Client.php’;
class Weather {
public $temp;
public $zipcode;
private $wsdl;
private $soapclient;
public function _ _construct($zipcode) {
$this->zipcode = $zipcode;
$this->_get_temp($zipcode);
}
private function _get_temp($zipcode) {
if(!$this->soapclient) {
$query = “http://www.xmethods.net/sd/2001/TemperatureService.wsdl”;
$wsdl = new SOAP_WSDL($query);
$this->soapclient = $wsdl->getProxy();
}
$this->temp = $this->soapclient->getTemp($zipcode);
}
}
function generate_navigation_weather($zip) {
$weather = new Weather($zip);
?>
The current temp in <?= $weather->zipcode ?>
is <?= $weather->temp ?> degrees Farenheit\n”;
<?php
}
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
281
Further Reading
RPCs of any kind tend to be slow, so you would like to cache the weather report for a
while before invoking the call again.You could simply apply the techniques used in
Project and cache the output of generate_navigation_weather() in a flat file.That
method would work fine, but it would allocate only one tiny file per ZIP code.
An alternative is to use a DBM cache and store a record for each ZIP code.To insert
the logic to use the Cache_DBM class that you implemented earlier in this chapter
requires only a few lines in _get_temp:
private function _get_temp($zipcode) {
$dbm = new Cache_DBM(Weather::get_cachefile(), 3600);
if($temp = $dbm->get($zipcode)) {
$this->temp = $temp;
return;
}
else {
if(!$this->soapclient) {
$url = “ http://www.xmethods.net/sd/2001/TemperatureService.wsdl”;
$wsdl = new SOAP_WSDL($url);
$this->soapclient = $wsdl->getProxy();
}
$this->temp = $this->soapclient->getTemp($zipcode);
$dbm->put($zipcode, $this->temp);
}
}
function get_cachefile() {
global $CACHEBASE;
return “$CACHEBASE/Weather.dbm”;
}
Now when you construct a Weather object, you first look in the DBM file to see
whether you have a valid cached temperature value.You initialize the wrapper with an
expiration time of 3,600 seconds (1 hour) to ensure that the temperature data does not
get too old.Then you perform the standard logic “if it’s cached, return it; if not, generate
it, cache it, and return it.”
Further Reading
A number of relational database systems implement query caches or integrate them into
external appliances. As of version 4.0.1, MySQL has an integrated query cache.You can
read more at
www.mysql.com.
mod_rewrite is detailed on the Apache site, http://httpd.apache.org.
Web services, SOAP, and WSDL are covered in Chapter 16.The end of that chapter
contains a long list of additional resources.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
11
Computational Reuse
C
OMPUTATIONAL REUSE IS A TECHNIQUE BY which intermediate data (that is, data that
is not the final output of a function) is remembered and used to make other calculations
more efficient. Computational reuse has a long history in computer science, particularly
in computer graphics and computational mathematics. Don’t let these highly technical
applications scare you, though; reuse is really just another form of caching.
In the past two chapters we investigated a multitude of caching strategies. At their
core, all involve the same premise:You take a piece of data that is expensive to compute
and save its value.The next time you need to perform that calculation, you look to see
whether you have stored the result already. If so, you return that value.
Computational reuse is a form of caching that focuses on very small pieces of data.
Instead of caching entire components of an application, computational reuse focuses on
how to cache individual objects or data created in the course of executing a function.
Often these small elements can also be reused. Every complex operation is the combined
result of many smaller ones. If one particular small operation constitutes a large part of
your runtime, optimizing it through caching can give significant payout.
Introduction by Example: Fibonacci Sequences
An easy example that illustrates the value of computational reuse has to do with com-
puting recursive functions. Let’s consider the Fibonacci Sequence, which provides a solu-
tion to the following mathematical puzzle:
If a pair of rabbits are put into a pen, breed such that they produce a new pair of rab-
bits every month, and new-born rabbits begin breeding after two months, how many
rabbits are there after n months? (No rabbits ever die, and no rabbits ever leave the
pen or become infertile.)
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
284
Chapter 11 Computational Reuse
Leonardo Fibonacci
Fibonacci was a 13th-century Italian mathematician who made a number of important contributions to
mathematics and is often credited as signaling the rebirth of mathematics after the fall of Western science
during the Dark Ages.
The answer to this riddle is what is now known as the Fibonacci Sequence.The
number of rabbit pairs at month n is equal to the number of rabbit pairs the previous
month (because no rabbits ever die), plus the number of rabbit pairs two months ago
(because each of those is of breeding age and thus has produced a pair of baby rabbits).
Mathematically, the Fibonacci Sequence is defined by these identities:
Fib(0) = 1
Fib(1) = 1
Fib(n) = Fib(n-1) + Fib(n-2)
If you expand this for say, n = 5, you get this:
Fib(5) = Fib(4) + Fib(3)
Now you know this:
Fib(4) = Fib(3) + Fib(2)
and this:
Fib(3) = Fib(2) + Fib(1)
So you expand the preceding to this:
Fib(5) = Fib(3) + Fib(2) + Fib(2) + Fib(1)
Similarly, you get this:
Fib(2) = Fib(1) + Fib(1)
Therefore, the value of Fib(5) is derived as follows:
Fib(5) = Fib(2) + Fib(1) + Fib(1) + Fib(0) + Fib(1) + Fib(0) + Fib(1)
= Fib(1) + Fib(0) + Fib(1) + Fib(1) + Fib(0) + Fib(1) + Fib(0) + Fib(1)
= 8
Thus, if you calculate Fib(5) with the straightforward recursive function:
function Fib($n) {
if($n == 0 || $n == 1) {
return 1;
}
else {
return Fib($n – 2) + Fib($n – 1);
}
}
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
285
Introduction by Example: Fibonacci Sequences
you see that you end up computing Fib(4) once but Fib(3) twice and Fib(2) three
times. In fact, by using mathematical techniques beyond the scope of this book, you can
show that calculating Fibonacci numbers has exponential complexity (O(1.6^n)).This
means that calculating F(n) takes at least 1.6^n steps. Figure 11.1 provides a glimpse into
why this is a bad thing.
Figure 11.1 Comparing complexities.
Complexity Calculations
When computer scientists talk about the speed of an algorithm, they often refer to its “Big O” speed, writ-
ten as O(n) or O(n
2
) or O(2
n
). What do these terms mean?
When comparing algorithms, you are often concerned about how their performance changes as the data set
they are acting on grows. The O( ) estimates are growth estimates and represent a worst-case bound on the
number of “steps” that need to be taken by the algorithm on a data set that has n elements.
For example, an algorithm for finding the largest element in an array goes as follows: Start at the head of
the array, and say the first element is the maximum. Compare that element to the next element in the array.
If that element is larger, make it the max. This requires visiting every element in the array once, so this
method takes n steps (where n is the number of elements in the array). We call this O(n), or linear time. This
means that the runtime of the algorithm is directly proportional to the size of the data set.
Another example would be finding an element in an associative array. This involves finding the hash value
of the key and then looking it up by that hash value. This is an O(1), or constant time, operation. This means
that as the array grows, the cost of accessing a particular element does not change.
1
23456789
O (1)
O (N)
O (N^2)
O (1.6^N)
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
286
Chapter 11 Computational Reuse
On the other side of the fence are super-linear algorithms. With these algorithms, as the data set size
grows, the number of steps needed to apply the algorithm grows faster than the size of the set. Sorting
algorithms are an example of this. One of the simplest (and on average slowest) sorting algorithms is
bubblesort. bubblesort works as follows: Starting with the first element in the array, compare each
element with its neighbor. If the elements are out of order, swap them. Repeat until the array is sorted.
bubblesort works by “bubbling” an element forward until it is sorted relative to its neighbors and then
applying the bubbling to the next element. The following is a simple bubblesort implementation in PHP:
function bubblesort(&$array) {
$n = count($array);
for($I = $n; $I >= 0; $I ) {
// for every position in the array
for($j=0; $j < $I; $j++) {
// walk forward through the array to that spot
if($array[$j] > $array[$j+1]) {
// if elements are out of order then swap position j and j+1
list($array[$j], $array[$j+1]) =
array($array[$j+1], $array[$j]);
}
}
}
}
In the worst-case scenario (that the array is reverse sorted), you must perform all possible swaps, which is
(n
2
+ n)/2. In the long term, the n
2
term dominates all others, so this is an O(n
2
) operation.
Figure 11.1 shows a graphical comparison of a few different complexities.
Anything you can do to reduce the number of operations would have great long-term
benefits.The answer, though, is right under your nose:You have just seen that the prob-
lem in the manual calculation of
Fib(5) is that you end up recalculating smaller
Fibonacci values multiple times. Instead of recalculating the smaller values repeatedly, you
should insert them into an associative array for later retrieval. Retrieval from an associa-
tive array is an O(1) operation, so you can use this technique to improve your algorithm
to be linear (that is, O(n)) complexity.This is a dramatic efficiency improvement.
Note
You might have figured out that you can also reduce the complexity of the Fibonacci generator to O(n) by
converting the tree recursive function (meaning that Fib(n) requires two recursive calls internally) to a
tail recursive one (which has only a single recursive call and thus is linear in time). It turns out that caching
with a static accumulator gives you superior performance to a noncaching tail-recursive algorithm, and the
technique itself more easily expands to common Web reuse problems.
Before you start tinkering with your generation function, you should add a test to ensure
that you do not break the function’s functionality:
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
287
Introduction by Example: Fibonacci Sequences
<?
require_once ‘PHPUnit/Framework/TestCase.php’;
require_once ‘PHPUnit/Framework/TestSuite.php’;
require_once ‘PHPUnit/TextUI/TestRunner.php’;
require_once “Fibonacci.inc”;
class FibonacciTest extends PHPUnit_Framework_TestCase {
private $known_values = array( 0 => 1,
1 => 1,
2 => 2,
3 => 3,
4 => 5,
5 => 8,
6 => 13,
7 => 21,
8 => 34,
9 => 55);
public function testKnownValues() {
foreach ($this->known_values as $n => $value) {
$this->assertEquals($value, Fib($n),
“Fib($n) == “.Fib($n).” != $value”);
}
}
public function testBadInput() {
$this->assertEquals(0, Fib(‘hello’), ‘bad input’);
}
public function testNegativeInput() {
$this->assertEquals(0, Fib(-1));
}
}
$suite = new PHPUnit_Framework_TestSuite(new Reflection_Class(‘FibonacciTest’));
PHPUnit_TextUI_TestRunner::run($suite);
?>
Now you add caching.The idea is to use a static array to store sequence values that you
have calculated. Because you will add to this array every time you derive a new value,
this sort of variable is known as an accumulator array. Here is the Fib() function with a
static accumulator:
function Fib($n) {
static $fibonacciValues = array( 0 => 1, 1 => 1);
if(!is_int($n) || $n < 0) {
return 0;
}
If(!$fibonacciValues[$n]) {
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
[...]... (strlen, strcmp, and so on, many of which have direct correspondents in PHP) know that a string ends when they encounter a null character Binary data, on the other hand, can consist of completely arbitrary characters, including nulls PHP does not have a separate type for binary data, so strings in PHP must know their own length so that the PHP versions of strlen and strcmp can skip past null characters embedded... benefit of having to perform the calculations only once, but you do not have to take the hit of precalculating all that might interest you In PHP 4 there are ways to hack your factory directly into the class constructor: // php4 syntax – not forward-compatible to php5 $wordcache = array(); function Word($name) { global $wordcache; if(array_key_exists($name, $wordcache)) { $this = $wordcache[$name]; }... the runtime once and then executing a page the way you might execute a function call in a loop (just calling the compiled copy) As we will discuss in Chapter 20, PHP and Zend Engine Internals,” PHP does not implement this sort of strategy PHP keeps a persistent interpreter, but it completely tears down the context at request shutdown This means that if in a page you create any sort of variable, like... Testing,” to calculate Flesch readability scores For every word in the document, you created a Word object to find its number of syllables In a document of any reasonable size, you expect to see some repeated words Caching the Word object for a given word, as well as the number of syllables for the word, should greatly reduce the amount of per-document parsing that needs to be performed Caching the number... involve significant resources is doing interprocess sharing of small data worthwhile It is difficult to overcome the cost of interprocess communication on such a small scale Computational Reuse Inside PHPPHP itself employs computational reuse in a number of places PCREs Perl Compatible Regular Expressions (PCREs) consist of preg_match(), preg_replace(), preg_split(), preg_grep(), and others.The PCRE... in the PCRE C library) is called to actually make the matches PHP hides this effort from you.The preg_match() function internally performs pcre_compile() and caches the result to avoid recompiling it on subsequent executions PCREs are implemented inside an extension and thus have greater control of their own memory than does user-space PHP code.This allows PCREs to not only cache compiled regular expressions... overhead of regular expression compilation entirely.This implementation strategy is very close to the PHP 4 method we looked at earlier in this chapter for caching Text_Word objects without a factory class 295 296 Chapter 11 Computational Reuse Array Counts and Lengths When you do something like this, PHP does not actually iterate through $array and count the number of elements it has: $array = array(‘a‘,‘b‘,‘c‘,1,2,3);... variable is assigned to a string (or cast to a string), PHP also calculates and stores the length of that string in an internal register in that variable If strlen() is called on that variable, its precalculated length value is returned.This caching is actually also critical to handling binary data because the underlying C library function strlen() (which PHP s strlen() is designed to mimic) is not binary... is rather large: mysql> select data from ObjectCache where keyname = ‘the’; + -+ data + -+ Computational Reuse Inside PHP O:4:”word”:2:{s:4:”word”;s:3:”the”;s:13:”_numSyllables”;i:0;} + -+ 1 row in set (0.01 sec) That amounts to 61 bytes of data, much of which is class structure In PHP 4 this is even worse because static class variables are not supported, and each serialization can include the syllable... patterns define the way you will interact with an RDBMS in PHP code At a simplistic level, this involves determining how and where SQL will appear in the code base.The span of philosophies on this is pretty wide On one hand is a camp of people who believe that data access is such a fundamental part of an application that SQL should be freely mixed with PHP code whenever a query needs to be performed On the . Sequences
<?
require_once ‘PHPUnit/Framework/TestCase .php ;
require_once ‘PHPUnit/Framework/TestSuite .php ;
require_once ‘PHPUnit/TextUI/TestRunner .php ;
require_once. interest you. In PHP 4 there are ways to hack your
factory directly into the class constructor:
// php4 syntax – not forward-compatible to php5
$wordcache