Search

2/03/2010

HipHop for PHP: Move Fast

HipHop for PHP: Move Fast

HipHop for PHP isn't technically a compiler itself. Rather it is a source code transformer. HipHop programmatically transforms your PHP source code into highly optimized C++ and then uses g++ to compile it. HipHop executes the source code in a semantically equivalent manner and sacrifices some rarely used features — such as eval() — in exchange for improved performance. HipHop includes a code transformer, a reimplementation of PHP's runtime system, and a rewrite of many common PHP Extensions to take advantage of these performance optimizations.

Finding new ways to improve PHP performance isn't a new concept. At run time the Zend Engine turns your PHP source into opcodes which are then run through the Zend Virtual Machine. Open source projects such as APC and eAccelerator cache this output and are used by the majority of PHP powered websites. There's also Zend Server, a commercial product which makes PHP faster via opcode optimization and caching. Instead, we were thinking about transforming PHP source directly into C++ which can then be turned into native machine code. Even compiling PHP isn't a new idea, open source projects like Roadsend and phc compile PHP to C, Quercus compiles PHP to Java, and Phalanger compiled PHP to .Net.

The main challenge of the project was bridging the gap between PHP and C++. PHP is a scripting language with dynamic, weak typing. C++ is a compiled language with static typing. While PHP allows you to write magical dynamic features, most PHP is relatively straightforward. It's more likely that you see if (...) {...} else {..} than it is to see function foo($x) { include $x; }. This is where we gain in performance. Whenever possible our generated code uses static binding for functions and variables. We also use type inference to pick the most specific type possible for our variables and thus save memory.

The transformation process includes three main steps:

1. Static analysis where we collect information on who declares what and dependencies,
2. Type inference where we choose the most specific type between C++ scalars, String, Array, classes, Object, and Variant, and
3. Code generation which for the most part is a direct correspondence from PHP statements and expressions to C++ statements and expressions.

沒有留言: