Legacy Codebase: A Love Story

After some years, working with a > 10 years old legacy PHP codebase, I can truly say: you can escape the legacy codebase and introduce whatever is helpful, in a well-maintained system.

Here are 5 important steps that I have done:

  • Custom error handling: Reporting notices for developers, report bad “assert” calls in the dev container, report bad indexes, reporting wrong code usage, …
  • Autocompletion for everything: classes, properties, SQL queries, CSS, HTML, JavaScript in PHP (e.g. via /* @lang JavaScript */ in PhpStorm), …
  • Static-Code Analysis: Preventing bugs is even better than fixing bugs, so just stop stupid bugs and use types in your code.
  • Automate the refactoring: With tools like PHP-CS-Fixer or Rector you can not only fix your code one time, you can fix any future wrong usage of the code.
  • Do not use strings for code: Just use constants, classes, properties, … use something that can be processes by your static-code analysis and something where you will have autocompletion.

Here are 5 additional steps that I already introduce:

  • Sentry: External error collecting (aggregating) tool + custom handler to see e.g. IDs of every Active Record object.
  • Generics: via PHPDocs + autocompletion via PhpStorm
  • No “mixed” types: Now we use something like, e.g. “array<int, string>” instead of “array”.
  • PSR standards: e.g. PSR-15 request handler, PSR-11 container, PSR-3 logger, …
  • Code Style: One code style to rule them all, we use PHP-CS-Fixer and PHP-Code-Sniffer to check / fix our code style for all ~ 10,000 PHP classes.

Here is what helped me mostly while working with old existing code.

First rule, first: 🥇 think or / and ask someone in the team

Analyzing: Here are some things that helped my analyzing software problems in our codebase.

  • Errors: Better error handling / reporting with a custom error handler, with all relevant information.
  • Understandable logging: Hint, you can just use syslog for medium-sized applications.
  • Grouping errors: Displaying and grouping all the stuff (PHP / JS / errors + our own custom reports) into Sentry (https://sentry.io/), now you can easily see how e.g. how many customers are effected from an error.
  • git history: Often new bugs were introduced with the latest changes (at least in often used components), so that good commit messages are really helpful to find this changes. (https://github.com/voku/dotfiles/wiki/git-commit-messages)
  • Local containers: If you can just download the application with a database dump from yesterday, you can analyze many problems without touching any external server.
  • Linux tools: mytop, strace, htop, iotop, lsof, …
  • Database tools:  EXPLAIN [SQL], IDE integration / autocompletion, …

Fixing: Here are some tricks for fixing existing code more easily.

  • IDE: PhpStorm with auto-completion and suggestions (including results from static analysis)
  • auto-code-style formatter: (as pre-commit hook) is also helpful because I do not need to think about this anymore while fixing code 
  • git stuff: sometimes it can also be helpful to understand git and how to revert or cherry-pick some changes

Preventing: Here are some hints how you can prevent some bugs.

  • custom static analysis rules: http://suckup.de/2022/07/php-code-quality-with-custom-tooling-extensions/
  • root cause: fixing the root cause of a problem, sometimes this is very hard because you need to fully understand the problem first, bust mostly spending this time is a good investment
  • testing: writing a test is always a good idea, at least to prevent the same problem

Job: If you now would like to work with this codebase (PHP 8 | MySQL 8 | Ubuntu), please contact me and take a look at this job offer: https://meerx.de/karriere/softwareentwickler-softwareentwicklerin/

What have I learned so far in my job?

I will start a new job next month (02-2023), so time to recap, I’m going to describe what I’ve learned so far.

me: Lars Moelleken |
> Assistant for business IT
> IT specialist for system integration
> IT specialist for application development

What did I learn as IT specialist for system integration?

– You only learn as much as you want to learn.

In contrast to school / technical college, I could and had to teach and work on many things myself during the training. And you quickly realize that you only learn as much as you want. Once you’ve understood this, you’re happy to sit down and learn whatever you want. For example, local or online courses, go to meetups or conferences. Worry about your skill because if you do something, you should do it right.

“An investment in knowledge still pays the best interest.” – Benjamin Franklin

– No panic!

What you learn pretty quickly as a sysadmin is “keep calm” and think first – then act. Hasty actionism does not help and usually even damages. Before you act, you should first obtain some information yourself (information from log files, hardware status, system status, …) so that you really know how to fix the error.

– Unix & command line <3

If you haven’t worked with a Unix operating system before, you unfortunately don’t know what you’re missing out on. If you want or have to use Windows for whatever reason, you can nowadays still use some of the advantages of Linux via WSL (Windows Subsystem for Linux). Leave your own comfort zone and trying out new operating systems helps to understand your computer better overall. At this point, I would have recommended “Arch Linux” as sysadmin, but today I would choose something that needs less maintaining.

One should also become familiar with the command line if you want to increase your productivity rapidly. For example, you should take a closer look at the following commands: e.g. find / grep / lsof / strace

– Read the official documentation.

It is often better to read the official documentation of the thing (hardware || software) you are currently using. While you start programming or learn a new programming language / framework, we often use stackoverflow.com and quickly finds answers and code examples, but the “why” and “how” is usually neglected. If you look at the specification / documentation first, you not only solve this problem, you also understand the problem, and maybe you will learn how to solve similar problems.

– Always make a backup (never use “sudo” drunken).

Some things you have to learn the hard way, apparently installing “safe-rm” was part of it for me!

apt-get install safe-rm

“Man has three ways of acting wisely. First, on meditation; that is the noblest. Secondly, on imitation; that is the easiest. Thirdly, on experience; that is the bitterest.” – (Confucius)

– Be honest with customers, employees and yourself.

Be honest with customers, especially when things go wrong. If the customer’s product (e.g. server) fails, then offer solutions and no excuses and solve the problem, not the question of blame. No one is helped by pointing the finger at colleagues or customers, not at the server, not at the customer and ultimately not at yourself.

– Ask questions if you don’t understand something.

Don’t talk to customers about something you don’t understand, not knowing something (especially in training) is fine, but then ask a colleague before you talk to a customer about it!

– Think about what you are doing (not only at work).

If you question things and think about your work and what you do, then you can develop personally. Question critical, for example, whether you should really order from Amazon, or should you rather order the book directly from the publisher? What is the advantage for the author and do I have any disadvantages? Should we use Nagios or rather Icinga directly? Question your work and critically evaluate whether this is really a good / safe / future-oriented solution.

If you are not sure yourself or your perspective is too limited (because you only know this one solution, for example), then you should acquire new knowledge, research other solutions or “best practices” and discuss the topic with others.

– Use Google correctly …

1. When searching an issue, look for the error message in quotes: “Lars Moelleken”

2. You can limit the result to special URLs: inurl:moelleken.org

3. Sometimes it’s helpful to find only specific files: filetype:txt inurl:suckup.de

> There are even more tricks that can help you in your daily work, just Google for it:

Full example: intitle:index.of mp3 “Männer Sind Schweine” -html -htm -php

What did I learn as IT specialist for application development?

– You never stop learning…

When you start to deal with web programming (HTML, CSS, JS, PHP, …), you don’t even know where to start. There is so much to learn, and this feeling accompanies you for some time (years) until you recognize recurring concepts. However, the advantage in web programming is that many different things have common APIs or at least can be combined. I can write a class in PHP that creates data attributes for my HTML, which I can read out with JavaScript to design them accordingly using CSS classes. But it stays the same, you never stop learning, and that’s also what’s so exciting about this job.

– Try to write code every day. (but set yourself a LIMIT)

Even if the boss isn’t in the office today, and you don’t have any more tasks, then write some code, if you’re sitting on the couch at home (and no new series is running on Netflix), then write or read some code and if you’re on vacation, you’re on vacation!

Here is an interesting link from “John Resig” (jQuery):
http://ejohn.org/blog/write-code-every-day/

– Think in modules and packages…

If you write software every day, you don’t want to code the same functionality (e.g. database connection, send e-mail or error logging …) multiple times for different projects (and remember, mostly you don’t want to code standard functions yourself). Therefore, every programmer should think in modules and design an application in different, preferably interchangeable parts. Often, the system can then also be expanded better, since there is already an interface for modules that can be used. The independence and decoupling of program parts also has the advantage that side effects from different places in the source code are further avoided. In addition, one should minimize the coupling of the corresponding modules, otherwise one gains nothing from modules.

There are already package managers for almost everything in web development. e.g.:
– Frontend (css, js): npm
– Backend (php): composer

– Open-Source-Software

If you are already working with modules and packages, you can publish them as an OSS project + run tests via GitHub actions + code coverage + at least a small amount of documentation. For these reasons alone, publishing as open source is worthwhile. The code quality also increases (in my experience), since the source code is released to the public and therefore more or less conscious attention is paid to the code quality.

The moment when you get your first pull request for your software or code together with someone from Israel, Turkey and the USA is priceless.

At some point, you would like to have the same advantages of OSS in your daily work because often, there is no package (code is directly added into the existing code), no tests (even not for bugs) and no documentation. So, possibly, you have now collected enough arguments to convince your boss to publish some package from your code at work.

– Git & Good

I don’t know how people can work with the PC without version control. I even use “git” for smaller private projects or for configuration files. The following are some advantages, but the list can certainly be extended with a few more points:

– changes can be traced (git log, git blame)
– changes can be undone (git revert, git reset)
– changes can be reviewed by other employees
– employees can work on a project at the same time (git commit, git pull, git push)
– development branches (forks) can be developed simultaneously (git checkout -b , git branches)

– Use GitHub and learn from the best.

GitHub itself is not open-source, but there has been an unofficial agreement to use the platform for open-source projects. You can therefore find many good developers and projects there, and you can learn a lot just by reading the source code / changes. Especially because you can understand and follow the development of the projects: How do others structure your code? How do others write their “commit” messages? How much code should a method contain? How much code should a class contain? Which variable names are better to avoid? How to use specific libraries / tools? How do others test their code? …

– try { tests() }

Especially when you write tests for your own code, you catch yourself testing exactly the cases that you have already considered, so you should test the corresponding functionality with different (not yet considered) input. Here are some inputs for testing: https://github.com/minimaxir/big-list-of-naughty-strings

Hint: We should add another test whenever an error occurred, so that already fixed error does not come back to use.

– Automate your tests.

Unit tests, integration tests and front-end tests only help if they are also executed, so you should deal with automated tests at an early stage and also run them automatically when the source code changes. Where and when these tests are run also determines how effective these tests ultimately are. As soon as you have written a few tests, you will understand why it is better not to use additional parameters for methods and functions, since the number of tests increases exponentially.

– Deployment is also important.

As soon as you work with more than one developer on a project, or the project will become more complex, you want to use some kind of deployment. As in application development, often the simplest solution is also here a good starting point: e.g. just pull the given changes into a directory and change the symlink from the existing source directory so that you can switch or rollback all files easily. PS: And you properly never need to write database-migration rollbacks, I never used them.

– Understanding concepts is more important than implementing them.

Understanding design patterns (programming concepts) not only helps in the current programming language, but can mostly be applied to other programming languages ​​as well.

Basic concepts (classes, objects, OOP, functions, ORM, MVC, DDD, unit tests, data binding, router, hooks, template engine, …) can be found in many frameworks / programming languages ​​and once you have understood the terms and concepts, it is no longer that difficult to use new / different frameworks. And you can see different strengths and weaknesses of these frameworks and tools: “If you only have a hammer as a tool, you see a nail in every problem.”

– Solving problems also means understanding customers.

Design patterns are part of the basic equipment, but you should always ask yourself: Which problem is actually supposed to be solved with the given solution? If necessary, you can find an even more elegant / simpler solution. And sometimes the customer actually wants something thoroughly different, he just doesn’t know it yet or someone has misunderstood the customer.

– Solving problems also means understanding processes.

But it is just as important to understand why a certain feature is implemented, and otherwise you are programming something that is either not needed or used at all. One should therefore understand the corresponding task before implementation and even before planning in the overall context.

project-swing-tree

– Spread code across multiple files.

Use one file for a class, use one file for CSS properties of a module or a specific page, use a new file for each new view. Dividing the source code into different files / directories offers many advantages, so the next developer knows where new source code should be stored and you can find your way around the source code faster. Many frameworks already provide a predefined directory structure.

– Readability comes first!

The readability of source code should always come first, since you or your work colleagues will have to maintain or expand this code in the future.

YouTube’s videos about “Clean Code”: https://www.youtube.com/results?search_query=%22Clean+Code%22&search_sort=video_view_count

Best Practices: http://code.tutsplus.com/tutorials/top-15-best-practices-for-writing-super-readable-code-net-8118

– Good naming is one of the most difficult tasks in programming.

It starts with the domain name / project name, goes through file names, to directory names, class names, method names, variable names, CSS class names. Always realize that others will read this and need to understand it. Therefore, you should also avoid unnecessary abbreviations and write what you want to describe.

We want to describe what the function does and not how it is implemented.

⇾ Incorrect: sendMailViaSwiftmailer(), sendHttpcallViaCurl(), …
⇾ Correct: mail->send(), http->send(), …

Variables should describe what they contain and not how they are stored.

⇾ Incorrect: $array2use, $personsArray, …
⇾ Correct: $pages, $persons, …

Summary: Describe what the variable/method/function/class is, not how it is implemented: https://github.com/kettanaito/naming-cheatsheet

Programming less bad PHP

– Save on comments (at least inline)…

Good comments explain “why” and not “what” the code is doing, and should offer the reader added value that is not already described in the source code.

Sometimes it makes sense to add some “what” comments anyway, e.g. for complicated regex or some other not optimal code that needs some hints.

Examples of inline comments:

bad code:

// Check if the user is already logged in if ( isset ( $_SESSION['user_loggedin']) && $_SESSION['user_loggedin'] > 1 ) { ... }
       

slightly better code:

// check if the user is already logged-in 
if ( session ( 'user_loggedin' ) > 1 ) { ... }
      

better code:

if ( $user->isLoggedin === true ) { ... }     

… and another example …

bad code:

// regex: email 
if (! preg_match ( '/^(.*<?)(.*)@(.*)(>?)$/' , $email ) { ... }
    

better code:

define ( 'EMAIL_REGEX_SIMPLE' , '/^(.*<?)(.*)@(.*)(>?)$/' ); 

if (! preg_match ( EMAIL_REGEX_SIMPLE , $email ) { ... }    

– Consistency in a project is more important than personal preferences!

Use the existing code and use given functions. If it brings benefits, then change / refactor the corresponding code, but then refactor all places in the project which are implemented in this way.

Example: If you have formerly worked without a template system and would like to use one for “reasons”, then use this for all templates in the project and not just for your current use case; otherwise, inconsistencies will arise in the project. For example, if you create a new “Email→isValid()” method, then you should also replace all previous RegEx attempts in the current project with the “Email” class; otherwise inconsistencies will arise again.

Read more about the topic:

– “Be consistent [and try to automate this process, please]!” http://suckup.de/2020/01/do-not-fear-the-white-space-in-your-code/

– “Why do we write unreadable code?”
http://suckup.de/2020/01/do-not-fear-the-white-space-in-your-code/

– A uniform code style has a positive effect on quality!

As in real life, if there is already rubbish somewhere, the inhibition threshold to dump rubbish there drops extremely. But if everything looks nice and tidy, then nobody just throws a “randumInt() { return 4; }” function on the floor.

It also helps to automate some refactoring because the code looks everywhere the same, you can also apply the same e.g. PHP-CS-Fixer and you do not need to think about every different coding style.

– Use functional principles & object-oriented concepts.

A pure function (“Pure Functions”) only depends on its parameters and with the same parameters it always returns the same result. These principles can also be considered in OOP and create so-called immutable classes (immutable class).

https://en.wikipedia.org/wiki/Pure_function
https://de.wikipedia.org/wiki/Object-oriented_programming
https://en.wikipedia.org/wiki/Immutable_object

– Please do not use global variables!

Global variables make testing difficult because they can cause side effects. Also, it’s difficult to refactor code with global variables because you don’t know what effects these changes will have on other parts of the system.

In some programming languages ​​(e.g. JavaScript, Shell) all variables are global and only become local with a certain keyword (e.g. in the scope of a function or a class).

– Learn to use your tools properly!

For example, if you’re writing code with Notepad, you can dig a hole with a spoon, which is just as efficient. Learn keyboard shortcuts for your programs and operating system! Use an IDE, e.g. from JetBrains (https://www.jetbrains.com/products.html) and use additional plugins and settings.

Modern IDEs also give hints/suggestions on how to improve your code. For example, for PHP, you can use PhpStorm + PHPStan and share the global IDE Inspections settings in the team.

– Performance?

In nearly every situation you don’t have to worry too much about performance, as modern programming languages ​​/ frameworks support us and with common solutions; otherwise the motto is “profiling, profiling… profiling”!

– Exceptions === Exceptions

You should not use exceptions to handle normal errors. Exceptions are exceptions, and regular code handles the regular cases! “Use exceptions only in exceptional circumstances” (Pragmatic Programmers). And nearly under no circumstances you should “choke off” exceptions, e.g. by trivially catching several exceptions.

– Finish your work

You should finish what you started. For example, if you need to use “fopen()” you should also use “fclose()” in the same code block. So, nobody in the team needs to clean up your stuff after he / she uses your function.

– Source code should be searchable [Ctrl + F] …

The source code should be easy to search through, so you should avoid using string nesting + “&” with Sass, for example, and also avoid using PHP functions such as “extract()”. Whenever variables are not declared, but created as if by magic (e.g. using magic methods in PHP), it is no longer so easy to change the source text afterward.

Example in PHP: (bad)

extract ( array ( 'bar' => 'bar' , 'lall' => 1 )); 
var_dump ( $bar ); // string 'bar' (length=3)      

Example in Sass: (bad)

. teaser { 
  font - size : 12px ; 

  & __link { 
    color : rgb ( 210 , 210 , 22 ); } }
  

Sass Nesting (code style): https://github.com/grvcoelho/css#nesting

– Program for your use case!

A big problem in programming is that you have to try to think and program in a generalized way so that you can (easily) expand the source code if new requirements are added or you can (easily) change it.

What does project sometimes look like? → A customer orders 10,000 green apples from a farm, changes his order to 10,000 red apples the morning before delivery and when these are delivered, the customer would prefer 10,000 pears and would like to pay for them in 6 months.

And precisely for this reason you should only write the source code that is really required for the current use case because you can’t map all eventualities anyway and the source code is unnecessarily complicated.

– KISS – Keep it simple, stupid.

One should always keep in mind that the source code itself is not that valuable. The value only arises when other developers understand it and can adapt / configure / use it for the customer or themselves. This should be kept in mind during programming so that a solution can be implemented as comprehensibly and “simply” as possible. And if I don’t need to use a new class or nice design pattern for the current problem, I probably shouldn’t. However, this does not mean that you should throw all good intentions overboard and use global variables / singletons everywhere. However, if a simple solution already does the job, go for that one.

A good example of what not to do is the JavaScript DOM Selector API. Not exactly nice to read or write…

Bad: (DOM Selector via JS)

document.getElementsByTagName ( "div" ) 
document.getElementById ( "foo" ) 
document.getElementsByClassName ( "bar" ) 
document.querySelector ( ".foo" ) 
document.querySelectorAll ( "div.bar" )

Better: (DOM Selector via jQuery)

$( "div" ) 
$( "#foo" ) 
$( ".bar" ) 
$( ".foo" ) 
$( "div.bar" )

– DRY – Don’t Reap Yourself.

Repetitions / redundancies in the source text or in recurring work arise relatively quickly if people do not communicate with each other. But also unintentionally due to errors in the software design because you don’t have a better idea or don’t think you have time for it.

improve

To avoid repetition, make your solution easy to find and easy to use. So that other devs will use it instead of re-creating a solution.

– The will to learn and understand something new is more important than previous knowledge.

If you can already program ActionScript (Flash), for example, but are not willing to learn something new, then previous knowledge is of no use because “The only constant in the universe is change.” – Heraclitus of Ephesus (about 540 – 480 BC).

– Read good books and magazines.

Books I have read: https://www.goodreads.com/user/show/3949219-lars-moelleken
Free Books: https://github.com/vhf/free-programming-books/blob/master/free- programming-books.md
books for programmers: http://stackoverflow.com/questions/1711/what-is-the-single-most-influential-book-every-programmer-should-read

– Follow other programmers on Twitter / GitHub / dev.to / YouTube / Medium / …

It sometimes helps to motivate yourself, to write e.g. a blog post or testing some new stuff, if you know some people how has the same interest, so just follow some of them online, there are many excellent developers out there, and they share their knowledge and tricks mostly for free. :)

– Listen to podcasts & subscribe to RSS feeds / newsletters & watch videos, for example from web conferences

To find out about new technologies, techniques, standards, patterns, etc., it is best to use different media, which can be consumed in different situations. An interesting podcast on “Frontend Architecture” before falling asleep or a video on “DevOps” while preparing lunch, reading a book on the tram in the morning entitled “Programming less badly” … to name just a few examples.

Podcasts: https://github.com/voku/awesome-web/blob/master/README.md#-audio-podcast
github is awesome: https://github.com/sindresorhus/awesome
and there is more: https://github.com/jnv/lists

– Attend Meetup’s & web conferences and talk to other developers.

Meetups are groups of people who meet regularly and talk about things like Python, Scala, PHP, etc. Usually, someone gives a lecture on a previously agreed topic.

⇉ http://www.meetup.com/de-DE/members/136733532/

Web conferencing is fun. Point. And every developer / admin should visit them because you get new impressions and meet wonderful people. Some conferences are expensive, but here you should contact your employer, if necessary, the company will take care of it. And there are also really cheap conferences.

– Post answers at quora.com || stackoverflow.com || in forums || your blog…

To deal with a certain topic yourself and to really understand it, it is worth doing research and writing a text (possibly even a lecture) that others can read and criticize and thus improve.

– Don’t stay at work for so long every day; otherwise nobody will be waiting for you at home!

With all the enthusiasm for the “job” (even if it’s fun), you shouldn’t lose sight of the essential things. Again, something I had to learn the hard way. :-/

PHP: Code Quality with Custom Tooling Extensions

After many years of using PHPStan, PHP-CS-Fixer, PHP_CodeSniffer, … I will give you one advice: add your own custom code to extend your Code-Quality-Tooling.

Nearly every project has custom code that procures the real value for the product / project, but this custom code itself is often not really improved by PHP-CS-Fixer, PHPStan, Psalm, and other tools. The tools do not know how this custom code is working so that we need to write some extensions for ourselves.

Example: At work, we have some Html-Form-Element (HFE) classes that used some properties from our Active Record classes, and back in the time we used strings to connect both classes. :-/

Hint: Strings are very flexible, but also awful to use programmatically in the future. I would recommend avoiding plain strings as much as possible.

1. Custom PHP-CS-Fixer

So, I wrote a quick script that will replace the strings with some metadata. The big advantage is that this custom PHP-CS-Fixer will also automatically fix code that will be created in the future, and you can apply / check this in the CI-pipline or e.g. in a pre-commit hook or directly in PhpStorm.

<?php

declare(strict_types=1);

use PhpCsFixer\Tokenizer\Analyzer\ArgumentsAnalyzer;
use PhpCsFixer\Tokenizer\Analyzer\FunctionsAnalyzer;
use PhpCsFixer\Tokenizer\Token;
use PhpCsFixer\Tokenizer\Tokens;

final class MeerxUseMetaFromActiveRowForHFECallsFixer extends AbstractMeerxFixerHelper
{

/**
* {@inheritdoc}
*/
public function getDocumentation(): string
{
return 'Use ActiveRow->m() for "HFE_"-calls, if it is possible.';
}

/**
* {@inheritdoc}
*/
public function getSampleCode(): string
{
return <<<'PHP'
<?php

$element = UserFactory::singleton()->fetchEmpty();

$foo = HFE_Date::Gen($element, 'created_date');
PHP;
}

public function isRisky(): bool
{
return true;
}

/**
* {@inheritdoc}
*/
public function isCandidate(Tokens $tokens): bool
{
return $tokens->isTokenKindFound(\T_STRING);
}

public function getPriority(): int {
// must be run after NoAliasFunctionsFixer
// must be run before MethodArgumentSpaceFixer
return -1;
}

protected function applyFix(SplFileInfo $file, Tokens $tokens): void
{
if (v_str_contains($file->getFilename(), 'HFE_')) {
return;
}

$functionsAnalyzer = new FunctionsAnalyzer();

// fix for "HFE_*::Gen()"
foreach ($tokens as $index => $token) {
$index = (int)$index;

// only for "Gen()"-calls
if (!$token->equals([\T_STRING, 'Gen'], false)) {
continue;
}

// only for "HFE_*"-classes
$object = (string)$tokens[$index - 2]->getContent();
if (!v_str_starts_with($object, 'HFE_')) {
continue;
}

if ($functionsAnalyzer->isGlobalFunctionCall($tokens, $index)) {
continue;
}

$argumentsIndices = $this->getArgumentIndices($tokens, $index);

if (\count($argumentsIndices) >= 2) {
[
$firstArgumentIndex,
$secondArgumentIndex
] = array_keys($argumentsIndices);

// If the second argument is not a string, we cannot make a swap.
if (!$tokens[$secondArgumentIndex]->isGivenKind(\T_CONSTANT_ENCAPSED_STRING)) {
continue;
}

$content = trim($tokens[$secondArgumentIndex]->getContent(), '\'"');
if (!$content) {
continue;
}

$newContent = $tokens[$firstArgumentIndex]->getContent() . '->m()->' . $content;

$tokens[$secondArgumentIndex] = new Token([\T_CONSTANT_ENCAPSED_STRING, $newContent]);
}
}
}

/**
* @param Token[]|Tokens $tokens <phpdoctor-ignore-this-line/>
* @param int $functionNameIndex
*
* @return array<int, int> In the format: startIndex => endIndex
*/
private function getArgumentIndices(Tokens $tokens, $functionNameIndex): array
{
$argumentsAnalyzer = new ArgumentsAnalyzer();

$openParenthesis = $tokens->getNextTokenOfKind($functionNameIndex, ['(']);
$closeParenthesis = $tokens->findBlockEnd(Tokens::BLOCK_TYPE_PARENTHESIS_BRACE, $openParenthesis);

// init
$indices = [];

foreach ($argumentsAnalyzer->getArguments($tokens, $openParenthesis, $closeParenthesis) as $startIndexCandidate => $endIndex) {
$indices[$tokens->getNextMeaningfulToken($startIndexCandidate - 1)] = $tokens->getPrevMeaningfulToken($endIndex + 1);
}

return $indices;
}
}

To use your custom fixes, you can register and enable them: https://cs.symfony.com/doc/custom_rules.html  

Example-Result:


$fieldGroup->addElement(HFE_Customer::Gen($element, 'customer_id'));

// <- will be replaced with ->

$fieldGroup->addElement(HFE_Customer::Gen($element, $element->m()->customer_id));

Hint: There are many examples for PHP_CodeSniffer and Fixer Rules on GitHub, you can often pick something that fits 50-70% for your use-case and then modify it for your needs.

The “m()” method looks like this and will call the simple “ActiveRowMeta”-class. This class will return the property name itself instead of the real value.

/**
* (M)ETA
*
* @return ActiveRowMeta|mixed|static
* <p>
* We fake the return "static" here because we want auto-completion for the current properties in the IDE.
* <br><br>
* But here the properties contains only the name from the property itself.
* </p>
*
* @psalm-return object{string,string}
*/
final public function m()
{
return (new ActiveRowMeta())->create($this);
}
<?php

final class ActiveRowMeta
{
/**
* @return static
*/
public function create(ActiveRow $obj): self
{
/** @var static[] $STATIC_CACHE */
static $STATIC_CACHE = [];

// DEBUG
// var_dump($STATIC_CACHE);

$cacheKey = \get_class($obj);
if (!empty($STATIC_CACHE[$cacheKey])) {
return $STATIC_CACHE[$cacheKey];
}

foreach ($obj->getObjectVars() as $propertyName => $propertyValue) {
$this->{$propertyName} = $propertyName;
}

$STATIC_CACHE[$cacheKey] = $this;

return $this;
}

}

2. Custom PHPStan Extension

In the next step, I added a DynamicMethodReturnTypeExtension for PHPStan, so that the static code analyze knows the type of the metadata + I still have auto-completion in the IDE via phpdocs.

Note: Here I’ve also made the metadata read-only, so we can’t misuse the metadata.

<?php

declare(strict_types=1);

namespace meerx\App\scripts\githooks\StandardMeerx\PHPStanHelper;

use PhpParser\Node\Expr\MethodCall;
use PHPStan\Analyser\Scope;
use PHPStan\Reflection\MethodReflection;
use PHPStan\Type\Type;

final class MeerxMetaDynamicReturnTypeExtension implements \PHPStan\Type\DynamicMethodReturnTypeExtension
{

public function getClass(): string
{
return \ActiveRow::class;
}

public function isMethodSupported(MethodReflection $methodReflection): bool
{
return $methodReflection->getName() === 'm';
}

/**
* @var \PHPStan\Reflection\ReflectionProvider
*/
private $reflectionProvider;

public function __construct(\PHPStan\Reflection\ReflectionProvider $reflectionProvider)
{
$this->reflectionProvider = $reflectionProvider;
}

public function getTypeFromMethodCall(
MethodReflection $methodReflection,
MethodCall $methodCall,
Scope $scope
): Type
{
$exprType = $scope->getType($methodCall->var);

$staticClassName = $exprType->getReferencedClasses()[0];
$classReflection = $this->reflectionProvider->getClass($staticClassName);

return new MeerxMetaType($staticClassName, null, $classReflection);
}
}
<?php

declare(strict_types=1);

namespace meerx\App\scripts\githooks\StandardMeerx\PHPStanHelper;

use PHPStan\Reflection\ClassMemberAccessAnswerer;
use PHPStan\Type\ObjectType;

final class MeerxMetaType extends ObjectType
{

public function getProperty(string $propertyName, ClassMemberAccessAnswerer $scope): \PHPStan\Reflection\PropertyReflection
{
return new MeerxMetaProperty($this->getClassReflection());
}

}
<?php

declare(strict_types=1);

namespace meerx\App\scripts\githooks\StandardMeerx\PHPStanHelper;

use PHPStan\Reflection\ClassReflection;
use PHPStan\TrinaryLogic;
use PHPStan\Type\NeverType;
use PHPStan\Type\StringType;

final class MeerxMetaProperty implements \PHPStan\Reflection\PropertyReflection
{

private ClassReflection $classReflection;

public function __construct(ClassReflection $classReflection)
{
$this->classReflection = $classReflection;
}

public function getReadableType(): \PHPStan\Type\Type
{
return new StringType();
}

public function getWritableType(): \PHPStan\Type\Type
{
return new NeverType();
}

public function isWritable(): bool
{
return false;
}

public function getDeclaringClass(): \PHPStan\Reflection\ClassReflection
{
return $this->classReflection;
}

public function isStatic(): bool
{
return false;
}

public function isPrivate(): bool
{
return false;
}

public function isPublic(): bool
{
return true;
}

public function getDocComment(): ?string
{
return null;
}

public function canChangeTypeAfterAssignment(): bool
{
return false;
}

public function isReadable(): bool
{
return true;
}

public function isDeprecated(): \PHPStan\TrinaryLogic
{
return TrinaryLogic::createFromBoolean(false);
}

public function getDeprecatedDescription(): ?string
{
return null;
}

public function isInternal(): \PHPStan\TrinaryLogic
{
return TrinaryLogic::createFromBoolean(false);
}
}

Summary

Think about your custom code and how you can improve it, use your already used tools and extend it to understand your code. Sometimes it’s easy, and you can add some modern PHPDocs or you need to go down the rabbit hole and implement some custom stuff, but at last it will help your software, your team and your customers.

Timeout Problems: Web Server + PHP

What?

First there is an HTTP request and that will hit your Web server, then it will pass the request via TCP- or UNIT-Socket via FastCGI to your PHP-FPM Daemon, here we will start a new PHP process and in this process we will connect e.g. to the database and run some queries.

PHP-Request

The Problem!

There are different timeout problems here because we connect different pieces together and this parts need to communicate. But what if one of the pieces does not respond in a given time or, even more bad, if one process is running forever like a bad SQL-query.

Understand your Timeouts.

Timeouts are a way to limit the time that a request can run, and otherwise an attacker could simply run a denial-of-service with a simple request. But there are many configurations in several layers: Web server, PHP, application, database, curl, …

– Web server

Mostly you will use Apache or Nginx as Web server and in the end it makes not really a difference, there are different timeout settings, but the idea is almost the same: The Web server will stop the execution and kills the PHP process, now you got a 504 HTTP error (Gateway Timeout) and you will lose your stack trace and error-tracking because we killed our application in the middle of nothing. So, we should keep the Web server running as long as needed.

“`grep -Ri timeout /etc/apache2/“`

/etc/apache2/conf-enabled/timeout.conf:Timeout 60

/etc/apache2/mods-available/reqtimeout.conf:<IfModule reqtimeout_module>

/etc/apache2/mods-available/reqtimeout.conf: # mod_reqtimeout limits the time waiting on the client to prevent an

/etc/apache2/mods-available/reqtimeout.conf: # configuration, but it may be necessary to tune the timeout values to

/etc/apache2/mods-available/reqtimeout.conf: # mod_reqtimeout per virtual host.

/etc/apache2/mods-available/reqtimeout.conf: # Note: Lower timeouts may make sense on non-ssl virtual hosts but can

/etc/apache2/mods-available/reqtimeout.conf: # cause problem with ssl enabled virtual hosts: This timeout includes

/etc/apache2/mods-available/reqtimeout.conf: RequestReadTimeout header=20-40,minrate=500

/etc/apache2/mods-available/reqtimeout.conf: RequestReadTimeout body=10,minrate=500

/etc/apache2/mods-available/reqtimeout.load:LoadModule reqtimeout_module /usr/lib/apache2/modules/mod_reqtimeout.so

/etc/apache2/mods-available/ssl.conf: # to use and second the expiring timeout (in seconds).

/etc/apache2/mods-available/ssl.conf: SSLSessionCacheTimeout 300

/etc/apache2/conf-available/timeout.conf:Timeout 60

/etc/apache2/apache2.conf:# Timeout: The number of seconds before receives and sends time out.

/etc/apache2/apache2.conf:Timeout 60

/etc/apache2/apache2.conf:# KeepAliveTimeout: Number of seconds to wait for the next request from the

/etc/apache2/apache2.conf:KeepAliveTimeout 5

/etc/apache2/mods-enabled/reqtimeout.conf:<IfModule reqtimeout_module>

/etc/apache2/mods-enabled/reqtimeout.conf: # mod_reqtimeout limits the time waiting on the client to prevent an

/etc/apache2/mods-enabled/reqtimeout.conf: # configuration, but it may be necessary to tune the timeout values to

/etc/apache2/mods-enabled/reqtimeout.conf: # mod_reqtimeout per virtual host.

/etc/apache2/mods-enabled/reqtimeout.conf: # Note: Lower timeouts may make sense on non-ssl virtual hosts but can

/etc/apache2/mods-enabled/reqtimeout.conf: # cause problem with ssl enabled virtual hosts: This timeout includes

/etc/apache2/mods-enabled/reqtimeout.conf: RequestReadTimeout header=20-40,minrate=500

/etc/apache2/mods-enabled/reqtimeout.conf: RequestReadTimeout body=10,minrate=500

/etc/apache2/mods-enabled/reqtimeout.load:LoadModule reqtimeout_module /usr/lib/apache2/modules/mod_reqtimeout.so

/etc/apache2/mods-enabled/ssl.conf: # to use and second the expiring timeout (in seconds).

/etc/apache2/mods-enabled/ssl.conf: SSLSessionCacheTimeout 300

Here you can see all configurations for Apache2 timeouts, but we only need to change etc/apache2/conf-enabled/timeout.conf`` because it will overwrite `/etc/apache2/apache2.conf` anyway.

PS: Remember to reload / restart your Web server after you change the configurations.

If we want to show the user at least a custom error page, we could add something like:

ErrorDocument503 /error.php?errorcode=503
ErrorDocument 504 /error.php?errorcode=504

… into our Apache configuration or in a .htaccess file, so that we can still use PHP to show an error page, also if the requested PHP call was killed. The problem here is that we will lose the error message / stack trace / request etc. from the error, and we can’t send e.g. an error into our error logging system. (take a look at sentry, it’s really helpful)

– PHP-FPM

Our PHP-FPM (FastCGI Process Manager) pool can be configured with a timeout (request-terminate-timeout), but just like the Web server setting, this will kill the PHP worker in the middle of the process, and we can’t handle the error in PHP itself. There is also a setting (process_control_timeout) that tells the child processes to wait for this much time before executing the signal received from the parent process, but I am uncertain if this is somehow helpfully here? So, our error handling in PHP can’t catch / log / show the error, and we will get a 503 HTTP error (Service Unavailable) in case of a timeout.

Shutdown functions will not be executed if the process is killed with a SIGTERM or SIGKILL signal. :-/

Source: register_shutdown_function

PS: Remember to reload / restart your PHP-FPM Daemon after you change the configurations.

– PHP

The first idea from most of us would be maybe to limit the PHP execution time itself, and we are done, but that sounds easier than it is because `max_execution_time` ignores time spent on I/O (system commands e.g. `sleep()`, database queries (SELECT SLEEP(100)). But these are the bottlenecks of nearly all PHP applications, PHP itself is fast but the external called stuff isn’t.

Theset_time_limit()function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, etc. is not included when determining the maximum time that the script has been running. This is not true on Windows where the measured time is real.

Source: set_time_limit

– Database (MySQLi)

Many PHP applications spend most of their time waiting for some bad SQL queries, where the developer missed adding the correct indexes and because we learned that the PHP max execution time did not work for database queries, we need one more timeout setting here.

There is the MYSQLI_OPT_CONNECT_TIMEOUT and MYSQLI_OPT_READ_TIMEOUT (Command execution result timeout in seconds. Available as of PHP 7.2.0. – mysqli.options) setting, and we can use that to limit the time for our queries.

In the end you will see a “Errno: 2006 | Error: MySQL server has gone away” error in your PHP application, but this error can be caught / reported, and the SQL query can be fixed, otherwise the Apache or PHP-FPM would kill the process, and we do not see the error because our error handler can’t handle it anyway.

Summary:

It’s complicated. PHP is not designed for long execution and that is good as it is, but if you need to increase the timeout it will be more complicated than I first thought. You need for example different “timeout”-code for testing different settings:

// DEBUG: long-running sql-call
// Query(‘SELECT SLEEP(600);’);

// DEBUG: long-running system-call
// sleep(600);

// DEBUG: long-running php-call
// while (1) { } // infinite loop

Solution:

We can combine different timeout, but the timeout from the called commands e.g. database, curl, etc. will be combined with the timeout from PHP (max_execution_time) itself. The timeout from the Web server (e.g. Apache2: Timeout) and from PHP-FPM (request_terminate_timeout) need to be longer than the combined timeout from the application so that we still can use our PHP error handler.

e.g.: ~ 5 min. timeout

  1. MySQL read timeout: 240s ⇾ 4 min.
    link->options(MYSQLI_OPT_READ_TIMEOUT, 240);
  2. PHP timeout: 300s ⇾ 5 min.
    max_execution_time = 300
  3. Apache timeout: 360s ⇾ 6 min.
    Timeout 360
  4. PHP-FPM: 420s ⇾ 7 min.
    request_terminate_timeout = 420

 

Links:

Prepare your PHP Code for Static Analysis

Three years ago I got a new job as PHP developer, before that I called myself web developer because I build ReactJS, jQuery, CSS, HTML, … and PHP  stuff for a web agency. So now I am a full-time PHP developer and I converted a non typed  (no PHPDoc + no native types) project with ~ 10.000 classes into a project with ~ 90% type coverage. Here is what I have learned.

1. Write code with IDE autocompletion support.

If you have autocompletion in the IDE most likely the Static Analysis can understand the code as well. 

Example:

bad:

->get('DB_Connection', true, false);

still bad:

->get(DB_Connection::class);

good:

getDbConnection(): DB_Connection

2. Magic in Code is bad for the long run!

Magic methods (__get, __set, …) for example can help to implement new stuff very fast, but the problem is nobody will understand it, you will have no autocompletion, no refactoring options, other developers will need more time to read and navigate in the code and in the end it will cost you much more time than you can save with it.

3. Break the spell on Magic Code …

… by explaining to everyone (Devs > IDE > Tools) what it does.

Example 1:

We use a simple Active Record Pattern, but we put all SQL stuff into the Factory classes, so that the Active Record class can be simple. (Example) But because of missing support for Generics we had no autocompletion without adding many dummy methods into the classes. So one of my first steps was to introduce a “PhpCsFixer” that automatically adds the missing methods of the parent class with the correct types via “@method”-comments into these classes. 

Example 2:

Sometimes you can use more modern PHPDocs to explain the function. Take a look at the “array_first” function in the linked Post.

Example 3:

/**
* Return an array which has the Property-Values of the given Objects as Values.
*
* @param object[] $ObjArray
* @param string $PropertyName
* @param null|string $KeyPropertyName if given uses this Property as key for the returned Array otherwise the keys from the
* given array are used
*
* @throws Exception if no property with the given name was found
*
* @return array
*/
function propertyArray($ObjArray, $PropertyName, $KeyPropertyName = null): array {
// init
$PropertyArray = [];

foreach ($ObjArray as $key => $Obj) {
if (!\property_exists($Obj, $PropertyName)) {
throw new Exception('No Property with Name ' . $PropertyName . ' in Object Found - Value');
}

$usedKey = $key;
if ($KeyPropertyName) {
if (!\property_exists($Obj, $KeyPropertyName)) {
throw new Exception('No Property with Name ' . $PropertyName . ' in Object Found - Key');
}
$usedKey = $Obj->{$KeyPropertyName};
}

$PropertyArray[$usedKey] = $Obj->{$PropertyName};
}

return $PropertyArray;
}

Sometimes it’s hard to describe the specific output types, so here you need to extend the  functions of your Static Code Analyze Tool, so that it knows what you are doing. ⇾ for example here you can find a solution for PHPStan ⇽  but there is still no support for the IDE and so maybe it’s not the best idea to use magic like that at all. And I am sure it’s more simple to use specific and simple methods instead:  e.g. BillCollection->getBillDates()

4. Try to not use strings for the code.

Strings are simple and flexible, but they are also bad for the long run. Mostly strings are used because it looks like a simple solution, but often they are redundant, you will have typos everywhere and the IDE and/or Static Analysis can’t analyze them because it’s just text.

Example:

bad: 

AjaxEditDate::generator($bill->bill_id, $bill->date, 'Bill', 'date');
  • “Bill” ⇾ is not needed here, we can call e.g. get_class($bill) in the “generator” method
  • “date” ⇾ is not needed here, we can fetch the property name from the class
  • “$bill->bill_id” ⇾ is not needed here, we can get the primary id value from the class

good:

AjaxEditDate::generator($bill, $bill->m()->date);

5. Automate stuff via git hook and check it via CI server.

Fixing bad code is only done if you disallow the same bad code for the future. With a pre-commit hook for git it’s simple to run these checks, but you need to check it again in the CI server because the developers can simply skip these checks.

Example:

I introduced a check for disallowing global variables (global $foo && $GLOBALS[‘foo’]) via “PHP_CodeSniffer”. 

Links:

6. Use array shapes (or collections) if you need to return an array, please.

Array shapes are like arrays but with fixed keys, so that you can define the types of each key in the PHPDocs.

You will have autocompletion for the IDE + all other devs can see what the method will return + you will notice if you’re better retuning an object because it will be very hard to describe the output for complex data structures and Static Analysis can use and check the types. 

Example:

/**
* @return array{
* missing: array<ShoppingCartLib::TAB_*,string>,
* disabled: array<ShoppingCartLib::TAB_*,string>
* }
*/
private static function getShoppingCartTabStatus(): array {
...
}

Generics in PHP via PHPDocs

If you did not know that you can use Generics in PHP or you do not exactly know how to use it or why you should use it, then the next examples are for you.

Type variables via @template

The @template tag allows classes and functions to declare a generic type parameter. The next examples starts with simple functions, so that we understand how it works, and then we will see the power of this in classes.


A dummy function that will return the input.

https://phpstan.org/r/1922279b-9786-4523-939d-dddcfd4ebb86

    <?php    

    /**
     * @param \Exception $param
     * @return \Exception
     *
     * @template T of \Exception
     * @psalm-param T $param
     * @psalm-return T
     */
    function foo($param) { ... }

    foo(new \InvalidArgumentException()); // The static-analysis-tool knows that 
                                          // the type is still "\InvalidArgumentException" 
                                          // because of the type variable.

@template T of \Exception // here we create a new type variable, and we force that it must be an instance of \Exception

@phpstan-param T $param // here we say that the static-analysis-tool need to remember the type that this variable had before (you can use @psalm-* or @phpstan-* both works with both tools)

@phpstan-return T // and that the return type is the same as the input type 


A simple function that gets the first element of an array or a fallback. 

In the @param PHPDocs we write “mixed” because this function can handle different types. But this information is not very helpful if you want to understand programmatically what the code does, so we need to give the static-analysis-tools some more information. 

https://phpstan.org/r/1900a2af-f5c1-4942-939c-409928a5ac4a

    <?php
     
    /**
     * @param array<mixed> $array
     * @param mixed        $fallback <p>This fallback will be used, if the array is empty.</p>
     *
     * @return mixed|null
     *
     * @template TFirst
     * @template TFirstFallback
     * @psalm-param TFirst[] $array
     * @psalm-param TFirstFallback $fallback
     * @psalm-return TFirst|TFirstFallback
     */
    function array_first(array $array, $fallback)
    {
        $key_first = array_key_first($array);
        if ($key_first === null) {
            return $fallback;
        }

        return $array[$key_first];
    }

    array_first([1, 2, 3], null); 

    if ($a === 'foo') { // The static-analysis-tool knows that 
                        // === between int|null and 'foo' will always evaluate to false.
	    // ...
    }

@template TFirst // we again define your typed variables

@template TFirstFallback // and one more because we have two inputs where we want to keep track of the types

@psalm-param TFirst[] $array // here we define that $array is an array of TFirst types

@psalm-param TFirstFallback $fallback // and that $fallback is some other type that comes into this function

@psalm-return TFirst|TFirstFallback // now we define the return type as an element of  the $array or the $fallback type 


 Very basic Active Record + Generics

The IDE support for generics is currently not there, :-/ so that we still need some hacks (see @method) for e.g. PhpStorm to have autocompletion.

https://phpstan.org/r/f88f5cd4-1bb9-4a09-baae-069fddb10b12

https://github.com/voku/phpstorm_issue_53352/tree/master/src/Framework/ActiveRecord

<?php

class ActiveRow
{
    /**
     * @var ManagedFactory<static>
     */
    public $factory;

    /**
     * @param Factory<ActiveRow>|ManagedFactory<static> $factory
     * @param null|array                                $row
     */
    public function __construct(Factory $factory, array $row = null) {
        $this->factory = &$factory;
    }
}

/**
 * @template T
 */
abstract class Factory
{
    /**
     * @var string
     *
     * @internal
     *
     * @psalm-var class-string<T>
     */
    protected $classname;

    /**
     * @return static
     */
    public static function create() {
        return new static();
    }
}

/**
 * @template  T
 * @extends   Factory<T>
 */
class ManagedFactory extends Factory
{
    /**
     * @param string $classname
     *
     * @return void
     *
     * @psalm-param class-string<T> $classname
     */
    protected function setClass(string $classname): void
    {
        if (\class_exists($classname) === false) {
            /** @noinspection ThrowRawExceptionInspection */
            throw new Exception('TODO');
        }

        if (\is_subclass_of($classname, ActiveRow::class) === false) {
            /** @noinspection ThrowRawExceptionInspection */
            throw new Exception('TODO');
        }

        $this->classname = $classname;
    }

    // ...
}

final class Foo extends ActiveRow {

    public int $foo_id;

    public int $user_id;

    // --------------------------------------
    // add more logic here ...
    // --------------------------------------
}

/**
 * @method Foo[] fetchAll(...)
 *
 * @see Foo
 *
 * // warning -> do not edit this comment by hand, it's auto-generated and the @method phpdocs are for IDE support       
 * //         -> https://gist.github.com/voku/3aba12eb898dfa209a787c398a331f9c
 *
 * @extends ManagedFactory<Foo>
 */
final class FooFactory extends ManagedFactory
{
    // -----------------------------------------------
    // add sql stuff here ...
    // -----------------------------------------------
}

A more complex collection example.

In the end we can extend the “AbstractCollection” and the static-analysis-tools knows the types of all the methods. 

https://github.com/voku/Arrayy/tree/master/src/Type

/**
 * @template TKey of array-key
 * @template T
 * @template-extends \ArrayObject<TKey,T>
 * @template-implements \IteratorAggregate<TKey,T>
 * @template-implements \ArrayAccess<TKey|null,T>
 */
class Arrayy extends \ArrayObject implements \IteratorAggregate, \ArrayAccess, \Serializable, \JsonSerializable, \Countable
{ ... }

/**
 * @template TKey of array-key
 * @template T
 * @template-extends \IteratorAggregate<TKey,T>
 * @template-extends \ArrayAccess<TKey|null,T>
 */
interface CollectionInterface extends \IteratorAggregate, \ArrayAccess, \Serializable, \JsonSerializable, \Countable
{ ... }

/**
 * @template   TKey of array-key
 * @template   T
 * @extends    Arrayy<TKey,T>
 * @implements CollectionInterface<TKey,T>
 */
abstract class AbstractCollection extends Arrayy implements CollectionInterface
{ ... }

/**
 * @template TKey of array-key
 * @template T
 * @extends  AbstractCollection<TKey,T>
 */
class Collection extends AbstractCollection
{ ... }

Links:

https://phpstan.org/blog/generics-in-php-using-phpdocs

https://psalm.dev/docs/annotating_code/templated_annotations/

https://stitcher.io/blog/php-generics-and-why-we-need-them

❤️ Simple PHP Code Parser

It based on code from “JetBrains/phpstorm-stubs” but instead of Php-Reflection we now use nikic/PHP-Parser, BetterReflection, phpDocumentor and PHPStan/phpdoc-parser internally. So, you can get even more information about the code. For example, psalm- / phpstan-phpdoc annotations or inheritdoc from methods.

Install:

composer require voku/simple-php-code-parser

Link:

voku/Simple-PHP-Code-Parser

More:


Example: get value from define in “\foo\bar” namespace

$code = '
  <?php
  namespace foo\bar;
  define("FOO_BAR", "Lall");
';

$phpCode = PhpCodeParser::getFromString($code);
$phpConstants = $phpCode->getConstants();

$phpConstants['\foo\bar\FOO_BAR']->value; // 'Lall'

Example: get information about @property phpdoc from a class

$code = '
  <?php
  /** 
   * @property int[] $foo 
   */
  abstract class Foo { }
';

$phpCode = PhpCodeParser::getFromString($code);
$phpClass = $phpCode->getClass('Foo');

$phpClass->properties['foo']->typeFromPhpDoc); // int

Example: get classes from a string (or from a class-name or from a file or from a directory)

$code = '
<?php
namespace voku\tests;
class SimpleClass {}
$obja = new class() {};
$objb = new class {};
class AnotherClass {}
';

$phpCode = \voku\SimplePhpParser\Parsers\PhpCodeParser::getFromString($code);
$phpClasses = $phpCode->getClasses();

var_dump($phpClasses['voku\tests\SimpleClass']); // "PHPClass"-object

Arrayy: A Quick Overview of map(), filter(), and reduce()

Arrayy: A PHP array manipulation library. Compatible with PHP 7+

The next examples are using the php array manipulation library “Arrayy” which is using generators internally for many operations.

https://github.com/voku/Arrayy

StringCollection::create(['Array', 'Array'])->unique()->append('y')->implode(); // Arrayy


map: transform all values in the collection

StringCollection::create(['foo', 'Foo'])->map('mb_strtoupper'); 

// StringCollection['FOO', 'FOO']

filter: pass all values to the truth test

$closure = function ($value) {
    return $value % 2 !== 0;
}
IntCollection::create([1, 2, 3, 4])->filter($closure); 

// IntCollection[0 => 1, 2 => 3]

reduce: transform all values into a new result

IntCollection::create([1, 2, 3, 4])->reduce(
    function ($carry, $item) {
        return $carry * $item;
    },
    1
); 

// IntCollection[24]

How to write readable code?

Why do we write unreadable code?

I think we are mostly lazy, and so we use names from underlying structures like APIs or the databases, instead of using names that describe the current situation.

For example, the method “Cart->getUser(): User“: There is a “user_id“ in the underlying database table, so the developer chooses the name “getUser()“ where the User will be returned. The problem is “getX“ and “setX“ are terrible names for methods because they did not say you anything about the context of these methods. Maybe something like “Cart->orderUser(): OrderUser“ and at some point “Cart->orderApprovalUser(): OrderApprovalUser“ are better names. But keep in mind that this depends on your use cases. And please rename also e.g. your “user_id“ in the cart database table, so that you keep the code consistent and readable.

In Eric Evans’ book Domain-Driven Design he introduces the concept of a “Ubiquitous Language“ — a shared, common vocabulary that the entire team shares when discussing or writing software. This “entire team” is made up of designers, developers, the product owner and any other domain experts that exist at the organization. And take the note that it is important that your team have a domain expert! Mostly we are not the expert in the field we are writing software for, so we need some professional input. If you build an invoice-system you will need an expert in finance-questions and -laws.


if ()


# BAD:
if ($userExists == 1) { }

# BETTER: use "==="
if ($userExists === 1) {}

# BETTER: use the correct type
if ($userExists === true) {}

# BETTER: use something readable
if ($this->userExists()) {}

# BAD:
$foo = 3;
if ($lall === 2) { $foo = 2; }

# BETTER: use if-else
if ($lall === 2) { 
    $foo = 2; 
} else {
    $foo = 3;
}

# BETTER: only for simple if-else
$foo = ($lall === 2) ? 2 : 3;

# BETTER: use something readable
$foo = $this->foo($lall);

# BAD:
if ($foo && $foo < 20 && $foo >= 10 || $foo === 100) {}

# BETTER: use separate lines
if (
    ($foo && $foo < 20 && $foo >= 10) 
    || 
    ($foo && $foo === 100)
) {}

# BETTER: avoid duplications
if (
    $foo
    &&
    (
        ($foo >= 10 && $foo < 20) 
        || 
        $foo === 100
    )
) {}

# BETTER: use something readable
if ($this->isFooCorrect($foo)) {}

loop ()


# BAD
foreach ($cartArticles as $key => $value)

# BETTER: use correct names
foreach ($cartArticles as $cartIndex => $cartArticle)

# BETTER: move the loop into a method
cart->getArticles(): Generator<int, Article>

# BAD
foreach ($cartArticles as $cartArticle) {
    if ($cartArticle->notActive) {
        $this->removeArticle($cartArticle)
    }
}

# BETTER: use continue
foreach ($cartArticles as $cartArticle) {
    if (! $cartArticle->active) {
        continue;
    }

    $this->remove_article($cartArticle)
}

# BETTER: use something readable
cart->removeNonActiveArticles(): int;

method ( )


# BAD
thisMethodNameIsTooLongOurEyesCanOnlyReadAboutFourCharsAtOnce(bool $pricePerCategory = false): float|float[]

# BETTER: use shorter names if possible (long names mostly indicates that the method does more than one thing)
cartNetPriceSum(bool $pricePerCategory = false): float|float[]

# BETTER: use less parameter and use one return type
cartNetPriceSum(): float
cartNetPriceSumPerCategory(): float[]

class ()


# BAD: 
class \shop\InvoicePdfTemplate;

class \shop\InvoicePdfTemplateNew extends \shop\InvoicePdfTemplate;

class \shop\InvoicePdfTemplateSpecial extends \shop\InvoicePdfTemplateNew;


# BETTER: use non-generic class names for non-generic classes
class \shop\InvoicePdfTemplate;

class \shop\PdfTemplateCustomerX extends \shop\InvoicePdfTemplate;

class \shop\PdfTemplateActionEaster2020 extends \shop\PdfTemplateCustomerX;


# BETTER: do not use multi-inheritance, because of side-effects
class \shop\invoice\PdfTemplateGeneric

class \shop\invoice\PdfTemplate extends \shop\invoice\PdfTemplateGeneric;

class \shop\invoice\PdfTemplateCustomerX extends \shop\invoice\PdfTemplateGeneric;

class \shop\invoice\PdfTemplateActionEaster2020 extends \shop\invoice\PdfTemplateGeneric;


# BETTER: use composition via interfaces
interface \shop\invoice\PdfTemplateInterface;

class \shop\invoice\PdfTemplateGeneric implements \shop\invoice\PdfTemplateInterface;

class \shop\invoice\PdfTemplate extends \shop\invoice\PdfTemplateGeneric;

class \shop\invoice\PdfTemplateCustomerX extends \shop\invoice\PdfTemplateGeneric;

class \shop\invoice\PdfTemplateActionEaster2020 implements \shop\invoice\PdfTemplateInterface;


# BETTER: use abstract or final
interface \shop\invoice\PdfTemplateInterface;

abstract class \shop\invoice\PdfTemplateGeneric implements \shop\invoice\PdfTemplateInterface;

final class \shop\invoice\PdfTemplate extends \shop\invoice\PdfTemplateGeneric;

final class \shop\invoice\PdfTemplateCustomerX extends \shop\invoice\PdfTemplateGeneric;

final class \shop\invoice\PdfTemplateActionEaster2020 implements \shop\invoice\PdfTemplateInterface;

Real-World-Example

https://github.com/woocommerce/woocommerce/blob/master/includes/class-wc-coupon.php#L505


// BAD:
/**
 * Set amount.
 *
 * @since 3.0.0
 * @param float $amount Amount.
 */
public function set_amount( $amount ) {
    $amount = wc_format_decimal( $amount );

    if ( ! is_numeric( $amount ) ) {
        $amount = 0;
    }

    if ( $amount < 0 ) {
        $this->error( 'coupon_invalid_amount', __( 'Invalid discount amount', 'woocommerce' ) );
    }

    if ( 'percent' === $this->get_discount_type() && $amount > 100 ) {
        $this->error( 'coupon_invalid_amount', __( 'Invalid discount amount', 'woocommerce' ) );
    }

    $this->set_prop( 'amount', $amount );
}

// BETTER: fix phpdocs
/**
 * @param float|string $amount Expects either a float or a string with a decimal separator only (no thousands).
 *
 * @since 3.0.0
 */
public function set_amount( $amount ) {
    $amount = wc_format_decimal( $amount );

    if ( ! is_numeric( $amount ) ) {
        $amount = 0;
    }

    if ( $amount < 0 ) {
        $this->error( 'coupon_invalid_amount', __( 'Invalid discount amount', 'woocommerce' ) );
    }

    if ( 'percent' === $this->get_discount_type() && $amount > 100 ) {
        $this->error( 'coupon_invalid_amount', __( 'Invalid discount amount', 'woocommerce' ) );
    }

    $this->set_prop( 'amount', $amount );
}

// BETTER: use php types (WARNING: this is a breaking change because we do not allow string as input anymore)
/**
 * @since 3.0.0
 */
public function set_amount( float $amount ) {
    if ( $amount < 0 ) {
        throw new WC_Data_Exception( 'coupon_invalid_amount', __( 'Invalid discount amount', 'woocommerce' ) );
    }

    if (
        $amount > 100
        &&
        'percent' === $this->get_discount_type()
    ) {
        throw new WC_Data_Exception( 'coupon_invalid_amount', __( 'Invalid discount amount', 'woocommerce' ) );
    }

    $this->set_prop( 'amount', $amount );
}

// BETTER: merge the logic
/**
 * @since 3.0.0
 */
public function set_amount( float $amount ) {
    if (
        $amount < 0
        ||
        (
            $amount > 100
            &&
            'percent' === $this->get_discount_type()
        )
    ) {
        throw new WC_Data_Exception( 'coupon_invalid_amount', __( 'Invalid discount amount', 'woocommerce' ) );
    }

    $this->set_prop( 'amount', $amount );
}

// BETTER: make the logic readable
/**
 * @since 3.0.0
 */
public function set_amount( float $amount ) {
    if (! is_valid_amount($amount)) {
        throw new WC_Data_Exception( 'coupon_invalid_amount', __( 'Invalid discount amount', 'woocommerce' ) );
    }

    $this->set_prop( 'amount', $amount );
}

private function is_valid_amount( float $amount ): bool {
    if ($amount < 0) {
        return false;
    }

    if (
        $amount > 100
        &&
        'percent' === $this->get_discount_type()
    ) {
        return false;
    }

    return true;
}

Should we write code that everybody can read?

I don’t think that every non-programmer need to read the code, so we do not need to write the code this way. But we should remember that we could write our code, so that every non-programmer could read it.

Summary

There are only two hard things in Computer Science: cache invalidation and naming things.

― Phil Karlton

“Indeed, the ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code. …[Therefore,] making it easy to read makes it easier to write.”

― Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship

Modern PHPDoc Annotations

We will start very simple with PhpStorm and default PHPDoc, then we will increase the complexity step by step until we have auto-completion for array keys directly from the database with generics, immutable and type safety support.

1.0 PhpStorm & auto-generate PHPDoc blocks

„For documentation comments, PhpStorm provides completion that is enabled by default. PhpStorm creates stubs of „PHPDoc blocks“ when you type the /** opening tag and press Enter, or press Alt+Insert and appoint the code construct (a class, a method, a function, and so on) to document. Depending on your choice, PhpStorm will create the required tags or add an empty documentation stub.“

https://www.jetbrains.com/help/phpstorm/phpdoc-comments.html

Code:

/**
 * @param array $row
 *
 * @return array
 */
abstract function formatRow(array $row): array;

1.1 Return $this|static|self

It‘s quite annoying that php itself currently only have „self“ as return type (https://wiki.php.net/rfc/static_return_type) for the current class. Because of „late static binding“ you can use „static“ in your code to refer to the class a method was actually called on, even if the method is inherited. But in PHPDoc you can already use:

  • @return $this: if you really return $this (e.g. for fluent interface)
  • @return static: refer to the class a method was actually called on
  • @return self: refer to the class a method was written in

Code:

/**
 * @return static
 */
abstract function getFoo(): self;

1.2 New (and not that new) Array Syntax

PhpStorm and (PHPStan & Psalm) are supporting some new (and some not that new) array syntax for PHPDoc types, but for now PhpStorm will not auto-generate this types. 

Examples:

  • int[]: an array with only INT values – [1, 4, 6, 8, 9, …]
  • array<int, int>: an array with only INT values – [4 => 1, 8 => 4, 12 => 6, …]
  • string[]: an array with only STRING values – [„foo“, „bar“, …]
  • array<int, string>: an array with only STRING values – [4 => „foo“, 8 => „bar“, …]
  • Order[]: an array with only „Order“-Object values – [Order, Order, …]
  • array<int|string, Order>: an array with INT or STRING as key and „Order“-Object values – [4 => Order, ‘foo‘ => Order, …]
  • array<int|string, mixed>: an array with INT or STRING as key and mixed as values – [1 => 1, 4 => „foo“, 6 => \stdClass, …]
  • array<int, array<int, string>>: an array with INT as key and and an array (with INT as key and string as value) as values – [1 => [1 => „foo“], 4 => [1 => 4], …]
  • array<int, string[]>: an array with INT as key and and an array (with INT as key and string as value) as values – [1 => [„foo“, „lall“], 4 => [„öäü“, „bar“], …]
  • array{output: string, debug: string}: an array with the key “output” and “debug”, the values are STRING values – [‘output’ => ‘foo’, ‘debug’ => ‘bar’]
  • array<int, array{output: string, debug: string}>: an array with the key “output” and “debug”, the values are STRING values – [1 => [‘output’ => ‘foo’, ‘debug’ => ‘bar’], 3 => [‘output’ => ‘foo’, ‘debug’ => ‘bar’], …]

Examples (@psalm-* || @phpstan-*): PHPStan can also use “psalm-*” prefixed annotations and Psalm understands “phpstan-*” annotations.

  • list<array{output: string, debug: string}>: an array with the key “output” and “debug”, the values are STRING values – [0 => [‘output’ => ‘foo’, ‘debug’ => ‘bar’], 1 => [‘output’ => ‘foo’, ‘debug’ => ‘bar’], …]

list: represents continuous, integer-indexed arrays (always start from index zero) like: [“red”, “yellow”, “blue”] 

Live-Examples:

Psalm: https://psalm.dev/r/922d4ba5b1

PHPStan: https://phpstan.org/r/ce657ef4-9f18-46a1-b21a-e51e3a0e6d2d

Code:

/**
 * @param array<int|string, mixed> $row
 *
 * @return array<int|string, mixed>
 */
abstract function formatRow(array $row): array;

PhpStorm support?: Sadly PhpStorm did not have good support for these types, so that you often have to add „@psalm-*“ PHPDoc comments. For example PhpStorm will accept “array<int, Order>” but PhpStorm will not understand the PHPDoc, so that you need to add e.g. “@param Order[] $order” and “@psalm-param array<int, Order> $order”. 

Examples for PhoStorm + PHPStan || Psalm: 

/**
* @param Order[] $order
* @psalm-param array<int, Order> $order
*
* @return void
*/
public function fooOrder($order): void { ... }

// you could also use "..." here

/**
* @param Order ...$order
*
* @return void
*/
public function fooOrder(Order ...$order): void { ... }
/**
* @param int $foo_id
*
* @return Foo[]|Generator
* @psalm-return Generator&iterable<Foo>
*/
abstract function fetchYieldByFoo($foo_id): Generator;

1.3 Dynamic Autocompletion (+ data from your database) via deep-assoc-completion

If you have a method e.g. “formatRow($row)” you can use “getFieldArray()[0]” (data from the database – you have to connection the IDE with your database and your queries need to be analyzable by PhpStorm (take a look at the next screenshot) and combine static data from “getHeaderFieldArray()”, so that you have auto-completion from different sources.

Code:

/**
 * @param array<int|string, mixed> $row = $this->getFieldArray()[0] + $this->getHeaderFieldArray()
 *
 * @return array<int|string, mixed>
 */
abstract function formatRow(array $row): array;

more information + examples: https://github.com/klesun/deep-assoc-completion

1.4 Immutability Check via Static Code Analyses (via psalm)

And there is even more. :) You can add PHPDoc annotation that will check if you really use immutable classes or at least methods. Please read more here: https://psalm.dev/articles/immutability-and-beyond

Code:

/**
 * @param array<int|string, mixed> $row = $this->getFieldArray()[0] + $this->getHeaderFieldArray()
 *
 * @return array<int|string, mixed>
 *
 * @psalm-mutation-free
 */
abstract function formatRow(array $row): array;

Live-Example:

– Psalm: https://psalm.dev/r/5bac0a9a07

1.5 Generics in PHP via Static Code Analyses

We can also use Generics via code annotations. PHPStan & Psalm both support it, but Psalm’s support is more feature complete and both tools can use the „@psalm-“-syntax. Here comes some simple examples.

array_last: Will return the last array element from the $array (type: TLast) or the $fallback (type: TLastFallback). We tell the function that the types comes from the input parameters and that the input is an array of TLast or TLastFallback from the fallback.

/**
 * @param array<mixed> $array
 * @param mixed             $fallback <p>This fallback will be used, if the array is empty.</p>
 *
 * @return mixed|null
 *
 * @template TLast
 * @template TLastFallback
 * @psalm-param TLast[] $array
 * @psalm-param TLastFallback $fallback
 * @psalm-return TLast|TLastFallback
 */
function array_last(array $array, $fallback = null)
{
    $key_last = \array_key_last($array);
    if ($key_last === null) {
        return $fallback;
    }
return $array[$key_last]; }

array_first: Will return the first array element from the $array (type: TFirst) or the $fallback (type: TFirstFallback). We tell the function that the types comes from the input params and that the input is an array of TFirst or TFirstFallback from the fallback.

/**
 * @param array<mixed> $array
 * @param mixed             $fallback <p>This fallback will be used, if the array is empty.</p>
 *
 * @return mixed|null
 *
 * @template TFirst
 * @template TFirstFallback
 * @psalm-param TFirst[] $array
 * @psalm-param TFirstFallback $fallback
 * @psalm-return TFirst|TFirstFallback
 */
function array_first(array $array, $fallback = null)
{
    $key_first = array_key_first($array);
    if ($key_first === null) {
        return $fallback;
    }
return $array[$key_first]; }

So we can define „Templates“ and map input arguments on that types, this can be even more complex if you use it in a class context and you map the „Templates“ on class properties. But the logic will be the same.

Here is a more complex example: https://github.com/voku/Arrayy/blob/master/src/Collection/CollectionInterface.php

PhpStorm support?: Noop, sadly we need to hack this via „PHPSTORM_META“, so here is an example:

  • override(\array_filter(0), type(0)); // suppose first parameter type is MyClass[] then return type of array_filter will be MyClass[]
  • override(\array_reduce(0), elementType(0)); // suppose first parameter type is MyClass[] then return type of array_reduce will be MyClass

Read more here:

2.0 Resume

It‘s not perfect, and type check and auto-completion only with PHPDoc is not really what I expected for the year 2020. But it‘s working and I hope PhpStorm will bring more support for the new types annotations in the future.

More Links: