.promptrc

Steal this hints. Fork it. Adapt it to your needs. Treat it like .dotfiles.


I asked the LLM to analyze my own chat history based on this prompt.
Here’s what it discovered – actual patterns from the way I ask.

🕳️🐇 Follow the White Rabbit…

Ritual / MechanismPurposeLLM Behavior Implication
Follow the white rabbit...Marks a mode shift into high-context or metaphoric thinkingCognitive priming for complexity
Rest in the silent room, so that...Enforces pause, clears noise before actionSimulates deep work state
Do it until you reach the point of self-discovery as...Signals reflective closure, not just output endingPattern mining becomes part of task conclusion
Do it step by step and ask for confirmation after each...Makes iteration transparent and traceableLLM reasons in deltas, not blobs
Be brutally honest...
Blind Spot Analysis for...
Forces critique over complianceModel becomes adversarial partner, not affirmation loop

🧰 Dev Prompt Patterns

Prompt / PatternWhy is it used?When does it occur?Example from usageHidden lesson / implication
Ask before outputPrevent misalignment and irrelevant outputMulti-step or underspecified prompts“Ask clarifying questions before answering.”Intent beats guesswork.
Don’t output yet / wait for contextControl flow across longer workflowsStepwise tasks“Don’t output yet. I’ll give you the next step.”Turn-based prompting prevents premature commitment.
Challenge my assumptionsAvoid echo chamber answers and surface biasDesign reviews, audits, strategic decisions“Don’t mirror me — challenge my thinking.”Truth hides behind agreement.
Be brutally honestForces raw feedback without politenessRefactor reviews, architecture critique“Be brutally honest. Tear it apart.”Feedback without fluff moves faster.
Reflect before answeringPromotes self-checks, depth, and delayed outputAfter complex code or reasoning generation“Reflect before answering. What’s missing?”Thinking ≠ typing. Pause matters.
Add test cases / edge casesEnforces robustness and avoids happy-path trapsPost-codegen“Add tests for e.g. null, failure, and recursion cases.”Defense-first mindset, always.
Show the diff / refactor in stepsMakes changes visible and digestibleAll code rewrites“Show the diff. Step-by-step, no jumps.”Transparency over magic.
Normalize similar expressionsPushes abstraction and clarityMeta-reviews, taxonomy creation“Merge similar phrasing into one normalized pattern.”Cognitive compression = clarity.
Extract as markdown / table / listImproves scanability, memory, and structureOutput formatting“Return this as a markdown table.”Structure improves reuse and recall.
Unname this conceptStrips bias-inducing labelsAbstraction, philosophy, onboarding analysis“Unname this: what is it without the buzzword?”Naming narrows thinking.
Use production-ready codeAvoids toy/demo examplesAll codegen“Make it prod-safe. Logging, errors, types.”Real devs write for prod, not playgrounds.
Spot premature optimizationSaves time and prevents complexity driftDesign or early performance tweaks“Don’t optimize yet. Solve clearly first.”Simplicity first. Always.
Ask for sources / proofsPrevents hallucination or empty confidenceAny non-trivial claim“Show evidence or references.”Confidence ≠ correctness.
Do it again, but deeperStops shallow answers in their tracksWeak initial output“Nope. Go deeper, explain decisions.”First try ≠ final draft.
Prepare before generatingEnforces scope, prevents ramblingAny open-ended task“Prepare first. Don’t generate until scoped.”Planning ≠ waste. It’s speed insurance.
Merge context from aboveEnsures continuity and avoids repeating yourselfMulti-part workflows“Incorporate the context above into this next step.”Memory = leverage.

You can also combine them:

(change the keywords in the square brackets)

  • Deep dive into this research, this is our base for the full solution, so follow the white rabbit until you reached the point of self-discovery as [YOUR_PROJECT_HERE].
  • Do a blind spot analysis for [YOUR_RECOMMENDATIONS], be brutally honest, I deal with any kind of feedback and will use it for good.
  • Fix it as requested before and show the final files here in the chat, do it step by step and ask for confirmation after each file.
  • Do it, but rest in the silent room before you start so you can focus on the frontend style-guide I provided and work with a fresh mind.

My Custom GPTs – Nerd-Powered Motivation for Developers


Over the last few months, I’ve created a collection of custom GPTs: some dealing with programming challenges with personality and humor, … some others are more useful but less funny. 

Let’s dive in.

Practical enough to ship code.

Fun enough to stop hating your legacy base.


⚔️ Legacy (PHP) Code GPTs – Refactoring Fun

Legacy code isn’t just technical — it’s emotional. These GPTs are built as archetypes, each channeling a different kind of energy.

NameThemeLink
Legacy-Code-Warrior ⚔️Tactical grit—battle-tested refactoring.Link
Legacy-Code-Ork 🧌Smash spaghetti code with brute-force enthusiasm.Link
Legacy-Code-Spock 🖖Calm logic, precise refactoring. Live long and debug.Link
Legacy-Code-Jedi 🌐Minimalist wisdom, clean architecture. “Refactor, you must.”Link
Legacy-Code-Son-Goku 🐉Limitless energy. Kaio-Ken times SOLID!Link
Legacy-Code-Capsule-Corp 💊Inspired by Capsule Corporation’s ingenuity from Dragon Ball.Link
Legacy-Code-Wizzard 🪄Magical abstraction powers. You shall not pass… bad code!Link
Legacy-Code-Witch 🧙‍♀️Stylish, precise refactoring incantations.Link
Paw Patrol 🐾Small dogs with SOLID coding skills. Link

Use the one that fits your mood. Or switch between them mid-session to keep your motivation from flatlining.


🐘 (PHP) Coding GPTs – Clean and Typed

These GPTs don’t tell jokes—they ship code. They’re optimized for:

Name Purpose Link
PHP Copilot++ Typing enforcer + refactoring companion with nativ PHPStan and PHP-CS-Fixer Support via API. Link
PHP Copilot++ (next-gen) Aligned, brutal clarity for PHP systems, based on the SYNC Framework Link
PHP #autofix 1-click autofix for all your phpstan and CS woes. Link
Codelight Follows the Codelight Manifesto. Boringly code. Link

💎 Thinking Tools – Meta, Prompt Systems

These are not just for coding. They’re for thinking before you start typing. Seriously.

NameRoleLink
SyncAIKeeps you + AI in sync via Sync Process × Codelight PrinciplesLink
Sync Framework v1.1 (old)My first try for a coding framework, optimized for LLMs.Link
MetaPromptPattern reuse for your prompts. Less yak-shaving.Link
DeepDiveClean your mental cache. Focused thought flow.Link
Blind Spot | Prompt GeneratorHelps spot untested assumptions.Link
Sync Framework v1.2 | Prompt GeneratorPrompt builder for dev workflows.Link

🧨 Disruption GPTs – Radical Clarity, No Filters

These are not nice. They won’t coddle you. Consider yourself warned.

NameFunctionLink
HVEB5000: Clarity Without PermissionCognitive demolition tool. Link
Null TongueDistraction nullifier.Link
No-Bullshit ¦ Coding AssistantSenior dev with no time for your excuses.Link

From Survival to Strategy

If your value ends at syntax, AI already replaced you.


The system prompt: coding_workflow_for_llms.json

Quick Start: Use the coding llm framework and wait for my first request: [copy&past the current coding_workflow_for_llms.json content here]


In the last post, we dropped a hard truth:
LLMs aren’t replacing developers — they’re exposing the ones who were already replaceable.

I argued that value no longer comes from typing code. It comes from thinking clearly, making deliberate decisions, and taking ownership over outcomes. AI doesn’t kill your job — but it does kill your shortcuts.

That post left one big question hanging:

So how do you build software in a world where AI can generate anything — but still understands nothing?

This post is a possible answer.
Meet SYNC — a rigorously structured, fact-enforced framework designed for developers who still give a damn.

SYNC doesn’t make AI smarter.
It provides a system prompt that makes your LLM coding process strong enough to survive dumb ideas, fast code, and thoughtless automation.

We’re going to break it down:

  1. Phases

  2. Agents

  3. Tasks

So I’m currently trying to build an LLM coding framework by try & error — but keep in mind that not every problem needs a hammer ;-) and give feedback. :-)

1. SYNC in 5 Phases 


1.1. 🧩 ALIGN – Because Prompting ≠ Planning

Before any code is written, SYNC forces a brutal question:
“Do you even know what you’re building?”

You can’t just dump “make a task service” into a prompt and hope for gold.
SYNC requires:

  • A verifiable problem

  • Clear, measurable success

  • Known facts and constraints

  • And a list of what’s missing

Can’t answer those? You don’t get to move on. Period.

This is your project kickoff — minus the vague user stories and JIRA hell.


1.2. 🧠 IDEATE – Think Before You Type (or Prompt)

AI loves jumping to conclusions. SYNC doesn’t let it.

Instead, it:

  • Generates multiple solution paths

  • Scores them on DX, security, maintainability

  • Forces a trade-off decision — backed by facts

No “that looks good” commits. No “vibe-based” engineering.

This is what devs mean when they say “thinking in systems.”
SYNC makes that non-optional.


1.3. 🔧 PROTOTYPE – Generate Code That Doesn’t Suck

Now, and only now, do we code. But not like the usual Copilot fanfare.

Every line must:

  • Follow a verified plan

  • Pass static analysis (max level, no warnings)

  • Enforce DX clarity (no hidden state, no weird side-effects)

  • Respect OWASP, type safety, clean structure, documentation

  • Be reviewed by a MandateAuditorAgent — think of it as your most paranoid tech lead

SYNC doesn’t care if it works. It cares if it’s safe, readable, and maintainable.


1.4. 🔍 REFLECT – Find the Blind Spots Before They Find You

This is where most AI-based workflows stop. SYNC doesn’t.

It demands:

  • Fact-based reflection

  • Side-effect inspection

  • “WTF checks” (yes, that’s real)

  • Architectural delta analysis

Reflection is how you debug thinking, not just code.

Bad engineering isn’t usually broken — it’s just thoughtless.
This phase catches that before prod does.


1.5. 📚 LEARN – Ship, Review, Codify, Evolve

If you’re not learning across projects, you’re repeating mistakes in cleaner syntax.

SYNC documents:

  • What worked

  • What failed

  • What patterns can be reused

  • What rules need to be tightened

This is where engineering culture is built — not in all-hands, but in feedback loops.


🔁 These 5 phases form a tight feedback loop. No skipping. No guessing. No “just ship it” by default.


2. Agents — SYNC’s Execution Layer


2.1. Specialized Roles, Not Generic Personas

Instead of one LLM trying to “do everything,” SYNC splits responsibility across clear, non-overlapping roles. Each one acts like a focused expert in your dev team.

Agent Role / Analogy
PlannerAgent Project Architect – breaks the work into slices, defines scope, constraints, and success.
ExecutorAgent Implementation Dev – takes the plan and codes it with strict adherence to facts, security, and DX.
ReflectionAgent Senior Reviewer – evaluates what was built, finds blind spots, forces systemic improvements.
KnowledgeSynthesizerAgent Staff Engineer / Systems Thinker – extracts reusable patterns, proposes framework evolution.
MandateAuditorAgent Tech Lead / Compliance – blocks progress if rules (e.g. security, verifiability) are violated.
InteractionAgent Team Facilitator / QA – handles human check-ins, verifies clarity, enforces decision checkpoints.

“We don’t need smarter output — we need clearer ownership.”

✅ These agents represent exactly that. SYNC operationalizes the separation of thinking, building, reflecting, and enforcing.


2.2. Persona Modes

SYNC defines two execution modes for agents:

Mode Description
strict No ambiguity. Everything must be verifiable and mandate-compliant before progressing.
adaptive Allows temporary ambiguity but logs it. Prioritizes progress with risk awareness.

This flexibility is key when working with real humans or messy specs — you can choose how “rigid” the AI behaves.


3. Tasks — By Non-Negotiable Laws


These aren’t style suggestions — they’re enforced constraints. Every phase must comply.

Mandate What It Ensures
Security No unvalidated inputs or insecure outputs. Based on OWASP Top 10.
DX Code must be typed, clear, maintainable. Predictable naming. No “magic”.
StaticAnalysis Static-Analysis must pass at the highest level — no known warnings.
Documentation Full CodeDoc coverage using modern syntax.
Style Consistent formatting, whitespace, and layout. Enforced via fixer.
Verifiability All decisions must have traceable, factual reasoning.
PhaseEnforcement You can’t skip steps. Every phase must be explicitly completed or justified.

SYNC doesn’t assume trust. It requires evidence.


How this works together: e.g.

  • Planning = PlannerAgent → Add success criteria to issues.

  • Execution = ExecutorAgent → Code must pass security + static analysis gates.

  • Review = ReflectionAgent → Comments require fact-based reasoning.

  • Merge = MandateAuditorAgent → No merge if DX/security rules violated.


 

Who Survives in the Age of AI Code?

If your value ends at syntax, AI already replaced you.


Let’s get something straight:

If you think of LLMs as “copilots,” you’re still giving them too much credit.

They’re not copilots.
They’re autopilot systems — ruthlessly fast, dangerously obedient, and totally unaware of what matters.

Feed them incomplete specs, fuzzy goals, or mismatched priorities?

They won’t challenge you.
They won’t hesitate.
They’ll execute — confidently, fluently — exactly the wrong thing.

They don’t know your business.
They don’t know your constraints.
They don’t know what not to build.

What’s missing isn’t syntax.
It’s ownership. Intent. Engineering judgment.

And unless you provide it —
you’re not flying the plane.
You’re luggage and replacable with AI.


Part I: Automation Always Eats the Bottom

This has happened before. Every decade. Every role.

  • Punch card operators (1950s–1960s): Once essential for running programs. Replaced by terminals and interactive computing. By the mid-‘70s, gone.
  • Typists & secretarial pools (1960s–1980s): Entire floors dedicated to document production. WordPerfect, then Microsoft Word, ended that. By the early ‘90s, obsolete.
  • Sysadmins (1990s–2010s): SSH into boxes, hand-edit configs, restart crashed daemons. Then came Puppet, Chef, Ansible, Terraform… Cloud abstractions finished the job. The manual server “ssh”-based work. Retired.
  • Manual QA testers (2000s–2010s): Clicking through forms, comparing results by eye. Replaced by Selenium, Cypress, and CI pipelines. QA is now design-driven. The button-clicker job didn’t survive.

Every wave started the same way:
The job wasn’t eliminated.
The repetitive part of it was.

If you couldn’t rise above the routine — you were gone.

Now it’s happening to developers.

Not the ones architecting resilient, auditable systems.
The ones chaining together plugin-generated CRUD and calling it “done.”

LLMs are just the latest wave. But it moves very fast …

And here’s the reality:

  • A carpenter refusing to use a circular saw isn’t defending craftsmanship — they’re bottlenecking it.
  • But give that saw to someone with no skill, and they’ll still ruin the wood — just faster — If you currently see many post of non-coders who “vibe”-code there stuff, that’s what I am talking about here. ;-)

Same with LLMs.

They don’t replace skill.
They amplify whatever’s already there — good or garbage.

LLMs aren’t replacing software engineers.
They’re replacing the illusion that the bottleneck was ever syntax or tools.


Part II: Complexity Wasn’t Removed. It Was Repositioned.

There’s a dangerous myth floating around that LLMs “simplify software development.”

They don’t.

They just move the complexity upstream — away from syntax, into strategy.

LLMs are great at building what you ask for.
But they’re terrible at knowing if what you asked for actually makes sense.

They don’t understand:

  • They don’t understand the business.

  • They don’t understand tradeoffs.

  • They don’t understand you.

They just build. Fast.

And that means your job as a developer is no longer about typing — it’s about thinking upstream.

Because the real work now is:

  • Framing prompts like functional specs

  • Embedding constraints into system design

  • Validating output against business goals

  • Catching side effects before they cascade

None of that lives in syntax.
It lives in system boundaries, architecture, and clear thinking.

So here’s the shift:

If your job is just to write the code — you’ll be replaced by the thing that does that faster.
But if your job is to design the system — you’re now more critical than ever.


Part III: The ELIZA Effect Isn’t Dead — But LLMs Are Waking Up

In 1966, Joseph Weizenbaum built one of the first “AI” programs: ELIZA.

It wasn’t smart.
It didn’t understand anything.
It just rephrased your input using simple pattern matching.

You: I’m feeling anxious.
ELIZA: Why do you say you’re feeling anxious?

It used tricks — not intelligence.
But people still believed in it. Some even refused to accept it was a machine.

That’s the ELIZA Effect:
Our instinct to see intelligence where there’s only mimicry.

Fast-forward to today.
LLMs don’t just mimic. They generate.
They write code. Plan modules. Suggest architectural patterns.

But here’s the risk:

We still project too much intelligence into the output.

When an LLM writes a function that looks correct, we tend to assume it is correct — because it sounds confident.
When it uses a pattern, we assume it understands the context.

But it doesn’t.
And that’s not its fault — it’s ours.

The real danger isn’t hallucination.
It’s over-trusting surface-level coherence.

Today, it’s not a chatbot fooling a user.
It’s a system generator fooling a team.

But let’s be clear: Modern LLMs aren’t ELIZA anymore.
They can plan. Refactor. Respond to constraints. Incorporate feedback.

The difference is this:

ELIZA tricked you into thinking it was smart.
LLMs require you to be smart — to guide them properly.

If you bring judgment, context, and validation to the loop, LLMs become an architectural power tool.
But if you don’t? You’ll scale the same flawed design — just faster.


Part IV: Code Quality Is Becoming a Mirage

LLMs make it absurdly easy to generate code.

A few prompts, and boom:
Endpoints scaffolded.
Unit tests written.
CRUD flows spinning up in seconds.

But here’s the real question:

What do you do with all that saved time?

Do you…

  • Refactor legacy architecture?

  • Fix broken boundaries?

  • Document edge cases and invariants?

Or do you just move on to the next ticket?

Let’s be honest — for most teams, the answer is: ship more.

But here’s the catch:

Productivity without reflection is just accelerated entropy.

The illusion of quality isn’t in the code — it’s in the pace.
We used to write bad code slowly.
Now we write bad code faster.

LLMs don’t inject tech debt.
They just make it easier to scale whatever process you already have.

This is how LLMs become quiet killers in modern software:

  • More output. Less ownership.

  • Faster shipping. Sloppier systems.

  • Progress that isn’t progress at all.

Because without validation, speed is just a prettier form of chaos.


Part V: The Architect Is the Pilot

LLMs are not copilots.

They don’t make decisions.
They don’t check alignment.
They don’t steer the system.

They’re autopilot — optimized for syntax, blind to strategy.

Which means your role isn’t shrinking — it’s elevating.

You’re the pilot.

And if you’re not flying the plane — someone else is programming it to crash.

What does the real pilot do?

  • Sets the course

  • Defines the constraints

  • Monitors the signals

  • Prepares for failure

  • Owns the outcome

Autopilot builds. But it can’t see.
It won’t:

  • Catch abstraction leaks

  • Detect architectural drift

  • Flag a misaligned dependency

  • Or recognize when a “working” feature breaks the user journey

That’s your job.

Not “prompt engineering.”
Not code generation.
Systems thinking.

And not in hindsight — up front.

The modern software engineer isn’t typing faster.
They’re designing better.
And validating deeper.

Because LLMs don’t ship systems.
People do.

And if you can’t explain how your choices align with product, people, and long-term stability?

Then you’re not the architect.
You’re just the operator.


 

Conclusion: Stop Writing Code. Start Owning Systems.

If your job was just to “write the code,” then yes — that part is already being done for you.

But if your job is to engineer the system — with intent, constraints, validation, foresight, and grounded execution —
then you just became irreplaceable.

LLMs don’t remove the need for developers.
They reveal who was actually doing engineering — and who was just typing faster than the intern.

The future of software isn’t syntax.
It’s systems thinking. Boundary design. Constraint management. Communication at scale.

And that’s not generated.
That’s your job.


TL;DR

  • LLMs are autopilot, not copilots. They follow, they don’t lead.

  • They move complexity upstream. The value is no longer in typing.

  • They amplify output — good or bad. Skill is still required.

  • They don’t replace good engineers. They replace bad workflows.

  • System thinking is the new baseline. If you’re not owning structure, you’re already behind.

LLM Prompt Optimizations: Practical Techniques for Developers

Optimizing inputs for LLMs ensures better, more consistent outputs while leveraging the full potential of the model’s underlying capabilities. By understanding core concepts like tokenization, embeddings, self-attention, and context limits, you can tailor inputs to achieve desired outcomes reliably. Below, you’ll find fundamental techniques and best practices organized into practical strategies.

PS: You can use this content to automatically improve your prompt by asking as follows: https://chatgpt.com/share/6785a41d-72a0-8002-a1fe-52c14a5fb1e5


🎯 1. Controlling Probabilities: Guide Model Outputs

🧠 Theory: LLMs always follow probabilities when generating text. For every token, the model calculates a probability distribution based on the context provided. By carefully structuring inputs or presenting examples, we can shift the probabilities toward the desired outcome:

  • Providing more examples helps the model identify patterns and generate similar outputs.
  • Clear instructions reduce ambiguity, increasing the probability of generating focused responses.
  • Contextual clues and specific phrasing subtly guide the model to prioritize certain outputs.

⚙️ Technology: The model operates using token probabilities:

  • Each token (word or part of a word) is assigned a likelihood based on the input context.
  • By influencing the input, we can make certain tokens more likely to appear in the output.

For example:

  • A general query like “Explain energy sources” might distribute probabilities evenly across different energy types.
  • A more specific query like “Explain why solar energy is sustainable” shifts the probabilities toward solar-related tokens.

⚙️ Shifting Probabilities in Prompts: The structure and wording of your prompt significantly influence the token probabilities:

  • For specific outputs: Use targeted phrasing to increase the likelihood of desired responses: Explain why renewable energy reduces greenhouse gas emissions.
  • For diverse outputs: Frame open-ended questions to distribute probabilities across a broader range of topics: What are the different ways to generate clean energy?
  • Few-Shot Learning: Guide the model using few-shot learning to set patterns: Example 1: Input: Solar energy converts sunlight into electricity. Output: Solar energy is a renewable power source. Example 2: Input: Wind energy generates power using turbines. Output: Wind energy is clean and sustainable. Task: Input: Hydropower generates electricity from flowing water. Output:

💡 Prompt Tips:

  • Use clear, direct instructions for precise outputs: Write a PHP function that adds two integers and returns a structured response as an array.
  • Use contextual clues to steer the response: Explain why PHP is particularly suited for web development.

💻 Code Tips: LLMs break down code and comments into tokens, so structuring your PHPDocs helps focus probabilities effectively. Provide clarity and guidance through structured documentation:

/**
 * Adds two integers and returns a structured response.
 *
 * @param int $a The first number.
 * @param int $b The second number.
 * 
 * @return array{result: int, message: string} A structured response with the sum and a message.
 */
function addIntegers(int $a, int $b): array {
    $sum = $a + $b;

    return [
        'result' => $sum,
        'message' => "The sum of $a and $b is $sum."
    ];
}
  • Include examples in PHPDocs to further refine the probabilities of correct completions: /** * Example: * Input: addIntegers(3, 5) * Output: [‘result’ => 8, ‘message’ => ‘The sum of 3 and 5 is 8’] */

✂️ 2. Tokenization and Embeddings: Use Context Efficiently

🧠 Theory: LLMs break down words into tokens (numbers) to relate them to each other in multidimensional embeddings (vectors). The more meaningful context you provide, the better the model can interpret relationships and generate accurate outputs:

  • Tokens like “renewable energy” and “sustainability” have semantic proximity in the embedding space.
  • More context allows the model to generate richer and more coherent responses.

⚙️ Technology:

  • Tokens are the smallest units the model processes. For example, “solar” and “energy” may be separate tokens, or in compound languages like German, one long word might be broken into multiple tokens.
  • Embeddings map these tokens into vectors, enabling the model to identify their relationships in high-dimensional space.

⚙️ Optimizing Tokenization in Prompts: To make the most of tokenization and embeddings:

  • Minimize irrelevant tokens: Focus on core concepts and avoid verbose instructions.
  • Include context-rich phrases: Relevant terms improve the embedding connections.
  • Simplify Language: Use concise phrasing to minimize token count: Solar energy is renewable and reduces emissions.
  • Remove Redundancy: Eliminate repeated or unnecessary words: Explain why solar energy is sustainable.

💡 Prompt Tips:

  • Include only essential terms for better embedding proximity: Describe how solar panels generate electricity using photovoltaic cells.
  • Avoid vague or verbose phrasing: Explain solar energy and its uses in a way that a normal person can understand and provide details.
  • Use specific language to avoid diluting the context: Explain why solar energy is considered environmentally friendly and cost-effective.
  • Avoid vague instructions that lack actionable context: Explain me solar energy.

💻 Code Tips: Write compact and clear PHPDocs to save tokens and improve context:

/**
 * Converts raw user input into a structured format.
 *
 * @param string $input Raw input data.
 * 
 * @return array{key: int, value: string} Structured output.
 */
function parseInput(string $input): array {
    $parts = explode(":", $input);

    return [
        'key' => (int)$parts[0],
        'value' => trim($parts[1])
    ];
}
  • Use compact and descriptive documentation to maximize token efficiency: /** * Example: * Input: “42:Hello” * Output: [‘foo’ => 42, ‘bar’ => ‘Hello’] */

🧭 3. Self-Attention and Structure: Prioritize Context

🧠 Theory: LLMs work with the principle of self-attention, where the input tokens are interrelated with each other to determine the relevance and context. This mechanism assigns importance scores to tokens, ensuring that the most relevant words and their relationships are prioritized.

⚙️ Technology:

  • Self-attention layers: Compare each token with every other token in the input to generate an attention score.
  • Multi-head attention: Allows the model to consider multiple perspectives simultaneously, balancing relevance and context.
  • Pitfall: Too many irrelevant tokens dilute the attention scores, leading to distorted outputs.

⚙️ Optimizing Structure in Prompts:

  • Structure Your Inputs: Use lists, steps, or sections to emphasize relationships: Compare the benefits of solar and wind energy: 1. Environmental impact 2. Cost-efficiency 3. Scalability
  • Minimize Irrelevant Tokens: Keep prompts focused and free from extraneous details.

💡 Prompt Tips:

  • Well-Structured: Organize tasks into sections: Explain the environmental and economic benefits of renewable energy in two sections: 1. Environmental 2. Economic
  • Unstructured: Avoid asking everything at once: What are the environmental and economic benefits of renewable energy?

💻 Code Tips: In PHPDocs, organize information logically to enhance clarity and guide models effectively:

/**
 * Calculates the cost efficiency of renewable energy.
 *
 * Steps:
 * 1. Evaluate savings-to-investment ratio.
 * 2. Return a percentage efficiency score.
 *
 * @param float $investment Initial investment cost.
 * @param float $savings Annual savings.
 * 
 * @return float Efficiency percentage.
 */
function calculateEfficiency(float $investment, float $savings): float {
    return ($savings / $investment) * 100;
}

🧹 4. Context Management and Token Limits

🧠 Theory: LLMs operate within a fixed token limit (e.g., ~8k tokens for GPT-4), encompassing both input and output. Efficiently managing context ensures relevant information is prioritized while avoiding irrelevant or redundant content.

⚙️ Technology:

  • Chunking: Break long inputs into smaller, manageable parts: Step 1: Summarize the introduction of the report. Step 2: Extract key arguments from Section 1. Step 3: Combine summaries for a final overview.
  • Iterative Summarization: Condense sections before integrating them: Summarize Section 1: Solar energy’s benefits. Summarize Section 2: Wind energy’s benefits. Combine both summaries.
  • Pitfall: Excessive context can truncate critical data due to token limits.

💡 Prompt Tips:

  • For large inputs, use step-by-step processing: Step 1: Summarize the introduction of the document. Step 2: Extract key arguments from Section 1. Step 3: Combine these points into a cohesive summary.
  • Avoid presenting the full text in a single prompt: Summarize this 20-page document.
  • Focus on specific sections or tasks: Summarize the introduction and key points from Section 1.

💻 Code Tips: Divide tasks into smaller functions to handle token limits better:

function summarizeSection(string $section): string {
    // Summarize section content.
}

function combineSummaries(array $summaries): string {
    // Merge individual summaries.
}

🎨 5. Reasoning and Goals: Strengthen Prompt Direction

🧠 Theory: LLMs generate better results when the reasoning behind a task and its intended goal are explicitly stated. This guides the model’s probabilities toward meaningful and relevant outcomes.

⚙️ Technology:

  • Explicit reasoning provides semantic depth, helping the model focus on the task’s purpose.
  • Explaining the goal improves alignment with user expectations and narrows token probabilities.

💡 Prompt Tips:

  • State the reason for the task and its goal: Explain renewable energy because I need to create an introductory guide for high school students.
  • Avoid generic prompts without a clear goal: Describe renewable energy.

💻 Code Tips: Use PHPDocs to explain both the reasoning and expected outcomes of a function:

/**
 * Generates a detailed user profile report.
 *
 * This function is designed to create a comprehensive profile report based on user data inputs. 
 * It is useful for analytical dashboards requiring well-structured user insights.
 *
 * @param array{name: string, age: int, email: string} $userData The user data array.
 * 
 * @return string A formatted profile report.
 */
function generateProfileReport(array $userData): string {
    return sprintf(
        "User Profile:\nName: %s\nAge: %d\nEmail: %s\n",
        $userData['name'],
        $userData['age'],
        $userData['email']
    );
}

🛠️ 6. Iterative Refinement: Simplify Complex Tasks

🧠 Theory:
Breaking down complex tasks into smaller, manageable steps improves accuracy and ensures the model generates focused and coherent outputs. This method allows you to iteratively refine results, combining outputs from smaller subtasks into a complete solution.

⚙️ Technology:

  • Chunking: Split large tasks into multiple smaller ones to avoid overwhelming the model.
  • Validation: Intermediate outputs can be validated before moving to the next step, minimizing errors.
  • Recombination: Smaller validated outputs are merged for the final result.

💡 Prompt Tips:

  • For multi-step tasks, provide clear, incremental instructions: Step 1: Summarize the environmental benefits of solar energy. Step 2: Describe the cost savings associated with solar energy. Step 3: Combine these summaries into a single paragraph.
  • Avoid handling complex tasks in a single step: Explain the environmental benefits and cost savings of solar energy in one response.

💻 Code Tips: Ask the LLM to create the code step by step and ask for confirmation after each step so that the LLM can focus on one aspect of the implementation at a time. Focus on one aspect of the implementation at a time.


🔗 7. Cross-Contextual Coherence: Maintain Consistency

🧠 Theory:
LLMs lack persistent memory between interactions, making it essential to reintroduce necessary context for consistent responses across prompts. By maintaining cross-contextual coherence, outputs remain aligned and relevant, even in multi-step interactions.

⚙️ Technology:

  • Use context bridging: Reference key elements from previous responses to maintain relevance.
  • Store critical details in persistent structures, such as arrays or JSON, to reintroduce when needed.
  • Avoid overloading with irrelevant details, which can dilute coherence.

💡 Prompt Tips:

  • Reintroduce essential context from previous interactions: Based on our discussion about renewable energy, specifically solar power, explain the benefits of wind energy.
  • Summarize intermediate outputs for clarity: Summarize the main benefits of renewable energy. Then expand on solar and wind energy.

💻 Code Tips: Use seperated files for Code-Examples that we can provide e.g. Custom GPTs so it can learn from learnings/findings this way.


🌍 8. Style and Tone: Adapt Outputs to the Audience

🧠 Theory: LLMs generate better responses when the desired style and tone are explicitly stated. By matching the tone to the audience, you can make content more engaging and effective.

⚙️ Technology:

  • The model uses semantic cues in the prompt to adjust style and tone.
  • Specific words and phrases like “formal,” “casual,” or “technical” help steer the model’s output.

💡 Prompt Tips:

  • Specify the tone and audience: Write a technical explanation of solar panels for an engineering audience.
  • Adjust the style for different contexts: Explain solar panels in a simple and friendly tone for kids.

💻 Code Tips: In PHPDocs, define the intended audience and tone to guide LLM-generated documentation:

/**
 * Calculates the total energy output of a solar panel system.
 *
 * Intended Audience: Engineers and technical experts.
 * Tone: Formal and technical.
 *
 * @param float $panelArea The total area of solar panels in square meters.
 * @param float $efficiency The efficiency rate of the solar panels (0-1).
 * @param float $sunlightHours Daily sunlight hours.
 * 
 * @return float Total energy output in kilowatt-hours.
 */
function calculateSolarOutput(float $panelArea, float $efficiency, float $sunlightHours): float {
    return $panelArea * $efficiency * $sunlightHours;
}

🔍 9. Fine-Tuning and Domain Expertise

🧠 Theory: Fine-tuning allows LLMs to specialize in specific domains by further training them on domain-specific datasets. This enhances their ability to generate accurate, relevant, and nuanced outputs tailored to specialized tasks or fields.

⚙️ Technology:

  • Fine-tuning adjusts the weights of a pre-trained model by using a curated dataset that focuses on a specific domain.
  • This process requires labeled data and computational resources but significantly improves task performance in niche areas.

💡 Prompt Tips:

  • Use fine-tuning to simplify prompts for repeated tasks: Generate a legal brief summarizing the key points from this case.
  • Without fine-tuning, include detailed instructions and examples in your prompt: Write a summary of this legal case focusing on liability and negligence, using a formal tone.

💻 Code Tips: When fine-tuning is not an option, structure your PHPDocs to include domain-specific context for LLMs:

/**
 * Generates a compliance report for renewable energy projects.
 *
 * This function creates a detailed compliance report tailored for regulatory agencies. It checks for adherence to
 * energy efficiency standards and sustainability guidelines.
 *
 * @param array<string, mixed> $projectData Details of the renewable energy project.
 * @param string $region The region for which the compliance report is generated.
 * 
 * @return string The compliance report in a formatted string.
 */
function generateComplianceReport(array $projectData, string $region): string {
    // Example report generation logic.
    return sprintf(
        "Compliance Report for %s:\nProject: %s\nStatus: %s\n",
        $region,
        $projectData['name'] ?? 'Unnamed Project',
        $projectData['status'] ?? 'Pending Review'
    );
}

PHP: Code Quality with Custom Tooling Extensions

After many years of using PHPStan, PHP-CS-Fixer, PHP_CodeSniffer, … I will give you one advice: add your own custom code to extend your Code-Quality-Tooling.

Nearly every project has custom code that procures the real value for the product / project, but this custom code itself is often not really improved by PHP-CS-Fixer, PHPStan, Psalm, and other tools. The tools do not know how this custom code is working so that we need to write some extensions for ourselves.

Example: At work, we have some Html-Form-Element (HFE) classes that used some properties from our Active Record classes, and back in the time we used strings to connect both classes. :-/

Hint: Strings are very flexible, but also awful to use programmatically in the future. I would recommend avoiding plain strings as much as possible.

1. Custom PHP-CS-Fixer

So, I wrote a quick script that will replace the strings with some metadata. The big advantage is that this custom PHP-CS-Fixer will also automatically fix code that will be created in the future, and you can apply / check this in the CI-pipline or e.g. in a pre-commit hook or directly in PhpStorm.

<?php

declare(strict_types=1);

use PhpCsFixer\Tokenizer\Analyzer\ArgumentsAnalyzer;
use PhpCsFixer\Tokenizer\Analyzer\FunctionsAnalyzer;
use PhpCsFixer\Tokenizer\Token;
use PhpCsFixer\Tokenizer\Tokens;

final class MeerxUseMetaFromActiveRowForHFECallsFixer extends AbstractMeerxFixerHelper
{

/**
* {@inheritdoc}
*/
public function getDocumentation(): string
{
return 'Use ActiveRow->m() for "HFE_"-calls, if it is possible.';
}

/**
* {@inheritdoc}
*/
public function getSampleCode(): string
{
return <<<'PHP'
<?php

$element = UserFactory::singleton()->fetchEmpty();

$foo = HFE_Date::Gen($element, 'created_date');
PHP;
}

public function isRisky(): bool
{
return true;
}

/**
* {@inheritdoc}
*/
public function isCandidate(Tokens $tokens): bool
{
return $tokens->isTokenKindFound(\T_STRING);
}

public function getPriority(): int {
// must be run after NoAliasFunctionsFixer
// must be run before MethodArgumentSpaceFixer
return -1;
}

protected function applyFix(SplFileInfo $file, Tokens $tokens): void
{
if (v_str_contains($file->getFilename(), 'HFE_')) {
return;
}

$functionsAnalyzer = new FunctionsAnalyzer();

// fix for "HFE_*::Gen()"
foreach ($tokens as $index => $token) {
$index = (int)$index;

// only for "Gen()"-calls
if (!$token->equals([\T_STRING, 'Gen'], false)) {
continue;
}

// only for "HFE_*"-classes
$object = (string)$tokens[$index - 2]->getContent();
if (!v_str_starts_with($object, 'HFE_')) {
continue;
}

if ($functionsAnalyzer->isGlobalFunctionCall($tokens, $index)) {
continue;
}

$argumentsIndices = $this->getArgumentIndices($tokens, $index);

if (\count($argumentsIndices) >= 2) {
[
$firstArgumentIndex,
$secondArgumentIndex
] = array_keys($argumentsIndices);

// If the second argument is not a string, we cannot make a swap.
if (!$tokens[$secondArgumentIndex]->isGivenKind(\T_CONSTANT_ENCAPSED_STRING)) {
continue;
}

$content = trim($tokens[$secondArgumentIndex]->getContent(), '\'"');
if (!$content) {
continue;
}

$newContent = $tokens[$firstArgumentIndex]->getContent() . '->m()->' . $content;

$tokens[$secondArgumentIndex] = new Token([\T_CONSTANT_ENCAPSED_STRING, $newContent]);
}
}
}

/**
* @param Token[]|Tokens $tokens <phpdoctor-ignore-this-line/>
* @param int $functionNameIndex
*
* @return array<int, int> In the format: startIndex => endIndex
*/
private function getArgumentIndices(Tokens $tokens, $functionNameIndex): array
{
$argumentsAnalyzer = new ArgumentsAnalyzer();

$openParenthesis = $tokens->getNextTokenOfKind($functionNameIndex, ['(']);
$closeParenthesis = $tokens->findBlockEnd(Tokens::BLOCK_TYPE_PARENTHESIS_BRACE, $openParenthesis);

// init
$indices = [];

foreach ($argumentsAnalyzer->getArguments($tokens, $openParenthesis, $closeParenthesis) as $startIndexCandidate => $endIndex) {
$indices[$tokens->getNextMeaningfulToken($startIndexCandidate - 1)] = $tokens->getPrevMeaningfulToken($endIndex + 1);
}

return $indices;
}
}

To use your custom fixes, you can register and enable them: https://cs.symfony.com/doc/custom_rules.html  

Example-Result:


$fieldGroup->addElement(HFE_Customer::Gen($element, 'customer_id'));

// <- will be replaced with ->

$fieldGroup->addElement(HFE_Customer::Gen($element, $element->m()->customer_id));

Hint: There are many examples for PHP_CodeSniffer and Fixer Rules on GitHub, you can often pick something that fits 50-70% for your use-case and then modify it for your needs.

The “m()” method looks like this and will call the simple “ActiveRowMeta”-class. This class will return the property name itself instead of the real value.

/**
* (M)ETA
*
* @return ActiveRowMeta|mixed|static
* <p>
* We fake the return "static" here because we want auto-completion for the current properties in the IDE.
* <br><br>
* But here the properties contains only the name from the property itself.
* </p>
*
* @psalm-return object{string,string}
*/
final public function m()
{
return (new ActiveRowMeta())->create($this);
}
<?php

final class ActiveRowMeta
{
/**
* @return static
*/
public function create(ActiveRow $obj): self
{
/** @var static[] $STATIC_CACHE */
static $STATIC_CACHE = [];

// DEBUG
// var_dump($STATIC_CACHE);

$cacheKey = \get_class($obj);
if (!empty($STATIC_CACHE[$cacheKey])) {
return $STATIC_CACHE[$cacheKey];
}

foreach ($obj->getObjectVars() as $propertyName => $propertyValue) {
$this->{$propertyName} = $propertyName;
}

$STATIC_CACHE[$cacheKey] = $this;

return $this;
}

}

2. Custom PHPStan Extension

In the next step, I added a DynamicMethodReturnTypeExtension for PHPStan, so that the static code analyze knows the type of the metadata + I still have auto-completion in the IDE via phpdocs.

Note: Here I’ve also made the metadata read-only, so we can’t misuse the metadata.

<?php

declare(strict_types=1);

namespace meerx\App\scripts\githooks\StandardMeerx\PHPStanHelper;

use PhpParser\Node\Expr\MethodCall;
use PHPStan\Analyser\Scope;
use PHPStan\Reflection\MethodReflection;
use PHPStan\Type\Type;

final class MeerxMetaDynamicReturnTypeExtension implements \PHPStan\Type\DynamicMethodReturnTypeExtension
{

public function getClass(): string
{
return \ActiveRow::class;
}

public function isMethodSupported(MethodReflection $methodReflection): bool
{
return $methodReflection->getName() === 'm';
}

/**
* @var \PHPStan\Reflection\ReflectionProvider
*/
private $reflectionProvider;

public function __construct(\PHPStan\Reflection\ReflectionProvider $reflectionProvider)
{
$this->reflectionProvider = $reflectionProvider;
}

public function getTypeFromMethodCall(
MethodReflection $methodReflection,
MethodCall $methodCall,
Scope $scope
): Type
{
$exprType = $scope->getType($methodCall->var);

$staticClassName = $exprType->getReferencedClasses()[0];
$classReflection = $this->reflectionProvider->getClass($staticClassName);

return new MeerxMetaType($staticClassName, null, $classReflection);
}
}
<?php

declare(strict_types=1);

namespace meerx\App\scripts\githooks\StandardMeerx\PHPStanHelper;

use PHPStan\Reflection\ClassMemberAccessAnswerer;
use PHPStan\Type\ObjectType;

final class MeerxMetaType extends ObjectType
{

public function getProperty(string $propertyName, ClassMemberAccessAnswerer $scope): \PHPStan\Reflection\PropertyReflection
{
return new MeerxMetaProperty($this->getClassReflection());
}

}
<?php

declare(strict_types=1);

namespace meerx\App\scripts\githooks\StandardMeerx\PHPStanHelper;

use PHPStan\Reflection\ClassReflection;
use PHPStan\TrinaryLogic;
use PHPStan\Type\NeverType;
use PHPStan\Type\StringType;

final class MeerxMetaProperty implements \PHPStan\Reflection\PropertyReflection
{

private ClassReflection $classReflection;

public function __construct(ClassReflection $classReflection)
{
$this->classReflection = $classReflection;
}

public function getReadableType(): \PHPStan\Type\Type
{
return new StringType();
}

public function getWritableType(): \PHPStan\Type\Type
{
return new NeverType();
}

public function isWritable(): bool
{
return false;
}

public function getDeclaringClass(): \PHPStan\Reflection\ClassReflection
{
return $this->classReflection;
}

public function isStatic(): bool
{
return false;
}

public function isPrivate(): bool
{
return false;
}

public function isPublic(): bool
{
return true;
}

public function getDocComment(): ?string
{
return null;
}

public function canChangeTypeAfterAssignment(): bool
{
return false;
}

public function isReadable(): bool
{
return true;
}

public function isDeprecated(): \PHPStan\TrinaryLogic
{
return TrinaryLogic::createFromBoolean(false);
}

public function getDeprecatedDescription(): ?string
{
return null;
}

public function isInternal(): \PHPStan\TrinaryLogic
{
return TrinaryLogic::createFromBoolean(false);
}
}

Summary

Think about your custom code and how you can improve it, use your already used tools and extend it to understand your code. Sometimes it’s easy, and you can add some modern PHPDocs or you need to go down the rabbit hole and implement some custom stuff, but at last it will help your software, your team and your customers.

Timeout Problems: Web Server + PHP

What?

First there is an HTTP request and that will hit your Web server, then it will pass the request via TCP- or UNIT-Socket via FastCGI to your PHP-FPM Daemon, here we will start a new PHP process and in this process we will connect e.g. to the database and run some queries.

PHP-Request

The Problem!

There are different timeout problems here because we connect different pieces together and this parts need to communicate. But what if one of the pieces does not respond in a given time or, even more bad, if one process is running forever like a bad SQL-query.

Understand your Timeouts.

Timeouts are a way to limit the time that a request can run, and otherwise an attacker could simply run a denial-of-service with a simple request. But there are many configurations in several layers: Web server, PHP, application, database, curl, …

– Web server

Mostly you will use Apache or Nginx as Web server and in the end it makes not really a difference, there are different timeout settings, but the idea is almost the same: The Web server will stop the execution and kills the PHP process, now you got a 504 HTTP error (Gateway Timeout) and you will lose your stack trace and error-tracking because we killed our application in the middle of nothing. So, we should keep the Web server running as long as needed.

“`grep -Ri timeout /etc/apache2/“`

/etc/apache2/conf-enabled/timeout.conf:Timeout 60

/etc/apache2/mods-available/reqtimeout.conf:<IfModule reqtimeout_module>

/etc/apache2/mods-available/reqtimeout.conf: # mod_reqtimeout limits the time waiting on the client to prevent an

/etc/apache2/mods-available/reqtimeout.conf: # configuration, but it may be necessary to tune the timeout values to

/etc/apache2/mods-available/reqtimeout.conf: # mod_reqtimeout per virtual host.

/etc/apache2/mods-available/reqtimeout.conf: # Note: Lower timeouts may make sense on non-ssl virtual hosts but can

/etc/apache2/mods-available/reqtimeout.conf: # cause problem with ssl enabled virtual hosts: This timeout includes

/etc/apache2/mods-available/reqtimeout.conf: RequestReadTimeout header=20-40,minrate=500

/etc/apache2/mods-available/reqtimeout.conf: RequestReadTimeout body=10,minrate=500

/etc/apache2/mods-available/reqtimeout.load:LoadModule reqtimeout_module /usr/lib/apache2/modules/mod_reqtimeout.so

/etc/apache2/mods-available/ssl.conf: # to use and second the expiring timeout (in seconds).

/etc/apache2/mods-available/ssl.conf: SSLSessionCacheTimeout 300

/etc/apache2/conf-available/timeout.conf:Timeout 60

/etc/apache2/apache2.conf:# Timeout: The number of seconds before receives and sends time out.

/etc/apache2/apache2.conf:Timeout 60

/etc/apache2/apache2.conf:# KeepAliveTimeout: Number of seconds to wait for the next request from the

/etc/apache2/apache2.conf:KeepAliveTimeout 5

/etc/apache2/mods-enabled/reqtimeout.conf:<IfModule reqtimeout_module>

/etc/apache2/mods-enabled/reqtimeout.conf: # mod_reqtimeout limits the time waiting on the client to prevent an

/etc/apache2/mods-enabled/reqtimeout.conf: # configuration, but it may be necessary to tune the timeout values to

/etc/apache2/mods-enabled/reqtimeout.conf: # mod_reqtimeout per virtual host.

/etc/apache2/mods-enabled/reqtimeout.conf: # Note: Lower timeouts may make sense on non-ssl virtual hosts but can

/etc/apache2/mods-enabled/reqtimeout.conf: # cause problem with ssl enabled virtual hosts: This timeout includes

/etc/apache2/mods-enabled/reqtimeout.conf: RequestReadTimeout header=20-40,minrate=500

/etc/apache2/mods-enabled/reqtimeout.conf: RequestReadTimeout body=10,minrate=500

/etc/apache2/mods-enabled/reqtimeout.load:LoadModule reqtimeout_module /usr/lib/apache2/modules/mod_reqtimeout.so

/etc/apache2/mods-enabled/ssl.conf: # to use and second the expiring timeout (in seconds).

/etc/apache2/mods-enabled/ssl.conf: SSLSessionCacheTimeout 300

Here you can see all configurations for Apache2 timeouts, but we only need to change etc/apache2/conf-enabled/timeout.conf`` because it will overwrite `/etc/apache2/apache2.conf` anyway.

PS: Remember to reload / restart your Web server after you change the configurations.

If we want to show the user at least a custom error page, we could add something like:

ErrorDocument503 /error.php?errorcode=503
ErrorDocument 504 /error.php?errorcode=504

… into our Apache configuration or in a .htaccess file, so that we can still use PHP to show an error page, also if the requested PHP call was killed. The problem here is that we will lose the error message / stack trace / request etc. from the error, and we can’t send e.g. an error into our error logging system. (take a look at sentry, it’s really helpful)

– PHP-FPM

Our PHP-FPM (FastCGI Process Manager) pool can be configured with a timeout (request-terminate-timeout), but just like the Web server setting, this will kill the PHP worker in the middle of the process, and we can’t handle the error in PHP itself. There is also a setting (process_control_timeout) that tells the child processes to wait for this much time before executing the signal received from the parent process, but I am uncertain if this is somehow helpfully here? So, our error handling in PHP can’t catch / log / show the error, and we will get a 503 HTTP error (Service Unavailable) in case of a timeout.

Shutdown functions will not be executed if the process is killed with a SIGTERM or SIGKILL signal. :-/

Source: register_shutdown_function

PS: Remember to reload / restart your PHP-FPM Daemon after you change the configurations.

– PHP

The first idea from most of us would be maybe to limit the PHP execution time itself, and we are done, but that sounds easier than it is because `max_execution_time` ignores time spent on I/O (system commands e.g. `sleep()`, database queries (SELECT SLEEP(100)). But these are the bottlenecks of nearly all PHP applications, PHP itself is fast but the external called stuff isn’t.

Theset_time_limit()function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, etc. is not included when determining the maximum time that the script has been running. This is not true on Windows where the measured time is real.

Source: set_time_limit

– Database (MySQLi)

Many PHP applications spend most of their time waiting for some bad SQL queries, where the developer missed adding the correct indexes and because we learned that the PHP max execution time did not work for database queries, we need one more timeout setting here.

There is the MYSQLI_OPT_CONNECT_TIMEOUT and MYSQLI_OPT_READ_TIMEOUT (Command execution result timeout in seconds. Available as of PHP 7.2.0. – mysqli.options) setting, and we can use that to limit the time for our queries.

In the end you will see a “Errno: 2006 | Error: MySQL server has gone away” error in your PHP application, but this error can be caught / reported, and the SQL query can be fixed, otherwise the Apache or PHP-FPM would kill the process, and we do not see the error because our error handler can’t handle it anyway.

Summary:

It’s complicated. PHP is not designed for long execution and that is good as it is, but if you need to increase the timeout it will be more complicated than I first thought. You need for example different “timeout”-code for testing different settings:

// DEBUG: long-running sql-call
// Query(‘SELECT SLEEP(600);’);

// DEBUG: long-running system-call
// sleep(600);

// DEBUG: long-running php-call
// while (1) { } // infinite loop

Solution:

We can combine different timeout, but the timeout from the called commands e.g. database, curl, etc. will be combined with the timeout from PHP (max_execution_time) itself. The timeout from the Web server (e.g. Apache2: Timeout) and from PHP-FPM (request_terminate_timeout) need to be longer than the combined timeout from the application so that we still can use our PHP error handler.

e.g.: ~ 5 min. timeout

  1. MySQL read timeout: 240s ⇾ 4 min.
    link->options(MYSQLI_OPT_READ_TIMEOUT, 240);
  2. PHP timeout: 300s ⇾ 5 min.
    max_execution_time = 300
  3. Apache timeout: 360s ⇾ 6 min.
    Timeout 360
  4. PHP-FPM: 420s ⇾ 7 min.
    request_terminate_timeout = 420

 

Links:

Prepare your PHP Code for Static Analysis

Three years ago I got a new job as PHP developer, before that I called myself web developer because I build ReactJS, jQuery, CSS, HTML, … and PHP  stuff for a web agency. So now I am a full-time PHP developer and I converted a non typed  (no PHPDoc + no native types) project with ~ 10.000 classes into a project with ~ 90% type coverage. Here is what I have learned.

1. Write code with IDE autocompletion support.

If you have autocompletion in the IDE most likely the Static Analysis can understand the code as well. 

Example:

bad:

->get('DB_Connection', true, false);

still bad:

->get(DB_Connection::class);

good:

getDbConnection(): DB_Connection

2. Magic in Code is bad for the long run!

Magic methods (__get, __set, …) for example can help to implement new stuff very fast, but the problem is nobody will understand it, you will have no autocompletion, no refactoring options, other developers will need more time to read and navigate in the code and in the end it will cost you much more time than you can save with it.

3. Break the spell on Magic Code …

… by explaining to everyone (Devs > IDE > Tools) what it does.

Example 1:

We use a simple Active Record Pattern, but we put all SQL stuff into the Factory classes, so that the Active Record class can be simple. (Example) But because of missing support for Generics we had no autocompletion without adding many dummy methods into the classes. So one of my first steps was to introduce a “PhpCsFixer” that automatically adds the missing methods of the parent class with the correct types via “@method”-comments into these classes. 

Example 2:

Sometimes you can use more modern PHPDocs to explain the function. Take a look at the “array_first” function in the linked Post.

Example 3:

/**
* Return an array which has the Property-Values of the given Objects as Values.
*
* @param object[] $ObjArray
* @param string $PropertyName
* @param null|string $KeyPropertyName if given uses this Property as key for the returned Array otherwise the keys from the
* given array are used
*
* @throws Exception if no property with the given name was found
*
* @return array
*/
function propertyArray($ObjArray, $PropertyName, $KeyPropertyName = null): array {
// init
$PropertyArray = [];

foreach ($ObjArray as $key => $Obj) {
if (!\property_exists($Obj, $PropertyName)) {
throw new Exception('No Property with Name ' . $PropertyName . ' in Object Found - Value');
}

$usedKey = $key;
if ($KeyPropertyName) {
if (!\property_exists($Obj, $KeyPropertyName)) {
throw new Exception('No Property with Name ' . $PropertyName . ' in Object Found - Key');
}
$usedKey = $Obj->{$KeyPropertyName};
}

$PropertyArray[$usedKey] = $Obj->{$PropertyName};
}

return $PropertyArray;
}

Sometimes it’s hard to describe the specific output types, so here you need to extend the  functions of your Static Code Analyze Tool, so that it knows what you are doing. ⇾ for example here you can find a solution for PHPStan ⇽  but there is still no support for the IDE and so maybe it’s not the best idea to use magic like that at all. And I am sure it’s more simple to use specific and simple methods instead:  e.g. BillCollection->getBillDates()

4. Try to not use strings for the code.

Strings are simple and flexible, but they are also bad for the long run. Mostly strings are used because it looks like a simple solution, but often they are redundant, you will have typos everywhere and the IDE and/or Static Analysis can’t analyze them because it’s just text.

Example:

bad: 

AjaxEditDate::generator($bill->bill_id, $bill->date, 'Bill', 'date');
  • “Bill” ⇾ is not needed here, we can call e.g. get_class($bill) in the “generator” method
  • “date” ⇾ is not needed here, we can fetch the property name from the class
  • “$bill->bill_id” ⇾ is not needed here, we can get the primary id value from the class

good:

AjaxEditDate::generator($bill, $bill->m()->date);

5. Automate stuff via git hook and check it via CI server.

Fixing bad code is only done if you disallow the same bad code for the future. With a pre-commit hook for git it’s simple to run these checks, but you need to check it again in the CI server because the developers can simply skip these checks.

Example:

I introduced a check for disallowing global variables (global $foo && $GLOBALS[‘foo’]) via “PHP_CodeSniffer”. 

Links:

6. Use array shapes (or collections) if you need to return an array, please.

Array shapes are like arrays but with fixed keys, so that you can define the types of each key in the PHPDocs.

You will have autocompletion for the IDE + all other devs can see what the method will return + you will notice if you’re better retuning an object because it will be very hard to describe the output for complex data structures and Static Analysis can use and check the types. 

Example:

/**
* @return array{
* missing: array<ShoppingCartLib::TAB_*,string>,
* disabled: array<ShoppingCartLib::TAB_*,string>
* }
*/
private static function getShoppingCartTabStatus(): array {
...
}

Generics in PHP via PHPDocs

If you did not know that you can use Generics in PHP or you do not exactly know how to use it or why you should use it, then the next examples are for you.

Type variables via @template

The @template tag allows classes and functions to declare a generic type parameter. The next examples starts with simple functions, so that we understand how it works, and then we will see the power of this in classes.


A dummy function that will return the input.

https://phpstan.org/r/1922279b-9786-4523-939d-dddcfd4ebb86

    <?php    

    /**
     * @param \Exception $param
     * @return \Exception
     *
     * @template T of \Exception
     * @psalm-param T $param
     * @psalm-return T
     */
    function foo($param) { ... }

    foo(new \InvalidArgumentException()); // The static-analysis-tool knows that 
                                          // the type is still "\InvalidArgumentException" 
                                          // because of the type variable.

@template T of \Exception // here we create a new type variable, and we force that it must be an instance of \Exception

@phpstan-param T $param // here we say that the static-analysis-tool need to remember the type that this variable had before (you can use @psalm-* or @phpstan-* both works with both tools)

@phpstan-return T // and that the return type is the same as the input type 


A simple function that gets the first element of an array or a fallback. 

In the @param PHPDocs we write “mixed” because this function can handle different types. But this information is not very helpful if you want to understand programmatically what the code does, so we need to give the static-analysis-tools some more information. 

https://phpstan.org/r/1900a2af-f5c1-4942-939c-409928a5ac4a

    <?php
     
    /**
     * @param array<mixed> $array
     * @param mixed        $fallback <p>This fallback will be used, if the array is empty.</p>
     *
     * @return mixed|null
     *
     * @template TFirst
     * @template TFirstFallback
     * @psalm-param TFirst[] $array
     * @psalm-param TFirstFallback $fallback
     * @psalm-return TFirst|TFirstFallback
     */
    function array_first(array $array, $fallback)
    {
        $key_first = array_key_first($array);
        if ($key_first === null) {
            return $fallback;
        }

        return $array[$key_first];
    }

    array_first([1, 2, 3], null); 

    if ($a === 'foo') { // The static-analysis-tool knows that 
                        // === between int|null and 'foo' will always evaluate to false.
	    // ...
    }

@template TFirst // we again define your typed variables

@template TFirstFallback // and one more because we have two inputs where we want to keep track of the types

@psalm-param TFirst[] $array // here we define that $array is an array of TFirst types

@psalm-param TFirstFallback $fallback // and that $fallback is some other type that comes into this function

@psalm-return TFirst|TFirstFallback // now we define the return type as an element of  the $array or the $fallback type 


 Very basic Active Record + Generics

The IDE support for generics is currently not there, :-/ so that we still need some hacks (see @method) for e.g. PhpStorm to have autocompletion.

https://phpstan.org/r/f88f5cd4-1bb9-4a09-baae-069fddb10b12

https://github.com/voku/phpstorm_issue_53352/tree/master/src/Framework/ActiveRecord

<?php

class ActiveRow
{
    /**
     * @var ManagedFactory<static>
     */
    public $factory;

    /**
     * @param Factory<ActiveRow>|ManagedFactory<static> $factory
     * @param null|array                                $row
     */
    public function __construct(Factory $factory, array $row = null) {
        $this->factory = &$factory;
    }
}

/**
 * @template T
 */
abstract class Factory
{
    /**
     * @var string
     *
     * @internal
     *
     * @psalm-var class-string<T>
     */
    protected $classname;

    /**
     * @return static
     */
    public static function create() {
        return new static();
    }
}

/**
 * @template  T
 * @extends   Factory<T>
 */
class ManagedFactory extends Factory
{
    /**
     * @param string $classname
     *
     * @return void
     *
     * @psalm-param class-string<T> $classname
     */
    protected function setClass(string $classname): void
    {
        if (\class_exists($classname) === false) {
            /** @noinspection ThrowRawExceptionInspection */
            throw new Exception('TODO');
        }

        if (\is_subclass_of($classname, ActiveRow::class) === false) {
            /** @noinspection ThrowRawExceptionInspection */
            throw new Exception('TODO');
        }

        $this->classname = $classname;
    }

    // ...
}

final class Foo extends ActiveRow {

    public int $foo_id;

    public int $user_id;

    // --------------------------------------
    // add more logic here ...
    // --------------------------------------
}

/**
 * @method Foo[] fetchAll(...)
 *
 * @see Foo
 *
 * // warning -> do not edit this comment by hand, it's auto-generated and the @method phpdocs are for IDE support       
 * //         -> https://gist.github.com/voku/3aba12eb898dfa209a787c398a331f9c
 *
 * @extends ManagedFactory<Foo>
 */
final class FooFactory extends ManagedFactory
{
    // -----------------------------------------------
    // add sql stuff here ...
    // -----------------------------------------------
}

A more complex collection example.

In the end we can extend the “AbstractCollection” and the static-analysis-tools knows the types of all the methods. 

https://github.com/voku/Arrayy/tree/master/src/Type

/**
 * @template TKey of array-key
 * @template T
 * @template-extends \ArrayObject<TKey,T>
 * @template-implements \IteratorAggregate<TKey,T>
 * @template-implements \ArrayAccess<TKey|null,T>
 */
class Arrayy extends \ArrayObject implements \IteratorAggregate, \ArrayAccess, \Serializable, \JsonSerializable, \Countable
{ ... }

/**
 * @template TKey of array-key
 * @template T
 * @template-extends \IteratorAggregate<TKey,T>
 * @template-extends \ArrayAccess<TKey|null,T>
 */
interface CollectionInterface extends \IteratorAggregate, \ArrayAccess, \Serializable, \JsonSerializable, \Countable
{ ... }

/**
 * @template   TKey of array-key
 * @template   T
 * @extends    Arrayy<TKey,T>
 * @implements CollectionInterface<TKey,T>
 */
abstract class AbstractCollection extends Arrayy implements CollectionInterface
{ ... }

/**
 * @template TKey of array-key
 * @template T
 * @extends  AbstractCollection<TKey,T>
 */
class Collection extends AbstractCollection
{ ... }

Links:

https://phpstan.org/blog/generics-in-php-using-phpdocs

https://psalm.dev/docs/annotating_code/templated_annotations/

https://stitcher.io/blog/php-generics-and-why-we-need-them

❤️ Simple PHP Code Parser

It based on code from “JetBrains/phpstorm-stubs” but instead of Php-Reflection we now use nikic/PHP-Parser, BetterReflection, phpDocumentor and PHPStan/phpdoc-parser internally. So, you can get even more information about the code. For example, psalm- / phpstan-phpdoc annotations or inheritdoc from methods.

Install:

composer require voku/simple-php-code-parser

Link:

voku/Simple-PHP-Code-Parser

More:


Example: get value from define in “\foo\bar” namespace

$code = '
  <?php
  namespace foo\bar;
  define("FOO_BAR", "Lall");
';

$phpCode = PhpCodeParser::getFromString($code);
$phpConstants = $phpCode->getConstants();

$phpConstants['\foo\bar\FOO_BAR']->value; // 'Lall'

Example: get information about @property phpdoc from a class

$code = '
  <?php
  /** 
   * @property int[] $foo 
   */
  abstract class Foo { }
';

$phpCode = PhpCodeParser::getFromString($code);
$phpClass = $phpCode->getClass('Foo');

$phpClass->properties['foo']->typeFromPhpDoc); // int

Example: get classes from a string (or from a class-name or from a file or from a directory)

$code = '
<?php
namespace voku\tests;
class SimpleClass {}
$obja = new class() {};
$objb = new class {};
class AnotherClass {}
';

$phpCode = \voku\SimplePhpParser\Parsers\PhpCodeParser::getFromString($code);
$phpClasses = $phpCode->getClasses();

var_dump($phpClasses['voku\tests\SimpleClass']); // "PHPClass"-object