Skip to content

Conversation

@splitbrain
Copy link
Owner

This is a major refactoring of the plugin's code base with the goal to make it easier to maintain and extend. The monolithic architecture has been broken into several classes with clearly separated concerns.

All PDF generation is now decoupled from the environment. This is in preparation to add a a command line or remote API component in the future.

MPDF has been updated to version 8.2.

Service classes:

  • PdfExportService.php: PDF generation and delivery service
    • Extracted PDF generation logic from action plugin
    • Handles PDF output, caching, and HTTP response
    • Clean separation between event handling and business logic
  • Config.php: Centralized configuration management
    • Encapsulates plugin configuration and input variables
    • Generates mPDF configuration arrays
    • Provides clean API for accessing settings
  • DokuPdf.php: mPDF wrapper (replaces DokuPDF.class.php)
    • Thin wrapper around mPDF with DokuWiki-specific defaults
    • Custom HttpClient and LocalContentLoader integration
    • has a few todo markers for future improvements in language handling
  • Writer.php: Main PDF composition orchestrator
    • Composites Config, Template, DokuPdf, and Styles
    • Handles TOC generation, internal link rewriting, and debug output
    • Manages the overall PDF writing workflow
  • Template.php: Template file handling
    • Loads header/footer/cover templates
    • Applies placeholder substitution
  • Styles.php: CSS and styling management
    • Loads styles from plugins and templates
    • Handles CSS compilation and processing
  • Cache.php: Caching layer
    • Implements DokuWiki cache integration
    • Manages cache validation and retrieval

New collector architecture is responsible to collect the pages to be added to the PDF.

  • AbstractCollector.php: Base class for all collectors
    • Common functionality for page collection
    • Checks ACLs
    • extracts title and language info
  • PageCollector.php: Single page export
    • No longer relies on global $ID
    • Clean dependency injection via Config
  • NamespaceCollector.php: Namespace export
    • Recursive page collection from namespaces
    • provides new event DW2PDF_NAMESPACEEXPORT_SORT where plugins could influence the sorting (or even filter out pages)
  • BookCreatorLiveSelectionCollector.php: Live BookCreator selections
    • Handles temporary book selections
  • BookCreatorSavedSelectionCollector.php: Saved BookCreator selections
    • Handles saved book collections
  • CollectorFactory.php: Factory for instantiating collectors
    • Determines correct collector based on request parameters

Media Handling #527

  • MediaLinkResolver.php: Media link resolution (replaces old ImageDecorator)
    • Resolves DokuWiki media links to filesystem paths
    • Integration point for custom image processing
  • HttpClient.php: Custom HTTP client for mPDF
    • Implements mPDF HttpClient interface
    • Integrates MediaLinkResolver for proper media handling
  • LocalContentLoader.php: Custom local content loader
    • Implements mPDF LocalContentLoaderInterface
    • Integrates MediaLinkResolver for proper media handling

Link rewriting is now done outside the (cached) rendering. The plugin requires the DOM and XML extension for this.

A bunch of tests have been added but they are far from comprehensive. Help in manually testing this branch would be very welcome

In theory, bookcreator integration should remain unchanged, but has not been tested. I did however change a lot of the Exception handling regarding the bookcreator integration so this part might behave different.

Issues addressed in this PR: #528 #527 #526 #508 #455 #140 (could be implemented via the new event now)

This refactoring has been largely financed by the DokuWiki Business Plugin Partner Program. Please consider joining the program to support future work like this.

We now have a few classes that separate the concerns

DokuPDF  - inherits from mPDF and doesn't do much anymore except for a
           few internal defaults and overwrites
Config   - encapsulate the configuration (file and input vars) and created
           an mpdf config array
Template - handles loading the template files and applies placeholders
Writer   - composites the above classed and gives access to write
           operations
We now have a Styles class that takes care of loading the styles from
plugins etc.
The basic handling of page collecting, caching and rendering is now
defined.

There is still some ugly dependency in the renderer and the
ImageProcessor loading is not reimplemented yet.

After the above a first attempt to actually run the code can be made.
This starts to approach #528 and #526
This should make tests and a future command line interface easier to
implement.
As suggested in mpdf/mpdf#2145 the previous decorator functionality was
implemented via custom HTTPClient and LocalContent loaders.
We want the ID to be passed via Config as well.
These tests have been initially LLM written, I went through each of them
to make sure they make sense and fixed what I thought needed fixing.

However coverage is still not great and should be improved in the
future.
This is now handled within the renderer itself
Now only event handling happens in the action handler
@splitbrain splitbrain requested a review from Klap-in November 26, 2025 19:54
@Klap-in
Copy link
Collaborator

Klap-in commented Nov 26, 2025

Impressive refactoring! Thanks for the work!
I start reviewing today, but it will take some time as I have limit time per day/week.

Just a first small observation. The tpl url-parameter seems now not supported anymore? Is that intended? I have reports that people seem to use it. I was still considering to add a dropdown or something like that for it in the Bookcreator.

@Klap-in
Copy link
Collaborator

Klap-in commented Nov 26, 2025

Doc change: seems debug output can now directly be requested with debug url parameter, it is not needed to enable $conf['allowdebug'] first before using &debughtml=text|html

Copy link
Collaborator

@Klap-in Klap-in left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some first comments/questions. Some are also a bit note to self as I have not tested anything.


if ($header) {
$this->mpdf->SetHTMLHeader($header, 'O');
$this->mpdf->SetHTMLHeader($header, 'E');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not expect both O and E need to be set for a first page. Does this mean also a second page in the document is always affected? (not tested yet)
(the old code used named html with pseudo css, difficult to see quickly what happened before)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question. I guess the first page is always odd? OTOH the first page might be pushed to an even page if there's a TOC before? I think it's okay to set both - mpdf will pick what it needs.

Copy link
Collaborator

@Klap-in Klap-in Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intend to test some variants of the template. I suppose the cover would be added before ToC, while wiki page(s) after the ToC? Not yet checked in the code again.

action.php Outdated
* @throws MpdfException
*/
protected function generatePDF($cachefile, $event)
protected function generatePDF(Config $config, AbstractCollector $collector, $cachefile, $event)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it logical to use an interface here for the collector?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... good question. Currently I need collectors to be children of AbstractCollector, because I use methods from it on the Collector. Using an Interface would make that more difficult, but I might be wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not fluent with this kind of patterns, so for in depth ideas others better step in ;-)

$this->pages = array_filter(
array_map('cleanID', $this->collect()),
fn($page) => auth_quickaclcheck($page) >= AUTH_READ
);
Copy link
Collaborator

@Klap-in Klap-in Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

book_skipforbiddenpages is not used anymore? Needs doc update and bookcreator fix.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe book_skipforbiddenpages is part of the mentioned changed exception handling...

Collectors will always obey ACLs thanks to above code. I think before a book export that contained ACL protected pages did return to the book creator page with an error unless skipforbiddenpages was set. Now it basically behaves like skipforbiddenpages is always set, because above code will skip those protected pages by default.

}

/**
* Note: we do not set up any dependencies to the plugin source code itself. On normal installation,
Copy link
Collaborator

@Klap-in Klap-in Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check or update docs for development.

src/Config.php Outdated
$this->useStyles = array_map('trim', $this->useStyles);
$this->useStyles = array_filter($this->useStyles);
}
if (isset($conf['watermark'])) $this->watermark = $conf['watermark']; // FIXME currently not in default config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was only url parameter, assumption from my side. E.g. might have a use for marking it as ‘Draft’, which is not useful to set permanently as a config setting.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess you rarely would set a default. But I could imagine setting it to "CONFIDENTIAL" in some wikis to make sure people are careful with printouts...

@splitbrain
Copy link
Owner Author

it is not needed to enable $conf['allowdebug']

Hmm you're right. Do you think this is a security issue? Might be a SEO issue at least if debug links end up on the web (duplicate content).

BTW. I am not 100% happy with how debugging currently works anyway (just suddenly dumping HTML and exiting). I might work on it a bit more.

@Klap-in
Copy link
Collaborator

Klap-in commented Nov 27, 2025

it is not needed to enable $conf['allowdebug']

Hmm you're right. Do you think this is a security issue? Might be a SEO issue at least if debug links end up on the web (duplicate content).

I do not expect it will expose css or html people could otherwise not find
It is also less processing, so load on server is probably not that heavy. So for me it does not matter that much.

BTW. I am not 100% happy with how debugging currently works anyway (just suddenly dumping HTML and exiting). I might work on it a bit more.

Its purpose is to have a way to grab the entire input that is given to mPDF, to enable debugging pdf creation itself. Since we add page for page to mPDF this workaround with buffering html is added. Of course, it is not the literal input to mPDF.

splitbrain and others added 7 commits November 27, 2025 12:29
This should make it easier to add new configuration and see with a
glance how it is initialized.
This refactors the service so that tests can easily access the produced
debug HTML
The test passes, but I am not 100% sure the behaviour is correct yet.
Static variables in the renderer might be a problem. Not tested with
multiple documents yet.

Currently render options need to be set via global conf and are not
passed from the Config object.
Using static variables makes it impossible to automatically test this
correctly.

The PDF renderer is now a singleton. The Writer class initializes it
anew for each export, but during the export, the same renderer is
reused.
This streamlines the configuration to a single point of entry. The
renderer was the last bit not adhering to that.
Again mostly LLM generated but manually checked and corrected.
@splitbrain
Copy link
Owner Author

Merged the attributes PR, added a whole bunch of tests, fixed issues with numbered headers in the renderer.

$this->debug = $config->isDebugEnabled();

// initialize a new renderer instance (singleton instance will be reused in later p_* calls)
$renderer = plugin_load('renderer', 'dw2pdf', true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this have influence on rendering included page by Include plugin? Or other render specialities?

Copy link
Owner Author

@splitbrain splitbrain Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. This should not be too different from the combination of a reference to the (singleton) action plugin and class static variables that were used before.

mpdf v8.2.6 => v8.2.7
log 3.0.2 => 1.1.4

Using the old log version avoids conflicts with the smtp plugin for now.

However we need a proper solution to this problem in the future
(probably in core).

A PR to be able to use any log version with the smtp plugin has been
submitted: txthinking/Mailer#36

We probably should introduce a log/psr:3 dependency in core when
upgrading to PHP8 (and provide a PSR compatible wrapper around our
logger)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants