Skip to content

Conversation

@vincentarelbundock
Copy link
Collaborator

Some challenges we face in tinyplot

  1. There many (read: MANY) different graphical elements, parameters, measurements, settings, and user inputs.
  2. Function calls tend to be extremely long, because we need to pass all those parameters around back and forth all the time.
  3. It is not always clear what "state" variables are in. This partly the fault of list2env, but also because overwrite a lot of variables, which makes user input unavailable, and forces us into tricks like xlim_user.
  4. Some information is not available to the type_*() functions, which complicates things like labelling (see recent discussion with @zeileis on type-specific labels).
  5. We don't have a lot of enforced checks / assertions on user input. It would be nice to raise informative errors more quickly and more often.

Potential solution

  1. Create a "container" object called settings that holds all user input, state variables, and computed parameters.
  2. Instead of passing a gazillion parameters to every function, we just pass settings.
  3. Whenever a function gets settings as input, it calls settings_to_environment() at the top.
    • That makes it easy access all internal information in processing / helper functions.
  4. Whenever a function gets settings as input and wants to modify something, it assigns explicitly the new values just before return(settings).
    • That makes it easy to inspect exactly which parameters/variables have been change by a function: just look before the return() statement.
  5. If we use a simple S4 class as "container", we get type validation on user inputs for "free", which is nice. We can also use initialize() to perform some sanity checks and assertions on user input automatically, without polluting the main code.
    • FWIW, I do something similar in tinytable and `marginaleffects, and it has worked out super nicely. The code is much cleaner now.

Proof of concept

This is just a proof of concept PR, with tiny subset of parameters. We have a simple tinyplot_settings object, that we initialize with some user input. Then, we pass those settings off to sanitize_type, which modifies appropriate internal slots. Finally, we call settings_to_environment() when we need to access those slots (or we use standard accessors like @type_draw.

@zeileis
Copy link
Collaborator

zeileis commented Sep 9, 2025

I have to admit I really like the freedom and flexibility of the functions with the long argument lists where we just can shove anything in and out.

But I also agree that containerization will probably make the code more readable.

I'm not convinced, though, that using S4 classes is a good idea. First, it necessitates loading one more package (even if it is a base package that is loaded by default in most R configs). Also the automatic sanity checking is "free" but comes at a computational cost, especially if the checks are triggered often. So I wonder whether there is maybe some middle ground between the current freedom and the strict S4 approach.

@vincentarelbundock
Copy link
Collaborator Author

Oh yeah, I'm absolutely not wedded to S4. Happy to use any container structure.

I do see the appeal of long lists of arguments, but IMHO this already has a big scope and we may introduce more things for other types, so maintainability and clarity are pretty big priorities at this stage

@zeileis
Copy link
Collaborator

zeileis commented Sep 10, 2025

Yes, I do see potential in making things more explicit and wrapped up in one place.

One thing I'm not sure about is how the container structure helps with tracking the evolution from defaults plus user settings to sanitized settings to type-specific settings. Would you keep multiple copies of the entire list rather than copies of individual variables?

@vincentarelbundock
Copy link
Collaborator Author

Good question, and I'll have to think about that.

One option is that a lot of checks we want could simply be is.null() on a stored match.call(), but I'm not sure that's the cleanest approach.

This is the kind of thing that is likely to become clearer for me when I give this a "real" try.

@zeileis
Copy link
Collaborator

zeileis commented Sep 10, 2025

I'm not sure whether the match.call() is a good strategy because you then have to be careful about when and where you evaluate it etc.

Off the top of my head, I would probably collect the user input in a plain list first and at that point run some sanity checks with rather explicit warning/error messages etc. because that might be helpful for the user. And that list might then get stored along with some subsequent version(s) of it.

But I haven't played any of this through...

@grantmcdermott
Copy link
Owner

Just catching up here after work. I like the idea and can see how a more structured OOP system like S4 would gives us a lot of control... but am also weary of introducing this as a dependency and have yet to use it for any of my other packages. Honestly, it sounds like something that we should discuss on a call if you are both up for it? I'll send an email to propose.

@vincentarelbundock
Copy link
Collaborator Author

but am also weary of introducing this as a dependency and have yet to use it for any of my other packages.

Really, the vault could just be a named list...

Happy to talk when I come back home from this conference.

@vincentarelbundock
Copy link
Collaborator Author

Closing in favor of the more fleshed out example here:

#473

Note that I have not done this other people to commit anyone to this. Still very much up for debate whether we eant to do this. I just thought it would be useful to actually see what the core function might look like after simplification/modularization/centralization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants