-
Notifications
You must be signed in to change notification settings - Fork 212
Disallow anonymous Data from being dumped #766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
@nevans @eregon Here's the PR I promised. I believe my justifications for removing this feature are sound. While Psych does allow dumping and loading anonymous classes and structs, Marshal does not. I believe Psych should not allow serializing things that Marshal disallows: I also don't think the feature is actually worth keeping, since it produces Data objects that no longer reflect their original class: And a large stream with many anonymous Data objects in it will create as many different anonymous Data types, all potentially identical in structure. None of these are what people want or expect. |
This feature leaked out without much discussion, but there's many reasons why it should not be supported: * Each instance of an anonymous Data in the stream will cause a new Data subtype to be defined. * Multiple objects of the same anonymous Data cannot be round- tripped, since they will no longer be of the same type after loading. As with Marshal's behavior for anonymous classes and structs, Psych should reject such objects rather than attempt to fake a Data class that appears to fit the original structure.
ce160ce to
7a37482
Compare
| raise TypeError, "can't dump anonymous Data: #{o}" unless o.class.name | ||
| ivars = o.instance_variables | ||
| if ivars.empty? | ||
| tag = ['!ruby/data', o.class.name].compact.join(':') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| tag = ['!ruby/data', o.class.name].compact.join(':') | |
| tag = "!ruby/data:#{o.class.name}" |
and same below.
|
That makes sense to me, especially the last case where one Data subclass becomes 100_000 Data subclasses after "roundtripping". I think we should also disallow loading anonymous Data for consistency & simplicity, but that's probably best done after #765 is merged to avoid conflicts. |
|
I'm still against this. My long term objection to it is a weak objection. But I am strongly against it without at least giving a deprecation period (until at least ruby 4.1). If someone wants to load anonymous data (or struct), that's their prerogative, IMO. To me, YAML is not only (or even primarily) about round-tripping, and the inability to fully round-trip something shouldn't be the only justification for being able to load (or dump) it. My projects over the years have loaded orders of magnitude more YAML than they've serialized. And even when I do dump YAML, sometimes that's more about printing a quick and dirty debug file that I can easily read through, and I have no intention of ever loading it into anything but my own brain (or maybe a quick one-off And LOTS of data can't be round-tripped safely. If that is the justification for disabling this, that justifies breaking many other cases, but in particular all of the other anonymous class round tripping, right? IMO, that would break many things. irb(main):022> oclass = Class.new do attr :ivar; def initialize(...) super; @ivar = "hello obj" end end
=> #<Class:0x00007a986f106558>
irb(main):023> oclass.new.to_yaml
=> "--- !ruby/object\nivar: hello obj\n"
irb(main):024> YAML.unsafe_load oclass.new.to_yaml
=> #<Object:0x00007a986ecf6788 @ivar="hello obj">
irb(main):025> _.class
ObjectAnonymous Data/Struct loading is far more useful than this, because the only way to use the anonymous object I've just loaded is manually modifying the singleton class! At which point, you're creating another class per object again. Of course, anonymous Data (and Struct) do have the extra issue beyond other anonymous classes: they create a brand new class per object. But I still don't think that should be a necessary restriction. They also create a custom object per object, so some sort of linear memory use is already baked in. If someone wants to do that (and bust a bunch of their caches), let them. (Adding a flag to enable/disable it would be fine, of course.) On the other hand, changing how anonymous data (and structs and objects) are dumped to be more like how anonymous Array subclasses and Hash subclasses are dumped would be much more acceptable to me. It would still be a breaking change, and a conservative approach would still give it a deprecation period or at least be careful about timing which release changes the default behavior. But it would force people to confront any round-tripping problems head-on, rather than hiding them. irb(main):001> aclass = Class.new(Array) do def initialize(...) super; @ivar = "hello array" end end
=> #<Class:0x00007448bdd45bf0>
irb(main):002> aclass.new(10) { rand }.to_yaml
=> "--- !ruby/array:%23%3CClass:0x00007448bdd45bf0%3E\ninternal:\n- 0.8122580707133161\n- 0.9037981601701021\n- 0.04788691095832298\n- 0.5915898581951498\n- 0.01001794374303544\n- 0.3938141475427078\n- 0.9484047343467139\n- 0.6925541802973968\n- 0.29676774864966593\n- 0.6565246077519113\nivars:\n :@ivar: hello array\n"See the array.class.inspect in there: irb(main):003> YAML.unsafe_load aclass.new(10) { rand }.to_yaml
/home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/class_loader.rb:57:in `path2class': can't retrieve anonymous class #<Class:0x00007448bdd45bf0> (ArgumentError)
path2class(name)
^^^^
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/class_loader.rb:57:in `resolve'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/class_loader.rb:49:in `find'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/class_loader.rb:29:in `load'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:465:in `resolve_class'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:266:in `visit_Psych_Nodes_Mapping'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:30:in `visit'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:6:in `accept'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:35:in `accept'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:345:in `visit_Psych_Nodes_Document'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:30:in `visit'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:6:in `accept'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:35:in `accept'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/nodes/node.rb:49:in `to_ruby'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych.rb:275:in `unsafe_load'
from (irb):3:in `<main>'
from <internal:kernel>:187:in `loop'
... 3 levels...
/home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/class_loader.rb:57:in `path2class': undefined class/module Struct::#<Class (ArgumentError)
path2class(name)
^^^^
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/class_loader.rb:57:in `resolve'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/class_loader.rb:49:in `find'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/class_loader.rb:29:in `load'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:465:in `resolve_class'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:266:in `visit_Psych_Nodes_Mapping'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:30:in `visit'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:6:in `accept'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:35:in `accept'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:345:in `visit_Psych_Nodes_Document'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:30:in `visit'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/visitor.rb:6:in `accept'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/visitors/to_ruby.rb:35:in `accept'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych/nodes/node.rb:49:in `to_ruby'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/psych-5.2.6/lib/psych.rb:275:in `unsafe_load'
from (irb):3:in `<main>'
from /home/user/.local/share/rbenv/versions/3.3.10/lib/ruby/gems/3.3.0/gems/irb-1.15.3/lib/irb/workspace.rb:101:in `eval'
... 17 levels...So, my counter-proposal is this:
|
|
Since the vast majority of production code that is loading YAML is (or should be) using But sometimes people may use So, I'll add the following details to my counter-proposal:
|
This feature leaked out without much discussion, but there's many reasons why it should not be supported:
As with Marshal's behavior for anonymous classes and structs, Psych should reject such objects rather than attempt to fake a Data class that appears to fit the original structure.