-
Notifications
You must be signed in to change notification settings - Fork 39
Implement Radix Tree #217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Implement Radix Tree #217
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #217 +/- ##
==========================================
+ Coverage 97.58% 97.60% +0.02%
==========================================
Files 56 59 +3
Lines 5176 5353 +177
==========================================
+ Hits 5051 5225 +174
- Misses 125 128 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
1d245b8 to
02a6420
Compare
paths/paths.go
Outdated
| var pathTreeCache sync.Map | ||
|
|
||
| // getOrBuildPathTree returns a cached radix tree for the document, or builds one if not cached. | ||
| func getOrBuildPathTree(document *v3.Document) *radix.PathTree { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only needed to make the change backwards compatible. I didn't want to add yet another arg to the FindPath function so we need a way to reuse the radix tree rather than having to make one for each request.
If we'd rather update the function call and have this generated as a ValidationOption, we can take that approach as well.
| } | ||
| } | ||
|
|
||
| func TestFindPath_TieBreaker_DefinitionOrder(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THIS IS A BREAKING CHANGE
Technically the radix tree doesn't support the case where we have 2 paths with identical hierarchy and different parameters. The parameters will override the previous in the radix tree which means you would only be able to match against getPetByName here.
I went over to the spec as I wanted to see if this was even a valid case, and it doesn't sound like it is from what I'm reading.
https://spec.openapis.org/oas/v3.1.0.html#paths-object
A relative path to an individual endpoint. The field name MUST begin with a forward slash (/). The path is appended (no relative URL resolution) to the expanded URL from the Server Object’s url field in order to construct the full URL. Path templating is allowed. When matching URLs, concrete (non-templated) paths would be matched before their templated counterparts. Templated paths with the same hierarchy but different templated names MUST NOT exist as they are identical. In case of ambiguous matching, it’s up to the tooling to decide which one to use.
...
The following paths are considered identical and invalid:
/pets/{petId}
/pets/{name}
I decided to drop support for this, but let me know if you have issues with that.
d2bfaf7 to
f7a0b49
Compare
|
|
||
| // create a new parameter validator | ||
| v.paramValidator = parameters.NewParameterValidator(m, opts...) | ||
| v.paramValidator = parameters.NewParameterValidator(m, config.WithExistingOpts(options)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually a bug with the schema cache I added as well. We were using opts... which is the options that were passed in from the user rather than the local options that we created here. This means that each validator section is getting its own schema cache if the user didn't specify their own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix for this is here: #221
f7a0b49 to
c4c00c4
Compare
|
|
||
| // warm the schema caches by pre-compiling all schemas in the document | ||
| // (warmSchemaCaches checks for nil cache and skips if disabled) | ||
| warmSchemaCaches(m, options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just moved from below.
| return options.PathLookup | ||
| } | ||
| if document != nil && document.Paths != nil { | ||
| return radix.BuildPathTree(document) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implicitly will make it so that if someone is calling FindPaths directly they will create a new radix tree on every call.
An alternative here is that we can set the tree we made here onto the options, but if the options are used in a goroutine there may be issues with that. We could potentially lock it with a mutex, but not sure if it's worth the complexity. I imagine most people aren't calling FindPaths directly.
The other alternative is we just don't use the radix tree if you didn't set it. In other words, using the Validator constructor will support it, but if you try and access these methods directly then you won't have it.
NOTICE
This PR has a subtle breaking change when it comes to identical paths. Refer to the comment below for more info:
https://github.com/pb33f/libopenapi-validator/pull/217/changes#r2700417695
Summary
I noticed some performance issues with the current path matching approach as it recomputes regexes on every validation call. I unfortunately didn't notice the regex cache before making this change, but either way according to my benchmarks this approach is faster and more memory efficient.
The idea is to essentially use a Radix tree for matching the incoming request path against those in the OpenAPI spec to find the appropriate PathItem. The RadixTree will bring the runtime down to O(k) where k is the number of path segments; however, it does not work for the more complex paths (OData, matrix, etc.). So I've essentially made it so we try and run the RadixTree lookup first as I imagine most HTTP urls are simple paths and if that fails we then fall back to the previous approach.
Note that most of the changes in this PR are due to me updating the
FindPathfunction to take in the ValidatorOptions rather than RegexCache directly. I either had to add the Radix tree as a new arg or swap to options so I figured swapping to options would reduce changes in the future if we needed more things later.Testing
AI wrote all of the tests, but I had it leave all of the previous unit tests except for a couple of outliers:
path_parameter_test.go- The tests that were removed in this file were testing that the RegexCache was appropriately updated for the simple path test case; however, we never hit the cache anymore since the radix handles that case for us. Instead, a new test was added with a supported (simple) and unsupported (OData) path and we test that the unsupported path does have a cache hit.paths_test.go- Same story here honestly. The only thing that changes is if it was testing the regex cache then we have to move it to an unsupported path. If it's a simple path, none of the tests should have to change, it's backwards compatible. No one should really notice anything is different (except for performance 😄)