-
Notifications
You must be signed in to change notification settings - Fork 46
Add regression tests for get_latest and get_oldest functions #110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Code was calling `os.path.join()` too many times and causing the path to be repeated unnecessarily. Signed-off-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
3d2364c to
d0919fa
Compare
| ], | ||
| key=lambda path: int(path.split("/")[-1].split("_")[1]), | ||
| [x for x in os.listdir(targdir) if qualifier(os.path.join(targdir, x))], | ||
| key=lambda path: int(path.split("_")[1]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably the same issue below with get_oldest ?
maybe we could add a unit test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks better, though now the ctime key fails in get_oldest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, have pushed a fix, can you run the tests again to see if it works?
Signed-off-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
When a list of 3 files are passed into the get_oldest and get_latest functions, ensure that the files that were created first and last are returned respectively. Signed-off-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
20c95a4 to
f082164
Compare
Signed-off-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
2c843ba to
cf39fc7
Compare
Updates `get_latest` and `get_oldest` to use the same sorting function, and allows the dataloader ckp handler to pass in its custom sort manually. Removes the bug where excessive path joins lead to repeated path prefixes in dataloader ckp loading. Fixes GPTBigCode signatures used for speculator training, to match superclass signatures (currently preventing other PRs from landing). Includes and subsumes #110 and #96. Full credit to @weiji14 and @Akash-Nayak respectively
|
Bug fix has been applied in #119, so this PR only add unit tests now. |
Edit: Bug fix was applied in #119. This PR thus only adds two unit tests to prevent bug regression.
Code was calling
os.path.join()too many times and causing the path to be repeated unnecessarily, resulting in errors like:This patch removes one of the
os.path.join()calls, and also thepath.split("/")[-1]part which was needed because a path with "/" dividers was used in the list comprehension.