Conversation
| ### Core Principles | ||
|
|
||
| 1. **Rename once, when files are first attached to a topic** | ||
| 2. **Never rename on topic updates** - maintains parity with Azure |
There was a problem hiding this comment.
Let's assume that S3 has the same document filenames as Azure from the beginning.
S3 remains our provider for active storage uploads, so Azure content is not managed automatically.
Now let's imagine that we update document_prefix (or file_name_prefix part from provider) for topic and then archive it.
We need to move documents from one Azure folder to another. How do we find documents with updated topic?
There was a problem hiding this comment.
I don't think we want to move the files in Azure.
Renaming the document prefix or file_name_prefix should only affect uploads after that change. Files we have already uploaded will maintain their old names.
So if we have a topic that we imported, along with its one document, and then later we added another document, we would have:
- this_is_an_imported_name.jpg (in Azure: this_is_an_imported_name.jpg)
- [skillrx_internal_upload]_ep34_06_a_later_file.mp3 (in Azure: 206206_myprovider_2025_08_ep34_06_a_later_file.mp3)
After we add the renaming to our lifecycle and also run a one-time fix of the filenames we are storing, we will have:
- this_is_an_imported_name.jpg (in Azure: the same thing)
- 206206_myprovider_2025_08_ep34_06_a_later_file.mp3 (in Azure: the same thing)
If we then update the provider prefix, then upload a new file, we will have:
- this_is_an_imported_name.jpg (in Azure: the same thing)
- 206206_myprovider_2025_08_ep34_06_a_later_file.mp3 (in Azure: the same thing)
- 206206_mynewproviderprefix_2025_10_06_yet_another_file (in Azure: the same thing)
There was a problem hiding this comment.
If they decide they want to do any sort of bulk renaming of files after this is all in place, including moving those around in Azure, that will be a separate effort.
There was a problem hiding this comment.
We don't move files because of file_name_prefix, for example. We move it because of deletion or archiving, right?
When topic is archived, we should move files. Is it possible not to find them by name?
There was a problem hiding this comment.
Actually we do not allow edit document_prefix for topics. And we just disabled editing file_name_prefix for providers. If we enable the latter again, we may get into trouble.
There was a problem hiding this comment.
The stakeholders want to update file_name_prefix for all providers. And they want to use that to name the provider-specific files. That's what's motivating me to want to get our file names in line with what we upload to Azure.
There was a problem hiding this comment.
Yes, I remember. I just don't understand how this sync will solve our problem.
Will try to go carefully through this doc
There was a problem hiding this comment.
Let me know if you want to talk it through. Basically, it solves the problem because we will always be managing our file sync with Azure using the names stored in our app. If they change one of the fields involved in generating those names, those changes will only apply to future files. If at any point they want to rename the files in Azure, that will be a separate issue with a separate solution.
|
I think I understand benefits now. |
See FILE_RENAMING_SPECIFICATION.md.
This specs out a possible approach to resolving our challenges with filenames. The goal is to have parity between SkillRx filenames and the names in Azure. We already have this with the imported files; the file names in SkillRx and Azure are the same for those files. The files uploaded via SkillRx, however, have different names.