-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decoupled mode is not being honored in cacheless mode after redirect #2146
Comments
It appears that the problem is caused by the L1desc entry not following a redirect. The data id is used for decoupled mode and it is taken from the L1desc table. |
This is not an issue with archive mode as the data object id is read directly from the irods::file_object class since the object has already been written to cache and therefore the database. This is not yet available in cacheless mode so we had to use the L1desc table. |
Hi, Is there any update regarding this issue? Is there any plan to pick it up in a coming release? TIA for your prompt response. |
The fix will be in the next release of the plugin. We should be able to fix it in the next few weeks. |
Hi, thanks for your response. Would that fix work with irods 4.12? Or only 4.3.x? |
The fix would apply to 4.2 and 4.3 assuming it doesn't require any changes to the iRODS server. If changes to the server are needed, then only 4.3 will receive the fix. |
The commits have been merged to both main and 4-2-stable. Closing this issue. |
Reopening this issue because the fix did not work on 4.2.12. When I tested it I accidentally had the resource set to detached mode so the redirect did not happen as intended. It appears that the issue is that unlike 4.3.1, in 4.2.12 at the time s3_file_create() is called the entry has not yet been written to the r_data_main table. We rely on that for the S3 resource to determine the object id which is used for the S3 object key in decoupled mode. |
I have unchecked both the main and 4-2-stable checkmarks as the solution here does not work for files that use parallel writes. Here is a brief description on how this currently works. Let's say the client is connected to server A and the S3 resource is attached to server B.
For this to work correctly, the physical path needs to be repaved within the agent memory on server A before step 8 is run. In the case of large files, the only plugin operation run on server A is the initial At this point I don't see a solution that only involves changes to the S3 plugin. Some options to move forward:
|
I like option 3 on first glance. |
Is all of this because decoupled mode needs the data id to create a valid physical path / prefix? |
Right... because we decided to use 'reversed data_id' as the prefix? What if... we did something else? Design goals include... 'stable', and ... what else? |
Exactly what I was getting at. There doesn't appear to be a reason to use the reversed data id. Perhaps the scheme can be controlled by the admin (with a sane default if the admin doesn't care). |
The thing is that the object key has to be predictable (so that both servers agree on the key) and guaranteed to be unique. On the first point, if it is not predictable then the key would have to somehow be shared between the two computers. The obvious solution would be to use a hash of the original iRODS path. I thought that I looked into that and found that it wasn't possible but don't remember why. When I get a chance I will look into it again. |
There's several pieces of information included in the API requests. Perhaps there's a combination of things that are always available, that when used together in some capacity, lead to a unique value. |
Hello. Are there any more planned activities targetting this issue? |
Not at the moment. It is a difficult problem to solve because the plugin isn't given enough information so it might require a server code change. I'll circle back around to this in the next few days to see if there is something else I can come up with. |
When a client is connected to one server and the S3 resource is attached to another server and in decoupled mode, an iput does not honor decoupled mode. It instead uses the path in iRODS for the key.
The object can still be read from any of the servers since the key that is used in S3 is stored in the catalog.
I tested the following scenarios:
The text was updated successfully, but these errors were encountered: