Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning about Optimizer information not properly decoded during JobWrapper execution #7481

Open
michmx opened this issue Feb 22, 2024 · 2 comments

Comments

@michmx
Copy link

michmx commented Feb 22, 2024

We are using DIRAC 8.0.37 + our Belle extension. At the pilot logs, we are observing the following error message:

[263926]JobWrapper VERBOSE: Job input data requirement is some_path/file.root
[263926]JobWrapper INFO: Site has the following local SEs: SIGNET-DATA-SE, SIGNET-TMP-SE
[263926]JobWrapper VERBOSE: Job input data resolution policy module is DIRAC.WorkloadManagementSystem.Client.InputDataResolution
[263926]JobWrapper WARN: This should never happen
[263926]JobWrapper WARN: Optimizer information could not be converted to a dictionary will call catalog directly
...

The warning comes from an exception at DIRAC/Core/Utilities/DEncode.py#L525 :

def decode(data):
    """Generic decoding function"""
    if not data:
        return data
    if not isinstance(data, bytes):
        raise NotImplementedError("This should never happen")
    return g_dDecodeFunctions[data[0]](data, 0)

Then This should never happen... happens

The issue seems to be that decode() gets a str from self.optArgs["InputData"], instead of bytes during the JobWrapper execution:

try:
    optDict, _length = DEncode.decode(self.optArgs["InputData"])
    optReplicas = optDict["Value"]
    self.log.info("Found optimizer catalog result")
except Exception as x:
    self.log.warn(str(x))
    self.log.warn("Optimizer information could not be converted to a dictionary will call catalog directly")

We have confirmed that the information is properly stored inside InputData in the table OptimizerParameters at the JobDB, and seems to be encoded:

mysql> select jobID, Name, Value from OptimizerParameters where jobID = 263926;
| 263926 | InputData      | ds5:Valueds2:OKb1s5:Valueds6:Faileddes10:Successfulds142:/belle/MC/release-06-00-08/DB00002100/MC15ri_b/prod00024790/s00/e1003/4S/r00000/ccbar/mdst/sub00/mdst_000999_prod00024790_task10020001003.rootds8:Checksums8:73c26afas12:ChecksumTypes2:ADs12:CreationDatezati2022ei3ei5ei23ei9ei56ei0enes4:GUIDs36:bd4fe732-075f-563a-e2ef-d6002d0cbfa1s4:Modei436es16:ModificationDatezati2023ei3ei8ei15ei32ei35ei0enes13:NumberOfLinksi1es4:Sizei2088918091es6:Statuss1:-s16:KEK-DISK-DATA-SEs207:davs://belle2-webdav-analysis-data.cc.kek.jp:8443/disk/belle/DATA/belle/MC/release-06-00-08/DB00002100/MC15ri_b/prod00024790/s00/e1003/4S/r00000/ccbar/mdst/sub00/mdst_000999_prod00024790_task10020001003.roots14:SIGNET-DATA-SEs190:davs://dcache.ijs.si:2880/pnfs/ijs.si/belle/DATA/belle/MC/release-06-00-08/DB00002100/MC15ri_b/prod00024790/s00/e1003/4S/r00000/ccbar/mdst/sub00/mdst_000999_prod00024790_task10020001003.rooteeees14:SiteCandidatesds13:LCG.SIGNET.sids4:diski1es4:tapei0ees10:LCG.KEK.jpds4:diski1es4:tapei0ees13:ARC.SIGNET.sids4:diski1es4:tapei0ees17:LCG.KIT-TARDIS.deds4:diski1es4:tapei0ees15:EuroHPC.Vega.sids4:diski1es4:tapei0eeee |

And we can reproduce the return of a string using the JobDB:

>>> from DIRAC.WorkloadManagementSystem.DB.JobDB import JobDB
>>> jobDB = JobDB()

>>> a = jobDB.getJobOptParameters(263926)
>>> type(a['Value']['InputData'])
<class 'str'>

But according to what the decode() function expects at DEncode.py, this should be bytes.

Have you experienced a similar issue recently? We found that you had the same warnings at #2421, which resulted in introducing the decode() at the JobWrapper.


Additional information on MySQLdb version:

>>> MySQLdb.version_info
(2, 0, 3, 'final', 0)
@fstagni
Copy link
Contributor

fstagni commented Feb 23, 2024

Thank you for the info. This actually happens also in LHCb production instance (also v8.0.x):

2024-02-23 08:48:08 UTC None/[849602392]JobWrapper INFO: Site has the following local SEs: Glasgow-DST, Glasgow_MC-DST
2024-02-23 08:48:08 UTC None/[849602392]JobWrapper WARN: This should never happen
2024-02-23 08:48:08 UTC None/[849602392]JobWrapper WARN: Optimizer information could not be converted to a dictionary will call catalog directly

The good news is that it is not dangerous.

I suspect an encode is needed at some point, we had in more recent past a bunch of cases like this, some fixed e.g. in #5854

Maybe simply this line in JobWrapper:

optDict, _length = DEncode.decode(self.optArgs["InputData"])

should become

optDict, _length = DEncode.decode(self.optArgs["InputData"].encode())

@michmx
Copy link
Author

michmx commented Mar 4, 2024

@fstagni Thanks a lot for the check and the confirmation that also happens on LHCb.

We have not been able to test the addition of .encode() yet as we got distracted with another problem, but after testing we will report the outcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants