Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-59705: Add _thread.set_name() function #127338

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Nov 27, 2024

On Linux, threading.Thread now sets the thread name to the operating system.

configure now checks if pthread_setname_np() function is available.

On Linux, threading.Thread now sets the thread name to the operating
system.

configure now checks if pthread_setname_np() function is available.
@vstinner
Copy link
Member Author

This implementation is very basic on purpose. I plan to add support for more platform in follow-up PRs.

  • On Linux, set_name() does nothing if the name is longer than 15 bytes. Should the function truncate silently to 15 bytes instead? I don't think that raising an exception is very convenient here.
  • Setting Thread.name after Thread.start() doesn't call again set_name(). set_name() is called only once per thread, at startup.
  • I didn't add automated tests since I don't want to add a get_name() function (use Thread.name to get a thread name).

Demo 1 (main thread):

$ ./python
>>> import os
>>> pid=os.getpid()
>>> with open(f"/proc/{pid}/task/{pid}/comm") as fp: print(f"comm = {fp.read()!r}")
... 
comm = 'python\n'

>>> import _thread; _thread.set_name("demo")
>>> with open(f"/proc/{pid}/task/{pid}/comm") as fp: print(f"comm = {fp.read()!r}")
... 
comm = 'demo\n'

Demo 2 (thread):

$ ./python
>>> import threading, os, time
>>> os.getpid()
81921
>>> t=threading.Thread(target=time.sleep, args=(60,), name="sleeper")
>>> t.start()
^Z

$ cat /proc/81921/task/81927/comm 
sleeper

See also a previous attempt to implement the feature: #14578

@vstinner
Copy link
Member Author

I didn't add automated tests since I don't want to add a get_name() function (use Thread.name to get a thread name).

I changed my mind and added a private _thread._get_name() function for tests.

@vstinner
Copy link
Member Author

@pitrou @encukou @serhiy-storchaka: Would you mind to review this change? It's to set the thread name in threading.Thread to the operating system.

@vstinner
Copy link
Member Author

On Linux, set_name() does nothing if the name is longer than 15 bytes. Should the function truncate silently to 15 bytes instead? I don't think that raising an exception is very convenient here.

I modified _thread.set_name(name) to truncate name to 15 bytes on Linux.

Truncating the string in threading.Thread would be more complicated since it requires to encode the string the filesystem encoding, detect the operating system (Linux), and hardcode the 15 bytes limit there. IMO it's more convenient to truncate in _thread.set_name().

Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you!

The truncation is not pretty with non-ASCII names. I guess codepoint-preserving truncation is not worth the effort, and Linux tools need to deal with thread names being arbitrary bytes.

But, we can test the edge cases, to ensure this quality-of-life enhancement doesn't start raising exceptions in working code.

Lib/test/test_threading.py Outdated Show resolved Hide resolved
Modules/_threadmodule.c Outdated Show resolved Hide resolved
Lib/threading.py Outdated Show resolved Hide resolved
@vstinner
Copy link
Member Author

@encukou: I addressed your reviews. Please review the updated PR.

I added tests on long names and non-ASCII names.

Lib/test/test_threading.py Outdated Show resolved Hide resolved
@vstinner
Copy link
Member Author

@encukou: Maybe the "replace" error handler can be used, instead of not setting the name if the name cannot be encoded to the filesystem encoding. What do you think?

@serhiy-storchaka
Copy link
Member

You can use FS_NONASCII. You can also use TESTFN_UNDECODABLE to test that it works with arbitrary bytes and TESTFN_UNENCODABLE to test for encoding error.

Is it a hard limit for the size? Is it the same on other platforms? I would prefer to use a named constant instead of magic numbers 15, 16, 17.

@vstinner
Copy link
Member Author

@serhiy-storchaka:

Is it a hard limit for the size?

Yes. Using a longer name fails with ERANGE.

Is it the same on other platforms?

It's 16 bytes on Linux and 64 bytes on macOS, so no, it's not the same.

I would prefer to use a named constant instead of magic numbers 15, 16, 17.

I failed to find a public constant for these limits. For example, Darwin MAXTHREADNAMESIZE constant is private (I'm not 100% sure, but I don't have macOS so I cannot check manually, I only read the code).

@vstinner
Copy link
Member Author

You can use FS_NONASCII. You can also use TESTFN_UNDECODABLE to test that it works with arbitrary bytes and TESTFN_UNENCODABLE to test for encoding error.

Ok, I added tests using FS_NONASCII and TESTFN_UNENCODABLE.

@vstinner
Copy link
Member Author

I modified my PR to use the "replace" error handler to not fail if the name cannot be encoded to the filesystem encoding. IMO it's better to replace a few characters in the name rather than not copying any character (all or nothing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants