Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-127541: Update os.walk example #127765

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
16 changes: 8 additions & 8 deletions Doc/library/os.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3665,16 +3665,16 @@ features:

This example displays the number of bytes taken by non-directory files in each
directory under the starting directory, except that it doesn't look under any
CVS subdirectory::
``__pycache__`` subdirectory::

import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
for root, dirs, files in os.walk('python/Lib/xml'):
print(root, "consumes", end=" ")
print(sum(getsize(join(root, name)) for name in files), end=" ")
print("bytes in", len(files), "non-directory files")
if 'CVS' in dirs:
dirs.remove('CVS') # don't visit CVS directories
if '__pycache__' in dirs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to update which directory is being walked over (currently it's python/Lib/email)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion! I’ll update the example to use os.getcwd() for better compatibility.

Copy link
Member

@picnixz picnixz Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending where you are actually running the command, it might be very very long for the user (say they run it from $HOME). Instead, we can walk over 'python/Lib' instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback! I’ve updated the example to use python/Lib as the base directory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason of this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since __pycache__ is more common throughout python/Lib than python/Lib/email, I updated the directory to make the example more representative and practical.

Copy link
Member

@picnixz picnixz Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, I only have a single __pycache__ in my email package (even though email.mime exists) so you wouldn't see the "recursion" in this example I think if you compare with a find Lib/email -not -ipath '*__pycache__*'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting from Lib you will walk into .git, which may be non-desirable. Lib/xml contains several subdirectories with their own __pycache__ directories.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah maybe, Lib/xml is better then.

dirs.remove('__pycache__') # don't visit __pycache__ directories

In the next example (simple implementation of :func:`shutil.rmtree`),
walking the tree bottom-up is essential, :func:`rmdir` doesn't allow
Expand Down Expand Up @@ -3727,16 +3727,16 @@ features:

This example displays the number of bytes taken by non-directory files in each
directory under the starting directory, except that it doesn't look under any
CVS subdirectory::
``__pycache__`` subdirectory::

import os
for root, dirs, files, rootfd in os.fwalk('python/Lib/email'):
for root, dirs, files, rootfd in os.fwalk('python/Lib/xml'):
print(root, "consumes", end="")
print(sum([os.stat(name, dir_fd=rootfd).st_size for name in files]),
end="")
print("bytes in", len(files), "non-directory files")
if 'CVS' in dirs:
dirs.remove('CVS') # don't visit CVS directories
if '__pycache__' in dirs:
dirs.remove('__pycache__') # don't visit __pycache__ directories

In the next example, walking the tree bottom-up is essential:
:func:`rmdir` doesn't allow deleting a directory before the directory is
Expand Down
12 changes: 6 additions & 6 deletions Lib/os.py
Original file line number Diff line number Diff line change
Expand Up @@ -345,12 +345,12 @@ def walk(top, topdown=True, onerror=None, followlinks=False):

import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
for root, dirs, files in os.walk('python/Lib/xml'):
print(root, "consumes ")
print(sum(getsize(join(root, name)) for name in files), end=" ")
print("bytes in", len(files), "non-directory files")
if 'CVS' in dirs:
dirs.remove('CVS') # don't visit CVS directories
if '__pycache__' in dirs:
dirs.remove('__pycache__') # don't visit __pycache__ directories

"""
sys.audit("os.walk", top, topdown, onerror, followlinks)
Expand Down Expand Up @@ -460,13 +460,13 @@ def fwalk(top=".", topdown=True, onerror=None, *, follow_symlinks=False, dir_fd=
Example:

import os
for root, dirs, files, rootfd in os.fwalk('python/Lib/email'):
for root, dirs, files, rootfd in os.fwalk('python/Lib/xml'):
print(root, "consumes", end="")
print(sum(os.stat(name, dir_fd=rootfd).st_size for name in files),
end="")
print("bytes in", len(files), "non-directory files")
if 'CVS' in dirs:
dirs.remove('CVS') # don't visit CVS directories
if '__pycache__' in dirs:
dirs.remove('__pycache__') # don't visit __pycache__ directories
"""
sys.audit("os.fwalk", top, topdown, onerror, follow_symlinks, dir_fd)
top = fspath(top)
Expand Down
Loading