os.walk目錄遍歷

每個(gè)月都有那么幾天想劃水,又到劃水的日子了,今天分享的是剛在處理遍歷目錄相關(guān)用到的相關(guān)方法。

os.walk

os.walk的參數(shù)如下:

os.walk(top, topdown=True, onerror=None, followlinks=False)

其中:

  • top是要遍歷的目錄。
  • topdown是代表要從上而下遍歷還是從下往上遍歷。
  • onerror可以用來(lái)設(shè)置當(dāng)便利出現(xiàn)錯(cuò)誤的處理函數(shù)(該函數(shù)接受一個(gè)OSError的實(shí)例作為參數(shù)),設(shè)置為空則不作處理。
  • followlinks表示是否要跟隨目錄下的鏈接去繼續(xù)遍歷,要注意的是,os.walk不會(huì)記錄已經(jīng)遍歷的目錄,所以跟隨鏈接遍歷的話有可能一直循環(huán)調(diào)用下去。

os.walk返回的是一個(gè)3個(gè)元素的元組 (root, dirs, files) ,分別表示遍歷的路徑名,該路徑下的目錄列表和該路徑下文件列表。注意目錄列表和文件列表不是具體路徑,需要具體路徑(從root開(kāi)始的路徑)的話可以用 os.path.join(root,dir) 和 os.path.join(root,dir) 。

例子

假設(shè)現(xiàn)在存在如下的文件和目錄結(jié)構(gòu):

? test_os_walk git:(master) ? tree .
├── a.py ├── b.py ├── c.py ├── dir1 │   ├── dir4 │   │   ├── g.py │   │   └── h.py │   ├── dirx │   │   ├── diry │   │   │   └── k.py │   │   └── z.py │   ├── e.py │   ├── f.py │   └── g.py ├── dir2 │   ├── dira │   │   └── dirb │   │       └── dirc │   │           └── aha.py │   ├── k.py │   ├── l.py │   └── m.py └── dir3 ├── dir5 │   └── z.py ├── x.py └── y.py 10 directories, 17 files

測(cè)試topdown

當(dāng)我用 os.walk 遍歷這個(gè)目錄時(shí),程序和輸出如下:

import os

path = '/Users/nisen/Projects/python_advanced_class/test/test_os_walk' for root, dirs, files in os.walk(path, True): print 'root: %s' % root print 'dirs: %s' % dirs print 'files: %s' % files print ''

結(jié)果如下,從root的路徑可以看出遍歷是自上而下的:

?  test git:(master) ? python test11.py root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk dirs: ['dir1', 'dir2', 'dir3'] files: ['a.py', 'b.py', 'c.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1 dirs: ['dir4', 'dirx'] files: ['e.py', 'f.py', 'g.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dir4 dirs: [] files: ['g.py', 'h.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx dirs: ['diry'] files: ['z.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx/diry dirs: [] files: ['k.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2 dirs: ['dira'] files: ['k.py', 'l.py', 'm.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira dirs: ['dirb'] files: []  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb dirs: ['dirc'] files: []  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb/dirc dirs: [] files: ['aha.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3 dirs: ['dir5'] files: ['x.py', 'y.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3/dir5 dirs: [] files: ['z.py']

而當(dāng)設(shè)置os.walk的topdown為False時(shí),結(jié)果如下, 可以看出他是自上而下遍歷的:

?  test git:(master) ? python test11.py root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dir4 dirs: [] files: ['g.py', 'h.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx/diry dirs: [] files: ['k.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx dirs: ['diry'] files: ['z.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1 dirs: ['dir4', 'dirx'] files: ['e.py', 'f.py', 'g.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb/dirc dirs: [] files: ['aha.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb dirs: ['dirc'] files: []  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira dirs: ['dirb'] files: []  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2 dirs: ['dira'] files: ['k.py', 'l.py', 'm.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3/dir5 dirs: [] files: ['z.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3 dirs: ['dir5'] files: ['x.py', 'y.py']  root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk dirs: ['dir1', 'dir2', 'dir3'] files: ['a.py', 'b.py', 'c.py']

運(yùn)行時(shí)修改遍歷目錄

當(dāng)topdown設(shè)置為T(mén)rue時(shí),可以在處理時(shí)修改返回的 dirs 列表,這樣可以遍歷下面的目錄時(shí)會(huì)根據(jù)修改后的 dirs來(lái)遍歷。比如下面的例子,在遍歷的時(shí)候不把"CSV"目錄包括在內(nèi):

import os from os.path import join, getsize for root, dirs, files in os.walk('python/Lib/email'): print root, "consumes", print sum(getsize(join(root, name)) for name in files), print "bytes in", len(files), "non-directory files" if 'CVS' in dirs:
        dirs.remove('CVS') # don't visit CVS directories