The files are placed in directories or subdirectories in OS and this is a very common scenario when you have to iterate files over a particular directory using python. In this tutorial, I show you how to use python to loop through files in a directory recursively or non-recursively.
Python provides several built-in methods or modules which can be used for file iteration and later you can perform different operations on files. Below is the list of methods or modules which can be used in the python loop through files and folders in the directory –
- pathlib() method
Let’s understand these methods one by one with examples.
Python loop through files in a directory using os.scandir() method
If you are using python 3.5 or later then the scandir() is the fastest file iterator method you can use. It returns the “DirEntry” object that holds the filename in a string. It provides below two options –
With Parameter – list files from given folder/directory.
No Parameter – list files from current folder/directory.
And the output of scandir() method looks like below –
<DirEntry ‘text.txt’> <DirEntry ‘sample.xlxs’>
Scandir() Example –
Import os x = os.scandir() For i in x: Print(i)
If you didn’t pass the path of the directory it by default read the current working directory. It prints all the available files and subdirectory them to the console. And If you want only files need to be fetched and ignore directories then add a file type check in your script like below –
import os directory = r'C:\testfolder' for strfile in os.scandir(directory): if (strfile.path.endswith(".xlsx") or strfile.path.endswith(".docx")) and strfile.is_file(): print(strfile.path)
I only required a .xlsx and .docx file from the entire directory so I have added the file type check.
Note – The scandir() method is not recursive use the walk() method which I show below if you need to iterate over nested folders.
Iterate file over directory using os.listdir() method –
If you are using python 2 which is an old but popular version of python then you can use listdir() method to iterate files from any particular directory –
import os myfiles = os.listdir() Print (myfiles)
It returns all files and folders from the current directory because I don’t mention the path in listdir() method. Let’s pass the folder path and iterate the file from the given folder.
import os directory = r'C:\testfolder' myfiles = [x for x in os.listdir(directory) if x.endswith(".jpg")] print(os.path.join(directory, myfiles))
It returns all the .jpg files from “testfolder” directory.
Iterate file from given directory using os.walk() method –
The os.scandir() and os.listdir() method have one limitation it only iterates files and folders from immediate directory means it’s not recursive if you need to iterate through nested directory or folder use os.walk() method –
dir = r'C:\testfolder' for subdir, dirs, files in os.walk(dir): for filename in files: filepath = subdir + os.sep + filename #check file extension ends with .png if filepath.endswith(".png"): print (filepath)
it returns “.png” files from all the folders in a given path.
Iterate file over directory using glob.iglob() method
The glob.iglob() or glob.glob() methods are used to retrieve paths recursively from inside the folder/directories.
glob.glob(pathname, *, recursive=False) glob.iglob(pathname, *, recursive=False)
By default recursive is false means it doesn’t retrieve the files from the nested folder it only fetches from the immediate directory. If you set recursive true then it recursively lists files from nested folders.
Let say we have one directory having below path –
Now we need to retrieve all the text files from the above path then use the below script –
import glob # fetch all txt files from given path for filepath in glob.iglob(r'C:\testfolder\files\*.txt'): print(filepath)
The above code only retrieve the .txt files inside the given immediate folder if you want to recursively fetch from a nested folder use the below code –
Python loop through files in directory recursively
import glob # Recursively fetch all txt files from given path for filepath in glob.iglob(r'C:\testfolder\files\*.txt' , recursive=True): print(filepath)
Iterate file over directory using pathlib() method
The pathlib method works the same as iglob() method the path module provides various classes to handle files. Below is the example to iterate files over a particular directory using pathlib() method.
from pathlib import Path directory = 'C:\testfolder\files' paths = Path(directory).glob('**/*.txt') for path in paths: #convert path object into string pathstr = str(path) # print .txt file path print(pathstr)
I hope now you have a basic understanding of how to iterate files all the above can be used in python to loop through files in the directory recursively.