#How to Bulk Rename Files to Numeric File Names in Linux – CloudSavvy IT

Table of Contents
“#How to Bulk Rename Files to Numeric File Names in Linux – CloudSavvy IT”

Want to rename a whole set of files to a numeric sequence (1.pdf, 2.pdf, 3.pdf, …) in Linux? This can be done with some light scripting and this article will show you how to do exactly that.
Numeric File Names
Usually when we scan a PDF file using some hardware (mobile phone, dedicated PDF scanner), the file name will read something like 2020_11_28_13_43_00.pdf. Many other semi-automated systems produces similar date and time based filenames.
Sometimes the file may also contain the name of the application being used, or some other information like for example the applicable DPI (dots per inch) or the scanned paper size.
When collecting PDF files together from different sources, file naming conventions may differ significantly and it may be good to standardize on a numeric (or part numeric) file name.
This also applies to other domains and sets of files. For example, your recipes or photo collection, data samples generated automated monitoring systems, log files ready for archiving, a set of SQL files for the database engineer, and generally any data collected from different sources with different naming schemes.
Bulk Rename Files to Numeric File Names
In Linux, it is easy to quickly rename a whole set of files with completely different file names, to a numerical sequence. “Easy” means “easy to execute” here: the problem of bulk renaming files to numerical numbers is complex to code in itself: the oneliner script below took 3-4 hours to research, create and test. Many other commands tried all had limitations which I wanted to avoid.
Please note that no warranties are given or provided, and this code is provided ‘as is’. Please do your own research before running it. That said, I did test it successfully against files with various special characters, and also against more then 50k files without any file being lost. I also checked a file named 'a'$'n''a.pdf'
which contains a newline.
if [ ! -r _e -a ! -r _c ]; then echo 'pdf' > _e; echo 1 > _c ;find . -name "*.$(cat _e)" -print0 | xargs -0 -I{} bash -c 'mv -n "{}" $(cat _c).$(cat _e);echo $[ $(cat _c) + 1 ] > _c'; rm -f _e _c; fi
Let’s first look at how this works, and then analyze the command. We have a created a directory with eight files, all named quite differently, except their extension matches and is .pdf. We next run the command above:
The outcome was that the 8 files have been renamed to 1.pdf, 2.pdf, 3.pdf, etc., even though their names were quite offset before.
The command assumes you do not have any 1.pdf to x.pdf named files yet. If you do, you can move those files into a separate directory, set the echo 1
to a higher number to start the renaming the remaining files at a given offset, and then merge the two directories together again.
Please always take care not to overwrite any files, and it is always a good idea to take a quick backup before updating anything.
Let’s look at the command in detail. It can help to see what is happening by adding the -t
option to xargs
which lets us see what is going on behind the scenes:
To start, the command uses two small temporary files (named _e and _c) as temporary storage. At the start of the oneliner it does a safety check using an if
statement to ensure that both _e and _c files are not present. If there is a file with that name, the script will not proceed.
On the topic of using small temporary files versus variables, I can say that whereas using variables would have been ideal (saves some disk I/O), there were two issues I was running into.
The first one is that if you EXPORT a variable at the start of the oneliner and then use that same variable later, if another script uses the same variable (including this script run more then once simultaneously on the same machine), then that script, or this one, may be affected. Such interference is best avoided when it comes to renaming many files!
The second one was that xargs in combination with bash -c seems to have a limitation in variable handling inside the bash -c
command line. Even extensive research online did not provide a workable solution for this. Thus, I ended up using a small file _c which keep progress.
_e Is the extension we will be searching for and using, and _c is a counter which will be automatically increased on each rename. The echo $[ $(cat _c) + 1 ] > _c
code takes care of this, by displaying the file with cat
, adding one number, and re-writing it.
The command also uses the best possible method of handling special file name characters by using null-termination instead of the standard newline termination, i.e. the