NAME ---- **tren** - Advanced File Renaming HOW TO USE THIS DOCUMENT ------------------------ **tren** is a powerful command line file/directory renaming tool. It implements a variety of sophisticated renaming features than can be a bit complex to learn. For this reason, this document is split into two general sections: `REFERENCE`_ and `TUTORIAL AND DESCRIPTION`_. If you are new to **tren**, start by studying the latter section first. It will take you from very simple- to highly complex **tren** renaming operations. Once you've got a sense of what **tren** can do, the reference section will be handy to look up options and their arguments. .. WARNING:: **tren** is very powerful and can easily and automatically rename things in ways you didn't intend. It is ***strongly*** recommended that you try out new **tren** operations with the ``-t`` option on the command line. This turns on the "test mode" and will show you what the program *would* do without actually doing it. It goes without saying that you should be even more careful when using this program as the system root or administrator. It's quite easy to accidentally rename system files and thereby clobber your OS. You have been warned!!! REFERENCE --------- SYNOPSIS -------- :: tren.py [-aCcdfhqtvXx] [-A alphabet] [-I file] [-i range] [-P esc] [-R sep] [-r old=new] [-S suffix] [-w width] file|dir ... SPECIFYING OPTIONS ------------------ You may specify *tren* options in one of three ways: 1) On the command line 2) In an "include" file specified with ``-I filename`` on the command line 3) Via the ``$TREN`` environment variable Options specified on the command line are evaluated from left to right and supercede any options specified in the environment variable. Think of any options set in ``$TREN`` as the "leftmost command line options". All options must precede the list of files and/or directories being renamed. If one of your rename targets start with the ``-`` character, most command shells recognize the double dash as an explicit "end of options" delimiter:: tren.py -opt -opt -opt -- -this_file_starts_with_a_dash Most shells aren't too fussy about space between an option that takes an argument, and that argument:: -i 1 -i1 Use whichever form you prefer. Just be aware that there are places where spaces matter. For example, you can quote spaces on your command line to create renaming requests that, say, replace spaces with dashes.. Some options below are "global" - they change the state of the entire program permanently and cannot be undone by subsequent options. Some options are "toggles", they can be turned on- and off as you move from left- to right on the command line. In this way, certain options (like case sensitivity, regular expression handling, and so on) can be set differently for each individual renaming request (``-r``). (If you're very brave, you can select the ``-d`` option to do a debug dump. Among many other things, the **tren** debugger dumps the state of each renaming request, and what options are in effect for that request.) OPTIONS ======= -A alphabet Install a user-defined "alphabet" to be used by sequence renaming tokens. (*Default*: Built-in alphabets only) The alphabet is specified in the form:: name:characterset Both the name and the characterset are case- and whitespace-sensitive (if your shell permits passing spaces on the command line). The "0th" element of the alphabet is the leftmost character. The counting base is the length of ``characterset``. So, for instance, the following alphabet is named ``Foo``, counts in base 5 in the sequence, ``a, b, c, d, e, ba, bb, ...``:: -A Foo:abcde -a Ask interactively before renaming each selected file or directory. (*Default*: off) If you invoke this option, **tren** will prompt you before renaming each file. The default (if you just hit ``Enter``) is to *not* rename the file. Otherwise, you have the following options:: n - Don't rename the current file y - Rename the current file ! - Rename all the remaining files without further prompting q - Quit the program These options are all insensitive to case. If you're doing forced renaming (``-f``), this option will interactively ask you first about making any necessary backups and then renaming the original target. *If you decline to do the backup renaming, but accept the renaming of the original target, the file or directory that already exists with that name will be lost!*. -C Do case-sensitive renaming (*Default*: This is the program default) This option is provided so you can toggle the program back to its default behavior after a previous ``-c`` on the command line. This option is observed both for literal and regular expression-based renaming (``-x``). . -c Collapse case when doing string substitution. (*Default*: Search for string to replace is case sensitive) When looking for a match on the old string to replace, **tren** will ignore the case of the characters found in the filename. For example:: tren.py -c -r Old=NEW Cold.txt fOlD.txt This renames both files to ``CNEW.txt`` and ``fNEW.txt`` respectively. Notice that the new (replacement) string's case is preserved. This option is observed both for literal and regular expression-based renaming (``-x``). -d Dump debugging information (*Default*: Off) Dumps all manner of information about **tren** internals - of interest only to program developers and maintainers. This option provides internal program state *at the time it is encountered on the command line*. For maximum debug output, place this as the last (rightmost) option on the command line, right before the list of files and directories to rename. You can also place multiple ``-d`` options on the command line to see how the internal tables of the program change as various options are parsed. -f Force renaming even if target file or directory name already exists. (*Default*: Skip renaming if a file or directory already exists by the same name as the target.) By default, **tren** will not rename something to a name that is already in use by another file or directory. This option forces the renaming to take place. However, the old file or directory is not lost. It is merely renamed itself first, by appending a suffix to the original file name. (*Default*: .backup, but you can change it via the ``-S`` option.) This way even forced renames don't clobber existing files or directories. -h Print help information. -I file "Include" command line arguments from ``file`` It is possible to perform multiple renaming operations in one step using more than one ``-r`` option on the **tren** command line. However, this can make the command line very long and hard to read. This is especially true if the renaming strings are complex, contain regular expressions or Renaming Tokens, or if you make heavy use of command line toggles. The ``-I`` option allows you to place any command line arguments in a separate *file* in place of- or in addition to the **tren** command line and/or the ``$TREN`` environment variable. This file is read one line at a time and the contents appended to any existing command line. You can even name the files you want renamed in the file, but they must appear as the last lines of that file (because they must appear last on the command line). Whitespace is ignored as is anything from a ``#`` to the end of a line:: # Example replacement string file # Each line appended sequentially # to the command line -xr t[ext]+=txt # Appended first -X -r =/MYEAR/ -r foo=bar my.file your.file # Appended last You may "nest" includes. That is, you can include file ``x``, that includes file ``y``, that includes file ``z`` and so on. However, its easy to introduce a "circular reference" when you do this. Suppose file ``z`` tried to include file ``x`` in this example? You'd be specifying an infinite inclusion loop. To avoid this, **tren** limits the total number of inclusions to 1000. If you exceed this, you'll get an error message and the program will terminate. Note that wildcard metacharacters like ``*`` and ``?`` that are embedded in filenames included this way are expanded as they would be from the command shell. -i instances Specifies which "instances" of matching strings should be replaced. (*Default*: 0 or leftmost) A file may have multiple instances of the ``old`` renaming string in it. The ``-i`` option lets you specify which of these (one, several, all) you'd like to have replaced. Suppose you have a file called ``foo1-foo2-foo3.foo4``. The leftmost ``foo`` is instance 0. The rightmost ``foo`` is instance 3. You can also refer to instances relative to the right. So the -1 instance is the last (rightmost), -2, second from the last, and so forth. Often, you just want to replace a specific instance:: -i :3 -r foo=boo -i :-1 -r foo=boo Both of these refer to the last instance of old string ``foo`` (found at ``foo4`` in our example name). Sometimes, you'd like to replae a whole *range* of instances. An "instance range" is specified using the ``:`` separator in the form:: -i first-to-replace:stop-here Notice that the "stop-here" instance is NOT replaced. In our string above, the option:: -i 1:-1 -r foo=boo Would change the file name to:: foo1-boo2-boo3.foo4 You can also provide partial ranges:: -i 1: # From instance 1 to end of name -i :-2 # All instances up to (not including) next-to-last -i : # All instances -P char Use ``char`` as the escape symbol. (*Default*: ``\``) -q Quiet mode, do not show progress. (*Default*: Display progress) Ordinarily, **tren** displays what it is doing as it processes each file. If you prefer to not see this "noisy" output, use the ``-q`` option. Note that this does not suppress warning and error messages. It doesn't make much sense to use this option in test mode (``-t``), although you can. The whole point of test mode is to see what would happen. Using the quiet mode suppresses that output. -R char Use ``char`` as the separator symbol in renaming specifications. (*Default*: ``=``) -r <old=new> Replace ``old`` with ``new`` in file or directory names. Use this option to specify which strings you want to replace in each file name. These strings are treated literally unless you also invoke the ``-x`` option. In that case, ``old`` is treated as a Python style regular expression. Both ``old`` and ``new`` may optionally contain *renaming tokens* described later in this document. If you need to use the ``=`` symbol *within* either the old or new string, simply escape it: ``\=`` If it is convenient, you can change the separator character to something other than ``=`` via the ``-R`` option. Similarly, you can change the escape character via the ``-P`` option. You can have multiple instances of this option on your **tren** command line:: tren.py -r old=new -r txt:doc old-old.txt This renames the file to:: new-old.doc Remember that, by default, **tren** only replaces the first (leftmost) instance of the old string with the new. Each rename specification on the command line "remembers" the current state of all the program options and acts accordingly. For example:: tren.py -cr A=bb -Cr B=cc ... The ``A=bb`` replacement would be done without regard to case (both ``A`` and ``a`` would match), where as the ``B=cc`` request would only replace ``B``. -S suffix Suffix to append when making backup copies of existing targets. (*Default*: .backup) If you choose to force renaming if files when the new name already exists (``-f``), **tren** simply renames the existing file or directory by appending a suffix to it. By default, this suffix is ``.backup``, but you can change it to any string you like with the ``-S``` option. -t Test mode, don't rename, just show what the program *would* do. **tren** is very powerful and capable of doing nasty things to your file and directory names. For this reason, it is helpful to test your **tren** commands before actually using them. With this option enabled, **tren** will print out diagnostic information about what your command *would* do, *without actually doing it*. If your renaming requests contain random renaming tokens, test mode will only show you an approximation of the renaming to take place (because new random name strings are generated each time the program runs). -v Print detailed program version information and keep running. This is handy if you're capturing **tren** output into a log and you want a record of what version of the program was used. -w length Set the length of diagnostic and error output. (*Default*: 80) **tren** limits output to this length when dumping debug information, errors, warnings, and general information as it runs. This option is especially useful when you're capturing **tren** output into a log and don't want lines wrapped:: tren.py -w999 ..... 2>&1 > tren.log **tren** makes sure you don't set this to some unreasonably small value such that output formatting would be impossible. -X Treat the renaming strings literally (*Default*: This is the program default) This option is provided so you can toggle the program back to its default behavior after a previous ``-x`` on the command line. -x Treat the old string in a ``-r`` replacement as a Python style regular expression for matching purposes. (*Default*: Treat the old string as literal text) TUTORIAL AND DESCRIPTION ------------------------ .. WARNING:: ONE MORE TIME: **tren** is a powerful file and directory renaming tool. Be **sure** you know what you're about to do. If you're not, run the program in test mode (invoke with the ``-t`` option) to see what would happen. You have been warned! The following sections are designed for the new- or occasional **tren** user. They begin with the simplest of **tren** operations and incrementally build more- and more complex examples, eventually describing all of **tren**'s capabilities. Overview ======== **tren** is a general purpose file and directory renaming tool. Unlike commands like ``mv``, **tren** is particularly well suited for renaming *batches* of files and/or directories with a single command line invocation. **tren** eliminates the tedium of having to script simpler tools to provide higher-level renaming capabilities. **tren** is also adept at renaming only *part of an existing file or directory name* either based on a literal string or a regular expression pattern. You can replace any single, group, or all instances of a given string in a file or directory name. **tren** implements the idea of a "renaming token". These are special names you can embed in your renaming requests that represent things like the file's original name, its length, date of creation, and so on. There are even renaming tokens that will substitute the content of any environment variable or the results of running a program from a shell back into the new file name. **tren** can automatically generate *sequences* of file names based on their dates, lengths, times within a given date, and so on. In fact, sequences can be generated on the basis of any of the file's ``stat`` information. Sequence "numbers" can be ascending or descending and the count can start at any initial value. Counting can take place in one of several internally defined counting "alphabets" (decimal, hex, octal, alpha, etc.) OR you can define your own counting alphabet. This allows you to create sequences in any base (2 or higher please :) using any symbol set for the count. A Word About Program Defaults ============================= **tren** has many options, but its defaults are designed to do two things: a) Simplify the most common operations by making them the default (no options required on the command line), and 2) Reduce the risk of accidentally modifying more of the file name than you intented. So, by default: **tren** treats renaming requests *literally*. That is, the "old string" you specify for replacement is treated as literal text. It requires a command line option (``-x``) to treat it as a regular expression. *However*, any renaming tokens found in either the old- or new strings of a renaming request *are* interpreted before the renaming takes place. **tren** renaming is *case sensitive*. If you want to ignore case, use the ``-c`` option. **tren** will only replace the *first (leftmost) instance* of "old string" with "new string". If you want more- or different instances replaced, use the ``-i`` option. **tren** will not allow you to rename a file or directory *if one with the new name already exists*. Such attempts will cause no change to the file or directory being processed and an error message will be displayed. This is intentional to force you to manually rename or remove the file or directory that would have been clobbered by a rename. You can override this default and *force* a renaming via the ``-f`` option. This will cause the orginal file or directory itself to be renamed with a ``.backup`` suffix. You can change this suffix via the ``-S`` option. Getting Help ============ There are three command line options that can give you some measure of help and information about using **tren**: -d Dumps debug information out to stderr. You can insert multiple instances of this option on the command line to see how the program has parsed everything *to the left* of it. This is primarily intended as a debugging tool for people maintaining **tren** but it does provide considerable information on the internal state of the program that advanced users may find useful. -h Prints a summary of the program invocation syntax and all the available options and then exits. -v Prints the program version number and keeps running. Controlling Program Output ========================== As **tren** runs, it produces a variety of diagnostic and status information. There are a number of options you can use to control how this works: -q Set's "quiet" mode and suppresses everthing except error messages. -w # Tells **tren** to wrap lines after ``#`` characters have been printed. If you're capturing output to a log, set this to a very high number like 999 to inhibit line wrapping. Error and debug messages are sent to ``stderr``. Normal informational messages are sent to ``stdout``. If you want to capture them both in a log, try something like this (depending on your OS and/or shell):: tren.py ..... 2>&1 >tren.log Managing Complexity =================== As you learn more of the program features, the **tren** command line can get long, complex, and easy to goof up. It's also hard to remember all the various options, how they work exactly, and which specific one you need. For this reason, it is *highly* recommended that - once you have a renaming request working the way you like - if you plan to use it again, save it as an "include" file. That way you can reuse it easily without having to keep track of the details over and over. Instead of this:: tren.pu -c -i -1 -r .jpeg=.jpg file ... Do this:: tren.py -I jpeg-to-jpg.tren file... What's in the ``jpeg-to-jpg.tren`` file? Just this:: # tren Command Line # Converts '.jpeg' (in any case mixture) file name suffix to '.jpg' # Make the replacement case-insensitive -c # Reset this later on the command line with -C # Only replace the rightmost instance -i -1 # The actual replacement request -r .jpeg=.jpg Notice that you can stick comments in the file anywhere you like and that they begin with ``#``. Notice also that the various options can be entered on separate lines so it's simpler to read the include file. If you find it useful, you can even include other include files *in* an include file:: # Get the jpeg -> jpg suffix renaming -I jpeg-to-jpg.tren # Let's make it fancy -i -1 -r .jpg=.fancy.jpg If you do this, take care not to create a circular include. This can happen when an include file tries to include itself, either directly, or via another include file. **tren** limits the total number of includes to a very large number. If it sees that the number has been exceeded, it suspects a circular include and will issue an error message to that effect and exit. You can insert include options anywhere you like on the command line and you can have as many as you like (up to a VERY large number you'll never hit in practice). Each include reference will be replaced with the contents of that file *at the position it appears on the command line*. If you find yourself using certain options most- or every time you use the program, you can put them in the **$TREN** environment variable. **tren** picks this up every time it starts. This minimizes errors and reduces typing tedium. Just keep in mind that some options can be overriden later on a command line, and some cannot. For instance, suppose you do this:: export TREN=-f -c The ``-c`` option to ignore case can be undone on the command line with a ``-C`` option. However, the ``-f`` option cannot be undone. So ... choose the options you want to make permanent in the environment variable wisely. Renaming Basics =============== **tren** supports a variety of renaming mechanisms. The one thing they have in common is that they're built with one or more *renaming requests* that will be applied to one or more file- or directory names. Renaming requests look like this on the **tren** command line:: tren.py ... -r old=new ... -r old=new ... list of files/directories No matter how complicated they look, the basic logic of the renaming request stays the same: "When you find the string ``old`` in the file- or directory name, change it to the string ``new``. The ``old`` and ``new`` renaming strings are built using a variety of building blocks: ============================= ============================= *Old Strings Are Built With:* *New Strings Are Built With:* ----------------------------- ----------------------------- Literal Text Literal Text Regular Expressions Renaming Tokens Renaming Tokens ============================= ============================= You can use any of these building blocks alone or combine them to create expressive and powerful renaming schemes. Literal String Substitution =========================== Literal String Substitution is just that - it replaces one literal string with another to rename the target file or directory. This is the most common, and simplest way to use **tren**. This is handy when you have files and directories that have a common set of characters in them you'd like to change. For instance:: tren.py -r .Jpeg=.jpg *.Jpeg This would rename all files (or directories) whose names contained the string ``.Jpeg`` and replace it with ``.jpg``. Well ... that's not quite right. Unless you specify otherwise with the ``-i`` option, *only the first (leftmost) instance of ``old`` is replaced with ``new``*. So, for example, if you started out with the file, ``My.Jpeg.Jpeg`` and ran the command above, you'd end up with a new file name of ``My.jpg.Jpeg`` You can omit either ``old`` or ``new`` strings in a renaming specification, but never both. If you omit the ``old`` string, you're telling **tren** to *change the whole file name*:: tren.py -r =MyNewFilename foo # New Name: MyNewFilename Be careful with this one. If you apply it to a list of files or directories, it's going to try and name them all to the *same* name. By default, **tren** will refuse to overwrite an existing filename, so it will stop you from doing this. If you absolutely insist on this via the ``-f`` option, you'll get a bunch of files ending with ``.backup``. Say you have files ``a``, ``b``, and ``c``:: tren.py -fr =NewName a b c When the command completes, the files will have been renamed in this fashion:: a -> NewName.backup.backup b -> NewName.backup c -> NewName If you omit the ``new`` string, you're telling **tren** to *remove* the leftmost instance of ``old`` string (or other instances via the ``-i`` option described below) from the file- or directory name. For example:: tren.py -rfoo= foo1-foo2-foo3.foo4 # New name: 1-foo2-foo3.foo4 If you try to omit *both* ``old`` and ``new`` strings, you're effectively telling **tren** to change the existing file name to ... nothing (a null string). This is impossible because file names must be at least one character long. **tren** enforces both this minimum length AND the maximum legal length of new file names. It will print an error and exit if your renaming attempt would violate either of these limits. (As of this writing, the maximum file- or directory name length allowed by the operating systems on which **tren** runs is 255 characters.) Substitution Instances ====================== As we just saw above, sometimes the ``old`` string appears in several places in a file- or directory name. By default, **tren** only replaces the first, or leftmost "instance" of an ``old`` string. However, using the ``-i`` option you can specify *any* instance you'd like to replace. In fact, you can even specify a *range* of instances to replace. Instances are nothing more than *numbers* that tell **tren** just where in the name you'd like the replacement to take place. Positive numbers means we're counting instances from the *left* end of the name. The leftmost instance is 0 (not 1!!!). You can also count *backwards* from the right end of the string using negative numbers. -1 means the last instance, -2 means next-to-last, and so on. In summary, counting from the left starts at zero and counting from the right starts at -1. Suppose you have a file called:: foo1-foo2-foo3.foo4 The leftmost ``foo1`` is instance 0 of old string ``foo``. It is also instance -4. The rightmost ``foo4`` is instance 3 of old string ``foo``, and also instance -1. You can specify a *single instance* to replace:: tren.py -i 1 -r f=b foo1-foo2-foo3.foo4 # New name: foo1-boo2-foo3.foo4 tren.py -i -1 -r f=b foo1-foo2-foo3.foo4 # New Name: foo1-foo2-foo3.boo4 You can also specify a *range of instances* to replace using the notation:: -i first-to-replace:stop-here All instances from the "first-to-replace" up to, *but NOT including* "the stop-here" are replaced:: tren.py -i 1:3 -r f=b foo1-foo2-foo3.foo4 # New Name: foo1-boo2-boo3.foo4 tren.py -i -4:-2 -r f=b foo1-foo2-foo3.foo4 # New Name: boo1-boo2-foo3.foo4 ``-i :`` means "replace *all* instances:: tren.py -i : -r f=b foo1-foo2-foo3.foo4 # New Name: boo1-boo2-boo3.boo4 You can also use *partial range specifications*:: tren.py -i 1: -r f=b foo1-foo2-foo3.foo4 # New Name: foo1-boo2-boo3.boo4 tren.py -i :-2 -r f=b foo1-foo2-foo3.foo4 # New Name: boo1-boo2-foo3.foo4 Note that you cannot specify individual, non-adjacent instances. There is no way to use a single **tren** command to replace, say, the only the 2nd and the 4th instance of an ``old`` string. Doing that requires two renaming requests. The good news is that we can do them both on a single **tren** invocation. Multiple Substitutions ====================== You can put as many renaming requests on a **tren** command line as you like (.... well, up to the length limit imposed by your operating system and shell, anyway). As we just saw, this can be handy when a single renaming request can't quite do everything we want. BUT ... there's a catch. In designing your renaming requests, you have to keep in mind that **tren** processes the command line *from left to right*, incrementally constructing the new name as it goes. For instance:: tren.py -r foo=bar -r foo=baz foo1-foo2-foo3.foo4 Produces ... wait a second ... why on earth are there two renaming requests with idential ``old`` strings on the same command line? Shouldn't this produce a final name of ``baz1-foo2-foo3.foo4``? Nope. After the leftmost renaming request has been processed, the new name is ``bar1-foo2-foo3.foo4``. Remember that, by default, **tren** only replaces the *leftmost* or 0th instance of an ``old`` string. So, when the second renaming request is processed, the instance 0 of ``foo`` is now found in the string ``foo2``. So, the final name will be, ``bar1-baz2-foo3.foo4``. The lesson to learn from this is that multiple renaming requests on the command line will work fine, but you have to do one of two things (or both): 1) Make sure you're tracking what the "intermediate" names will look like as the new file name is being constructed, renaming request, by renaming request. 2) Make sure the renaming requests operate on completely disjoint parts of the file name. .. NOTE:: Similarly, **tren** remembers the last state of each option as you move from left to right on the command line. For instance:: tren.py -i1 -r f=F -r o=O foo1-foo2-foo3.foo4 You might be tempted to believe that this would produce, ``fOo1-Foo2-foo3.foo4``, but it doesn't. It produces, ``foO1-Foo2-foo3.foo4`` instead because the ``-i 1`` appears prior to *both* renaming requests and thus applies to each of them. If you want the first instance of "o" to be replaced, you need a command line like this:: tren.py -i1 -r f=F -i0 -r o=O foo1-foo2-foo3.foo4 This sort of thing is generally true for *all* options, so be sure they're set the way you want them to the left of a renaming request. As a practical matter, this can get really complicated to track. If in doubt, it's always better to run two separate **tren** commands in, say, a shell script to make the renaming explicit, rather than to obscure things with clever command line trickery. So, let's go back to our example from the previous section. We want to replace the 2nd and 4th instances of the string "foo" in our file name. We do this with two renaming requests on the same command line, considering what each one does to the name as it is encountered:: tren.py -i1 -r foo=bar -i2 -r foo=bar foo1-foo2-foo3.foo4 More About Command Line Pitfalls ================================ As we just saw, you can get surprising results as **tren** works its way through the command line from left to right. There are other potential pitfalls here, so it's helpful to understand just *how* **tren** processes your command line, step-by-step: 1) Prepend the contents of $TREN to the user-provided command line. This allows you to configure your own default set of options so you don't have to type them in every time. 2) Resolve all references to include files. This has to be done before anything that follows, because include files add options to the command line. 3) Build a table of every file name to be renamed. We'll need this information if any of the renaming requests use the file- or sequence renaming tokens (discussed later in this document). 4) Build a table containg each renaming request storing the current state of every program option at that point on the command line. This allows **tren** to apply options differently to different renaming requests on the same command line. This came in handy in our example of the previous section. 5) Resolve any renaming tokens found in either the ``old`` or ``new`` portions of the renaming request. At this point, both ``old`` and ``new`` are nothing more than simple strings, although ``old`` may be interpreted as a regular expression rather than literally if the option to do so is in effect. 5) Process each file found on the command line in left to right order, applying each renaming request, in the order it appeared from left to right on the command line. Simple eh? Well, mostly it is ... until it isn't. As we just saw, incrementally building up a new name with multiple renaming requests can produce unexpected results and we have to plan for them. Similarly, you can inadvertently accidentally give a file the *wrong name entirely* ... this is usually a Bad Thing. Say you have two files, ``x`` and ``y``. You want to rename ``x`` to ``y`` and ``y`` to ``z1``. Well, order matters here. Say you do this:: tren.py -fr x=y -r y=z1 x y Let's see what happens in order: 1) File ``x`` renaming:: x -> y y -> z1 So, file ``x`` is renamed ``z1`` (!) 2) File ``y`` renaming:: y -> z1 .... oops, x1 exists, we need a backup z1 -> z1.backup y -> z1 Um ... not quite what we wanted. However, if we shuffle around the order of renaming arguments AND the order in which to process the files, we can get what we want:: tren.py -r y=z1 -r x=y y x Notice that we can drop the ``-f`` option because there is no longer a naming conflict (see the next section for more about forced renaming). The point here, as we've said already, is that you have to be very careful when constructing command lines, keeping track of options, and *what order* you specify both renaming requests *and* the files- and directories to be renamed. As always, the simple way around this is to run multiple, separate **tren** commands, each with its own single renaming request. Forcing Renaming ================ Ignoring Case ============= The Strange Case Of Mac OS X And Windows ======================================== Mac OS X and Windows have an "interesting" property that makes case handling a bit tricky. Both of these operating systems *preserve* case in file and directory names, but they do not *observe* it. (It is possible to change this behavior in OS X when you first prepare a drive, and make the filesystem case sensitive. This is rarely done in practice, however.) These OSs show upper- and lower- case in file names as you request, but they do not *distinguish* names on the basis of case. For instance, the files ``Foo``, ``foo``, and ``FOO``, are all the same name in these operating systems, and only one of these can exist in a given directory. This can cause **tren** to do the unexpected when your renaming command is doing nothing more than changing case. Suppose you start with a file called ``Aa.txt`` and run this command:: tren.py -rA=a Aa.txt **tren** will immediately complain and tell you that the file ``aa.txt`` already exists and it is skipping the renaming. Why? Because from the point-of-view of OS X or Windows, ``aa.txt`` (your new file name) is the same as ``Aa.txt`` (your original file name). You can attempt to force the renaming:: tren.py -frA=a Aa.txt Guess what happens? Since **tren** thinks the new file name already exists, it backs it up to ``aa.txt.backup``. But now, when it goes to rename the original file ... the file is *gone* (thanks to the backup renaming operation)! **tren** declares an error and terminates. This is not a limitation of **tren** but a consequence of a silly design decision in these two operating systems. As a practical matter, the way to avoid this issue is to never do a renaming operation in OS X or Windows *that only converts case*. Try to include some other change to the filename to keep the distinction between "old name" and "new name" clear to the OS. In the worst case, you'll have to resort to something like:: tren.py -rA=X Aa.txt tren.py -rX=a Xa.txt Using Regular Expressions ========================= Ordinarily **tren** treats both the old string you specify with the ``-r`` option *literally*. However, it is sometimes handy to be able to write a regular expression to specify what you want replaced. If you specify the ``-x`` option, **tren** will treat your old string as a Python style regex, compile it (or try to anyway!) and use it to select which strings to replace. This makes it much easier to rename files that have repeated characters or patterns, and groups of files that have similar, but not idential strings in their names you'd like to replace. Say you have a set of files that are similar, but not idential in name, and you want to rename them all:: sbbs-1.txt sbbbs-2.txt sbbbbbbbbs-3.txt Suppose you want to rename them, replacing two or more instances of ``b`` with ``X``. It is tedious to have to write a separate literal ``-r old=new`` string substitution for each instance above. This is where regular expressions can come in handy. When you invoke the ``-x`` option, **tren** understands this to mean that the ``old`` portion of the replacement option is to be treated as a *Python style regular expression*. That way, a single string can be used to match many cases:: tren.py -x -r bb+=X *.txt This renames the files to:: sXs-1.txt sXs-2.txt sXs-3.txt Keep in mind that a literal string is a subset of a regular expression. This effectively means that with ``-x`` processing enabled you can include *both* regular expressions and literal text in your "old string" specification. The only requirement is that the string taken as a whole must be a valid Python regular expression. If it is not, **tren** will display an error message to that effect. Because Python regular expressions can make use of the ``=`` symbol, you need a way to distinguish between an ``=`` used in a regular exression and the same symbol used to separate the old and new operands for the ``-r`` option. Where this symbol needs to appear in a regular expression, it has to be escaped like this: ``\=``. Regular expression processing is unaffected by the ``-g / -1`` (global rename) and ``-c / -C`` (ignore case) options. That's because there are regular expression mechanisms for achieving the same thing. More importantly, if you've selected regular expression matching, it's probably because you want very fine grained control of the renaming defined by the regex. In short, regular expression matching always takes place on the *original characters* of the target portion of the name and does replacement as called for in the regex itself. Changing The Renaming Separator & Escape Characters =================================================== There may be times when the default renaming separator (``=``) and/or escape character (``\``) make it clumsy to construct a renaming request. This can happen if, say, either the old- or new string in a literal renaming needs to use the ``=`` symbol many times. Another case where this may be helpful is when constructing complex regular expressions than need to make use of these characters. The ``-R`` and ``-P`` options can be used to change the character used for renaming separator and escape character respectively. You can use any character you like (these must be a single character each), but bear in mind that the underlying operating system understands certain characters as being special. Trying to use them here will undoubtedly deeply confuse your command shell, and possibly your file system. For example, the ``/`` character is used as a path separator in Unix-derived systems. It's therefore a Really Bad Idea to try and use it as a renaming separator or escape character. Interactive Renaming ==================== An Overview Of Renaming Tokens ============================== **tren** implements the notion of *renaming tokens*. These can appear in either the ``old`` or ``new`` string components of a ``-r`` renaming argument. It is sometimes useful to be able to take a group of files or directories whose names have nothing in common and impose a common naming scheme on them. Another use for renaming tokens is to do the renaming based on some property the file or directory possesses like its creation date, size, owner's name, and so on. In their simplest form, renaming tokens are nothing more than "canned" information **tren** knows about a particular file or directory. For instance, if you insert the ``/D/`` token into a old- or new string definition, **tren** will replace it with *the creation date of the file or directory being renamed* and use that string in the renaming process. There are also tokens that allow you to use system information in your renaming strings. Finally, there are tokens that can be used to automatically renumber or sequence (order) a set of files or directories being renamed. For example, suppose you and your friends pool your vacation photos but each of your cameras uses a slightly different naming scheme. You might want to just reorder them by the date and time each picture was taken, for example. That way you end up with one coherent set of named and numbered files. You might start with something like this:: DSC002.jpg # Bob's camera, taken 1-5-2010 at noon dc0234.Jpg # Mary's camera, taken 1-5-2010 at 8am 032344.jpeg # Sid's camera, taken 1-3-2010 at 4pm It would be nice to get these in order somehow:: tren.py -r =/D/-MyVacation-/+T0001/.jpeg *.jp* This would rename all the files in the current directory ending with ``.jp*``. The ``/D/`` would be replaced with the *date* the picture was taken. The ``/+T0001/`` refers to a *starting sequence number* to uniquely identify pictures taken on the same date. The other strings, ``-MyVacation-`` and ``.jpeg``, are inserted *literally* in the final file names. After we ran this command, the files above would end up with these names:: 20100103-MyVacation-0001.jpeg # Sid's 20100105-MyVacation-0001.jpeg # Mary's 20100105-MyVacation-0002.jpeg # Bob's Notice that the files taken on the same date have been sequenced by the time-of-day they were taken because we included the ``/+T0001/`` renaming token in our pattern. The ``+`` here means to construct the sequence in *ascending* order. A ``-`` would specify *descending* order. .. Note:: Notice that there is *no old string* in our example above. That is, there is nothing to the left of the ``=`` symbol in the ``-r`` option. This effectively means "replace everything" in the existing file or directory name. You can do the same thing using a regular expression:: tren.py -x -r *=/D/-MyVacation-/+T001/.jpeg *.jp* Of course, if you use the ``-b`` or ``-e`` flags, you limit just what portion of the filename is considered "everything". Of course, you don't *have* to replace the entire filename when using tokens. It's perfectly legitimate to replace only a portion of the existing name:: tren.py -r file=/D/-file file-1 file.2 This would rename our files to: ``20100101-file-1 and 20100101-file.2`` Notice that we combined literal text and a renaming token to do this. You can even use renaming tokens in your *old string* specification. For instance, suppose you manage a number of different systems and you set their system name in an environment variable called SYSNAME. You might then do something like this:: tren.py -x -r /$SYSNAME/*.bku=/$SYSNAME/*.bku.old If your system name was "matrix", then the command above would only rename files whose names began with ``matrix`` and ended with ``.bku``. If your system name were "morton", then the command above would only rename files whose names began with ``morton`` and ended with ``.bku``. There are a couple of things to keep in mind when doing things like this: 1) The ``/$SYSNAME/`` in the old string is used to *find the text to rename*, whereas the same renaming token in the new string means *insert the contents of that environment variable here*. 2) Renaming tokens are always evaluated *before* any regular expression processing takes place. It's up to you to make sure that when the two are combined (as we have in the example above), *that the final result is still a valid Python regular expression*. This may involve explicit quoting of the renaming tokens used in the old string specification. **tren** has many other kinds of renaming tokens. Their structure and use is described in some detail in the section below entitled `Renaming Tokens: The Gory Details`_. Renaming Token Pitfalls ======================= As we saw in earlier sections, **tren** command line option and file name interaction can be tricky. It can depend on order and on whether the various renaming request "collide" with each other as a new file name is computed. A similar potential collision exists between renaming tokens and renaming requests. Recall from `More About Command Line Pitfalls`_ that renaming tokens are resolved *before* a renaming request is processed. This means that the string substitution (literal or regular expression) of the renaming operation can *conflict with the characters returned when the renaming token was resolved*. For example, suppose we do this:: tren.py -r =New-/FNAME/ -r My=Your MyFile.txt The first renaming request computes the name ``New-MyFile.txt``. However, the second renaming request further modifies this to ``New-YourFile.txt``. In effect, the second renaming request is *overwriting part of the string produced by the renaming token reference*. This is an intentional feature of **tren** to allow maximum renaming flexibility. However, you need to understand how it works so you don't get unexpected and strange results. For example, look what happens when you reverse the order of the renaming requests in this case:: tren.py -r My=Your -r =New-/FNAME/ MyFile.txt ``My`` gets replaces with ``Your``, but as soon as the second renaming request is processed, the whole string is thrown away and replaced with the final name ``New-MyFile.txt``. Renaming Tokens: The Gory Details ================================= As we've just seen, a *renaming token* is nothing more than a string representing something **tren** knows about. These fit in one of three categories: - An attribute of the file or directory being renamed - An attribute of the underling operating system environment - A sequence that reflects some ordering principle Renaming tokens are delimited by the ``/`` character. **tren** replaces these tokens with the corresponding information (see descriptions below) wherever you indicated in either the ``old`` or ``new`` strings of a ``-r`` rename command. Currently, **tren** defines a number of renaming tokens. Future releases of **tren** may add more of these, so it's good to periodically reread this material. File Attribute renaming tokens ============================== These tokens are derived from information about the file or directory being renamed. ``/D/ File or directory creation date`` This token is replaced with the date of creation of the file or directory being renamed. It is in ``yyyymmdd`` format. ``/dd/ File or directory day of creation`` This token is replaced with the the day of the month the file was created in ``dd`` format. ``/dy/ File or directory day of creation`` This token is replaced with the the name of the day the file was created in ``Ddd`` format. ``/E/ Original File Extension`` This token is replaced the "extension" portion of the file or directory before renaming. This does not include the extension separator string. ``/F/ Original File Name`` This token is replaced the "name" portion of the file or directory before renaming. .. NOTE:: Notice that there is no token for the *whole* filename because you can always synthesize it with ``/F/./E/`` ``/G/ File or directory primary group name`` This token is replaced with the name of the primary group to which the file belongs. ``/hh/ File or directory hour of creation`` This token is replaced with the hour the file was created in ``hh`` format. ``/I/ File or directory creation date in ISO format`` This token is replaced with the date of creation of the file or directory being renamed. It is similar to ``/D/`` except it is in ISO format, ``YYYY-MM-DD``. ``/L/ File or directory length`` This token is replaced with a numeric string that indicates the length of the file or directory in bytes. ``/mm/ File or directory minutes of creation`` This token is replaced with the minutes the file was created in ``mm`` format. ``/mo/ File or directory month of creation`` This token is replaced with the numeric month the file was created in ``mm`` format. ``/my/ File or directory month of creation`` This token is replaced with the abbreviated name of the month the file was created in ``Mmm`` format. ``/ss/ File or directory seconds of creation`` This token is replaced with the seconds the file was created in ``ss`` format. ``/T/ File or directory creation time`` This token is replaced with the time of creation of the file or directory being renamed. It is in ``hh:mm:ss`` format. This is equivalent to ``/hh/:/mm/:/ss/``. ``/U/ File or directory owner name`` This token is replaced with the name of the file or directory's owner. ``/yyyy/ File or directory year of creation`` This token is replaced with the year the file was created in ``yyyy`` format. System Related renaming tokens ============================== These tokens are derived from the underlying operating system and runtime environment. ``/$ENV/ Environment variable`` This token is replaced with the value of the environment variable ``$ENV``. If that variable does not exist, the token is replaced with an empty string:: tren.py -r =/$ORGANZATION/-/F/./E/ * This prepends the organization's name to everything in the current directory. ``/`cmd`/ Arbitrary command execution`` This token is replaced with the string returned by executing the ``cmd`` command. For instance, you might want to prepend the name of the system to a all you shell scripts:: tren.py -r =/`uname -n`/-/F/./E/ *.sh This construct is more generally a way to synthesize renaming tokens that are not built into **tren**. For instance, the built-in tokens only provide information about file and directory *creation* dates. You might want to use the date of *last access*. You do this by writing the appropriate script or program and then executing it within the /\`cmd\`/ construct. This effectively provides **tren** an unlimited number of renaming tokens. .. WARNING:: Be *very* careful using this. It's possible to construct bizzarre, overly long, and just plain chowder-headed strings that make no sense in a renaming context using this construct. Sequence Renaming Tokens ======================== Sometimes it's useful to rename files or directories based on some *property they possess* like the date or time of creation, the size of the file, who owns it, and so on. That's the idea behind the ``/D/``, ``/L/``, and ``/T/`` renaming tokens described in the previous section. An extension of this idea is to *order all the files being renamed* based on one of these parameters. For instance, instead of actually embedding the date and time of creation in a file or directory name, you might want to order the files from oldest to newest with a naming convention like:: file-1.txt file-2.txt file-3.txt This guarantees uniqueness in the final name and also sees to it that a sorted directory listing will show you the files or directories in the order you care about. This is the purpose of *sequence renaming tokens*. They give you various ways to create sequences that can be embedded in the final file or directory name. General Format Of Sequence Renaming Tokens ========================================== Sequence renaming tokens consist of three descriptive components and have the following general format:: /<ordering flag><type><counting pattern>/ where, ordering flag: + ascending - descending type: D sequence on file creation date & time L sequence on file length R sequence on the command line file order T sequence on file creation time within a given day Count Pattern Format ==================== The counting pattern is used to specify two things: The width of the sequence string, and the starting value for the sequence. Examples:: 0001 -> 0001, 0002, 0003, ... 0000 -> 0000, 0001, 0002, ... 03 -> 03, 04, 05, ... You do not have to use a ``0`` to indicate the sequence width. You can use *any* padding characters you like. **tren** only cares about the width of the field and will "consume" your padding characters as the count increases.:: xxx3 -> xxx3, xxx4, xxx5, ... 9999, xxx3, xxx4, ... -+8 -> -+8, -+9, -10, -11, ... 999, -+8, -+9, ... You are not restricted to numbers in a counting pattern. Letters may also be used. **tren** will preserve the case you specify in the token when creating sequences like this:: 000a -> 000a, 000b, 000c, ... zzzz, 000a, ... ---Y -> ---Y, ---Z, --AA, ... ZZZZ, ---Y, ---Z, ... Notice that when a sequence "rolls over", the next value is the *initial sequence value you specified*. Types Of Sequence Renaming Tokens ================================= Sequence renaming tokens are thus a way to generate an ordering *based on some property common to everything being renamed*. Keep in mind that for purposes of sequencing, **tren** *makes no distinction between a file and directory*. It merely sequences based on the property you requested. **tren** currently supports the following kinds of sequencing: ``/+D0001/ Sequence based on the creation date/time`` This produces a sequence from oldest to newest (or the reverse) of the renamed objects. ``tren.py -b -r =/+D0002/ *.txt`` This would rename all the files in the current directory into the form, ``0002.txt``, ``0003.txt``, ... ``9999.txt`` with ``0002.txt`` being the oldest file and ``9999.txt`` being the newest. If you used the token ``/-D0002/``, you'd get the same thing, but in reverse order. ``/+L0001/ Sequence based on the size of the files being renamed`` This produces a sequence from shortest to longest (or the reverse) of the renamed objects. ``tren.py -r =/+L0002/ *.txt`` This would rename all the files in the current directory into the form, ``0002.txt``, ``0003.txt``, ... ``9999.txt`` with ``0002.txt`` being the shortest file and ``9999.txt`` being the longest. If you used the token ``/-L0002/``, you'd get the same thing, but in reverse order. ``/+R0001/ Sequence based on the file order on the command line`` This produces a sequence based on the order (or the reverse) of renaming - i.e., The order of the names on the command line. ``tren.py -e -r =/+R0000/ MyFile.txt AFile.jpg me.log`` This would rename all the files to, ``MyFile.0``, ``AFile.1``, and ``me.2``. If you used ``/-R0000/``, you'd get ``MyFile.2``, ``AFile.1```, and ``me.0``. ``/+T0001/ Sequence based on creation time within date`` This produces a sequence based on the creation date and time similar to the ``/+D.../`` sequence renaming token above. However, the sequence *resets* at the beginning of each new date. This allows you to create unique sequences *within a date* like our example of renaming photo files from different cameras. (See: `An Overview Of Renaming Tokens`_):: tren.py -b -r =/D/-/+T0100/ *.txt This would rename all the ``.txt`` files in the current directory into the form:: 200103010-0100.txt 200103010-0101.txt 200103010-0102.txt 200104010-0100.txt 200104010-0101.txt 200104010-0102.txt 200104011-0100.txt 200104011-0101.txt 200104011-0102.txt ... In other words, instead of sequence just on the creation date, this allows us to sequence *within* the date. As always, the ``-`` flag will reverse this order within the date. Notice that you can get something similar using just file attribute renaming tokens:: tren.py -b -r =/D/-/T/ *.txt This would produce names in the form:: 200103010-03:01:23.txt 200103010-03:01:24.txt 200103010-03:01:25.txt ... For most purposes, though, the order, rather than the absolute time is both more useful and more readable. BUGS, MISFEATURES, OTHER ------------------------ You must be running Python 2.6.x or later. **tren** makes use of features not supported in releases prior to this. As a general matter, **tren** should run on any POSIX-compliant OS that has this version (or later) of Python on it. It will also run on many Microsoft Windows systems. If the Windows system has the ``win32all`` Python extensions installed, **tren** will take advantage of them for purposes of deriving the names of the user and group that own the file or directory being renamed. As of this writing, **tren** will not run in the **cygwin** environment because their version of Python is still backleveled to 2.5.x. When and if the **cygwin** team upgrades to 2.6.x, **tren** is expected to work there as well. This program is **EXPERIMENTAL** (see the license). This means it's had some testing but is certainly not guaranteed to be perfect. As of this writing, it has been run on FreeBSD, Linux, Windows XP, and Mac OS X. It has not, however, been run on 64-bit versions of those OSs. If you have experience, positive or negative, using **tren** on other OS/bitsize systems, please contact us at the email address below. HOW COME THERE'S NO GUI? ------------------------ **tren** is primarily intented for use by power users, sys admins, and advanced users that (mostly) find GUIs more of a nuisance than a help. There are times, however when it would be handy to be able to select the files to be renamed graphically. TundraWare has a freely available file browser that is macro programmed. It will work nicely in such applications: http://www.tundraware.com/Software/twander/ COPYRIGHT AND LICENSING ----------------------- **tren** is Copyright (c) 2010 TundraWare Inc. For terms of use, see the ``tren-license.txt`` file in the program distribution. If you install **tren** on a FreeBSD system using the 'ports' mechanism, you will also find this file in ``/usr/local/share/doc/tren``. AUTHOR ------ :: Tim Daneliuk tren@tundraware.com DOCUMENT REVISION INFORMATION ----------------------------- :: $Id: tren.rst,v 1.154 2010/03/26 20:09:32 tundra Exp $ You can find the latest version of this program at: http://www.tundraware.com/Software/tren This document was produced using reStructuredText: http://docutils.sourceforge.net/rst.html