Newer
Older
tren / tren.rst
NAME
----

**tren** - Advanced File Renaming


HOW TO USE THIS DOCUMENT
------------------------

**tren** is a powerful command line file/directory renaming tool.  It
implements a variety of sophisticated renaming features than can be a bit
complex to learn.  For this reason, this document is split into two
general sections: `REFERENCE`_ and `TUTORIAL AND DESCRIPTION`_.  If
you are new to **tren**, start by studying the latter section first.
It will take you from very simple- to highly  complex **tren** renaming
operations.  Once you've got a sense of what
**tren** can do, the reference section will be handy to look up
options and their arguments.

.. WARNING:: **tren** is very powerful and can easily and
             automatically rename things in ways you didn't intend.
             It is ***strongly*** recommended that you try out new
             **tren** operations with the ``-t`` option on the command
             line.  This turns on the "test mode" and will show you
             what the program *would* do without actually doing it.
             It goes without saying that you should be even more
             careful when using this program as the system root or
             administrator.  It's quite easy to accidentally rename
             system files and thereby clobber your OS.  You have been
             warned!!!

REFERENCE
---------


SYNOPSIS
--------

::

    tren.py [-aCcdfhqtvXx] [-A alphabet] [-I file] [-i range]  [-P esc] [-R sep] [-r old=new] [-S suffix] [-w width] file|dir ...


SPECIFYING OPTIONS
------------------

You may specify *tren* options in one of three ways:

  1) On the command line
  2) In an "include" file specified with ``-I filename`` on the command line
  3) Via the ``$TREN`` environment variable

Options specified on the command line are evaluated from left to right
and supercede any options specified in the environment variable.
Think of any options set in ``$TREN`` as the "leftmost command line
options".

All options must precede the list of files and/or directories being
renamed.  If one of your rename targets start with the ``-``
character, most command shells recognize the double dash as an
explicit "end of options" delimiter::

  tren.py -opt -opt -opt -- -this_file_starts_with_a_dash

Most shells aren't too fussy about space between an option
that takes an argument, and that argument::

  -i 1
  -i1

Use whichever form you prefer.  Just be aware that there are places
where spaces matter.  For example, you can quote spaces on your
command line to create renaming requests that, say, replace spaces
with dashes..

Some options below are "global" - they change the state of the entire
program permanently and cannot be undone by subsequent options.  Some
options are "toggles", they can be turned on- and off as you move from
left- to right on the command line.  In this way, certain options
(like case sensitivity, regular expression handling, and so on) can be
set differently for each individual renaming request (``-r``).  (If
you're very brave, you can select the ``-d`` option to do a debug
dump.  Among many other things, the **tren** debugger dumps the state
of each renaming request, and what options are in effect for that
request.)


OPTIONS
=======

  -A alphabet  Install a user-defined "alphabet" to be used by
               sequence renaming tokens.

               (*Default*: Built-in alphabets only)

               The alphabet is specified in the form::

                 name:characterset

               Both the name and the characterset are case- and
               whitespace-sensitive (if your shell permits passing
               spaces on the command line). The "0th" element
               of the alphabet is the leftmost character.  The
               counting base is the length of ``characterset``.
               So, for instance, the following alphabet is
               named ``Foo``, counts in base 5 in the
               sequence, ``a, b, c, d, e, ba, bb, ...``::

                 -A Foo:abcde

  -a   Ask interactively before renaming each selected file or
       directory.

        (*Default*: off)

       If you invoke this option, **tren** will prompt you before
       renaming each file.  The default (if you just hit ``Enter``)
       is to *not* rename the file.  Otherwise, you have the following 
       options::

         n - Don't rename the current file
         y - Rename the current file
         ! - Rename all the remaining files without further prompting
         q - Quit the program

       These options are all insensitive to case.

       If you're doing forced renaming (``-f``), this option will
       interactively ask you first about making any necessary backups
       and then renaming the original target.  *If you decline to
       do the backup renaming, but accept the renaming of the original
       target, the file or directory that already exists with that
       name will be lost!*.       

  -C   Do case-sensitive renaming

         (*Default*: This is the program default)

       This option is provided so you can toggle the program back to
       its default behavior after a previous ``-c`` on the command
       line.

       This option is observed both for literal and regular
       expression-based renaming (``-x``).  .

  -c   Collapse case when doing string substitution.

        (*Default*: Search for string to replace is case sensitive)

       When looking for a match on the old string to replace,
       **tren** will ignore the case of the characters found
       in the filename.  For example::

         tren.py -c -r Old=NEW Cold.txt fOlD.txt

       This renames both files to ``CNEW.txt`` and ``fNEW.txt``
       respectively.  Notice that the new (replacement) string's case
       is preserved.

       This option is observed both for literal and regular
       expression-based renaming (``-x``). 


  -d   Dump debugging information

        (*Default*: Off)

       Dumps all manner of information about **tren** internals - of
       interest only to program developers and maintainers.  This
       option provides internal program state *at the time it is
       encountered on the command line*.  For maximum debug output,
       place this as the last (rightmost) option on the command line,
       right before the list of files and directories to rename.  You
       can also place multiple ``-d`` options on the command line to
       see how the internal tables of the program change as various
       options are parsed.

  -f   Force renaming even if target file or directory name already
       exists.

       (*Default*: Skip renaming if a file or directory already
       exists by the same name as the target.)

       By default, **tren** will not rename something to a name that
       is already in use by another file or directory.  This option
       forces the renaming to take place.  However, the old file or
       directory is not lost.  It is merely renamed itself first, by
       appending a suffix to the original file name. (*Default*:
       .backup, but you can change it via the ``-S`` option.)  This
       way even forced renames don't clobber existing files or
       directories.

  -h   Print help information.


  -I file  "Include"  command line arguments from ``file`` 

       It is possible to perform multiple renaming operations in one
       step using more than one ``-r`` option on the **tren** command
       line.  However, this can make the command line very long and
       hard to read.  This is especially true if the renaming strings
       are complex, contain regular expressions or Renaming
       Tokens, or if you make heavy use of command line toggles.

       The ``-I`` option allows you to place any command line
       arguments in a separate *file* in place of- or in addition to
       the **tren** command line and/or the ``$TREN`` environment
       variable.  This file is read one line at a time and the
       contents appended to any existing command line.  You can even
       name the files you want renamed in the file, but they must
       appear as the last lines of that file (because they must appear
       last on the command line).

       Whitespace is ignored as is anything from a ``#`` to the end of
       a line::

         # Example replacement string file
         # Each line appended sequentially
         # to the command line

         -xr t[ext]+=txt     # Appended first
         -X
         -r =/MYEAR/ -r foo=bar 
         my.file
         your.file          # Appended last
 

       You may "nest" includes.  That is, you can include file ``x``,
       that includes file ``y``, that includes file ``z`` and so on.
       However, its easy to introduce a "circular reference" when you
       do this.  Suppose file ``z`` tried to include file ``x`` in
       this example?  You'd be specifying an infinite inclusion loop.
       To avoid this, **tren** limits the total number of inclusions
       to 1000.  If you exceed this, you'll get an error message and
       the program will terminate.

       Note that wildcard metacharacters like ``*`` and ``?`` that are
       embedded in filenames included this way are expanded as they
       would be from the command shell.

  -i instances  Specifies which "instances" of matching strings should
                be replaced.
                (*Default*: 0 or leftmost)

                A file may have multiple instances of the "old"
                renaming string in it.  The ``-i`` option lets you
                specify which of these (one, several, all) you'd
                like to have replaced.

                Suppose you have a file called
                ``foo1-foo2-foo3.foo4``.  The leftmost ``foo`` is
                instance 0.  The rightmost ``foo`` is instance 3.
                You can also refer to instances relative to the
                right.  So the -1 instance is the last (rightmost),
                -2, second from the last, and so forth.

                Often, you just want to replace a specific instance::

                  -i :3 -r foo=boo
                  -i :-1 -r foo=boo

                Both of these refer to the last instance of old string
                ``foo`` (found at ``foo4`` in our example name). 

                Sometimes, you'd like to replae a whole *range* of
                instances.  An "instance range" is specified using the
                ``:`` separator in the form::

                  -i first-to-replace:stop-here


                Notice that the "stop-here" instance is NOT replaced.
                In our string above, the option::

                  -i 1:-1 -r foo=boo

                Would change the file name to::

                  foo1-boo2-boo3.foo4

                You can also provide partial ranges::

                  -i 1: # From instance 1 to end of name

                  -i :-2 # All instances up to (not including) next-to-last

                  -i :   # All instances
 
  -P char   Use ``char`` as the escape symbol.
            (*Default*: ``\``)

  -q   Quiet mode, do not show progress.

        (*Default*: Display progress)

       Ordinarily, **tren** displays what it is doing as it processes
       each file.  If you prefer to not see this "noisy" output, use
       the ``-q`` option.  Note that this does not suppress warning
       and error messages.  

       It doesn't make much sense to use this option in test mode
       (``-t``), although you can.  The whole point of test mode is
       to see what would happen.  Using the quiet mode suppresses that
       output.


  -R char  Use ``char`` as the separator symbol in renaming
           specifications.
           (*Default*: ``=``)
            

  -r <old=new>   Replace ``old`` with ``new`` in file or directory
                 names.

                 Use this option to specify which strings you want to
                 replace in each file name. These strings are treated
                 literally unless you also invoke the ``-x`` option.  In
                 that case, ``old`` is treated as a Python style
                 regular expression.

                 Both ``old`` and ``new`` may optionally contain
                 *renaming tokens* described later in this document.

                 If you need to use the ``=`` symbol *within* either
                 the old or new string, simply escape it: ``\=``

                 If it is convenient, you can change the separator
                 character to something other than ``=`` via the
                 ``-R`` option.  Similarly, you can change the
                 escape character via the ``-P`` option.

                 You can have multiple instances of this option on
                 your **tren** command line::

                   tren.py -r old=new -r txt:doc old-old.txt

                 This renames the file to::

                   new-old.doc
               
                 Remember that, by default, **tren** only replaces the first
                 (leftmost) instance of the old string with the new.

                 Each rename specification on the command line
                 "remembers" the current state of all the program
                 options and acts accordingly.  For example::

                   tren.py -cr A=bb -Cr B=cc ...

                 The ``A=bb`` replacement would be done without
                 regard to case (both ``A`` and ``a`` would match),
                 where as the ``B=cc`` request would only replace
                 ``B``.

  -S suffix   Suffix to append when making backup copies of existing 
              targets.

              (*Default*: .backup)

              If you choose to force renaming if files when the new
              name already exists (``-f``), **tren** simply renames
              the existing file or directory by appending a suffix to
              it.  By default, this suffix is ``.backup``, but you
              can change it to any string you like with the ``-S```
              option.

  -t   Test mode, don't rename, just show what the program *would* do.

       **tren** is very powerful and capable of doing nasty things to
       your file and directory names.  For this reason, it is helpful
       to test your **tren** commands before actually using them.
       With this option enabled, **tren** will print out diagnostic
       information about what your command *would* do, *without
       actually doing it*.

       If your renaming requests contain random renaming tokens,
       test mode will only show you an approximation of the renaming
       to take place (because new random name strings are generated
       each time the program runs).

  -v   Print detailed program version information and keep running.

       This is handy if you're capturing **tren** output into a log
       and you want a record of what version of the program was used.

  -w length  Set the length of diagnostic and error output.

             (*Default*: 80)

             **tren** limits output to this length when dumping
             debug information, errors, warnings, and general
             information as it runs.  This option is especially
             useful when you're capturing **tren** output into
             a log and don't want lines wrapped::

               tren.py -w999 ..... 2>&1 > tren.log

             **tren** makes sure you don't set this to some
             unreasonably small value such that output formatting
             would be impossible.
             

  -X   Treat the renaming strings literally

         (*Default*: This is the program default)

       This option is provided so you can toggle the program back to
       its default behavior after a previous ``-x`` on the command
       line.

  -x   Treat the old string in a ``-r`` replacement as a Python
       style regular expression for matching purposes.

        (*Default*: Treat the old string as literal text)



TUTORIAL AND DESCRIPTION
------------------------

.. WARNING:: ONE MORE TIME: **tren** is a powerful file and directory
             renaming tool.  Be **sure** you know what you're about to
             do.  If you're not, run the program in test mode (invoke
             with the ``-t`` option) to see what would happen.  You
             have been warned!

The following sections are designed for the new- or occasional
**tren** user.  They begin with the simplest of **tren** operations
and incrementally build more- and more complex examples, eventually
describing all of **tren**'s capabilities.


Overview
========

**tren** is a general purpose file and directory renaming tool. Unlike
commands like ``mv``, **tren** is particularly well suited for
renaming *batches* of files and/or directories with a single command
line invocation.  **tren** eliminates the tedium of having to script
simpler tools to provide higher-level renaming capabilities. 

**tren** is also adept at renaming only *part of an existing file
or directory name* either based on a literal string or a regular
expression pattern.  You can replace any single, group, or all
instances of a given string in a file or directory name.

**tren** implements the idea of a "renaming token".  These are special
names you can embed in your renaming requests that represent things
like the file's original name, its length, date of creation, and so
on.  There are even renaming tokens that will substitute the content
of any environment variable or the results of running a program from a
shell back into the new file name.

**tren** can automatically generate *sequences* of file names based on
their dates, lengths, times within a given date, and so on.  In fact,
sequences can be generated on the basis of any of the file's
``stat`` information.  Sequence "numbers" can be ascending or
descending and the count can start at any initial value.  Counting can
take place in one of several internally defined counting "alphabets"
(decimal, hex, octal, alpha, etc.) OR you can define your own counting
alphabet.  This allows you to create sequences in any base (2 or
higher please :) using any symbol set for the count.


A Word About Program Defaults
=============================

**tren** has many options, but its defaults are designed to do 
two things:  a) Simplify the most common operations by making
them the default (no options required on the command line), and
2) Reduce the risk of accidentally modifying more of the file name
than you intented.  So, by default:

  **tren** treats renaming requests *literally*.  That is, the "old
  string" you specify for replacement is treated as literal text.  It
  requires a command line option (``-x``) to treat it as a regular
  expression.  *However*, any renaming tokens found in either the old-
  or new strings of a renaming request *are* interpreted before the
  renaming takes place.

  **tren** renaming is *case sensitive*. If you want to ignore case,
  use the ``-c`` option.

  **tren** will only replace the *first (leftmost) instance* of "old
  string" with "new string".  If you want more- or different instances
  replaced, use the ``-i`` option.

  **tren** will not allow you to rename a file or directory *if one
  with the new name already exists*. Such attempts will cause no
  change to the file or directory being processed and an error message
  will be displayed.  This is intentional to force you to manually
  rename or remove the file or directory that would have been
  clobbered by a rename.  You can override this default and *force* a
  renaming via the ``-f`` option.  This will cause the orginal file or
  directory itself to be renamed with a ``.backup`` suffix.  You can
  change this suffix via the ``-S`` option.


Getting Help
============

There are three command line options that can give you some
measure of help and information about using **tren**:

  -d   Dumps debug information out to stderr.  You can
       insert multiple instances of this option on the
       command line to see how the program has parsed
       everything *to the left* of it.  This is primarily
       intended as a debugging tool for people maintaining
       **tren** but it does provide considerable information
       on the internal state of the program that advanced
       users may find useful.

  -h   Prints a summary of the program invocation syntax
       and all the available options and then exits.

  -v   Prints the program version number and keeps running.


Controlling Program Output
==========================

As **tren** runs, it produces a variety of diagnostic and
status information.  There are a number of options you can
use to control how this works:

  -q    Set's "quiet" mode and suppresses everthing except
        error messages.

  -w  #  Tells **tren** to wrap lines after ``#`` characters have been
         printed.  If you're capturing output to a log, set this to a
         very high number like 999 to inhibit line wrapping.

Error and debug messages are sent to ``stderr``.  Normal informational
messages are sent to ``stdout``.  If you want to capture them both in
a log, try something like this (depending on your OS and/or shell)::

  tren.py ..... 2>&1 >tren.log

Managing Complexity
===================

As you learn more of the program features, the **tren** command line
can get long, complex, and easy to goof up.  It's also hard to
remember all the various options, how they work exactly, and which
specific one you need.  For this reason, it is *highly* recommended
that - once you have a renaming request working the way you like - if
you plan to use it again, save it as an "include" file.  That
way you can reuse it easily without having to keep track of the
details over and over.  Instead of this::

  tren.pu -c -i -1 -r .jpeg=.jpg file ...

Do this::

  tren.py -I jpeg-to-jpg.tren file...

What's in the ``jpeg-to-jpg.tren`` file?  Just this::

  # tren Command Line
  # Converts '.jpeg' (in any case mixture) file name suffix to '.jpg'

  # Make the replacement case-insensitive
  -c   # Reset this later on the command line with -C

  # Only replace the rightmost instance
   -i -1

  # The actual replacement request
  -r  .jpeg=.jpg


Notice that you can stick comments in the file anywhere you like and
that they begin with ``#``.  Notice also that the various options
can be entered on separate lines so it's simpler to read the include
file.  If you find it useful, you can even include other include
files *in* an include file::

  # Get the jpeg -> jpg suffix renaming
  -I jpeg-to-jpg.tren

  # Let's make it fancy

  -i -1 -r .jpg=.fancy.jpg

If you do this, take care not to create a circular include.  This can
happen when an include file tries to include itself, either directly,
or via another include file.  **tren** limits the total number of
includes to a very large number.  If it sees that the number has been
exceeded, it suspects a circular include and will issue an error
message to that effect and exit.

You can insert include options anywhere you like on the command line
and you can have as many as you like (up to a VERY large number you'll
never hit in practice).  Each include reference will be replaced with
the contents of that file *at the position it appears on the command
line*.

If you find yourself using certain options most- or every time you use
the program, you can put them in the **$TREN** environment variable.
**tren** picks this up every time it starts.  This minimizes errors
and reduces typing tedium.  Just keep in mind that some options can be
overriden later on a command line, and some cannot.  For instance,
suppose you do this::

  export TREN=-f -c

The ``-c`` option to ignore case can be undone on the command line
with a ``-C`` option.  However, the ``-f`` option cannot be undone.

So ... choose the options you want to make permanent in the
environment variable wisely.


Renaming Basics
===============

**tren** supports a variety of renaming mechanisms.  The one thing
they have in common is that they're built with one or more *renaming
requests* that will be applied to one or more file- or directory
names.  Renaming requests look like this on the **tren** command
line::

  tren.py ... -r old=new ... -r old=new ... list of files/directories

No matter how complicated they look, the basic logic of the
renaming request stays the same: "When you find the string
``old`` in the file- or directory name, change it to the string
``new``. 

The ``old`` and ``new`` renaming strings are built using a variety of
building blocks:

  =============================   =============================
  *Old Strings Are Built With:*   *New Strings Are Built With:*
  -----------------------------   -----------------------------

  Literal Text                    Literal Text
  Regular Expressions             Renaming Tokens
  Renaming Tokens
  =============================   =============================


You can use any of these building blocks alone or combine them
to create expressive and powerful renaming schemes.


Literal String Substitution
===========================

Literal String Substitution is just that - it replaces one literal
string with another to rename the target file or directory.  This is
the most common, and simplest way to use **tren**.  This is handy when
you have files and directories that have a common set of characters in
them you'd like to change.  For instance::

  tren.py -r .Jpeg=.jpg *.Jpeg

This would rename all files (or directories) whose names contained the
string ``.Jpeg`` and replace it with ``.jpg``.  Well ... that's
not quite right.  Unless you specify otherwise with the ``-i``
option, *only the first (leftmost) instance of ``old`` is replaced
with ``new``*.  So, for example, if you started out with the file,
``My.Jpeg.Jpeg`` and ran the command above, you'd end up with a
new file name of ``My.jpg.Jpeg``


Substitution Instances
======================

As we just saw above, sometimes the "old" string appears in several
places in a file- or directory name.  By default, **tren** only
replaces the first, or leftmost "instance" of an "old" string.
However, using the ``-i`` option you can specify *any* instance you'd
like to replace.  In fact, you can even specify a *range* of instances
to replace.

Instances are nothing more than *numbers* that tell **tren** just
where in the name you'd like the replacement to take place.  Positive
numbers means we're counting instances from the *left* end of the
name.  The leftmost instance is 0 (not 1!!!).

You can also count *backwards* from the right end of the string using
negative numbers.  -1 means the last instance, -2 means next-to-last,
and so on.  In summary, counting from the left starts at zero and
counting from the right starts at -1.

Suppose you have a file called::

  foo1-foo2-foo3.foo4

The leftmost ``foo1`` is instance 0 of old string ``foo``.  It is also
instance -4.  The rightmost ``foo4`` is instance 3 of old string
``foo``, and also instance -1.  


You can specify a *single instance* (other than the default leftmost)
to replace::

  tren.py -i 1 -r f=b foo1-foo2-foo3.foo4    # New name:  foo1-boo2-foo3.foo4

  tren.py -i -1 -r f=b foo1-foo2-foo3.foo4   # New Name: foo1-foo2-foo3.boo4


You can also specify a *range of instances* to replace using the 
notation::

   -i first-to-replace:stop-here

All instances from the "first-to-replace" up to, *but NOT including*
"the stop-here" are replaced::

  tren.py -i 1:3 -r f=b foo1-foo2-foo3.foo4   # New Name: foo1-boo2-boo3.foo4

  tren.py -i -4:-2 -r f=b foo1-foo2-foo3.foo4 # New Name: boo1-boo2-foo3.foo4

``-i :`` means "replace *all* instances::

  tren.py -i : -r f=b foo1-foo2-foo3.foo4     # New Name: boo1-boo2-boo3.boo4

You can also use *partial range specifications*::

  tren.py -i 1: -r f=b foo1-foo2-foo3.foo4   # New Name: foo1-boo2-boo3.boo4

  tren.py -i :-2 -r f=b foo1-foo2-foo3.foo4 # New Name: boo1-boo2-foo3.foo4


Note that you cannot specify individual, non-adjacent instances.
There is no way to use a single **tren** command to replace, say, the
only the 2nd and the 4th instance of an "old" string.  Doing that
requires two renaming requests.  The good news is that we can do them
both on a single **tren** invocation.


Multiple Substitutions
======================

You can put as many renaming requests on a **tren** command line as
you like .... well, up to the length limit imposed by your operating
system and shell, anyway.  As we just saw, this can be handy when
a single renaming request can't quite do everything we want.

BUT ... there's a catch.  In designing your renaming requests,
you have to keep in mind that **tren** processes the command
line *from left to right*, incrementally constructing the new name
as it goes.  For instance::

  tren.py -r foo=bar -r foo=baz  foo1-foo2-foo3.foo4

Produces ... wait a second ... why on earth are there two renaming
requests with idential "old" strings on the same command line?
Shouldn't this produce a final name of ``baz1-foo2-foo3.foo4``?

Nope.  After the leftmost renaming request has been processed,
the new name is ``bar1-foo2-foo3.foo4``.  Remember that, by
default, **tren** only replaces the *leftmost* or 0th instance
of an "old" string.  So, when the second renaming request is
processed, the instance 0 of ``foo`` is now found in the
string ``foo2``.  So, the final name will be, ``bar1-baz2-foo3.foo4``.

The lesson to learn from this is that multiple renaming requests
on the command line will work fine, but you have to do one of
two things (or both):

  1) Make sure you're tracking what the "intermediate" names
     will look like as the new file name is being constructed,
     renaming request, by renaming request.

  2) Make sure the renaming requests operate on completely 
     disjoint parts of the file name.

.. NOTE:: Similarly, **tren** remembers the last state of each option
          as you move from left to right on the command line. For instance::

            tren.py -i1 -r f=F -r o=O foo1-foo2-foo3.foo4

          You might be tempted to believe that this would produce,
          ``fOo1-Foo2-foo3.foo4``, but it doesn't.  It produces,
          ``foO1-Foo2-foo3.foo4`` instead because the ``-i 1`` appears
          prior to *both* renaming requests and thus applies to each
          of them.  If you want the first instance of "o" to be
          replaced, you need a command line like this::

            tren.py -i1 -r f=F -i0 -r o=O foo1-foo2-foo3.foo4

          This sort of thing is generally true for *all* options, so
          be sure they're set the way you want them to the left of a
          renaming request.

As a practical matter, this can get really complicated to track.  If
in doubt, it's always better to run two separate **tren** commands in,
say, a shell script to make the renaming explicit, rather than to
obscure things with clever command line trickery.

So, let's go back to our example from the previous section.  We
want to replace the 2nd and 4th instances of the string "foo"
in our file name.  We do this with two renaming requests on the
same command line, considering what each one does to the name
as it is encountered::

  tren.py -i1 -r foo=bar -i2 -r foo=bar foo1-foo2-foo3.foo4


More About Command Lines
========================

As we just saw, you can get surprising results as you process
the command line from left to right, as **tren** works its way 
through the various renaming requests.  There are other potential
pitfalls here, so it's helpful to understand just *how* **tren**
processes your command line, step-by-step:

  1) Prepend the contents of $TREN to the user-provided command line.

       This allows you to configure your own default set of options
       so you don't have to type them in every time.

  2) Resolve all references to include files.

       This has to be done before anything that follows, because
       include files add options to the command line.

  3) Build a table of every file name to be renamed.

       We'll need this information if any of the renaming
       requests use renaming tokens.

  4) Build a table containg each renaming request storing
     the current state of every program option at that
     point on the command line.

       This allows **tren** to apply options differently
       to different renaming requests on the same command
       line.  This came in handy in our example of the
       previous section.

  5) Process each file found on the command line in
     left to right order, applying every each renaming
     request, again as it appeared from left to right on
     the command line.

Simple eh?  Well, mostly it is ... until it isn't.  As we
just saw, incrementally building up a new name with multiple
renaming requests can produce unexpected results and we have
to plan for them.  

Similarly, you can inadvertently rename *the wrong file* ... this is
usually a Bad Thing.  Say you have two files, ``x`` and ``y``.
You want to rename ``x`` to ``y`` and ``y`` to ``Y``.  Well, 
order matters here.  Say you do this::

  tren.py -frx=y -ry=Y x y

**tren** tries to rename ``x`` to ``y`` and spots the fact that ``y`` already
exists, so it makes a backup, ``y.backup``, and renames ``x`` to ``y`` as
requested.  TO BE CONTINUED



Forcing Renaming
================

Ignoring Case
=============

The Strange Case Of Mac OS X And Windows
========================================

Mac OS X and Windows have an "interesting" property that makes case
handling a bit tricky.  Both of these operating systems *preserve*
case in file and directory names, but they do not *observe* it.  (It
is possible to change this behavior in OS X when you first prepare a
drive, and make the filesystem case sensitive.  This is rarely done in
practice, however.)

These OSs show upper- and lower- case in file names as you request,
but they do not *distinguish* names on the basis of case.  For
instance, the files ``Foo``, ``foo``, and ``FOO``, are all the
same name in these operating systems, and only one of these can exist
in a given directory.  This can cause **tren** to do the unexpected
when your renaming command is doing nothing more than changing case.
Suppose you start with a file called ``Aa.txt`` and run this
command::

  tren.py -rA=a Aa.txt

**tren** will immediately complain and tell you that the file
``aa.txt`` already exists and it is skipping the renaming.  Why?
Because from the point-of-view of OS X or Windows, ``aa.txt`` (your
new file name) is the same as ``Aa.txt`` (your original file name).
You can attempt to force the renaming::

  tren.py -frA=a Aa.txt

Guess what happens?  Since **tren** thinks the new file name already
exists, it backs it up to ``aa.txt.backup``.  But now, when it goes
to rename the original file ... the file is *gone* (thanks to the
backup renaming operation)!  **tren** declares an error and
terminates.

This is not a limitation of **tren** but a consequence of a silly
design decision in these two operating systems.  As a practical
matter, the way to avoid this issue is to never do a renaming
operation in OS X or Windows *that only converts case*.  Try
to include some other change to the filename to keep  the
distinction between "old name" and "new name" clear to the
OS.  In the worst case, you'll have to resort to something like::

 tren.py -rA=X Aa.txt
 tren.py -rX=a Xa.txt



Using Regular Expressions
=========================

Ordinarily **tren** treats both the old string you specify with the
``-r`` option *literally*.  However, it is sometimes handy to be able
to write a regular expression to specify what you want replaced.  If
you specify the ``-x`` option, **tren** will treat your old string as
a Python style regex, compile it (or try to anyway!) and use it to
select which strings to replace.  This makes it much easier to rename
files that have repeated characters or patterns, and groups of files
that have similar, but not idential strings in their names you'd like
to replace.

Say you have a set of files that are similar, but not idential in
name, and you want to rename them all::

  sbbs-1.txt
  sbbbs-2.txt
  sbbbbbbbbs-3.txt

Suppose you want to rename them, replacing two or more instances of
``b`` with ``X``. It is tedious to have to write a separate literal
``-r old=new`` string substitution for each instance above.  This is
where regular expressions can come in handy.  When you invoke the
``-x`` option, **tren** understands this to mean that the ``old``
portion of the replacement option is to be treated as a *Python style
regular expression*.  That way, a single string can be used to match
many cases::
 
  tren.py -x -r bb+=X *.txt

This renames the files to::

  sXs-1.txt
  sXs-2.txt
  sXs-3.txt

Keep in mind that a literal string is a subset of a regular
expression.  This effectively means that with ``-x`` processing
enabled you can include *both* regular expressions and literal text in
your "old string" specification.  The only requirement is that the
string taken as a whole must be a valid Python regular expression.  If
it is not, **tren** will display an error message to that effect.

Because Python regular expressions can make use of the ``=`` symbol,
you need a way to distinguish between an ``=`` used in a regular
exression and the same symbol used to separate the old and new
operands for the ``-r`` option.  Where this symbol needs to appear in
a regular expression, it has to be escaped like this: ``\=``.

Regular expression processing is unaffected by the ``-g / -1`` (global
rename) and ``-c / -C`` (ignore case) options.  That's because there
are regular expression mechanisms for achieving the same thing.  More
importantly, if you've selected regular expression matching, it's
probably because you want very fine grained control of the renaming
defined by the regex.  In short, regular expression matching always
takes place on the *original characters* of the target portion of the
name and does replacement as called for in the regex itself.


Changing The Renaming Separator & Escape Characters
===================================================

There may be times when the default renaming separator (``=``)
and/or escape character (``\``) make it clumsy to construct a
renaming request.  This can happen if, say, either the old- or new
string in a literal renaming needs to use the ``=`` symbol many
times.  Another case where this may be helpful is when constructing
complex regular expressions than need to make use of these characters.

The ``-R`` and ``-P`` options can be used to change the character
used for renaming separator and escape character respectively.  You
can use any character you like (these must be a single character
each), but bear in mind that the underlying operating system
understands certain characters as being special.  Trying to use them
here will undoubtedly deeply confuse your command shell, and possibly
your file system.  For example, the ``/`` character is used as a
path separator in Unix-derived systems.  It's therefore a Really Bad
Idea to try and use it as a renaming separator or escape character.


Interactive Renaming
====================

An Overview Of Renaming Tokens
==============================

**tren** implements the notion of *renaming tokens*.  These can
appear in either the ``old`` or ``new`` string components of a ``-r``
renaming argument.

It is sometimes useful to be able to take a group of files or
directories whose names have nothing in common and impose a common
naming scheme on them. Another use for renaming tokens is to do the
renaming based on some property the file or directory possesses like
its creation date, size, owner's name, and so on.

In their simplest form, renaming tokens are nothing more than
"canned" information **tren** knows about a particular file or
directory.  For instance, if you insert the ``/D/`` token into a
old- or new string definition, **tren** will replace it with *the
creation date of the file or directory being renamed* and use that
string in the renaming process.

There are also tokens that allow you to use system information in your
renaming strings.  Finally, there are tokens that can be used to
automatically renumber or sequence (order) a set of files or
directories being renamed.

For example, suppose you and your friends pool your vacation photos
but each of your cameras uses a slightly different naming scheme.  You
might want to just reorder them by the date and time each picture was
taken, for example.  That way you end up with one coherent set of
named and numbered files.  You might start with something like this::

  DSC002.jpg      # Bob's camera,  taken 1-5-2010 at noon
  dc0234.Jpg      # Mary's camera, taken 1-5-2010 at 8am
  032344.jpeg     # Sid's camera,  taken 1-3-2010 at 4pm

It would be nice to get these in order somehow::

  tren.py -r =/D/-MyVacation-/+T0001/.jpeg *.jp*

This would rename all the files in the current directory ending with
``.jp*``.  The ``/D/`` would be replaced with the *date* the picture
was taken. The ``/+T0001/`` refers to a *starting sequence number* to
uniquely identify pictures taken on the same date.  The other strings,
``-MyVacation-`` and ``.jpeg``, are inserted *literally* in the final
file names.  After we ran this command, the files above would end up
with these names::

  20100103-MyVacation-0001.jpeg       # Sid's
  20100105-MyVacation-0001.jpeg       # Mary's
  20100105-MyVacation-0002.jpeg       # Bob's

Notice that the files taken on the same date have been sequenced by
the time-of-day they were taken because we included the ``/+T0001/``
renaming token in our pattern.  The ``+`` here means to construct
the sequence in *ascending* order.  A ``-`` would specify
*descending* order.  

.. Note:: Notice that there is *no old string* in our example above.
          That is, there is nothing to the left of the ``=`` symbol in
          the ``-r`` option.  This effectively means "replace
          everything" in the existing file or directory name.  You can
          do the same thing using a regular expression::

               tren.py -x -r *=/D/-MyVacation-/+T001/.jpeg *.jp*

          Of course, if you use the ``-b`` or ``-e`` flags, you limit
          just what portion of the filename is considered
          "everything".

Of course, you don't *have* to replace the entire filename when
using tokens.  It's perfectly legitimate to replace only
a portion of the existing name::

   tren.py -r file=/D/-file  file-1 file.2

This would rename our files to: ``20100101-file-1 and
20100101-file.2`` Notice that we combined literal text and a renaming
token to do this.

You can even use renaming tokens in your *old string* specification.
For instance, suppose you manage a number of different systems and you
set their system name in an environment variable called SYSNAME.  You
might then do something like this::

  tren.py -x -r /$SYSNAME/*.bku=/$SYSNAME/*.bku.old

If your system name was "matrix", then the command above would only
rename files whose names began with ``matrix`` and ended with ``.bku``.
If your system name were "morton", then the command above would only
rename files whose names began with ``morton`` and ended with ``.bku``.

There are a couple of things to keep in mind when doing things like
this:

  1) The ``/$SYSNAME/`` in the old string is used to *find the text to
     rename*, whereas the same renaming token in the new string means
     *insert the contents of that environment variable here*.

  2) Renaming tokens are always evaluated *before* any regular
     expression processing takes place.  It's up to you to make sure
     that when the two are combined (as we have in the example above),
     *that the final result is still a valid Python regular
     expression*.  This may involve explicit quoting of the renaming
     tokens used in the old string specification.


**tren** has many other kinds of renaming tokens.  Their
structure and use is described in some detail in the
section below entitled `RENAMING TOKENS: THE GORY DETAILS`_.


COMMAND LINE TOGGLES
====================

**tren** defaults to a specific set of behaviors:

  - ``old`` and ``new`` renaming text is treated *literally*
  - Renaming takes place within *the entire filename*
  - *Only the first instance* of ``old`` is replaced with ``new``
  - Renaming is *case sensitive*

There are command line "switches" to override each of these defaults
(``-x``, ``-b``, ``-e``, ``-g``, and ``-c``).

There are additional "switches" to return the program to its
default behavior (``-X``, ``-a``, ``-1``, and ``-C``).

The idea is that you can specify what kind of replacement
behavior you want *for each different renaming operation*.
For instance::

  tren.py -e -r txt=TXT -g -a -c -r M=0 -C -x -r [ss]+=S filelist

This would rename the files as follows:

  - The first instance of ``txt`` would be replaced with
    ``TXT`` in each of the file extensions.

  - All instances of ``m`` or  ``M`` would be replaced
    anywhere they were found in the filename.

  - All instances of one or more strings in the form ``ss`` would
    be replaced with ``S``.


OTHER PROGRAM SEMANTICS
=======================

It's important to understand some subtleties of just how **tren**
works, particularly if you intend to create complex, multi-replacement
command lines:

  - Command line processing is from left to right.  As we saw in the 
    `COMMAND LINE TOGGLES`_ above, this means the options can be
    different for each renaming operating you specify.

  - Regular expression processing is unaffected by the ``-g / -1``
    (global rename) and ``-c / -C`` (ignore case) options.

  - Filenames may be absolute, relative, or implict (to the current
    working directory).  **tren** keeps track of this and can do
    renaming in directories other than the current one.

  - **tren** processes each renaming string in the following
    manner:

    1) Select the target portion of the filename for
       renaming (all, name only, extension only).

    2) Replace all renaming tokens with their equivalent
       text in both the ``old`` and ``new`` renaming strings.

    3) If doing literal string replacement:

        - If ``-c`` is in effect, collapse the target and the ``old``
          renaming string to *lower case* before checking for a match.

        - Replace the first- (default and ``-1``) or all (``-g``)
          instances of ``old`` with ``new``.  4)

    4) If doing regular expression processing, replace any regex
       matches with the corresponding ``new`` string.  Keep in mind
       that if ``-x`` is selected the *entire* ``old`` string is
       treated as a Python regular expression.  Pay particular
       attention to this if you're combing literal text and/or
       renaming tokens with regular expression metacharacters.

  - When all the renaming operations are complete - and thus a new filename
    has been constructed - **tren** checks to see if a file or directory
    by that name already exists.  Unless the ``-f`` flag is in force,
    **tren** will refuse to do a renaming *over an existing filename*.
    If the new filename does not exist, **tren** will attempt the
    renaming.  If the rename fails for some reason - say you don't
    have permission to rename a particular file or directory - you'll
    see an error message to that effect.

  - By default, **tren** will stop processing on any error.  You
    can override this with the ``-E`` option.  In that case,
    an error message will be displayed.  No matter what caused
    the error,  **tren** will skip the file currently being processed
    and go on to the next one.


RENAMING TOKENS: THE GORY DETAILS
=================================

As we've just seen, a *renaming token* is nothing more than 
a string representing something **tren** knows about.  These
fit in one of three categories:

  - An attribute of the file or directory being renamed
  - An attribute of the underling operating system environment
  - A sequence that reflects some ordering principle

Renaming tokens are delimited by the ``/`` character.  **tren**
replaces these tokens with the corresponding information (see
descriptions below) wherever you indicated in either the ``old`` or
``new`` strings of a ``-r`` rename command.

Currently, **tren** defines a number of renaming tokens.  Future
releases of  **tren** may add more of these, so it's good to
periodically reread this material.

File Attribute renaming tokens
==============================

These tokens are derived from information about the file or
directory being renamed.



``/D/       File or directory creation date``

              This token is replaced with the date of creation
              of the file or directory being renamed.  It is
              in ``yyyymmdd`` format.

``/dd/      File or directory day of creation``

              This token is replaced with the the day of the month the
              file was created in ``dd`` format.

``/dy/      File or directory day of creation``

              This token is replaced with the the name of the day the
              file was created in ``Ddd`` format.

``/E/       Original File Extension``

              This token is replaced the "extension" portion of the file
              or directory before renaming.  This does not include the
              extension separator string.

``/F/       Original File Name``

              This token is replaced the "name" portion of the file or
              directory before renaming.

.. NOTE:: Notice that there is no token for the *whole* filename
          because you can always synthesize it with ``/F/./E/``


``/G/       File or directory primary group name``

              This token is replaced with the name of the
              primary group to which the file belongs.

``/hh/      File or directory hour of creation``

              This token is replaced with the hour the file was
              created in ``hh`` format.

``/I/       File or directory creation date in ISO format``

              This token is replaced with the date of creation of the
              file or directory being renamed.  It is similar to ``/D/``
              except it is in ISO format, ``YYYY-MM-DD``.

``/L/       File or directory length``

              This token is replaced with a numeric string
              that indicates the length of the file or directory
              in bytes.

``/mm/      File or directory minutes of creation``

              This token is replaced with the minutes the file was
              created in ``mm`` format.

``/mo/       File or directory month of creation``

              This token is replaced with the numeric month the file was
              created in ``mm`` format.

``/my/      File or directory month of creation``

              This token is replaced with the abbreviated name of the
              month the file was created in ``Mmm`` format.

``/ss/      File or directory seconds of creation``

              This token is replaced with the seconds the file was
              created in ``ss`` format.

``/T/       File or directory creation time``

              This token is replaced with the time of creation of the
              file or directory being renamed.  It is in ``hh:mm:ss``
              format.  This is equivalent to ``/hh/:/mm/:/ss/``.

``/U/       File or directory owner name``

              This token is replaced with the name of the
              file or directory's owner.

``/yyyy/    File or directory year of creation``

              This token is replaced with the year the file was
              created in ``yyyy`` format.




System Related renaming tokens
==============================

These tokens are derived from the underlying operating system
and runtime environment.

``/$ENV/     Environment variable``

               This token is replaced with the value of
               the environment variable ``$ENV``.  If
               that variable does not exist, the token
               is replaced with an empty string::

                 tren.py -r =/$ORGANZATION/-/F/./E/ *

               This prepends the organization's name to everything in
               the current directory.


``/`cmd`/     Arbitrary command execution``

               This token is replaced with the string
               returned by executing the ``cmd`` command.

               For instance, you might want to prepend the name
               of the system to a all you shell scripts::

                 tren.py -r =/`uname -n`/-/F/./E/ *.sh

              This construct is more generally a way to synthesize
              renaming tokens that are not built into **tren**.  For
              instance, the built-in tokens only provide information
              about file and directory *creation* dates.  You might
              want to use the date of *last access*.  You do this by
              writing the appropriate script or program and then
              executing it within the /\`cmd\`/ construct.  This
              effectively provides **tren** an unlimited number of
              renaming tokens.

.. WARNING:: Be *very* careful using this.  It's possible to
              construct bizzarre, overly long, and just plain
              chowder-headed strings that make no sense in a renaming
              context using this construct.


Sequence Renaming Tokens
========================

Sometimes it's useful to rename files or directories based on some
*property they possess* like the date or time of creation, the size of
the file, who owns it, and so on.  That's the idea behind the ``/D/``,
``/L/``, and ``/T/`` renaming tokens described in the previous section.

An extension of this idea is to *order all the files being renamed*
based on one of these parameters.  For instance, instead of actually
embedding the date and time of creation in a file or directory name,
you might want to order the files from oldest to newest with a naming
convention like::

  file-1.txt
  file-2.txt
  file-3.txt

This guarantees uniqueness in the final name and also sees to it that
a sorted directory listing will show you the files or directories in
the order you care about.

This is the purpose of *sequence renaming tokens*.  They give you
various ways to create sequences that can be embedded in the final
file or directory name.

General Format Of Sequence Renaming Tokens
==========================================

Sequence renaming tokens consist of three descriptive components and
have the following general format::

    /<ordering flag><type><counting pattern>/
  
      where,
             ordering flag: 
  
                   +  ascending
                   -  descending
  
             type:
  
                   D  sequence on file creation date & time
                   L  sequence on file length
                   R  sequence on the command line file order
                   T  sequence on file creation time within a given day


Count Pattern Format
====================

The counting pattern is used to specify two things: The width of the
sequence string, and the starting value for the sequence.  Examples::

  0001    ->   0001, 0002, 0003, ...
  0000    ->   0000, 0001, 0002, ...
  03      ->   03, 04, 05, ...

You do not have to use a ``0`` to indicate the sequence width.  You
can use *any* padding characters you like.  **tren** only cares about
the width of the field and will "consume" your padding characters as
the count increases.::

  xxx3    ->   xxx3, xxx4, xxx5, ... 9999, xxx3, xxx4, ...
   -+8    ->   -+8, -+9, -10, -11, ... 999, -+8, -+9, ...

You are not restricted to numbers in a counting pattern.  Letters may
also be used.  **tren** will preserve the case you specify in the
token when creating sequences like this::

  000a    ->   000a, 000b, 000c, ... zzzz, 000a, ...
  ---Y    ->   ---Y, ---Z, --AA, ... ZZZZ, ---Y, ---Z, ...

Notice that when a sequence "rolls over", the next value is the
*initial sequence value you specified*.


Types Of Sequence Renaming Tokens
=================================

Sequence renaming tokens are thus a way to generate an ordering *based
on some property common to everything being renamed*.  Keep in mind
that for purposes of sequencing, **tren** *makes no distinction
between a file and directory*.  It merely sequences based on the
property you requested.

**tren** currently supports the following kinds of sequencing:


  ``/+D0001/   Sequence based on the creation date/time``

               This produces a sequence from oldest to newest
               (or the reverse) of the renamed objects.

                 ``tren.py -b -r =/+D0002/  *.txt``

               This would rename all the files in the current
               directory into the form, ``0002.txt``, ``0003.txt``,
               ... ``9999.txt`` with ``0002.txt`` being the oldest
               file and ``9999.txt`` being the newest.  If you
               used the token ``/-D0002/``, you'd get the same
               thing, but in reverse order.

  ``/+L0001/   Sequence based on the size of the files being renamed``

               This produces a sequence from shortest to longest
               (or the reverse) of the renamed objects.

                 ``tren.py -r =/+L0002/  *.txt``

               This would rename all the files in the current
               directory into the form, ``0002.txt``, ``0003.txt``,
               ... ``9999.txt`` with ``0002.txt`` being the shortest
               file and ``9999.txt`` being the longest.  If you
               used the token ``/-L0002/``, you'd get the same
               thing, but in reverse order.

  ``/+R0001/   Sequence based on the file order on the command line``

               This produces a sequence based on the order (or the
               reverse) of renaming - i.e., The order of the names
               on the command line.

                 ``tren.py -e -r =/+R0000/  MyFile.txt AFile.jpg me.log``

               This would rename all the files to, ``MyFile.0``,
               ``AFile.1``, and ``me.2``.  If you used ``/-R0000/``,
               you'd get  ``MyFile.2``, ``AFile.1```, and ``me.0``.

  ``/+T0001/   Sequence based on creation time within date``

               This produces a sequence based on the creation date
               and time similar to the ``/+D.../`` sequence renaming
               token above.  However, the sequence *resets* at the
               beginning of each new date. This allows you to
               create unique sequences *within a date* like our
               example of renaming photo files from different
               cameras.  (See: `An Overview Of Renaming Tokens`_)::
  
                
                 tren.py -b -r =/D/-/+T0100/ *.txt

               This would rename all the ``.txt`` files in the current
               directory into the form::

                 200103010-0100.txt
                 200103010-0101.txt
                 200103010-0102.txt
                 200104010-0100.txt
                 200104010-0101.txt
                 200104010-0102.txt
                 200104011-0100.txt
                 200104011-0101.txt
                 200104011-0102.txt
                 ...

               In other words, instead of sequence just on the creation date,
               this allows us to sequence *within* the date. As always, the
               ``-`` flag will reverse this order within the date.

               Notice that you can get something similar using just
               file attribute renaming tokens::

                 tren.py -b -r =/D/-/T/ *.txt

               This would produce names in the form::

                 200103010-03:01:23.txt
                 200103010-03:01:24.txt
                 200103010-03:01:25.txt
                 ...

               For most purposes, though, the order, rather than the
               absolute time is both more useful and more readable.


BUGS, MISFEATURES, OTHER
------------------------

You must be running Python 2.6.x or later.  **tren** makes use of
features not supported in releases prior to this.

As a general matter, **tren** should run on any POSIX-compliant OS
that has this version (or later) of Python on it.  It will also run on
many Microsoft Windows systems.  If the Windows system has the
``win32all`` Python extensions installed, **tren** will take advantage
of them for purposes of deriving the names of the user and group that
own the file or directory being renamed.

As of this writing, **tren** will not run in the **cygwin** environment
because their version of Python is still backleveled to 2.5.x.  When
and if the **cygwin** team upgrades to 2.6.x, **tren** is expected to
work there as well.

This program is **EXPERIMENTAL** (see the license).  This means it's
had some testing but is certainly not guaranteed to be perfect.  As of
this writing, it has been run on FreeBSD, Linux, Windows XP, and Mac
OS X.  It has not, however, been run on 64-bit versions of those OSs.

If you have experience, positive or negative, using **tren** on other
OS/bitsize systems, please contact us at the email address below.


HOW COME THERE'S NO GUI?
------------------------

**tren** is primarily intented for use by power users, sys admins, and
advanced users that (mostly) find GUIs more of a nuisance than a help.
There are times, however when it would be handy to be able to select
the files to be renamed graphically.  TundraWare has a freely
available file browser that is macro programmed.  It will work nicely
in such applications:

  http://www.tundraware.com/Software/twander/


COPYRIGHT AND LICENSING
-----------------------

**tren** is Copyright (c) 2010 TundraWare Inc.

For terms of use, see the ``tren-license.txt`` file in the
program distribution.  If you install **tren** on a FreeBSD
system using the 'ports' mechanism, you will also find this file in
``/usr/local/share/doc/tren``.


AUTHOR
------

::

   Tim Daneliuk
   tren@tundraware.com



DOCUMENT REVISION INFORMATION
-----------------------------

::

  $Id: tren.rst,v 1.151 2010/03/26 00:57:04 tundra Exp $

You can find the latest version of this program at:

  http://www.tundraware.com/Software/tren


This document was produced using reStructuredText:

  http://docutils.sourceforge.net/rst.html