This manual is a general guide to the Subversion revision control system.
How to get started with Subversion.
Subversion is a free/open-source revision control system.
That is, Subversion manages files over time. The files are placed into a central repository. The repository is much like an ordinary file server, except that it remembers every change ever made to your files. This allows you to recover older versions of your files, or browse the history of how your files changed. Many people think of a revision control system as a sort of "time machine."
Some revision control systems are also software configuration management (SCM) systems. These systems are specifically tailored to manage trees of source code, and have many features that are specific to software development (such as natively understanding programming languages). Subversion, however, is not one of these systems; it is a general system that can be used to manage any sort of collection of files, including source code.
Subversion aims to be the successor to CVS (http://www.cvshome.org/).
At the time of writing, CVS is the standard Free revision control system used by the open-source community. It has a hard-earned, well-deserved reputation as stable and useful software, and has a design that makes it perfect for open-source development. However, it also has a number of problems that are difficult to fix.
Subversion's original designers settled on a few simple goals. First, it was decided that Subversion should be a functional replacement for CVS. That is, it should do everything that CVS does, preserving the same development model while fixing the most obvious flaws. Secondly, with existing CVS users as the first target audience, Subversion should be written such that any CVS user should be able to start using it with little effort.
Collabnet (http://www.collab.net/) provided the initial funding in 2000 to begin development work, and the effort has now blossomed into a large, open-source project backed by a community of free software developers.
The intended audience of this book is anyone who has used a revision control system before, although perhaps not Subversion or CVS. It assumes that the reader is computer-literate, and reasonably comfortable at a Unix command-line.
People familiar with CVS may want to skip some of the introductory sections that describe Subversion's concurrent versioning model. Also, there is a quick guide for CVS users attached as an appendix (See SVN for CVS users.)
What sort of things does Subversion do better than CVS? Here's a short list to whet your appetite:
Subversion has a modular design; it's implemented as a collection of C libraries. Each layer has a well-defined purpose and interface.
If you aren't interested in how Subversion works under the hood, feel free to skip this section and move on to Installation and Basics.
Here's a helpful diagram of Subversion's layers. Program flow begins at
the top of the diagram (initiated by the user) and flows "downward."
+--------------------+ | commandline or GUI | | client app | +----------+--------------------+----------+ <=== Client interface | Client Library | | | | +----+ | | | | | +-------+--------+ +--------------+--+----------+ <=== Network interface | Working Copy | | Remote | | Local | | Management lib | | Repos Access | | Repos | +----------------+ +--------------+ | Access | | neon | | | +--------------+ | | ^ | | / | | DAV / | | / | | v | | +---------+ | | | | | | | Apache | | | | | | | +---------+ | | | mod_DAV | | | +-------------+ | | | mod_DAV_SVN | | | +----------+-------------+--------------+----------+ <=== Filesystem interface | | | Subversion Filesystem | | | +--------------------------------------------------+
The Subversion Filesystem is not a kernel-level filesystem that one would install in an operating system (like the Linux ext2 or NTFS), but a virtual filesystem. It is implemented as library which exports a C API that simulates a filesystem - specifically, a "versioned" filesystem - but uses a database system for the back-end storage mechanism.
This means that writing a program to access the repository is like writing against other filesystem APIs: you can open files and directories for reading and writing as usual. The main difference is that this particular filesystem never loses data when written to; old versions of files and directories are always saved as historical artifacts.
Currently, the database system in use is BerkeleyDB, which provides other nice features that Subversion needs: data integrity, atomic writes, recoverability, and hot backups.
Subversion has the mark of Apache all over it. At its very core, the client uses the Apache Portable Runtime (APR) library. This means that the Subversion client compiles and runs anywhere Apache httpd does - right now, this list includes all flavors of Unix, Win32, BeOS, OS/2, Mac OS X, and possibly Netware.
However, Subversion depends on more than just APR - the Subversion "server" is Apache httpd itself. Apache httpd is a time-tested, extensible open-source server process that is ready for serious use. It can sustain a high network load, runs on many platforms, and can operate through firewalls. It can use a number of different authentication protocols and do network pipelining and caching. By using Apache as a server, Subversion gets all these features for free.
Subversion uses WebDAV as its network protocol. DAV (Distributed Authoring and Versioning) is a whole discussion in itself (see http://www.webdav.org/) - but in short, it's an extension to HTTP that allows reads/writes and "versioning" of files over the web. The Subversion project is hoping to ride a slowly rising tide of support for this protocol: all of the latest file-browsers for Win32, MacOS, and GNOME speak this protocol already. Interoperability will (hopefully) become more and more of a boon over time.
For users who simply wish to access Subversion repositories on their local disk, the client can do this too; no network is required. The "Repository Access" layer (RA) is an abstract API implemented by both the DAV and local-access RA libraries. This is a specific benefit of writing a "librarized" revision control system - if you feel like writing a new network protocol for Subversion, just write a new library that implements the RA API.
On the client side, the Subversion "working copy" library maintains administrative information within special .svn subdirectories, similar in purpose to the CVS administrative directories found in CVS working copies.
A glance inside the typical .svn directory turns up a bit more than what CVS maintains in its administrative directories, however. The `entries' file contains XML which describes the current state of the working copy directory (and which basically serves the purposes of CVS's Entries, Root, and Repository files combined). But other items present (and not found in CVS) include storage locations for the versioned "properties" (the metadata mentioned in "Subversion Features" above) and private caches of pristine versions of each file. This latter feature provides the ability to report local modifications - and do reversions - without network access. Authentication data is also stored within .svn/, rather than in a single .cvspass-like file.
The Subversion "client" library has the broadest responsibility; its job is to mingle the functionality of the working-copy library with that of the repository-access library, and then to provide a highest-level API to any application that wishes to perform general revision control actions.1
The client library is designed to be used by any application. While the Subversion source code includes a standard command-line client, it should be very easy to write any number of GUI clients on top of the client library.
### Somebody please write this. It should describe how to fetch various binary packages of Subversion for different platforms. Maybe this will flesh out once RPMs, .debs, and BSD ports are widely available from standard locations?
To build from source code, See Compiling and installing.
If you're an existing CVS user, then the first section, The Subversion Development Model, should already be familiar. You may just want to skim it quickly, noting the special definition of "Revision" in the second subsection. At some point, you should probably also read the appendix which describes fundamental differences between CVS and SVN (See SVN for CVS users.)
Suppose you are using Subversion to manage a software project. There are two things you will interact with: your working directory, and the repository.
Your working directory is an ordinary directory tree, on your local system, containing your project's sources. You can edit these files and compile your program from them in the usual way. Your working directory is your own private work area: Subversion never changes the files in your working directory, or publishes the changes you make there, until you explicitly tell it to do so.
After you've made some changes to the files in your working directory, and verified that they work properly, Subversion provides commands to publish your changes to the other people working with you on your project. If they publish their own changes, Subversion provides commands to incorporate those changes into your working directory.
A working directory contains some extra files, created and maintained by Subversion, to help it carry out these commands. In particular, these files help Subversion recognize which files contain unpublished changes, and which files are out-of-date with respect to others' work.
While your working directory is for your use alone, the repository is the common public record you share with everyone else working on the project. To publish your changes, you use Subversion to put them in the repository. (What this means, exactly, we explain below.) Once your changes are in the repository, others can tell Subversion to incorporate your changes into their working directories. In a collaborative environment like this, each user will typically have their own working directory (or perhaps more than one), and all the working directories will be backed by a single repository, shared amongst all the users.
A Subversion repository holds a single directory tree, and records the history of changes to that tree. The repository retains enough information to recreate any prior state of the tree, compute the differences between any two prior trees, and report the relations between files in the tree -- which files are derived from which other files.
A Subversion repository can hold the source code for several projects; usually, each project is a subdirectory in the tree. In this arrangement, a working directory will usually correspond to a particular subtree of the repository.
For example, suppose you have a repository laid out like this:
/trunk/paint/Makefile canvas.c brush.c write/Makefile document.c search.c
In other words, the repository's root directory has a single
subdirectory named trunk
, which itself contains two
subdirectories: paint
and write
.
To get a working directory, you must check out some subtree of the
repository. If you check out /trunk/write
, you will get a working
directory like this:
write/Makefile document.c search.c .svn/This working directory is a copy of the repository's
/trunk/write
directory, with one additional entry -- .svn
-- which holds the
extra information needed by Subversion, as mentioned above.
Suppose you make changes to search.c
. Since the .svn
directory remembers the file's modification date and original contents,
Subversion can tell that you've changed the file. However, Subversion
does not make your changes public until you explicitly tell it to.
To publish your changes, you can use Subversion's commit
command:
$ pwd /home/jimb/write $ ls -a .svn/ Makefile document.c search.c $ svn commit search.c $
Now your changes to search.c
have been committed to the
repository; if another user checks out a working copy of
/trunk/write
, they will see your text.
Suppose you have a collaborator, Felix, who checked out a working
directory of /trunk/write
at the same time you did. When you
commit your change to search.c
, Felix's working copy is left
unchanged; Subversion only modifies working directories at the user's
request.
To bring his working directory up to date, Felix can use the Subversion
update
command. This will incorporate your changes into his
working directory, as well as any others that have been committed since
he checked it out.
$ pwd /home/felix/write $ ls -a .svn/ Makefile document.c search.c $ svn update U search.c $
The output from the svn update
command indicates that Subversion
updated the contents of search.c
. Note that Felix didn't need to
specify which files to update; Subversion uses the information in the
.svn
directory, and further information in the repository, to
decide which files need to be brought up to date.
We explain below what happens when both you and Felix make changes to the same file.
A Subversion commit
operation can publish changes to any number
of files and directories as a single atomic transaction. In your
working directory, you can change files' contents, create, delete,
rename and copy files and directories, and then commit the completed set
of changes as a unit.
In the repository, each commit is treated as an atomic transaction: either all the commit's changes take place, or none of them take place. Subversion tries to retain this atomicity in the face of program crashes, system crashes, network problems, and other users' actions. We may call a commit a transaction when we want to emphasize its indivisible nature.
Each time the repository accepts a transaction, this creates a new state of the tree, called a revision. Each revision is assigned a unique natural number, one greater than the number of the previous revision. The initial revision of a freshly created repository is numbered zero, and consists of an empty root directory.
Unlike those of many other systems, Subversion's revision numbers apply to an entire tree, not individual files. Each revision number selects an entire tree.
It's important to note that working directories do not always correspond
to any single revision in the repository; they may contain files from
several different revisions. For example, suppose you check out a
working directory from a repository whose most recent revision is 4:
write/Makefile:4 document.c:4 search.c:4
At the moment, this working directory corresponds exactly to revision 4
in the repository. However, suppose you make a change to
search.c
, and commit that change. Assuming no other commits have
taken place, your commit will create revision 5 of the repository, and
your working directory will look like this:
write/Makefile:4 document.c:4 search.c:5Suppose that, at this point, Felix commits a change to
document.c
, creating revision 6. If you use svn update
to
bring your working directory up to date, then it will look like this:
write/Makefile:6 document.c:6 search.c:6Felix's changes to
document.c
will appear in your working copy of
that file, and your change will still be present in search.c
. In
this example, the text of Makefile
is identical in revisions 4, 5,
and 6, but Subversion will mark your working copy with revision 6 to
indicate that it is still current. So, after you do a clean update at
the root of your working directory, your working directory will
generally correspond exactly to some revision in the repository.
For each file in a working directory, Subversion records two essential pieces of information:
Given this information, by talking to the repository, Subversion can tell which of the following four states a file is in:
The subversion status
command will show you the state of any item in
your working copy. See Basic Work Cycle, in particular the
subsection "Examine your changes".
Subversion does not prevent two users from making changes to the same
file at the same time. For example, if both you and Felix have checked
out working directories of /trunk/write
, Subversion will allow
both of you to change write/search.c
in your working directories.
Then, the following sequence of events will occur:
search.c
first. His
commit will succeed, and his text will appear in the latest revision in
the repository.
search.c
, Subversion
will reject your commit, and tell you that you must update
search.c
before you can commit it.
search.c
, Subversion will try to merge Felix's
changes from the repository with your local changes. By default,
Subversion merges as if it were applying a patch: if your local changes
do not overlap textually with Felix's, then all is well; otherwise,
Subversion leaves it to you to resolve the overlapping
changes. In either case,
Subversion carefully preserves a copy of the original pre-merge text.
search.c
,
which now contains everyone's changes.
Some revision control systems provide "locks", which prevent others from changing a file once one person has begun working on it. In our experience, merging is preferable to locks, because:
Of course, the merge process needs to be under the users' control. Contextual, line-by-line patching is not appropriate for files with rigid formats, like images or executables. Subversion attempts to notice when a file is in a binary format, or is of any mime-type other than text/*. For these rigid-format files, Subversion simply presents you with the two original texts to choose from. See Basic Work Cycle, in particular the subsection "Merge others' changes".
The previous section gave an abstract overview of the Subversion development model. Here's an opportunity to play with Subversion in some hands-on examples. The Subversion commands demoed here are just small examples of what Subversion can do; see Chapter 2 for full explanations of each.
The Subversion client has an abstract interface for accessing a
repository. Two "Repository Access" (RA) implementations currently
exist as libraries. You can see which methods are available to your svn
client like so:
$ svn --version Subversion Client, version N compiled Jan 26 2002, 16:43:58 Copyright (C) 2000-2002 CollabNet. Subversion is open source software, see http://subversion.tigris.org/ The following repository access (RA) modules are available: * ra_dav : Module for accessing a repository via WebDAV (DeltaV) protocol. - handles 'http' schema * ra_local : Module for accessing a repository on local disk. - handles 'file' schema
If you don't see ra_local, it probably means that Berkeley DB (or relevant database back-end) wasn't found when compiling your client binary. To continue with these examples, you'll need to have ra_local available.
Start by creating a new, empty repository using the svnadmin
tool:
$ svnadmin create myrepos
Let's assume you have a directory someproject
which contains
files that you wish to place under version control:
someproject/foo bar baz/ baz/gloo baz/bloo
Once the repository exists, you can initially import your data into it,
using the ra_local access method (invoked by using a "file" URL):
$ svn import file:///absolute/path/to/myrepos someproject myproj ... Committed revision 1.
The example above creates a new directory myproj
in the root of
the repository's filesystem, and copies all the data from
someproject
into it.
Now check out a fresh "working copy" of your project. To do this, we
specify a URL to the exact directory within the repository that we want.
The parameter after the URL allows us to name the working copy we check out.
$ svn co file:///usr/local/svn/repos/myproj wc A wc/foo A wc/bar A wc/baz A wc/baz/gloo A wc/baz/bloo
Now we have a working copy in a local directory called wc
, which
represents the location /myproj
in the repository (assuming the
repository's root is <file:///usr/local/svn/repos
>.)
For the sake of example, let's duplicate the working copy, and pretend
it belongs to someone else:
$ cp -R wc wc2
From here, let's make some changes within our original working copy:
$ cd wc $ echo "new text" >> bar # change bar's text $ svn propset color green foo # add a metadata property to foo $ svn rm baz # schedule baz directory for deletion $ touch newfile $ svn add newfile # schedule newfile for addition
That's a lot of changes! If we were to leave and come back tomorrow,
how could we remember what changes we'd made? Easy. The status
command will show us all of the "local modifications" in our working
copy:
$ svn status # See what's locally modified M ./bar _M ./foo A ./newfile D ./baz D ./baz/gloo D ./baz/bloo
According to this output, three items are scheduled to be (D)eleted from
the repository, one item is scheduled to be (A)dded to the repository,
and two items have had their contents (M)odified in some way. For more
details, be sure to read about svn status
in Chapter 2.
Now we decide to commit our changes, creating Revision 2 in the
repository:
$ svn commit -m "fixed bug #233" Sending bar Sending foo Adding newfile Deleting baz Transmitting data... Committed revision 2.
The -m argument is a way of specifying a log message: that is, a specific description of your change-set sent to the repository. The log message is now attached to Revision 2. A future user might peruse repository log messages, and now will know what your Revision 2 changes were for.
Finally, pretend that you are now Felix, or some other collaborator. If
you go wc2
(that other working copy you made), it will need the
svn update
command to receive the Revision 2 changes:
$ cd ../wc2 # change to the back-up working copy $ svn update # get changes from repository U ./bar _U ./foo A ./newfile D ./baz
The output of the svn update
command tells Felix that baz was
(D)eleted from his working copy, newfile was (A)dded to his working
copy, and that bar and foo had their contents (U)pdated.
If for some reason bar
contained some local changes made by
Felix, then the server changes would be merged into bar
:
that is, bar
would now contain both sets of changes. Whenever
server changes are merged into a locally-modified file, two possible
things can happen:
svn update
prints a G
("mer(G)ed").
C
for (C)onflict is printed. See
section ??? for information about how conflict resolution works.
How to make a lovely gumbo with your Subversion client, in 11 easy steps.
This chapter goes into more of the gritty details of client commands. For a first overview of the client's CVS-like "copy-modify-merge" model of development, See Basics.
Before reading on, here is the most important piece of information
you'll ever need when using Subversion: svn help
. The
Subversion command-line client tries to be self-documenting; at any
time, a quick svn help <subcommand>
will describe the
syntax, switches, and behavior of subcommand
.
This chapter by no means covers every option to every client
subcommand. Instead, it's a conversational introduction to the most
common tasks you'll encounter. When in doubt, run svn help
.
Most of the time, you will start using a Subversion repository by
doing a checkout of your project. "Checking out" will provide
you with a local copy of the HEAD (latest revision) of the Subversion
repository that you checked out.
$ svn co http://svn.collab.net/repos/svn/trunk A trunk/subversion.dsw A trunk/svn_check.dsp A trunk/COMMITTERS A trunk/configure.in A trunk/IDEAS ... Checked out revision 2499.
Although the above example checks out the trunk directory, you can just
as easily checkout any deep subdirectory of a repository by specifying
the subdirectory in the checkout URL:
$ svn co http://svn.collab.net/repos/svn/trunk/doc/handbook A handbook/svn-handbook.texi A handbook/getting_started.texi A handbook/outline.txt A handbook/license.texi A handbook/repos_admin.texi A handbook/client.texi Checked out revision 2499.
Since Subversion uses a "copy-modify-merge" model instead of "lock-modify-unlock," you're now ready to start making changes to the files that you've checked out, known collectively as your working copy. You can even delete the entire working copy and forget about it entirely - there's no need to notify the Subversion server unless you're ready to check in changes, a new file, or even a directory.
Every directory in a working copy contains an administrative
area, a subdirectory named .svn
. Normal ls
commands
won't show this subdirectory, but it's vital. Whatever you do, don't
delete or change anything in the administrative area! Subversion
depends on it to manage your working copy.
You can run svn help checkout
for command line options to
checkout, although one option is very common and worth mentioning: you
can specify a directory after your repository url. This places your working
copy into a new directory that you name. For example:
$ svn co http://svn.collab.net/repos/svn/trunk subv A subv/subversion.dsw A subv/svn_check.dsp A subv/COMMITTERS A subv/configure.in A subv/IDEAS ... Checked out revision 2499.
Subversion has numerous features, options, bells and whistles, but on a day-to-day basis, odds are that you will only use a few of them. In this section we'll run through the most common things that you might find yourself doing with Subversion in the course of a day's work.
The typical work cycle looks like this:
svn up
)When working on a project with a team, you'll want to update
your working copy: that is, receive any changes from other developers
on the project. svn update
brings your working copy in-sync
with the latest revision in the repository.
$ svn up U ./foo.c U ./bar.c Updated to revision 2.
In this case, someone else checked in modifications to both
foo.c
and bar.c
since the last time you updated, and
Subversion has updated your working copy to include those changes.
Let's examine the output of svn update
a bit more. When the
server sends changes to your working copy, a letter code is displayed
next to each item:
foo
foo
was (U)pdated (received changes from the server.)
foo
foo
was (A)dded to your working copy.
foo
foo
was (D)eleted from your working copy.
foo
foo
was (R)eplaced in your working copy;
that is, foo
was deleted, and a new item with the same name
was added. While they may have the same name, the repository
considers them to be distinct objects with distinct histories.
foo
foo
received new changes, but also had changes of your own
to begin with. The changes did not intersect, however, so Subversion
has mer(G)ed the repository's changes into the file without a problem.
foo
foo
received (C)onflicting changes from the server. The
changes from the server directly overlap your own changes to the file.
No need to panic, though. This overlap needs to be resolved by a
human (you); we discuss this situation further down.
svn add
, svn rm
, svn cp
, svn mv
)Now you can to get to work and make changes in your working copy. It's usually most convenient to create a "task" for yourself, such as writing a new feature, fixing a bug, etc.
What kinds of changes can you make to your working copy tree?
To make file changes, just use your normal editor, word processor, or whatever. A file needn't be in text-format; binary files work just fine.
There are at least four Subversion subcommands for making tree
changes. Detailed help can be found with svn help
, but here
is an overview:
svn add foo
foo
to be added to the repository. When you next
commit, foo
will become a permanent child of its parent
directory. Note that if foo
is a directory, only the directory
itself will be scheduled for addition. If you want to add its
contents as well, pass the --recursive
switch.
svn rm foo
foo
to be removed from the repository. If foo
is a file, it immediately vanishes from the working copy - but it can
be recovered with svn revert
(see below). If foo
is
a directory, it is merely scheduled for deletion. After you commit,
foo
will no longer exist in the working copy or repository.
svn cp foo
bar
bar
as a duplicate of foo
. bar
is automatically scheduled for addition. When bar
is added to
the repository on the next commit, it's copy-history is recorded (as
having originally come from foo
.)
svn mv foo
bar
svn cp foo bar;
svn rm foo
. That is, bar
is scheduled for addition as a copy
of foo
, and foo
is scheduled for removal.
Let's ammend our original statement: there are some use-cases
that immediately commit tree changes to the repository. This usually
happens when a subcommand is operating directly on a URL, rather than
on a working-copy path. (In particular, specific uses of svn
mkdir
, svn cp
, svn mv
, and svn rm
can work with
URLs. See svn help
on these commands for more details.)
svn status
, svn diff
, svn revert
)So now you've finished your changes... or so you think. But what exactly did you change? How can you review them?
Subversion has been optimized to help you with this task, and is able
to do many things without talking to the repository or network at all.
In particular, your working copy contains a secret cached "pristine"
copy of each file within the .svn
area. Because of this, it
can quickly show you how your working files have changed, or even
allow you to undo your changes.
The svn status
command is your friend; become intimate with
it. You'll probably use svn status
more than any other
command.
If you run svn status
at the top of your working copy with no
arguments, it will detect all file and tree changes you've made. This
example is designed to show all the different status codes that
svn status
can return. The text in []
is not printed by
svn status
.
$ svn status _ L ./abc.c [svn has a lock in its .svn directory for abc.c] M ./bar.c [the content in bar.c has local modifications] _M ./baz.c [baz.c has property but no content modifications] ? ./foo.o [svn doesn't manage foo.o] ! ./foo.c [svn knows foo.c but a non-svn program deleted it] ~ ./qux [versioned as dir, but is file, or vice versa] A + ./moved_dir [added with history of where it came from] M + ./moved_dir/README [added with history and has local modifications] D ./stuff/fish.c [this file is scheduled for deletion] A ./stuff/things/bloo.h [this file is scheduled for addition]
In this output format svn status
prints four columns of
characters followed by several whitespace characters followed by a file
or directory name. The first column tells the status of a file or
directory and/or its contents. The codes printed here are
file_or_dir
file_or_dir
's contents been modified if it is a file.
file_or_dir
file_or_dir
has been scheduled for addition
into the repository.
file
file
have been modified.
file_or_dir
file_or_dir
has been scheduled for deletion
from the repository.
file_or_dir
file_or_dir
indicates that this file or
directory is not under revision control. You can silence the question
marks by either passing the --quiet
(-q
) option to
svn status
, or by setting the svn:ignore
property on the
parent directory, See Properties.
file_or_dir
file_or_dir
is under revision control but
the working copy is missing. This happens if the file or directory is
removed using a non-Subversion command. A quick svn up
or
svn revert file_or_dir
will restore the missing file from its
cached pristine copy.
file_or_dir
file_or_dir
is under revision control as
one kind of object, but what's actually on disk is some other kind.
For example, Subversion might be expecting a file, but the user has
removed the file and created a directory in its place, without
using the svn rm
nor svn add
commands.
The second column tells the status of a file's or directory's
properties, See Properties. If a M
appears in the second
column, then the properties have been modified, otherwise a whitespace
will be printed. If only the properties of a file or directory are
modified, then you will get _M
printed in the first and second
columns. The first _
is just printed to make it clear to the eye
that the properties are modified and not the contents.
The third column will only show whitespace or a L
which means
that svn
has locked item locked in the .svn
working
area. You will see L
if you run svn status
in a directory
you are currently running svn commit
when you are editing the log
message. If there are no running svn
's, then presumably
svn
was forcibly quit or died and the lock needs to be cleaned
up by running svn cleanup
. Locks typically appear if a
Subversion command is interrupted before completion.
The forth column will only show whitespace or a +
which means
that the file or directory is scheduled to be added or modified with
additional attached history. This typically happens when you svn
mv
or svn cp
a file or directory. If you see A +
,
this means the item is scheduled for addition-with-history. It could be
a file, or the root of a copied directory. _ +
means the
item is part of a subtree scheduled for addition-with-history, i.e. some
parent got copied, and it's just coming along for the ride. M +
means the item is part of a subtree scheduled for
addition-with-history, and it has local mods. When you commit,
first some parent will be added-with-history (copied), which means this
file will automatically exist in the copy. Then the local mods will be
further uploaded into the copy.
By default, svn status
ignores files matching the regular
expressions *.o
, *.lo
, *.la
, #*#
,
*.rej
, *~
, and .#*
. If you want additional files
ignored, set the svn:ignore
property on the parent directory. If
you want to see the status of all the files in the repository
irrespective of svn status
and svn:ignore
's regular
expressions, then use the --no-ignore
command line option.
If a single path is passed to the command, it will tell you about it:
$ svn status stuff/fish.c D stuff/fish.c
This command also has a --verbose
(-v
) mode, which will
show you the status of every item in your working copy:
$ svn status -v M 44 23 joe ./README _ 44 30 frank ./INSTALL M 44 20 frank ./bar.c _ 44 18 joe ./stuff _ 44 35 mary ./stuff/trout.c D 44 19 frank ./stuff/fish.c _ 44 21 mary ./stuff/things A 0 ? ? ./stuff/things/bloo.h _ 44 36 joe ./stuff/things/gloo.c
This is the "long form" output of svn status
. The first
column is still the same. The second column shows the
working-revision of the item. The third and fourth column show the
revision in which the item last changed, and who changed it.
Finally, there is a --show-updates
(-u
) switch, which
contacts the repository and adds information about things that are
out-of-date:
$ svn status -u -v M * 44 23 joe ./README M 44 20 frank ./bar.c _ * 44 35 mary ./stuff/trout.c D 44 19 frank ./stuff/fish.c A 0 ? ? ./stuff/things/bloo.h
Notice the two asterisks: if you were to run svn up
at this
point, you would receive changes to README
and trout.c
.
Hmmm, better be careful. You'll need to absorb those server-changes
on README
before you commit, lest the repository reject your
commit for being out-of-date. (More on this subject below.)
Another way to examine your changes is with the svn diff
command. You can find out exactly how you've modified things
by running svn diff
with no arguments, which prints out file
changes in unified diff format:
$ svn diff Index: ./bar.c =================================================================== --- ./bar.c +++ ./bar.c Mon Jul 15 17:58:18 2002 @ -1,7 +1,12 @ +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> + +#include <stdio.h> int main(void) { - printf("Sixty-four slices of American Cheese...\n"); + printf("Sixty-five slices of American Cheese...\n"); return 0; } Index: ./README =================================================================== --- ./README +++ ./README Mon Jul 15 17:58:18 2002 @ -193,3 +193,4 @ +Note to self: pick up laundry. Index: ./stuff/fish.c =================================================================== --- ./stuff/fish.c +++ ./stuff/fish.c Mon Jul 15 17:58:18 2002 -Welcome to the file known as 'fish'. -Information on fish will be here soon. Index: ./stuff/things/bloo.h =================================================================== --- ./stuff/things/bloo.h +++ ./stuff/things/bloo.h Mon Jul 15 17:58:18 2002 +Here is a new file to describe +things about bloo.
The svn diff
command produces this output by comparing your
working files against the cached "pristine" copies within the
.svn
area. Files scheduled for addition are displayed as all
added-text, and files scheduled for deletion are displayed as all
deleted text.
Now suppose you see this output, and realize that your changes to
README
are a mistake; perhaps you accidentally typed that text
into the wrong file in your editor.
The svn revert
command is exactly for this purpose. It
throws away all changes to your file:
$ svn revert README Reverted ./README
The file is reverted to its pre-modified state by overwriting it with
the cached "pristine" copy. But also note that svn revert
can undo any scheduled operations - in case you decide that you don't
want to add a new file after all, or that you don't want to remove
something.
A final reminder: all three of these commands (svn status
,
svn diff
, svn revert
) can be used without any
network access (except for the -u
switch to status). This
makes it easy to manage your changes-in-progress while traveling on
an airplane, etc.
We've already seen how svn status -u
can predict conflicts.
Suppose you run svn update
and some interesting things
occur:
$ svn up U ./INSTALL G ./README C ./bar.c
The U and G codes are nothing to sweat about; those files cleanly
absorbed changes from the repository. The G
stands for mer(G)ed,
which means that the file had local changes to begin with, but the
repository changes didn't overlap in any way.
But the C
stands for conflict. This means that the server's changes
overlapped with your own, and now you have to manually choose between
them.
Whenever a conflict occurs:
C
is printed during the update, and Subversion remembers that the
file is "conflicted."
ORIG_NAME
, the three new fulltext files have filenames of the
form ORIG_NAME.*.mine
, ORIG_NAME.*.rOLD_REV
and
ORIG_NAME.*.rNEW_REV
. Here *
represents some random
digits that SVN chooses, ORIG_NAME.*.mine
is a copy of the file
that existed in your local working copy before the merge and without any
conflict markers, ORIG_NAME.*.rOLD_REV
is the original version of
ORIG_NAME
at the revision that your working copy is based off of
with OLD_REV
replaced with the specific revision number, and
ORIG_NAME.*.rNEW_REV
is the original version of ORIG_NAME
that the file is being merged to, again with NEW_REV
replaced
with the specific revision number.
At this point, Subversion will not allow you to commit the file until the three temporary files are removed.
If you get a conflict, you need to either (1) hand-merge the
conflicted text (by examining and editing the conflict markers within
the file), (2) copy one of the tmpfiles on top of your working file, or
(3) run svn revert
to toss all of your changes.
Once you've resolved the conflict, you need to let Subversion know by
removing the three tmpfiles. (The svn resolve
command, by
the way, is a shortcut that does nothing but automatically remove the
three tmpfiles for you.) When the tmpfiles are gone, Subversion no
longer considers the file to be in a state of conflict anymore.
Finally! Your edits are finished, you've merged all updates from the server, and you're ready to commit your changes.
The svn commit
command sends all (or some) of your changes
to the repository. When you commit a change, you need to supply a
log message, describing your change. Your log message will be
permanently attached to the new revision you create.
$ svn commit -m "Added include lines and corrected # of cheese slices." Sending bar.c Transmitting file data . Committed revision 3. $
Another way to specify a log message is to place it in a file, and
pass the filename with the -F
switch. If you fail to
specify either the -m
or -F
switch, then Subversion will
automatically launch your favorite $EDITOR
for composing a log
message.
The repository doesn't know or care if your changes make any sense as
a whole; it only checks to make sure that nobody else has changed any
of the same files that you did when you weren't looking. If somebody
has done that, the entire commit will fail with a message
informing that one or more of your files is out-of-date. At this
point, you need to run svn update
again, deal with any
merges or conflicts that result, and attempt your commit again.
That covers the most basic work cycle for using Subversion. Run
svn help commandname
for help on any of the commands
covered in this section.
As we mentioned earlier, the repository is like a time machine. It remembers every revision ever committed, and allows you to explore this history.
There are two commands that mine historical data from the repository.
svn log
shows you broad information: log messages attached
to revisions, and which paths changed in each revision. svn
diff
, on the other hand, can show you the specific details of how a
file changed over time.
svn log
To find out information about the history of a file or directory, you
use the svn log
command. svn log
will tell you who
made changes to a file and at what revision, the time and date of that
revision, and the log message that accompanied the commit.
$ svn log ------------------------------------------------------------------------ rev 3: fitz | Mon, 15 Jul 2002 18:03:46 -0500 | 1 line Added include lines and corrected # of cheese slices. ------------------------------------------------------------------------ rev 2: someguy | Mon, 15 Jul 2002 17:47:57 -0500 | 1 line Added main() methods. ------------------------------------------------------------------------ rev 1: fitz | Mon, 15 Jul 2002 17:40:08 -0500 | 2 lines Initial import ------------------------------------------------------------------------
Note that the log messages are printed in reverse chronological order
by default. If you wish to see a different range of revisions in a
particular order, or just a single revision, pass the
--revision
(-r
) switch:
$ svn log -r 5:19 ... # shows logs 5 through 19 in chronological order $ svn log -r 19:5 ... # shows logs 5 through 19 in reverse order $ svn log -r 8 ...
You can also examine the log history on a single file or directory.
The commands
$ svn log foo.c ... $ svn log http://foo.com/svn/trunk/code/foo.c ...
will display log messages only for those revisions in which the working file (or URL) changed.
And while we're on the subject, svn log
also takes a
--verbose
(-v
) option too; it includes a list of
changed-paths in each revision:
$ svn log -r 8 -v ------------------------------------------------------------------------ rev 8: jrandom | 2002-07-14 08:15:29 -0500 | 1 line Changed paths: U /trunk/code/foo.c U /trunk/code/bar.h A /trunk/code/doc/README Frozzled the sub-space winch. ------------------------------------------------------------------------
svn diff
We've already seen svn diff
in an previous section; it
displays file differences in unified diff format. Earlier, it was
used to show the local modifications made to our working copy.
In fact, it turns out that there are three distinct uses of
svn diff
:
Invoking svn diff
with no switches will compare your working
files to the cached "pristine" copies in the .svn
area:
$ svn diff foo Index: ./foo =================================================================== --- ./foo +++ ./foo Tue Jul 16 15:19:53 2002 @ -1 +1,2 @ An early version of the file +...extra edits
If a single --revision
(-r
) number is passed, then your
working files are compared to a particular revision in the repository.
$ svn diff -r 3 foo Index: ./foo =================================================================== --- ./foo +++ ./foo Tue Jul 16 15:19:53 2002 @ -1,2 +1,2 @ An early version of the file -Second version of the file +...extra edits
If two revision numbers are passed via -r
, then the two
revisions are directly compared.
$ svn diff -r 2:3 foo Index: ./foo =================================================================== --- ./foo +++ tmp.280.00001 Tue Jul 16 15:22:19 2002 @ -1 +1,2 @ An early version of the file +Second version of the file
If you read the help for svn diff
, you'll discover that you
can supply URLs instead of working copy paths as well. This is
especially useful if you wish to inspect changes when you have no
working copy available:
$ svn diff -r 23:24 http://foo.com/some/project ...
Branches and tags are general concepts common to almost all revision control systems. If you're not familiar with these ideas, you can find a good introductory explanation in Karl Fogel's free CVS book: http://cvsbook.red-bean.com/cvsbook.html#Branching_Basics
At this point, you should understand how each commit creates an entire new filesystem tree in the repository. (If not, read about revisions, See Transactions and Revision Numbers, or See Revision numbers are different now.)
As you may have suspected, the filesystem doesn't grow 652 new inodes each time a new revision is created. Instead, each new tree is mostly made of pointers to already-existing nodes; new nodes are created only for changed items, and all the rest of the revision tree is "shared storage" with other revision trees. This technique demonstrates how the filesystem is able to make "cheap copies" of things. These cheap copies are nothing more than directory entries that point to existing nodes. And this is the basis of tags and branches.
svn cp
Suppose we have a repository whose head tree is revision 82. In this
repository is a subdirectory mooIRC
that contains a software
project that is ready to be tagged. How do we tag it? Very simple:
make a "cheap" copy of this directory. In other words, create a new
directory entry (somewhere else in the filesystem) that points to this
specific node that represents directory mooIRC
in
revision 82. Of course, you can name the new directory entry whatever
you want - probably a tag-name like mooIRC-beta
.
The easiest way to make this copy is with svn cp
, which,
incidentally, can operate entirely on URLs, so that the copy happens
only on the server-side:
$ svn cp http://foo.com/repos/mooIRC \ http://foo.com/repos/mooIRC-beta Committed revision 83.
Now, as long as you never touch the contents of the directory
mooIRC-beta
, that entry will forever point to a node that looks
the way mooIRC
did at a specific moment in time (however it
looked in revision 82). And that's exactly what a tag is.
But suppose mooIRC-beta
isn't sacred, and instead you decide to
start making commits to it. And suppose you also continue to
make commits in the original mooIRC
directory. Then you have
two directories that started out looking identical - their common
ancestor was mooIRC
in revision 82 - but now have diverged
their contents over time. In other words, they represent different
branches of the project.
It's very important to note that the Subversion filesystem is not aware of "tags" or "branches." It's only aware of directories, and all directories are equal. The tag and branch concepts are purely human meanings attached to particular directories.
For this reason, it's up to users (and the Subversion repository
administrator) to choose sane policies that help elucidate these
labels. For example, here's a good way to lay out your repository:
/ /projectA /projectA/trunk/ /projectA/branches/ /projectA/tags/ /projectB /projectB/trunk/ /projectB/branches/ /projectB/tags/
Each time /projectA/trunk
reaches a taggable state, make a copy
of the directory somewhere in /projectA/tags/
, and set the copy
to read-only. Use the same procedure to create a branch in
/projectA/branches/
.
An alternate way to lay out a repository:
/ /trunk /trunk/projectA /trunk/projectB /branches /branches/projectA /branches/projectB /tags /tags/projectA /tags/projectB
Or, of course, you could just place each project into a dedicated repository. It's up to you. For examples on how to create a repository with one of these structures, See Creating a repository.
svn switch
The svn switch
command allows you to "move" some or all of
your working copy to a branch or tag. For example, suppose I have a
working copy of mooIRC
, and I'd like to work on some subsystem
as it appears in a subdirectory of mooIRC-beta
. At the same
time, I want the rest my working copy to remain on the original
mooIRC
branch. To do this, I switch the appropriate subdir to
the new branch location:
$ svn switch http://foo.com/repos/mooIRC-beta/subsystems/renderer \ mooIRC/subsystems/renderer U mooIRC/subsystems/renderer/foo.c U mooIRC/subsystems/renderer/bar.h U mooIRC/subsystems/renderer/baz.c
Now my working copy of the renderer
subdirectory represents a
different location on the server.
Really, svn switch
is just a fancier version of svn
update
. Whereas svn update
has the ability to move your
working copy through time (either by updating to the latest revision,
or by updating to a specific revision given with -r
),
svn switch
is able to move your working copy through time
and space.
If your working copy contains a number of "switched" subtrees from different repository locations, it continues to function as normal. When you update, you'll receive patches to each subtree as appropriate. When you commit, your local changes will still be applied as a single, atomic change to the repository.
svn merge
Suppose a team of programmers working on the mooIRC-beta
branch
have fixed a critical bug, and the team working on the original
mooIRC
branch would like to apply that change as well.
The svn merge
command is the answer. You can think of
svn merge
as a special kind of svn diff
; only
instead of displaying unified diffs to the screen, it applies
the differences to your working copy as if they were local changes.
For example, suppose the bug fix happened in a commit to the
mooIRC-beta
branch in revision 102.
$ svn diff -r 101:102 http://foo.com/repos/mooIRC-beta ... # diffs sent to screen $ svn merge -r 101:102 http://foo.com/repos/mooIRC-beta mooIRC U mooIRC/glorb.c U mooIRC/src/floo.h
While the output of svn merge
looks similar to
svn update
or svn switch
, it is in fact only applying
temporary changes to the working files. Once the differences are
applied as local changes, you can examine them as usual with
svn diff
, svn status
, or undo them with
svn revert
as usual. If the changes are acceptable, you can
commit them.
svn merge
Another common use for svn merge
is for rolling back a change
that has been committed. Say you commit some changes in revision 10, and
later decide that they were a mistake. You can easily revert the tree to
the state it was in at revision 9 with an svn merge
command.
$ svn commit -m "change some stuff" Sending bar.c Sending foo.c Transmitting file data .. Committed revision 10. $ ... # developer continues on and realizes he made a mistake $ svn merge -r 10:9 . U ./bar.c U ./foo.c $ svn commit -m "oops, reverting revision 10" Sending bar.c Sending foo.c Transmitting file data .. Committed revision 11.
If you aren't rolling back the changes to your current directory (say you
want to roll back one specific file, or all the files in one specific
subdirectory), then the syntax is slightly different, as you have to tell
svn merge
where it should merge the changes into.
$ svn merge -r 10:9 baz/ baz/ U ./baz/bar.c U ./baz/foo.c $ svn commit -m "reverting revision 10's changes in baz/" Sending baz/bar.c Sending baz/foo.c Transmitting file data .. Committed revision 12. $ ... # developer continues on and later makes another mistake $ svn merge -r 13:12 baz/foo.c baz/foo.c U ./baz/foo.c $ svn commit -m "reverting revision 12's change to foo.c" Sending baz/foo.c Transmitting file data . Committed revision 15.
Keep in mind that rolling back a change like this is just like any other
svn merge
operation, so you should use svn status
and
svn diff
to confirm that your work is in the state you want it
to be in, and then use svn commit
to send the final version to
the repository.
Sometimes you want to manage modified third-party source code inside your Subversion repository, while still tracking upstream releases. In CVS this would have been called a "vendor branch". Subversion doesn't have a formal "vendor branch", but it is sufficiently flexible that you can still do much the same thing.
The general procedure goes like this. You create a top level
directory (we'll use /vendor
) to hold the vendor branches. Then you
import the third party code into a subdirectory of /vendor
, and copy it
into /trunk
where you make your local changes. With each new
release of the code you are tracking you bring it into the vendor branch
and merge the changes into /trunk
, resolving whatever conflicts occur
between your local changes and the upstream changes.
Let's try and make this a bit clearer with an example.
First, the initial import.
$ svn mkdir http://svnhost/repos/vendor/foobar $ svn import http://svnhost/repos/vendor/foobar ~/foobar-1.0 current
Now we've got the current version of the foobar project in
/vendor/foobar/current
. We make another copy of it so we can
always refer to that version, and then copy it into the trunk so you can
work on it.
$ svn copy http://svnhost/repos/vendor/foobar/current \ http://svnhost/repos/vendor/foobar/foobar-1.0 \ -m `tagging foobar-1.0' $ svn copy http://svnhost/repos/vendor/foobar/foobar-1.0 \ http://svnhost/repos/trunk/foobar \ -m `bringing foobar-1.0 into trunk'
Now you just check out a copy of /trunk/foobar
and get to work!
Later on, the developers at FooBar Widgets, Inc release a new version of
their code, so you want to update the version of the code you're using.
First, you check out the /vendor/foobar/current
directory, then
copy the new release over that working copy, handle any renames,
additions or removals manually, and then commit.
$ svn co http://svnhost/repos/vendor/foobar/current ~/current $ cd ~/foobar-1.1 $ tar -cf - . | (cd ~/current ; tar -xf -) $ cd ~/current $ mv foobar.c main.c $ svn mv main.c foobar.c $ svn rm dead.c $ svn add doc $ svn add doc/* $ svn commit -m `importing foobar 1.1 on vendor branch'
Whoa, that was complicated. Don't worry, most cases are far simpler.
What happened? foobar 1.0 had a file called main.c
. This file
was renamed to foobar.c
in 1.1. So your working-copy had the old
main.c
which svn
knew about, and the new
foobar.c
which svn
did not know about. You rename
foobar.c
to main.c
and svn mv
it back to the new
name. This way, svn
will know that foobar.c
is a
descendant of main.c
. dead.c
has vanished in 1.1, and
they have finally written some documentation, so you add that.
Next you copy /vendor/foobar/current
to
/vendor/foobar/foobar-1.1
so you can always refer back to version
1.1, like this.
$ svn copy http://svnhost/repos/vendor/foobar/current \ http://svnhost/repos/vendor/foobar/foobar-1.1 \ -m `tagging foobar-1.1'
Now that you have a pristine copy of foobar 1.1 in /vendor
, you
just have to merge their changes into /trunk
and you're done.
That looks like this.
$ svn co http://svnhost/repos/trunk/foobar ~/foobar $ cd ~/foobar $ svn merge http://svnhost/repos/vendor/foobar/foobar-1.0 \ http://svnhost/repos/vendor/foobar/foobar-1.1 $ ... # resolve all the conflicts between their changes and your changes $ svn commit -m `merging foobar 1.1 into trunk'
There, you're done. You now have a copy of foobar 1.1 with all your local changes merged into it in your tree.
Vendor branches that have more than several deletes, additions and moves
can use the svn_load_dirs.pl
script that comes with the
Subversion distribution. This script automates the above importing
steps to make sure that mistakes are minimized. You still need to use
the merge commands to merge the new versions of foobar into your own
local copy containing your local modifications.
This script has the following enhancements over svn import
:
svn add
, svn rm
and optionally any svn mv
commands as necessary.
This script takes care of complications where Subversion requires a
commit before renaming a file or directory twice, such as if you had a
vendor branch that renamed foobar-1.1/docs/doc.ps
to
foobar-1.2/documents/doc-1.2.ps
. Here, you would rename
docs
to documents
, perform a commit, then rename
doc.ps
to doc-1.2.ps
. You could not do the two renames
without the commit, because doc.ps
was already moved once from
docs/doc.ps
to documents/doc.ps
.
This script always compares the directory being imported to what currently exists in the Subversion repository and takes the necessary steps to add, delete and rename files and directories to make the subversion repository match the imported directory. As such, it can be used on an empty subversion directory for the first import or for any following imports to upgrade a vendor branch.
For the first foobar-1.0 release located in ~/foobar-1.0
:
$ svn_load_dirs.pl -t foobar-1.0 \ http://svnhost/repos/vendor/foobar \ current \ ~/foobar-1.0
svn_load_dirs.pl
takes three mandatory arguments. The first
argument, <http://svnhost/repos/vendor/foobar
>, is the URL to the
base Subversion directory to work in. In this case, we're working in
the vendor/foobar
part of the Subversion repository. The next argument,
current
, is relative to the first and is the directory where the
current import will take place, in this case
<http://svnhost/repos/vendor/foobar/current
>. The last argument,
~/foobar-1.0
, is the directory to import. Finally, the optional
-t
command line option is also relative to
<http://svnhost/repos/vendor/foobar
> and tells
svn_load_dirs.pl
to create a tag of the imported directory in
<http://svnhost/repos/vendor/foobar/foobar-1.0
>.
The import of foobar-1.1 would be taken care of in the same way:
$ svn_load_dirs.pl -t foobar-1.1 \ http://svnhost/repos/vendor/foobar \ current \ ~/foobar-1.1
The script looks in your current
<http://svnhost/repos/vendor/foobar/current
> directory and sees
what changes need to take place for it to match ~/foobar-1.1
.
The script is kind enough to notice that there are files and directories
that exist in 1.0 and not in 1.1 and asks if you want to perform any
renames. At this point, you can indicate that main.c
was renamed to
foobar.c
and then indicate that no further renames have taken place.
The script will then delete dead.c
and add doc
and
doc/*
to the Subversion repository and finally create a tag
foobar-1.1 in <http://svnhost/repos/vendor/foobar/foobar-1.1
>.
The script also accepts a separate configuration file for applying
properties to specific files and directories matching a regular
expression that are added to the repository. This script will
not modify properties of already existing files or directories in the
repository. This configuration file is specified to
svn_load_dirs.pl
using the -p
command line option.
The format of the file is either two or four columns.
regular_expression control property_name property_value
The regular_expression
is a Perl style regular expression. The
control
column must either be set to break
or cont
.
It is used to tell svn_load_dirs.pl
if the following lines in the
configuration file should be examined for a match or if all matching
should stop. If control
is set to break
, then no more
lines from the configuration file will be matched. If control
is
set to cont
, which is short for continue, then more comparisons
will be made. Multiple properties can be set for one file or directory
this way. The last two columns, property_name
and
property_value
are optional and are applied to matching files and
directories.
If you have whitespace in any of the regular_expression
,
property_name
or property_value
columns, you must surround
the value with either a single or double quote. You can protect single
or double quotes with a \ character. The \ character is removed by this
script only for whitespace or quote characters, so you do not
need to protect any other characters, beyond what you would normally
protect for the regular expression.
This sample configuration file was used to load on a Unix box a number
of Zip files containing Windows files with CRLF
end of lines.
\.doc$ break svn:mime-type application/msword \.ds(p|w)$ break svn:eol-style CRLF \.ilk$ break svn:eol-style CRLF \.ncb$ break svn:eol-style CRLF \.opt$ break svn:eol-style CRLF \.exe$ break svn:mime-type application/octet-stream dos2unix-eol\.sh$ break .* break svn:eol-style native
In this example, all the files should be converted to the native end of
line style, which the last line of the configuration handles. The
exception is dos2unix-eol.sh
, which contains embedded CR
's
used to find and replace Windows CRLF
end of line characters with
Unix's LF
characters. Since svn
and
svn_load_dirs.pl
convert all CR
, CRLF
and
LF
's to the native end of line style when svn:eol-style
is
set to native
, this file should be left untouched. Hence, the
break
with no property settings.
The Windows Visual C++ and Visual Studio files (*.dsp
,
*.dsw
, etc.) should retain their CRLF
line endings on any
operating system and any *.doc
files are always treated as binary
files, hence the svn:mime-type
setting of
application/msword
.
svn rm
The svn rm
command can operate on URLs. A file or directory
can be "remotely" deleted from the repository, with no working copy
present:
$ svn rm http://foo.com/repos/tags/mooIRC-bad-tag -m "deleting bad tag" Committed revision 1023.
Of course, this is still a form of immediate commit, so some kind of log message is still required.
Enough said!
Subversion allows you to attach arbitrary "metadata" to files and directories. We refer to this data as properties, and they can be thought of as collections of name/value pairs (hash-tables) attached to each item in your working copy.
To set or get a property on a file or directory, use the svn
propset
and svn propget
commands. To list all properties
attached to an item, use svn proplist
. To delete a
property, use svn propdel
.
$ svn propset color green foo.c property `color' set on 'foo.c' $ svn propget color foo.c green $ svn propset height "5 feet" foo.c property `height' set on 'foo.c' $ svn proplist foo.c Properties on 'foo.c': height color $ svn proplist foo.c --verbose Properties on 'foo.c': height : 5 feet color : green $ svn propdel color foo.c property `color' deleted from 'foo.c'
Properties are versioned, just like file contents. This means
that new properties can be merged into your working files, and can
sometimes come into conflict too. Property values need not be text,
either. For example, you could attach a binary property-value by
using the -F
switch:
$ svn propset x-face -F joeface.jpg foo.c property `x-face' set on 'foo.c'
Subversion also provides a great convenience method for editing
existing properties: svn propedit
. When you invoke it,
Subversion will open the value of the property in question in your
favorite editor (or at least the editor that you've defined as $EDITOR
in your shell), and you can edit the value just as you would edit any
text file. This is exceptionally convenient for properties that are a
newline-separated array of values. (See below.)
Property changes are still considered "local modifications", and
aren't permanent until you commit. Like textual changes, property
changes can be seen by svn diff
, svn status
, and
reverted altogether with svn revert
:
$ svn diff Property changes on: foo.c ___________________________________________________________________ Name: color + green $ svn status _M foo.c
Notice that a 2nd column has appeared in the status output; the
leading underscore indicates that you've not made any textual changes,
but the M
means you've modified the properties. svn
status
tries to hide the 2nd "property" column when an item has no
properties at all; this was a design choice, to ease new users into
the concept. When properties are created, edited, or updated on an
item, that 2nd column appears forever after.
Also: don't worry about the non-standard way that Subversion currently
displays property differences. You can still run svn diff
and redirect the output to create a usable patch file. The
patch
program will ignore property patches; as a rule, it
ignores any noise it can't understand. (In future versions of
Subversion, though, we may start using a new patch format that
describes property changes and file copies/renames.)
Subversion has no particular policy regarding properties; they can be
used for any purpose. The only restriction is that Subversion has
reserved the svn:
name prefix for itself. A number of special
"magic" properties begin with this prefix. We'll cover these
features here.
svn:executable
This is a file-only property, and can be set to any value. Its mere existence causes a file's permissions to be executable.
svn:mime-type
At the present time, Subversion examines the svn:mime-type
property
to decide if a file is text or binary. If the file has no
svn:mime-type
property, or if the property's value matches
text/*
, then Subversion assumes it is a text file. If the file
has the svn:mime-type
property set to anything other than
text/*
, it assumes the file is binary.
If Subversion believes that the file is binary, it will not attempt to
perform contextual merges during updates. Instead, Subversion creates
two files side-by-side in your working copy; the one containing your
local modifications is renamed with an .orig
extension.
Subversion also helps users by running a binary-detection algorithm in
the svn import
and svn add
subcommands. These subcommands try to
make a good guess at a file's binary-ness, and then (possibly) set a
svn:mime-type
property of application/octet-stream
on the file
being added. (If Subversion guesses wrong, you can always remove or
hand-edit the property.)
Finally, if the svn:mime-type
property is set, then mod_dav_svn will
use it to fill in the Content-type:
header when responding to an
http GET request. This makes files display more nicely when perusing
a repository with a web browser.
svn:ignore
If you attach this property to a directory, it causes certain file
patterns within the directory to be ignored by svn status
.
For example, suppose I don't want to see object files or backup files
in my status listing:
$ svn status M ./foo.c ? ./foo.o ? ./foo.c~
Using svn propedit
, I would set the value of
svn:ignore
to a newline-delimited list of patterns:
$ svn propget svn:ignore . *.o *~
svn:keywords
Subversion has the ability to substitute useful strings into special
"keywords" within text files. For example, if I placed this text
into a file:
Here is the latest report from the front lines. $LastChangedDate$ Cumulus clouds are appearing more frequently as summer approaches.
Subversion is able substitute the $LastChangedDate$
string with
the actual date in which this file last changed. The keyword string is
not removed in the replacement, just the specific information is placed
after the keyword string:
Here is the latest report from the front lines. $LastChangedDate: 2002-07-22 21:42:37 -0700 (Mon, 22 Jul 2002) $ Cumulus clouds are appearing more frequently as summer approaches.
Subversion substitutes five keywords:
Date
.
The keyword substitution of $LastChangedDate$
will look something
like
$LastChangedDate: 2002-07-22 21:42:37 -0700 (Mon, 22 Jul 2002) $
.
Rev
. The keyword substitution of $LastChangedRevision
will
look something like $LastChangedRevision: 144 $
.
Author
. The
keyword substitution of $LastChangedBy$
will look something like
$LastChangedBy: joe $
.
URL
. The keyword substitution of $HeadURL$
will
look something like
$HeadURL: http://svn.collab.net/repos/trunk/README $
.
$Id: bar 148 2002-07-28 21:30:43 epg $
. This means the file
bar
was last changed in revision 148 by committer epg
,
at 2002-07-28 21:30:43.
To activate a keyword, or set of keywords, you merely need to set the
svn:keywords
property to a list of keywords you want replaced.
Keywords not listed in svn:keywords
will not be replaced.
$ svn propset svn:keywords "Date Author" foo.c property `svn:keywords' set on 'foo.c'
And when you commit this property change, you'll discover that all
occurrences of $Date$
, $LastChangedDate$
,
$Author$
, and $LastChangedBy$
will have substituted
values within foo.c
.
svn:eol-style
By default, Subversion doesn't pay any attention to line endings. If a text file has either LF, CR, or CRLF endings, then those are the line endings that will exist on the file in both the repository and working copy.
But if developers are working on different platforms, line endings can sometimes become troublesome. For example, if a Win32 developer and Unix developer took turns modifying a file, its line endings might flip-flop back and forth from revision to revision in the repository. This makes examining or merging differences very difficult, as every line appears to be changed in each version of the file.
The solution here is to set the svn:eol-style
property to
"native". This makes the file always appear with the "native"
line endings of each developer's operating system. Note, however,
that the file will always contain LF endings in the repository. This
prevents the line-ending "churn" from revision to revision.
Alternately, you can force files to always retain a fixed, specific
line ending: set a file's svn:eol-style
property to one of
LF
, CR
or CRLF
. A Win32 .dsp
file, for
example, which is used by Microsoft development tools, should always
have CRLF endings.
svn:externals
See Modules.
Sometimes it's useful to construct a working copy that is made out of a number of different checkouts. For example, you may want different sub-directories to come from different locations in a repository.
On the one hand, you could begin by checking out a working copy, and
then run svn switch
on various subdirectories. But this is
a bit of work. Wouldn't it be nice to define - in a single place -
exactly how you want the final working copy to be?
This is known as a module. You can define a module by attaching
another special "magic" svn:
property to a directory: the
svn:externals
property.
The value of this property is a list of subdirectories and
their corresponding URLs:
$ svn propget svn:externals projectdir subdir1/foo http://url.for.external.source/foo subdir1/bar http://blah.blah.blah/repositories/theirproj subdir1/bar/baz http://blorg.blorg.blorg/basement/code
Assuming that this property is attached to the directory
projectdir
, then when we check it out, we'll get everything
else defined by the property.
$ svn checkout http://foo.com/repos/projectdir A projectdir/blah.c A projectdir/gloo.c A projectdir/trout.h Checked out revision 128. Fetching external item into projectdir/subdir1/foo A projectdir/subdir1/foo/rho.txt A projectdir/subdir1/foo/pi.txt A projectdir/subdir1/foo/tau.doc Checked out revision 128. ...
By tweaking the value of the svn:externals
property, the
definition of the module can change over time, and subsequent calls to
svn update
will update working copies appropriately.
As you may have noticed, many Subversion commands are able to process
the -r
switch. Here we describe some special ways to specify
revisions.
The Subversion client understands a number of revision keywords.
These keywords can be used instead of integer arguments to the
-r
switch, and are resolved into specific revision numbers:
Here are some examples of revision keywords in action:
$ svn diff -r PREV:COMMITTED foo.c # shows the last change committed to foo.c $ svn log -r HEAD # shows log message for the latest repository commit $ svn diff -r HEAD # compares your working file (with local mods) to the latest version # in the repository. $ svn diff -r BASE:HEAD foo.c # compares your "pristine" foo.c (no local mods) with the latest version # in the repository $ svn log -r BASE:HEAD # shows all commit logs since you last updated $ svn update -r PREV foo.c # rewinds the last change on foo.c. # (foo.c's working revision is decreased.)
svn cleanup
When Subversion modifies your working copy (or any information within
.svn
), it tries to do so as safely as possible. Before
changing anything, it writes its intentions to a logfile, then
executes the commands in the logfile. It's similar in design to a
journaled filesystem; if the user hits Control-C or if the machine
crashes, the logfiles are left lying around. By re-executing the
logfiles, the work can complete, and your working copy can get itself
back into a consistent state.
And this is exactly what svn cleanup
does: it searches your
working copy and re-runs any leftover logs, removing locks in the
process. Use this command if Subversion ever tells you that some part
of your working copy is "locked". Also, svn status
will
display an L
next to locked items:
$ svn st L ./somedir M ./somedir/foo.c $ svn cleanup $ svn st M ./somedir/foo.c
svn info
In general, we try to discourage users from directly reading the
.svn/entries
file used to track items. Instead, curiosity can
be quelled by using the svn info
to display most of the
tracked information:
$ svn info client.texi Path: client.texi Name: client.texi Url: http://svn.collab.net/repos/svn/trunk/doc/handbook/client.texi Revision: 2548 Node Kind: file Schedule: normal Last Changed Author: fitz Last Changed Rev: 2545 Last Changed Date: 2002-07-15 23:03:54 -0500 (Mon, 15 Jul 2002) Text Last Updated: 2002-07-16 08:48:04 -0500 (Tue, 16 Jul 2002) Properties Last Updated: 2002-07-16 08:48:03 -0500 (Tue, 16 Jul 2002) Checksum: 8sfaU+5dqyOgkhuSdyxGrQ==
svn import
The import command is a quick way to move an unversioned tree of files into a repository.
There are two ways to use this command:
$ svnadmin create /usr/local/svn/newrepos $ svn import file:///usr/local/svn/newrepos mytree Adding mytree/foo.c Adding mytree/bar.c Adding mytree/subdir Adding mytree/subdir/quux.h Transmitting file data.... Committed revision 1.
The above example places the contents of directory mytree
directly into the root of the repository:
/foo.c /bar.c /subdir /subdir/quux.h
If you give svn import
a third argument, it will use the
argument as the name of a new subdirectory to create within the URL.
$ svnadmin create /usr/local/svn/newrepos $ svn import file:///usr/local/svn/newrepos mytree fooproject Adding mytree/foo.c Adding mytree/bar.c Adding mytree/subdir Adding mytree/subdir/quux.h Transmitting file data.... Committed revision 1.
The repository would now look like
/fooproject/foo.c /fooproject/bar.c /fooproject/subdir /fooproject/subdir/quux.h
svn export
The export command is a quick way to create an unversioned tree of
files from a repository directory.
$ svn export file:///usr/local/svn/newrepos/fooproject A fooproject/foo.c A fooproject/bar.c A fooproject/subdir A fooproject/subdir/quux.h Checked out revision 3.
The resulting directory will not contain any .svn
administrative areas, and all property metadata will be lost. (Hint:
don't use this tecnique for backing up; it's probably better for
rolling source distributions.)
svn ls
The ls command lets you find what files are in a repository directory.
$ svn ls http://svn.collab.net/repos/svn README branches/ clients/ tags/ trunk/
If you want a more detailed listing, pass the -v
flag and you
will get output like this.
$ svn ls -v http://svn.collab.net/repos/svn _ 2755 kfogel 1331 Jul 28 02:07 README _ 2773 sussman 0 Jul 29 15:07 branches/ _ 2769 cmpilato 0 Jul 29 12:07 clients/ _ 2698 rooneg 0 Jul 24 18:07 tags/ _ 2785 brane 0 Jul 29 19:07 trunk/
The columns tell you if there file has any properties ("P" if it does, "_" if it doesn't), the revision it was last updated at, the user who last updated it, it's size, the date it was last updated, and the filename.
svn mkdir
This is another convenience command, and it has two uses.
First, it can be used to simultaneously create a new working copy
directory and schedule it for addition:
$ svn mkdir new-dir A new-dir
Or, it can be used to instantly create a directory in a repository (no
working copy needed):
$ svn mkdir file:///usr/local/svn/newrepos/branches -m "made new dir" Committed revision 1123.
Again, this is a form of immediate commit, so some sort of log message is required.
When you first run the svn
command-line client, it creates a
per-user configuration area. On Unix-like systems, a
.subversion/
directory is created in the user's home
directory. On Win32 systems, a Subversion
folder is created
wherever it's appropriate to do so (typically somewhere within
Documents and Settings\username
, although it depends on the
system.)
At the time of writing, the configuration area only contains one item:
a proxies
file. By setting values in this file, your Subversion
client can operate through an http proxy. (Read the file itself for
details; it should be self-documenting.)
Soon - very soon - a config
file will exist in this area for
defining general user preferences. For example, the preferred
$EDITOR
to use, options to pass through to svn diff
,
preferences for date/time formats, and so on. See issue #668 for
details
(http://subversion.tigris.org/issues/show_bug.cgi?id=668).
On Unix, an administrator can create"global" Subversion preferences
by creating and populating an /etc/subversion/
area. The
per-user ~/.subversion/
configuration will still override these
defaults, however.
On Win32, an administrator has the option of creating three other
locations: a global Subversion
folder in the "All Users"
area, a collection of global registry settings, or a collection of
per-user registry settings. The registry settings are set in:
HKCU\Software\Tigris.org\Subversion\Proxies HKCU\Software\Tigris.org\Subversion\Config etc.
To clarify, here is the order Subversion searches for run-time settings on Win32. Each subsequent location overrides the previous one:
Subversion
folder
Subversion
folder
How to administer a Subversion repository.
In this section, we'll mainly focus on how to use the
svnadmin
and svnlook
programs to work with repositories.
Creating a repository is incredibly simple:
$ svnadmin create path/to/myrepos
This creates a new repository in a subdirectory myrepos
.
(Note that the svnadmin
and svnlook
programs
operate directly on a repository, by linking to libsvn_fs.so
.
So these tools expect ordinary, local paths to the repositories. This
is in contrast with the svn
client program, which always
accesses a repository via some URL, whether it be via <http://
>
or <file:///
> schemas.)
A new repository always begins life at revision 0, which is defined to
be nothing but the root (/
) directory.
As mentioned earlier, repository revisions can have unversioned
properties attached to them. In particular, every revision is created
with a svn:date
timestamp property. (Other common properties
include svn:author
and svn:log
)
For a newly created repository, revision 0 has nothing but a
svn:date
property attached.
Here is a quick run-down of the anatomy of a repository:
$ ls myrepos conf/ dav/ db/ hooks/ locks/
conf
dav
db
hooks
locks
Once the repository has been created, it's very likely that you'll
want to use the svn client to import an initial tree. (Try
svn help import
, or See Other Commands.)
You may want to give your repository an initial directory structure
that reflects the trunk, branches, and tags of your project(s)
(See Branches and Tags.) You can do this via svn mkdir
:
$ svnadmin create /path/to/repos $ svn mkdir file:///path/to/repos/projectA -m 'Base dir for A' Committed revision 1. $ svn mkdir file:///path/to/repos/projectA/trunk -m 'Main dir for A' Committed revision 2. $ svn mkdir file:///path/to/repos/projectA/branches -m 'Branches for A' Committed revision 3. $ svn mkdir file:///path/to/repos/projectA/tags -m 'Tags for A' Committed revision 4. $ svn co file:///path/to/repos/projectA/trunk projectA Checked out revision 4. # ... now work on projectA ...
With svn import
, you can create the structure with a single
commit:
$ svnadmin create /path/to/repos $ mkdir projectA $ mkdir projectA/trunk $ mkdir projectA/branches $ mkdir projectA/tags $ svn import file:///path/to/repos projectA projectA -m 'Dir layout for A' Adding projectA/trunk Adding projectA/branches Adding projectA/tags Committed revision 1. $ rm -rf projectA/ $ svn co file:///path/to/repos/projectA/trunk projectA Checked out revision 1. # ... now work on projectA ...
A Subversion repository is essentially a sequence of trees; each tree is called a revision. (If this is news to you, it might be good for you to read Transactions and Revision Numbers.)
Every revision begins life as a transaction tree. When doing a commit, a client builds a transaction that mirrors their local changes, and when the commit succeeds, the transaction is effectively "promoted" into a new revision tree, and is assigned a new revision number.
At the moment, updates work in a similar way: the client builds a transaction tree that is a "mirror" of their working copy. The repository then compares the transaction tree with some revision tree, and sends back a tree-delta. After the update completes, the transaction is deleted.
Transaction trees are the only way to "write" to the repository's versioned filesystem; all users of libsvn_fs will do this. However, it's important to understand that the lifetime of a transaction is completely flexible. In the case of updates, transactions are temporary trees that are immediately destroyed. In the case of commits, transactions are transformed into permanent revisions (or aborted if the commit fails.) In the case of an error or bug, it's possible that a transaction can be accidentally left lying around - the libsvn_fs caller might die before deleting it. And in theory, someday whole workflow applications might revolve around the creation of transactions; they might be examined in turn by different managers before being deleted or promoted to revisions.
The point is: if you're administering a Subversion repository, you're going to have to examine revisions and transactions. It's part of monitoring the health of the repository.
svnlook
svnlook
is a read-only2 tool that can be
used to examine the revision and transaction trees within a repository.
It's useful for system administrators, and can be used by the
pre-commit
and post-commit
hook scripts as well.
The simplest usage is
$ svnlook repos
This will print information about the HEAD revision in the repository "repos." In particular, it will show the log message, author, date, and a diagram of the tree.
To look at a particular revision or transaction:
$ svnlook repos rev 522 $ svnlook repos txn 340
Or, if you only want to see certain types of information,
svnlook
accepts a number of subcommands. For example,
$ svnlook repos rev 522 log $ svnlook repos rev 559 diff
Available subcommands are:
log
author
date
dirs-changed
changed
diff
The svnadmin
tool has a toy "shell" mode as well. It doesn't
do much, but it allows you to poke around the repository as if it were
an imaginary mounted filesystem. The basic commands cd
,
ls
, exit
, and help
are available, as well
as the very special command cr
- "change revision." The last
command allows you to move between revision trees.
$ svnadmin shell repos <609: />$ <609: />$ ls < 1.0.2i7> [ 601] 1 0 trunk/ <nh.0.2i9> [ 588] 0 0 branches/ <jz.0.18c> [ 596] 0 0 tags/ <609: />$ cd trunk <609: /trunk>$ cr 500 <500: /trunk>$ ls < 2.0.1> [ 1] 0 3462 svn_config.dsp < 4.0.dj> [ 487] 0 3856 PORTING < 3.0.cr> [ 459] 0 7886 Makefile.in < d.0.ds> [ 496] 0 9736 build.conf < 5.0.d9> [ 477] 1 0 ac-helpers/ < y.0.1> [ 1] 0 1805 subversion.dsp ... <500: />$ exit
The output of ls
has only a few columns:
NODE-ID CREATED-REV HAS_PROPS? SIZE NAME < 1.0.2i7> [ 601] 1 0 trunk/ <nh.0.2i9> [ 588] 0 0 branches/ <jz.0.18c> [ 596] 0 0 tags/
A hook is a program triggered by a repository read or write access. The hook is handed enough information to tell what the action is, what target(s) it's operating on, and who is doing it. Depending on the hook's output or return status, the hook program may continue the action, stop it, or suspend it in some way.
Subversion's hooks are programs that live in the repository's hooks
directory:
$ ls repos/hooks/ post-commit.tmpl* read-sentinels.tmpl write-sentinels.tmpl pre-commit.tmpl* start-commit.tmpl*
This is how the hooks
directory appears after a repository is first
created. It doesn't contain any hook programs - just templates.
The actual hooks need to be named start-commit
, pre-commit
and
post-commit
. The template (.tmpl) files are example shell scripts to
get you started; read them for details about how each hook works. To
make your own hook, just copy foo.tmpl
to foo
and edit.
(The read-sentinels
and write-sentinels
are not yet implemented.
They are intended to be more like daemons than hooks. A sentinel is
started up at the beginning of a user operation. The Subversion
server communicates with the sentinel using a protocol yet to be
defined. Depending on the sentinel's responses, Subversion may stop
or otherwise modify the operation.)
Here is a description of the hook programs:
start-commit
pre-commit
The Subversion distribution includes a
tools/hook-scripts/commit-access-control.pl
script that can be
called from pre-commit
to implement fine-grained access control.
post-commit
The Subversion distribution includes a
tools/hook-scripts/commit-email.pl
script that can be used to
send out the differences applied in the commit to any number of email
addresses. Also included is tools/backup/hot-backup.py
, which is
a script that perform hot backups of your Subversion repository after
every commit.
Note that the hooks must be executable by the user who will invoke them (commonly the user httpd runs as), and that same user needs to be able to access the repository.
The pre-commit
and post-commit
hooks need to know things
about the change about to be committed (or that has just been
committed). The solution is a standalone program, svnlook
(See Examining a repository.) which was installed in the same place
as the svn
binary. Have the script use svnlook
to
examine a transaction or revision tree. It produces output that is both
human- and machine-readable, so hook scripts can easily parse it. Note
that svnlook
is read-only - it can only inspect, not change
the repository.
At the time of writing, the subversion repository has only one
database back-end: Berkeley DB. All of your filesystem's structure
and data live in a set of tables within repos/db/
.
Berkeley DB comes with a number of tools for managing these files, and they have their own excellent documentation. (See http://www.sleepycat.com/, or just read man pages.) We won't cover all of these tools here; rather, we'll mention just a few of the more common procedures that repository administrators might need.
First, remember that Berkeley DB has genuine transactions. Every attempt to change the DB is first logged. If anything ever goes wrong, the DB can back itself up to a previous `checkpoint' and replay transactions to get the data back into a sane state.
In our experience, we have seen situations where a bug in Subversion (which causes a crash) can sometimes have a side-effect of leaving the DB environment in a `locked' state. Any further attempts to read or write to the repository just sit there, waiting on the lock.
To `unwedge' the repository:
db_recover -v -h repos/db
, where
repos is the repository's directory name. You should see
output like this:
db_recover: Finding last valid log LSN: file: 40 offset 4080873 db_recover: Checkpoint at: [40][4080333] db_recover: Checkpoint LSN: [40][4080333] db_recover: Previous checkpoint: [40][4079793] db_recover: Checkpoint at: [40][4079793] db_recover: Checkpoint LSN: [40][4079793] db_recover: Previous checkpoint: [40][4078761] db_recover: Recovery complete at Sun Jul 14 07:15:42 2002 db_recover: Maximum transaction id 80000000 Recovery checkpoint [40][4080333]Make sure that the
db_recover
program you invoke is the one
distributed with the same version of Berkeley DB you're using in your
Subversion server.
Make sure you run this command as the user that owns and manages the
database -- typically your Apache process -- and not as root.
Running db_recover
as root leaves files owned by root in the
db
directory, which the non-root user that manages the database
cannot open. If you do this, you'll get "permission denied" error
messages when you try to access the repository.
Second, a repository administrator may need to manage the growth of logfiles. At any given time, the DB environment is using at least one logfile to log transactions; when the `current' logfile grows to 10 megabytes, a new logfile is started, and the old one continues to exist.
Thus, after a while, you may see a whole group of 1MB logfiles lying
around the environment. At this point, you can make a choice: if you
leave every single logfile behind, it's guaranteed that
db_recover
will always be able to replay every single DB
transaction, all the way back to the first commit. (This is the
`safe', or perhaps paranoid, route.) On the other hand, you can ask
Berkeley DB to tell you which logfiles are no longer being actively
written to:
$ db_archive -a -h repos/db log.0000000023 log.0000000024 log.0000000029
Subversion's own repository uses a post-commit
hook script, which,
after performing a `hot-backup' of the repository, removes these
excess logfiles. (In the Subversion source tree, see
tools/backup/hot-backup.py
)
This script also illustrates the safe way to perform a backup of the
repository while it's still up and running: recursively copy the
entire repository directory, then re-copy the logfiles listed by
db_archive -l
.
To start using a repository backup that you've restored, be sure to
run db_recover -v
command in the db
area first.
This guarantees that any unfinished log transactions are fully played
before the repository goes live again. (The hot-backup.py
script does that for you during backup, so you can skip this step
if you decide to use it.)
Finally, note that Berkeley DB has a whole locking subsystem; in
extremely intensive svn operations, we have seen situations where the
DB environment runs out of locks. The maximum number of locks can be
adjusted by changing the values in the repos/db/DB_CONFIG
file. Don't change the default values unless you know what you're
doing; be sure to read
http://www.sleepycat.com/docs/ref/lock/max.html first.
The svnadmin
tool has some subcommands that are specifically
useful to repository administrators. Be careful with
svnadmin
! Unlike svnlook
, which is read-only,
svnadmin
has the ability to modify the repository.
The most-used feature is probably svnadmin setlog
. A
commit's log message is an unversioned property directly attached to
the revision object; there's only one log message per revision.
Sometimes a user screws up the message, and it needs to be replaced:
$ echo "Here is the new, correct log message" > newlog.txt $ svnadmin setlog myrepos 388 newlog.txt
There's a nice CGI script in tools/cgi/
that allows people
(with commit-access passwords) to tweak existing log messages via web
browser.
Another common use of svnadmin
is to inspect and clean up
old, dead transactions. Commits and updates both create transaction
trees, but occasionally a bug or crash can leave them lying around.
By inspecting the datestamp on a transaction, an administrator can
make a judgment call and remove it:
$ svnadmin lstxns myrepos 319 321 $ svnadmin lstxns --long myrepos Transaction 319 Created: 2002-07-14T12:57:22.748388Z ... $ svnadmin rmtxns myrepos 319 321
Another useful subcommand: svnadmin undeltify
. Remember
that the latest version of each file is stored as fulltext in the
repository, but that earlier revisions of files are stored as "deltas"
against each next-most-recent revisions. When a user attempts to
access an earlier revision, the repository must apply a sequence of
backwards-deltas to the newest fulltexts in order to derive the older
data.
If a particular revision tree is extremely popular, the administrator
can speed up the access time to this tree by "undeltifying" any path
within the revision - that is, by converting every file to fulltext:
$ svnadmin undeltify myrepos 230 /project/tags/release-1.3 Undeltifying `/project/tags/release-1.3' in revision 230...done.
Okay, so now you have a repository, and you want to make it available over a network.
Subversion's primary network server is Apache httpd speaking WebDAV/deltaV protocol, which is a set of extension methods to http. (For more information on DAV, see http://www.webdav.org/.)
To network your repository, you'll need to
httpd.conf
file to export the repository
You can accomplish the first two items by either building httpd and
Subversion from source code, or by installing a binary packages on
your system. The second appendix of this document contains more
detailed instructions on doing this. (See Compiling and installing.) Instructions are also available in the INSTALL
file in Subversion's source tree.
In this section, we focus on configuring your httpd.conf
.
Somewhere near the bottom of your configuration file, define a new
Location
block:
<Location /repos/myrepo> DAV svn SVNPath /absolute/path/to/myrepo </Location>
This now makes your myrepo
repository available at the URL
<http://hostname/repos/myrepo
>.
Alternately, you can use the SVNParentPath
directive to
indicate a "parent" directory whose immediate subdirectories are
are assumed to be independent repositories:
<Location /repos> DAV svn SVNParentPath /absolute/path/to/parent/dir </Location>
If you were to run svnadmin create foorepo
within this parent
directory, then the url <http://hostname/repos/foorepo
> would
automatically be accessible without having to change httpd.conf
or restart httpd.
Note that this simple <Location>
setup starts life with no
access restrictions at all:
If you want to restrict either read or write access to a repository as a whole, you can use Apache's built-in access control features.
First, create an empty file that will hold httpd usernames and
passwords. Place names and crypted passwords into this file like so:
joe:Msr3lKOsYMkpc frank:Ety6rZX6P.Cqo mary:kV4/mQbu0iq82
You can generate the crypted passwords by using the standard
crypt(3)
command, or using the htpasswd
tool
supplied in Apache's bin
directory:
$ /usr/local/apache2/bin/htpasswd -n sussman New password: Re-type new password: sussman:kUqncD/TBbdC6
Next, add lines within your <Location>
block that point to the
user file:
AuthType Basic AuthName "Subversion repository" AuthUserFile /path/to/users/file
If you want to restrict all access to the repository, add one
more line:
Require valid-user
This line make Apache require user authentication for every single type of http request to your repository.
To restrict write-access only, you need to require a valid user for
all request methods except those that are read-only:
<LimitExcept GET PROPFIND OPTIONS REPORT> Require valid-user </LimitExcept>
Or, if you want to get fancy, you can create two separate user files,
one for readers, and one for writers:
AuthGroupFile /my/svn/group/file <LimitExcept GET PROPFIND OPTIONS REPORT> Require group svn_committers </LimitExcept> <Limit GET PROPFIND OPTIONS REPORT> Require group svn_committers Require group svn_readers </Limit>
These are only a few simple examples. For a complete tutorial on Apache access control, please consider taking a look at the "Security" tutorials found at http://httpd.apache.org/docs-2.0/misc/tutorials.html.
Another note: in order for svn cp
to work (which is actually
implemented as a DAV COPY request), mod_dav needs to be able to be
able to determine the hostname of the server. A standard way of doing
this is to use Apache's ServerName directive to set the server's
hostname. Edit your httpd.conf
to include:
ServerName svn.myserver.org
If you are using virtual hosting through Apache's NameVirtualHost
directive, you may need to use the ServerAlias
directive to specify
additional names that your server is known by.
(If you are unfamiliar with an Apache directive, or not exactly sure about what it does, don't hesitate to look it up in the documentation: http://httpd.apache.org/docs-2.0/mod/directives.html.)
You can test your exported repository by firing up httpd:
$ /usr/local/apache2/bin/apachectl stop $ /usr/local/apache2/bin/apachectl start
Check /usr/local/apache2/logs/error_log
to make sure it started up
okay. Try doing a network checkout from the repository:
$ svn co http://localhost/repos wc
The most common reason this might fail is permission problems reading the repository db files. Make sure that the user "nobody" (or whatever UID the httpd process runs as) has permission to read and write the Berkeley DB files! This is a very common problem.
You can see all of mod_dav_svn's complaints in the Apache error
logfile, /usr/local/apache2/logs/error_log
, or wherever you
installed Apache. For more information about tracing problems, see
"Debugging the server" in the HACKING
file.
Sometimes special situations arise where you need to move all of your filesystem data from one repository to another. Perhaps the internal fs database schema has changed in some way in a new release of Subversion, or perhaps you'd like to start using a different database "back end".
Either way, your data needs to be migrated to a new repository. To do
this, we have the svnadmin dump
and svnadmin load
commands.
svnadmin dump
writes a stream of your repository's data to
stdout:
$ svnadmin dump myrepos > dumpfile * Dumped revision 0. * Dumped revision 1. * Dumped revision 2. ...
This stream describes every revision in your repository as a list of changes to nodes. It's mostly human-readable text; but when a file's contents change, the entire fulltext is dumped into the stream. If you have binary files or binary property-values in your repository, those parts of the stream may be unfriendly to human readers.
After dumping your data, you would then move the file to a different
system (or somehow alter the environment to use a different version of
svnadmin
and/or libsvn_fs.so
), and create a
"new"-style repository that has a new schema or DB back-end:
$ svnadmin create newrepos
The svnadmin load
command attempts to read a dumpstream from
stdin, and effectively replays each commit:
$ svnadmin load newrepos < dumpfile <<< Started new txn, based on original revision 1 * adding path : A ... done. * adding path : A/B ... done. ... ------- Committed new rev 1 (loaded from original rev 1) >>> <<< Started new txn, based on original revision 2 * editing path : A/mu ... done. * editing path : A/D/G/rho ... done. ------- Committed new rev 2 (loaded from original rev 2) >>>
Voila, your revisions have been recommitted into the new repository.
And because svnadmin
uses standand input and output streams for
the repository dump and load process, people who are feeling saucy with
Unix can try things like this:
$ svnadmin create newrepos $ svnadmin dump myrepos | svnadmin load newrepos
You can also create a dumpfile that represents a specific range of
revisions. svnadmin dump
takes optional starting and ending
revisions to accomplish just that task.
$ svnadmin dump myrepos 23 > rev-23.dumpfile $ svnadmin dump myrepos 100 200 > revs-100-200.dumpfile
Now, regardless of the range of revisions used when dumping the
repository, the default behavior is for the first revision dumped to
always be compared against revision 0, which is just the empty root
directory /
. This means that the first revision in any dumpfile
will always look like a gigantic list of "added" nodes. We do this so
that a file like revs-100-200.dumpfile
can be directly loaded
into an empty repository.
However, if you add the --incremental
option when you dump your
repository, this tells svnadmin
to compare the first dumped
revision against the previous revision in the repository, the same way
it treats every other revision that gets dumped. The benefit of this is
that you can create several small dumpfiles that can be loaded in
succession, instead of one large one, like so:
$ svnadmin dump myrepos 0 1000 > dumpfile1 $ svnadmin dump myrepos 1001 2000 --incremental > dumpfile2 $ svnadmin dump myrepos 2001 3000 --incremental > dumpfile3
These dumpfiles could be loaded into a new repository with the following
command sequence:
$ svnadmin load newrepos < dumpfile1 $ svnadmin load newrepos < dumpfile2 $ svnadmin load newrepos < dumpfile3
Another neat trick you can perform with this --incremental
option involves appending to an existing dumpfile a new range of
revisions. For example, you might have a post-commit hook that simply
appends the repository dump of the single revision that triggered the
hook. Or you might have a script like the following that runs nightly
to append dumpfile data for all the revisions that were added to the
repository since the last time the script ran.
#!/usr/bin/perl $repos_path = '/path/to/repos'; $dumpfile = '/usr/backup/svn-dumpfile'; $last_dumped = '/var/log/svn-last-dumped'; # Figure out the starting revision (0 if we cannot read the last-dumped file, # else use the revision in that file incremented by 1). if (open LASTDUMPED, "$last_dumped") { $new_start = <LASTDUMPED>; chomp $new_start; $new_start++; close LASTDUMPED; } else { $new_start = 0; } # Query the youngest revision in the repos. $youngest = `svnadmin youngest $repos_path`; chomp $youngest; # Do the backup. `svnadmin dump $repos_path $new_start $youngest --incremental >> $dumpfile`; # Store a new last-dumped revision open LASTDUMPED, "> $last_dumped" or die; print LASTDUMPED "$youngest\n"; close LASTDUMPED; # All done!
As you can see, the Subversion repository dumpfile format, and specifically
svnadmin
's use of that format, can be a valuable means by
which to backup changes to your repository over time in case of a system
crash or some other catastrophic event.
Subversion uses WebDAV (Distributed Authoring and Versioning) as its primary network protocol, and here we discuss what this means to you, both present and future.
WebDAV was designed to make the web into a read/write medium, instead of a read-only medium (as it mainly exists today.) The theory is that directories and files can be shared over the web, using standardized extensions to HTTP. RFC 2518 describes the WebDAV extensions to HTTP, and is available (along with a lot of other useful information) at http://www.webdav.org/.
Already, a number of operating system file-browsers are able to mount networked directories using WebDAV. On Win32, the Windows Explorer can browse what it calls "WebFolders", just like any other share. Mac OS X also has this capability, as does the Nautilus browser for GNOME.
However, RFC 2518 doesn't fully implement the "versioning" aspect of WebDAV. A separate committee has created RFC 3253, known as the DeltaV extensions to WebDAV, available at http://www.webdav.org/deltav/. These extensions add version-control concepts to HTTP, and this is what Subversion uses.
It's important to understand that while Subversion uses DeltaV for communication, the Subversion client is not a general-purpose DeltaV client. In fact, it expects some custom features from the server. Further, the Subversion server is not a general-purpose DeltaV server. It implements a strict subset of the DeltaV specification. A WebDAV or DeltaV client may very well be able to interoperate with it, but only if that client operates within the narrow confines of those features the server has implemented. Future versions of Subversion will address more complete WebDAV interoperability.
At the moment, most DAV browsers and clients do not yet support DeltaV; this means that a Subversion repository can viewed or mounted only as a read-only resource. (An HTTP "PUT" request is valid when sent to a WebDAV-only server, but a DeltaV server such as mod_dav_svn will not allow it. The client must use special version-control methods to write to the server.) And on the flip side, a Subversion client cannot checkout a working copy from a generic WebDAV server; it expects a specific subset of DeltaV features.
For a detailed description of Subversion's WebDAV implementation, see http://svn.collab.net/repos/svn-repos/trunk/www/webdav-usage.html.
Tips to use Subversion more effectively.
In this chapter, we'll focus on how to avoid some pitfalls of version control systems in general and Subversion specifically.
Subversion diffs and merges text files work on a line-by-line basis. They don't understand the syntax of programming languages or even know when you've just reflowed text to a different line width.
Given this design, it's important to avoid unnecessary reformatting. It creates unnecessary conflicts when merging branches, updating working copies, and applying patches. It also can drown you in noise when viewing differences between revisions.
You can avoid these problems by following clearly-defined formatting rules. The Subversion project's own HACKING document and the Code Conventions for the Java Programming Language are good examples.
Tabs are particularly important. Some projects, like Subversion, do not use tabs at all in the source tree. Others always use them and define a particular tab size.
It can be very helpful to have an editor smart enough to help adhere to
these rules. For example, vim
can do this on a per-project basis
with .vimrc
commands like the following:
autocmd BufRead,BufNewFile */rapidsvn/*.{cpp,h} setlocal ts=4 noexpandtab autocmd BufRead,BufNewFile */subversion/*.[ch] setlocal sw=2 expandtab cinoptions=>2sn-s{s^-s:s
Check your favorite editor's documentation for more information.
In the real world, we're not always so perfect. Formatting preferences may change over time, or we may just make mistakes. There are things you can do to minimize the problems of reformatting.
These are good guidelines to follow:
Here's an example of a sweeping reformat:
$ svn co file:///repo/path/trunk indent_wc $ indent -gnu indent_wc/src/*.[ch] $ svn commit -m 'Ran indent -gnu src/*.[ch]' indent_wc
This follows all rules: there were no semantic changes mixed in (no files
were changed other than through indent
). The indent
commandline was given, so the changes can be very easily duplicated. All the
reformatting was done in a single revision.
Let's say these changes occurred to the trunk at revision 26. The head
revision is now 42. You created a branch at revision 13 and now want to
merge it back into the trunk. Ordinarily you'd do this:
$ svn co file://repo/path/trunk merge_wc $ svn merge -r 13:head file://repo/path/branches/mybranch merge_wc ... # resolve conflicts $ svn commit -m 'Merged branch'
But with the reformatting changes, there will be many, many conflicts. If
you follow these rules, you can merge more easily:
$ svn co -r 25 file://repo/path/trunk merge_wc $ svn merge -r 13:head file://repo/path/branches/mybranch merge_wc ... # resolve conflicts $ indent -gnu src/*.[ch] $ svn up ... # resolve conflicts $ svn commit -m 'Merged branch'
In English, the procedure is:
When viewing differences between revisions, you can customize
svn diff
output to hide whitespace changes. The -x
argument
passes arguments through to GNU diff. Here are some useful arguments:
-b
-B
-i
-t
-T
The commit emails always show whitespace-only changes.
commit-email.pl
uses svnlook diff
to get differences, which
doesn't support the -x
option.
Different platforms (Unix, Windows, MacOS) have different conventions for marking the line endings of text files. Simple editors may rewrite line endings, causing problems with diff and merge. This is a subset of the formatting problems.
Subversion has built-in support for normalizing line endings. To enable it,
set the svn:eol-style
property to "native". See Properties.
It pays to take some time before you commit to review your changes and create an appropriate log message. You are publishing the newly changed project anew every time you commit. This is true in two senses:
cvs admin -o
.) If you might not want something to be in the
repository, make sure it is not included in your commit. Check for
sensitive information, autogenerated files, and unnecessary large files.
If you later don't like your log message, it is possible to change it. The
svnadmin setlog
command will do this locally. You can set up the
tweak-log.cgi
script to allow the same thing remotely. All the same, creating a good log
message beforehand helps clarify your thoughts and avoid committing a mistake.
You should run a svn diff
before each commit and ask yourself:
Defining a log entry policy is also helpful -- the Subversion HACKING document is a good model. If you always embed filenames, function names, etc. then you can easily search through the logs with search-svnlog.pl.
You may want to write the log entry as you go. It's common to create a file
changes
with your log entry in progress. When you commit, use
svn ci -F changes
.
If you do not write log entries as you go, you can generate an initial
log entry file using the output of svn status
which contains a
list of all modified files and directories and write a comment for each
one.
Subversion does not have any way to merge or view differences of binary
files, so it's critical that these have accurate log messages. Since you
can't review your changes with svn diff
immediately before
committing, it's a particularly good idea to write the log entry as you go.
A number of other useful documents relevant to Subversion.
This document is meant to be a quick-start guide for CVS users new to Subversion. It's not a substitute for real documentation and manuals; but it should give you a quick conceptual "diff" when switching over.
The goal of Subversion is to take over the current and future CVS user base. Subversion not only includes new features, but attempts to fix certain "broken" behaviors that CVS had. This means that you may be encouraged to break certain habits - ones that you forgot were odd to begin with.
In CVS, revision numbers are per-file. This is because CVS uses RCS as a backend; each file has a corresponding RCS file in the repository, and the repository is roughly laid out according to structure of your project tree.
In Subversion, the repository looks like a single filesystem. Each commit results in an entirely new filesystem tree; in essence, the repository is an array of trees. Each of these trees is labeled with a single revision number. When someone talks about "revision 54," they're talking about a particular tree (and indirectly, the way the filesystem looked after the 54th commit).
Technically, it's not valid to talk about "revision 5 of foo.c
".
Instead, one would say "foo.c
as it appears in revision 5."
Also, be careful when making assumptions about the evolution of a file.
In CVS, revisions 5 and 6 of foo.c
are always different. In
Subversion, it's most likely that foo.c
did *not* change between
revisions 5 and 6.
In recent years, disk space has become outrageously cheap and abundant, but network bandwidth has not. Therefore, the Subversion working copy has been optimized around the scarcer resource.
The .svn
administrative directory serves the same purpose as the
CVS
directory, except that it also stores "pristine" copies of files.
This allows you to do many things off-line:
svn status
shows you local modifications (see below)
svn diff
shows you the details of your modifications
svn ci
sends differences to the repository (CVS only sends fulltexts!)
svn revert
removes your modifications
This last subcommand is new; it will not only remove local mods, but
it will un-schedule operations such as adds and deletes. It's the
preferred way to revert a file; running rm file; svn up
will
still work, but it blurs the purpose of updating. And, while we're on
this subject...
In Subversion, we've tried to erase a lot of the confusion between the
status
and update
subcommands.
The status
command has two purposes: (1) to show the user any local
modifications in the working copy, and (2) to show the user which
files are out-of-date. Unfortunately, because of CVS's hard-to-read
output, many CVS users don't take advantage of this command at all.
Instead, they've developed a habit of running cvs up
to quickly see
their mods. Of course, this has the side effect of merging repository
changes that you may not be ready to deal with!
With Subversion, we've tried to remove this muddle by making the
output of svn status
easy to read for humans and parsers. Also,
svn update
only prints information about files that are updated,
not local modifications.
Here's a quick guide to svn status
. We encourage all new
Subversion users to use it early and often:
svn status
prints all files that have local modifications; the network is not
accessed by default.
-u
switch
add out-of-dateness information from repository
-v
switch
show all entries under version control
-n
switch
nonrecursive
The status command has two output formats. In the default "short"
format, local modifications look like this:
% svn status M ./foo.c M ./bar/baz.c
If you specify either the -u
or -v
switch, a "long"
format is used:
% svn status M 1047 ./foo.c _ * 1045 ./faces.html _ * - ./bloo.png M 1050 ./bar/baz.c Head revision: 1066
In this case, two new columns appear. The second column
contains an asterisk if the file or directory is
out-of-date. The third column shows the working-copy's revision
number of the item. In the example above, the asterisk indicates that
faces.html
would be patched if we updated, and that
bloo.png
is a newly added file in the repository. (The -
next
to bloo.png means that it doesn't yet exist in the working copy.)
Lastly, here's a quick summary of status codes that you may see:
A Add D Delete R Replace (delete, then re-add) M local Modification U Updated G merGed C Conflict
Subversion has combined the CVS P
and U
codes into just
U
. When a merge or conflict occurs, Subversion simply prints
G
or C
, rather than a whole sentence about it.
A new feature of Subversion is that you can attach arbitrary metadata to files and directories. We refer to this data as properties, and they can be thought of as collections of name/value pairs (hashtables) attached to each item in your working copy.
To set or get a property name, use the svn propset
and svn
propget
subcommands. To list all properties on an object, use
svn proplist
.
For more information, See Properties.
Subversion tracks tree structures, not just file contents. It's one of the biggest reasons Subversion was written to replace CVS.
Here's what this means to you:
svn add
and svn rm
commands work on directories now, just as
they work on files. So do svn cp
and svn mv
. However, these
commands do *not* cause any kind of immediate change in the
repository. Instead, the working directory is recursively "scheduled"
for addition or deletion. No repository changes happen until you
commit.
foo/
in revision 5".)
Let's talk more about that last point. Directory versioning is a Hard Problem. Because we want to allow mixed-revision working copies, there are some limitations on how far we can abuse this model.
From a theoretical point of view, we define "revision 5 of directory
foo
" to mean a specific collection of directory-entries and
properties. Now suppose we start adding and removing files from foo
,
and then commit. It would be a lie to say that we still have revision
5 of foo
. However, if we bumped foo
's revision number after the
commit, that would be a lie too; there may be other changes to foo
we
haven't yet received, because we haven't updated yet.
Subversion deals with this problem by quietly tracking committed adds
and deletes in the .svn
area. When you eventually run svn
update
, all accounts are settled with the repository, and the directory's new
revision number is set correctly. Therefore, only after an update is
it truly safe to say that you have a "perfect" revision of a directory.
Most of the time, your working copy will contain "imperfect" directory
revisions.
Similarly, a problem arises if you attempt to commit property changes on a directory. Normally, the commit would bump the working directory's local revision number. But again, that would be a lie, because there may be adds or deletes that the directory doesn't yet have, because no update has happened. Therefore, you are not allowed to commit property-changes on a directory unless the directory is up-to-date.
For more specific examples and discussion: See Directory versioning.
CVS marks conflicts with in-line "conflict markers", and prints a C
during an update. Historically, this has caused problems. Many users
forget about (or don't see) the C
after it whizzes by on their
terminal. They often forget that the conflict-markers are even
present, and then accidentally commit garbaged files.
Subversion solves this problem by making conflicts more tangible. Read about it: See Basic Work Cycle. In particular, read the section about "Merging others' changes".
CVS users have to mark binary files with -kb
flags, to prevent data
from being munged (due to keyword expansion and line-ending
translations). They sometimes forget to do this.
Subversion examines the svn:mime-type
property to decide if a file
is text or binary. If the file has no svn:mime-type
property,
Subversion assumes it is text. If the file has the svn:mime-type
property set to anything other than text/*
, it assumes the file is
binary.
Subversion also helps users by running a binary-detection algorithm in
the svn import
and svn add
subcommands. These subcommands will
make a good guess and then (possibly) set a binary svn:mime-type
property on the file being added. (If Subversion guesses wrong, you
can always remove or hand-edit the property.)
As in CVS, binary files are not subject to keyword expansion or
line-ending conversions. Also, when a binary file is "merged" during
update, no real merge occurs. Instead, Subversion creates two files
side-by-side in your working copy; the one containing your local
modifications is renamed with an .orig
extension.
Unlike CVS, SVN can handle anonymous and authorized users in the same repository. There is no need for an anonymous user or a separate repository. If the SVN server requests authorization when committing, the client should prompt you for your authorization (password).
Unlike CVS, a Subversion working copy is aware that it has checked out
a module. That means that if somebody changes the definition of a
module, then a call to svn up
will update the working copy
appropriately.
Subversion defines modules as a list of directories within a directory property. See Modules.
Subversion doesn't distinguish between filesystem space and "branch" space; branches and tags are ordinary directories within the filesystem. This is probably the single biggest mental hurdle a CVS user will need to climb. Read all about it: See Branches and Tags.
"The three cardinal virtues of a master technologist are: laziness, impatience, and hubris." - Larry Wall
This appendix describes some of the theoretical pitfalls around the (possibly arrogant) notion that one can simply version directories just as one versions files.
To begin, recall that the Subversion repository is an array of trees. Each tree represents the application of a new atomic commit, and is called a revision. This is very different from a CVS repository, which stores file histories in a collection of RCS files (and doesn't track tree-structure.)
So when we refer to "revision 4 of foo.c
" (written foo.c:4) in
CVS, this means the fourth distinct version of foo.c
- but in
Subversion this means "the version of foo.c
in the fourth revision
(tree)". It's quite possible that foo.c
has never changed at all
since revision 1! In other words, in Subversion, different revision
numbers of the same versioned item do not imply different
contents.
Nevertheless, the contents of foo.c:4
is still well-defined. The
file foo.c
in revision 4 has a specific text and properties.
Suppose, now, that we extend this concept to directories. If we have a
directory DIR
, define DIR:N to be "the directory DIR in the
fourth revision." The contents are defined to be a particular set of
directory entries (dirents) and properties.
So far, so good. The concept of versioning directories seems fine in the repository - the repository is very theoretically pure anyway. However, because working copies allow mixed revisions, it's easy to create problematic use-cases.
Suppose our working copy has directory DIR:1
containing file
foo:1
, along with some other files. We remove foo
and
commit.
Already, we have a problem: our working copy still claims to have
DIR:1
. But on the repository, revision 1 of DIR
is
defined to contain foo
- and our working copy DIR
clearly
does not have it anymore. How can we truthfully say that we still have
DIR:1
?
One answer is to force DIR
to be updated when we commit
foo
's deletion. Assuming that our commit created revision 2, we
would immediately update our working copy to DIR:2
. Then the
client and server would both agree that DIR:2
does not contain
foo, and that DIR:2
is indeed exactly what is in the working
copy.
This solution has nasty, un-user-friendly side effects, though. It's
likely that other people may have committed before us, possibly adding
new properties to DIR
, or adding a new file bar
. Now pretend our
committed deletion creates revision 5 in the repository. If we
instantly update our local DIR
to 5, that means unexpectedly receiving a
copy of bar
and some new propchanges. This clearly violates a UI
principle: "the client will never change your working copy until you ask
it to." Committing changes to the repository is a server-write
operation only; it should not modify your working data!
Another solution is to do the naive thing: after committing the
deletion of foo
, simply stop tracking the file in the .svn
administrative directory. The client then loses all knowledge of the
file.
But this doesn't work either: if we now update our working copy, the
communication between client and server is incorrect. The client still
believes that it has DIR:1
- which is false, since a "true"
DIR:1
contains foo
. The client gives this incorrect
report to the repository, and the repository decides that in order to
update to revision 2, foo
must be deleted. Thus the repository
sends a bogus (or at least unnecessary) deletion command.
After deleting foo
and committing, the file is not
totally forgotten by the .svn
directory. While the file is no
longer considered to be under revision control, it is still secretly
remembered as having been `deleted'.
When the user updates the working copy, the client correctly informs the
server that the file is already missing from its local DIR:1
;
therefore the repository doesn't try to re-delete it when patching the
client up to revision 2.
Again, suppose our working copy has directory DIR:1
containing
file foo:1
, along with some other files.
Now, unbeknownst to us, somebody else adds a new file bar
to this
directory, creating revision 2 (and DIR:2
).
Now we add a property to DIR
and commit, which creates revision
3. Our working-copy DIR
is now marked as being at revision 3.
Of course, this is false; our working copy does not have
DIR:3
, because the "true" DIR:3
on the repository contains
the new file bar
. Our working copy has no knowledge of
bar
at all.
Again, we can't follow our commit of DIR
with an automatic update
(and addition of bar
). As mentioned previously, commits are a
one-way write operation; they must not change working copy data.
Let's enumerate exactly those times when a directory's local revision number changes:
In this light, it's clear that our "overeager directory" problem only happens in the second situation - those times when we're committing directory propchanges.
Thus the answer is simply not to allow property-commits on directories that are out-of-date. It sounds a bit restrictive, but there's no other way to keep directory revisions accurate.
Really, the Subversion client seems to have two difficult--almost contradictory--goals.
First, it needs to make the user experience friendly, which generally means being a bit "sloppy" about deciding what a user can or cannot do. This is why it allows mixed-revision working copies, and why it tries to let users execute local tree-changing operations (delete, add, move, copy) in situations that aren't always perfectly, theoretically "safe" or pure.
Second, the client tries to keep the working copy in correctly in sync with the repository using as little communication as possible. Of course, this is made much harder by the first goal!
So in the end, there's a tension here, and the resolutions to problems can vary. In one case (the "lagging directory"), the problem can be solved through a bit of clever entry tracking in the client. In the other case ("the overeager directory"), the only solution is to restrict some of the theoretical laxness allowed by the client.
The latest instructions for compiling and installing Subversion (and
httpd-2.0) are maintained in the INSTALL
file at the top of the
Subversion source tree.
In general, you should also be able to find the latest version of this file by grabbing it directly from Subversion's own repository: http://svn.collab.net/repos/svn/trunk/INSTALL
A latex quick-reference sheet exists on Subversion's website for
download, which is compiled from the source file in
doc/user/svn-ref.tex
directory. Any volunteers to rewrite it here
in texinfo?
The main FAQ for the project can viewed directly in Subversion's repository: http://svn.collab.net/repos/svn/trunk/www/project_faq.html
For a full description of how to contribute to Subversion, read the
HACKING
file at the top of Subversion's source tree. It's also
available at http://svn.collab.net/repos/svn/trunk/HACKING.
In a nutshell: Subversion behaves like many open-source projects. One begins by participating in discussion on mailing lists, then by submitting patches for review. Eventually, rights are granted direct commit access to the repository.
Copyright © 2002 Collab.Net. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
This software consists of voluntary contributions made by many individuals on behalf of CollabNet.
For example: the C routine `svn_client_checkout()' takes a URL as an argument. It passes this URL to the repository-access library and opens an authenticated session with a particular repository. It then asks the repository for a certain tree, and sends this tree into the working-copy library, which then writes a full working copy to disk (.svn directories and all.)
Why read-only? Because if a pre-commit hook script changed the transaction before commit, the working copy would have no way of knowing what happened, and would therefore be out of sync and not know it. Subversion currently has no way to handle this situation, and maybe never will.
At this time, this is the only method by which
users can implement finer-grained access control beyond what
httpd.conf
offers. In a future version of Subversion, we plan to
implement ACLs directly in the filesystem.