From cd976f5c52694acb4b23c3f2425ed4f0a47ec799 Mon Sep 17 00:00:00 2001 From: "J. Bruce Fields" Date: Wed, 6 Dec 2006 23:18:05 -0500 Subject: [PATCH 1/4] Documentation: reorganize cvs-migration.txt Modify cvs-migration.txt so it explains first how to develop against a shared repository, then how to set up a shared repository, then how to import a repository from cvs. Though this seems chronologically backwards, it's still readable in this order, and it puts the more commonly needed material closer to the front. Remove the annotate/pickaxe section; perhaps it can find a place elsewhere in the future. Remove most of the "why git is better than cvs" stuff from the introduction. Add some minor clarifications, including two that have come up several times on the mailing list: 1. Recommend committing any changes before running pull. 2. Note that changes must be commited before they can be pushed. Update the clone discussion to reflect the new --use-separate-remotes default, and add a brief mention of git-cvsserver. Signed-off-by: J. Bruce Fields Signed-off-by: Junio C Hamano --- Documentation/cvs-migration.txt | 371 ++++++++++---------------------- 1 file changed, 115 insertions(+), 256 deletions(-) diff --git a/Documentation/cvs-migration.txt b/Documentation/cvs-migration.txt index a436180dd4..47846bdab2 100644 --- a/Documentation/cvs-migration.txt +++ b/Documentation/cvs-migration.txt @@ -1,33 +1,106 @@ git for CVS users ================= -So you're a CVS user. That's OK, it's a treatable condition. The job of -this document is to put you on the road to recovery, by helping you -convert an existing cvs repository to git, and by showing you how to use a -git repository in a cvs-like fashion. +Git differs from CVS in that every working tree contains a repository with +a full copy of the project history, and no repository is inherently more +important than any other. However, you can emulate the CVS model by +designating a single shared repository which people can synchronize with; +this document explains how to do that. Some basic familiarity with git is required. This link:tutorial.html[tutorial introduction to git] should be sufficient. -First, note some ways that git differs from CVS: +Developing against a shared repository +-------------------------------------- - * Commits are atomic and project-wide, not per-file as in CVS. +Suppose a shared repository is set up in /pub/repo.git on the host +foo.com. Then as an individual committer you can clone the shared +repository over ssh with: - * Offline work is supported: you can make multiple commits locally, - then submit them when you're ready. +------------------------------------------------ +$ git clone foo.com:/pub/repo.git/ my-project +$ cd my-project +------------------------------------------------ - * Branching is fast and easy. +and hack away. The equivalent of `cvs update` is - * Every working tree contains a repository with a full copy of the - project history, and no repository is inherently more important than - any other. However, you can emulate the CVS model by designating a - single shared repository which people can synchronize with; see below - for details. +------------------------------------------------ +$ git pull origin +------------------------------------------------ - * Since every working tree contains a repository, a commit in your - private repository will not publish your changes; it will only create - a revision. You have to "push" your changes to a public repository to - make them visible to others. +which merges in any work that others might have done since the clone +operation. If there are uncommitted changes in your working tree, commit +them first before running git pull. + +[NOTE] +================================ +The first `git clone` places the following in the +`my-project/.git/remotes/origin` file, and that's why the previous step +and the next step both work. +------------ +URL: foo.com:/pub/project.git/ +Pull: refs/heads/master:refs/remotes/origin/master +------------ +================================ + +You can update the shared repository with your changes by first commiting +your changes, and then using: + +------------------------------------------------ +$ git push origin master +------------------------------------------------ + +to "push" those commits to the shared repository. If someone else has +updated the repository more recently, `git push`, like `cvs commit`, will +complain, in which case you must pull any changes before attempting the +push again. + +In the `git push` command above we specify the name of the remote branch +to update (`master`). If we leave that out, `git push` tries to update +any branches in the remote repository that have the same name as a branch +in the local repository. So the last `push` can be done with either of: + +------------ +$ git push origin +$ git push foo.com:/pub/project.git/ +------------ + +as long as the shared repository does not have any branches +other than `master`. + +Setting Up a Shared Repository +------------------------------ + +We assume you have already created a git repository for your project, +possibly created from scratch or from a tarball (see the +link:tutorial.html[tutorial]), or imported from an already existing CVS +repository (see the next section). + +If your project's working directory is /home/alice/myproject, you can +create a shared repository at /pub/repo.git with: + +------------------------------------------------ +$ git clone -bare /home/alice/myproject /pub/repo.git +------------------------------------------------ + +Next, give every team member read/write access to this repository. One +easy way to do this is to give all the team members ssh access to the +machine where the repository is hosted. If you don't want to give them a +full shell on the machine, there is a restricted shell which only allows +users to do git pushes and pulls; see gitlink:git-shell[1]. + +Put all the committers in the same group, and make the repository +writable by that group: + +------------------------------------------------ +$ cd /pub +$ chgrp -R $group repo.git +$ find repo.git -mindepth 1 -type d |xargs chmod ug+rwx,g+s +$ GIT_DIR=repo.git git repo-config core.sharedrepository true +------------------------------------------------ + +Make sure committers have a umask of at most 027, so that the directories +they create are writable and searchable by other group members. Importing a CVS archive ----------------------- @@ -60,14 +133,32 @@ work, you must not modify the imported branches; instead, create new branches for your own changes, and merge in the imported branches as necessary. -Development Models ------------------- +Advanced Shared Repository Management +------------------------------------- + +Git allows you to specify scripts called "hooks" to be run at certain +points. You can use these, for example, to send all commits to the shared +repository to a mailing list. See link:hooks.html[Hooks used by git]. + +You can enforce finer grained permissions using update hooks. See +link:howto/update-hook-example.txt[Controlling access to branches using +update hooks]. + +Providing CVS Access to a git Repository +---------------------------------------- + +It is also possible to provide true CVS access to a git repository, so +that developers can still use CVS; see gitlink:git-cvsserver[1] for +details. + +Alternative Development Models +------------------------------ CVS users are accustomed to giving a group of developers commit access to -a common repository. In the next section we'll explain how to do this -with git. However, the distributed nature of git allows other development -models, and you may want to first consider whether one of them might be a -better fit for your project. +a common repository. As we've seen, this is also possible with git. +However, the distributed nature of git allows other development models, +and you may want to first consider whether one of them might be a better +fit for your project. For example, you can choose a single person to maintain the project's primary public repository. Other developers then clone this repository @@ -80,235 +171,3 @@ variants of this model. With a small group, developers may just pull changes from each other's repositories without the need for a central maintainer. - -Creating a Shared Repository ----------------------------- - -Start with an ordinary git working directory containing the project, and -remove the checked-out files, keeping just the bare .git directory: - ------------------------------------------------- -$ mv project/.git /pub/repo.git -$ rm -r project/ ------------------------------------------------- - -Next, give every team member read/write access to this repository. One -easy way to do this is to give all the team members ssh access to the -machine where the repository is hosted. If you don't want to give them a -full shell on the machine, there is a restricted shell which only allows -users to do git pushes and pulls; see gitlink:git-shell[1]. - -Put all the committers in the same group, and make the repository -writable by that group: - ------------------------------------------------- -$ chgrp -R $group repo.git -$ find repo.git -mindepth 1 -type d |xargs chmod ug+rwx,g+s -$ GIT_DIR=repo.git git repo-config core.sharedrepository true ------------------------------------------------- - -Make sure committers have a umask of at most 027, so that the directories -they create are writable and searchable by other group members. - -Performing Development on a Shared Repository ---------------------------------------------- - -Suppose a repository is now set up in /pub/repo.git on the host -foo.com. Then as an individual committer you can clone the shared -repository: - ------------------------------------------------- -$ git clone foo.com:/pub/repo.git/ my-project -$ cd my-project ------------------------------------------------- - -and hack away. The equivalent of `cvs update` is - ------------------------------------------------- -$ git pull origin ------------------------------------------------- - -which merges in any work that others might have done since the clone -operation. - -[NOTE] -================================ -The first `git clone` places the following in the -`my-project/.git/remotes/origin` file, and that's why the previous step -and the next step both work. ------------- -URL: foo.com:/pub/project.git/ my-project -Pull: master:origin ------------- -================================ - -You can update the shared repository with your changes by first commiting -your changes, and then using: - ------------------------------------------------- -$ git push origin master ------------------------------------------------- - -to "push" those commits to the shared repository. If someone else has -updated the repository more recently, `git push`, like `cvs commit`, will -complain, in which case you must pull any changes before attempting the -push again. - -In the `git push` command above we specify the name of the remote branch -to update (`master`). If we leave that out, `git push` tries to update -any branches in the remote repository that have the same name as a branch -in the local repository. So the last `push` can be done with either of: - ------------- -$ git push origin -$ git push repo.shared.xz:/pub/scm/project.git/ ------------- - -as long as the shared repository does not have any branches -other than `master`. - -[NOTE] -============ -Because of this behavior, if the shared repository and the developer's -repository both have branches named `origin`, then a push like the above -attempts to update the `origin` branch in the shared repository from the -developer's `origin` branch. The results may be unexpected, so it's -usually best to remove any branch named `origin` from the shared -repository. -============ - -Advanced Shared Repository Management -------------------------------------- - -Git allows you to specify scripts called "hooks" to be run at certain -points. You can use these, for example, to send all commits to the shared -repository to a mailing list. See link:hooks.html[Hooks used by git]. - -You can enforce finer grained permissions using update hooks. See -link:howto/update-hook-example.txt[Controlling access to branches using -update hooks]. - -CVS annotate ------------- - -So, something has gone wrong, and you don't know whom to blame, and -you're an ex-CVS user and used to do "cvs annotate" to see who caused -the breakage. You're looking for the "git annotate", and it's just -claiming not to find such a script. You're annoyed. - -Yes, that's right. Core git doesn't do "annotate", although it's -technically possible, and there are at least two specialized scripts out -there that can be used to get equivalent information (see the git -mailing list archives for details). - -git has a couple of alternatives, though, that you may find sufficient -or even superior depending on your use. One is called "git-whatchanged" -(for obvious reasons) and the other one is called "pickaxe" ("a tool for -the software archaeologist"). - -The "git-whatchanged" script is a truly trivial script that can give you -a good overview of what has changed in a file or a directory (or an -arbitrary list of files or directories). The "pickaxe" support is an -additional layer that can be used to further specify exactly what you're -looking for, if you already know the specific area that changed. - -Let's step back a bit and think about the reason why you would -want to do "cvs annotate a-file.c" to begin with. - -You would use "cvs annotate" on a file when you have trouble -with a function (or even a single "if" statement in a function) -that happens to be defined in the file, which does not do what -you want it to do. And you would want to find out why it was -written that way, because you are about to modify it to suit -your needs, and at the same time you do not want to break its -current callers. For that, you are trying to find out why the -original author did things that way in the original context. - -Many times, it may be enough to see the commit log messages of -commits that touch the file in question, possibly along with the -patches themselves, like this: - - $ git-whatchanged -p a-file.c - -This will show log messages and patches for each commit that -touches a-file. - -This, however, may not be very useful when this file has many -modifications that are not related to the piece of code you are -interested in. You would see many log messages and patches that -do not have anything to do with the piece of code you are -interested in. As an example, assuming that you have this piece -of code that you are interested in in the HEAD version: - - if (frotz) { - nitfol(); - } - -you would use git-rev-list and git-diff-tree like this: - - $ git-rev-list HEAD | - git-diff-tree --stdin -v -p -S'if (frotz) { - nitfol(); - }' - -We have already talked about the "\--stdin" form of git-diff-tree -command that reads the list of commits and compares each commit -with its parents (otherwise you should go back and read the tutorial). -The git-whatchanged command internally runs -the equivalent of the above command, and can be used like this: - - $ git-whatchanged -p -S'if (frotz) { - nitfol(); - }' - -When the -S option is used, git-diff-tree command outputs -differences between two commits only if one tree has the -specified string in a file and the corresponding file in the -other tree does not. The above example looks for a commit that -has the "if" statement in it in a file, but its parent commit -does not have it in the same shape in the corresponding file (or -the other way around, where the parent has it and the commit -does not), and the differences between them are shown, along -with the commit message (thanks to the -v flag). It does not -show anything for commits that do not touch this "if" statement. - -Also, in the original context, the same statement might have -appeared at first in a different file and later the file was -renamed to "a-file.c". CVS annotate would not help you to go -back across such a rename, but git would still help you in such -a situation. For that, you can give the -C flag to -git-diff-tree, like this: - - $ git-whatchanged -p -C -S'if (frotz) { - nitfol(); - }' - -When the -C flag is used, file renames and copies are followed. -So if the "if" statement in question happens to be in "a-file.c" -in the current HEAD commit, even if the file was originally -called "o-file.c" and then renamed in an earlier commit, or if -the file was created by copying an existing "o-file.c" in an -earlier commit, you will not lose track. If the "if" statement -did not change across such a rename or copy, then the commit that -does rename or copy would not show in the output, and if the -"if" statement was modified while the file was still called -"o-file.c", it would find the commit that changed the statement -when it was in "o-file.c". - -NOTE: The current version of "git-diff-tree -C" is not eager - enough to find copies, and it will miss the fact that a-file.c - was created by copying o-file.c unless o-file.c was somehow - changed in the same commit. - -You can use the --pickaxe-all flag in addition to the -S flag. -This causes the differences from all the files contained in -those two commits, not just the differences between the files -that contain this changed "if" statement: - - $ git-whatchanged -p -C -S'if (frotz) { - nitfol(); - }' --pickaxe-all - -NOTE: This option is called "--pickaxe-all" because -S - option is internally called "pickaxe", a tool for software - archaeologists. From 46732fae3d049254f4f12b8a716cf56159277eda Mon Sep 17 00:00:00 2001 From: Nicolas Pitre Date: Wed, 6 Dec 2006 23:01:00 -0500 Subject: [PATCH 2/4] change the unpack limit treshold to a saner value Currently the treshold is 5000. The likelihood of this value to ever be crossed for a single push is really small making it not really useful. The optimal treshold for a pure space saving on a filesystem with 4kb blocks is 3. However this is likely to create many small packs concentrating a large number of files in a single directory compared to the same objects which are spread over 256 directories when loose. This means we would need 512 objects per pack on average to approximagte the same directory cost (a pack has 2 files because of the index). But 512 is a really high value just like 5000 since most pushes are unlikely to have that many objects. So let's try with a value of 100 which should have a good balance between small pushes going to be exploded into loose objects and large pushes kept as whole packs. This is not a replacement for periodic repacks of course. Signed-off-by: Nicolas Pitre Signed-off-by: Junio C Hamano --- receive-pack.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/receive-pack.c b/receive-pack.c index a20bc924d6..e76d9aea31 100644 --- a/receive-pack.c +++ b/receive-pack.c @@ -11,7 +11,7 @@ static const char receive_pack_usage[] = "git-receive-pack "; static int deny_non_fast_forwards = 0; -static int unpack_limit = 5000; +static int unpack_limit = 100; static int report_status; static char capabilities[] = " report-status delete-refs "; From 4f88d3e0cbf443cd309c2c881209f3366f14023d Mon Sep 17 00:00:00 2001 From: Martin Langhoff Date: Thu, 7 Dec 2006 16:38:50 +1300 Subject: [PATCH 3/4] cvsserver: Avoid miscounting bytes in Perl v5.8.x At some point between v5.6 and 5.8 Perl started to assume its input, output and filehandles are UTF-8. This breaks the counting of bytes for the CVS protocol, resulting in the client expecting less data than we actually send, and storing truncated files. Signed-off-by: Martin Langhoff Signed-off-by: Junio C Hamano --- git-cvsserver.perl | 1 + 1 file changed, 1 insertion(+) diff --git a/git-cvsserver.perl b/git-cvsserver.perl index ca519b7e49..197014d9e6 100755 --- a/git-cvsserver.perl +++ b/git-cvsserver.perl @@ -17,6 +17,7 @@ use strict; use warnings; +use bytes; use Fcntl; use File::Temp qw/tempdir tempfile/; From db9819a40a56b4747931e637c1c22a104dcab902 Mon Sep 17 00:00:00 2001 From: "J. Bruce Fields" Date: Fri, 8 Dec 2006 01:27:21 -0500 Subject: [PATCH 4/4] Documentation: update git-clone man page with new behavior Update git-clone man page to reflect recent changes (--use-separate-remote default and use of .git/config instead of remotes files), and rewrite introduction. Signed-off-by: J. Bruce Fields Signed-off-by: Junio C Hamano --- Documentation/git-clone.txt | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt index d5efa00dea..985043faca 100644 --- a/Documentation/git-clone.txt +++ b/Documentation/git-clone.txt @@ -16,22 +16,21 @@ SYNOPSIS DESCRIPTION ----------- -Clones a repository into a newly created directory. All remote -branch heads are copied under `$GIT_DIR/refs/heads/`, except -that the remote `master` is also copied to `origin` branch. -In addition, `$GIT_DIR/remotes/origin` file is set up to have -this line: +Clones a repository into a newly created directory, creates +remote-tracking branches for each branch in the cloned repository +(visible using `git branch -r`), and creates and checks out a master +branch equal to the cloned repository's master branch. - Pull: master:origin - -This is to help the typical workflow of working off of the -remote `master` branch. Every time `git pull` without argument -is run, the progress on the remote `master` branch is tracked by -copying it into the local `origin` branch, and merged into the -branch you are currently working on. Remote branches other than -`master` are also added there to be tracked. +After the clone, a plain `git fetch` without arguments will update +all the remote-tracking branches, and a `git pull` without +arguments will in addition merge the remote master branch into the +current branch. +This default configuration is achieved by creating references to +the remote branch heads under `$GIT_DIR/refs/remotes/origin` and +by initializing `remote.origin.url` and `remote.origin.fetch` +configuration variables. OPTIONS -------