Each window in a subversion delta (svndiff0-format file) starts with a
window header, consisting of five integers with variable-length
representation:
source view offset
source view length
output length
instructions length
auxiliary data length
Parse it. The result is not usable for deltas with nonempty postimage
yet; in fact, this only adds support for deltas without any
instructions or auxiliary data. This is a good place to stop, though,
since that little support lets us add some simple passing tests
concerning error handling to the test suite.
Improved-by: Ramkumar Ramachandra <artagnon@gmail.com>
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
A delta in the subversion delta (svndiff0) format consists of the
magic bytes SVN\0 followed by a sequence of windows of a certain well
specified format (starting with five integers).
Add an svndiff0_apply function and test-svn-fe -d commandline tool to
parse such a delta in the special case of not including any windows.
Later patches will add features to turn this into a fully functional
delta applier for svn-fe to use to parse the streams produced by
"svnrdump dump" and "svnadmin dump --deltas".
The content of symlinks starts with the word "link " in Subversion's
worldview, so we need to be able to prepend that text to input for the
sake of delta application. So initialization of the input state of
the delta preimage is left to the calling program, giving callers a
chance to seed the buffer with text of their choice.
Improved-by: Ramkumar Ramachandra <artagnon@gmail.com>
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
buffer_read_binary is a thin wrapper around fread, but its signature
is wrong:
- fread can fill an arbitrary in-memory buffer. buffer_read_binary
is limited to buffers whose size is representable by a 32-bit
integer.
- The result from fread is the number of bytes actually read.
buffer_read_binary only reports the number of bytes read by
incrementing sb->len by that amount and returns void.
Fix both: let buffer_read_binary accept a size_t instead of uint32_t
for the number of bytes to read and as a convenience return the number
of bytes actually read.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Each section of a Subversion-format delta only requires examining (and
keeping in random-access memory) a small portion of the preimage. At
any moment, this portion starts at a certain file offset and has a
well-defined length, and as the delta is applied, the portion advances
from the beginning to the end of the preimage. Add a move_window
function to keep track of this view into the preimage.
You can use it like this:
buffer_init(f, NULL);
struct sliding_view window = SLIDING_VIEW_INIT(f);
move_window(&window, 3, 7); /* (1) */
move_window(&window, 5, 5); /* (2) */
move_window(&window, 12, 2); /* (3) */
strbuf_release(&window.buf);
buffer_deinit(f);
The data structure is called sliding_view instead of _window to
prevent confusion with svndiff0 Windows.
In this example, (1) reads 10 bytes and discards the first 3;
(2) discards the first 2, which are not needed any more; and (3) skips
2 bytes and reads 2 new bytes to work with.
When move_window returns, the file position indicator is at position
window->off + window->width and the data from positions window->off to
the current file position are stored in window->buf.
This function performs only sequential access from the input file and
never seeks, so it can be safely used on pipes and sockets.
On end-of-file, move_window silently reads less than the caller
requested. On other errors, it prints a message and returns -1.
Helped-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
As the svn import infrastructure evolves, it's getting to be a pain to
tell by eye what files were added or removed from a dependency line
like
VCSSVN_OBJS = vcs-svn/string_pool.o vcs-svn/line_buffer.o \
vcs-svn/repo_tree.o vcs-svn/fast_export.o vcs-svn/svndump.o
So use a style with one entry per line instead, like the existing
BUILTIN_OBJS:
# protect against environment
VCSSVN_OBJS =
...
VCSSVN_OBJS += vcs-svn/string_pool.o
VCSSVN_OBJS += vcs-svn/line_buffer.o
...
which is readable on its own and produces nice, clear diffs.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
gcc -m32 correctly warns:
vcs-svn/fast_export.c: In function 'fast_export_commit':
vcs-svn/fast_export.c:54:2: warning: format '%llu' expects
argument of type 'long long unsigned int', but argument 2
has type 'unsigned int' [-Wformat]
Fix it.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
* mg/rev-list-n-parents:
tests: avoid nonportable {foo,bar} glob
rev-list --min-parents,--max-parents: doc, test and completion
revision.c: introduce --min-parents and --max-parents options
t6009: use test_commit() from test-lib.sh
* jc/fetch-progressive-stride:
fetch-pack: use smaller handshake window for initial request
fetch-pack: progressively use larger handshake windows
fetch-pack: factor out hardcoded handshake window size
Conflicts:
builtin/fetch-pack.c
* 'svn-fe' of git://repo.or.cz/git/jrn:
vcs-svn: handle log message with embedded NUL
vcs-svn: avoid unnecessary copying of log message and author
vcs-svn: remove buffer_read_string
vcs-svn: make reading of properties binary-safe
Currently there are two functions to retrieve the mode and content
at a path:
const char *repo_read_path(const uint32_t *path);
uint32_t repo_read_mode(const uint32_t *path)
Replace them with a single function with two return values. This
means we can use one round-trip to get the same information from
fast-import that previously took two.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Pass the log message by strbuf instead of as a C-style string and use
fwrite instead of printf to write it to fast-import so embedded '\0'
bytes can be preserved.
Currently "git log" doesn't show the embedded NULs but "git cat-file
commit" can.
While at it, stop including system headers from repo_tree.h. git
source files need to include git-compat-util.h (or cache.h or
builtin.h) sooner to ensure the appropriate feature test macros are
defined.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Use strbuf_swap when storing the svn:log and svn:author properties, so
pointers to rather than the contents of buffers get copied. The main
effect should be to make the code a little easier to read.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
All previous users of buffer_read_string have already been converted
to use the more intuitive buffer_read_binary, so remove the old API to
avoid some confusion.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
svn-fe errors out on revision 59151 of the ASF repository:
fatal: invalid dump: unexpected end of file
The proximate cause is a property with an embedded NUL character.
Previously such anomalies were ignored but commit c9d1c8ba
(2010-12-28) introduced a check strlen(val) == len to avoid reading
uninitialized data when a property list ends early and unfortunately
this test does not distinguish between "foo" followed by EOF and the
string "foo\0bar\0baz".
Fix it by using buffer_read_binary to read to a strbuf and checking
the actual length read. Most consumers of properties still use
C-style strings, so in practice an author or log message with embedded
NULs will be truncated, but a least this way svn-fe won't error out
(fixing the regression).
Reported-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
For a pull into an unborn branch, we do not use "git merge"
at all. Instead, we call read-tree directly. However, we
used the --reset parameter instead of "-m", which turns off
the safety features.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we merge into an unborn branch, there are basically two
steps:
1. Write the sha1 of the new commit into the ref pointed
to by HEAD.
2. Update the index with the new content, and check it out
to the working tree.
We currently do them in this order. However, (2) is the step
that is much more likely to fail, since it can be blocked by
things like untracked working tree files. When it does, the
merge fails and we are left with an empty index but an
updated HEAD.
This patch switches the order, so that a failure in updating
the index leaves us unchanged. Of course, a failure in
updating the ref now leaves us with an updated index and
mis-matched HEAD. That is arguably not much better, but it
is probably less likely to actually happen.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This file ends up conflicting with the test just after it
(causing the "git merge" to fail). Neither test is to blame
for the bug, though. It looks like the merge in 1a9fe45
(Merge branch 'tr/merge-unborn-clobber', 2011-02-09) is what
caused the conflict.
We didn't notice because the follow-on test is already
marked as expect_failure (even though it has since been
fixed, and now succeeds once the untracked file is moved out
of the way).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was fixed by 1d718a51 (do not overwrite untracked
symlinks, 2011-02-20).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fractional timezones, like -0330 (NST used in Canada) or +0430
(Afghanistan, Iran DST), were not handled properly in parse_date; this
means values such as 'minute_local' and 'iso-tz' were not generated
correctly.
This was caused by two mistakes:
* sign of timezone was applied only to hour part of offset, and not
as it should be also to minutes part (this affected only negative
fractional timezones).
* 'int $h + $m/60' is 'int($h + $m/60)' and not 'int($h) + $m/60',
so fractional part was discarded altogether ($h is hours, $m is
minutes, which is always less than 60).
Note that positive fractional timezones +0430, +0530 and +1030 can be
found as authortime in git.git repository itself.
For example http://repo.or.cz/w/git.git/commit/88d50e7 had authortime
of "Fri, 8 Jan 2010 18:48:07 +0000 (23:48 +0530)", which is not marked
with 'atnight', when "git show 88d50e7" gives correct author date of
"Sat Jan 9 00:18:07 2010 +0530".
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t_e_i() can return -1 or 2 to early shortcut a search. Current code
may use up to two variables to handle it. One for saving return value
from t_e_i temporarily, one for saving return code 2.
The second variable is not needed. If we make sure the first variable
does not change until the next t_e_i() call, then we can do something
like this:
int ret = 0;
while (...) {
if (ret != 2) {
ret = t_e_i();
if (ret < 0) /* no longer interesting */
break;
if (ret == 0) /* skip this round */
continue;
}
/* ret > 0, interesting */
}
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch changes behavior of the two functions. Previously it does
prefix matching only. Now it can also do wildcard matching.
All callers are updated. Some gain wildcard matching (archive,
checkout), others reset pathspec_item.has_wildcard to retain old
behavior (ls-files, ls-tree as they are plumbing).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_tree_recursive() uses a very similar function, match_tree_entry, to
tree_entry_interesting() to do its path matching. This patch kills
match_tree_entry() in favor of tree_entry_interesting().
match_tree_entry(), like older version of tree_entry_interesting(), does
not support wildcard matching. New read_tree_recursive() retains this
behavior by forcing all pathspecs literal.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is to improve the process_tree() function defined in list-objects.c
* en/object-list-with-pathspec:
Add testcases showing how pathspecs are handled with rev-list --objects
Make rev-list --objects work together with pathspecs
The Tcl msgcat package doesn't detect the use of a multi-lingual language
pack on Windows 7. This means that a user may have their display language
set to Japanese but the system installed langauge was English.
This patch reads the relevent registry key to fix this before loading in
the locale specific parts of git-gui.
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Unlike bash and ksh, dash and busybox ash do not support brace
expansion (as in 'echo {hello,world}'). So when dash is sh,
t6009.13 (set up dodecapus) ends up pass a string beginning with
"root{1,2," to "git merge" verbatim and the test fails.
Fix it by introducing a variable to hold the list of parents for
the dodecapus and populating it in a more low-tech way.
While at it, simplify a little by combining this setup code with the
test it sets up for.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git merge" without specifying any commit is a no-op by default.
A new option merge.defaultupstream can be set to true to cause such an
invocation of the command to merge the upstream branches configured for
the current branch by using their last observed values stored in their
remote tracking branches.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to be very casual in terminology and used <branch>, <ref> and
<rev> more or less interchangeably with <commit>. Match the help text
given by "git merge -h" with that of the documentation.
Signed-off-by: Jared Hance <jaredhance@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since e9c8409 (diff-index --cached --raw: show tree entry on the LHS for
unmerged entries., 2007-01-05), an unmerged entry should be detected by
using DIFF_PAIR_UNMERGED(p), not by noticing both one and two sides of
the filepair records mode=0 entries. However, it forgot to update some
parts of the rename detection logic.
This only makes difference in the "diff --cached" codepath where an
unmerged filepair carries information on the entries that came from the
tree. It probably hasn't been noticed for a long time because nobody
would run "diff -M" during a conflict resolution, but "git status" uses
rename detection when it internally runs "diff-index" and "diff-files"
and gives nonsense results.
In an unmerged pair, "one" side can have a valid filespec to record the
tree entry (e.g. what's in HEAD) when running "diff --cached". This can
be used as a rename source to other paths in the index that are not
unmerged. The path that is unmerged by definition does not have the
final content yet (i.e. "two" side cannot have a valid filespec), so it
can never be a rename destination.
Use the DIFF_PAIR_UNMERGED() to detect unmerged filepair correctly, and
allow the valid "one" side of an unmerged filepair to be considered a
potential rename source, but never to be considered a rename destination.
Commit message and first two test cases by Junio, the rest by Martin.
Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-new-workdir script in contrib/ makes a new work tree by sharing
many subdirectories of the .git directory with the original repository.
When rerere.enabled is set in the original repository, but the user has
not encountered any conflicts yet, the original repository may not yet
have .git/rr-cache directory.
When rerere wants to run in a new work tree created from such a young
original repository, it fails to mkdir(2) .git/rr-cache that is a symlink
to a yet-to-be-created directory.
There are three possible approaches to this:
- A naive solution is not to create a symlink in the git-new-workdir
script to a directory the original does not have (yet). This is not a
solution, as we tend to lazily create subdirectories of .git/, and
having rerere.enabled configuration set is a strong indication that the
user _wants_ to have this lazy creation to happen;
- We could always create .git/rr-cache upon repository creation. This is
tempting but will not help people with existing repositories.
- Detect this case by seeing that mkdir(2) failed with EEXIST, checking
that the path is a symlink, and try running mkdir(2) on the link
target.
This patch solves the issue by doing the third one.
Strictly speaking, this is incomplete. It does not attempt to handle
relative symbolic link that points into the original repository, but this
is good enough to help people who use contrib/workdir/git-new-workdir
script.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Clarify "string of unsigned bytes";
* Blob has two variants (regular file vs symlink), not (blob vs symlink);
* Clarify permission mode bits;
* Clarify ce_namelen() "too long to fit in the length field" case;
* Clarify "." etc are forbidden as path components;
* Match the description with the internal wording "cache-tree";
* All types of extension begin with signature and length as explained in
the first part. Don't repeat the "length" part in the description of
each extension (can be mistaken as if there is a separate 32-bit size
field inside the extension), but state what the signature for each
extension is.
* Don't say "Extension tag", as we have said "Extension signature" in the
first part---be consistent;
* Clarify the invalidation of cache-tree entries;
* Correct description on subtree_nr field in the cache-tree;
* Clarify the order of entries in cache-tree;
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This also adds test for "--merges" and "--no-merges" which we did not
have so far.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce --min-parents and --max-parents options which limit the
revisions to those commits which have at least (or at most) that many
commits, where negative arguments for --max-parents= denote infinity
(i.e. no upper limit).
In particular:
--max-parents=1 is the same as --no-merges;
--min-parents=2 is the same as --merges;
--max-parents=0 shows only roots; and
--min-parents=3 shows only octopus merges
Using --min-parents=n and --max-parents=m with n>m gives you what you ask
for (i.e. nothing) for obvious reasons, just like when you give --merges
(show only merge commits) and --no-merges (show only non-merge commits) at
the same time.
Also, introduce --no-min-parents and --no-max-parents to do the obvious
thing for convenience.
We compute the number of parents only when we limit by that, so there
is no performance impact when there are no limiters.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>