Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Certain modifications showing up inconsistently #1000

Closed
1 task done
ceball opened this issue Dec 16, 2016 · 7 comments
Closed
1 task done

Certain modifications showing up inconsistently #1000

ceball opened this issue Dec 16, 2016 · 7 comments

Comments

@ceball
Copy link

ceball commented Dec 16, 2016

  • I was not able to find an open or closed issue matching what I'm seeing

(I tried - sorry if I failed...)

Setup

  • Which version of Git for Windows are you using? Is it 32-bit or 64-bit?
$ git --version --build-options
git version 2.9.2.windows.1
sizeof-long: 4
  • Which version of Windows are you running? Vista, 7, 8, 10? Is it 32-bit or 64-bit?
$ cmd.exe /c ver

Microsoft Windows [Version 10.0.14393]

  • What options did you set as part of the installation? Or did you choose the
    defaults?
# One of the following:
> type "C:\Program Files\Git\etc\install-options.txt"
> type "C:\Program Files (x86)\Git\etc\install-options.txt"
> type "%USERPROFILE%\AppData\Local\Programs\Git\etc\install-options.txt"
$ cat /etc/install-options.txt
Path Option: Cmd
SSH Option: OpenSSH
CRLF Option: CRLFAlways
Bash Terminal Option: MinTTY
Performance Tweaks FSCache: Enabled
  • Any other interesting things about your environment that might be related
    to the issue you're seeing?
$ cat ~/.gitconfig
...
[core]
        eol = lf
        autocrlf = false
        ...

Details

  • Which terminal/shell are you running Git from? e.g Bash/CMD/PowerShell/other

git bash

If I make a small modification to a binary Excel .xls file, that modification is not always showing up in git - seemingly dependent on whether or not I run git status before modifying the file.

Create a repository, mark .xls as binary, add an xls file:

$ mkdir test && cd test

$ git init
Initialized empty Git repository in C:/Users/chris/AppData/Roaming/test/.git/

$ echo '*.xls -text -diff' > .gitattributes

$ git add .gitattributes

$ git commit -m "init"
[master (root-commit) 66a5c6a] init
 1 file changed, 1 insertion(+)
 create mode 100644 .gitattributes

$ sha256sum test.xls
78b56ecec9f3ead3d230dad6df7da5416e661a155962cdf3aa3d064d09e6f710 *test.xls

$ git add test.xls

$ git commit -m "test file"
[master 0e7a594] test file
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 test.xls

(1) Run git status after checkout - modification to file does not show up:

$ rm test.xls

$ git checkout test.xls

$ sha256sum test.xls
78b56ecec9f3ead3d230dad6df7da5416e661a155962cdf3aa3d064d09e6f710 *test.xls

$ git status
On branch master
nothing to commit, working tree clean

# open test.xls in excel and then close it (don't save)

$ sha256sum test.xls
3eba30b2aa34e2325adb86e0c33a8d6a933f5c6bcab7e61e6d1da9cd0fa80fde *test.xls

$ git status
On branch master
nothing to commit, working tree clean

(2) Don't run git status after checkout - modification to file does show up:

$ rm test.xls

$ git checkout test.xls

$ sha256sum test.xls
78b56ecec9f3ead3d230dad6df7da5416e661a155962cdf3aa3d064d09e6f710 *test.xls

# open test.xls in excel and then close it (don't save)

$ sha256sum test.xls
d9836a97d06a39a29b7a4d84ad82f80075f35a64e5a4d85a016ec9e9c20d9b03 *test.xls

$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   test.xls

no changes added to commit (use "git add" and/or "git commit -a")

  • What did you expect to occur after running these commands?

In both case (1) and case (2), I expected test.xls to show up as modified.

  • What actually happened instead?

Only in case (2) did test.xls show up as modified.

I happen to have docker running on this machine. Doing the same test as above (on the same file) but in a linux container with the repository mounted from the host windows machine, I get the result I expected (i.e. the file shows up as modified in both cases). However, the container does not have the same version of git, nor does it have the same global git configuration as the Windows host.

$ git --version
git version 2.1.4

$ cat ~/.gitconfig
...
[core]
        ...
  • If the problem was occurring with a specific repository, can you provide the
    URL to that repository to help us with testing?

Repository can be created using commands above. I've attached test.xls and test2.xls in a zip file - test2.xls is the result of opening test.xls in Excel and then closing it (without explicitly saving).

test.zip

@dscho
Copy link
Member

dscho commented May 15, 2017

Sorry for the long silence, this ticket simply fell under my radar.

I just tried this, and can confirm. Upon closer inspection using stat test.xls in Git Bash, it would appear that the change time is modifed by Excel along with the bytes on disk, but not the modified time.

I fear that the described problem is related to the fact that Git for Windows has to take a couple of shortcuts when trying to emulate Linux' semantics. In particular, when the so-called "stat" data (essentially, all the metadata for a give file) is emulated, we use the FindFirstFile()/FindNextFile() API which gives us the time of the last access, the time of the last modification and the creation time. Sadly, that differs slightly from the POSIX semantics that Git wants to see, where the first two times are identical, but the ctime does not refer to the creation time but the change time.

But we do not have a fast way to get at the change time, only the access time, modified time and creation time. We could get the change time, via the ChangeTime field in the FILE_BASIC_INFO data structure initialized by GetFileAttributesByHandleEx() function, but that requires a HANDLE, which we can only obtain using CreateFile() (which is orders of magnitude slower than FindFirstFile()/FindNextFile().

So what Git for Windows does is rely on applications to update the modified time when changing any file contents. But that is not the case with Excel.

I fear there is not really anything we can do here, not unless we want to slow down Git for Windows dramatically (in most cases, for no good reason)...

@PhilipOakley
Copy link

PhilipOakley commented May 20, 2017 via email

@PhilipOakley
Copy link

PhilipOakley commented May 21, 2017 via email

@dscho
Copy link
Member

dscho commented May 22, 2017

Is this issue (Excel!) something that should get a note in the Known Issues section?

I think that would be really nice. Could you open a PR?

@dscho
Copy link
Member

dscho commented Apr 4, 2018

I'll just mark this "up for grabs". Maybe somebody else will be motivated enough to add this valuable information (while I am busy with the non-fun aspects of Git for Windows).

@E3V3A
Copy link

E3V3A commented Nov 22, 2018

Just FWIW.
It seem that it depend on the FS used (and the underlying OS drivers for that FS).
My findings are that:

#--------------------------------------
# [a/c/m]time
#--------------------------------------
# On Windows (via Cygwin & Python3):
#   The creation time is:       aTime           .CreationTime === .LastAccessTime in Poweshell, but known as "access" time in Linux)
#   The modification time is:   mTime == cTime  .LastWriteTime in Poweshell
# 
# On Linux:
#   The creation time is:       cTime
#   The modification time is:   mTime
#   The access time is:         aTime           (normally not used)
# 
# ==> For seeing last modification time, use "cTime" on Windows FS's, and "mTime" on *linux FS's
#--------------------------------------

IDK why an Excel file would behave different from any other "Windows" generated file, in this respect.
Does it?

@PhilipOakley
Copy link

I have memory that Excel does things a little differently (but I could be wrong).

I found https://blogs.technet.microsoft.com/the_microsoft_excel_support_team_blog/2012/08/10/date-modified-time-stamps-on-shared-excel-files-stored-on-a-network-drive-may-not-update-correctly-after-saving-and-closing-the-file/

Also there are many links to determined the last saved date of an excel worksheet via its internal properties.

The other option is some form of virus scan (is this on a corporate network) that has 'checked' the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants