week11-12

For the ‘git diff –check’ test, my mentor highlighted some mistakes and areas for improvement:

The ‘git config’ in its current form doesn’t apply the configuration to any of the test repositories; instead, it applies the config to the parent directory. To apply the config to the test repositories, the correct command should be:

test_all_match git config core.whitespace -trailing-space,-space-before-tab 

Add ‘folder1/a’ to the index then “re-sparsifying” ‘folder1/‘ with ‘git sparse-checkout reapply’.

Use the ‘–cached’ option to ‘git diff’ to compare “index vs. HEAD” rather than “working tree vs. index”.

However, with these modifications to the test, it failed for the ‘sparse-index’ case. My mentor believes that the reason for the test failure isn’t an issue with my test setup. It doesn’t seem to be a problem stemming from my patch, but rather a bug in ‘diff’. This bug can be replicated with the following test, which uses the infrastructure in t1092:

test_expect_failure 'diff --cached shows differences in sparse directory' '
       init_repos &&

       test_all_match git reset --soft update-folder1 &&
    test_all_match git diff --cached -- folder1/a
'

The issue is in ‘oneway_diff()’. The pathspec of ‘folder1/a’ doesn’t match ‘folder1/‘ according to ‘ce_path_match()’, so ‘do_oneway_diff()’ isn’t called on ‘folder1/‘ like it should be. Specifically, in the sparse index mode, ‘folder1/‘ can’t be expanded, leading to ‘folder1/a’ not being properly recognized. I’m considering your suggestion to utilize ‘pathspec_needs_expanded_index()’. If the path doesn’t need to expand the index, I’ll just stick to what ‘ce_path_match’ originally did (return match_pathspec(istate, pathspec, ce->name, ce_namelen(ce), 0, seen, S_ISDIR(ce->ce_mode) || S_ISGITLINK(ce->ce_mode)). But if the path does need to expand the index, I plan to find the corresponding tree for the path, using ‘tree_entry_interesting()’ to check if my path exists within the tree. Finding the tree corresponding to the pathspec is a tricky issue, and my current idea is to use the ‘git_fnmatch()’ function. I plan to start from the root directory, traversing each tree, recursively exploring the top-level tree, and looking for all trees matching a specific pattern. But I’m concerned it might really slow things down.

After discussing with Victoria, she believes that it’s not feasible to find the trees corresponding to the pathspec without incurring a significant tree traversal cost. She proposed a potential solution for me to try: || (pathspec_needs_expanded_index(…) && S_ISSPARSEDIR(tree)). Unfortunately, this approach still couldn’t successfully address the issue. As a result, I’ve temporarily marked this problem as NEEDWORK and submitted the ‘check-attr’ patch. I hope to explore new methods to rectify this issue in the coming period.