Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integeration: fix TestV3HashKV hash mismatch #8323

Merged

Conversation

fanminshi
Copy link
Member

TestV3HashKV returns hash mismatch because UnsafeForEach traverses mvcc
at a different order depends on if there are uncommitted items or not.
Before uncommitted items are committed, UnsafeForEach traverses those first before items in boltdb.
After uncommitted items have been committed, UnsafeForEach traverses boltdb first then the recently committed items.
Hence, hash on same items with different order results hash mismatch.

dups[string(k)] = struct{}{}
return nil
}
f1 := func(k, v []byte) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f1 doesn't do anything; can replace with visitor to simplify. Can also drop the extra unlock since lock/unlock should typically be paired when convenient, plus have more descriptive names since more complicated now:

func (rt *readTx) UnsafeForEach(bucketName []byte, visitor func(k, v []byte) error) error {
	dups := make(map[string]struct{})
	getDups := func(k, v []byte) error {
		dups[string(k)] = struct{}{}
		return nil
	}
	visitNoDup := func(k, v []byte) error {
		if _, ok := dups[string(k)]; ok {
			return nil
		}
		return visitor(k, v)
	}
	if err := rt.buf.ForEach(bucketName, getDups); err != nil {
		return err
	}
	rt.txmu.Lock()
	err := unsafeForEach(rt.tx, bucketName, visitNoDup)
	rt.txmu.Unlock()
	if err != nil {
		return err
	}
	return rt.buf.ForEach(bucketName, visitor)
}

@@ -156,12 +156,11 @@ func TestV3HashKV(t *testing.T) {
kvc := toGRPC(clus.RandClient()).KV
mvc := toGRPC(clus.RandClient()).Maintenance

for i := 0; i < 10; i++ {
for i := 0; i < 100; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a backend test that exercises the bug; triggering it at this level is too indirect.

Here's the test I wrote to confirm this patch works as expected:

// TestBackendWritebackForEach checks that partially written / buffered                                                                                      
// data is visited in the same order as fully committed data.                                                                                                
func TestBackendWritebackForEach(t *testing.T) {                                                                                                             
        b, tmpPath := NewTmpBackend(time.Hour, 10000)                                                                                                        
        defer cleanup(b, tmpPath)                                                                                                                            
                                                                                                                                                             
        tx := b.BatchTx()
        tx.Lock()
        tx.UnsafeCreateBucket([]byte("key"))
        for i := 0; i < 5; i++ {
                k := []byte(fmt.Sprintf("%04d", i))                                                                                                          
                tx.UnsafePut([]byte("key"), k, []byte("bar"))                                                                                                
        }
        tx.Unlock()                                                                                                                                          
        
        // writeback                                                                                                                                         
        b.ForceCommit()
        
        tx.Lock()
        tx.UnsafeCreateBucket([]byte("key"))                                                                                                                 
        for i := 5; i < 20; i++ {                                                                                                                            
                k := []byte(fmt.Sprintf("%04d", i))                                                                                                          
                tx.UnsafePut([]byte("key"), k, []byte("bar"))                                                                                                
        }       
        tx.Unlock()
        
        seq := ""                                                                                                                                            
        getSeq := func(k, v []byte) error {
                seq += string(k)
                return nil
        }       
        rtx := b.ReadTx()
        rtx.Lock()      
        rtx.UnsafeForEach([]byte("key"), getSeq)                                                                                                             
        rtx.Unlock() 
                
        partialSeq := seq
                
        seq = ""
        b.ForceCommit()                                                                                                                                      

        tx.Lock()                                                                                                                                            
        tx.UnsafeForEach([]byte("key"), getSeq)                                                                                                              
        tx.Unlock()
        
        if seq != partialSeq {                                                                                                                               
                t.Fatalf("expected %q, got %q", seq, partialSeq)                                                                                             
        }
}

This pr changes  UnsafeForEach to traverse on boltdb before on the buffer.
This ordering guarantees that UnsafeForEach traverses in the same order
before or after the commit of buffer.
@fanminshi fanminshi force-pushed the fix_TestV3HashKV_Hash_MisMatch branch from c6f791e to eef5923 Compare July 28, 2017 16:31
@@ -173,7 +172,7 @@ func TestV3HashKV(t *testing.T) {

prevHash := hresp.Hash
prevCompactRev := hresp.CompactRevision
for i := 0; i < 10; i++ {
for i := 0; i < 100; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep this integration test as-is; the backend unit test is enough

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.

@fanminshi fanminshi force-pushed the fix_TestV3HashKV_Hash_MisMatch branch from eef5923 to 451b062 Compare July 28, 2017 16:39
@fanminshi
Copy link
Member Author

all fixed. PTAL.

Copy link
Contributor

@heyitsanthony heyitsanthony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm once CI passes

@fanminshi
Copy link
Member Author

via https://jenkins-etcd-public.prod.coreos.systems/job/etcd-coverage/1851/console

# github.com/coreos/etcd
/usr/local/go/pkg/tool/linux_amd64/link: running gcc failed: exit status 1
collect2: error: ld returned 1 exit status

this is failure doesn't seem to be related to this pr.

@heyitsanthony
Copy link
Contributor

it's probably from the CI machine being full

@fanminshi
Copy link
Member Author

both ci failures are unrelated; proxy-ci failure #8330.

@heyitsanthony heyitsanthony merged commit ca58614 into etcd-io:master Jul 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants