Skip to content

*: Update comment on Ingest() on callers needing to Sync() #835

Merged
itsbilal merged 1 commit intocockroachdb:masterfrom
itsbilal:sync-on-ingest
Jul 23, 2020
Merged

*: Update comment on Ingest() on callers needing to Sync() #835
itsbilal merged 1 commit intocockroachdb:masterfrom
itsbilal:sync-on-ingest

Conversation

@itsbilal
Copy link
Copy Markdown
Contributor

@itsbilal itsbilal commented Jul 22, 2020

This change updates the comment at the function signature of
db.Ingest to highlight that all callers to that method need to
ensure all writes to sstables are synced to disk, to guarantee
durability of ingested data.

Fixes #787.

@itsbilal itsbilal requested a review from petermattis July 22, 2020 15:54
@itsbilal itsbilal self-assigned this Jul 22, 2020
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@itsbilal itsbilal requested a review from jbowens July 22, 2020 15:57
Copy link
Copy Markdown
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @itsbilal, @jbowens, and @petermattis)


ingest.go, line 508 at r1 (raw file):

	// files before calling Ingest, but do it again just in case.
	for _, path := range paths {
		f, err := d.opts.FS.Open(path)

Is it kosher to call Sync on a file opened read-only? Does the Sync actually do anything?


ingest.go, line 513 at r1 (raw file):

		}
		if err := f.Sync(); err != nil {
			return err

This leaks an open file descriptor on error. You might want to use firstError here:

err := f.Sync()
err = firstError(err, f.Close())
if err != nil { ... }

Copy link
Copy Markdown
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @itsbilal, @jbowens, and @petermattis)


ingest.go, line 508 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Is it kosher to call Sync on a file opened read-only? Does the Sync actually do anything?

The internets are not giving me a definitive answer here. Apparently this works for directories under linux. Not clear about files. Not clear about whether this works on Windows. Sigh.

@itsbilal
Copy link
Copy Markdown
Contributor Author

Did some research on the interwebs, and it seems like fsync on some versions of linux often means "flush all bytes since last successful fsync on this file handle". Behaviour with Linux also seems FS dependent. No idea about Windows. The standard seems to advise write-sync-close as opposed to write-close-reopen-sync, as the former is a better durability guarantee.

I'm leaning towards just relying on the callers to call Sync. What do you think?

Postgres' mailing list thread on this: https://www.postgresql.org/message-id/flat/CAMsr%2BYHh%2B5Oq4xziwwoEfhoTZgr07vdGG%2Bhu%3D1adXx59aTeaoQ%40mail.gmail.com#CAMsr+YHh+5Oq4xziwwoEfhoTZgr07vdGG+hu=1adXx59aTeaoQ@mail.gmail.com
Postgres wiki entry on errored fsyncs and retrying them: https://wiki.postgresql.org/wiki/Fsync_Errors
Stack Overflow question: https://stackoverflow.com/questions/37288453/calling-fsync2-after-close2
Linux-fsdevel mailing list: https://marc.info/?l=linux-fsdevel&m=152535409207496&w=2

@petermattis
Copy link
Copy Markdown
Collaborator

I'm leaning towards just relying on the callers to call Sync. What do you think?

Based on these docs, I think that sounds reasonable. I think we should enhance the comment on DB.Ingest to make this requirement clear. Include these discussion links so that our future self can easily find them.

This change updates the comment at the function signature of
db.Ingest to highlight that all callers to that method need to
ensure all writes to sstables are synced to disk, to guarantee
durability of ingested data.

Fixes cockroachdb#787.
@itsbilal
Copy link
Copy Markdown
Contributor Author

Done. This is now a comment-only PR.

@itsbilal itsbilal changed the title *: Call Sync() when ingesting files *: Update comment on Ingest() on callers needing to Sync() Jul 23, 2020
Copy link
Copy Markdown
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @itsbilal and @jbowens)

@itsbilal
Copy link
Copy Markdown
Contributor Author

TFTR!

@itsbilal itsbilal merged commit a9d18a2 into cockroachdb:master Jul 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

db: sync ingested sstables

3 participants