-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os: File.Sync() for Darwin should use the F_FULLFSYNC fcntl #26650
Comments
Since this issue threatens data loss/corruption, please consider fixing it in 1.11.x and 1.10.x |
Thank you for reporting this issue @networkimprov! It is triaged for 1.12 as 1.11 is scheduled to be released in just a few days. It has also been like this since forever. However we'll definitely consider it as an early candidate for Go1.12 when the tree opens up in the next few weeks. If you are interested in submitting a CL, please feel free, it'd be great to have your contribution in the Go project and community, otherwise we'll get it in for Go1.12. |
I'm not proposing 1.11.0 as a target; rather 1.11.1 & 1.10.4 or the minor releases that appear after a fix lands on tip. Do you have guidance on where fcntl should be called for Darwin File.Sync()? |
The question is whether we should do it in |
A backport I believe warrants something that has plagued previous versions or results for memory corruption, but I'll defer to others on whether this qualifies too -- in this case potential file corruption might qualify :)
@ianlancetaylor I too was leaning towards it going in to // As per /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk/usr/include/sys/fcntl.h
const F_FULLFSYNC = 51
func Fsync(fd int) error {
_, err := fcntl(fd, F_FULLFSYNC, 1)
return err
} so in full +// As per /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk/usr/include/sys/fcntl.h
+const F_FULLFSYNC = 51
+
+func Fsync(fd int) error {
+ _, err := fcntl(fd, F_FULLFSYNC, 0)
+ return err
+}
+
// SockaddrDatalink implements the Sockaddr interface for AF_LINK type sockets.
type SockaddrDatalink struct {
Len uint8
@@ -332,7 +340,6 @@ func Uname(uname *Utsname) error {
//sys Fstat(fd int, stat *Stat_t) (err error) = SYS_FSTAT64
//sys Fstatat(fd int, path string, stat *Stat_t, flags int) (err error) = SYS_FSTATAT64
//sys Fstatfs(fd int, stat *Statfs_t) (err error) = SYS_FSTATFS64
-//sys Fsync(fd int) (err error)
//sys Ftruncate(fd int, length int64) (err error)
//sys Getdirentries(fd int, buf []byte, basep *uintptr) (n int, err error) = SYS_GETDIRENTRIES64
//sys Getdtablesize() (size int) Also happy to hear what comments others have. |
On the other hand, thinking about it further, most of the syscall functions simply implement that operation on the system, rather than trying to present a unified cross-system API. So that suggests that maybe |
Note that this can hurt end-users without the Go developer ever realizing it: User does xyz in app, OS crashes (or hw fails), user sees odd behavior in app after restart and restores from backup (losing today's work), user assumes problem was due to hw or os.
It's already defined in tree: Agreed you should leave MacOS fsync available in case anyone needs it. |
Perhaps surface MacOS fsync as syscall.FsyncDarwin and implement syscall.Fsync with fcntl? |
Cool, thanks for the discussion @networkimprov and @ianlancetaylor! |
@odeke-em the 1.12 tree is open; can we get this CL posted? |
@networkimprov thank you for the ping and reminder! For sure, let's get the ball rolling. I am in the process of submitting a CL. |
Change https://golang.org/cl/130676 mentions this issue: |
In regards to backporting this fix to Go1.11 and earlier, what do y'all think @ianlancetaylor @andybons? |
This is not a new problem. I don't see a need to backport. |
Roger that! The rationale that @networkimprov proposed, which I thought was reasonable was that because it is an underlying problem that has existed since Go1.0, it'd be nice to have it in the latest Go release since it can lead to disk memory corruption and affects many databases, folks that need flushes to permanent storage. However, as we've all noticed, no one had frantically reported this bug during previous Go cycles hence doesn't meet the bar for a backport. |
This can hurt end-users without the Go developer ever realizing it:
I plan to release two apps near-future which rely on os.File.Sync() for crash protection, and MacOS is widely used by my audience. Not Go-related, but my Git repo was corrupted last week due to a hw failure. |
@ianlancetaylor could you consider cherry-picking this for 1.11.1, given the report from a bbolt db user on MacOS? |
I don't see a report at that link. We've also seen a bug report on this due to the changed performance: #27415. I'm pretty reluctant to make this kind of change in a point release. As I said earlier, it's not a new problem. |
Couldn't bbolt handle this by making the syscall directly? From Go 1.12 onwards they can go back to a plain |
@mark-rushakoff, probably they could. Bleve is also affected and they don't use the bbolt fork yet; hopefully they'll switch if bbolt applies the workaround. Wish I knew who else depends on os.File.Sync :-( @ianlancetaylor, yes that was the report I meant. I hope you won't fault me for asking. |
Regarding the 1.12 release notes:
I'd suggest adding that File.Sync() prior to 1.12 is broken on MacOS, and may cause data loss/corruption. |
That would imply that calling I think the current release notes text is clear enough. |
I was looking at the release notes for this as well, wondering to myself: "negative performance impact" relative to ... the previous implementation of Sync? Not calling Sync but just relying on Close? I think it would be more clear if it mentioned that calling File.Sync is expected to block for a longer time in 1.12 than in 1.11 and earlier, because now Sync forces writes to disk rather than just OS buffers, and as a result Sync is now more resilient to crashes. "Negative performance impact" is a bit vague in my opinion. |
Change https://golang.org/cl/155038 mentions this issue: |
What do you think of https://golang.org/cl/155038? (I really think we are spending too much time on this. It's a bug fix.) |
LGTM @ianlancetaylor. Thanks for rewording it. |
True. Maybe "File.Sync() prior to 1.12 is ineffective on MacOS, and cannot prevent data loss/corruption on system crash/power-off." It's a bug fix with broad impact among database packages, including etcd bbolt, syndtr goleveldb, dgraph badger, and bleve. |
Thanks, I'm going with what I've got. |
Updates #26650 Change-Id: I0ec070127dcacc7fc68dd5baf125eb762e1ea846 Reviewed-on: https://go-review.googlesource.com/c/155038 Reviewed-by: Bryan C. Mills <bcmills@google.com>
I'm definitely late to the conversation, but I wanted to document my frustration with the decision of not backporting this fix, and how it's affecting https://github.com/dgraph-io/badger. We now need to replace all the calls to Once is very straight forward:
// +build !darwin go1.12
package y
import "os"
// FileSync calls os.File.Sync with the right parameters.
// This function can be removed once we stop supporting Go 1.11
// on MacOS.
func FileSync(f *os.File) error {return f.Sync()} The second one for now simply makes the corresponding syscall.
// +build darwin,!go1.12
package y
import (
"os"
"syscall"
)
func FileSync(f *os.File) error {
_, _, err := syscall.Syscall(syscall.SYS_FCNTL, f.Fd(), syscall.F_FULLFSYNC, 0)
if err == 0 {
return nil
}
return err
} Making this function do the right thing seems to require interacting with Could you, at least, provide a clear way of how to implement this function correctly? Thanks UPDATE: I already found a way my code was wrong, added the check to see whether errno is zero. |
@campoy I understand your frustration, but there was no good solution here. Making this change has led to multiple bug reports about significant performance changes, changes due to now using the correct behavior. It would have been inappropriate to make a change that led to multiple bug reports in a minor release. The code you show above looks correct and fine for a file on local disk. I'm not sure there is a fully correct to call fsync for an NFS file in Go 1.11 on Darwin. The |
Thanks for the confirmation, @ianlancetaylor! I think that just keeping my code in this issue should be enough to help others wondering how to fix it on their own codebases. The PR fixing this issue for badger is dgraph-io/badger#871. |
os.File.Sync() was inneffective on mac in go prior to 1.12. In 1.12 onwards it is effective, but slow: golang/go#26650 This change disables Sync for CAS uploads on mac, because we can (potentially) verify this later. Let's see if this helps performance on mac: buchgr#67 (while being less risky than buchgr#68).
os.File.Sync() was inneffective on mac in go prior to 1.12. In 1.12 onwards it is effective, but slow: golang/go#26650 This change disables Sync for CAS uploads on mac, because we can (potentially) verify this later. Let's see if this helps performance on mac: buchgr#67 (while being less risky than buchgr#68).
os.File.Sync() was inneffective on mac in go prior to 1.12. In 1.12 onwards it is effective, but slow: golang/go#26650 This change disables Sync for CAS uploads on mac, because we can (potentially) verify this later. Let's see if this helps performance on mac: buchgr#67 (while being less risky than buchgr#68).
What version of Go?
1.10 or tip
What operating system and processor architecture?
darwin, *
What is the matter?
According to this and other sites,
fcntl(fd, F_FULLFSYNC)
is necessary for proper fsync on MacOS (OS X): https://danluu.com/file-consistency/Github search finds no uses of "F_FULLFSYNC" in the src tree.
If there is a rationale for the present implementation, it should be documented in File.Sync() with directions on how to invoke the fcntl variant.
The text was updated successfully, but these errors were encountered: