Too Many Open Files


(Jakob Borg) #41

It should have been fs though. :expressionless:


#42

This has now completed, with no hits for that search term.

Thanks,

Pants.


#43

Hi Jakob:

I’ve been running the build you suggested (with STTRACE=filesystem) - and it’s not looking happy:

2017/12/13 09:51:19.475114 logfs.go:73: DEBUG: main.go:1135 basic /usr/local/syncthing/var/ MkdirAll . 448 <nil>
2017/12/13 09:51:19.475283 logfs.go:115: DEBUG: main.go:1140 basic /usr/local/syncthing/var/ Stat . {0xc4201d44e0} <nil>
[OVPYX] 2017/12/13 09:51:19.495248 main.go:642: INFO: syncthing v0.14.42-rc.1+9-g12deb37 "Dysprosium Dragonfly" (go1.9.2 linux-amd64) teamcity@build.syncthing.net 2017-12-12 10:54:35 UTC
[OVPYX] 2017/12/13 09:51:19.495332 main.go:643: INFO: My ID: OVPYXBH-******-******-******-******-******-******-******
[OVPYX] 2017/12/13 09:51:20.559941 sha256.go:92: INFO: Single thread SHA256 performance is 93 MB/s using minio/sha256-simd (68 MB/s using crypto/sha256).
[OVPYX] 2017/12/13 09:51:20.560202 verboseservice.go:48: VERBOSE: Starting up (/usr/local/syncthing/var)
[OVPYX] 2017/12/13 09:51:21.651980 main.go:700: INFO: Hashing performance with weak hash is 80.75 MB/s
[OVPYX] 2017/12/13 09:51:22.662685 main.go:702: INFO: Hashing performance without weak hash is 91.35 MB/s
[OVPYX] 2017/12/13 09:51:22.662734 main.go:708: INFO: Weak hash enabled, as it has an acceptable performance impact.
[OVPYX] 2017/12/13 09:52:47.443152 model.go:166: INFO: Starting deadlock detector with 20m0s timeout
panic: deadlock detected at fmut
2017/12/13 10:17:48.586153 logfs.go:73: DEBUG: main.go:1135 basic /usr/local/syncthing/var/ MkdirAll . 448 <nil>
2017/12/13 10:17:48.586621 logfs.go:115: DEBUG: main.go:1140 basic /usr/local/syncthing/var/ Stat . {0xc4201d65b0} <nil>
[OVPYX] 2017/12/13 10:17:48.607104 main.go:642: INFO: syncthing v0.14.42-rc.1+9-g12deb37 "Dysprosium Dragonfly" (go1.9.2 linux-amd64) teamcity@build.syncthing.net 2017-12-12 10:54:35 UTC
[OVPYX] 2017/12/13 10:17:48.607494 main.go:643: INFO: My ID: OVPYXBH-******-******-******-******-******-******-******
[OVPYX] 2017/12/13 10:17:49.673951 sha256.go:92: INFO: Single thread SHA256 performance is 93 MB/s using minio/sha256-simd (68 MB/s using crypto/sha256).
[OVPYX] 2017/12/13 10:17:49.674392 verboseservice.go:48: VERBOSE: Starting up (/usr/local/syncthing/var)
[OVPYX] 2017/12/13 10:17:50.764223 main.go:700: INFO: Hashing performance with weak hash is 80.82 MB/s
[OVPYX] 2017/12/13 10:17:51.775533 main.go:702: INFO: Hashing performance without weak hash is 91.14 MB/s
[OVPYX] 2017/12/13 10:17:51.775587 main.go:708: INFO: Weak hash enabled, as it has an acceptable performance impact.
[OVPYX] 2017/12/13 10:18:31.576336 model.go:166: INFO: Starting deadlock detector with 20m0s timeout
panic: deadlock detected at fmut
2017/12/13 10:43:32.897960 logfs.go:73: DEBUG: main.go:1135 basic /usr/local/syncthing/var/ MkdirAll . 448 <nil>
2017/12/13 10:43:32.898111 logfs.go:115: DEBUG: main.go:1140 basic /usr/local/syncthing/var/ Stat . {0xc42005f930} <nil>
[OVPYX] 2017/12/13 10:43:32.916996 main.go:642: INFO: syncthing v0.14.42-rc.1+9-g12deb37 "Dysprosium Dragonfly" (go1.9.2 linux-amd64) teamcity@build.syncthing.net 2017-12-12 10:54:35 UTC
etc

Thanks,

Pants.


(Jakob Borg) #44

Wat. What does the actual panic trace look like? Should be a file called panic-something in the config dir…

Maybe skip the trace stuff.


#45

Hi Jakob:

Sure - here’s a representative panic log:

2017/12/13 16:39:21.219351 logfs.go:73: DEBUG: main.go:1135 basic /usr/local/syncthing/var/ MkdirAll . 448 <nil>
2017/12/13 16:39:21.219507 logfs.go:115: DEBUG: main.go:1140 basic /usr/local/syncthing/var/ Stat . {0xc4201d04e0} <nil>
[OVPYX] 2017/12/13 16:39:21.238378 main.go:642: INFO: syncthing v0.14.42-rc.1+9-g12deb37 "Dysprosium Dragonfly" (go1.9.2 linux-amd64) teamcity@build.syncthing.net 2017-12-12 10:54:35 UTC
[OVPYX] 2017/12/13 16:39:21.238472 main.go:643: INFO: My ID: OVPYXBH-*******-*******-*******-*******-*******-*******-*******
[OVPYX] 2017/12/13 16:39:22.302158 sha256.go:92: INFO: Single thread SHA256 performance is 93 MB/s using minio/sha256-simd (68 MB/s using crypto/sha256).
[OVPYX] 2017/12/13 16:39:22.302440 verboseservice.go:48: VERBOSE: Starting up (/usr/local/syncthing/var)
[OVPYX] 2017/12/13 16:39:23.383159 main.go:700: INFO: Hashing performance with weak hash is 81.08 MB/s
[OVPYX] 2017/12/13 16:39:24.390276 main.go:702: INFO: Hashing performance without weak hash is 91.52 MB/s
[OVPYX] 2017/12/13 16:39:24.390333 main.go:708: INFO: Weak hash enabled, as it has an acceptable performance impact.
[OVPYX] 2017/12/13 16:39:44.410866 model.go:166: INFO: Starting deadlock detector with 20m0s timeout
...
Panic at 2017-12-13T17:04:44Z
panic: deadlock detected at fmut

goroutine 12 [running]:
github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1(0xc420176660, 0x7fbc8830a590, 0xc42027a0c0, 0xb902db, 0x4)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:57 +0x3b3
created by github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:33 +0xb7

goroutine 1 [runnable]:
sync.(*RWMutex).RUnlock(0xc42023e178)
	/usr/local/go/src/sync/rwmutex.go:62 +0x91
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/util.(*BufferPool).Put(0xc42023e0e0, 0xc4343c2800, 0x3ea, 0x3fd)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/util/buffer_pool.go:178 +0xf5
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table.(*Reader).readRawBlock(0xc420072f70, 0x5f3ed, 0x3e5, 0xb5ee01, 0x1, 0xc4346e8a80, 0xc42024c3e8, 0x8aead8, 0xc425150180)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table/reader.go:589 +0x611
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table.(*Reader).readBlock(0xc420072f70, 0x5f3ed, 0x3e5, 0xc422808001, 0xc420cc77a0, 0x8b00bf, 0xc42024c3f8)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table/reader.go:603 +0x5b
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table.(*Reader).readBlockCached.func1(0xc4346e8aa0, 0xc42025e400, 0xc422eae6c0)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table/reader.go:627 +0x51
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/cache.(*Cache).Get(0xc42025e400, 0x27eba6, 0x5f3ed, 0xc42024c4f8, 0x0)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/cache/cache.go:389 +0x1f6
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/cache.(*NamespaceGetter).Get(0xc4224b00a0, 0x5f3ed, 0xc42024c4f8, 0x10)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/cache/cache.go:58 +0x4b
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table.(*Reader).readBlockCached(0xc420072f70, 0x5f3ed, 0x3e5, 0xc420240101, 0x42a4f9, 0xc400000008, 0xc4223c6080, 0xc42250b553, 0x8ba5fb)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table/reader.go:625 +0xea
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table.(*Reader).getDataIter(0xc420072f70, 0x5f3ed, 0x3e5, 0x0, 0x101, 0x741e, 0x3e5)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table/reader.go:765 +0x5e
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table.(*Reader).getDataIterErr(0xc420072f70, 0x5f3ed, 0x3e5, 0x0, 0x101, 0x0, 0x0)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table/reader.go:780 +0x14d
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table.(*indexIter).Get(0xc4200d42c0, 0xc4346e6690, 0xc42120bb00)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/table/reader.go:507 +0x2ed
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/iterator.(*indexedIterator).setData(0xc421e9a180)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/iterator/indexed_iter.go:39 +0x4e
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/iterator.(*indexedIterator).Next(0xc421e9a180, 0xcf2a9d9e7e745216)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/iterator/indexed_iter.go:160 +0xbd
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/iterator.(*indexedIterator).Next(0xc421e9b0e0, 0xcf2a9d9e7e745216)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/iterator/indexed_iter.go:149 +0x114
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/iterator.(*mergedIterator).Next(0xc420172a80, 0xc42a37fe60)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/iterator/merged_iter.go:169 +0xf1
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb.(*dbIter).Next(0xc42017bef0, 0x20)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/db_iter.go:240 +0xa3
github.com/syncthing/syncthing/lib/db.(*Instance).withAllFolderTruncated(0xc420cb3c20, 0xc4202737a0, 0xb, 0x10, 0xc42024c9c8)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/db/leveldb_dbinstance.go:175 +0x1d0
github.com/syncthing/syncthing/lib/db.NewFileSet(0xc420272c80, 0xb, 0x10d8a60, 0xc420177270, 0xc420cb3c20, 0xc420272c93)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/db/set.go:136 +0x2f1
github.com/syncthing/syncthing/lib/model.(*Model).addFolderLocked(0xc4201f0480, 0xc420272c80, 0xb, 0xc420272c90, 0x3, 0x0, 0xc420272c93, 0xd, 0x0, 0xc4202a6600, ...)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/model.go:323 +0x12c
github.com/syncthing/syncthing/lib/model.(*Model).AddFolder(0xc4201f0480, 0xc420272c80, 0xb, 0xc420272c90, 0x3, 0x0, 0xc420272c93, 0xd, 0x0, 0xc4202a6600, ...)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/model.go:316 +0xb2
main.syncthingMain(0x7fffa8329db6, 0x18, 0x0, 0x0, 0x0, 0x0, 0x7fffa8329d8b, 0x1b, 0x0, 0x0, ...)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/cmd/syncthing/main.go:792 +0x13b1
main.main()
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/cmd/syncthing/main.go:438 +0x3c9

goroutine 6 [chan receive, 25 minutes]:
github.com/syncthing/syncthing/vendor/github.com/zillode/notify.(*nonrecursiveTree).dispatch(0xc42000e360, 0xc42000e2a0)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/zillode/notify/tree_nonrecursive.go:36 +0xc0
created by github.com/syncthing/syncthing/vendor/github.com/zillode/notify.newNonrecursiveTree
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/zillode/notify/tree_nonrecursive.go:29 +0x149

goroutine 7 [chan receive, 25 minutes]:
github.com/syncthing/syncthing/vendor/github.com/zillode/notify.(*nonrecursiveTree).internal(0xc42000e360, 0xc42000e300)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/zillode/notify/tree_nonrecursive.go:81 +0x88
created by github.com/syncthing/syncthing/vendor/github.com/zillode/notify.newNonrecursiveTree
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/zillode/notify/tree_nonrecursive.go:30 +0x178

goroutine 49 [select]:
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/util.(*BufferPool).drain(0xc42023e0e0)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/util/buffer_pool.go:206 +0x172
created by github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/util.NewBufferPool
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/util/buffer_pool.go:237 +0x195

goroutine 9 [select, 25 minutes]:
github.com/syncthing/syncthing/vendor/github.com/AudriusButkevicius/kcp-go.(*updateHeap).updateTask(0x1135020)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/AudriusButkevicius/kcp-go/updater.go:81 +0x36f
created by github.com/syncthing/syncthing/vendor/github.com/AudriusButkevicius/kcp-go.init.2
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/AudriusButkevicius/kcp-go/updater.go:13 +0x74

goroutine 10 [syscall, 25 minutes]:
os/signal.signal_recv(0x0)
	/usr/local/go/src/runtime/sigqueue.go:131 +0xa6
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:22 +0x22
created by os/signal.init.0
	/usr/local/go/src/os/signal/signal_unix.go:28 +0x41

goroutine 50 [select, 25 minutes]:
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb.(*DB).compactionError(0xc4201b7200)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/db_compaction.go:90 +0x119
created by github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb.openDB
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/db.go:139 +0x4bd

goroutine 17 [select, 25 minutes, locked to thread]:
runtime.gopark(0xd3ecf8, 0x0, 0xb92265, 0x6, 0x18, 0x1)
	/usr/local/go/src/runtime/proc.go:287 +0x12c
runtime.selectgo(0xc42003cf50, 0xc4201ce3c0)
	/usr/local/go/src/runtime/select.go:395 +0x1149
runtime.ensureSigM.func1()
	/usr/local/go/src/runtime/signal_unix.go:511 +0x220
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:2337 +0x1

goroutine 18 [chan receive, 25 minutes]:
main.setupSignalHandling.func1(0xc4201d8240)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/cmd/syncthing/main.go:937 +0x34
created by main.setupSignalHandling
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/cmd/syncthing/main.go:936 +0xbc

goroutine 19 [chan receive, 25 minutes]:
main.setupSignalHandling.func2(0xc4201d8300)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/cmd/syncthing/main.go:947 +0x34
created by main.setupSignalHandling
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/cmd/syncthing/main.go:946 +0x181

goroutine 20 [select, 25 minutes]:
github.com/syncthing/syncthing/vendor/github.com/thejerf/suture.(*Supervisor).Serve(0xc42023e000)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/thejerf/suture/supervisor.go:338 +0x304
created by github.com/syncthing/syncthing/vendor/github.com/thejerf/suture.(*Supervisor).ServeBackground
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/thejerf/suture/supervisor.go:297 +0x3f

goroutine 21 [select, 25 minutes]:
main.(*verboseService).Serve(0xc420176520)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/cmd/syncthing/verboseservice.go:44 +0x1a3
github.com/syncthing/syncthing/vendor/github.com/thejerf/suture.(*Supervisor).runService.func1(0xc42023e000, 0xc400000000, 0x10cdbe0, 0xc420176520)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/thejerf/suture/supervisor.go:454 +0x63
created by github.com/syncthing/syncthing/vendor/github.com/thejerf/suture.(*Supervisor).runService
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/thejerf/suture/supervisor.go:444 +0x5b

goroutine 22 [chan receive, 25 minutes]:
github.com/syncthing/syncthing/lib/events.(*bufferedSubscription).pollingLoop(0xc4201e0190)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/events/events.go:392 +0x8c
created by github.com/syncthing/syncthing/lib/events.NewBufferedSubscription
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/events/events.go:387 +0x19b

goroutine 33 [chan receive, 25 minutes]:
github.com/syncthing/syncthing/lib/events.(*bufferedSubscription).pollingLoop(0xc42028c000)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/events/events.go:392 +0x8c
created by github.com/syncthing/syncthing/lib/events.NewBufferedSubscription
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/events/events.go:387 +0x19b

goroutine 51 [select]:
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb.(*DB).mpoolDrain(0xc4201b7200)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/db_state.go:101 +0x13b
created by github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb.openDB
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/db.go:140 +0x4df

goroutine 52 [select, 25 minutes]:
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb.(*DB).tCompaction(0xc4201b7200)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/db_compaction.go:804 +0x2c8
created by github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb.openDB
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/db.go:146 +0x65e

goroutine 53 [select, 25 minutes]:
github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb.(*DB).mCompaction(0xc4201b7200)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/db_compaction.go:751 +0x192
created by github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb.openDB
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/vendor/github.com/syndtr/goleveldb/leveldb/db.go:147 +0x680

goroutine 13 [sleep]:
time.Sleep(0x45d964b800)
	/usr/local/go/src/runtime/time.go:65 +0x130
github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1(0xc420176660, 0x7fbc8830a590, 0xc42027a120, 0xb904ff, 0x4)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:35 +0x58
created by github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:33 +0xb7

goroutine 23 [semacquire, 20 minutes]:
sync.runtime_SemacquireMutex(0xc42027a0c4, 0x81c6f241ca14fe00)
	/usr/local/go/src/runtime/sema.go:71 +0x3d
sync.(*Mutex).Lock(0xc42027a0c0)
	/usr/local/go/src/sync/mutex.go:134 +0xee
sync.(*RWMutex).Lock(0xc42027a0c0)
	/usr/local/go/src/sync/rwmutex.go:93 +0x2d
github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1.1(0x7fbc8830a590, 0xc42027a0c0, 0xc4206244d0)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:39 +0x31
created by github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:38 +0xb5

goroutine 424 [sleep]:
time.Sleep(0x1176592e000)
	/usr/local/go/src/runtime/time.go:65 +0x130
github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1.2(0xc420176660, 0xc422f460e0)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:46 +0x2e
created by github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:45 +0xe7

goroutine 26 [sleep, 15 minutes]:
time.Sleep(0x1176592e000)
	/usr/local/go/src/runtime/time.go:65 +0x130
github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1.2(0xc420176660, 0xc422f46070)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:46 +0x2e
created by github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:45 +0xe7

goroutine 187 [sleep, 10 minutes]:
time.Sleep(0x1176592e000)
	/usr/local/go/src/runtime/time.go:65 +0x130
github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1.2(0xc420176660, 0xc420624540)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:46 +0x2e
created by github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:45 +0xe7

goroutine 30 [sleep, 5 minutes]:
time.Sleep(0x1176592e000)
	/usr/local/go/src/runtime/time.go:65 +0x130
github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1.2(0xc420176660, 0xc4206245b0)
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:46 +0x2e
created by github.com/syncthing/syncthing/lib/model.(*deadlockDetector).Watch.func1
	/tmp/syncthing-e02d0de3/src/github.com/syncthing/syncthing/lib/model/util.go:45 +0xe7

I’ll try it without the trace, and see how that fares.

Thanks,

Pants.


(Jakob Borg) #46

I think that deadlock detection is a false positive because database operations are taking unusually long, because the system is slow. Ignore. (Deadlock detection is run when tracing, as a debug measure.)


(Audrius Butkevicius) #47

I think deadlock detection runs on all non-release builds, and has to be explicitly disabled.


(Jakob Borg) #48

Erh, yeah, so it is.

@Moisie: export STDEADLOCKTIMEOUT=7200

Also, help text is out of sync with reality on this point. I’ll correct…


#49

Hello both:

Thanks again! Am now running with the increased deadlock timeout - will report back…

Pants.


#50

Hello again:

Unfortunately the system panicked again under the suggested build, reporting too many open files.

Here’s the panic log: https://pastebin.com/Vn66wxk4

I’ll re-run it again with the fs/filesystem trace on, and report back.

Thanks,

Pants.


(Jakob Borg) #51

Yeah, I have no idea. I think the interesting thing could be a full trace (from start to panic) with STTRACE=model, I don’t think the fs trace will give much but maybe Audrius has a plan there.


(Audrius Butkevicius) #52

Well the fs will show if file closes are happening and what if any fs calls returns errors.


#53

Hello again:

Ok - I’ve found some relevant log records - but I’m not sure what they show.

Here’s an open file record from /proc/{PID}/fd whilst there were 2800+ files open:

lrwx------+ 1 syncthing users 64 Dec 17 22:11 1001 -> /volume1/NAS/Computer 1.sparsebundle/bands/.syncthing.455fc.tmp

Searching for log entries relating to this file, I have found the following:

[OVPYX] 2017/12/17 16:40:13.774399 logfs.go:79: DEBUG: <autogenerated>:1 basic /volume1/NAS/ Open Computer 1.sparsebundle/bands/.syncthing.455fc.tmp <nil> open /volume1/NAS/Computer 1.sparsebundle/bands/.syncthing.455fc.tmp: no such file or directory
[OVPYX] 2017/12/17 16:40:13.774578 verboseservice.go:48: VERBOSE: Started syncing "bvkmk-*****" / "Computer 1.sparsebundle/bands/455fc" (update file)
[OVPYX] 2017/12/17 16:40:13.912749 logfs.go:115: DEBUG: <autogenerated>:1 basic /volume1/NAS/ Stat Computer 1.sparsebundle/bands {0xc4256eb1e0} <nil>
[OVPYX] 2017/12/17 16:40:13.916280 logfs.go:85: DEBUG: <autogenerated>:1 basic /volume1/NAS/ OpenFile Computer 1.sparsebundle/bands/.syncthing.455fc.tmp 194 438 {0xc4421f0640 Computer 1.sparsebundle/bands/.syncthing.455fc.tmp} <nil>
[OVPYX] 2017/12/17 16:40:13.916331 logfs.go:145: DEBUG: <autogenerated>:1 basic /volume1/NAS/ Hide Computer 1.sparsebundle/bands/.syncthing.455fc.tmp <nil>
[OVPYX] 2017/12/17 16:40:13.926180 logfs.go:79: DEBUG: <autogenerated>:1 basic /volume1/NAS/ Open Computer 1.sparsebundle/bands/455fc <nil> open /volume1/NAS/Computer 1.sparsebundle/bands/455fc: no such file or directory

I’ve searched through for a couple of files that are hanging open, and so far this pattern appears to be repeated.

Is that helpful at all?

Best wishes,

Pants.


#54

Hello:

Sorry to pester - but any thoughts on this?!

Thanks,

Pants.


(Jakob Borg) #55

This is what I need to be able to gain any insight on what’s happening here. The log extract above tells me nothing.


#56

Hi Jakob:

Thanks for this. I’m happy (in principle) to provide this - apart from three factors:

  • There are confidential file names in the log file, so I can’t publish this publicly;
  • The log file I’ve got is about 200GB in size;
  • I don’t yet have a matching list of open files to correspond with a full trace. I presume you would ideally need this to sift through the mass of data?

If there’s a way to send you the log file privately, and you’re happy to digest it without a list of open files, I can send it to you.

If not, please let me know what would be useful for you within these confines!

Many thanks,

Pants.


(Jakob Borg) #57

So clearly 200 GB is excessive and impractical, and in the end I’ll need just a representative kilobyte of it or so… I think from the model trace it should be clear what errors are happening, and from there we should be able to conclude why the files are not closed. So there should be no need to correlate with the list of closed files.

I’ll give you an account where you can SCP/SFTP the file, and we can start with an initial 100 MB or so of that log - from startup until it’s accumulated some data. The next 199.9 GB until the crash should just be more of same. But it’s important that it is from startup and unmodified; I’ll make sure the data never leaves my system/eyes.


#58

Hi Jakob:

Thanks - I really appreciate that. I’ll send it over once I receive the details from you.

Best wishes,

Pants.


(Jakob Borg) #59

In private message now. :+1:


#60

Hi Jakob:

Sorry for the delay - I’d forgotten to extend the deadlock timeout for a while…

However, I have now uploaded a trace file, together with a list of currently-open file descriptors, which currently numbers around 1400.

Many thanks for looking at this - and the compliments of the season to you and all who have worked on syncthing!

Pants.