IRCaBot 2.1.0
GPLv3 © acetone, 2021-2022
#ls2
/2021/12/16
zzz zlatinb, would really appreciate it if you could allocate some time to review the pumper MR
zzz not sure it's solid enough yet for the next release but if we don't start working on it soon it won't make it in
zlatinb zzz: commented on the MR. There is an architectural flaw I'm afraid - all interest changes must happen on the pumper thread
zlatinb so one needs to introduce a queue with interest operation changes and wakeup() the selector after enqueing
zzz hmm. I looked at what jetty does and this is similar
zzz take a look at the locking before you declare it hopeless
zlatinb this is not hopeless per se because we have a time limit to how long we wait in select(..) but still as is this patch will add latency in the case where writes need to be done with interest ops
zlatinb you can write a small test program with two threads - one blocked in select() w/o a timeout, and the other trying to change interest ops
zlatinb actually let me double-check..
zlatinb yes, SELECTOR_LOOP_DELAY = 200
zlatinb so it may not be visible with naked eye
zlatinb I used the exact same design in one of my HFT jobs where microseconds mattered
zlatinb since they don't and the selector is guaranteed to wake up within 200ms it's probably not fatal
zzz zlatinb, the interest ops lock is on the individual SelectionKey, not on the selector; the write lock per-connection, not on the selector
zzz there's no possibility for a 200ms block
zzz like I said, take your time to review the diff
zlatinb I did and added other comments too
zlatinb but the 200ms wait is real
zzz where's the lock? line 101?
zlatinb yes, which is an abstract method in SelectorImpl implemented in the subclasses
zzz still no sign of a lock
zlatinb EPollSelectorImpl.java
zzz so that holds updateLock which is also held in processUpdateQueue() called from doSelect()
zzz processUpdateQueue() is called at line 113; the timeout wait is at line 120
zzz so the lock's not held in the wait
zzz it would be very bad design if it did hold a lock in the wait()
zlatinb it was at least in earlier JDKs. Let me write a quick program
zzz how early? java 4 or 5? those were known to be bad
zlatinb around that early yes
zzz this jdk 11 code we're looking at is pretty clear
zlatinb it's not clear to me the interest ops will be changed until the next iteration of doSelect
zzz sure but most writes are happening directly, not in the selector thread
zzz if they do get thrown onto the selector, it will be woken up as usual
zzz 99.9% of writes bypass the selector
zlatinb that depends, maybe not in a bulk transfer. That will need to be tested in a testnet
zzz sure, not claiming it's perfect yet. just that it's been through several live net testing / bug fix iterations already to get to this pint
zlatinb ok, wrote a small program - the changing of the interestOps does not block the changing thread, but it does not take effect until the next iteration of select
zlatinb waiting for pastebins to load
zzz thats fine. if the write doesn't complete, it will get thrown onto the selector which will get woken up, just like before
zlatinb ok. In the code I did for work I would set a state that the channel is full and any further writes to the channel would need to go through the selector. I don't see such state kept in this MR
zzz there's no state. every call to NTCPCon.write() will try to write directly, and then if it fails, throw it to the selector
zlatinb hmm that might be a problem because there is no guarantee that data will be written in order
zzz true, my statement above was a simplification
zzz NTCPCon.write() -> pumper.processWrite() tries to clear the write buf queue
zlatinb yes but there is no guarantee that the pumper thread will grab the cpu before the writing thread writes to a writeable socketchannel
zzz that's why there's now a separate write lock
zzz all writes happen in pumper.processWrite(), whether via the selector thread or another thread
zlatinb that's not sufficient, consider this sequence:
zlatinb 1. writer thread writes to socketchannel, fills it up, registers for interest with selector, releases all locks goes to sleep
zlatinb 2. some time passes, socketchannel becomes writeable again, wakes up selector thread
zlatinb 3. before the selector thread gets the cpu, writer thread comes back, acquires all locks and writes more
zlatinb = data out of order
zzz no, because everything to be written gets put in the write buf queue (no matter what thread), and then pumper.processWrite() takes it off the queue
zzz in your sequence above, if there's still data in the write buf queue after 3), then 4) the selector will write the rest of it (or not)
zzz so when I say a thread 'writes directly', it is in the thread, but it's first putting the data at the tail of the write buf queue, then writing everything it can starting at the head of the write buf queue
zzz that keeps it all in-order
zlatinb ok that looks fine then
zlatinb I'll run it through the testnet and see how it behaves
zlatinb not tonight though
zzz great, glad to finally get your eyeballs on it