Read/write concurrency questions

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Read/write concurrency questions

D. J. Hagberg (Sun)
So I think I have my asynch. write setup working correctly once I
started trying to ensure that only a single task was able to do SSL
writes to a single SelectionKey at any one point in time.  I've stepped
through the SelectorHandler/ProtocolChain/SSLReadFilter code enough now
that I'm convinced it's only possible for it to do a single read on a
SelectionKey at any one point in time, too.

But now I'm puzzling over a few places in the Grizzly code that are
starting to look strange to me.  The main one is this:

In the main loop on Controller.doSelect is this chunk:

     if (key.isValid()) {
         if ((key.readyOps() & SelectionKey.OP_ACCEPT)
                 == SelectionKey.OP_ACCEPT){
             // ... accept-handling stuff
         } else if ((key.readyOps() & SelectionKey.OP_READ)
                 == SelectionKey.OP_READ) {
             delegateToWorkerThread = selectorHandler.
                     onReadInterest(key,serverCtx);
         } else if ((key.readyOps() & SelectionKey.OP_WRITE)
                 == SelectionKey.OP_WRITE) {
             delegateToWorkerThread = selectorHandler.
                     onWriteInterest(key,serverCtx);
         } else if ((key.readyOps() & SelectionKey.OP_CONNECT)
                 == SelectionKey.OP_CONNECT) {
             delegateToWorkerThread = selectorHandler.
                     onConnectInterest(key,serverCtx);
         }

         if (delegateToWorkerThread){
             Context context = pollContext(key);
             context.setProtocol(selectorHandler.protocol());
             context.setPipeline(selectorHandler.pipeline());
             context.execute();
         }
     } else {
         selectionKeyHandler.cancel(key);
     }

It looks to me like the (probably unlikely) case where a SelectionKey
has *both* a read-ready and a write-ready state is not handled here.  It
looks like OP_READ will *always* win and SelectorHandler.onWriteInterest
will not be called until 1000ms later or the next time the Selector is
being woken up and, only then, if there is not an outstanding OP_READ on
the socket.

I noticed this because when I put my application under significant load
I started building up a significant queue of outstanding write's.

Is this a correct analysis?

Thanks,

                        -=- D. J.

PS, For details, I implemented my async write handling by having my own
class that overrides TCPSelectorHandler with its own:

     public boolean onWriteInterest(SelectionKey key, Context ctx) {
         // thread-safe call to set up a latch that prevents calls to
         // SelectorHandler.register(key,OP_WRITE) until write task
         // has completed.
         myTaskHandler.flagWriteReady(key);

         // disable OP_WRITE on key before submitting to task queue
         key.interestOps(key.interestOps() & (~SelectionKey.OP_WRITE));

         // submit write task to its own Pipeline
         myTaskHandler.submitWriteTask(key);
     }

Then in any background tasks I have that post messages to the write
queue, I:

- make a synchronized call into the task handler that sets a
"needsWrite" flag.  That call will ONLY be successful if there is
currently a task waiting to start (based on the call to flagWriteReady
above) or if a write task is currently running.

- if the call to set needsWrite was unsuccessful, it makes a call to
SelectorHandler.register(key, OP_WRITE) to make sure the Selector is
woken up and tries to look for OP_WRITE on the key.

- an async write task runs in its own pipeline that does
message-to-bytes encoding and then bytes-to-SSL encoding.  At the end of
that task in a finally block, it does an atomic check/reset of the
needsWrite flag and, if so, calls back to SelectorHandler.register(key,
OP_WRITE) to start the process over again.

I haven't yet been able to streamline the logic more than this.  It
seems a bit convoluted, but does test correctly now.  The unfortunate
side-effect is that it occasionally means that an async write task is
created and started even though the last one actually emptied the write
queue...  It's a "safe" operation, but adds unnecessary overhead.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Read/write concurrency questions

Jeanfrancois Arcand-2
Hi,

D. J. Hagberg (Sun) wrote:

> So I think I have my asynch. write setup working correctly once I
> started trying to ensure that only a single task was able to do SSL
> writes to a single SelectionKey at any one point in time.  I've stepped
> through the SelectorHandler/ProtocolChain/SSLReadFilter code enough now
> that I'm convinced it's only possible for it to do a single read on a
> SelectionKey at any one point in time, too.
>
> But now I'm puzzling over a few places in the Grizzly code that are
> starting to look strange to me.  The main one is this:
>
> In the main loop on Controller.doSelect is this chunk:
>
>     if (key.isValid()) {
>         if ((key.readyOps() & SelectionKey.OP_ACCEPT)
>                 == SelectionKey.OP_ACCEPT){
>             // ... accept-handling stuff
>         } else if ((key.readyOps() & SelectionKey.OP_READ)
>                 == SelectionKey.OP_READ) {
>             delegateToWorkerThread = selectorHandler.
>                     onReadInterest(key,serverCtx);
>         } else if ((key.readyOps() & SelectionKey.OP_WRITE)
>                 == SelectionKey.OP_WRITE) {
>             delegateToWorkerThread = selectorHandler.
>                     onWriteInterest(key,serverCtx);
>         } else if ((key.readyOps() & SelectionKey.OP_CONNECT)
>                 == SelectionKey.OP_CONNECT) {
>             delegateToWorkerThread = selectorHandler.
>                     onConnectInterest(key,serverCtx);
>         }
>
>         if (delegateToWorkerThread){
>             Context context = pollContext(key);
>             context.setProtocol(selectorHandler.protocol());
>             context.setPipeline(selectorHandler.pipeline());
>             context.execute();
>         }
>     } else {
>         selectionKeyHandler.cancel(key);
>     }
>
> It looks to me like the (probably unlikely) case where a SelectionKey
> has *both* a read-ready and a write-ready state is not handled here.  It
> looks like OP_READ will *always* win and SelectorHandler.onWriteInterest
> will not be called until 1000ms later or the next time the Selector is
> being woken up and, only then, if there is not an outstanding OP_READ on
> the socket.
>
> I noticed this because when I put my application under significant load
> I started building up a significant queue of outstanding write's.
>
> Is this a correct analysis?

I'm not sure it is a bug as if you spawn two threads (let say there is
no if/else in the code above), it means the OP_READ and OP_WRITE might
be concurrently proceeded. Can it cause problems? I've no object to
change the Controller code, but I just want to make sure it will not
cause more problems.

Thanks

-- Jeanfrancois


>
> Thanks,
>
>             -=- D. J.
>
> PS, For details, I implemented my async write handling by having my own
> class that overrides TCPSelectorHandler with its own:
>
>     public boolean onWriteInterest(SelectionKey key, Context ctx) {
>         // thread-safe call to set up a latch that prevents calls to
>         // SelectorHandler.register(key,OP_WRITE) until write task
>         // has completed.
>         myTaskHandler.flagWriteReady(key);
>
>         // disable OP_WRITE on key before submitting to task queue
>         key.interestOps(key.interestOps() & (~SelectionKey.OP_WRITE));
>
>         // submit write task to its own Pipeline
>         myTaskHandler.submitWriteTask(key);
>     }
>
> Then in any background tasks I have that post messages to the write
> queue, I:
>
> - make a synchronized call into the task handler that sets a
> "needsWrite" flag.  That call will ONLY be successful if there is
> currently a task waiting to start (based on the call to flagWriteReady
> above) or if a write task is currently running.
>
> - if the call to set needsWrite was unsuccessful, it makes a call to
> SelectorHandler.register(key, OP_WRITE) to make sure the Selector is
> woken up and tries to look for OP_WRITE on the key.
>
> - an async write task runs in its own pipeline that does
> message-to-bytes encoding and then bytes-to-SSL encoding.  At the end of
> that task in a finally block, it does an atomic check/reset of the
> needsWrite flag and, if so, calls back to SelectorHandler.register(key,
> OP_WRITE) to start the process over again.
>
> I haven't yet been able to streamline the logic more than this.  It
> seems a bit convoluted, but does test correctly now.  The unfortunate
> side-effect is that it occasionally means that an async write task is
> created and started even though the last one actually emptied the write
> queue...  It's a "safe" operation, but adds unnecessary overhead.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Read/write concurrency questions

charlie hunt-3
Jeanfrancois Arcand wrote:

> Hi,
>
> D. J. Hagberg (Sun) wrote:
>> So I think I have my asynch. write setup working correctly once I
>> started trying to ensure that only a single task was able to do SSL
>> writes to a single SelectionKey at any one point in time.  I've
>> stepped through the SelectorHandler/ProtocolChain/SSLReadFilter code
>> enough now that I'm convinced it's only possible for it to do a
>> single read on a SelectionKey at any one point in time, too.
>>
>> But now I'm puzzling over a few places in the Grizzly code that are
>> starting to look strange to me.  The main one is this:
>>
>> In the main loop on Controller.doSelect is this chunk:
>>
>>     if (key.isValid()) {
>>         if ((key.readyOps() & SelectionKey.OP_ACCEPT)
>>                 == SelectionKey.OP_ACCEPT){
>>             // ... accept-handling stuff
>>         } else if ((key.readyOps() & SelectionKey.OP_READ)
>>                 == SelectionKey.OP_READ) {
>>             delegateToWorkerThread = selectorHandler.
>>                     onReadInterest(key,serverCtx);
>>         } else if ((key.readyOps() & SelectionKey.OP_WRITE)
>>                 == SelectionKey.OP_WRITE) {
>>             delegateToWorkerThread = selectorHandler.
>>                     onWriteInterest(key,serverCtx);
>>         } else if ((key.readyOps() & SelectionKey.OP_CONNECT)
>>                 == SelectionKey.OP_CONNECT) {
>>             delegateToWorkerThread = selectorHandler.
>>                     onConnectInterest(key,serverCtx);
>>         }
>>
>>         if (delegateToWorkerThread){
>>             Context context = pollContext(key);
>>             context.setProtocol(selectorHandler.protocol());
>>             context.setPipeline(selectorHandler.pipeline());
>>             context.execute();
>>         }
>>     } else {
>>         selectionKeyHandler.cancel(key);
>>     }
>>
>> It looks to me like the (probably unlikely) case where a SelectionKey
>> has *both* a read-ready and a write-ready state is not handled here.  
>> It looks like OP_READ will *always* win and
>> SelectorHandler.onWriteInterest will not be called until 1000ms later
>> or the next time the Selector is being woken up and, only then, if
>> there is not an outstanding OP_READ on the socket.
>>
>> I noticed this because when I put my application under significant
>> load I started building up a significant queue of outstanding write's.
>>
>> Is this a correct analysis?
>
> I'm not sure it is a bug as if you spawn two threads (let say there is
> no if/else in the code above), it means the OP_READ and OP_WRITE might
> be concurrently proceeded. Can it cause problems? I've no object to
> change the Controller code, but I just want to make sure it will not
> cause more problems.

If we consider the (basic) two configurations of handle OP events,
handled by a worker thread or by a selector thread ....

If you have a no if / else when delegating to a worker thread for read /
write OP events, the multiple OP events would be handled (almost)
concurrently since the OP_READ will be delegated to a worker thread
immediately and within a few cpu instructions the OP_WRITE will be
delegated to a different worker thread.  This likely means that you may
be reading and writing to the same SocketChannel at the same time.  For
non-SSL communications this is ok.  But, I don't know if that will
present any issues when using SSL.  It shouldn't, but I thought it is
worth asking.

Btw, the delegating OP_READ and OP_WRITE to worker threads is the
default configuration.

If the configuration is such that OP_READ / OP_WRITE are not delegated,
having a no if / else means that the OP_READ will finish first and then
the OP_WRITE will execute.  Although I can't think of a situation where
one would run such a configuration, it is possible to configure.

At any rate, with the current if / else structure, the OP_WRITE will
either have to wait for the next cycle of Selector.select() which will
fall through right away since OP_WRITE hasn't been disabled.   But, with
just an if structure the OP_WRITE would be executed without having to
incur the additional Selector.select() cycle.  From a performance
perspective, if there's no additional OP events on that next
Selector.select() cycle, we could be spending some precious system /
kernel cpu time in that Selector.select() that could be avoided with the
non if / else structure.

If you make the change to using the non if / else structure, you might
wanna do a quick Grizzly / Faban performance test, but I think more
importantly think about any potential issues of using that structure
given the alternative "delegate to worker thread" versus "execute in the
Selector thread" configuration for each of the OP events.

At the moment, I'm thinking the non if / else structure may be a more
"fair" approach and may be slightly better performing.  But, I'm also
thinking I might have overlooked something in the "delegate to worker
thread" versus "execute in the Selector thread" configuration using that
structure.

charlie ...

>
> Thanks
>
> -- Jeanfrancois
>
>
>>
>> Thanks,
>>
>>             -=- D. J.
>>
>> PS, For details, I implemented my async write handling by having my
>> own class that overrides TCPSelectorHandler with its own:
>>
>>     public boolean onWriteInterest(SelectionKey key, Context ctx) {
>>         // thread-safe call to set up a latch that prevents calls to
>>         // SelectorHandler.register(key,OP_WRITE) until write task
>>         // has completed.
>>         myTaskHandler.flagWriteReady(key);
>>
>>         // disable OP_WRITE on key before submitting to task queue
>>         key.interestOps(key.interestOps() & (~SelectionKey.OP_WRITE));
>>
>>         // submit write task to its own Pipeline
>>         myTaskHandler.submitWriteTask(key);
>>     }
>>
>> Then in any background tasks I have that post messages to the write
>> queue, I:
>>
>> - make a synchronized call into the task handler that sets a
>> "needsWrite" flag.  That call will ONLY be successful if there is
>> currently a task waiting to start (based on the call to
>> flagWriteReady above) or if a write task is currently running.
>>
>> - if the call to set needsWrite was unsuccessful, it makes a call to
>> SelectorHandler.register(key, OP_WRITE) to make sure the Selector is
>> woken up and tries to look for OP_WRITE on the key.
>>
>> - an async write task runs in its own pipeline that does
>> message-to-bytes encoding and then bytes-to-SSL encoding.  At the end
>> of that task in a finally block, it does an atomic check/reset of the
>> needsWrite flag and, if so, calls back to
>> SelectorHandler.register(key, OP_WRITE) to start the process over again.
>>
>> I haven't yet been able to streamline the logic more than this.  It
>> seems a bit convoluted, but does test correctly now.  The unfortunate
>> side-effect is that it occasionally means that an async write task is
>> created and started even though the last one actually emptied the
>> write queue...  It's a "safe" operation, but adds unnecessary overhead.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


--

Charlie Hunt
Java Performance Engineer
630.285.7708 x47708 (Internal)

<http://java.sun.com/docs/performance/>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Read/write concurrency questions

Jeanfrancois Arcand-2

Salut,

charlie hunt wrote:

> Jeanfrancois Arcand wrote:
>> Hi,
>>
>> D. J. Hagberg (Sun) wrote:
>>> So I think I have my asynch. write setup working correctly once I
>>> started trying to ensure that only a single task was able to do SSL
>>> writes to a single SelectionKey at any one point in time.  I've
>>> stepped through the SelectorHandler/ProtocolChain/SSLReadFilter code
>>> enough now that I'm convinced it's only possible for it to do a
>>> single read on a SelectionKey at any one point in time, too.
>>>
>>> But now I'm puzzling over a few places in the Grizzly code that are
>>> starting to look strange to me.  The main one is this:
>>>
>>> In the main loop on Controller.doSelect is this chunk:
>>>
>>>     if (key.isValid()) {
>>>         if ((key.readyOps() & SelectionKey.OP_ACCEPT)
>>>                 == SelectionKey.OP_ACCEPT){
>>>             // ... accept-handling stuff
>>>         } else if ((key.readyOps() & SelectionKey.OP_READ)
>>>                 == SelectionKey.OP_READ) {
>>>             delegateToWorkerThread = selectorHandler.
>>>                     onReadInterest(key,serverCtx);
>>>         } else if ((key.readyOps() & SelectionKey.OP_WRITE)
>>>                 == SelectionKey.OP_WRITE) {
>>>             delegateToWorkerThread = selectorHandler.
>>>                     onWriteInterest(key,serverCtx);
>>>         } else if ((key.readyOps() & SelectionKey.OP_CONNECT)
>>>                 == SelectionKey.OP_CONNECT) {
>>>             delegateToWorkerThread = selectorHandler.
>>>                     onConnectInterest(key,serverCtx);
>>>         }
>>>
>>>         if (delegateToWorkerThread){
>>>             Context context = pollContext(key);
>>>             context.setProtocol(selectorHandler.protocol());
>>>             context.setPipeline(selectorHandler.pipeline());
>>>             context.execute();
>>>         }
>>>     } else {
>>>         selectionKeyHandler.cancel(key);
>>>     }
>>>
>>> It looks to me like the (probably unlikely) case where a SelectionKey
>>> has *both* a read-ready and a write-ready state is not handled here.  
>>> It looks like OP_READ will *always* win and
>>> SelectorHandler.onWriteInterest will not be called until 1000ms later
>>> or the next time the Selector is being woken up and, only then, if
>>> there is not an outstanding OP_READ on the socket.
>>>
>>> I noticed this because when I put my application under significant
>>> load I started building up a significant queue of outstanding write's.
>>>
>>> Is this a correct analysis?
>>
>> I'm not sure it is a bug as if you spawn two threads (let say there is
>> no if/else in the code above), it means the OP_READ and OP_WRITE might
>> be concurrently proceeded. Can it cause problems? I've no object to
>> change the Controller code, but I just want to make sure it will not
>> cause more problems.
>
> If we consider the (basic) two configurations of handle OP events,
> handled by a worker thread or by a selector thread ....
>
> If you have a no if / else when delegating to a worker thread for read /
> write OP events, the multiple OP events would be handled (almost)
> concurrently since the OP_READ will be delegated to a worker thread
> immediately and within a few cpu instructions the OP_WRITE will be
> delegated to a different worker thread.  This likely means that you may
> be reading and writing to the same SocketChannel at the same time.  For
> non-SSL communications this is ok.  But, I don't know if that will
> present any issues when using SSL.  It shouldn't, but I thought it is
> worth asking.

I need to double check but I suspect the SSLEngine is not thread safe
(but I might be wrong here).


>
> Btw, the delegating OP_READ and OP_WRITE to worker threads is the
> default configuration.
>
> If the configuration is such that OP_READ / OP_WRITE are not delegated,
> having a no if / else means that the OP_READ will finish first and then
> the OP_WRITE will execute.  Although I can't think of a situation where
> one would run such a configuration, it is possible to configure.

Right but I would not recommend doing the OP_READ/OP_WRITE on the same
thread as the Selector.select() as a ProtocolFilter can block for an
operation, and the select() will be delayed.

>
> At any rate, with the current if / else structure, the OP_WRITE will
> either have to wait for the next cycle of Selector.select() which will
> fall through right away since OP_WRITE hasn't been disabled.   But, with
> just an if structure the OP_WRITE would be executed without having to
> incur the additional Selector.select() cycle.  From a performance
> perspective, if there's no additional OP events on that next
> Selector.select() cycle, we could be spending some precious system /
> kernel cpu time in that Selector.select() that could be avoided with the
> non if / else structure.

Make sense. If we change the current logic, we need a way to support the
old behavior as some application might rely on the current logic.


>
> If you make the change to using the non if / else structure, you might
> wanna do a quick Grizzly / Faban performance test, but I think more
> importantly think about any potential issues of using that structure
> given the alternative "delegate to worker thread" versus "execute in the
> Selector thread" configuration for each of the OP events.

The Grizzly Http Server isn't using the OP_WRITE. But the SIP
Container/SailFin uses it so I will make sure we don't break them. I
suspect it may improve their performance.

>
> At the moment, I'm thinking the non if / else structure may be a more
> "fair" approach and may be slightly better performing.  But, I'm also
> thinking I might have overlooked something in the "delegate to worker
> thread" versus "execute in the Selector thread" configuration using that
> structure.

I think you are right, but it will make the development more complex as
you will need to be aware that you can have simultaneous OP_READ/WRITE
using the same SelectionKey.

A+

-- Jeanfrancois


>
> charlie ...
>
>>
>> Thanks
>>
>> -- Jeanfrancois
>>
>>
>>>
>>> Thanks,
>>>
>>>             -=- D. J.
>>>
>>> PS, For details, I implemented my async write handling by having my
>>> own class that overrides TCPSelectorHandler with its own:
>>>
>>>     public boolean onWriteInterest(SelectionKey key, Context ctx) {
>>>         // thread-safe call to set up a latch that prevents calls to
>>>         // SelectorHandler.register(key,OP_WRITE) until write task
>>>         // has completed.
>>>         myTaskHandler.flagWriteReady(key);
>>>
>>>         // disable OP_WRITE on key before submitting to task queue
>>>         key.interestOps(key.interestOps() & (~SelectionKey.OP_WRITE));
>>>
>>>         // submit write task to its own Pipeline
>>>         myTaskHandler.submitWriteTask(key);
>>>     }
>>>
>>> Then in any background tasks I have that post messages to the write
>>> queue, I:
>>>
>>> - make a synchronized call into the task handler that sets a
>>> "needsWrite" flag.  That call will ONLY be successful if there is
>>> currently a task waiting to start (based on the call to
>>> flagWriteReady above) or if a write task is currently running.
>>>
>>> - if the call to set needsWrite was unsuccessful, it makes a call to
>>> SelectorHandler.register(key, OP_WRITE) to make sure the Selector is
>>> woken up and tries to look for OP_WRITE on the key.
>>>
>>> - an async write task runs in its own pipeline that does
>>> message-to-bytes encoding and then bytes-to-SSL encoding.  At the end
>>> of that task in a finally block, it does an atomic check/reset of the
>>> needsWrite flag and, if so, calls back to
>>> SelectorHandler.register(key, OP_WRITE) to start the process over again.
>>>
>>> I haven't yet been able to streamline the logic more than this.  It
>>> seems a bit convoluted, but does test correctly now.  The unfortunate
>>> side-effect is that it occasionally means that an async write task is
>>> created and started even though the last one actually emptied the
>>> write queue...  It's a "safe" operation, but adds unnecessary overhead.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Read/write concurrency questions

D. J. Hagberg (Sun)
Jeanfrancois Arcand wrote:

> charlie hunt wrote:
>> If you have a no if / else when delegating to a worker thread for read
>> / write OP events, the multiple OP events would be handled (almost)
>> concurrently since the OP_READ will be delegated to a worker thread
>> immediately and within a few cpu instructions the OP_WRITE will be
>> delegated to a different worker thread.  This likely means that you
>> may be reading and writing to the same SocketChannel at the same
>> time.  For non-SSL communications this is ok.  But, I don't know if
>> that will present any issues when using SSL.  It shouldn't, but I
>> thought it is worth asking.
>
> I need to double check but I suspect the SSLEngine is not thread safe
> (but I might be wrong here).

The SSLEngine documentation includes the following comment under
Concurrency Notes in the 1.5 javadocs:

   -------
   1. The wrap() and unwrap() methods may execute concurrently of
   each other."

   2. The SSL/TLS protocols employ ordered packets. Applications must
   take care to ensure that generated packets are delivered in sequence.
   If packets arrive out-of-order, unexpected or fatal results may occur.

   For example:

                 synchronized (outboundLock) {
                     sslEngine.wrap(src, dst);
                     outboundQueue.put(dst);
                 }


   As a corollary, two threads must not attempt to call the same method
   (either wrap() or unwrap()) concurrently, because there is no way to
   guarantee the eventual packet ordering.
   -------

So my reading is that it's fine to have OP_WRITE threads executing
concurrently with OP_READ, even using SSLEngine.  The current
ProtocolChain code prevents 2 simultaneous SSLEngine.unwrap calls, by
its careful disabling and re-enabling OP_READ on the SelectionKey.

It appears that one just has to be as careful with OP_WRITE handling, or
even more careful in the case where external/asynchronous events in
other threads initiate a WRITE...

After running a long SSL load test last night, I'm not convinced of the
correct operation of my code :-(, still getting a "bad record MAC" after
running for 6-8 hrs.  More digging needed...

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Read/write concurrency questions

charlie hunt-3
In reply to this post by Jeanfrancois Arcand-2
Couple comments embedded below.

charlie ...

Jeanfrancois Arcand wrote:

> Salut,
>
> charlie hunt wrote:
>> Jeanfrancois Arcand wrote:
>>> Hi,
>>>
>>> D. J. Hagberg (Sun) wrote:
>>>> So I think I have my asynch. write setup working correctly once I
>>>> started trying to ensure that only a single task was able to do SSL
>>>> writes to a single SelectionKey at any one point in time.  I've
>>>> stepped through the SelectorHandler/ProtocolChain/SSLReadFilter
>>>> code enough now that I'm convinced it's only possible for it to do
>>>> a single read on a SelectionKey at any one point in time, too.
>>>>
>>>> But now I'm puzzling over a few places in the Grizzly code that are
>>>> starting to look strange to me.  The main one is this:
>>>>
>>>> In the main loop on Controller.doSelect is this chunk:
>>>>
>>>>     if (key.isValid()) {
>>>>         if ((key.readyOps() & SelectionKey.OP_ACCEPT)
>>>>                 == SelectionKey.OP_ACCEPT){
>>>>             // ... accept-handling stuff
>>>>         } else if ((key.readyOps() & SelectionKey.OP_READ)
>>>>                 == SelectionKey.OP_READ) {
>>>>             delegateToWorkerThread = selectorHandler.
>>>>                     onReadInterest(key,serverCtx);
>>>>         } else if ((key.readyOps() & SelectionKey.OP_WRITE)
>>>>                 == SelectionKey.OP_WRITE) {
>>>>             delegateToWorkerThread = selectorHandler.
>>>>                     onWriteInterest(key,serverCtx);
>>>>         } else if ((key.readyOps() & SelectionKey.OP_CONNECT)
>>>>                 == SelectionKey.OP_CONNECT) {
>>>>             delegateToWorkerThread = selectorHandler.
>>>>                     onConnectInterest(key,serverCtx);
>>>>         }
>>>>
>>>>         if (delegateToWorkerThread){
>>>>             Context context = pollContext(key);
>>>>             context.setProtocol(selectorHandler.protocol());
>>>>             context.setPipeline(selectorHandler.pipeline());
>>>>             context.execute();
>>>>         }
>>>>     } else {
>>>>         selectionKeyHandler.cancel(key);
>>>>     }
>>>>
>>>> It looks to me like the (probably unlikely) case where a
>>>> SelectionKey has *both* a read-ready and a write-ready state is not
>>>> handled here.  It looks like OP_READ will *always* win and
>>>> SelectorHandler.onWriteInterest will not be called until 1000ms
>>>> later or the next time the Selector is being woken up and, only
>>>> then, if there is not an outstanding OP_READ on the socket.
>>>>
>>>> I noticed this because when I put my application under significant
>>>> load I started building up a significant queue of outstanding write's.
>>>>
>>>> Is this a correct analysis?
>>>
>>> I'm not sure it is a bug as if you spawn two threads (let say there
>>> is no if/else in the code above), it means the OP_READ and OP_WRITE
>>> might be concurrently proceeded. Can it cause problems? I've no
>>> object to change the Controller code, but I just want to make sure
>>> it will not cause more problems.
>>
>> If we consider the (basic) two configurations of handle OP events,
>> handled by a worker thread or by a selector thread ....
>>
>> If you have a no if / else when delegating to a worker thread for
>> read / write OP events, the multiple OP events would be handled
>> (almost) concurrently since the OP_READ will be delegated to a worker
>> thread immediately and within a few cpu instructions the OP_WRITE
>> will be delegated to a different worker thread.  This likely means
>> that you may be reading and writing to the same SocketChannel at the
>> same time.  For non-SSL communications this is ok.  But, I don't know
>> if that will present any issues when using SSL.  It shouldn't, but I
>> thought it is worth asking.
>
> I need to double check but I suspect the SSLEngine is not thread safe
> (but I might be wrong here).
>
I kinda had thought that might be an issue too.   It's certainly needs
investigation.

>
>>
>> Btw, the delegating OP_READ and OP_WRITE to worker threads is the
>> default configuration.
>>
>> If the configuration is such that OP_READ / OP_WRITE are not
>> delegated, having a no if / else means that the OP_READ will finish
>> first and then the OP_WRITE will execute.  Although I can't think of
>> a situation where one would run such a configuration, it is possible
>> to configure.
>
> Right but I would not recommend doing the OP_READ/OP_WRITE on the same
> thread as the Selector.select() as a ProtocolFilter can block for an
> operation, and the select() will be delayed.
I completely agree.

I wouldn't recommend it either :-)   I brought it up since it is a
possible configuration.

>
>>
>> At any rate, with the current if / else structure, the OP_WRITE will
>> either have to wait for the next cycle of Selector.select() which
>> will fall through right away since OP_WRITE hasn't been disabled.  
>> But, with just an if structure the OP_WRITE would be executed without
>> having to incur the additional Selector.select() cycle.  From a
>> performance perspective, if there's no additional OP events on that
>> next Selector.select() cycle, we could be spending some precious
>> system / kernel cpu time in that Selector.select() that could be
>> avoided with the non if / else structure.
>
> Make sense. If we change the current logic, we need a way to support
> the old behavior as some application might rely on the current logic.
I think the logic we have to be careful about in the current logic it
that the handling of a SelectionKey who's OP_READ & OP_WRITE are both
set will likely delay the handling of an OP_WRITE a little longer than
with the changed logic.   This is of course assuming the "delegate to
worker thread" configuration.

That kinda indicates to me that with the changed logic we may be more
likely to run into the scenario of the same SocketChannel performing
reads and writes simultaneously.

Fwiw, I know we do this in the current implementation of GlassFish IIOP
/ ORB.   But, that does not include SSL or UDP communications.   The
GlassFish IIOP / ORB SSL communications are currently done over
java.net.Socket(s).

>
>
>>
>> If you make the change to using the non if / else structure, you
>> might wanna do a quick Grizzly / Faban performance test, but I think
>> more importantly think about any potential issues of using that
>> structure given the alternative "delegate to worker thread" versus
>> "execute in the Selector thread" configuration for each of the OP
>> events.
>
> The Grizzly Http Server isn't using the OP_WRITE. But the SIP
> Container/SailFin uses it so I will make sure we don't break them. I
> suspect it may improve their performance.
You're right ... if grizzly http server isn't using OP_WRITE, then
you'll see no difference in performance with the different / new logic.

>
>>
>> At the moment, I'm thinking the non if / else structure may be a more
>> "fair" approach and may be slightly better performing.  But, I'm also
>> thinking I might have overlooked something in the "delegate to worker
>> thread" versus "execute in the Selector thread" configuration using
>> that structure.
>
> I think you are right, but it will make the development more complex
> as you will need to be aware that you can have simultaneous
> OP_READ/WRITE using the same SelectionKey.
This should be ok with TCP SocketChannels.   I don't know about UDP or SSL ?

I haven't done enough with either to confirm yes or no.

charlie ...

>
> A+
>
> -- Jeanfrancois
>
>
>>
>> charlie ...
>>
>>>
>>> Thanks
>>>
>>> -- Jeanfrancois
>>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>             -=- D. J.
>>>>
>>>> PS, For details, I implemented my async write handling by having my
>>>> own class that overrides TCPSelectorHandler with its own:
>>>>
>>>>     public boolean onWriteInterest(SelectionKey key, Context ctx) {
>>>>         // thread-safe call to set up a latch that prevents calls to
>>>>         // SelectorHandler.register(key,OP_WRITE) until write task
>>>>         // has completed.
>>>>         myTaskHandler.flagWriteReady(key);
>>>>
>>>>         // disable OP_WRITE on key before submitting to task queue
>>>>         key.interestOps(key.interestOps() & (~SelectionKey.OP_WRITE));
>>>>
>>>>         // submit write task to its own Pipeline
>>>>         myTaskHandler.submitWriteTask(key);
>>>>     }
>>>>
>>>> Then in any background tasks I have that post messages to the write
>>>> queue, I:
>>>>
>>>> - make a synchronized call into the task handler that sets a
>>>> "needsWrite" flag.  That call will ONLY be successful if there is
>>>> currently a task waiting to start (based on the call to
>>>> flagWriteReady above) or if a write task is currently running.
>>>>
>>>> - if the call to set needsWrite was unsuccessful, it makes a call
>>>> to SelectorHandler.register(key, OP_WRITE) to make sure the
>>>> Selector is woken up and tries to look for OP_WRITE on the key.
>>>>
>>>> - an async write task runs in its own pipeline that does
>>>> message-to-bytes encoding and then bytes-to-SSL encoding.  At the
>>>> end of that task in a finally block, it does an atomic check/reset
>>>> of the needsWrite flag and, if so, calls back to
>>>> SelectorHandler.register(key, OP_WRITE) to start the process over
>>>> again.
>>>>
>>>> I haven't yet been able to streamline the logic more than this.  It
>>>> seems a bit convoluted, but does test correctly now.  The
>>>> unfortunate side-effect is that it occasionally means that an async
>>>> write task is created and started even though the last one actually
>>>> emptied the write queue...  It's a "safe" operation, but adds
>>>> unnecessary overhead.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
--

Charlie Hunt
Java Performance Engineer
630.285.7708 x47708 (Internal)

<http://java.sun.com/docs/performance/>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

charlie.hunt.vcf (167 bytes) Download Attachment