SelectorRunner uses 100% CPU

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

SelectorRunner uses 100% CPU

Marc Arens
Moin,

we are seeing system outages at one deployment where one repeating point of the
outages is a

"Grizzly(1) SelectorRunner" daemon prio=10 tid=0x00007fec31068800 nid=0x23e6
runnable [0x00007fec6e3aa000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x00000002c9c4e4a8> (a sun.nio.ch.Util$2)
        - locked <0x00000002ca5e08e0> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000002c9bdc060> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at
org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:112)
        at org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:325)
        at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:264)
        at
org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:551)
        at
org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:531)
        at java.lang.Thread.run(Thread.java:745)

which keeps a CPU busy at 100% for at least 15 seconds and probably even longer
but those 15s is all our debug data covers at the moment.

This is a Grizzly 2.2.x on java version "1.7.0_79", OpenJDK Runtime Environment
(rhel-2.5.5.3.el6_6-x86_64 u79-b14), OpenJDK 64-Bit Server VM (build 24.79-b02,
mixed mode).

I found a 2.3.x related bug at https://java.net/jira/browse/GRIZZLY-1712 could
it be this one we are hitting although it's 2.2.x? If yes, would it be possible
to backport it to 2.2.x? Do you know about any other issues we might be hitting
here?

Best regards
Marc

--
...it was evening and it was morning and there were already two ways to store
Unicode...
Reply | Threaded
Open this post in threaded view
|

Re: SelectorRunner uses 100% CPU

Ryan Lubke-2
It's hard to say if you're hitting the issue or not based on the provided information.

We can try back-porting the fix as well as changes to force selector spin detection.

How often is this occurring?

Thanks,
-rl

September 22, 2015 at 10:51
Moin,

we are seeing system outages at one deployment where one repeating point of the
outages is a

"Grizzly(1) SelectorRunner" daemon prio=10 tid=0x00007fec31068800 nid=0x23e6
runnable [0x00007fec6e3aa000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000002c9c4e4a8> (a sun.nio.ch.Util$2)
- locked <0x00000002ca5e08e0> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000002c9bdc060> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at
org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:112)
at org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:325)
at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:264)
at
org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:551)
at
org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:531)
at java.lang.Thread.run(Thread.java:745)

which keeps a CPU busy at 100% for at least 15 seconds and probably even longer
but those 15s is all our debug data covers at the moment.

This is a Grizzly 2.2.x on java version "1.7.0_79", OpenJDK Runtime Environment
(rhel-2.5.5.3.el6_6-x86_64 u79-b14), OpenJDK 64-Bit Server VM (build 24.79-b02,
mixed mode).

I found a 2.3.x related bug at https://java.net/jira/browse/GRIZZLY-1712 could
it be this one we are hitting although it's 2.2.x? If yes, would it be possible
to backport it to 2.2.x? Do you know about any other issues we might be hitting
here?

Best regards
Marc

--
...it was evening and it was morning and there were already two ways to store
Unicode...

Reply | Threaded
Open this post in threaded view
|

Re: SelectorRunner uses 100% CPU

Marc Arens
Hey Ryan,

thanks for the fast reply. Complete system outage(OOM not able to create new
threads) happens ~15 minutes after server start in the worst case. But i can't
even say when exactly the SelectorRunner starts using so much CPU within that
time frame. I'd be very thankful for backports of those fixes!

> On 22 September 2015 at 18:21 Ryan Lubke <[hidden email]> wrote:
>
>
> It's hard to say if you're hitting the issue or not based on the
> provided information.
>
> We can try back-porting the fix as well as changes to force selector
> spin detection.
>
> How often is this occurring?
>
> Thanks,
> -rl
>
> > Marc Arens <mailto:[hidden email]>
> > September 22, 2015 at 10:51
> > Moin,
> >
> > we are seeing system outages at one deployment where one repeating
> > point of the
> > outages is a
> >
> > "Grizzly(1) SelectorRunner" daemon prio=10 tid=0x00007fec31068800
> > nid=0x23e6
> > runnable [0x00007fec6e3aa000]
> > java.lang.Thread.State: RUNNABLE
> > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> > at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
> > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> > - locked <0x00000002c9c4e4a8> (a sun.nio.ch.Util$2)
> > - locked <0x00000002ca5e08e0> (a java.util.Collections$UnmodifiableSet)
> > - locked <0x00000002c9bdc060> (a sun.nio.ch.EPollSelectorImpl)
> > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> > at
> > org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:112)
> > at
> > org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:325)
> > at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:264)
> > at
> > org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:551)
> > at
> > org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:531)
> > at java.lang.Thread.run(Thread.java:745)
> >
> > which keeps a CPU busy at 100% for at least 15 seconds and probably
> > even longer
> > but those 15s is all our debug data covers at the moment.
> >
> > This is a Grizzly 2.2.x on java version "1.7.0_79", OpenJDK Runtime
> > Environment
> > (rhel-2.5.5.3.el6_6-x86_64 u79-b14), OpenJDK 64-Bit Server VM (build
> > 24.79-b02,
> > mixed mode).
> >
> > I found a 2.3.x related bug at
> > https://java.net/jira/browse/GRIZZLY-1712 could
> > it be this one we are hitting although it's 2.2.x? If yes, would it be
> > possible
> > to backport it to 2.2.x? Do you know about any other issues we might
> > be hitting
> > here?
> >
> > Best regards
> > Marc
> >
> > --
> > ...it was evening and it was morning and there were already two ways
> > to store
> > Unicode...
>
Reply | Threaded
Open this post in threaded view
|

Re: SelectorRunner uses 100% CPU

oleksiys
Administrator
Hi Marc,

I just backported [1] and [2] to 2.2.x branch and published
2.2.23-SNAPSHOT, can you pls try it?
You can enable a system property as mentioned in [2] and see if it helps
to prevent 100% CPU consumption, but it could be also a consequence of OOM.
Once you hit OOM, could you take a memory dump, so we can investigate if
it's a leak, or we just hit machine limit.

Thanks.

WBR,
Alexey.

[1] https://java.net/jira/browse/GRIZZLY-1712
[2] https://java.net/jira/browse/GRIZZLY-1801



On 22.09.15 09:58, Marc Arens wrote:

> Hey Ryan,
>
> thanks for the fast reply. Complete system outage(OOM not able to create new
> threads) happens ~15 minutes after server start in the worst case. But i can't
> even say when exactly the SelectorRunner starts using so much CPU within that
> time frame. I'd be very thankful for backports of those fixes!
>
>> On 22 September 2015 at 18:21 Ryan Lubke <[hidden email]> wrote:
>>
>>
>> It's hard to say if you're hitting the issue or not based on the
>> provided information.
>>
>> We can try back-porting the fix as well as changes to force selector
>> spin detection.
>>
>> How often is this occurring?
>>
>> Thanks,
>> -rl
>>
>>> Marc Arens <mailto:[hidden email]>
>>> September 22, 2015 at 10:51
>>> Moin,
>>>
>>> we are seeing system outages at one deployment where one repeating
>>> point of the
>>> outages is a
>>>
>>> "Grizzly(1) SelectorRunner" daemon prio=10 tid=0x00007fec31068800
>>> nid=0x23e6
>>> runnable [0x00007fec6e3aa000]
>>> java.lang.Thread.State: RUNNABLE
>>> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>>> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>>> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>>> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>>> - locked <0x00000002c9c4e4a8> (a sun.nio.ch.Util$2)
>>> - locked <0x00000002ca5e08e0> (a java.util.Collections$UnmodifiableSet)
>>> - locked <0x00000002c9bdc060> (a sun.nio.ch.EPollSelectorImpl)
>>> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>>> at
>>> org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:112)
>>> at
>>> org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:325)
>>> at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:264)
>>> at
>>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:551)
>>> at
>>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:531)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> which keeps a CPU busy at 100% for at least 15 seconds and probably
>>> even longer
>>> but those 15s is all our debug data covers at the moment.
>>>
>>> This is a Grizzly 2.2.x on java version "1.7.0_79", OpenJDK Runtime
>>> Environment
>>> (rhel-2.5.5.3.el6_6-x86_64 u79-b14), OpenJDK 64-Bit Server VM (build
>>> 24.79-b02,
>>> mixed mode).
>>>
>>> I found a 2.3.x related bug at
>>> https://java.net/jira/browse/GRIZZLY-1712 could
>>> it be this one we are hitting although it's 2.2.x? If yes, would it be
>>> possible
>>> to backport it to 2.2.x? Do you know about any other issues we might
>>> be hitting
>>> here?
>>>
>>> Best regards
>>> Marc
>>>
>>> --
>>> ...it was evening and it was morning and there were already two ways
>>> to store
>>> Unicode...

Reply | Threaded
Open this post in threaded view
|

Re: SelectorRunner uses 100% CPU

Marc Arens
Hello Alexey,

thanks a lot for the backport. 1801 doesn't look like it applies here because
the used jvm is 1.7.0_79, still useful for other scenarios though.
We are still collecting debug infos ourself to analyze what exactly is going
wrong. Your Fix will probably help us to narrow it down some more. We can't
share memory dumps until we managed to reproduce it in a testing environment
without any private data.
 

> On 23 September 2015 at 02:38 Oleksiy Stashok <[hidden email]>
> wrote:
>
>
> Hi Marc,
>
> I just backported [1] and [2] to 2.2.x branch and published
> 2.2.23-SNAPSHOT, can you pls try it?
> You can enable a system property as mentioned in [2] and see if it helps
> to prevent 100% CPU consumption, but it could be also a consequence of OOM.
> Once you hit OOM, could you take a memory dump, so we can investigate if
> it's a leak, or we just hit machine limit.
>
> Thanks.
>
> WBR,
> Alexey.
>
> [1] https://java.net/jira/browse/GRIZZLY-1712
> [2] https://java.net/jira/browse/GRIZZLY-1801
>
>
>
> On 22.09.15 09:58, Marc Arens wrote:
> > Hey Ryan,
> >
> > thanks for the fast reply. Complete system outage(OOM not able to create new
> > threads) happens ~15 minutes after server start in the worst case. But i
> > can't
> > even say when exactly the SelectorRunner starts using so much CPU within
> > that
> > time frame. I'd be very thankful for backports of those fixes!
> >
> >> On 22 September 2015 at 18:21 Ryan Lubke <[hidden email]> wrote:
> >>
> >>
> >> It's hard to say if you're hitting the issue or not based on the
> >> provided information.
> >>
> >> We can try back-porting the fix as well as changes to force selector
> >> spin detection.
> >>
> >> How often is this occurring?
> >>
> >> Thanks,
> >> -rl
> >>
> >>> Marc Arens <mailto:[hidden email]>
> >>> September 22, 2015 at 10:51
> >>> Moin,
> >>>
> >>> we are seeing system outages at one deployment where one repeating
> >>> point of the
> >>> outages is a
> >>>
> >>> "Grizzly(1) SelectorRunner" daemon prio=10 tid=0x00007fec31068800
> >>> nid=0x23e6
> >>> runnable [0x00007fec6e3aa000]
> >>> java.lang.Thread.State: RUNNABLE
> >>> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> >>> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> >>> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
> >>> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> >>> - locked <0x00000002c9c4e4a8> (a sun.nio.ch.Util$2)
> >>> - locked <0x00000002ca5e08e0> (a java.util.Collections$UnmodifiableSet)
> >>> - locked <0x00000002c9bdc060> (a sun.nio.ch.EPollSelectorImpl)
> >>> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> >>> at
> >>> org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:112)
> >>> at
> >>> org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:325)
> >>> at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:264)
> >>> at
> >>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:551)
> >>> at
> >>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:531)
> >>> at java.lang.Thread.run(Thread.java:745)
> >>>
> >>> which keeps a CPU busy at 100% for at least 15 seconds and probably
> >>> even longer
> >>> but those 15s is all our debug data covers at the moment.
> >>>
> >>> This is a Grizzly 2.2.x on java version "1.7.0_79", OpenJDK Runtime
> >>> Environment
> >>> (rhel-2.5.5.3.el6_6-x86_64 u79-b14), OpenJDK 64-Bit Server VM (build
> >>> 24.79-b02,
> >>> mixed mode).
> >>>
> >>> I found a 2.3.x related bug at
> >>> https://java.net/jira/browse/GRIZZLY-1712 could
> >>> it be this one we are hitting although it's 2.2.x? If yes, would it be
> >>> possible
> >>> to backport it to 2.2.x? Do you know about any other issues we might
> >>> be hitting
> >>> here?
> >>>
> >>> Best regards
> >>> Marc
> >>>
> >>> --
> >>> ...it was evening and it was morning and there were already two ways
> >>> to store
> >>> Unicode...
>
Reply | Threaded
Open this post in threaded view
|

Re: SelectorRunner uses 100% CPU

oleksiys
Administrator
Hi Marc,


On 23.09.15 02:13, Marc Arens wrote:
> We are still collecting debug infos ourself to analyze what exactly is going
> wrong. Your Fix will probably help us to narrow it down some more. We can't
> share memory dumps until we managed to reproduce it in a testing environment
> without any private data.
I recently found that JProfiler can put together nice GC roots summary
based on java memory dump, maybe other profilers also support that
(unfortunately NB/VisualVM doesn't support this feature). So instead of
sending the heap dump itself you can send the report.

Thanks.

WBR,
Alexey.

>> On 23 September 2015 at 02:38 Oleksiy Stashok <[hidden email]>
>> wrote:
>>
>>
>> Hi Marc,
>>
>> I just backported [1] and [2] to 2.2.x branch and published
>> 2.2.23-SNAPSHOT, can you pls try it?
>> You can enable a system property as mentioned in [2] and see if it helps
>> to prevent 100% CPU consumption, but it could be also a consequence of OOM.
>> Once you hit OOM, could you take a memory dump, so we can investigate if
>> it's a leak, or we just hit machine limit.
>>
>> Thanks.
>>
>> WBR,
>> Alexey.
>>
>> [1] https://java.net/jira/browse/GRIZZLY-1712
>> [2] https://java.net/jira/browse/GRIZZLY-1801
>>
>>
>>
>> On 22.09.15 09:58, Marc Arens wrote:
>>> Hey Ryan,
>>>
>>> thanks for the fast reply. Complete system outage(OOM not able to create new
>>> threads) happens ~15 minutes after server start in the worst case. But i
>>> can't
>>> even say when exactly the SelectorRunner starts using so much CPU within
>>> that
>>> time frame. I'd be very thankful for backports of those fixes!
>>>
>>>> On 22 September 2015 at 18:21 Ryan Lubke <[hidden email]> wrote:
>>>>
>>>>
>>>> It's hard to say if you're hitting the issue or not based on the
>>>> provided information.
>>>>
>>>> We can try back-porting the fix as well as changes to force selector
>>>> spin detection.
>>>>
>>>> How often is this occurring?
>>>>
>>>> Thanks,
>>>> -rl
>>>>
>>>>> Marc Arens <mailto:[hidden email]>
>>>>> September 22, 2015 at 10:51
>>>>> Moin,
>>>>>
>>>>> we are seeing system outages at one deployment where one repeating
>>>>> point of the
>>>>> outages is a
>>>>>
>>>>> "Grizzly(1) SelectorRunner" daemon prio=10 tid=0x00007fec31068800
>>>>> nid=0x23e6
>>>>> runnable [0x00007fec6e3aa000]
>>>>> java.lang.Thread.State: RUNNABLE
>>>>> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>>>>> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>>>>> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>>>>> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>>>>> - locked <0x00000002c9c4e4a8> (a sun.nio.ch.Util$2)
>>>>> - locked <0x00000002ca5e08e0> (a java.util.Collections$UnmodifiableSet)
>>>>> - locked <0x00000002c9bdc060> (a sun.nio.ch.EPollSelectorImpl)
>>>>> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>>>>> at
>>>>> org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:112)
>>>>> at
>>>>> org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:325)
>>>>> at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:264)
>>>>> at
>>>>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:551)
>>>>> at
>>>>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:531)
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>> which keeps a CPU busy at 100% for at least 15 seconds and probably
>>>>> even longer
>>>>> but those 15s is all our debug data covers at the moment.
>>>>>
>>>>> This is a Grizzly 2.2.x on java version "1.7.0_79", OpenJDK Runtime
>>>>> Environment
>>>>> (rhel-2.5.5.3.el6_6-x86_64 u79-b14), OpenJDK 64-Bit Server VM (build
>>>>> 24.79-b02,
>>>>> mixed mode).
>>>>>
>>>>> I found a 2.3.x related bug at
>>>>> https://java.net/jira/browse/GRIZZLY-1712 could
>>>>> it be this one we are hitting although it's 2.2.x? If yes, would it be
>>>>> possible
>>>>> to backport it to 2.2.x? Do you know about any other issues we might
>>>>> be hitting
>>>>> here?
>>>>>
>>>>> Best regards
>>>>> Marc
>>>>>
>>>>> --
>>>>> ...it was evening and it was morning and there were already two ways
>>>>> to store
>>>>> Unicode...

Reply | Threaded
Open this post in threaded view
|

Re: SelectorRunner uses 100% CPU

Marc Arens
Just a little feedback from my side: "... but it could be also a consequence of
OOM..." seems to have been the case. After several days we finally got the
confirmation that the admins didn't manage to set proper ulimits for the
user/server process which led to the OOM. As Grizzly performs pretty solid in
many many other deployments i was already wondering what strange edge case we
have been hitting here.

Thanks again for your help.

> On 28 September 2015 at 07:12 Oleksiy Stashok <[hidden email]>
> wrote:
>
>
> Hi Marc,
>
>
> On 23.09.15 02:13, Marc Arens wrote:
> > We are still collecting debug infos ourself to analyze what exactly is going
> > wrong. Your Fix will probably help us to narrow it down some more. We can't
> > share memory dumps until we managed to reproduce it in a testing environment
> > without any private data.
> I recently found that JProfiler can put together nice GC roots summary
> based on java memory dump, maybe other profilers also support that
> (unfortunately NB/VisualVM doesn't support this feature). So instead of
> sending the heap dump itself you can send the report.
>
> Thanks.
>
> WBR,
> Alexey.
>
> >> On 23 September 2015 at 02:38 Oleksiy Stashok <[hidden email]>
> >> wrote:
> >>
> >>
> >> Hi Marc,
> >>
> >> I just backported [1] and [2] to 2.2.x branch and published
> >> 2.2.23-SNAPSHOT, can you pls try it?
> >> You can enable a system property as mentioned in [2] and see if it helps
> >> to prevent 100% CPU consumption, but it could be also a consequence of OOM.
> >> Once you hit OOM, could you take a memory dump, so we can investigate if
> >> it's a leak, or we just hit machine limit.
> >>
> >> Thanks.
> >>
> >> WBR,
> >> Alexey.
> >>
> >> [1] https://java.net/jira/browse/GRIZZLY-1712
> >> [2] https://java.net/jira/browse/GRIZZLY-1801
> >>
> >>
> >>
> >> On 22.09.15 09:58, Marc Arens wrote:
> >>> Hey Ryan,
> >>>
> >>> thanks for the fast reply. Complete system outage(OOM not able to create
> >>> new
> >>> threads) happens ~15 minutes after server start in the worst case. But i
> >>> can't
> >>> even say when exactly the SelectorRunner starts using so much CPU within
> >>> that
> >>> time frame. I'd be very thankful for backports of those fixes!
> >>>
> >>>> On 22 September 2015 at 18:21 Ryan Lubke <[hidden email]> wrote:
> >>>>
> >>>>
> >>>> It's hard to say if you're hitting the issue or not based on the
> >>>> provided information.
> >>>>
> >>>> We can try back-porting the fix as well as changes to force selector
> >>>> spin detection.
> >>>>
> >>>> How often is this occurring?
> >>>>
> >>>> Thanks,
> >>>> -rl
> >>>>
> >>>>> Marc Arens <mailto:[hidden email]>
> >>>>> September 22, 2015 at 10:51
> >>>>> Moin,
> >>>>>
> >>>>> we are seeing system outages at one deployment where one repeating
> >>>>> point of the
> >>>>> outages is a
> >>>>>
> >>>>> "Grizzly(1) SelectorRunner" daemon prio=10 tid=0x00007fec31068800
> >>>>> nid=0x23e6
> >>>>> runnable [0x00007fec6e3aa000]
> >>>>> java.lang.Thread.State: RUNNABLE
> >>>>> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> >>>>> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> >>>>> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
> >>>>> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> >>>>> - locked <0x00000002c9c4e4a8> (a sun.nio.ch.Util$2)
> >>>>> - locked <0x00000002ca5e08e0> (a java.util.Collections$UnmodifiableSet)
> >>>>> - locked <0x00000002c9bdc060> (a sun.nio.ch.EPollSelectorImpl)
> >>>>> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:112)
> >>>>> at
> >>>>> org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:325)
> >>>>> at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:264)
> >>>>> at
> >>>>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:551)
> >>>>> at
> >>>>> org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:531)
> >>>>> at java.lang.Thread.run(Thread.java:745)
> >>>>>
> >>>>> which keeps a CPU busy at 100% for at least 15 seconds and probably
> >>>>> even longer
> >>>>> but those 15s is all our debug data covers at the moment.
> >>>>>
> >>>>> This is a Grizzly 2.2.x on java version "1.7.0_79", OpenJDK Runtime
> >>>>> Environment
> >>>>> (rhel-2.5.5.3.el6_6-x86_64 u79-b14), OpenJDK 64-Bit Server VM (build
> >>>>> 24.79-b02,
> >>>>> mixed mode).
> >>>>>
> >>>>> I found a 2.3.x related bug at
> >>>>> https://java.net/jira/browse/GRIZZLY-1712 could
> >>>>> it be this one we are hitting although it's 2.2.x? If yes, would it be
> >>>>> possible
> >>>>> to backport it to 2.2.x? Do you know about any other issues we might
> >>>>> be hitting
> >>>>> here?
> >>>>>
> >>>>> Best regards
> >>>>> Marc
> >>>>>
> >>>>> --
> >>>>> ...it was evening and it was morning and there were already two ways
> >>>>> to store
> >>>>> Unicode...
>