During our update from elastic 7.6.1 (oss) to elastic 7.8.0 (oss) we encounter this error, on elastcisearch logs :
[2020-08-20T07:00:12.095] ERROR tech.beshu.ror.es.services.EsIndexJsonContentService [scala-execution-context-global-22] [spicedpassion] Cannot get source of document [.readonlyrest ID=1]
java.lang.NullPointerException: Cannot invoke "org.elasticsearch.cluster.ClusterState.nodes()" because "clusterState" is null
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.<init>(TransportSingleShardAction.java:151) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.<init>(TransportSingleShardAction.java:136) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:103) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:62) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:179) ~[elasticsearch-7.8.0.jar:7.8.0]
at tech.beshu.ror.es.IndexLevelActionFilter.$anonfun$apply$1(IndexLevelActionFilter.scala:95) ~[?:?]
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[?:?]
at tech.beshu.ror.utils.AccessControllerHelper$$anon$1.run(AccessControllerHelper.scala:25) ~[?:?]
at java.security.AccessController.doPrivileged(AccessController.java:312) ~[?:?]
at tech.beshu.ror.utils.AccessControllerHelper$.doPrivileged(AccessControllerHelper.scala:24) ~[?:?]
at tech.beshu.ror.es.IndexLevelActionFilter.apply(IndexLevelActionFilter.scala:93) ~[?:?]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:177) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:155) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:83) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:399) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:388) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:492) ~[elasticsearch-7.8.0.jar:7.8.0]
at tech.beshu.ror.es.services.EsIndexJsonContentService.$anonfun$sourceOf$1(EsIndexJsonContentService.scala:55) ~[?:?]
at monix.eval.internal.TaskRunLoop$.startFull(TaskRunLoop.scala:81) ~[?:?]
at monix.eval.internal.TaskRunLoop$.$anonfun$restartAsync$1(TaskRunLoop.scala:222) ~[?:?]
at monix.execution.internal.InterceptRunnable.run(InterceptRunnable.scala:27) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]
at java.lang.Thread.run(Thread.java:832) ~[?:?]
So there is nothing to worry about. The first try failed, but next was success. ES was not ready yet. As I said, in future we will improve it and the error won’t show up.
We have a lot of other errors of this kind during the elasticsearch cluster runtime.
The first try failed, but next was success.
What do you mean by next was success?
Is there need a cluster reboot?
Error Log Below
2020-08-24T07:10:30.366] ERROR tech.beshu.ror.es.services.EsIndexJsonContentService [scala-execution-context-global-60] [cavarzere] Cann ot get source of document [.readonlyrest ID=1]
org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [indices:data/read/get[s]] would be [4236005 706/3.9gb], which is larger than the limit of [4080218931/3.7gb], real usage: [4236005552/3.9gb], new bytes reserved: [154/154b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=154/154b, accounting=15434632/14.7mb]
at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:347) ~[e lasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:128) ~ [elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.transport.InboundAggregator.checkBreaker(InboundAggregator.java:210) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.transport.InboundAggregator.finishAggregation(InboundAggregator.java:119) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:140) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:73) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[?:?]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[?:?]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
at java.lang.Thread.run(Thread.java:832) ~[?:?]
I tried removing the .readonlyrest index and restarting the ES cluster but we still have these errors.
Can you explain to us why the class tech.beshu.ror.es.services.EsIndexJsonContentService generates all these errors?
This is very annoying for our production monitoring because this class generates tens of stack errors:
ERROR tech.beshu.ror.es.services.EsIndexJsonContentService [scala-execution-context-global-22] [spicedpassion] Cannot get source of document [.readonlyrest ID=1]
java.lang.NullPointerException: Cannot invoke "org.elasticsearch.cluster.ClusterState.nodes()" because "clusterState" is null
…
ERROR tech.beshu.ror.es.services.EsIndexJsonContentService [scala-execution-context-global-108] [campofelice] Cannot get source of document [.readonlyrest ID=1]
org.elasticsearch.action.NoShardAvailableActionException: No shard available for [get [.readonlyrest][_doc][1]: routing [null]]
…
ERROR tech.beshu.ror.es.services.EsIndexJsonContentService [scala-execution-context-global-84] [campofelice] Cannot get source of document [.readonlyrest ID=1]
org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [indices:data/read/get[s]] would be [4165110002/3.8gb], which is larger than the limit of [4080218931/3.7gb], real usage: [4165109848/3.8gb], new bytes reserved: [154/154b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=154/154b, accounting=15419396/14.7mb]
For now we have rolled back to es7.6.1 and ror1.19.4, where the culster is running properly.
Along with the ror errors, we also learned of a lucene memory leak affecting elasticsearch versions 7.8.0 and 7.9.0 …
In the coming days I will be testing es7.6.1 with ror1.22.1 because I need to use the kibana api to save all the kibana assets of the different tenants.
I will see at this time if the above errors still appear.