13 Replies Latest reply on Jul 18, 2012 3:43 AM by thomas.diesler

    TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules

    steffenwollscheid

      Hi all,

       

      we face the following problem in JBoss 7.1.0-Final (including the fix for AS7-3830):

       

      We would like to be able to trigger a chain of events, say by JMX-Bean in on OSGi bundle [A].

      [A] then calls up an OSGi Service/Class located in bundle [B] using an interface exported by [B].

      Now [B] tries to make a remote EJB lookup into an ear [C] on an interface it imported from another OSGi Bundle [D].

       

      This fails with the following stacktrace:

      [Server:server-one] 15:46:32,245 ERROR [stderr] (RMI TCP Connection(4)-10.0.103.110) javax.naming.NamingException: Could not load ejb proxy class steffen.experimental.remote.ejb.RemoteCalculator [Root exception is java.lang.ClassNotFoundException: steffen.experimental.remote.ejb.RemoteCalculator from [Module "deployment.steffen.experimental.ejb-remote.twice-removed:0.0.1.SNAPSHOT" from Service Module Loader]]

      [Server:server-one] 15:46:32,245 ERROR [stderr] (RMI TCP Connection(4)-10.0.103.110)    at org.jboss.ejb.client.naming.ejb.EjbNamingContext.createEjbProxy(EjbNamingContext.java:108)

      [Server:server-one] 15:46:32,246 ERROR [stderr] (RMI TCP Connection(4)-10.0.103.110) at org.jboss.ejb.client.naming.ejb.EjbNamingContext.lookup(EjbNamingContext.java:96)

      [Server:server-one] 15:46:32,246 ERROR [stderr] (RMI TCP Connection(4)-10.0.103.110)    at org.jboss.ejb.client.naming.ejb.EjbNamingContext.lookup(EjbNamingContext.java:76)

      [Server:server-one] 15:46:32,246 ERROR [stderr] (RMI TCP Connection(4)-10.0.103.110)    at org.jboss.as.naming.InitialContext.lookup(InitialContext.java:100)

      [Server:server-one] 15:46:32,246 ERROR [stderr] (RMI TCP Connection(4)-10.0.103.110) at org.jboss.as.naming.NamingContext.lookup(NamingContext.java:213)

      [Server:server-one] 15:46:32,246 ERROR [stderr] (RMI TCP Connection(4)-10.0.103.110)    at org.apache.aries.jndi.DelegateContext.lookup(DelegateContext.java:161)

      [Server:server-one] 15:46:32,246 ERROR [stderr] (RMI TCP Connection(4)-10.0.103.110)    at steffen.experimental.client.jmx.service.LookupImpl.internal_InitialContextService(LookupImpl.java:63)

      [Server:server-one] 15:46:32,247 ERROR [stderr] (RMI TCP Connection(4)-10.0.103.110) at steffen.experimental.client.jmx.service.TriggerLookup.doAddition_InitialContextService(TriggerLookup.java:85)

      [Server:server-one] 15:46:32,247 ERROR [stderr] (RMI TCP Connection(4)-10.0.103.110)    at steffen.experimental.indirect.jmx.ServiceCallerWrapper.doAddition_InitialContextService(ServiceCallerWrapper.java:30)

      [Server:server-one] 15:46:32,247 ERROR [stderr] (RMI TCP Connection(4)-10.0.103.110)    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

       

      Where “twice-removed” is [A] and the class TriggerLookup does reside in [B].

       

      (We have aries.jndi running in our jboss, but the behavior described here occurs also without aries.jndi – in fact we had hoped that aries.jndi would solve our problems)

       

      It is important to note, that the same code in [B] works alright, when the initiating JMX Bean resides in [B] instead of [A], because in this case the TCCL is bundle classloader of bundle [B], whereas in the other case it is the bundle classloader of [A] which of course has not knowledge of the interface class.

       

      Furthermore it is important to note that this behavior occurs even though the flow of control from [A] to [B] is done using:

       

      private TriggerLookupMBean getService()

      {

              ServiceReference sRef = TwiceRemovedActivator.getBundleContext().getServiceReference(TriggerLookupMBean.class.getName());

              if( sRef != null ){

                 return (TriggerLookupMBean) TwiceRemovedActivator.getBundleContext().getService(sRef);

              } else {

                 throw new IllegalStateException("Service TriggerLookupMBean was not found!");

              }

          }

       

      public String doAddition_InitialContextService()

      {

           return getService().doAddition_InitialContextService();

      }

       

      So that the OSGi framework would have a chance to change the TCCL using an interceptor hooked into the service which is returned by getService.

      But from what I see simply an instance of the implementation class from bundle [B] is returned.

       

      Am I doing something wrong here?

       

      Having aries.jndi installed, I can do a successful JNDI lookup for an OSGi Service regardless of the Bundle initiating the flow of control, while the same lookup, when done with a “ejb:” prefix fails.

       

      This works:

      AnOSGiService otherSvc = null;

      ServiceReference sRef = Activator.getBundleContext()

            .getServiceReference(JNDIContextManager.class.getName());

      if (sRef != null)

      {

           JNDIContextManager contextMgr = (JNDIContextManager) Activator.getBundleContext().getService(sRef);

       

           try

           {

              Properties props = new Properties();

              props.put("osgi.service.jndi.bundleContext", Activator.getBundleContext());

              Context ctx = contextMgr.newInitialContext(props);

              System.out.println("doing JNDI lookup");

              otherSvc = (AnOSGiService) ctx.lookup("osgi:service/" + AnOSGiService.class.getName());

              System.out.println("lookup succeeded, calling service");

              return "result:" + otherSvc.foo();

           }

        //...

        

      This fails:

       

      RemoteCalculator calc = null;

      ServiceReference sRef = Activator.getBundleContext()

                      .getServiceReference(JNDIContextManager.class.getName());

        if (sRef != null)

      {

            JNDIContextManager contextMgr = (JNDIContextManager) Activator.getBundleContext().getService(sRef);

       

                 try

            {

               Properties props = new Properties();

               props.put("osgi.service.jndi.bundleContext", Activator.getBundleContext());

               props.put(Context.URL_PKG_PREFIXES, "org.jboss.ejb.client.naming");

               Context ctx = contextMgr.newInitialContext(props);

               System.out.println("doing lookup");

               calc = (RemoteCalculator)ctx.lookup("ejb:application-ear-0.0.1-SNAPSHOT/ejb-definition-0.0.1-SNAPSHOT//CalculatorBean!steffen.experimental.remote.ejb.RemoteCalculator");

               System.out.println("lookup succeeded, calling remote bean");

                  return "result:" + calc.add(1, 1);

            }

      //...


       

       

      As I mentioned before when called from a JMX-Bean in the same bundle both work!

      What am I missing?

       

      Our current workaround is an aspect that changes the TCCL in exported public methods if required – but I believe this should not be necessary.

       


       


        • 1. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
          thomas.diesler

          > So that the OSGi framework would have a chance to change the TCCL using an interceptor hooked into the service which is returned by getService

           

          The OSGi layer stays out of changing the TCCL. Generally, the notion of TCCL likely breaks modularity.

          I'd say the code that relies on TCCL should be wrapped by something that sets it correctly. Clients would call the wrapper.

          Perhaps this is what you already do in your aspect.

          • 2. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
            steffenwollscheid

            Hello Thomas,

             

            thanks for your answer. I agree that OSGi is not the one to fiddle with the TCCL, and yes our aspect does just that.

             

             

            But the code relying on the TCCL is the EJBClientContext of the JBoss itself. Take a look at

             

            org.jboss.ejb.client.naming.ejb.EjbNamingContext (modules/org/jboss/ejb-client/main/jboss-ejb-client-1.0.5.Final.jar)

            Line 106:  viewClass = Class.forName(identifier.getViewName(), false, SecurityActions.getContextClassLoader());

             

            Do you really expect every one using remote EJBs from an OSGi bundle in JBoss to take care of the TCCL themselves, every time they lookup and access a remote EJB?

             

            Would it really break modularity if bundle[B] sets the TCCL for the duration of method executions within bundle [B] to the module class loader of [B]?

             

            In the current scenario, code in bundle [B] has access to classes from bundle [A] through the TCCL - as you said, it breaks modularity.

             

            Best Regards

            Steffen

            • 3. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
              thomas.diesler

              To nail this down, you are saying that a component (e.g. an OSGi service) cannot invoke a remote EJB if the TCCL is not set correctly by this component. This may be a general issue that does not (just) apply to the context of OSGi components.

               

              Lets work on a consise description of this issue and creaet a jira for it. I could supply an isolated test case for an OSGi client. If the above is correct I'll can create the jira for it - you can do that too if you like.

              • 4. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
                steffenwollscheid

                The remote lookup  - as it is currently implemented - fails if the TCCL is set to the loader of another module, than the one doing the lookup.

                 

                In my scnenario, where JMX Beans are used to trigger the chain of events, the TCCL is set to the module originating the thread; but i do not see why this should be different when a web-app starts the thread, or anyhting else.

                 

                Thank you for looking into this!

                • 5. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
                  thomas.diesler

                  Who is setting the TCCL in the first place? Is this your code? What happends if the TCCL is null (i.e. not set)?

                  • 6. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
                    steffenwollscheid

                    When i tried to find out, who sets the TCCL but i got lost in the JBoss sources, so i cannot say who does, only that it' not our code.

                    When the TCCL is null, which it is during calls to the bundle activator, we experienced NPEs.

                     

                    Furthermore, even if  (in the OSGi bundle) the TCCL is set correctly before looking up and calling the remote EJB, the search for the EJB client context ends up in the org.jboss.as.ejb3.remote.TCCLEJBClientContextSelector, which does not find the EJB client context from the jboss-ejb-client.xml in the META-INF of the bundle -- although according to the log, the file gets processed during startup.

                     

                    Here again this seems to be a problem with the set TCCL as the TCCL is used as the key into the map of registered EJB client contexts.

                     

                    Since the fix for AS7-3830 (part of 7.1.1-Final) TCCLEJBClientContextSelector at least returns then default org.jboss.ejb.client.EJBClientContext -- before this fix, looking up and calling a remote EJB from an OSGi bundle did not work at all. With this fix it works, but i fear not reliably. I did not look into what happens, if more than one remote-outbound-connection is configured, but i fear selecting a specific one using the jboss-ejb-client.xml is not possible in the current implementation. I opened a discussion in the general JBoss7 AS forum on this (https://community.jboss.org/thread/196802), but didn't get a reaction so far.

                     

                    I cannot say if it is a general issue, but it severely impairs the use of EJB from OSGi bundles.

                     

                    From my point of view this concerns the deep JBoss internals, so i would rather not create the jira ticket myself.

                    • 7. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
                      thomas.diesler

                      Ok, here it is: https://issues.jboss.org/browse/AS7-4253

                       

                      Next, we need to create an isolated test case. Do you already have something?

                      • 8. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
                        steffenwollscheid

                        I have attached a complete scenario (in the form of 5 maven projects) to AS7-4253, although i don't know if it is what you meant by an isolated test case. But it should be easy and fast to use and demonstrates the problem.

                        • 9. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
                          thomas.diesler

                          I actually meant a test case similar to: StatelessBeanIntegrationTestCase

                          If you have the time to translate your scenario into such test case - that would be great. I'll help you along if needed.

                          The easiest way is to fork the jboss-as project in github, make the changes in the testsuite and send a pull request to me.

                          • 10. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
                            steffenwollscheid

                            Ok, attached an project with an arquillian test case to https://issues.jboss.org/browse/AS7-4253, see my comment there for details. Since i am not proficient with git, i'd be grateful if you could add the sources to git. Thanks for your help!

                            • 11. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
                              thomas.diesler

                              This issue is resolved.

                               

                              I saw this CNFE

                               

                              09:18:24,662 ERROR [stderr] (pool-3-thread-2) javax.naming.NamingException: EJBCLIENT000037: Could not load ejb proxy class demo.experimental.interfaces.RemoteCalculator [Root exception is java.lang.ClassNotFoundException: demo.experimental.interfaces.RemoteCalculator from [Module "deployment.cascaded-accessor-osgi-0.0.1-SNAPSHOT.jar:main" from Service Module Loader]]
                              09:18:24,662 ERROR [stderr] (pool-3-thread-2)     at org.jboss.ejb.client.naming.ejb.EjbNamingContext.createEjbProxy(EjbNamingContext.java:108)
                              09:18:24,663 ERROR [stderr] (pool-3-thread-2)     at org.jboss.ejb.client.naming.ejb.EjbNamingContext.lookup(EjbNamingContext.java:96)
                              09:18:24,663 ERROR [stderr] (pool-3-thread-2)     at org.jboss.ejb.client.naming.ejb.EjbNamingContext.lookup(EjbNamingContext.java:76)
                              09:18:24,663 ERROR [stderr] (pool-3-thread-2)     at org.jboss.as.naming.InitialContext.lookup(InitialContext.java:101)
                              09:18:24,663 ERROR [stderr] (pool-3-thread-2)     at org.jboss.as.naming.NamingContext.lookup(NamingContext.java:215)
                              09:18:24,663 ERROR [stderr] (pool-3-thread-2)     at javax.naming.InitialContext.lookup(InitialContext.java:392)
                              09:18:24,663 ERROR [stderr] (pool-3-thread-2)     at demo.experimental.remotecaller.LookupImpl.doLookup(LookupImpl.java:32)
                              09:18:24,664 ERROR [stderr] (pool-3-thread-2)     at demo.experimental.remotecaller.LookupImpl.useInitialContextFromOSGiService(LookupImpl.java:73)
                              09:18:24,664 ERROR [stderr] (pool-3-thread-2)     at demo.experimental.jmx.LookupWrapper.useInitialContextFromOSGiService(LookupWrapper.java:20)
                              09:18:24,664 ERROR [stderr] (pool-3-thread-2)     at demo.experimental.cascaded.jmx.CascadedCaller.useInitialContextFromOSGiService(CascadedCaller.java:26)
                              

                               

                              which I fixed by adding this import

                               

                              builder.addImportPackages(RemoteCalculator.class, ...
                              

                               

                              All tests pass against the latest EJB3/OSGi integration (which is not yet in master). If you can't wait for the AS7 release, you can try it on the branch that is referenced from #2596

                              • 12. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
                                steffenwollscheid

                                I'm afraid the CNFE is precisely what the testcase was about in the first place.

                                 

                                The cascaded-accessor-bundle did not import the RemoteCalculator.class on purpose, because it does not use it.

                                RemoteCalculator.class is a interface between the ejb-accessor-bundle (which imports it) and the ebj-definition ear/jar - as such it is a implementation detail of the ejb-accessor-bundle.

                                 

                                The CNFE arises from the fact that the TCCL is set to the HostBundleClassLoader of cascaded-accessor-bundle when the ejb-accessor-bundle calls the ejb on behalf of the cascaded-accessor-bundle.

                                • 13. Re: TCCL used by EJBNamingContext is wrong when callstack passes through multiple OSGi modules
                                  thomas.diesler

                                  ok, I reopened