1 2 Previous Next 16 Replies Latest reply on Oct 22, 2015 3:43 AM by mkouba

    Weld 2.2.10 efficiency vs webbeans

    katheris

      I have recently switched from using webbeans to the weld version 2.2.10. I have an application that repeatedly pings a servlet. I have found that the overhead on the application when running weld is 24% higher than when running webbeans. Comparing the time that is spent in different methods it looks like a lot of the extra time comes from HttpContextLifecycle.requestInitialized(). More specifically the child methods HttpRequestContextImpl.associate() and AbstractBoundContext.activate() both call AttributeBeanStore.attach() which seems to account for the majority of the time taken in those methods. The other child methods of HttpContextLifecycle.requestInitialized() that take the most time are HttpSesssionContextImpl.associate() and ConversationContextActivator.associateConversationContext(). I realise that weld does a lot more in this call than the equivalent call in webbeans, however a 24% difference is quite significant. Are there currently any plans to improve the speed of weld in this area?

       

       

      Many thanks,

      Katherine Stanley

        • 1. Re: Weld 2.2.10 efficiency vs webbeans
          mkouba

          Hi Katherine,

          I've created a tracking issue for this - WELD-1931.

          • 2. Re: Weld 2.2.10 efficiency vs webbeans
            mkouba

            I suppose that your application is not publicly available, is it? It would be great if we can analyze the entire HTTP request lifecycle and the application setup.

            • 3. Re: Weld 2.2.10 efficiency vs webbeans
              katheris

              Hi, the app I have is quite complicated so I'm in the process of making a simpler app that still demonstrates the problem, I will get it to you as soon as I can. Thanks for opening the tracking issue, Katherine

              • 4. Re: Weld 2.2.10 efficiency vs webbeans
                katheris

                Hi Martin,

                 

                Sorry for the delay I have been quite busy. Here are two applications, one of them is a simple servlet (PingServlet.war) and the other is the same simple servlet but with another servlet that injects a class using CDI (PingServletCDI.war). I then "pinged" the simple non-cdi servlet for both applications enabling a webbeans cdi in one case and weld cdi in another on the server. These are the results:

                PingServlet.war:                        33328 req/sec
                PingServletCDI.war (webbeans): 30711 req/sec (-8%)
                PingServletCDI.war (weld): 25566 req/sec (-24%)

                 

                I hope that is sufficient for you to do your own investigations, let me know if you need any more information.

                 

                Kind regards,

                Katherine

                • 5. Re: Weld 2.2.10 efficiency vs webbeans
                  mkouba

                  Hi Katherine,

                  What container do you use? Recently, we've made some improvements in this area but it did not make a big difference compared to the overall overhead of HTTP requests processing (our test container was WildFly).

                  • 6. Re: Weld 2.2.10 efficiency vs webbeans
                    katheris

                    Hi,

                    I'm using Websphere Application Server Liberty Profile, the webbeans cdi is the cdi-1.0 feature, and weld is cdi-1.2 feature.

                    • 7. Re: Weld 2.2.10 efficiency vs webbeans
                      emilyj

                      Hi Martin/Jozef,

                      You might be interested in knowing the performance testing among different application servers.

                      Just in case the performance problem we reported is to do with the application server Liberty Profile.

                      We downloaded eebench (https://github.com/struberg/eebench) and compared Glassfish 4.1, Wildfly 8.2 (Weld) to TomEE (OWB). As a matter of fact, Liberty profile performs a lot of better than Glassfish and Wildfly but it is not as fast as TomEE.

                       

                      (with Oracle JDK)

                      TomEE 1.7.2                      1079

                      Glassfish 4.1                      151   - 86%

                      Wildfly 8.2                           167  -84%

                       

                      Parent  0   9.11  84.05    1574   14524J:sun/reflect/GeneratedMethodAccessor47.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;

                       

                      Self  0   9.11  84.05    1574   14524J:org/apacheextras/eebench/cdibench/CdiBenchBean.getState()Ljava/lang/String;

                       

                         Child  0   4.33  26.08     748    4507J:org/apacheextras/eebench/cdibench/beans/ClassInterceptedBean$Proxy$_$$_WeldClientProxy.getMeaningOfLife()Ljava/lang/Integer;
                         Child  0   3.25  25.72     562    4445J:org/apacheextras/eebench/cdibench/beans/MethodInterceptedBean$Proxy$_$$_WeldClientProxy.getMeaningOfLife()Ljava/lang/Integer;
                         Child  0   0.00  13.72       0    2370J:org/apacheextras/eebench/cdibench/beans/SimpleRequestScopedBeanWithoutInterceptor$Proxy$_$$_WeldClientProxy.theMeaningOfLife()I
                         Child  0   0.00   6.93       0    1197J:org/apacheextras/eebench/cdibench/beans/MethodInterceptedBean$Proxy$_$$_WeldClientProxy.getMeaningOfHalfLife()Ljava/lang/Integer;
                         Child  0   0.00   2.49       0     431J:org/apacheextras/eebench/cdibench/beans/SimpleApplicationScopedBeanWithoutInterceptor$Proxy$_$$_WeldClientProxy.theMeaningOfLife()I

                       

                      ------------

                       

                      This seems to be taking a big chunk of time in WELD:

                        Parent  0   0.03   0.03       6       6J:org/apacheextras/eebench/cdibench/beans/MethodInterceptedBean$Proxy$_$$_WeldClientProxy.getMeaningOfLife()Ljava/lang/Integer;
                        Parent  0   0.05   0.05       9       9J:org/apacheextras/eebench/cdibench/beans/ClassInterceptedBean$Proxy$_$$_WeldClientProxy.getMeaningOfLife()Ljava/lang/Integer;
                        Parent  0   2.48   2.48     428     428J:org/apacheextras/eebench/cdibench/beans/SimpleApplicationScopedBeanWithoutInterceptor$Proxy$_$$_WeldClientProxy.theMeaningOfLife()I
                        Parent  0   5.69   5.69     984     984J:org/apacheextras/eebench/cdibench/beans/MethodInterceptedBean$Proxy$_$$_WeldClientProxy.getMeaningOfHalfLife()Ljava/lang/Integer;
                        Parent  0   3.60  13.72     622    2370J:org/apacheextras/eebench/cdibench/beans/SimpleRequestScopedBeanWithoutInterceptor$Proxy$_$$_WeldClientProxy.theMeaningOfLife()I

                       

                         Self  0  11.86  21.97    2049    3797J:org/jboss/weld/bean/proxy/ProxyMethodHandler.getInstance()Ljava/lang/Object;

                       

                         Child  0   1.39  10.12     241    1748J:org/jboss/weld/bean/proxy/ContextBeanInstance.getInstance()Ljava/lang/Object;

                       

                      Hope this information can help you identify the bottleneck.

                      • 8. Re: Weld 2.2.10 efficiency vs webbeans
                        mkouba

                        Hi Emily,

                         

                        thanks for your tips. It would be nice if you add some description to the numbers you provided (and better formatting would help as well). With regard to eebench - first of all, the test for CDI also involves other EE technologies (JSF, EL, etc.). So the results are more relevant to the performance of the JSF/CDI stack. It's also also targeted for one feature only - bean invocation. E.g. it does not test events, bean instance construction, producers, etc. And also remember that you should warmup JVM before you run the test.

                        • 9. Re: Weld 2.2.10 efficiency vs webbeans
                          emilyj

                          Hi Martin,

                          Sorry for the slow response due to holidays! To answer your questions:

                          1. This test was done with a warmup JVM

                          2. eebench does test other stuff, but the profiles clearly show CDI is the issue here.

                          3.  

                          This java profile numbers below how much time (in percentage) the following methods are taking during a run of the eebench benchmark.

                          The method org/jboss/weld/bean/proxy/ProxyMethodHandler.getInstance() (and methods it calls) are taking 21.97% of the time. The ProxyMethodHandler.getInstance() method itself is taking 11.86% of the time, and its childs methods take ~10.11% of the time.

                           

                          Parent 0 0.03 0.03 6 6 J:org/apacheextras/eebench/cdibench/beans/MethodInterceptedBean$Proxy$_$$_WeldClientProxy.getMeaningOfLife()Ljava/lang/Integer;

                            Parent 0 0.05 0.05 9 9 J:org/apacheextras/eebench/cdibench/beans/ClassInterceptedBean$Proxy$_$$_WeldClientProxy.getMeaningOfLife()Ljava/lang/Integer;

                            Parent 0 2.48 2.48 428 428 J:org/apacheextras/eebench/cdibench/beans/SimpleApplicationScopedBeanWithoutInterceptor$Proxy$_$$_WeldClientProxy.theMeaningOfLife()I

                            Parent 0 5.69 5.69 984 984 J:org/apacheextras/eebench/cdibench/beans/MethodInterceptedBean$Proxy$_$$_WeldClientProxy.getMeaningOfHalfLife()Ljava/lang/Integer;

                            Parent 0 3.60 13.72 622 2370 J:org/apacheextras/eebench/cdibench/beans/SimpleRequestScopedBeanWithoutInterceptor$Proxy$_$$_WeldClientProxy.theMeaningOfLife()I

                           

                            Self 0  11.86 21.97 2049 3797 J:org/jboss/weld/bean/proxy/ProxyMethodHandler.getInstance()Ljava/lang/Object;

                           

                            Child 0 1.39 10.12 241 1748 J:org/jboss/weld/bean/proxy/ContextBeanInstance.getInstance()Ljava/lang/Object;

                          Thanks,

                          Emily

                          • 10. Re: Weld 2.2.10 efficiency vs webbeans
                            mkouba

                            Ok Emily, thanks for info. I tried to run the "benchmark" on WildFly master and I can see a little bit different numbers. First of all, the most consuming part is JSF, namely javax.faces.webapp.FacesServlet.service() (self time) and com.sun.faces.application.view.FaceletViewHandlingStrategy.getSession() - note that a new HTTP session is created for every HTTP request made by jmeter. CdiBenchBean.getState() logically also takes a lot of time, namely intercepted methods ClassInterceptedBean.getMeaningOfLife() and MethodInterceptedBean.getMeaningOfLife() or the interceptor stack itself respectively. We did a lot of optimizations in this area but we have limited possibilities due to the way Weld implements client proxies - i.e. subclassing. Anyways, thanks for sharing your findings!

                            • 11. Re: Weld 2.2.10 efficiency vs webbeans
                              emilyj

                              Thank you Martin for your quick response! I understand the subclassing is the thing to be blamed. If we cannot improve subclassing performance, how about exploring ASM?

                              • 12. Re: Weld 2.2.10 efficiency vs webbeans
                                mkouba

                                how about exploring ASM?

                                Emily, what exactly do you mean?

                                • 13. Re: Weld 2.2.10 efficiency vs webbeans
                                  emilyj

                                  I was commenting on your comments:

                                  "We did a lot of optimizations in this area but we have limited possibilities due to the way Weld implements client proxies - i.e. subclassing."

                                   

                                  As Weld uses its own subclassing approach to create client proxies, do you think it will improve if the client proxy was generated by using ASM?

                                  • 14. Re: Weld 2.2.10 efficiency vs webbeans
                                    mkouba

                                    Weld is using jboss-classfilewriter to generete the bytecode, a tool similar to ASM. However, subclassing is rather a question of concept or implementation design. It would require a massive refactoring and I'm not so sure it's worth it.

                                    1 2 Previous Next