11 Replies Latest reply on Dec 10, 2014 11:17 AM by pmensik

    Visual testing tool - request for feedback

    jhuska

      Hello all,


      I am here to kindly ask you guys for a feedback for my recent developments of visual testing tool.


      I would like to achieve following:


      Create a tool and enhance a process for visual testing of web application. In other words, protect your web application from CSS regressions, and other bugs, which can not be checked by unit/integration/functional or other tests, but need to be tested manually when testing with traditional tools. By automating this, rapidly decrease demand for human resources put into the testing.


      There are already some solutions, which take screenshots and then compare them. I miss in that solutions some things, and therefore I started to develop another tool (with a stress to not a reinvent a wheel).


      My approach would involve:

      • reusing of functional tests for visual testing (because those tests click here and there over the page, and put the application into various states)
      • make during such testing screenshots
      • store the screenshots somewhere = pattern screenshots
      • compare new screenshots = sample screenshots with the pattern screenshots
      • see reasonable, useful, concise output of such comparison, and to be able to take appropriate action according to it


      In this video I tried to summarize what I have done so far. Basically it runs Arquillian Graphene functional tests, make screenshots during that testing with Graphene screenshooter, compare screenshots with Arquillan Rusheye and make some outputs from such comparison.


      That is for background. Now I would like to ask you guys for you opinion on how such a meaningful and useful report should look like. What would be most effective way for you to work with such tool.


      Currently the process would start with the tasks presented in the video. Then the results will be uploaded to a web application.
      I can see these functional/non functional requirements for such an application:

      • application will provide REST endpoints, by which an user will be able to upload screenshots, and respective meta data
      • logically the results will be separated into lets say jobs (CI jobs), and each will contain multiple runs of the test suite
      • when reviewing results, one will be able to mark a new screenshot as a correct one, or otherwise
      • authentication and authorization will be deployed as well

       

      I also prepared GUI mockups of such an application, which you can find attached. You can find there:

      • Front page - list of recent screenshots uploads
      • Page with particular job - executions of visual testing
      • A particular upload - always pattern screenshot, diff and the new one

       

      Do you miss something ? What are your requirements on such web app to review the results of such visual testing ? Any ideas are more than welcomed.


      Thank you!


        • 1. Re: Visual testing tool - request for feedback
          lfryc

          Hey Juraj,

           

          I believe the mockup screens are a great first milestone.

           

          I have some ideas where to develop it further:

           

          • UI
            • Milestone 1: indicate number of failures / total number of tests
            • Future: indicate concurrent activity of other users
              • (like that guy is currently checking this part of the test suite - this would enable parallel work on single test suite)

           

           

          I guess particularJob.png is following the concept of "List of Builds" (suite run) inside particular "Job" (suite)

           

          1) I would love the UI to indicate also how many functional tests failed in this suite and see whether the functional test passed or not for one particular visual test

          2) and it is connected with idea to throw out the results from whole one build (e.g. too many regressions caused by unrelated failure)

           

           

          Other problem that will happen sooner than later is that if you change sample images for build #15, then you will need to re-verify images for all the later builds up to most recent one (say #26), because samples (newly taken screenshots) will stay same, but the pattern was changed.

           

          It would be nice to add some description/tag to tests (like revisions).

           

          And if we come to more use cases, sometimes I want to compare two different revisions of visual test suites (like comparison of version 2.0.1 against 2.0.0, instead of comparing 2.0.1 against 2.3.0).

           

          I hope the feedback helps. :-)

          1 of 1 people found this helpful
          • 2. Re: Re: Visual testing tool - request for feedback
            lfryc

            I want to compare two different revisions of visual test suites (like comparison version 2.0.0 with 2.0.1, instead of comparing 2.0.0 with 2.3.0).

             

            Actually, with the knowledge with what project revision (e.g. Git commit-id) was the project built, you can simply find which visual test results you should compare the samples with.

            1 of 1 people found this helpful
            • 3. Re: Re: Re: Visual testing tool - request for feedback
              jhuska

              Thanks Lukas for your valuable feedback.

               

              I will change the mockups so they also mirror your ideas.

               

              To the suites versions:

              Do I understand it correctly that it is enough to involve to the suite build description a revision of the source code versioning system ?

               

              E.g. this would be a list of suit runs for a particular job:

               

              Commit IDTimestampNumber of diffs / number of comparisonsNumber of failed functional tests / Total number of tests
              f5933559008136e2e4343944c1e414d3d431a94f2014-08-24_15-50-04-56715 / 6503 / 210

               

              I quite can not understand how someone, and why would like to compare samples with patterns across different application versions. Or in other words, should I support somehow this ? If somebody wants this, he/she will do it on the level of source control version system. So this is maybe also answer to my question that commit id is enough ?

              • 4. Re: Visual testing tool - request for feedback
                smikloso

                Hi Juraj,

                 

                this is interesting project indeed!

                 

                I was thinking about the implementation little bit and I think that I would make diffs only of the last two test runs. I would like to see it like this:

                 

                1) You have a Graphene-enabled test with Arquillian Recorder

                2) You provide some credentials to your custom Diff extension in arq.xml and other configuration options which will connect to hosted Diff server elsewhere

                3) As you are taking screenshots along the way, you would hook into well defined events like Before/After ScreenshotTaken or TakeScreenshot and you would gather screenshots for uploading to your Diff server.

                4) Once test run is about its end, you would upload these images to your Diff server.

                5) As a user, I can login to your Diff server and see the difference between two last test runs.

                   I can also delete tests with resources from Diff server which I am not interested in anymore

                 

                I can imagine that uploading of big files like screenshots would be particulary time consuming so I would do it also in "offline mode" meaning your diff server would be hosted on the same machine as you run your tests on and diff server would grab them from some directory where they are copied from test project.

                 

                That should be it.

                 

                I do not find the idea of being able to compare every possible two commits much valuable because your test which is testing these commits can be very different as such from version to version and you would have difficulties with pairing these screenshots in order to actually compare them.

                 

                When you run these tests for the first time you would have obviously nothing to compare it with.

                 

                For better matching of taken screenshots I would provide additional annotation put on test method like @DiffAware, once put on a method, that would mean you indeed want to make diffs for this method. When you do not add that annotation on it, comparision would not be made.

                 

                I do not know how you pair screenshots. According its name + method name + class name? What if these names change?

                 

                My biggest concern is actually about making diffs as such. I wonder how you are going to make diffs of tests which are constantly changing in time (meaning from implementation point of view).

                • 5. Re: Visual testing tool - request for feedback
                  rsmeral

                  Hi Juraj,

                   

                  to be honest, I can't see the value in aggregation of the screenshots and viewing the results through a server application. I think the tool you are proposing can be broken down into more components and integrations with existing tools, since from what I understand, your proposal combines:

                  - an approach to testing (visual testing), implemented as an Arquillian extension,

                  - an image comparison tool - the core of the application - utilizing RushEye,

                  - a test result repository (the web application part of your solution).

                   

                  Conceptually, I see "visual testing" only as an additional aspect of functional testing. Not only in the sense, that you reuse the Selenium tests to "click you through" to the state of the website that you want to capture, but it is just another verification aspect. Standard tests verify mostly textual and structural correctness, and you propose verification also of the visual aspect.

                  The visual aspect seems to be different, as opposed to the others, in that it can be automated - the screenshots are taken, retained, and compared automatically. In this basic use case, the output of visual testing should be just "pass-fail", through the standard API of the testing framework in use, with an appropriate message indicating that it's the visual aspect of the test method that failed.

                   

                  I fully agree with Stefan that there should be an offline mode. In fact, for me the design would be the cleanest, if the image comparison would be done in the core of the tool, perhaps in the extension, diffing against a local cache of screenshots. This design could be extended to support also remote diffing, where the remote app could do all that you propose, including storage and aggregation of the screenshots.

                  The web application part of the solution seems superfluous, since many of the aspects you mention (grouping by jobs, builds/runs, uploads/files/artifacts) are problems solved by CI tools, like Jenkins.

                   

                  To support the offline cache functionality, a reasonable identifier strategy would indeed be required (could be pluggable, e.g. by annotation on test class, etc.). A screenshot of an application should be identified by a composition of identifiers of:

                  - the state of the app code base (commit ID, version number, etc.),

                  - possibly also the state of the test code base (commit ID),

                  - immediate state of the tested application (this could be a composite of the test class + test method + test phase (before/after), or maybe by URL),

                  - any other aspects affecting the appearance (browser, OS, ...).

                  Before testing, the tool could just scan the local cache (or possibly query the remote screenshot repo) and tell you, based on the configuration of your particular test execution (configuration of what you want to compare against - previous version, commit ID, etc.), which screenshots are missing in your local cache for the comparison to be made.

                  Also, it would make sense to embed any metadata (e.g. the identifiers) into the screenshot files directly (EXIF, XMP, ...), which would make the design a little more extensible and compact.

                   

                  Some more ideas for the tool:

                  - identification of the area that has changed on the level of HTML elements (I suppose you have the information about changed regions in the diffed image). If the HTML file would be saved along with the screenshot, you could experiment with using the JS call to Document.elementFromPoint to obtain the names/IDs of elements which fall into the region that has changed. E.g. the test output could be: "FAIL: the element 'id=loginButton' is different".

                  - since RushEye supports the concept of masking, you could also use the "reverse" of the above idea, which would be Element.getBoundingClientRect. Then you could (maybe on test method level, using annotation, etc.) tell your tool to take into account only changes occurring in (or around) the area of a specific element. This way, the visual testing could be broken down by individual elements. E.g. "@ShouldStaySame("id=loginButton") @MayDiffer("id=message") someTest()".

                  - I don't think the phrase "pattern screenshot" expresses what it's supposed to. I guess you meant "reference screenshot"?

                   

                  I guess I stated some quite obvious points, but hopefully some of this helps

                  • 6. Re: Visual testing tool - request for feedback
                    okiss

                    Hi I'm sorry this is not a deep analysis like those posted before me, but I hope it's helpful.

                    I just have three short points:

                    - I think having three images side by side in patternsSamplesDiffsAltered.png would make them too small to see differences. I suggest using showing reference/new sample in a mouse controlled overlay (see this article to see what I'm talking about) covering most of the page, with ability to switch to diff view and back.

                    - I think you should include test run numbers in job view as they are easier to reference than date and time.

                    - Have you thought about how Jenkins/other CI integration would look like? As Ron said, you are including some aspects of CI in your tool and I'm wondering how the two would work together.

                    • 7. Re: Visual testing tool - request for feedback
                      jhuska

                      Thanks much smikloso for your feedback. Here are my answers yo your questions, and hopefully some more clarification:

                       

                      1. The tool is not going to do diff from two last commits, the commit info is purely for identifying in what state was the application when the visual testing took place.

                      2. It will work basically as you suggested, particularly:

                      2.1 During first run of the functional test suite it will create set of screenshots, called patterns, and upload them to the storage (Java Content Repository).

                      2.2 Second, and other subsequent executions of functional tests will create another set of screenshots. Patterns will be downloaded, and those new screenshots will be compared with them. Result of the comparison, and generated diffs will be uploaded to server.

                      3. offline mode is there by default, all comparison is made on localhost, the result is uploaded to a server to enable better cooperation on the test suite among testers and developers. The web application for reviewing results should provide a more easy way how to review the results than going over generated pictures. I tested the process, and it was not that time consuming. User can optimize the solution by deploying the web manager "Close" to the CI environment.

                      4. DiffAware annotation or similar is indeed a great idea. I would just do it other way round. In other words when there is special annotation, comparison would not be made. It can be used for tests which too undeterministic for visual testing. Thanks for idea!

                      5. Screenshots are paired as you said: TestClass + TestName + when the screenshots was taken (e.g. before test).

                      6. It is a good question. My expectation is that visual testing is deployed later in the development process. When features are a bit stabilized. Even though there can be some non stable tests. For those I would like to use either rush eye feature of masking such constantly changing areas, or exclusion of such tests.

                       

                      Hope I cleared some of my thoughts.

                      • 8. Re: Visual testing tool - request for feedback
                        jhuska

                        Thank you rsmeral for taking time to think about it, and thanks for your input. I would like to clarify some of your concerns:

                         

                        1. As you said the solution is exactly as you said. Integration of those components. The server side (I call it Rusheye web manager) is only for reviewing the results. The tool is by default offline, as all results are also on localhost, where the functional tests took place. It is a web application for enabling a better cooperation among testers and developers, or other interested parties.

                        2. I think that we see the visual testing the same. My thought with this project is that even when it is completely different aspect thank functional testing, still the scripts for functional testing can be greatly reused. But I put a big stress to not messing functional tests with visual aspect.

                        3. The reason why there is a server side, which groups the runs somehow is not only for reviewing the results, but for taking an action on results too. Meaning, if there is a false negative test, take an immediate action from the tool (apply mask, etc.) Or if there is a bug take an action. If application visual state changed and it is expected take action (delete old patter and put there newly created screenshot). IMHO without server application it would not possible to do this in a way that all interested parties can cooperate on that.

                        4. Additional information about screenshots are great ideas! I will use some of them for sure.

                        5. There is already such feature request to identify the change on the level of element and play with it. Thanks for your input to this problem, it will definitely help with implementing of this feature.

                        6. I reused the naming from Rusheye project. Do you think that "pattern" is absolutely wrong ? Or "reference" would be just better ?

                         

                        I hope I was clear enough.

                        • 9. Re: Visual testing tool - request for feedback
                          jhuska

                          Hey okiss, your points are interesting as well, thanks for them, here are my answers:

                           

                          1. This is a very nice idea, I would definitely play around with the effect you are proposing. Thanks.

                          2. Yes, that is true, a number of the run should be there definitely.

                          3. Please see my answer to Ron. I can formulate it in other words, so it may be more clear:

                          CI like Jenkins is great for running my functional tests and doing visual comparisons, also to review the results. But, I need a way how to dynamically take an action according to the results. Meaning of there is bug, or in other case a false negative test, I would like to change configuration of the visual testing somehow. Without a server side component, I think that I would be left to just move screenshots from one folder to another manuallly, changing various XML files (RushEyey configuration) on different places manually. With that server side tool (Rusheye web manager), I can see results in a comfort way, and to take action more easily.

                          • 10. Re: Visual testing tool - request for feedback
                            rsmeral

                            At first the word "pattern" seemed completely out of place, but now I see, what the author might have meant.

                            Though, it still sounds rather odd to me. Perhaps you should ask some experienced english speakers for opinion.

                            • 11. Re: Visual testing tool - request for feedback
                              pmensik

                              Hi Juraj,

                               

                              I not going to be as exhaustive as our colleagues because they have already mentioned main issues. However some additional ideas could be

                               

                              1. keep history of all builds and eventually provide a way how to do a diff between them (possible use case - you might want to see what changed between some major versions)

                              2. sorting and filtering in list of builds would be really nice (by date, pass/fail, number of diffs from the last build, etc)

                              3. you could think about keeping just a sectors of images (with some context) instead of having whole screenshots saved, this would reduce upload/download time to/from your application

                              4. not sure if I got this right but you have to upload images (or date from the build) to your app, right?Instead of doing that you should be also able to run this app locally and just point it to some directory where the images are stored to quickly see the results of the differ

                              5. maybe provide some integration between the differ and source code of the page (like Ron mentioned), with the test results (like stack trace from from testsuite or application server) and possibly with the source code of the test but I guess you wouldn't use that feature much