12 Replies Latest reply: Jul 19, 2012 4:22 AM by Gary Brown RSS

BAM demo

Gary Brown Master

Some early work on a BAM infrastructure has been started within the Governance organisation at github including:

 

1) A gadget-server repo, to provide pluggable visual components that can present different capabilities to users

 

2) The bam repo, providing the ability to collect, analyse and organise activity information

 

The purpose of this post is to describe the functionality that should be delivered in an initial demo, that can be used to help focus the current work, and enable the potential capabilities to be illustrated.

 

  • Switchyard application - the first step is a simple switchyard application that can be monitored to produce performance metrics and analysed against a SLA
  • Switchyard activity event collector - initially will use an ExchangeHandler configured with the application, but eventually may be supported by the Switchyard infrastructure
  • Activity Server to receive collected activity events
  • Event Processor Network, configured with an example network to:
    • derive service invocation metrics - the information should include service type, operation name, optional fault name, duration (ms)
    • SLA rules to check service metrics to determine whether they violate the contract
  • Result Processing
    • SLA violations should be reported via JMX notifications
    • SLA violations should be stored in an active collection
    • Service metric information should be stored in an active collection
  • Visual presentation
    • Gadget to display list of SLA violations
    • Gadget to display service metrics (more details below)

 

 

The gadget for displaying service metrics will need to allow a user to customise it to specify the service type, operation name and optionally a fault type. For the demo, this could be handled by simply providing text fields where the user directly enters the relevant values to be used for filtering results. In the final version it should be possible to use RESTful queries to retrieve a valid set of values that can be presented to the user in dropdown lists.

 

The metric information may represent individual durations, but may also represent aggregated data over a time range. If a range is defined, then the duration will represent the average value (being the main line on the graph) with the min and max values representing a region that should be highlighted on the graph (e.g. as a lighter filled background behind the main solid line representing the average).

 

If possible it would also be good to access the SLA violations active collection, and overlay any relevant violations for the selected service type/op/fault onto the graph as a marker associated with the relevant date/time.

  • 1. Re: BAM demo
    Jeff Yu Master

    Gary Brown wrote:

     

    • Visual presentation
      • Gadget to display list of SLA violations
      • Gadget to display service metrics (more details below)

     

    The gadget for displaying service metrics will need to allow a user to customise it to specify the service type, operation name and optionally a fault type. For the demo, this could be handled by simply providing text fields where the user directly enters the relevant values to be used for filtering results. In the final version it should be possible to use RESTful queries to retrieve a valid set of values that can be presented to the user in dropdown lists.

     

    The metric information may represent individual durations, but may also represent aggregated data over a time range. If a range is defined, then the duration will represent the average value (being the main line on the graph) with the min and max values representing a region that should be highlighted on the graph (e.g. as a lighter filled background behind the main solid line representing the average).

     

    I am going to ask a bit detail on these two gadgets, well, from the gadget's designer perspective. :-)

     

    1. Gadget to display list of SLA violations:

     

    Input:  SLA violation restful service url, list size (maximum), what else options want to offer?

    Output: JSON representation of the list of the result. What JSON object graph it would be? like do we have this model already?

     

    2. Gadget to display the service metrics:

     

    Input: serivce metric restful service url, serivce type, operation name, fault type (optional)

    Output: JSON representation, i.e. model.  Will use the google chart to be embedded in the gadget.

     

     

    Regards

    Jeff

  • 2. Re: BAM demo
    Gary Brown Master

    Jeff Yu wrote:

    1. Gadget to display list of SLA violations:

     

    Input:  SLA violation restful service url, list size (maximum), what else options want to offer?

    Output: JSON representation of the list of the result. What JSON object graph it would be? like do we have this model already?

     

    Input: Although the URL will be required, it would be good if the gadget could be pre-configured with a default when stored in the server, so an organisation who has a well defined URL for their server does not need a user to specify it. List size would be good, which can be passed to the server as part of the query, to keep the network traffic to a minimum. That is probably it for now.

     

    Output: I don't currently have an object for this, but would require one to output from the EPN nodes. I'll let you know when it is available. For now though it would probably just define the service type, optional operation and a text field to signify the violation.

     

     

    Jeff Yu wrote:

    2. Gadget to display the service metrics:

     

    Input: serivce metric restful service url, serivce type, operation name, fault type (optional)

    Output: JSON representation, i.e. model.  Will use the google chart to be embedded in the gadget.

     

    Inputs: probably fault field would be required, but blank if a normal response is required. We need to consider what values should be passed for (1) all metrics for an operation, regardless of whether normal or fault response, (2) all metrics for an operation but only normal response, (3) all metrics for an operation with a specified fault. For now, the time range of the data to be displayed, and whether an aggregation period is used, will be controlled at the server, so not part of the customisation by the user.

     

    Output: Have the start of an object model for this information, but needs to be updated based on some of these requirements - so will let you know when it is available. But essentially it just needs the service type, operation, fault, duration (or average), min and max.

     

    We will need to think about refresh cycles for this graph, as we want it to appear reasonably active. Manual refresh may be ok initially, but ideally for the demo it should be automatic.

     

    Regards

    Gary

  • 3. Re: BAM demo
    Rob Cernich Master

    Hey Gary,

     

    Any plans to expose bits through the AS7 management layer?

     

    Best,

    Rob

  • 4. Re: BAM demo
    Gary Brown Master

    Anything is possible - did you have something specific in mind?

     

    The initial thoughts were providing notifications via JMX, and doing a console integration with the gadget server, to enable different types of information relevant to a particular user to be available. But if there is something AS7 specific that you think is worth considering, then we are open to suggestions.

     

    Once the whole governance project structure has been sorted, we will hopefully have a jira set up for the BAM stuff, so will make capturing requirements easier.

     

    Regards

    Gary

  • 5. Re: BAM demo
    Jeff Yu Master

    Rob Cernich wrote:

     

    Hey Gary,

     

    Any plans to expose bits through the AS7 management layer?

     

    Best,

    Rob

     

    I like this idea, as AS7 managment layer uses the detyped object model, if we can experiment it early, that would be great.

     

    Any reports/gadgets in your mind?

  • 6. Re: BAM demo
    Rob Cernich Master

    Hey Jeff,

     

    I didn't have anything specific in mind.  My main reason for asking was that the core console can only access information that is presented through the management API.  So, if we were ever to want for anything to be hooked into the basic SwitchYard console, it would need to come through the AS7 management layer.

     

    Best,

    Rob

  • 7. Re: BAM demo
    Jeff DeLong Master

    There have been a few threads among SA's etc about customers / partners looking for service metrics as well as drill down into individual message flows. This is expressed both in the context of individual services and an ESB, as well as composite services and BPM. So a management console that allowed users, for example, to start with a composite service, view it process diagram, look at summary level metrics about the business process (e.g. number of process instances completed yesterday, number of process instance completed yesterday whose completion time was greater than 3 days), drill down to an indidual process instance, examine the average completion times of each individual task within the process, examine the service metrics for a particular ServiceTask within the process, then drill down to the individual service for the given process instance, and compare its completion time with the average for the service, then look at the message associated with that service instance (all as a way of determining why it took 300 ms to process this message when the average is 45 ms), ...

     

    Not sure how this gets done in a single UI, but it is more or less what people are asking for when they talk about BAM / SAM.

  • 8. Re: BAM demo
    Gary Brown Master

    Hi Jeff

     

    Thanks for the great input.

     

    The aim of the BAM demo initially is just to provide an end to end demo of the current infrastructure being developed, but we also have an intern project recently started to build BPMN2 based activity analysis tools, so some of the requirements you mention will be good to feed into that project.

     

    Regards

    Gary

  • 9. Re: BAM demo
    Jeff Yu Master

    Hi Gary,

     

    Two questions:

     

    1) For the min, and max value, how these two values are computated? like over what range, within a day, a week etc?

    2) By default, is it that we are showing a metric for a specific service's all available operations?

     

    Gary Brown wrote:

     

    Output: Have the start of an object model for this information, but needs to be updated based on some of these requirements - so will let you know when it is available. But essentially it just needs the service type, operation, fault, duration (or average), min and max.

     

    Regards

    Jeff

  • 10. Re: BAM demo
    Gary Brown Master

    Hi Jeff

    Jeff Yu wrote:

     

    1) For the min, and max value, how these two values are computated? like over what range, within a day, a week etc?

     

    The time interval is configurable, but basically every period (which will probably be something like 1 second), all the response times will be aggregated to work out the average, min and max. These values are then recorded in a single ResponeTime object associated with that time range.

     

     

    2) By default, is it that we are showing a metric for a specific service's all available operations?

     

    Gary Brown wrote:

     

    Output: Have the start of an object model for this information, but needs to be updated based on some of these requirements - so will let you know when it is available. But essentially it just needs the service type, operation, fault, duration (or average), min and max.

     

     

    Strictly speaking, the metrics should be on an operation basis - but also depends on whether it is a normal or specific fault response. However if the collection contains information for different response times for an operation, or even multiple operations, then we need to consider how this will be displayed in the chart. Either the distinct operations should be represented by separate lines, or somehow represented by a single aggregated line.

     

    I think for now just plot the info (avg, min, max) regardless of the service type/op/fault etc. If the value is in the collection associated with the chart, then just plot it.

     

    If the user wants to isolate a particular operation/fault, then they will need to change the gadget properties to filter out the other information.

     

    One issue with this approach is how a user would know which operation/fault is causing the extreme values. So it might be useful if they can hover over a particular point and get a tooltip style feedback on the service type, op, fault and values associated with that individual point?

     

    Regards

    Gary

  • 11. Re: BAM demo
    Jeff Yu Master

    Gary Brown wrote:

     

    One issue with this approach is how a user would know which operation/fault is causing the extreme values. So it might be useful if they can hover over a particular point and get a tooltip style feedback on the service type, op, fault and values associated with that individual point?

    Good point, I'll add the tooltip text to see how it looks.

     

    Regards

    Jeff

  • 12. Re: BAM demo
    Gary Brown Master

    This demo has been completed and is included in the 1.0.0.M1 release of the BAM component: http://www.jboss.org/overlord/downloads/bam

     

    Information about the BAM architecture, and running the demo, can be found here: http://jboss-overlord.blogspot.co.uk/2012/07/introducing-overlord-bam-and-gadget.html