Striim

Welcome to the Striim Help Center and Community Site

3.10.1 Sample applications for programmers

Follow
  • PosApp demonstrates how a credit card payment processor might use Striim to generate reports on current transaction activity by merchant and send alerts when transaction counts for a merchant are higher or lower than average for the time of day. This application is explained in greater detail than the other two, so you should read about it first. The posApp.tql file is also the more extensively commented of the two.

  • MultiLogApp demonstrates how Striim could be used to monitor and correlate logs from web and application server logs from the same web application.

The source code and data for the sample applications are installed with the server in …/Striim/Samples.

The alternative PosAppThrottled.tql and MultiLogAppThrottled.tql versions introduce delays in the parsing of the source streams in order to simulate real-time data in the dashboards. See the comments in those TQL files for more information.

PosApp

The PosApp sample application demonstrates how a credit card payment processor might use Striim to generate reports on current transaction activity by merchant and send alerts when transaction counts for a merchant are higher or lower than average for the time of day.

Overview

Before following the instructions below, complete the steps in Configuring your system to install Striim.

In the web UI, from the top left menu, select Apps, click the next to PosApp, and select Manage Flow. The flow editor displays a graphical representation of the application flow:

Screen_Shot_2015-09-22_at_4.58.45_PM.png

The following is a simplified diagram of that flow:

simplified_PosApp_diagram.png
Step 1: acquire data

The flow starts with a source:

posapp_02.png

Double-clicking CsvDataSource displays its properties:

posapp_03.png

This is the primary data source for this application. In a real-world application, it would be real-time data. Here, the data comes from a comma-delimited file, posdata.csv. Here are the first two lines of that file:

BUSINESS NAME, MERCHANT ID, PRIMARY ACCOUNT NUMBER, POS DATA CODE, DATETIME, EXP DATE, 
CURRENCY CODE, AUTH AMOUNT, TERMINAL ID, ZIP, CITY
COMPANY 1,D6RJPwyuLXoLqQRQcOcouJ26KGxJSf6hgbu,6705362103919221351,0,20130312173210,
0916,USD,2.20,5150279519809946,41363,Quicksand

In Striim terms, each line of the file is an event, which in many ways is comparable to a row in a SQL database table, and and can be used in similar ways. Click Advanced to see the DSVParser properties:

posapp_04.png

The True setting for the header property indicates that the first line contains field labels that are not to be treated as data.

The "Output to" stream CsvStream uses the WAEvent type associated with DSVParser:

240_posapp_1_03.png

The only field used by this application is "data", an array containing the delimited fields.

Step 2: filter the data stream
posapp_05.png

CsvDataSource outputs the data to CsvStream, which is the input for the query CsvToPosData:

csvtoposdata.png

This CQ converts the comma-delimited fields from the source into typed fields in a stream that can be consumed by other Striim components. Here, "data" refers to the array mentioned above, and the number in brackets specifies a field from the array, counting from zero. Thus data[1] is MERCHANT ID, data[4] is DATETIME, data[7] is AUTH AMOUNT, and data[9] is ZIP.

TO_STRING, TO_DATEF, and TO_DOUBLE functions cast the fields as the types to be used in the Output to stream. The DATETIME field from the source is converted to both a dateTime value, used as the event timestamp by the application, and (via the DHOURS function) an integer hourValue, which is used to look up historical hourly averages from the HourlyAveLookup cache, discussed below.

The other six fields are discarded. Thus the first line of data from posdata.csv has at this point been reduced to five values:

  • D6RJPwyuLXoLqQRQcOcouJ26KGxJSf6hgbu (merchantId)

  • 20130312173210 (DateTime)

  • 17 (hourValue)

  • 2.20 (amount)

  • 41363 (zip)

The CsvToPosDemo query outputs the processed data to PosDataStream:

posdatastream.png

PosDataStream assigns the five remaining fields the names and data types in the order listed above:

  • PRIMARY ACCOUNT NUMBER to merchantID

  • DATETIME to dateTime

  • the DATETIME substring to hourValue

  • AUTH AMOUNT to amount

  • ZIP to zip

Step 3: define the data set
posapp_08.png

PosDataStream passes the data to the window PosData5Minutes:

Screen_Shot_2015-09-03_at_12.10.57_PM.png

A window is in many ways comparable to a table in a SQL database, just as the events it contains are comparable to rows in a table. The Mode and Size settings determine how many events the window will contain and how it will be refreshed. With the Mode set to Jumping, this window is refreshed with a completely new set of events every five minutes. For example, if the first five-minute set of events received when the application runs from 1:00 pm through 1:05 pm, then the next set of events will run from 1:06 through 1:10, and so on. (If the Mode were set to Sliding, the window continuously add new events and drop old ones so at to always contain the events of the most recent five minutes.)

Step 4: process and enhance the data
posapp_10.png

The PosData5Minutes window sends each five-minute set of data to the GenerateMerchantTxRateOnly query. As you can see from the following schema diagram, this query is fairly complex:

311GenerateMerchantTxRateOnly.png

The GenerateMerchantTxRateOnly query combines data from the PosData5Minutes event stream with data from the HourlyAveLookup cache. A cache is similar to a source, except that the data is static rather than real-time. In the real world, this data would come from a periodically updated table in the payment processor's system containing historical averages of the number of transactions processed for each merchant for each hour of each day of the week (168 values per merchant). In this sample application, the source is a file, hourlyData.txt, which to simplify the sample data set has only 24 values per merchant, one for each hour in the day.

For each five-minute set of events received from the PosData5Minutes window, the GenerateMerchantTxRateOnly query ouputs one event for each merchantID found in the set to MerchantTxRateOnlyStream, which applies the MerchantTxRate type. The easiest way to summarize what is happening in the above diagram is to describe where each of the fields in the MerchantTxRateOnlySteam comes from:

field

description

TQL

merchantId

the merchantID field from PosData5Minutes

SELECT p.merchantID

zip

the zip field from PosData5Minutes

SELECT ... p.zip

startTime

the dateTime field for the first event for the merchantID in the five-minute set from PosData5Minutes

SELECT ... FIRST(p.dateTime)

count

count of events for the merchantID in the five-minute set from PosData5Minutes

SELECT ... COUNT(p.merchantID)

totalAmount

sum of amount field values for the merchantID in the five-minute set from PosData5Minutes

SELECT ... SUM(p.amount)

hourlyAve

the hourlyAve value for the current hour from HourlyAveLookup, divided by 12 to give the five-minute average

SELECT … l.hourlyAve / 12 ...
  WHERE ...p.hourValue = l.hourValue

upperLimit

the hourlyAve value for the current hour from HourlyAveLookup, divided by 12, then multiplied by 1.15 if the value is 200 or less, 1.2 if the value is between 201 and 800, 1.25 if the value is between 801 and 10,000, or 1.5 if the value is over 10,000

SELECT … l.hourlyAve / 12 * CASE
  WHEN l.hourlyAve / 12 > 10000 THEN 1.15 
  WHEN l.hourlyAve / 12 > 800 THEN 1.2 
  WHEN l.hourlyAve / 12 > 200 THEN 1.25 
  ELSE 1.5 END

lowerLimit

the hourlyAve value for the current hour from HourlyAveLookup, divided by 12, then divided by 1.15 if the value is 200 or less, 1.2 if the value is between 201 and 800, 1.25 if the value is between 801 and 10,000, or 1.5 if the value is over 10,000

SELECT … l.hourlyAve / 12 / CASE
  WHEN l.hourlyAve / 12 > 10000 THEN 1.15 
  WHEN l.hourlyAve / 12 > 800 THEN 1.2 
  WHEN l.hourlyAve / 12 > 200 THEN 1.25
  ELSE 1.5 END

category, status

placeholders for values to be added

SELECT ... '<NOTSET>'

The MerchantTxRateOnlyStream passes this output to the GenerateMerchantTxRateWithStatus query, which populates the category and status fields by evaluating the count, upperLimit, and lowerLimit fields:

SELECT merchantId,
  zip,
  startTime,
  count,
  totalAmount,
  hourlyAve,
  upperLimit,
  lowerLimit,
    CASE
      WHEN count > 10000 THEN 'HOT'
      WHEN count > 800 THEN 'WARM'
      WHEN count > 200 THEN 'COOL'
      ELSE 'COLD' END,
    CASE
      WHEN count > upperLimit THEN 'TOOHIGH'
      WHEN count < lowerLimit THEN 'TOOLOW'
      ELSE 'OK' END
FROM MerchantTxRateOnlyStream

The category values are used by the Dashboard to color-code the map points. The status values are used by the GenerateAlerts query.

The output from the GenerateMerchantTxRateWithStatus query goes to MerchantTxRateWithStatusStream.

Step 5: populate the dashboard
posapp_10.png

The GenerateWactionContent query enhances the data from MerchantTxRateWithStatusStream with the merchant's company, city, state, and zip code, and the latitude and longitude to position the merchant on the map, then populates the MerchantActivity WActionStore:

220_GenerateWactionContext.png

In a real-world application, the data for the NameLookup cache would come from a periodically updated table in the payment processor's system, but the data for the ZipLookup cache might come from a file such as the one used in this sample application.

When the application finishes processing all the test data, the WActionStore will contain 423 WActions, one for each merchant. Each WAction includes the merchant's context information (MerchantId, StartTime, CompanyName, Category, Status, Count, HourlyAve, UpperLimit, LowerLimit, Zip, City, State, LatVal, and LongVal) and all events for that merchant from the MerchantTxRateWithStatusStream (merchantId, zip, String, startTime, count, totalAmount, hourlyAvet, upperLimit, lowerLimit, category, and status for each of 40 five-minute blocks). This data is used to populate the dashboard, as detailed in PosAppDash.

Step 6: trigger alerts
posapp_11.png

MerchantTxRateWithStatusStream sends the detailed event data to the GenerateAlerts query, which triggers alerts based on the Status value:

311GenerateAlerts.png

When a merchant's status changes to TOOLOW or TOOHIGH, Striim will send an alert such as, "WARNING - alert from Striim - POSUnusualActivity - 2013-12-20 13:55:14 - Merchant Urban Outfitters Inc. count of 12012 is below lower limit of 13304.347826086958." The "raise" value for the flag field instructs the subscription not to send another alert until the status returns to OK.

When the status returns to OK, Striim will send an alert such as, "INFO - alert from Striim - POSUnusualActivity - 2013-12-20 14:02:27 - Merchant Urban Outfitters Inc. count of 15853 is back between 13304.347826086958 and 17595.0." The "cancel" value for the flag field instructs the subscription to send an alert the next time the status changes to TOOLOW or TOOHIGH. See Sending alerts from applications for more information on info, warning, raise, and cancel.

PosAppDash

Screen_Shot_2016-01-05_at_1.03.49_PM.png

See Viewing dashboards for instructions on getting to the PosAppDash Main page shown above.

The following discussion gives a brief overview of the underlying queries and settings for some of the visualizations in this dashboard. For more details, see Dashboard Guide. To explore the settings yourself, click the Edit button at the top right.

Main page vector map
Screen_Shot_2016-01-05_at_12.24.56_PM.png

To view a query for a visualization, click Edit Query. When done click x to exit.

image4.png

The basic query for this map is SELECT * FROM MerchantActivity ORDER BY StartTime, which gets all fields from the MerchantActivity WActionStore. The ORDER BY clause ensures that the map looks the same every time you run the application.

To view the properties for a visualization, click Configure. When done click x to exit.

image8.png

The fields are assigned to map properties as follows:

Screen_Shot_2016-01-05_at_12.27.22_PM.png

Data Retention Type = Current means the map shows the latest events for each map point. Group By CompanyName means each company gets a single dot on the map. Color By Category means the colors are based on the Category field value (more about that below). The Longitude and Latitude properties are set to the fields with the coordinate values. Value = Count means that the size of the dot on the map varies according to the number of transactions in the company's latest event.

The colors of the map points are set manually:

Screen_Shot_2016-01-05_at_12.31.21_PM.png

If the map had a legend, it would use the alias strings:

Screen_Shot_2016-01-05_at_12.37.21_PM.png

See Vector map for more information about these settings.

Screen_Shot_2016-01-05_at_12.23.13_PM.png

The map's title is a "value" visualization, which can add text, query results, and almost any valid HTML to a dashboard. Its basic query is select count(distinct(MerchantId)) as mcount from MerchantActivity, which returns the number of merchants displayed on the map (currently 423). The underlying code is:

<div style="padding: 5px; font-weight: 100; font-size: 1.5em">Latest event count for 
{{ mcount }} merchants (map and scatter plot)</div>

See Value (text label) for more information.

Main page scatter plot
scatterAll.png

The scatter plot uses the same basic query and Group By Company setting as the map plus SAMPLE BY Count MAXLIMIT 500. The Data Retention Type is All, so it shows a range of Count values over time for each company. If you changed Data Retention Type to Current, it would show only the most recent Count value for each company (the same data as on the map):

scatterCurrent.png
Main page bar chart
Screen_Shot_2016-01-05_at_4.02.14_PM.png

The basic query for this bar chart is select sum(Count) as Count, State from MerchantActivity group by State order by sum(Count) desc limit 10, which returns the top ten total counts (for all merchants) by state. For more information, see Bar chart.

Main page heat map
Screen_Shot_2016-01-05_at_4.47.53_PM.png

This represents the total counts for 12 combinations of Status and Category, with blue representing the lowest counts and red the highest. See Heat map for a discussion of the query and settings.

Company details page line chart
Screen_Shot_2016-01-06_at_12.42.04_PM.png

If you are in edit mode, click Done to stop editing. Then, in the map, click on the big red dot for Recreational Equipment in California to drill down to the "Company details" page.

Screen_Shot_2016-01-06_at_11.03.15_AM.png

The green line represents the average count, red and orange represent the upper and lower bounds, and blue is the count itself. You can see that between 7:00 and 7:30 the count dropped below the lower bound, which means its current status is TOOLOW.

The basic query is select CompanyName, Count, UpperLimit, LowerLimit, HourlyAve, StartTime, City, State, Zip, MerchantId, Status, Category from MerchantActivity. The queries also include where MerchantID=:mid, which limits the results to the company you drilled down on (see Creating links between pages). Data Retention Type is All because we’re plotting the values over time. Group By is blank because we are not grouping the data. Color By is blank because colors are set manually as for the map.

Screen_Shot_2016-01-06_at_11.09.49_AM.png

The horizontal axis for all series is is StartTime. The vertical axis uses a different field for each of the four series (Count, UpperLimit, HourlyAve, and LowerLimit). Above, series 1 (Count, the blue line) is selected. The Series Alias defines the label for the legend. To change settings for another series, click its number in the selector at upper right.

Screen_Shot_2016-01-06_at_11.17.41_AM.png

Here we see the settings for series 2 (UpperLimit, the red line).

For information on the other visualizations on this page, see Gauge, Icon, and Table.

Interactive HeatMap page donut charts
2016-01-06_13-20-11.png

To return to the main page, click PosAppDash.

2016-01-06_13-21-49.png

Then click click here for additional visualizations in the heat map title.

Screen_Shot_2016-01-06_at_1.33.53_PM.png

The basic query for both donut charts is SELECT count(*) AS Count, Status, Category FROM MerchantActivity.

Screen_Shot_2016-01-06_at_1.28.37_PM.png

The left chart (settings shown above) shows the total Count for each Status (OK, TOOLOW, TOOHIGH). The right chart shows the total Count for each Category (COLD, COOL, WARM, HOT).

See Making visualizations interactive for discussion of how clicking segments in the donut charts filters the data for the page's vector map and heat map.

MultiLogApp

This sample application shows how Striim could be used to monitor and correlate logs from web and application server logs from the same web application. The following is a relatively high-level explanation. For a more in-depth examination of a sample application with more detail about the components and how they interact, see PosApp.

MultiLogApp contains 12 flows that analyze the data from one or both logs and take appropriate actions:

  • MonitorLogs parses the log files to create two event streams (AccessStream for access log events and Log4JStream for application log events) used by the other flows. See the detailed discussion below.

  • ErrorsAndWarnings selects application log error and warning messages for use by the ErrorHandling and WarningHandling flows, and creates a sliding window containing the 300 most recent errors and warnings for use by the LargeRTCheck and ZeroContentCheck flows, which join it with web server data.

The following flows send alerts regarding the web server logs and populate the dashboard's Overview page world map and the Detail - UnusualActivity page:

  • HackerCheck sends an alert when an access log srcIp value is on a blacklist.

  • LargeRTCheck sends an alert when an access log responseTime value exceeds 2000 microseconds.

  • ProxyCheck sends an alert when an access log srcIP value is on a list of suspicious proxies.

  • ZeroContentCheck sends an alert when an access log entry's code value is 200 (that is, the HTTP request succeeded) but the size value is 0 (the return had no content).

The following flows send alerts regarding the application server log and populate the dashboard's Overview page pie chart and API detail pages:

  • ErrorHandling sends an alert when an error message appears in the application server log.

  • WarningHandling sends an alert once an hour with the count of warnings for each API call for which there has been at least one alert.

  • InfoFlow joins application log events with user information from the MLogUserLookup cache to create the ApiEnrichedStream used by ApiFlow, CompanyApiFlow, and UserApiFlow.

  • ApiFlow populates the Detail - ApiActivity page.

  • CompanyApiFlow populates the Detail - CompanyApiActivity page and the bar chart on the Overview page. It also sends an alert when an API call is used by a company more than 1500 times during the flow's one-hour jumping window.

  • UserApiFlow populates the dashboard's Detail - UserApiActivity page and the US map on the Overview page. It also sends an alert when an API call is used by a user more than 125 times during the flow's one-hour window.

MonitorLogs: web server log data

The web server logs are in Apache NCSA extended/ combined log format plus response time:

"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" %D"

(See apache.org for more information.) Here are four sample log entries:

216.103.201.86 - EHernandez [10/Feb/2014:12:13:51.037 -0800]  "GET http://cloud.saas.me/login&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 21560 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 1606
216.103.201.86 - EHernandez [10/Feb/2014:12:13:52.487 -0800]  "GET http://cloud.saas.me/create?type=Partner&id=01e3928f-e05a-9be1-bdc5-14109fcf2383&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 63523 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 1113
216.103.201.86 - EHernandez [10/Feb/2014:12:13:52.543 -0800]  "GET http://cloud.saas.me/query?type=ChatterMessage&id=01e3928f-e05a-9be2-bdc5-14109fcf2383&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 46556 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 1516
216.103.201.86 - EHernandez [10/Feb/2014:12:13:52.578 -0800]  "GET http://cloud.saas.me/retrieve?type=ContractHistory&id=01e3928f-e05a-9be3-bdc5-14109fcf2383&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 44556 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 39
Screen_Shot_2016-01-07_at_10.42.42_AM.png

In MultiLogApp, these logs are read by AccessLogSource:

CREATE SOURCE AccessLogSource USING FileReader (
  directory:'Samples/MultiLogApp/appData',
  wildcard:'access_log',
  blocksize: 10240,
  positionByEOF:false
)
PARSE USING DSVParser (
  columndelimiter:' ',
  ignoreemptycolumn:'Yes',
  quoteset:'[]~"',
  separator:'~'
)
OUTPUT TO RawAccessStream;

The log format is space-delimited, so the columndelimiter value is one space. With these quoteset and separator values, both square brackets and double quotes are recognized as delimiting strings that may contain spaces. With these settings, the first log entry above is output as a WAEvent data array with the following values:

"216.103.201.86",
"-",
"EHernandez",
"10/Feb/2014:12:13:51.037 -0800",
"GET http://cloud.saas.me/login&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1",
"200",
"21560",
"-",
"Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)",
"1606"

This in turn is processed by the ParseAccessLog CQ:

CREATE CQ ParseAccessLog 
INSERT INTO AccessStream
SELECT data[0],
  data[2],
  MATCH(data[4], ".*jsessionId=(.*) "),
  TO_DATE(data[3], "dd/MMM/yyyy:HH:mm:ss.SSS Z"),
  data[4],
  TO_INT(data[5]),
  TO_INT(data[6]),
  data[7],
  data[8],
  TO_INT(data[9])
FROM RawAccessStream;

After the AccessLogEntry type is applied, the event looks like this:

srcIp: "216.103.201.86"
userId: "EHernandez"
sessionId: "01e3928f-e059-6361-bdc5-14109fcf2383"
accessTime: 1392063231037
request: "GET http://cloud.saas.me/login&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1"
code: 200
size: 21560
referrer: "-"
userAgent: "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"
responseTime: 1606

The web server log data is now in a format that Striim can analyze. AccessStream is used by the HackerCheck, LargeRTCheck, ProxyCheck, and ZeroContentCheck flows.

MonitorLogs: application server log data

The application server logs are in Apache's Log4J format. (Log4J is a standard Java logging framework used by many web-based applications.) In a real-world implementation, this application could be reading many log files on many hosts. Here is a sample message:

<log4j:event logger="SaasApplication" timestamp="1392067731765" level="ERROR" thread="main">
<log4j:message><![CDATA[Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User]]]></log4j:message>
<log4j:throwable><![CDATA[com.me.saas.SaasMultiApplication$SaasException: Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User]
    at com.me.saas.SaasMultiApplication.login(SaasMultiApplication.java:1253)
    at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at com.me.saas.SaasMultiApplication$UserApiCall.invoke(SaasMultiApplication.java:360)
    at com.me.saas.SaasMultiApplication$Session.login(SaasMultiApplication.java:1447)
    at com.me.saas.SaasMultiApplication.main(SaasMultiApplication.java:1587)
]]></log4j:throwable>
<log4j:locationInfo class="com.me.saas.SaasMultiApplication" method="login" file="SaasMultiApplication.java" line="1256"/>
</log4j:event>
Screen_Shot_2016-01-07_at_10.44.12_AM.png

Log4JSource retrieves data from …/Striim/Samples/MultiLOgApp/appData/log4jLog. This file contains around 1.45 million errors, warnings, and informational messages. The XMLParser portion of Log4JSource specifies the portions of the message that will be used by this application:

CREATE SOURCE Log4JSource USING FileReader (
  directory:'Samples/MultiLogApp/appData',
  wildcard:'log4jLog.xml',
  positionByEOF:false
) 
PARSE USING XMLParser(
  rootnode:'/log4j:event',
  columnlist:'log4j:event/@timestamp,
    log4j:event/@level,
    log4j:event/log4j:message,
    log4j:event/log4j:throwable,
    log4j:event/log4j:locationInfo/@class,
    log4j:event/log4j:locationInfo/@method,
    log4j:event/log4j:locationInfo/@file,
    log4j:event/log4j:locationInfo/@line'
)
OUTPUT TO RawXMLStream;

For example, for the sample log message above, log4j:event/@level returns WARN and log4j:event/log4j:locationInfo/@line returns 1133. These elements are output as a WAEvent data array with the following values:

"1392067731765",
"ERROR",
"Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User]","com.me.saas.SaasMultiApplication$SaasException: Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User]\n\tat com.me.saas.SaasMultiApplication.login(SaasMultiApplication.java:1253)\n\tat sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:606)\n\tat com.me.saas.SaasMultiApplication$UserApiCall.invoke(SaasMultiApplication.java:360)\n\tat com.me.saas.SaasMultiApplication$Session.login(SaasMultiApplication.java:1447)\n\tat com.me.saas.SaasMultiApplication.main(SaasMultiApplication.java:1587)",
"com.me.saas.SaasMultiApplication",
"login",
"SaasMultiApplication.java",
"1256"

This array in turn is processed by the ParseLog4J CQ:

CREATE CQ ParseLog4J
INSERT INTO Log4JStream
SELECT TO_DATE(TO_LONG(data[0])),
  data[1],
  data[2], 
  MATCH(data[2], '\\\\[api=([a-zA-Z0-9]*)\\\\]'),
  MATCH(data[2], '\\\\[session=([a-zA-Z0-9\\-]*)\\\\]'),
  MATCH(data[2], '\\\\[user=([a-zA-Z0-9\\-]*)\\\\]'),
  MATCH(data[2], '\\\\[sobject=([a-zA-Z0-9]*)\\\\]'),
  data[3],
  data[4],
  data[5],
  data[6],
  data[7]
FROM RawXMLStream;

The elements in the array are numbered from zero, so data[0] returns the timestamp, data[1] returns the level, and so on. The MATCH functions use regular expressions to return the api, session, user, and sobject portions of the message string. (See Using regular expressions (regex) for discussion of the multiple escapes for [ and ] in the regular expressions.) After processing by the CQ, the event looks like this:

logTime: 1392067731765
level: "ERROR"
message: "Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User]"
api: "login"
sessionId: "01e3928f-e975-ffd4-bdc5-14109fcf2383"
userId: "HGonzalez"
sobject: "User"
xception: "com.me.saas.SaasMultiApplication$SaasException: Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User]\n\tat com.me.saas.SaasMultiApplication.login(SaasMultiApplication.java:1253)\n\tat sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:606)\n\tat com.me.saas.SaasMultiApplication$UserApiCall.invoke(SaasMultiApplication.java:360)\n\tat com.me.saas.SaasMultiApplication$Session.login(SaasMultiApplication.java:1447)\n\tat com.me.saas.SaasMultiApplication.main(SaasMultiApplication.java:1587)"
className: "com.me.saas.SaasMultiApplication"
method: "login"
fileName: "SaasMultiApplication.java"
lineNum: "1256"

The application server log data is now in a format that Striim can analyze. Log4JStream is used by the ErrorsAndWarnings and InfoFlow flows.

ErrorsAndWarnings
Screen_Shot_2016-01-07_at_10.46.50_AM.png

The GetLog4JErrorWarning CQ filters Log4JStream, selecting only WARN and ERROR messages and discarding all others.

CREATE CQ GetLog4JErrorWarning
INSERT INTO Log4ErrorWarningStream
SELECT l FROM Log4JStream l
WHERE l.level = 'ERROR' OR l.level = 'WARN';

Log4ErrorWarningStream is used by the ErrorHandling and WarningHandling flows and by the Log4JErrorWarningActivity sliding window, which contains the most recent 300 events:

CREATE WINDOW Log4JErrorWarningActivity 
OVER Log4ErrorWarningStream KEEP 300 ROWS;

This window is used by the LargeRTCheck and ZeroContentCheck flows.

HackerCheck
Screen_Shot_2016-01-07_at_10.50.08_AM.png

This flow sends an alert when an access log srcIp value is on a blacklist. The BlackListLookup cache contains the blacklist:

CREATE CACHE BlackListLookup using FileReader (
  directory: 'Samples/MultiLogApp/appData',
  wildcard: 'multiLogBlackList.txt'
)
PARSE USING DSVParser ( )
QUERY (keytomap:'ip') OF IPEntry;

The FindHackers CQ selects access log events that match a blacklist entry:

CREATE CQ FindHackers
INSERT INTO HackerStream
SELECT ale 
FROM AccessStream ale, BlackListLookup bll
WHERE ale.srcIp = bll.ip;

The SendHackingAlerts CQ sends an alert for each such event:

CREATE CQ SendHackingAlerts 
INSERT INTO HackingAlertStream 
SELECT 'HackingAlert', ''+accessTime, 'warning', 'raise',
  'Possible Hacking Attempt from ' + srcIp + ' in ' + IP_COUNTRY(srcIp)
FROM HackerStream;

CREATE SUBSCRIPTION HackingAlertSub 
USING WebAlertAdapter( ) 
INPUT FROM HackingAlertStream;

This flow also creates the UnusualActivity WActionStore that populates various charts and tables on the dashboard:

CREATE TYPE UnusualContext (
    typeOfActivity String,
    accessTime DateTime,
    accessSessionId String,
    srcIp String KEY,
    userId String,
    country String,
    city String,
    lat double,
    lon double
);
CREATE WACTIONSTORE UnusualActivity 
CONTEXT OF UnusualContext ...

The GenerateHackerContext CQ populates UnusualActivity:

CREATE CQ GenerateHackerContext
INSERT INTO UnusualActivity
SELECT 'HackAttempt', accessTime, sessionId, srcIp, userId,
  IP_COUNTRY(srcIp), IP_CITY(srcIP), IP_LAT(srcIP), IP_LON(srcIP)
FROM HackerStream
LINK SOURCE EVENT;

HackAttempt is a literal string that identifies the type of activity. That will distinguish events from this flow from those from the three other flows that populate UnusualActivity.

LargeRTCheck
Screen_Shot_2016-01-07_at_10.53.00_AM.png

LargeRTCheck sends an alert whenever an access log responseTime value exceeds 2000 microseconds.

CREATE CQ FindLargeRT
INSERT INTO LargeRTStream
SELECT ale
FROM AccessStream ale
WHERE ale.responseTime > 2000;

The alert code is similar to HackerCheck's.

The typeOfActivity string for events written to the UnusualActivity WActionStore is LargeResponseTime.

ProxyCheck
Screen_Shot_2016-01-07_at_10.55.44_AM.png

ProxyCheck sends an alert when an access log srcIP value is on a list of suspicious proxies. This works exactly like HackerCheck but with a different list. The typeOfActivity string for events written to the UnusualActivity WActionStore is ProxyAccess.

ZeroContentCheck
Screen_Shot_2016-01-07_at_10.59.41_AM.png

ZeroContentCheck sends an alert when an access log entry's code value is 200 (that is, the HTTP request succeeded) but the size value is 0 (the return had no content).

CREATE CQ FindZeroContent
INSERT INTO ZeroContentStream
SELECT ale
FROM AccessStream ale
WHERE ale.code = 200 AND ale.size = 0;

The alert code is similar to HackerCheck's.

The typeOfActivity string for events written to the UnusualActivity WActionStore is ZeroContent).

ErrorHandling
Screen_Shot_2016-01-07_at_11.02.22_AM.png

This flow sends an alert immediately when an error appears in Log4ErrorWarningStream.

CREATE CQ GetErrors 
INSERT INTO ErrorStream 
SELECT log4j 
FROM Log4ErrorWarningStream log4j WHERE log4j.level = 'ERROR';

CREATE CQ SendErrorAlerts 
INSERT INTO ErrorAlertStream 
SELECT 'ErrorAlert', ''+logTime, 'error', 'raise', 'Error in log ' + message 
FROM ErrorStream;

GetErrors discards all WARN messages and passes only ERROR messages. In SendErrorAlerts, since the key value is logTime (which is different for every event) and the flag is raise (see Sending alerts from applications), an alert will be sent for every message in ErrorStream.

WarningHandling
Screen_Shot_2016-01-07_at_11.03.16_AM.png

This flow sends an alert once an hour with the count of warnings for each API call for which there has been at least one warning. The following code creates a one-hour jumping window of application log warning messages:

CREATE CQ GetWarnings 
INSERT INTO WarningStream 
SELECT log4j 
FROM Log4ErrorWarningStream log4j WHERE log4j.level = 'WARN';

CREATE JUMPING WINDOW WarningWindow 
OVER WarningStream KEEP WITHIN 60 MINUTE ON logTime;

The HAVING clause in the SendWarningAlerts CQ filters out API calls that have had no warnings.

CREATE CQ SendWarningAlerts 
INSERT INTO WarningAlertStream 
SELECT 'WarningAlert', ''+logTime, 'warning', 'raise', 
        COUNT(logTime) + ' Warnings in log for api ' + api 
FROM WarningWindow 
GROUP BY api 
HAVING count(logTime) > 1;
InfoFlow, APIFlow, CompanyApiFlow, and UserApiFlow

InfoFlow joins application log INFO events with user information from the MLogUserLookup cache:

CREATE CQ GetInfo 
INSERT INTO InfoStream 
SELECT log4j 
FROM Log4JStream log4j WHERE log4j.level = 'INFO';

CREATE CQ GetUserDetails 
INSERT INTO ApiEnrichedStream 
SELECT a.userId, a.api, a.sobject, a.logTime,
  u.userName, u.company, u.userZip, u.companyZip 
FROM InfoStream a, MLogUserLookup u 
WHERE a.userId = u.userId;

Otherwise this portion of the application is generally similar to PosApp. APIFlow, CompanyApiFlow, and UserAPIFlow populate dashboard charts and send alerts as described in the summary above.

MultiLogDash

This dashboard includes only visualization types previously discussed in PosAppDash. There are numerous examples of tables that may be instructive.

3.10.1
Was this article helpful?
0 out of 0 found this helpful

Comments