Note: if you haven't already, see Log Parsing, Analysis, Correlation, and Reporting Engine post first.
Access log is a great source of information (for troubleshooting, performance analysis, user trend reporting etc.) as it records all requests processed by Apache Web server. What information to capture in access log is controlled using CustomLog and LogFormat directives. Visit Apache site (https://httpd.apache.org/docs/2.4/logs.html#accesslog) for more information about the access log.
This particular Log Parser that I'm discussing here is written to parse the access_log generated using the following log format:
LogFormat "%h %l %u %t \"%r\" %>s %b JSESSIONID=\"%{JSESSIONID}C\" UID=\"%{UID}C\" %D %I %O \"%{User-agent}i\" %v" common
|
Note: if your access_log is generated using different LogFormat, you may need to tweak the script a little bit.
Finding log files: currently parser finds all access_log in the given path if:
$recDate == $currDate
or access_log.$rec0MM$rec0DD$recYY
if ($recDate < $currDate).
Where:
recDate:
|
Optional. It is the log entry date. Meaning log entries with that date will be processed. It takes the format 'YYYY-MM-DD'. Default is to use current date. However, if 'daily' is chosen as 2nd argument, and log entry date is not provided, it defaults to 'date - 1 day'.
|
currDate:
|
Optional. It is the log entry date. Meaning log entries with that date will be processed. It takes the format 'YYYY-MM-DD'. Default is to use current date. However, if 'daily' is chosen
|
rec0MM:
|
rec0MM is Month like 01 (01 represent month of January)
|
rec0DD:
|
rec0DD is Day like 01 (01 represents the first day of a month)
|
recYY:
|
recYY is Year like 17 (17 represent year of 2017)
|
Review the actual script available in github - https://github.com/pppoudel/log-parser/blob/master/webAccessLogParser.sh for details.
Note: script is written to parse the date format like '13/Jun/2015:10:32:04 -0400' in access_log. If your access_log uses different date format, you may need to tweak the section of script which parses date.
How to execute:
You can see all the available options, by just launching:$> ./webAccessLogParser.sh
Few examples are here:
# processing current day's logs
|
Output
Report/Output files:- $rptDir/00_Alert.txt
- $rptDir/02_WebAccessLogSummaryRpt.txt
- $rptDir/WebAccessLogRpt_all.csv
- $rptDir/WebAccessLog_discardedRpt.csv
- $rptDir/WebAccessLogSummaryByDomainRpt.csv
- $rptDir/WebAccessLogSummaryByTransactionRpt.csv
- $rptDir/WebAccessLogSummaryByUIDRpt.csv
- $rptDir/WebAccessLogSummaryByRC400PlusURLRpt.csv
- $rptDir/WebAccessLogSummaryByUidSessionRpt.csv
- $rptDir/WebAccessLogSummaryUnknowUARpt.csv
- $rptDir/WebHourlyDomainUsageByUid.csv
- $rptDir/WebHourlyDomainUsageBySess.csv
- $rptDir/WebDlyDomainUsage.csv
Where $rptDir is report directory. Default value is $TMP/$recDate
History Report/Output files:
# These are historical reports. Each run will append record in existing report file.- $pDir/WebPerfHistoryRpt.csv
- $pDir/WebHourlyAvgRespTimeHistoryRpt.csv
- $pDir/WebUniqueUsersHourlyHistoryRpt_all.csv
- $pDir/WebRequestTypeHistoryRpt.csv
- pDir/WebResponseCodeHistoryRpt.csv
- $pDir/WebStatsByIHSHistoryRpt.csv
- $pDir/WebStatsByWASHistoryRpt.csv
See sample summary report in github - https://github.com/pppoudel/log-parser/blob/master/sample_reports/02_WebAccessLogSummaryRpt.txt
See my other posts in this series
- websphereLogParser.sh for parsing, analyzing and reporting WebSphere Application Server (WAS) SystemOut.log
- webErrorLogParser.sh for parsing, analyzing and reporting Apache/IBM HTTP Server (IHS) error_log
- javaGCStatsParser.sh for parsing, analyzing and reporting Java verbose Garbage Collection (GC) log
No comments:
Post a Comment