ted.neward@newardassociates.com | Blog: http://blogs.newardassociates.com | Github: tedneward | LinkedIn: tedneward
How do I let others identify problems?
How do I let others fix problems?
How do I identify problems before users do?
the art of identifying and fixing a problem after it has been spotted
the art of identifying and fixing a problem after it has been spotted
the art of keeping an eye on the system so as to prevent problems from being spotted or occurring
the art of identifying and fixing a problem after it has been spotted
the art of keeping an eye on the system so as to prevent problems from being spotted or occurring
of the two, Monitoring is actually more important
The worst form of error report is the one that comes from users or customers
your company looks bad
your application looks bad
you look bad
remember, first impressions are everything!
Monitoring allows developers & admins to watch the running system and react to problems before users ever see them
not only deadlocks, crashes, outages, ...
but also slow response times
Execution-level monitoring
Database monitoring
App server monitoring
Application monitoring
Logging
Reports
Statistics
Threshold alerts
virtual machines will usually have some amount of monitoring
tracking automatic memory management
identifying thread creation
tracking code-loading
native (OS) monitoring is also often available
specific details depend on the OS
(Fact: DBAs will likely do most of the production database monitoring, but it helps to know their tools)
Query analyzers: learn to use your database's tool
Protocol interceptor: Wireshark
API interceptors
Most production-ready app servers have some form of monitoring capabilities
Consider separate logs for developers and administrators (and any other roles)
Ensure a clear definition of verbosity for each level
Consider providing "runtime-connectable" diagnostic sinks (socket) for "hot" viewing
When logging, always explicitly check logging flags--- if the flags change during execution, logging code can pick up the changes
Consider separate sinks for logs as well as a unified sink for a more holistic view of the system
Statistics provide insight into application execution
Step 1: identify useful statistics (you, admins, users)
counts, averages (per min, sec, hour, �)
logins (successful, failed), database (accesses, timeouts, SQL errors), application exceptions (handled, unhandled), transactions (shopping carts purchased), etc
look to PerfMon and/or business analyst's reports for suggestions
Step 2: if not somehow already provided, build instrumentation to track those statistics
Step 3: enable the instrumentation in your running app/server
varies by platform or appserver
-verbose
command-line options provide diagnostics on GC(!), JNI invocation, and classes loaded(!)
_JAVA_LAUNCHER_DEBUG
environment variable provides JRE info
jps provides list of Java processes & PIDs
jconsole/visualVM provides nice JVM GUI overview
(5) must launch monitored JVM with property param: -Dcom.sun.management.jmxremote
(6+) no param necessary; agent can be hot-loaded
Custom tools: build your own JMX clients
HPROF: sample (!) profiler shipping with Sun JDK
LoggingMXBean
(Hotspot-specific) provides memory dump during execution
jhat: analyzes HPROF binary output, provides an HTTP connection point to browse the data
jstat: prints from built-in HotSpot instrumentation
JMX MBeans: purely managed code
JVMTI (JVMDI, JVMPI): C++
Instrumentation (java.lang.instrument)
Thread APIs (java.lang.Thread)
Online resources
http://java.sun.com/javase/6/webnotes/trouble/TSG-VM/TSG-VM.pdf
Troubleshooting Guide
http://java.sun.com/javase/6/webnotes/trouble/TSG-Desktop/TSG-Desktop.pdf
Troubleshooting Guide for Desktop Applications
http://java.sun.com/javase/6/webnotes/trouble/other/matrix6-Windows.html
Quick Troubleshooting Guide
http://java.sun.com/products/hotspot/whitepaper.html
The Java HotSpot Performance Engine Architecture
http://java.sun.com/docs/performance/
Java Performance Documentation
http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp
Java HotSpot VM Options
use JVM and app server hooks to keep an eye on the infrastructure your system depends on
create application hooks to keep an eye on the core domain functionality your system offers
build/use monitoring tools as necessary to enable monitoring to be more than a developer activity
Architect, Engineering Manager/Leader, "force multiplier"
http://www.newardassociates.com
http://blogs.newardassociates.com
Sr Distinguished Engineer, Capital One
Educative (http://educative.io) Author
Performance Management for Engineering Managers
Books
Developer Relations Activity Patterns (w/Woodruff, et al; APress, forthcoming)
Professional F# 2.0 (w/Erickson, et al; Wrox, 2010)
Effective Enterprise Java (Addison-Wesley, 2004)
SSCLI Essentials (w/Stutz, et al; OReilly, 2003)
Server-Based Java Programming (Manning, 2000)