Crash Dump Analysis on Integrity

This course provides the foundations to perform crash dump analysis on OpenVMS Integrity servers. Itanium architectural characteristics are introduced in order to analyze crash dumps and system hangs. The calling standard is reviewed, allowing students to walk call frames to determine where improper arguments to a function call have been passed. Interrupts and exceptions are discussed, as well as their handling by the Software Interrupt Services (SWIS). Considerable time is spent exploring different types of crashes and reviewing specific examples.

Course Objectives

Upon completion of the course, students should be able to: 

  • Identify the general Integrity architecture and procedure call types
  • Describe what causes a crash to occur
  • Describe the general characteristics of interruptions on Integrity servers
  • Identify the reasons for an "Unexpected System Service Exception" crash
  • Describe and analyze basic "Invalid Exception" crashes
  • Identify and analyze basic "Page Fault with IPL too high" crashes
  • Describe monitoring-oriented system parameters
  • Describe and perform basic analysis of the "Kernel stack not valid" crash
  • Identify general causes of SMP-related crashes
  • Describe and analyze basic AST-related crashes
  • Analyze system hangs

Curriculum

Course Outline

Overview of General Itanium Architecture
  • General IA64 architecture
  • Itanium application registers
  • Predication
  • Register conventions and usage considerations
  • Using map and listing files
Procedure Calls
  • OpenVMS calling standard on IA64
  • Calling procedures
  • Passing parameters
  • Register stack engine
Crash Dump Fundamentals
  • Overview of OpenVMS crashes
  • Crash dump analysis tools
  • General steps in analyzing crashes
Interrupts and Exceptions
  • Interrupts and exceptions
  • Software Interrupt Services (SWIS)
  • Exception frames
  • SWIS log
Exception-Related Crashes
  • Exceptions and condition handling review
  • Exception-related crashes
  • Examining unexpected system service exception crashes
Invalid Exception Crashes
  • Invalid exception crash concepts
  • Analyzing invalid exception crashes
Analysis of "Page Fault with IPL too High" Crashes
  • "Page fault with IPL too high" crash concepts
  • Analyzing "Page fault with IPL too high" crashes
More on SDA
  • SDA techniques
  • Monitoring-oriented parameters
"Kernel Stack Not Valid" Crashes
  • "Kernel stack not valid" concepts
  • Sample analysis of "Kernel stack not valid" crashes
SMP-Related Crashes
  • Symmetric multiprocessing
  • SMP bugchecks
  • Spin wait crashes
  • CPU sanity timeout crashes
AST-Related Crashes
  • AST concepts
  • AST crash considerations
  • Sample crashes in AST routines
System Hangs
  • System hangs
  • Methods for forcing a system crash
  • Analyzing system hangs
  • Sample analysis of system hangs