Practices for Lesson 8 Practices Overview
Practice 8-1: Investigating Server Problems
Duration: 45 minutes Skills Learned
At the end of this practice, you should be able to:
• Manage server overload conditions
• Use WLST and the console to monitor server threads
• Identify potential deadlocks in thread dumps
• Limit server resource consumption by using work managers Overview
WebLogic Server employs a self-tuning thread pool that optimizes the number of threads based on server load and on constraints defined by work managers. Idle threads await an incoming client request to process, while active threads are currently running application code. WLS periodically monitors its threads to collect statistics and to also flag potential problems, such as a thread that appears to be stuck. If a server becomes too overloaded, WLS can also shut itself down or shut down individual applications.
In this practice, you simulate an overloaded server by sending multiple requests to an
unresponsive application. You then use various tools to investigate the cause of this issue (a deadlock, for example). Finally, you take advantage of the work manager framework to tune and constrain the threading behavior for the application. This lab environment is depicted in the following diagram:
Instructions
1. Set up the practice.
a. Locate a Lab Framework prompt or start a new one. Change directories to
<CURRENT_LAB>.
b. Execute the following:
ant setup_exercise The Lab Framework:
− Deploys a new version of the MedRec application along with a deployment plan
− Updates the configuration for MedRecSvr1 c. Kill and restart the server MedRecSvr1.
2. Test the application under load.
a. From the Lab Framework prompt, navigate to <LAB_WORK>/client.
b. Execute the runclients.sh script.
Oracle University and Sentra inversiones y servicios LTDA use only
c. After a few minutes, MedRecSvr1 should fail and also generate a thread dump. The test client should also report that there were errors.
d. Use the server’s output or log file to locate the original error messages, which report several stuck threads. For example:
<Critical> <Health> ... <Critical Subsystem Thread Pool has failed. Setting server state to FAILED.
Reason: Server failed as the number of stuck threads has exceeded the max limit of 3>
Tip: You may also see the following types of errors:
<Error> <WebLogicServer> <BEA-000337> <[STUCK] ExecuteThread:
'0' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "83" seconds working on the request ... which is more than the configured time (StuckThreadMaxTime) of "60" seconds.
3. Modify server overload configuration.
a. Launch the administration console and Lock it.
b. Locate and edit MedRecSvr1.
c. Click the Configuration > Overload tab.
d. Notice the current values of Failure Action and Panic Action.
e. Update the following fields:
Field Value
Max Stuck Thread Time 300 Stuck Thread Count 5 f. Save and Activate your changes.
g. Start MedRecSvr1.
4. Monitor thread usage by using WLST and the console.
a. Inspect the contents of the <CURRENT_LAB>/resources/wlst/
monitorThreads.py file.
b. Launch another Lab Framework prompt.
c. Execute the monitorThreads.py WLST script.
d. Note the current number of threads in various states (idle, active, and stuck). Leave the script running.
e. Execute the runclients.sh script again and continue monitoring the WLST output.
Confirm that the number of active and stuck threads is increasing.
Notice that the server no longer automatically shuts down.
f. Kill the WLST script.
g. Return to the console and view MedRecSvr1 again.
h. Click the Monitoring > Threads tab.
i. Inspect the contents of the Self-Tuning Thread Pool Threads table. Make a note of the names of threads whose Hogger flag is set to true.
Hogger threads are simply suspicious threads that have not yet formally been flagged as stuck.
Tip: You can scroll to the right to view this column, if necessary. Alternatively, customize the table.
Oracle University and Sentra inversiones y servicios LTDA use only
5. Analyze thread deadlocks by using the console.
a. Click the Dump Thread Stacks button.
Tip: If the test client has finished, run it again and then take a thread dump.
b. Locate the hogger threads that you identified earlier. Notice that several of these threads meet the following criteria:
− Standard execute threads (not internal)
− Active state
− Running MedRec application code
− Blocked (waiting for a lock) Here is an example:
[ACTIVE] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)'" ...
Blocked trying to get lock: com/bea/medrec/web/controller/
ViewingRecordSummaryController$SynchronizedRecord@564290 at ...
at com/bea/medrec/web/controller/
ViewingRecordSummaryController.viewRecordSummary ...
Tip: Perform a search for the text "blocked".
c. How many threads meet these criteria? Notice that all of them appear to be waiting on the same lock.
This indicates a potential deadlock.
d. Try to find a single thread running the MedRec application but not waiting for a lock.
For example:
[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'" ...
at ...
at com/bea/medrec/web/controller.
ViewingRecordSummaryController.viewRecordSummary (ViewingRecordSummaryController.java:52)
...
This code is likely the cause of the deadlock.
e. Scroll down to the end of the thread dump and locate the Blocked lock chains section. Use this information to confirm that these threads are all waiting on the same lock.
Tip: For those with some Java knowledge, you can view the culprit code at
<CURRENT_LAB>/resources.
6. Create a custom work manager.
While this deadlock is being investigated, we use this opportunity to limit the number of threads this application can consume. This way we can minimize the impact upon other applications on the same server.
a. Lock the console.
b. In the Domain Structure panel, select Environment > Work Managers. Click New.
Oracle University and Sentra inversiones y servicios LTDA use only
c. Select the Work Manager option and click Next.
d. Name the work manager LowPriority and click Next.
e. Target the work manager to MedRecSvr1 and click Finish.
f. Edit the new work manager.
g. Locate the Maximum Threads Constraint field and click New:
h. Enter the following values:
Field Value
Name Max3
Count 3
Click Next.
i. Target the constraint to MedRecSvr1 and click Finish.
j. Click Save and Activate your changes.
7. Assign the work manager to the application.
a. Inspect the contents of the <CURRENT_LAB>/resources/medrec-plan.xml file. b. Copy this file to <LAB_WORK>/applications. Overwrite any existing version of the
file.
c. Update the medrec application by using the console.
d. Restart MedRecSvr1 to destroy any stuck threads.
8. Validate the work manager constraints.
a. Start the test client once again.
b. Use the console or the kill command to capture a thread dump.
c. Analyze the thread dump and confirm that no more than three threads are processing MedRec requests at the same time.
d. When finished with the practice, return to the server’s Configuration > Overload tab.
Modify the following fields:
Field Value
Failure Action Ignore, take no action Panic Action Ignore, take no action
Oracle University and Sentra inversiones y servicios LTDA use only
Solution Instructions
1. If the <LAB_WORK>/domains/MedRecDomain location does not yet exist, follow the Solution Instructions for the “Developing a Custom Monitoring Script” practice.
2. Launch the Lab Framework command shell by executing the
<STUDENT>/bin/prompt.sh file.
3. Change the current directory to <CURRENT_LAB>.
4. Execute the following:
ant setup_solution The Lab Framework:
− Makes a backup copy of your current work
− Disables overload protection for MedRecSvr1, if enabled
− Adds a new work manager to MedRecSvr1
− Deploys the MedRec application by using an updated deployment plan