Troubleshooting Mcrnc

(1)

WCDMA RAN, Rel. RU50 and

RU50 EP1, Operating

Documentation, Issue 05

Troubleshooting

Multicontroller RNC

DN0976768

Issue 02B

Approval Date 2015-02-23

(2)

The information in this document applies solely to the hardware/software product (“Product”) specified herein, and only as specified herein.

This document is intended for use by Nokia Solutions and Networks' customers (“You”) only, and it may not be used except for the purposes defined in the agreement between You and Nokia Solutions and Networks (“Agreement”) under which this document is distributed. No part of this document may be used, copied, reproduced, modified or transmitted in any form or means without the prior written permission of Nokia Solutions and Networks. If you have not entered into an Agreement applicable to the Product, or if that Agreement has expired or has been terminated, You may not use this document in any manner and You are obliged to return it to Nokia Solutions and Networks and destroy or delete any copies thereof.

The document has been prepared to be used by professional and properly trained personnel, and You assume full responsibility when using it. Nokia Solutions and Networks welcome Your comments as part of the process of continuous development and improvement of the documentation.

This document and its contents are provided as a convenience to You. Any information or statements concerning the suitability, capacity, fitness for purpose or performance of the Product are given solely on an “as is” and “as available” basis in this document, and Nokia Solutions and Networks reserves the right to change any such information and statements without notice. Nokia Solutions and Networks has made all reasonable efforts to ensure that the content of this document is adequate and free of material errors and omissions, and Nokia Solutions and Networks will correct errors that You identify in this document. But, Nokia Solutions and Networks' total liability for any errors in the document is strictly limited to the correction of such error(s). Nokia Solutions and Networks does not warrant that the use of the software in the Product will be uninterrupted or error-free.

NO WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY OF AVAILABILITY, ACCURACY, RELIABILITY, TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, IS MADE IN RELATION TO THE CONTENT OF THIS DOCUMENT. IN NO EVENT WILL NOKIA SOLUTIONS AND NETWORKS BE LIABLE FOR ANY DAMAGES, INCLUDING BUT NOT LIMITED TO SPECIAL, DIRECT, INDIRECT, INCIDENTAL OR CONSEQUENTIAL OR ANY LOSSES, SUCH AS BUT NOT LIMITED TO LOSS OF PROFIT, REVENUE, BUSINESS INTERRUPTION, BUSINESS OPPORTUNITY OR DATA THAT MAY ARISE FROM THE USE OF THIS DOCUMENT OR THE INFORMATION IN IT, EVEN IN THE CASE OF ERRORS IN OR OMISSIONS FROM THIS DOCUMENT OR ITS CONTENT.

This document is Nokia Solutions and Networks’ proprietary and confidential information, which may not be distributed or disclosed to any third parties without the prior written consent of Nokia Solutions and Networks.

Nokia is a registered trademark of Nokia Corporation. Other product names mentioned in this document may be trademarks of their respective owners, and they are mentioned for identification purposes only. Copyright © 2015 Nokia Solutions and Networks. All rights reserved.

f

Important Notice on Product Safety

This product may present safety risks due to laser, electricity, heat, and other sources of danger. Only trained and qualified personnel may install, operate, maintain or otherwise handle this product and only after having carefully read the safety information applicable to this product. The safety information is provided in the Safety Information section in the “Legal, Safety and Environmental Information” part of this document or documentation set.

Nokia Solutions and Networks is continually striving to reduce the adverse environmental effects of its products and services. We would like to encourage you as our customers and users to join us in working towards a cleaner, safer environment. Please recycle product packaging and follow the recommendations for power use and proper disposal of our products and their components.

If you should have questions regarding our Environmental Policy or any of the environmental services we offer, please contact us at Nokia Solutions and Networks for any additional information.

(3)

This document has 71 pages Summary of changes... 7 1 Overview of Multicontroller RNC Troubleshooting... 8 1.1 Troubleshooting Multicontroller RNC recommendations...8 1.2 Information sources in fault situations... 9 1.3 Problem types... 10 1.4 Generic troubleshooting procedure... 11 2 Reporting problem to NSN... 14 3 Symptoms data collection... 19 3.1 Standard symptoms report... 19 3.2 Collecting standard symptom reports...23 3.3 Listing the standard symptom reports... 28 3.4 Copying standard symptoms reports to remote machine...29 3.5 Deleting the standard symptom reports... 30 4 Generic software troubleshooting instructions... 32 4.1 Displaying all blackbox files... 32 4.2 Viewing active alarms and alarms history... 33 4.3 Collecting core dump file information... 34 5 Configuration management troubleshooting... 36 5.1 Saving configuration snapshot fails...36 5.1.1 Description... 36 5.1.2 Symptoms... 36 5.1.3 Recovery procedures... 36 6 Software management troubleshooting... 37 6.1 Installation of a new software delivery fails... 37 6.1.1 Description... 37 6.1.2 Symptoms... 37 6.1.3 Recovery procedures... 37 6.2 Activation of a new software delivery fails...37 6.2.1 Description... 37 6.2.2 Symptoms... 38 6.2.3 Recovery procedures... 38 6.3 Troubleshooting software configuration management... 38 6.3.1 Executing pre-download script fails...38 6.3.2 Parsing targetBD.xml fails... 39 6.3.3 Executing pre-activation script fails... 39 6.3.4 Upgrading eSW fails... 40 6.3.5 Alarm 2518 - NO VALID FALLBACK COPY FOR DEFAULT PACKAGE is triggered... 41

(4)

7 Networking troubleshooting...43 7.1 Packet loss in certain traffic flows unexpectedly high... 43 7.1.1 Description... 43 7.1.2 Symptoms... 43 7.1.3 Recovery procedures... 43 7.2 Multihop BFD sessions are not established ... 46 7.2.1 Description... 46 7.2.2 Symptoms... 46 7.2.3 Recovery procedures... 46 7.3 OSPF is not working properly... 48 7.3.1 Description... 48 7.3.2 Symptoms... 48 7.3.3 Recovery procedures... 49 7.4 IP signaling link activation fails...52 7.5 The state of all subsystems in the remote network element is unavailable (UA)...53 8 Hardware troubleshooting... 54 8.1 No traffic detected... 54 8.1.1 Description... 54 8.1.2 Symptoms... 54 8.1.3 Recovery procedure... 54 8.2 SFP ports added for USPUs and CSPUs do not show correctly their operational state nor alarms are raised...55 8.3 Troubleshooting with mcJANE tool... 56 9 Performance management troubleshooting... 69 9.1 Threshold monitoring alarm is not sent to NetAct... 69 9.2 Transport and HW measurement management in mcRNC...70 9.2.1 McRNC measurement management concepts... 70 9.2.2 McRNC measurement management commands... 71

(5)

List of Figures

Figure 1 Forcing change of the SPF post state... 64

Figure 2 Example usage of the cp command... 65

Figure 3 Checking the port monitoring...66

(6)

List of Tables

Table 1 NSN Case Type...15

Table 2 NSN Problem Priority... 16

Table 3 Supported plugins...19

Table 4 Parameters related to save symptom-report command... 24

Table 5 Mapping between pip-mark and Queue ID...44

(7)

Summary of changes

Changes between document issues are cumulative. Therefore, the latest document issue contains all changes made to previous issues.

Changes between issues 02A(2014-05-23, RU40) and 02B(2015-02-23,

RU40)

New subchapter has been added. Troubleshooting software configuration management

Changes between issues 02(2013-09-15, RU40) and 02A(2014-05-23,

RU40)

OSPF is not working properly has been updated. The feature management name has

been changed to OSPFForRedundancy.

Changes between issues 01D (2012-11-23, RU30) and 02(2013-09-15,

RU40)

Activation of a new software delivery fails • Commands have been updated.

Installation of a new software delivery fails • Commands have been updated.

Displaying all blackbox files

• Information on permissions required for executing the command in this section is added to Before you start section.

Collecting core dump file information • Commands have been updated.

(8)

1 Overview of Multicontroller RNC

Troubleshooting

1.1 Troubleshooting Multicontroller RNC

recommendations

If you have a contract with Nokia for the operation and maintenance of the network (or some other agreement), the actions you need to take in a fault situation may be different from the ones suggested in the troubleshooting instructions. If the general principles are in conflict with the operation and maintenance contract or any other contract, carry out actions as agreed in the contract. The operation and maintenance personnel that carry out troubleshooting should be familiar with the hardware and software of Nokia network elements.

Electrostatic precautions

When handling plug-in units, it is important to use Electrostatic Precautions (ESP). This means that you must be earthed to equipment racks using an approved wrist strap and connecting lead. Approved ESP equipment makes a resistive connection to ensure the safety of the personnel and to prevent a sudden static discharge during connection to the earthing point.

Security procedures

You are recommended to establish security procedures at your site to ensure appropriate staff and terminal access to the personnel.

Disaster recovery plan

Establish a disaster recovery plan to help the personnel to deal with emergency situations. Remember that emergency situations can be best avoided by detecting abnormal conditions early. A disaster recovery plan should cover various disaster scenarios and disaster recovery procedures for personnel. The operation and maintenance personnel should also be able to contact the persons who are capable of dealing with the problem in question. Therefore, each site should have an escalation plan available with appropriate contact information.

Escalation plan

An escalation plan offers contact lists of internal and external support personnel and services available to tackle problems. It should contain information on who to contact and in what kind of situations, for example, air conditioning, power back-up system and Nokia Emergency/Help Desk numbers.

Preventive maintenance

Perform preventive maintenance routines on a regular basis. For example, carry out regular alarm and unit state surveillance. Overview of Multicontroller RNC Troubleshooting Troubleshooting Multicontroller RNC

(9)

Safecopying

Taking regular safecopies of the software and databases of the network element ensures that you have a functional copy of the software which you can use if there are problems with the software or hardware of the network element. How often you need to take safecopies depends on the size and type of the network element.

Performance monitoring

The purpose of performance monitoring is to measure the overall quality of the system. Performance monitoring can help you to detect very low rate or intermittent problems and possible degradation of some part of the system. The performance monitoring parameters for network elements of different kinds and sizes vary.

Documentation

Establish a procedure for keeping the documentation up-to-date and make sure that the operation and maintenance personnel have access to all relevant external and internal documents.

Network element diary

It is recommended that you maintain a network element diary. The diary should be network element -specific, but you can store it in the Operation and Maintenance Centre if the network element is not usually manned. Start filling in the network element diary already when the network element is being set up and installed. You are recommended to record the following events in the network element diary: • Hardware changes • Software and hardware updates (for example, change notes and correction deliveries) • Essential modifications to the configuration or routing in the network element • Safecopying • Operational failures • Any other relevant information A network element diary can provide useful information on the system's performance in the past and hints on what might cause the current problems.

1.2 Information sources in fault situations

Use at least the following information sources when carrying out troubleshooting.

Alarms

Alarms are the primary source of information in most situations where troubleshooting is needed. Alarms are printed out on the alarm printer and/or other devices that you have specified for the network element. Troubleshooting Multicontroller RNC Overview of Multicontroller RNC Troubleshooting

(10)

Diagnosis reports

Diagnosis reports contain data on the plug-in units that the system suspects to be faulty. Diagnosis reports are usually printed out on the report printer and/or other devices that you have specified for the network element.

Error messages

General error messages of the system tell why the system cannot carry out a task. They can appear in the supplementary information fields of alarms and diagnostic printouts, in the printouts of the starting phases monitored through a service terminal, and in SCLI and Element Manager command outputs.

Statistical reports

Different statistical reports contain useful data, for example, on traffic on speech and signalling circuits, use of services and load and availability measurements. Monitor and assess statistical data regularly as this data can indicate forthcoming problems before they affect the traffic. For more information, see performance management documentation.

Logs and other relevant statistical information

Different logs (for example, computer and operating system logs and SCLI session reports) contain useful data that can be attached to the fault report when you need Nokia' help to solve a problem. Take the logs from the unit sending the alarm and the unit that is the object of the alarm, and if the alarm is I/O related, from the OMU.

Unit states

Check also the unit states.

1.3 Problem types

Here are some problem types you may encounter.

Reproducible problems

You can reproduce the symptoms using a set of actions. Reproducible problems can be solved by narrowing down the possible causes of the problem to a single cause or to a number of causes and applying corrective actions. This requires knowledge of how the system works and tests to eliminate wrong conclusions.

Intermittent problems

You cannot reproduce the symptoms consistently using any set of actions. However, an intermittent problem can reproduce itself randomly. In such a case, some kind of tracing or monitoring of the system may lead you to the origin of the trouble. If an intermittent problem occurs very seldom and it has no serious consequences, it may be best to just ignore the problem. You can also perform general maintenance and see if the problem disappears. If the problem occurs occasionally, try to conclude which factors seem to affect or contribute to the appearance of the problem. Overview of Multicontroller RNC Troubleshooting Troubleshooting Multicontroller RNC

(11)

Several related or isolated troubles active at the same time

Study whether the symptoms relate to each other or not and try to isolate the problems if possible.

1.4 Generic troubleshooting procedure

Description

Depending on whether you have a contract with Nokia on the operation and maintenance of the network (or some other agreement), the actions that you need to take in a fault situation may be different from the ones presented here. When you suspect that a Nokia network element is not performing as it should, carry out at least the following checks.

Symptoms

A Nokia network element is not performing as it should.

Recovery procedures

Troubleshooting process before calling Nokia Solutions and

Networks help desk

Steps

1 Evaluate how serious the consequences of the trouble are

If the problem has very serious consequences, you may have to call for expert help or apply an emergency plan immediately.

2 Analyse the situation where the problem or failure first appeared Consider the following before you carry out any corrective actions. • What is the problem? • Where is the problem? • When did the problem occur? • What were the circumstances that led to the problem? • What is the impact of the problem (for example, to what extent does the fault affect the end customers)? • Who is responsible for taking care of the problem?

3 Try to eliminate the possibility of human error

• For example, recent changes in the software or hardware configuration of the system (for example, circuits added or equipping changed) are possible sources of problems. The changes may have been carried out incorrectly. Check the configuration parameters, physical connections, strappings, plug-in units and so on, very carefully. Troubleshooting Multicontroller RNC Overview of Multicontroller RNC Troubleshooting

(12)

show functional-unit unit-info • Human error is a very common cause of problems – therefore, check and double-check every possible problem source. Type history. • A failure can also occur spontaneously (for example, the remote end system may have problems or a service breakdown, or a plug-in unit may fail due to ageing). Check the alarms, clear codes, unit and link states and logs as described below.

4 Make an accurate description of the symptoms

You may not be able to solve the problem yourself. A detailed description of the situation where the symptoms occurred can help an expert solve the problem. Gather also data on the failure event. A symptom description should contain all the basic facts, such as: • Date, name of the person who detected the trouble, phone number and e-mail address • Details of the system; for example, what equipment and software is in use • Description of the symptoms (alarms, error messages, clear codes, faulty states of the units and links and so on) • Any other relevant information, for example, log and message monitoring files. All data may be valuable even if they seem irrelevant at the time. Store this information preferably in electronic format.

5 Check and analyse the alarm situation

• Check the alarms that are currently on. You are recommended to also study the alarm history. Display the alarm history so that it shows all alarm events from the time period which starts one hour before the occurrence of the problem situation, and ends one hour after the problem situation was over. You should display the alarm history.

# show alarm active

The system may set an alarm and cancel it immediately. You can find these alarms in the alarm history. Alarms behaving in this way may indicate that some part of the system is about to break down or its functionality has been reduced. • Check also the diagnostic reports. 6 _Option _Description If the fault can be located based on the alarm situation Then Carry out the appropriate maintenance actions

• If you have ended up with more than one probable and possible cause for the trouble, change only one thing at a time – otherwise you cannot be sure of which change corrected the failure or problem. • Remember that random actions can make problems worse. Generally, you should not take any radical corrective actions if you are not sure what the problem is and what the consequences of the corrective actions are. Losing traffic because of incorrect actions is not what you want. Overview of Multicontroller RNC Troubleshooting Troubleshooting Multicontroller RNC

(13)

7 _Option _Description

If you cannot locate the fault based solely on the alarm situation Then Try to narrow down the possible problem source

• Analyse and categorise symptoms and list possible causes for the symptoms. Sometimes there can be several related or isolated troubles active at the same time. Study whether the symptoms relate to each other or not. Prioritise symptoms and collect further facts if needed. • Based on tests and your knowledge of the system, eliminate symptoms that are not relevant to the trouble you are trying to solve. This way you can focus on symptoms and causes that are more likely to produce a solution to the problem. Examine what works and what does not. • However, even though you may not be able to analyse what the cause of the trouble is, you can carry out general maintenance to eliminate some trivial causes of troubles, such as loose cables and bad connections.

8 Use measurements to trace any abnormal trends

For more information, see performance management documentation.

9 Check the states of units, links and circuits Check unit states.

show functional-unit unit-info

10 Fill in a problem report if needed

Describe the problem in detail in the problem report. Include all relevant information that you have available from the problem situation and describe also the corrective measures that you have carried out after the problem occured.

(14)

2 Reporting problem to NSN

Problem reports are used to communicate problems and failures to service personnel. Report only one fault in one problem report. Include the attachments in the compressed format. To make the investigation of a problem faster, include the following information in the problem report: • a title that gives a brief description of the problem • a clear and exact description of the problem itself • problem background information: – software and hardware releases – situation in the beginning, for example, the first symptoms of the problem – situation after the problem occured – operations you made which possibly caused the failure, for example: hardware, software, parameter, feature or configuration changes, including opertions in transport network and third party equipment – describe if the problem can be reproduced and what actions are required to reproduce this problem – troubleshooting and recovery actions that you made • symptom data reports Most of the Multicontroller RNC configuration data and logs essential for investigation of the problem can be collected with the standard symptom plugin report group-RNC and subreport-MessageMonitor.This data is always collected if Problem Report is submitted to NSN.

To collect the Multicontroller RNC standard symptom data, execute the following SCLI commands:

– symptom data and message monitoring:

save symptom-report name <report_name> include group-RNC include subreport-MessageMonitor

– basic symptom data:

save symptom-report name <report_name> include group-RNC For more instructions on the standard symptoms data collection framework, see

Standard symptoms data collection.

The data must be collected as soon as possible after an abnormal situation has taken place and before any recovery action is performed, such as Multicontroller RNC restarts or replacing hardware. This is important because the information stored about the problem (for example, blackbox of a certain unit) may get overwritten in the process of time or be lost because of recovery actions.

The standard symptom report group-RNC is expected to be completed in around 15 minutes depending on Multicontroller RNC configuration. Recovery actions can be started, if needed, as soon as symptom report generation is completed. A copy of the symptom report can be transferred to local machine later at a convenient time. When you send out a problem report, make sure that all the possible attachments are included in the problem report, to avoid unnecessary information requests. For OMS symptom data collection, see Troubleshooting OMS document. • If possible, include additional problem specific symptom data. For example, problem specific message monitoring logs, network analyzer interface traces. Reporting problem to NSN Troubleshooting Multicontroller RNC

(15)

• NetAct KPI/counter report in case the problem concerns certain KPI/counter • In a multivendor environment, include detailed information on the other third party products. • NSN Case Type; Emergency, Trouble Resolution or Technical Query. For details, see #c105225791/table_bjv_g12_rp • NSN Problem Priority: Major, Medium or Minor. For details, see #c105225791/table_ndz_pd2_rp. Table 1 NSN Case Type

NSN Case Type NSN Definition Examples

Emergency This case type is for Total and Partial outages. Available priority - Critical. This case type is for cases under Emergency Support service. Total Outage Total loss of voice and data traffic capability. An unscheduled event must be longer than 15 seconds. Partial Outage Loss of greater than 10% of the provisioned capacity for origination and/or termination of voice and/or data traffic. Total loss of one or more critical services. An unscheduled event must be longer than 15 seconds. Total Outage: • All Iub links are down. • All WCELs are in incorrect state. • RNC spontaneous restart (unplanned). • RNC system restart (planned), all links are down. Partial Outage: • Loss of 10% or more of voice or data traffic capacity for at least 15 seconds. • Spontaneous restarts of active computer unit(s) when there is no redundant unit available to take over the services provided by the restarted one. • One or more network interfaces are down for over 15 seconds. • Total loss of subscriber related RNC functionality (ISHO, HSDPA, HSUPA, Internet browsing, and so on) • RNC operation and maintenance from NetAct is totally lost. • Emergency calls not possible (for example, 112 or 991). Trouble Resolution This case type is for problems which caused by a suspected or an identified defect. Available problem priorities -Major, Medium, and Minor. • SW defect is identified or suspected. A correction will be required if the defect is confirmed. • HW design defect is suspected. A correction (HW retrofit or SW change) will be required if the defect is confirmed. • An error or omission in the product technical literature is suspected. A documentation correction will be required if the defect is confirmed.

Technical Query This case type is for technical questions regarding daily network operations and maintenance issues. • Technical questions on procedures or features that are covered in the documentation shipped with the product. Troubleshooting Multicontroller RNC Reporting problem to NSN

(16)

Table 1 NSN Case Type (Cont.)

NSN Case Type NSN Definition Examples

Available problem priorities – Major and Medium. • Information requests on NSN product that will be used to help interface this product with a third party product. Table 2 NSN Problem Priority NSN Problem Priority NSN Definition Examples Critical Problems under Emergency Support service. N/A Major Only Total or Partial outages which are not avoidable with a workaround solution. N/A Medium Loss of less than 10% of the provisioned capacity for origination and/or termination of voice and/or data traffic. Total or Partial outages avoidable with a workaround solution. Partial loss of one or more critical services. • total or more than 10% loss of voice or data traffic capacity for at least 15 seconds, avoidable with workaround • single restart of computer units • configuration changes (RNW, HW and SW) are not working • activation of a feature fails • single performance measurement is not working completely • partial loss of subscriber related RNC functionality (ISHO, HSDPA, HSUPA, Internet browsing, and so on) • partial loss of alarm management of objects (BTS, functional units) • problems with back-up • major errors in documentation, for example, an alarm or parameter description is missing from documentation • vital documents are missing from the documentation library Minor Minor fault not affecting operation or service quality • failures not seriously affecting traffic • errors in command line syntax or output • minor errors in documentation Reporting problem to NSN Troubleshooting Multicontroller RNC

(17)

Most of the Multicontroller RNC configuration data and logs essential for problem investigation can be collected with the standard symptom plugin report group-RNC and subreport-MessageMonitor.This data is always collected if Problem Report is submitted to Nokia Solutions and Networks.

To collect the Multicontroller RNC standard symptom data, execute the following SCLI commands:

• symptom data and message monitoring:

save symptom-report name <report_name> include group-RNC include subreport-MessageMonitor

• basic symptom data:

save symptom-report name <report_name> include group-RNC For more instructions on the standard symptoms data collection framework, see

Standard symptoms data collection.

The data must be collected as soon as possible after an abnormal situation has taken place and before any recovery action is performed such as Multicontroller RNC restarts or replacing hardware. This is important because the information stored about the problem may get overwritten in the process of time or lost due to recovery actions. To save or collect the basic symptom report group-RNC, enter the following command: save symptom-report name <report_name> include group-RNC include subreport-MessageMonitor

_nokadmin@CFPU-0 [RNC-37] > save symptom-report name walle20141016 include group-RNC include subreport-MessageMonitor

CFPU-0@RNC-37 [2014-10-16 13:48:57 +0200] Mode : Normal mode

Max. execution time limit : 30 minutes Max. size of each part : 100 MB Max. size of full report : 350 MB

ReportName FileName Chunk Subreport Store

--- --- --- ---

---walle20141016 /stdsymp/---walle20141016_0.tar Part 1 ipmgmt CFPU-0

walle20141016 /stdsymp/walle20141016_0.tar Part 1 MessageMonitor CFPU-0

walle20141016 /stdsymp/walle20141016_0.tar Part 1 rncipconfig CFPU-0

walle20141016 /stdsymp/walle20141016_0.tar Part 1 rncrnw CFPU-0

walle20141016 /stdsymp/walle20141016_0.tar Part 1 rncsignaling CFPU-0

walle20141016 /stdsymp/walle20141016_0.tar Part 1 rnchw CFPU-0

walle20141016 /stdsymp/walle20141016_0.tar Part 1 rncinfo CFPU-0

walle20141016 /stdsymp/walle20141016_0.tar Part 1 rncmon CFPU-0

walle20141016 /stdsymp/walle20141016_0.tar Part 1 rncuplane CFPU-0

walle20141016 /stdsymp/walle20141016_0.tar Part 1 rnchas CFPU-0

(18)

walle20141016 /stdsymp/walle20141016_0.tar Part 1 rncalarm CFPU-0

Successfully collected symptom report data for 11 out of 11 subreports

The standard symptom report group-RNC is expected to be completed in around 15 minutes depending on Multicontroller RNC configuration.

g

When using the "to" and "from" options with the standard symptom report group-RNC, the syslog, syslog.1 files are not filtered according to the time range provided. The time range is not applicable to syslog and syslog.1 files. Recovery actions can be started, if needed, as soon as symptom report generation is completed. Copy of the symptom report can be transferred to local machine later at a convenient time. When you send out a problem report, make sure that all the possible attachments are included in the problem report, to avoid unnecessary information requests. Reporting problem to NSN Troubleshooting Multicontroller RNC

(19)

3 Symptoms data collection

3.1 Standard symptoms report

Standard symptom report is a framework used to collect symptom data from Multicontroller RNC to support troubleshooting of problems. The framework collects standard symptoms data by running individual or multiple plugins. Some of the plugins are put into groups, thus it is possible to collect the symptom data on the group level too. The framework allows easy enhancement with additional plugins, if the available standard plugins are not sufficient. The collected symptoms data is stored in /mnt/backup/stdsymp directory. The following table provides details about standard plugins and their functionality: Table 3 Supported plugins

Plugin Name Description

group-RNC It collects Multicontroller RNC configuration data and logs essential for investigation of the problem.

w

NOTICE: This data must always be collected if Problem

Report is submitted to Nokia.

This is the report group containing the following sub-reports:

rnchw rncinfo rncipconfig rncsignaling subreport-rncrnw subreport-rnchas subreport-rncalarm subreport-rncuplanesubreport-rncmon group-Backupandrestore It collects the following backup and restore-related logs: subreport-clusterinfo subreport-syslog subreport-corefiles subreport-processinfo subreport-clusterstatus group-Networking It collects the following general statistics of the network: subreport-clusterinfo subreport-osinfo subreport-processinfo subreport-clusterstatus subreport-syslog subreport-corefiles subreport-alarmsinfo subreport-debug subreport-routing subreport-ipmgmt group-SignalingProtocol It collects the following reports related to the signaling protocols and the signaling network manager: subreport-signalingSS7 subreport-signalingSCCP signalingNetManager signalingPacketCaptureSctp signalingPacketCaptureSccpUser syslog corefiles subreport-alarmsinfo subreport-debug

(20)

Table 3 Supported plugins (Cont.)

group-IpalPlatform This is the report group containing the following sub-reports: IpalSignalling subreport-IpalCallManagement subreport-IpalCommon subreport-IpalTransportsubreport-IpalBasicServices subreport-alarmsinfo It collects alarm data for a selected period of time. subreport-clusterinfo It collects cluster and node information. subreport-clusterstatus It collects information about the status of cluster, nodes, recovery groups or units.

subreport-corefiles It collects core data available within the /crash directory of the active node. subreport-databaseinfo It collects the database information such as Database Deployed, Disk Space, Postgres Configuration files, Process lists, and Enterprise Database details. subreport-debug It collects debug data collected from master debug log.

subreport-fastpath It collects information from nodes which have 6windip stack. For example, ngctl.dat, fpdebug, fpstat.

subreport-ipmgmt It collects the IP networking configuration data. subreport-licinfo It collects license management related information. subreport-licdebug It collects license management related information. subreport-ipsec It collects information about IKE template, IP sec rules, VPNs, and configuration files. subreport-MessageMonitor It collects two message logs for the following traffic: • call scenarios • operation and maintenace (O&M) scenarios Message monitoring parameters are stored in the msgmon.ini file under /opt/nsn/SS_SysReport/stdsymp/plugins/ path. By default, the buffers size is set to 8 MB (0x8FFFFF), and duration of the monitoring is 30 seconds. The message monitoring is performed in two runs, the first is for call scenarios, and the second is for O&M scenarios. To modify the settings that are used to capture message logs with subreport-MessageMonitor, proceed as follows:

(21)

1. Switch to root user and bash shell (type exit later on to return to fsclish).

set user username root

Provide the password for root (default is root).

2. Copy the msgmon.ini file to /mnt/backup/share and modify it.

cp /opt/nsn/SS_SysReport/

stdsymp/plugins/msgmon.ini /mnt/backup/ share/msgmon.ini

To restore default settings, delete the msgmon.ini file from

/mnt/backup/share directory. subreport-osinfo It collects information about operating system version, cluster uptime, disk usage, shared memory, build label, image variants, current snapshot name, and complete RPM information. subreport-processinfo It collects process-related information. subreport-routing It collects the current routing information, IP address, and routing instances. subreport-signalingNetManager It collects the ring-based buffer trace files created for net-manager in the following location: /trace subreport-signalingPacketCaptur eSccpUser It collects the tcp dump pcap files created for capturing the flow of packets between sccp and sccp-user. subreport-signalingPacketCaptur eSctp It collects the tcp dump pcap files created for capturing the flow of packets at the SCTP level. subreport-signalingSCCP It collects the trace files for SCCP subsystem, ldap configuration, and output of SCLI show command at the starting and stopping time of signaling diagnostics. subreport-signalingSS7 It collects the trace files for SS7 subsystem, ldap configuration, and output of SCLI show command at the starting and stopping time of signaling diagnostics.

subreport-syslog It collects relevant data from syslog.

subreport-techservice It collects the output of some software commands like fsswcli, uname and so on, and also some scripts required by tech service people.

(22)

subreport-tracemgmt It collects tracing-related information useful for debugging purposes such as: • build version • tracing-related rpm versions • configuration files • LDAP fragments under tracing • features related to tracing LDAP fragment • contents of fptl_admin shared library • list of trace files • list of plugin libraries • process states of NodeTraceManager (NTM) and ClusterTraceManager (CTM) • list of buffers/admin buffers • disk usage of tracing-related filesystems • consistency of tracing configuration subreport-tracesnapshot It collects the snapshot for default_platform buffer in the CLA-0 node. subreport-IpalBasicServices It collects data on distributed computing services, such as functional unit states, name services information, and feature management data. subreport-IpalCallManagement It collects information about call management and user plane management, such as call details, connection information, and user plane service allocation. subreport-IpalCommon It collects general information for troubleshooting. This includes software build version, hardware configuration information, recovery units/recovery groups/node configuration, and basic counters. subreport-IpalSignalling It collects information about signaling connections, including all NBAP/SCTP link information and all SIGTRAN link information. subreport-IpalTransport It collects information about traffic transport that is handled by EIPU, such as transport forwarding table, GTP tunnel ID information, IPBR configuration. subreport-rnchw It collects information on Multicontroller RNC functional units. subreport-rncinfo It collects information on the software builds, disk space usage, available snapshots, recovery groups states, PRFILE parameters, and so on. This includes also relevant information from the syslog. Symptoms data collection Troubleshooting Multicontroller RNC

(23)

subreport-rncipconfig It collects information on the IP-related configuration, that is, IP interfaces, IP addresses, routing, user plane resources (ipbr and ipro objects), and so on.

subreport-rncsignaling It collects information on SIGTRAN-related configuration, that is, status of local and remote subsystems, local application server(s), SCTP associations, SCTP profiles, M3UA limits, SSP filter timers, and so on. subreport-rncrnw It collects information on the RNW status, that is BTSOM connections status, NBAP connection status, WBTS and WCEL objects status, IPNB object status, IUCS and IUCSIP objects statuses IUPS and IUPSIP object statuses. subreport-rnchas It collects information on the node, recovery groups, and recovery units. This includes also HAS related information from the syslog. subreport-rncalarm It collects information about active alarms, alarms history, and so on. subreport-rncuplane It collects information on the user plane configuration and status. subreport-rncmon It collects information oncontrol plane UE specific monitoring for abnormal calls as part of HPL logging functionality.

3.2 Collecting standard symptom reports

Purpose

Follow this procedure to save (collect) the standard symptom reports.

Steps

1 Save or collect the standard symptom reports.

To save (collect) the standard symptom report, enter the following command:

save symptom-report name <report-name>

The full syntax of the command is:

save symptom-report name <report-name> {[exclude <exclude_file> [include <include_file>] [quick-mode <yes-no>] [timeout

<timeout_value>] [report-max-size <maximum total report size>]

[single-file-max-size <chunk_size>]} from date-time <from_date_value> to date-time <to_date_value>

(24)

w

NOTICE: It is recommended to use normal-mode for the symptoms data collection for group-RNC. It is because, for group-RNC, the data collection in normal-mode completes in the time allocated for quick-mode.

Simultaneous multiple sessions of symptom reports data collection with the same report name is not allowed. If a report file already exists, collecting standard symptom report cannot be initiated with the same report name.

When using the "from date-time" and "to date-time" options with the standard symptom report group-RNC, the syslog and syslog.1 files are not filtered according to the time range provided. The time range is not applicable to syslog and syslog.1 files.

Symptom reports that are older than 10 days are removed from the file system automatically.

The parameters in the curly brackets are not allowed in the command syntax after the "from date-time" and "to date-time" parameters.

When retrieving symptom reports from the previous year, the value of the from date-time and to date-time parameters do not go back further than 333 days from the current date. The following messages are displayed according to the triggering scenarios encountered during command execution:

• Use case 1: The date specified in the from date-time parameter is from the previous year.

Message displayed:

From/To dates cannot be older than 333 days from current date Date has to be given in specified format, refer usage

• Use case 2: The date specified in the from date-time parameter is from the previous year while the date specified in the to date-time parameter is in the current year.

Message displayed:

From/To dates cannot be older than 333 days from current date Date has to be given in specified format, refer usage

If the user only specified the from date-time parameter, or if the user only specified the to date-time parameter, then the one that was not specified is calculated by a default offset of 10 days.

For the description of the parameters of the save symptom-report command, see Table 4: Parameters related to save symptom-report command.

Table 4 Parameters related to save symptom-report command

Parameter Description

name <report-name> This mandatory parameter creates a single or multiple (it depends on subsequent options) tar file(s) with the provided report-name. The report(s) are stored under the directory /mnt/backup/stdsymp of the Multicontroller RNC.

The report file(s) generated contain subreport(s) that are compressed archive file(s) (with .tar.gz extension) containing one or more file(s).

(25)

Table 4 Parameters related to save symptom-report command (Cont.) Parameter Description In the report file(s), there is also one text file that summarizes executed commands, their execution status and time. Its format is: <report name>_<chunk>_summary.txt Special characters for report-name are not allowed. Numbers are not allowed as first character. The report name length must not exceed 25 characters.

[exclude <exclude_file>] If this optional parameter is used, the standard symptom report of the selected plugin and group are not collected.

[include <include_file>] If this optional parameter is used, the standard symptom report can be collected from a specified plugin or group.

[quick-mode <yes-no>] This optional parameter is used in emergencies to gather important and relevant information rather than having all information available. When quick mode is enabled, the symptom subreports

automatically include data that is critical and quickly retrieved. If "yes" is selected, each subreport includes only data that is quick to collect. If "no" is selected, each subreport includes all data that is needed in a support case. The default value for this option is "no". [single-file-max-size <chunk_size>] This optional parameter creates the final report in chunks or pieces of tar file, each with a maximum size specified as the argument. Each chunk is a tar file which contains reports that are independently analyzable. Report produced by single subreport is not split even if the report size exceeds the specified maximum size. This implies that size of a single file may sometimes be bigger than the specified maximum single file size. The default value for this option is 10 MB. A size of 0 means no limit. [report-max-size <maximum total report size>]

This optional parameter accepts the maximum allowed size limit (in MB) for the data generated. When the size of the data generated reaches the specified limit, the framework stops the data collection after the Troubleshooting Multicontroller RNC Symptoms data collection

(26)

Table 4 Parameters related to save symptom-report command (Cont.) Parameter Description currently executing subreport completes its execution. Since the data generation is not abruptly stopped upon reaching the size limit, the resulting size of the symptom data collected may still exceed the specified limit. The default value for report-max-size is 30 MB. A size of 0 means no limit.

[timeout <timeout_value>] This optional parameter specifies the timeout value for symptom data collection. When timeout occurs, the plugin that is currently being processed fails to be collected, and all the queued plugins (if any) are skipped. Timeout option is strictly followed except when plugins go into kernel mode. In kernel mode, delays occur because the signals are not processed until it logs out of the kernel mode. Permitted range is 0-60. A value of 0 means there is no timeout. Timeout option allows you to stop the symptom data collection after the specified timeout value expires. The default timeout is 30 minutes. from date-time <from_date_value> This optional parameter saves symptom report up to a particular date. Accepted date formats are YYYY.MM.DD-HH:MM:SS or DD.MM.YYYY-HH:MM:SS. When trying to collect symptom reports from the previous year, make sure the parameter does not go back further than 333 days from the current date.

to date-time <to_date_value> This optional parameter specifies the date till which the standard symptom report must be collected. The accepted date formats are YYYY.MM.DD-HH:MM:SS or DD.MM.YYYY-HH:MM:SS. When trying to collect symptom reports from the previous year, make sure the parameter does not go back further than 333 days from the current date. Examples a) To collect the basic symptom report with a specific name (RNC311), and with a subreport group (group-RNC), execute the following command:

save symptom-report name RNC311 include group-RNC

(27)

b) To collect the standard symptom report with a specific name (RNC311), with a subreport group (group-RNC ), and with a message monitoring (subreport-MessageMonitor), execute the following command:

save symptom-report name RNC311 include group-RNC include subreport-MessageMonitor

c) To collect the standard symptom report with a specific name (RNC311), with a subreport group (group-RNC), and from the given date ( 2013.08.05-09:00:00), execute the following command:

save symptom-report name RNC311 include group-RNC from date-time

2013.08.05-09:00:00

d) To collect the standard symptom report with a specific name (RNC311), with a subreport group (group-RNC), from the given date (2013.08.05-09:00:00) to a particular date (2013.08.06-09:00:00), execute the following command:

save symptom-report name RNC311 include group-RNC from date-time

2013.08.05-09:00:00 to date-time 2013.08.06-09:00:00

e) To collect the standard symptom report with a specific name (RNC311), with a subreport group (group-RNC), from the time the Multicontroller RNC was commissioned to a particular date (2013.08.06-09:00:00), execute the following command:

save symptom-report name RNC311 include group-RNC to date-time 2013.08.06-09:00:00

f) To collect all the standard symptom reports with a specific name (RNC311), execute the following command:

save symptom-report name RNC311

g) To collect all the standard symptom reports with a specific name

(RNC311ipconifg), and with the specific subreport (rncipconifg), execute the following command:

save symptom-report name RNC311ipconfig include subreport- rncipconfig

h) To collect the standard symptom report with a specific name (RNC311), with a subreport group (group-RNC), and with the limited size of a single report file (5 MB), execute the following command:

save symptom-report name RNC311 include group-RNC single-file-max- size 5

i) To collect the standard symptom report with a specific name (RNC311), with a subreport group (group-RNC), and with the maximum size of the report (10 MB), execute the following command:

save symptom-report name RNC311 include group-RNC report-max-size 10

j) To collect the standard symptom report with a specific name (RNC311), with a subreport group (group-RNC), and with timeout for data collection (10 minutes), execute the following command:

save symptom-report name RNC311 include group-RNC timeout 10

k) To collect the standard symptom report with a specific name (RNC311), with a subreport group (group-RNC), and with the single subreport excluded from the group (subreport-rnchw), execute the following command:

save symptom-report name RNC311 include group-RNC exclude subreport-rnchw

Expected outcome:

The files are stored as in directory /mnt/backup/stdsymp, with the name used in the save symptom-report command.

(28)

3.3 Listing the standard symptom reports

Purpose

Follow this procedure to list the standard symptom report of files, plugins or groups available on the Multicontroller RNC.

Before you start

Default system username and password are "_nokadmin" / "system".

Steps

1 List the symptom reports.

To list all the standard symptom reports, enter the following command: show symptom-report all

Sample Output

CFPU-0@RNC-311 [2013-08-06 07:52:43 +0200]

ReportName FileName Chunk/Total Store

--- --- ---

---AllRNC-311 /stdsymp/---AllRNC-311_0.tar part 1/2 CFPU-0

AllRNC-311 /stdsymp/AllRNC-311_1.tar part 2/2 CFPU-0

MessageMonitor /stdsymp/MessageMonitor_0.tar part 1/1 CFPU-0

RNC-311 /stdsymp/RNC-311_0.tar part 1/1 CFPU-0

rncmon /stdsymp/rncmon_0.tar part 1/1 CFPU-0

RoutingFails /stdsymp/RoutingFails_0.tar part 1/1 CFPU-0

history /stdsymp/history_0.tar part 1/1 CFPU-0

w

NOTICE: The Chunk/Total field displays the number of chunks present for the particular report. This assists the user in learning whether the files of the report have been listed properly.

2 List the detailed contents of a particular symptom report. show symptom-report name <report-name>

Example

To list the particular standard symptom report (RNC-311), enter the following command:

show symptom-report name RNC-311

(29)

Sample Output

CFPU-0@RNC-311 [2013-08-06 07:57:14 +0200]

ReportName FileName Chunk/Total Subreport Store

--- --- --- ---

---RNC-311 /stdsymp/---RNC-311_0.tar part 1/1 rnchw CFPU-0

RNC-311 /stdsymp/RNC-311_0.tar part 1/1 rncinfo CFPU-0

RNC-311 /stdsymp/RNC-311_0.tar part 1/1 rncipconfig CFPU-0

RNC-311 /stdsymp/RNC-311_0.tar part 1/1 rncsignaling CFPU-0

RNC-311 /stdsymp/RNC-311_0.tar part 1/1 rncrnw CFPU-0

RNC-311 /stdsymp/RNC-311_0.tar part 1/1 rnchas CFPU-0

RNC-311 /stdsymp/RNC-311_0.tar part 1/1 rncalarm CFPU-0

RNC-311 /stdsymp/RNC-311_0.tar part 1/1 rncuplane CFPU-0

3 Display the list of available plugins. Example

To display the list of available plugins, enter the following command: show symptom-report plugin-list

4 List the group of plugins based on the group names. Example

To list the group of plugins based on the group names (group-RNC), enter the following command:

show symptom-report group group-RNC

3.4 Copying standard symptoms reports to remote

machine

Purpose

Follow this procedure to copy standard symptoms reports from Multicontroller RNC local storage to remote machine.

Expected outcome

(30)

The standard symptoms reports are successfully copied to external storage.

Steps

1 Switch to bash shell (type exit later on to return to fsclish). shell bash full

Confirm the command by pressing "y".

2 Use SCP to copy standard symptoms report to remote machine. Note that the report files are stored in /mnt/backup/stdsymp directory.

scp /mnt/backup/stdsymp/<report_name>.tar <username>@<external server IP>:/<external server folder path>

Note that you can use wildcard symbol "*" in the <report_name> to copy all the report files from /mnt/backup/stdsymp directory.

If the same remote machine is used to store the reports from different Multicontroller RNCs, it is recommended to include specific Multicontroller RNC name and identifier in the folder name to differentiate the reports.

w

NOTICE: The report files can also be copied directly from the Multicontroller RNC to external machine using SFTP or SCP protocol.There is a default account _nokadmin which provides the access to Multicontroller RNC file system. The default password for _nokadmin user is "system”. The reports are stored under /stdsymp directory.

3.5 Deleting the standard symptom reports

Purpose

Follow this procedure to delete the standard symptom reports.

Steps

1 Delete the particular standard symptom report.

To delete a particular standard symptom report, enter the following command: delete symptom-report name <report-name>

Example

To delete a particular standard symptom report with name RNC311, enter the following command:

delete symptom-report name RNC311 Sample Output:

CFPU-0@RNC-311 [2013-08-06 08:29:10 +0200]

(31)

--- ---RNC311 /stdsymp/---RNC311_0.tar CFPU-0

2 Delete all the standard symptoms reports.

To delete all the standard symptom reports, execute the following command: delete symptom-report all

Sample Output:

CFPU-0@RNC-311 [2013-08-06 08:32:00 +0200]

ReportName FileName Store --- ---AllRNC311 /stdsymp/---AllRNC311_0.tar CFPU-0 AllRNC311 /stdsymp/AllRNC311_1.tar CFPU-0 MessageMonitor /stdsymp/MessageMonitor_0.tar CFPU- 0

RoutingFails /stdsymp/RoutingFails_0.tar CFPU- 0

history /stdsymp/history_0.tar CFPU- 0

rncmon /stdsymp/rncmon_0.tar CFPU- 0

(32)

4 Generic software troubleshooting

instructions

4.1 Displaying all blackbox files

Purpose

This section provides instructions on how to display all the blackbox files in the system. These blackbox files are normally located in /srv/Log/crash folder.

Make sure you have the authority to the secondary group_nokfsuicrashlog and the permission fsASView. To avoid adding the user account to a long list of groups, you are recommended to add the permissionfsASView to group _nokfsuicrashlog using the command add user-management group-to-permission gid

_nokfsuicrashlog permid fsASView and then assign the group _nokfsuicrashlog to the target user account.

1 Show all blackbox file groups for all crashed processes. Enter the following command:

show troubleshooting blackbox list

Expected outcome All the blackbox names are listed. Each name indicates there is a process crash. An example of the output is as follows. Apr 10 16:34 CLA-1-11945-534657b0-snmpmdserver-ABRT Apr 21 09:27 CLA-0-11435-53547401-snmpmdserver-ABRT

Steps

1 Show all blackbox file groups for all crashed processes. Enter the following command:

show troubleshooting blackbox list

Expected outcome All the blackbox names are listed. Each name indicates there is a process crash. An example of the output is as follows. Jun 23 14:34 CFPU-0-21089-4e02db77-sokeri-ABRT Jun 23 15:30 CFPU-0-21441-4e02eb8e-ilalarm-ABRT Jun 24 14:24 CFPU-0-7052-4e042aaa-slapd-ABRT Jul 18 13:15 CFPU-0-7905-4e23c149-lastproc-ABRT Jul 18 13:15 CFPU-0-7871-4e23c188-starter-ABRT Generic software troubleshooting instructions Troubleshooting Multicontroller RNC

(33)

4.2 Viewing active alarms and alarms history

The alarm system indicates potential faults in the system as well as faults that require corrective actions. After an alarm is raised, the fault causing the alarm must be solved. The solution can be an automatic recovery or a manual corrective action. Alarms are typically used in situations where it is possible to give instructions for corrective actions in the alarm description, such as replacing a hardware unit. The alarms can also be raised through the alarm system to indicate that the system is not working normally, for example, when the hard disk is full and the system cannot write to it. Such alarms are cleared automatically when the system returns to its normal state. You cannot clear these alarms manually.

Multicontroller RNC alarm handling

• Multicontroller RNC OMS provides Fault Management application that enables user to perform operations such as:

– managing Multicontroller RNC, Multicontroller RNC OMS, and Flexi BTS alarms – viewing Multicontroller RNC, Multicontroller RNC OMS, and Flexi BTS alarms

history

For information on how to check and manage the alarms, see Managing Faults with

OMS.

g

To check the active alarms and alarms history directly from Multicontroller RNC command line interface, enter the following commands:

• For information on the Multicontroller RNC specific alarms, check the following reference documentation:

Multicontroller RNC Notices (0-999)

Multicontroller RNC Disturbances (1000-1999) Multicontroller RNC Failure Printouts (2000-3999)

Multicontroller RNC, IPA-RNC and I-HSPA Adapter Base Station Alarms (7000-7900)

Flexi BTS alarm handling

• Flexi BTS alarms are managed using OMS Fault Management application. As an alternative, they can be also checked by using BTS Site Manager. For more information, seeChecking BTS alarms ofTroubleshooting Flexi Multiradio BTS

WCDMA.

• For information on the Flexi BTS specific alarms, see Flexi Multiradio BTS WCDMA

Faults.

(34)

Multicontroller RNC OMS alarm handling

• Multicontroller RNC OMS alarms are managed using OMS Fault Management application.

• For information on the specific Multicontroller RNC OMS alarms, see: OMS Alarms.

4.3 Collecting core dump file information

Purpose When an application crashes, it writes the contents of its execution environment (stacks, variables and so on) into a file. This file is known as a core dump. A core dump is useful for problem solving, since it provides the following details: • The application's task during failure • The application's execution environment In addition to the core dump, the system also gathers information such as recent syslog entries and information about the node. This information is also saved to files and labeled the same way as the core files.

Steps

1 Open an SSH connection to Multicontroller RNC IP address.

2 List all the blackbox files in the system. show troubleshooting blackbox list

These blackbox files are normally located in /srv/Log/crash folder.

3 Switch to root user and the bash shell. set user username root

Provide the password for root user (default is root).

4 Collect the log and the core files.

To check the files that can be accessed from the node, where the log recovery group is running, enter the following command: ls /srv/Log/crash The filenames will be in the following format: <node>-<pid>-<timestamp>-<name>-<signal>.<type>.gz Where: • node is the name of node where the crash occurred. Generic software troubleshooting instructions Troubleshooting Multicontroller RNC

(35)

• pid is the process id of the crashed process. • timestamp is the time of crash in seconds since 1970-01-01 00:00:00 UTC in hexadecimal notation. • name is the name of the crashed process. • signal is the signal that caused the process to dump the core. type can be one of the following: • core - the core dump file • syslog - a snapshot of syslog entries from the time of crash • debug - a snapshot of debug entries from the time of crash • blackbox.tar - information about the node

5 Copy the files from the Multicontroller RNC to external machine.

scp /srv/Log/crash/<core-file-name> <username>@<remote machine IP address>:/<external machine folder path>/

6 Send the files to your Nokia Solutions and Networks representative.

(36)

5 Configuration management troubleshooting

5.1 Saving configuration snapshot fails

5.1.1 Description

If there is no free space on the disk, saving the configuration snapshot is not possible.

5.1.2 Symptoms

The save snapshot command fails with an error message.

5.1.3 Recovery procedures

If saving configuration snapshot fails, follow either of these steps:

1 Check the existing configuration snapshots and delete the old configuration snapshots that are not needed.

To display the existing configuration snapshots, enter the following command: show snapshot list

To delete all configuration snapshots that are no longer needed, enter the following command:

delete snapshot config-name <config_name>

2 Check the existing software volumes and delete the old software deliveries that are not needed.

To display the existing software volumes, enter the following command: show sw-manage list

To delete the old software deliveries that are no longer needed, enter the following command:

delete sw-manage delivery

Troubleshooting Mcrnc

WCDMA RAN, Rel. RU50 and

RU50 EP1, Operating

Documentation, Issue 05

Troubleshooting

Multicontroller RNC

DN0976768

Issue 02B

Approval Date 2015-02-23

f

Important Notice on Product Safety

Table of Contents

List of Figures

List of Tables

Summary of changes

Changes between issues 02A(2014-05-23, RU40) and 02B(2015-02-23,

RU40)

Changes between issues 02(2013-09-15, RU40) and 02A(2014-05-23,

RU40)

Changes between issues 01D (2012-11-23, RU30) and 02(2013-09-15,

RU40)

1 Overview of Multicontroller RNC

Troubleshooting

1.1 Troubleshooting Multicontroller RNC

recommendations

Electrostatic precautions

Security procedures

Disaster recovery plan

Escalation plan

Preventive maintenance

Safecopying

Performance monitoring

Documentation

Network element diary

1.2 Information sources in fault situations

Alarms

Diagnosis reports

Error messages

Statistical reports

Logs and other relevant statistical information

Unit states

1.3 Problem types

Reproducible problems

Intermittent problems

Several related or isolated troubles active at the same time

1.4 Generic troubleshooting procedure

Description

Symptoms

Recovery procedures

Troubleshooting process before calling Nokia Solutions and

Networks help desk

Steps

2 Reporting problem to NSN

g

3 Symptoms data collection

3.1 Standard symptoms report

w

3.2 Collecting standard symptom reports

Steps

w

3.3 Listing the standard symptom reports

Steps

w

3.4 Copying standard symptoms reports to remote

machine

Steps

w

3.5 Deleting the standard symptom reports

Steps

4 Generic software troubleshooting

instructions

4.1 Displaying all blackbox files

Steps

4.2 Viewing active alarms and alarms history

Multicontroller RNC alarm handling

g

Flexi BTS alarm handling

Multicontroller RNC OMS alarm handling

4.3 Collecting core dump file information

Steps