De plus en plus, venant aussi du fait de concurrence les challengers habituels de la téléphonie IP, les clients souhaitent être en mesure de surveiller en temps réel l'activité des serveurs Microsoft Lync.
Par surveillance, il est entendu que je fais référence au recueil et l'affichage de toutes les données, des conditions relatives au fonctionnement et au comportement des équipements ainsi que desapplications incluant toutes les informations et les procédures requises pour faciliter
l'entretien et le bon fonctionnement des applications.
Pour une organisation ayant Solarwinds, il existe déjà des templates tout prêt pour effectuer ces tâches pour les rôles suivants:
Description: http://thwack.solarwinds.com/docs/DOC-155345
Description: http://thwack.solarwinds.com/docs/DOC-155347
Description: http://thwack.solarwinds.com/docs/DOC-155346
Front-End:
Service: Lync Server Audio Test Service
This component monitor returns the CPU and memory usage of the Lync Server Audio Test Service. This service offers users the ability to subjectively test the quality of a call before placing the call. The user checks the call quality by making a test call.
Service: Lync Server File Transfer Agent
This component monitor returns the CPU and memory usage of the Lync Server File Transfer Agent. The File Transfer Agent is responsible for replicating configuration settings with the Replica Replicator Agent that runs on every Lync Server.
Service: Lync Server Front-End
This component monitor returns the CPU and memory usage of the Front-End Lync Server. The Front-End Servers maintain transient information, such as logged-on state and control information for an IM, Web, or audio/video (A/V) conference.
Service: Lync Server IM Conferencing
This component monitor returns the CPU and memory usage of the Lync Server IM Conferencing. The IM Conferencing service is responsible for multiplexing the instant messages data feed from the leader to all participants in the session.
Service: Lync Server Master Replicator Agent
This component monitor returns the CPU and memory usage of the Lync Server Master Replicator Agent. This service is used by File Transfer Agent for replication configuration settings.
Service: Lync Server Replica Replicator Agent
This component monitor returns the CPU and memory usage of the Lync Server Replica Replicator Agent. This service is used by the File Transfer Agent for replication configuration settings.
SIP Peers: Connections Active
This component monitor returns the number of established connections that are currently active. A connection is considered established when peer credentials are verified (e.g. via MTLS), or the peer receives a 2xx response. You will need to baseline this counter by testing and monitoring the user load. This returned value should be less than 15,000 connections per Front-End.
SIP Peers: TLS Connections Active
This component monitor returns the number of established TLS connections that are currently active. A TLS connection is considered established when the peer certificate, and possibly the host name, are verified for a trust relationship. You will need to baseline this counter by testing and monitoring the user load.
SIP Peers: Sends Outstanding
This component monitor returns the number of messages that are currently present in the outgoing queues. If you receive error message 504, investigate the results from this counter. Doing so will indicate which servers are having problems. To do so, you will need to change the instance from _Total, to the server hostname. You can check this within perfmon.exe
SIP Peers: Average Outgoing Queue Delay
This component monitor returns the average time, in seconds, that messages have been delayed in outgoing queues. Check the Outgoing Queue Delay for delays in sending messages to other servers or clients that could be causing messages to be accumulated in the server. The server will drop client connections if it is in a throttle state and messages stay in the outgoing queue for more than 32 seconds.
SIP Peers: Flow-controlled Connections Dropped
This component monitor returns the total number of connections dropped because of excessive flow-control. You will need to baseline this counter by testing and monitoring the server's health. The returned value should be as low as possible.
SIP Peers: Average Flow-Control Delay
This component monitor returns the average delay, in seconds, in message processing when the socket is flow-controlled. You will need to baseline this counter by testing and monitoring the server's health. The returned value should be as low as possible.
SIP Peers: Incoming Requests/sec
This component monitor returns the rate of received requests, per second. You will need to baseline this counter by testing and monitoring the user load.
SIP Protocol: Incoming Messages/sec
This component monitor returns the rate of received messages, per second. You will need to baseline this counter by testing and monitoring the user load.
SIP Protocol: Events In Processing
This component monitor returns the number of SIP transactions, or dialog state change events, that are currently being processed. You will need to baseline this counter by testing and monitoring the user load.
SIP Responses: Local 500 Responses/sec
This component monitor returns the rate of 500 responses generated by the server, per second. This can indicate that there is a server component that is not functioning correctly.
SIP Responses: Local 503 Responses/sec
This component monitor returns the rate of 503 responses generated by the server, per second. The 503 code corresponds to the server being unavailable. On a healthy server, you should not receive this code at a steady rate. However, during ramp up, after a server has been brought back online, there may be some 503 responses. Once all users get back in and the server returns to a stable state, there should no longer be any 503 responses returned.
SIP Responses: Local 504 Responses/sec
This component monitor returns the rate of 504 responses generated by the server, per second. A few 504 responses to clients (for clients disconnecting abruptly) is to be expected, but this counter mainly indicates connectivity issues with other servers. It can indicate connection failures or delays connecting to remote servers.
SIP Load Management: Average Holding Time For Incoming Messages
This component monitor returns the average time that the server held the incoming messages currently being processed. This should usually be less than one second, on average, but it is normal to see short spikes of up to three seconds. The server will throttle new incoming messages after going above the high watermark and until the number of messages falls below the low watermark. The server starts rejecting new connections when the average holding time is greater than overload time of 15 seconds.
SIP Load Management: Address space usage
This component monitor returns the percentage of available address space currently in use by the server process. The returned value should be as low as possible.
SIP Load Management: Page file usage
This component monitor returns the percentage of available page file space currently in use by the server process. The returned value should be as low as possible.
IM Conferences: Active Conferences
This component monitor returns the number of active conferences. You will need to baseline this counter by testing and monitoring the user load.
IM Conferences: Connected Users
This component monitor returns the number of connected users in all conferences. You will need to baseline this counter by testing and monitoring the user load.
IM Conferences: Throttled Sip Connections
This component monitor returns the number of throttled Sip connections. If the value is greater than ten, it could indicate that Peer is not processing requests in a timely fashion. This can happen if the peer machine is overloaded. Peer is defined as the connected servers, adjacent Front-End servers, or MCUs in the same EE Pool. The same set of counters apply.
IM MCU Health And Performance: MCU Health State
This component monitor returns the current health of the MCU.
Possible values:
0 = Normal.
1 = Loaded.
2 = Full.
3 = Unavailable.
IM MCU Health And Performance: MCU Draining State
This component monitor returns the current draining status of the MCU.
Possible values:
0 = Not requesting to drain.
1 = Requesting to drain.
2 = Draining.
When a server is drained, it stops taking new connections and calls. These new connections and calls are routed through other servers in the pool. A server being drained allows its sessions on existing connections to continue until they naturally end. When all existing sessions have ended, the server is ready to be taken offline.
User Services - DBStore: Queue Latency (msec)
This component monitor returns the average time, in milliseconds, that a request is held in the database queue. This counter represents the time that a request spends in the queue of the Back-End Database Server. If the topology is healthy, this counter averages less than 100 ms. Occasional spikes are acceptable. The value will be higher on Front-End Servers that are located at the site opposite the location of the Back-End Database Servers. This value can increase if the Back-End Database Server is having performance problems or if network latency is too high. If the returned value is high, check both network latency and the health of the Back-End Database Server. Server health decreases as latency increases to 12 seconds, when server throttling begins.
User Services - DBStore: Sproc Latency (msec)
This component monitor returns the average time, in milliseconds, it takes to execute a stored procedure call. A healthy state is considered to be less than 100 ms. Server health decreases as latency increases to 12 seconds, when server throttling begins.
User Services - Https Transport: Number of failed connection attempts / Sec
This component monitor returns the rate of connection attempt failures, per second. You will need to baseline this counter by testing and monitoring the server's health.
Mediation:
Service: Lync Server Replica Replicator Agent
This component monitor returns the CPU and memory usage of the Lync Server Replica Replicator Agent. This service is used by the File Transfer Agent for replication configuration settings.
Outbound Calls: Current
This component monitor returns the total number of active calls going through the Mediation Server.
Outbound Calls: Active media bypass calls
This component monitor returns the total number of active calls going through Mediation Server that are in Media Bypass mode. Calls using Media Bypass use significantly fewer Mediation Server resources because the media is not flowing through the Mediation Server.
Inbound Calls: Current
This component monitor returns the number of inbound calls in progress.
Inbound Calls: Active media bypass calls
This component monitor returns the number of media bypass calls in progress.
Media Relay: Media Connectivity Check Failure
This component monitor returns the number of calls where media connectivity between the Mediation Server and the remote endpoints could not be established. The returned value should be as low as possible.
Health Indices: Load Call Failure Index
This component monitor returns the scaled index between zero and 100 that is related to all call failures due to Global Health Index as a heavy load.
Global Counters: Current audio channels with PSM quality reporting
This component monitor returns the total number of active channels that are having Phase Shift Modulation (PSM) quality reported. Calculating PSM quality has a processing overhead so this should be taken into account when measuring performance.
Total failed calls caused by unexpected interaction from the Proxy
This component monitor returns the number of calls that failed because of an unexpected response from the Front End Server. The returned value should be as low as possible.
Total failed calls caused by unexpected interaction from a gateway
This component monitor returns the number of calls that failed because of an unexpected response from a gateway peer. The returned value should be as low as possible.
Edge:
Service: Lync Server Replica Replicator Agent
This component monitor returns the CPU and memory usage of the Lync Server Replica Replicator Agent. This service is used by the File Transfer Agent for replication configuration settings.
SIP Peers: Connections Active
This component monitor returns the number of established connections that are currently active. A connection is considered established when peer credentials are verified (e.g. via MTLS), or the peer receives a 2xx response. You will need to baseline this counter by testing and monitoring the user load. This returned value should be less than 15,000 connections per Front-End.
SIP Peers: TLS Connections Active
This component monitor returns the number of established TLS connections that are currently active. A TLS connection is considered established when the peer certificate, and possibly the host name, are verified for a trust relationship. You will need to baseline this counter by testing and monitoring the user load.
SIP Peers: Average Outgoing Queue Delay
This component monitor returns the average time, in seconds, that messages have been delayed in outgoing queues. Check the Outgoing Queue Delay for delays in sending messages to other servers or clients that could be causing messages to be accumulated in the server. The server will drop client connections if it is in a throttle state and messages stay in the outgoing queue for more than 32 seconds.
SIP Peers: Incoming Requests/sec
This component monitor returns the rate of received requests, per second. You will need to baseline this counter by testing and monitoring the user load.
SIP Protocol: Incoming Messages/sec
This component monitor returns the rate of received messages, per second. You will need to baseline this counter by testing and monitoring the user load.
SIP Load Management: Average Holding Time For Incoming Messages
This component monitor returns the average time that the server held the incoming messages currently being processed. This should usually be less than one second, on average, but it is normal to see short spikes of up to three seconds. The server will throttle new incoming messages after going above the high benchmark and until the number of messages falls below the low benchmark. The server starts rejecting new connections when the average holding time is greater than overload time of 15 seconds.
SIP Access Edge Server: External Messages/sec With Internally Supported Domain
This component monitor returns the per-second rate of messages received at the external edge with an internally supported domain.
SIP Access Edge Server: External Messages/sec Received With Allowed Partner Server Domain
This component monitor returns the per-second rate of messages received at the external edge with an allowed partner server domain.
SIP Access Edge Server: External Messages/sec Received With a Configured Allowed Domain
This component monitor returns the per-second rate of messages received at the external edge with a configured allowed domain.
A/V Edge UDP: Active Relay Sessions - Authenticated
This component monitor returns the number of active relay sessions over UDP.
A/V Edge UDP: Active Relay Sessions - Allocated Port
This component monitor returns the number of active relay sessions with a UDP port allocation.
A/V Edge UDP: Active Relay Sessions - Data
This component monitor returns the number of active relay data sessions over UDP.
A/V Edge UDP: Allocated Port Pool Count
This component monitor returns the number of UDP ports available in the Allocated Port Pool. This monitor should be more than zero. If it reaches zero there is a resource issue.
A/V Edge UDP: Allocate Requests/sec
This component monitor returns the per-second rate of Allocate Requests over UDP. You will need to baseline this counter by testing and monitoring the user load.
A/V Edge UDP: Authentication Failures/sec
This component monitor returns the per-second rate of failed attempts to authenticate with the relay over UDP. The returned value should be as low as possible.
A/V Edge UDP: Allocate Requests Exceeding Port Limit
This component monitor returns the number of allocate requests over UDP that exceeded the port limit. If the value is greater than zero, this could indicate an attempt to misuse the port.
A/V Edge UDP: Packets Received/sec
This component monitor returns the number of packets, received per second, by the relay over UDP. You will need to baseline this counter by testing and monitoring the user load.
A/V Edge UDP: Packets Sent/sec
This component monitor returns the number of packets sent per second by the relay over UDP. You will need to baseline this counter by testing and monitoring the user load.
A/V Edge UDP: Average Data Packet Latency (milliseconds)
This component monitor returns the average latency for a valid data request over UDP in milliseconds. The returned value should be as low as possible.
A/V Edge UDP: Packets Dropped/sec
This component monitor returns the per-second rate of packets over UDP dropped by the relay. The returned value should be as low as possible.
This error occurs when an unexpectedly high rate of User Datagram Protocol (UDP) packets is received at the Media Relay (A/V Edge server) causing some packets to be discarded. This could be the result of system overload or an indication of an attempt to misuse the MR.
To resolve this, check that the profile of network traffic to the MR is in line with expected usage. If the traffic exceeds 250 Mbps per interface, increase the Receive and Transmit buffer size on the associated network adapter network adapters to three times the default values.
If the cause is a general system overload, increase the capacity of the deployed MR function. A network level trace can be used to determine if there is an unusual amount of traffic originating from a single source. If the situation persists, enable tracing to check the network source of sessions exceeding the bandwidth limits to allow further troubleshooting of the cause.
A/V Edge TCP: Active Relay Sessions - Authenticated
This component monitor returns the number of active relay sessions over TCP.
A/V Edge TCP: Active Relay Sessions - Allocated Port
This component monitor returns the number of active relay sessions with a TCP port allocation.
A/V Edge TCP: Active Relay Sessions - Data
This component monitor returns the number of active relay data sessions over TCP.
A/V Edge TCP: Allocated Port Pool Count
This component monitor returns the number of TCP ports available in the Allocated Port Pool. This monitor should be greater than zero. If zero is reached, a resource issue exists.
A/V Edge TCP: Allocate Requests/sec
This component monitor returns the per-second rate of Allocate Requests over TCP. You will need to baseline this counter by testing and monitoring the user load.
A/V Edge TCP: Authentication Failures/sec
This component monitor returns the per-second rate of failed attempts to authenticate with the relay over TCP. The returned value should be as low as possible.
A/V Edge TCP: Allocate Requests Exceeding Port Limit
This component monitor returns the number of allocate requests over TCP that exceeded the port limit. If the value is greater than zero, this could indicate an attempt to misuse the port.
A/V Edge TCP: Packets Received/sec
This component monitor returns the number of packets received per second by the relay over TCP. You will need to baseline this counter by testing and monitoring the user load.
A/V Edge TCP: Packets Sent/sec
This component monitor returns the number of packets sent per second by the relay over TCP. You will need to baseline this counter by testing and monitoring the user load.
A/V Edge TCP: Average Data Packet Latency (milliseconds)
This component monitor returns the average latency for a valid data request over TCP in milliseconds. The returned value should be as low as possible.
A/V Edge TCP: Packets Dropped/sec
This component monitor returns the per-second rate of packets over TCP dropped by the relay. The returned value should be as low as possible.
This error occurs when an unexpectedly high rate of User Datagram Protocol (UDP) packets is received at the Media Relay (A/V Edge server) causing some packets to be discarded. This could be the result of system overload or an indication of an attempt to misuse the MR.
To resolve this, check that the profile of network traffic to the MR is in line with expected usage. If the traffic exceeds 250 Mbps per interface, increase the Receive and Transmit buffer size on the associated network adapter network adapters to three times the default values.
If the cause is a general system overload, increase capacity of the deployed MR function. A network level trace can be used to determine if there is an unusual amount of traffic originating from a single source. If the situation persists, enable tracing to check the network source of sessions