Le Cloud de Christophe BOUCETTA

Voici le blog sur les communications unifiées et la collaboration Microsoft par un MVP nominé depuis 11 ans

Et oui, nous avons encore beaucoup de client utilisant Skype Entreprise comme système de téléphonie à supporter, certes, les nouveaux projets de ce type se font plus rares.

Contexte

Nous avons un problème récurrent dans une organisation, les services stoppent sans raison évidente aléatoirement.

Les messages d'erreurs dans l'event viewer est le suivant à chaque occurence du problème:

--------------------------------------------

Log Name:      Lync Server

Source:        LS Protocol Stack

Date:          7/27/2018 10:47:38 AM

Event ID:      14397

Task Category: (1001)

Level:         Warning

Keywords:      Classic

User:          N/A

Computer:      FE1.contoso.com

Description:

A configured certificate could not be loaded from store. The serial number is attached for reference.

Extended Error Code: 0x80092004.

 --------------------------------------------

Log Name:      Lync Server

Source:        LS Protocol Stack

Date:          7/27/2018 10:47:38 AM

Event ID:      14623

Task Category: (1001)

Level:         Error

Keywords:      Classic

User:          N/A

Computer:      FE1.contoso.com

Description:

A serious problem related to certificates is preventing Skype for Business Server from functioning.

 Unable to use a certificate as configured.

Transport:TLS, IP address:0.0.0.0, Port:5061, Error:0xC3E93C0D(SIP_E_STACK_TRANSPORT_CERT_NOT_FOUND).

Ensure that a valid certificate is present in the local computer certificate store. Also ensure that the server has sufficient privileges to access the store. The Skype for Business Server failed to initialize with the configured certificate.

 --------------------------------------------

Étant un environnement avec 4000 utilisateurs, c'est un problème majeur et impactant le business, toutes les vérifications concernant les certificats ne démontre pas un enjeu avec les certificats existant assignés au serveur, ou la présence d'un certificat intermédiaire ou racine dans un mauvais conteneur.

Cause

Finalement, la cause fut la suivante:

When a server running Lync/SFB service is joined to Azure AD the cert store updates every 5 hours for a new cert, this causes a condition in SfB to fail to find there cert after re-sync.

Azure AD joined devices get a new certificate every 5 hours, this causes a cert store change notification to fire, which causes skype to resync the cert store and validate their cert still exists in the cert store.  Occasionally, when this process occurs, Skype can no longer find its cert in the cert store handle it has and this is a fatal failure which causes skype FE to shut down.  We just haven’t been able to determine why that last piece would fail, as the cert is indeed in the physical store and we can see the re-sync read it from the physical store. Understanding that last bit is not a trivial task, especially if it is as we expect some sort of race condition scenario is hit.

Vous pouvez confirmer si la machine fut ajouté à Azure AD avec la commande suivante sur un des serveurs FE:

C:\Windows\system32>dsregcmd /status

Device State                                                         |

+----------------------------------------------------------------------+

        AzureAdJoined : YES

 

Réponse du support MS:

If the output shows, AzureADJoined as yes, then it is confirmed that server has been added to Azure AD. By default there is feature inbuilt in Windows server 2016 to add servers to AAD based on its pre-requisites being satisfied.

And this causes the SFB to hit some condition to do re-sync of certificate configured it's fails randomly, causing FE service to go down.

Résolution

Assez évidente, la prochaine étape est de retirer les serveurs Frontend de Azure AD

Facebook Like