Skip to content

[Linux][Alpine 3.23] SIGSEGV in gss_init_sec_context during concurrent SQL Server SSPI authentication #123782

@jeremy-doss

Description

@jeremy-doss

Description

A segmentation fault occurs in the MIT Kerberos native libraries when multiple SQL Server connections concurrently perform SSPI authentication (Integrated Security) on Alpine Linux 3.23. The crash occurs during the P/Invoke call from NegotiateStreamPal.GssInitSecurityContext into gss_init_sec_context() in libgssapi_krb5.so.

This regression was introduced when Alpine updated their krb5 packages in version 3.23. The same workload runs without issue on Alpine 3.20, 3.21, and 3.22.

Our base image is mcr.microsoft.com/dotnet/aspnet:8.0-alpine, so a few weeks ago this change was pulled in via our CI automatically and caused issues under load.

Reproduction Steps

  1. Create a .NET 8 application that connects to SQL Server using Integrated Security
  2. Deploy to a container based on mcr.microsoft.com/dotnet/aspnet:8.0-alpine (which pulls Alpine 3.23)
  3. Apply concurrent load that causes connection pool expansion (4+ simultaneous new connections)
  4. Application terminates with SIGSEGV

Expected behavior

Concurrent SSPI authentication completes successfully, as it does on Alpine 3.22.

Actual behavior

Process terminates with SIGSEGV (signal 11) during GSS-API security context initialization.

Exception Type: 0x20000000 (CLR signal-based exception)

Regression?

Yes, in the previous release of mcr.microsoft.com/dotnet/aspnet:8.0-alpine which uses Alpine 3.22, there is no issue. We confirmed by only setting the base image to mcr.microsoft.com/dotnet/aspnet:8.0-alpine3.22, which resolved the issue.

Known Workarounds

Pinning to mcr.microsoft.com/dotnet/aspnet:8.0-alpine3.22 to use Alpine 3.22.

Configuration

Component Version
.NET Runtime 8.0.23 (8.0.2325.60607)
Microsoft.Data.SqlClient 5.x (via EF Core)
OS Alpine Linux 3.23 (x64)
C Library musl libc (ld-musl-x86_64.so.1)
libkrb5.so 3.3
libgssapi_krb5.so 2.2
libkrb5support.so 0.1
Container Base mcr.microsoft.com/dotnet/aspnet:8.0-alpine

Native Module Build IDs

From the crash dump:

/usr/lib/libkrb5.so.3.3          Build ID: 77009165be6662630a3749ba1e753130a5e2db4c
/usr/lib/libgssapi_krb5.so.2.2   Build ID: 3d4d5273b668d9a2b0eed2a57b4810429079d683
/usr/lib/libkrb5support.so.0.1   Build ID: e7c672f5159054fff41976fa2877d7d647dfb330

Other information

Crash Analysis

Call Stack (from core dump)

IL_STUB_PInvoke(Status ByRef, SafeGssCredHandle, SafeGssContextHandle ByRef, ...)
System.Net.Security.NegotiateStreamPal.GssInitSecurityContext(...)
System.Net.Security.NegotiateStreamPal.EstablishSecurityContext(...)
System.Net.Security.NegotiateStreamPal.InitializeSecurityContext(...)
Microsoft.Data.SqlClient.SNI.SNIProxy.GenSspiClientContext(...)
Microsoft.Data.SqlClient.TdsParser.SNISSPIData(...)
Microsoft.Data.SqlClient.TdsParser.ProcessSSPI(...)
Microsoft.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(...)
Microsoft.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(...)
Microsoft.Data.ProviderBase.DbConnectionPool.CreateObject(...)
Microsoft.Data.ProviderBase.DbConnectionPool.WaitForPendingOpen()

SafeHandle Corruption Evidence

Multiple SafeGssContextHandle objects show inconsistent state at crash time:

SafeGssContextHandle:
  handle = 0x0000000000000000  (NULL - invalid)
  _state = 4                    (Closed/Disposed)
  _fullyInitialized = true      (Inconsistent with null handle)

This pattern indicates memory corruption or a race condition in the native GSS-API layer.

Parallel Stacks

Analysis of two independent crash dumps shows:

  • Dump 1: 16 threads blocked in GssInitSecurityContext, 24 active SQL connections
  • Dump 2: 4 threads blocked in GssInitSecurityContext, 5 active SQL connections

The crash occurs at both high and low concurrency levels.

Root Cause Analysis

The issue appears to be a thread-safety bug in the MIT Kerberos libraries (libkrb5, libgssapi_krb5) shipped with Alpine 3.23. When multiple threads concurrently call gss_init_sec_context():

  1. The native credential cache or GSS context structures are accessed without proper synchronization
  2. One thread corrupts or frees memory that another thread is actively using
  3. This results in SIGSEGV when the corrupted pointer is dereferenced

The .NET libSystem.Net.Security.Native.so interop layer calls into these native libraries, and the crash occurs before control returns to managed code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions