Skip to content

Conversation

@dirkbro
Copy link
Contributor

@dirkbro dirkbro commented Dec 3, 2025

Summary
This PR fixes a bug in B2B entity replication (clusterer + b2b_logic) where the out_sdp field is pushed twice into the binary packet during packing, but consumed only once during unpacking. The mismatch corrupts the subsequent fields in the packet, including entity->no, and breaks B2B tuple synchronization across cluster nodes.

After this change:

  • pack_entity() and b2bl_entity_unpack() have matching field sequences
  • entity->no is correctly restored as 0 or 1
  • The Bad entity bridge no [...] errors no longer occur in normal operation
  • opensips-cli -x mi b2b_list reports a consistent view of tuples across all nodes.

Details
Environment:

  • image
  • Cluster of 3 SBC nodes
  • BIN clustering (proto=bin) with B2B state replication enabled
  • Modules:
    • b2b_logic
    • b2b_entities
    • clusterer

Problem description:

When packing B2B entities for cluster replication in pack_entity(), the out_sdp field is currently serialized twice:
// Around line 133 - entity_storage.c : pack_entity()

	if (event_type == B2B_EVENT_CREATE) {

		...

		bin_push_str(storage, &entity->hdrs);
		bin_push_str(storage, &entity->out_sdp);       // First time

		bin_push_str(storage, &entity->dlginfo->callid);
		bin_push_str(storage, &entity->dlginfo->fromtag);
		bin_push_str(storage, &entity->dlginfo->totag);
	}

	bin_push_str(storage, &entity->out_sdp);      // Second time - this is the bug

However, in receive_entity_create(), out_sdp is deserialized only once, with the expected sequence being:

// Around line 344 (unpacking) - entity_storage.c : receive_entity_create()
	bin_pop_str(storage, &hdrs);
	bin_pop_str(storage, &sdp);

	...

	bin_pop_str(storage, &dlginfo.callid);
	bin_pop_str(storage, &dlginfo.fromtag);
	bin_pop_str(storage, &dlginfo.totag);

Because the pack side writes out_sdp twice but the unpack side reads it only once, the binary stream becomes misaligned and all subsequent fields are read from the wrong offset. In clustered deployments this corrupts the reconstructed entity, including entity->no, which leads to errors such as:

  ERROR:b2b_logic:receive_entity_create: Bad entity bridge no [21349] for tuple [501.0]
  ERROR:b2b_logic:receive_entity_create: Failed to process received entity [B2B.XXX...]
  ERROR:b2b_logic:receive_entity_ack: Tuple [501.0] not found

On the “good” node (the one where the tuple was originally created), the B2B state looks correct, for example:

# opensips-cli -x mi b2b_list
{
  "Tuples": [
    {
      "id": 0,
      "key": "499.0",
      "state": 4,
      "scenario": "sbc",
      "SERVERS": [
        { "index": 0, "no": 0, "type": 0, "key": "B2B.AAA...", "peer": "B2B.BBB..." }
      ],
      "CLIENTS": [
        { "index": 0, "no": 1, "type": 1, "key": "B2B.BBB...", "peer": "B2B.AAA..." }
      ],
      "BRIDGE_ENTITIES": [
        { "index": 0, "no": 0, "type": 0, "key": "B2B.AAA...", "peer": "B2B.BBB..." },
        { "index": 1, "no": 1, "type": 1, "key": "B2B.BBB...", "peer": "B2B.AAA..." }
      ]
    }
  ]
}

On the receiving nodes (after replication), the same tuple fails to deserialize correctly and receive_entity_create() logs an invalid entity->no (e.g. 21349), with the tuple subsequently missing in b2b_list / b2be_list due to the failed creation.

This behaviour is reproducible with:

  • A 3‑node cluster with proto=bin for clusterer
  • b2b_logic / b2b_entities configured with cluster_id
  • An inbound call that triggers a B2B scenario and tuple replication

Solution
The fix is to remove the duplicate serialization of out_sdp in pack_entity(), so that the packed fields exactly match the unpacked fields in receive_entity_create().

Testing

  • Recompiled b2b_logic with the fix and deployed to a 3‑node SBC cluster
  • Placed multiple test calls to create B2B tuples with replication enabled
  • Verified on all nodes:
    • No more Bad entity bridge no [...] errors in logs
    • opensips-cli -x mi b2b_list shows the same tuples on all nodes
    • opensips-cli -x mi b2be_list shows consistent dialog / entity state across the cluster

Compatibility

  • No configuration changes are required.
  • The intended wire format between pack_entity() and receive_entity_create() is restored to a consistent state.
  • No known SIP interoperability issues: the change only affects how internal B2B state is serialized across cluster nodes, not SIP messages on the wire.

Closing issues
Fixes issue: #3707

When packing B2B entities for cluster replication in
b2bl_entity_pack(), the out_sdp field is currently serialized twice:
// Around line 133 (packing)
	if (event_type == B2B_EVENT_CREATE) {

		...

		bin_push_str(storage, &entity->hdrs);
		bin_push_str(storage, &entity->out_sdp);       // First time

		bin_push_str(storage, &entity->dlginfo->callid);
		bin_push_str(storage, &entity->dlginfo->fromtag);
		bin_push_str(storage, &entity->dlginfo->totag);
	}

	bin_push_str(storage, &entity->out_sdp);      // Second time - remove this

However, in receive_entity_create(), out_sdp is deserialized only once, with the expected sequence being:

// Around line 344 (unpacking)
	bin_pop_str(storage, &hdrs);
	bin_pop_str(storage, &sdp);

	...

	bin_pop_str(storage, &dlginfo.callid);
	bin_pop_str(storage, &dlginfo.fromtag);
	bin_pop_str(storage, &dlginfo.totag);

Because the pack side writes out_sdp twice but the unpack side reads it only once, the binary stream becomes misaligned and all subsequent fields are read from the wrong offset. In clustered deployments this
corrupts the reconstructed entity, including entity->no, which leads to
errors such as:

  ERROR:b2b_logic:receive_entity_create: Bad entity bridge no [21349]
  for tuple [549.0]
@razvancrainea
Copy link
Member

I am afraid this is not entirely OK, as you no longer push an SDP after headers, but in create, it still popped. Thus, a complete fix, would also need to pop the SDP after the totag/before start_time, just as it is pushed in the replicaed packet.
But I also think the in_sdp should have been replicated (after headers), not the out one. I will push a fix myself for this, can you please give it a try on master?

Thanks!
Răzvan

@razvancrainea
Copy link
Member

I've just pushed b88203c to address this. Please give it a try and let us know if you're still having the initial issue. If not, I will backport everything to the supported versions.

Best regards,
Răzvan

@razvancrainea razvancrainea self-assigned this Dec 9, 2025
@dirkbro
Copy link
Contributor Author

dirkbro commented Dec 10, 2025

Hi Răzvan,

I’ve just tested with b88203c on a 3‑node BIN cluster (clusterer + b2b_entities + b2b_logic) and it looks good.

  • No more Bad entity bridge no [...] errors.
  • b2b_list / b2be_list show consistent tuples and entities on all nodes

So from my side the original issue is resolved. Please go ahead and backport to the supported branches.

Thanks, and best regards,
Dirk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants