Skip to content

Conversation

@osyoyu
Copy link
Contributor

@osyoyu osyoyu commented Nov 4, 2025

Effectively reverts commit 788274b and 0abac72.

EMAIL_REGEXP was mostly drawn from WHATWG HTML LS. This spec states that it intentionally violates RFC 5322 to provide a practical regex for validation.

This requirement is a willful violation of RFC 5322, which defines a syntax for email addresses that is simultaneously too strict (before the "@" character), too vague (after the "@" character), and too lax (allowing comments, whitespace characters, and quoted strings in manners unfamiliar to most users) to be of practical use here.

The allowing of consecutive dot s(a..a@) and leading/trailing dots (.a@, a.@) is not the only deviation from RFC 5322. If a truly RFC 5322-compliant regexp is needed, it should be organized under a different name, since too much departure from the original EMAIL_REGEXP must be introduced.

@osyoyu osyoyu force-pushed the restore-whatwg-email-regexp branch from 4bc6c3d to a1770c3 Compare November 4, 2025 05:37
@osyoyu osyoyu force-pushed the restore-whatwg-email-regexp branch from a1770c3 to 0c5c1f3 Compare November 4, 2025 05:40
@osyoyu osyoyu changed the title Restore EMAIL_PREFIX from WHATWG HTML LS Re-allow consecutive dots in EMAIL_REGEXP Nov 4, 2025
@moznion
Copy link

moznion commented Nov 4, 2025

@osyoyu I think this should also be reverted because it was introduced in v1.1.0: #124

@osyoyu osyoyu force-pushed the restore-whatwg-email-regexp branch from 9a089d2 to 53f8b74 Compare November 4, 2025 06:22
@osyoyu
Copy link
Contributor Author

osyoyu commented Nov 4, 2025

Edited. EMAIL_REGEXP is now identical to that of v1.0.4.

Effectively reverts commit 788274b and
0abac72.

EMAIL_REGEXP was mostly drawn from WHATWG HTML LS. This spec states that
it intentionally violates RFC 5322 to provide a practical regex for
validation.

> This requirement is a willful violation of RFC 5322, which defines a
> syntax for email addresses that is simultaneously too strict (before the
> "@" character), too vague (after the "@" character), and too lax
> (allowing comments, whitespace characters, and quoted strings in manners
> unfamiliar to most users) to be of practical use here.

The allowing of consecutive dot s(`a..a@`) and leading/trailing dots
(`.a@`, `a.@`) is not the only derivation from RFC 5322. If a truly RFC
5322-compliant regexp is needed, tt should be organized under a
different name, since too much departure from the original EMAIL_REGEXP
must be introduced.
@osyoyu osyoyu changed the title Re-allow consecutive dots in EMAIL_REGEXP Re-allow consecutive, leading and trailing dots in EMAIL_REGEXP Nov 4, 2025
@osyoyu osyoyu force-pushed the restore-whatwg-email-regexp branch from 53f8b74 to c551d70 Compare November 4, 2025 06:25
@sorah
Copy link
Member

sorah commented Nov 4, 2025

Bit hesitant to cut into this issue without waiting people involved in the original pull request, but merging this as the issue appears to have major impact.

@sorah sorah merged commit 8557e8d into ruby:master Nov 4, 2025
26 checks passed
@sorah
Copy link
Member

sorah commented Nov 4, 2025

While minor release (increment of Y in X.Y.Z) may have a breaking change, it must be well communicated - and the comment says it is based on WHATWG practical pattern, disallowing the email addresses in question can be considered a bug/regression. The comment on the constant mentions WHATWG even in 1.1.0, there's no doubt that this is a bug.

@osyoyu
Copy link
Contributor Author

osyoyu commented Nov 5, 2025

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants