Skip to content

Implement "Followup improvements for ext/uri" RFC - RFC 3986 URI building#22173

Open
kocsismate wants to merge 8 commits into
php:masterfrom
kocsismate:uri-followup3
Open

Implement "Followup improvements for ext/uri" RFC - RFC 3986 URI building#22173
kocsismate wants to merge 8 commits into
php:masterfrom
kocsismate:uri-followup3

Conversation

@kocsismate

Copy link
Copy Markdown
Member

@TimWolla TimWolla left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not yet sure how I feel about the manual validation happening during recomposition. I'd like to take a second look later.

Comment thread ext/uri/php_uri.c Outdated
Comment thread ext/uri/php_uri.c Outdated
Comment thread ext/uri/uri_parser_rfc3986.c Outdated
Comment thread ext/uri/uri_parser_rfc3986.c Outdated
Comment thread ext/uri/uri_parser_rfc3986.c Outdated
Comment thread ext/uri/tests/rfc3986/modification/port_error_negative.phpt
Comment thread ext/uri/php_uri_common.h
return uriIsWellFormedFragmentA(p, p + len);
}

ZEND_ATTRIBUTE_NONNULL_ARGS(2,3,4,5,6,7,8) php_uri_parser_rfc3986_uris *php_uri_parser_rfc3986_build_from_zval(

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about this method? I think it's more efficient than the previous implementation, because the recomposition part was eliminated. It was also possible to get rid of two custom validation of the "global state" below. The remaining two are needed because uriparser doesn't fail in these cases (

UriBool success = URI_FUNC(FixPathNoScheme)(destUri, memory);
): https://github.com/php/php-src/blob/27bc7c0e1294eb0ce87f071acd42f43f01877d61/ext/uri/tests/rfc3986/modification/roundtrip_special_case2.phpt#L18-L17

Comment thread ext/uri/php_uri.c Outdated
Comment thread ext/uri/php_uri.c
Comment thread ext/uri/php_uri.c Outdated
Comment thread ext/uri/uri_parser_whatwg.c Outdated
@kocsismate

Copy link
Copy Markdown
Member Author

I fixed a few suggestions, thank you very much! I'll continue tomorrow with the rest

@TimWolla TimWolla self-requested a review June 1, 2026 20:34
Comment thread ext/uri/php_uri.c
static zend_result php_uri_parser_rfc3986_add_base_url(
UriUriA *tmp, const UriUriA *uri, const php_uri_parser_rfc3986_uris *uriparser_base_urls, const bool silent
) {
const int result = uriAddBaseUriExMmA(tmp, uri, &uriparser_base_urls->uri, URI_RESOLVE_STRICTLY, mm);

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLion started to nag me about adding the const modifier for scalar values like this. As far as I know, the compilers usually find out themselves without the modifier if a value is modified or not, so in practice, its only benefit is maybe some extra explicitness. WDYT?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The const modifier only helps you, the programmer, marking local as const won't influence the quality of the resulting machine code.
I don't think it's terribly important to mark them as const. If it's not a burden for you or goes automatically feel free to add it of course.

@ndossche ndossche left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remarks I had were fixed. One suggestion.
I only looked at the C code, I didn't look or think about the API interface or tests. I leave that part to Tim.

Comment thread ext/uri/uri_parser_rfc3986.c Outdated
char buf[MAX_LENGTH_OF_LONG + 1];
const char *res = zend_print_long_to_buf(buf + sizeof(buf) - 1, port);

const bool well_formed = uriIsWellFormedPortA(res, res + strlen(res));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also work:

Suggested change
const bool well_formed = uriIsWellFormedPortA(res, res + strlen(res));
const bool well_formed = uriIsWellFormedPortA(res, buf + sizeof(buf) - 1 - res);

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, very clever, but the current approach seems much more straightforward for me

@TimWolla TimWolla left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if all success tests should reparse the resulting URL for extra safety. So basically add:

var_dump($uri->equals(new Uri\Rfc3986\Uri($uri->toRawString())));

at the end of every test (it should always be bool(true)).


I looked at all the tests for now. Did not yet take a full look at the implementation.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the test description explicitly mentions “falls back to a registered name”, should the test also assert on getHostType()?

<?php

$builder = new Uri\Rfc3986\UriBuilder();
$builder->setUserInfo(null);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$builder->setUserInfo(null);
$builder->setUserInfo('something');
$builder->setUserInfo(null);

I think all the null tests should first set the value to something else, otherwise we are just testing that we're successfully not doing anything - since null already effectively is the default value.

NULL
["fragment"]=>
NULL
} No newline at end of file

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing newline at eof

const char *p = Z_STRVAL_P(path);
while (*p != '\0' && *p != '/') {
if (*p == ':') {
zend_throw_exception(php_uri_ce_invalid_uri_exception, "The path must not begin with \":\" when the URI doesn't contain a scheme", 0);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
zend_throw_exception(php_uri_ce_invalid_uri_exception, "The path must not begin with \":\" when the URI doesn't contain a scheme", 0);
zend_throw_exception(php_uri_ce_invalid_uri_exception, "The path must not begin with \":\" when the URI does not contain a scheme", 0);

Using contractions is colloquial (same applies to the other error message.

/cc @Girgias for error message suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants