Skip to content

Service responds with 500 status when URL without scheme is requested #718

@robertknight

Description

@robertknight

An issue was observed when loading proxied resources while visiting https://via.hypothes.is/https://en.wikipedia.org/wiki/Diplodocus, where the service would respond with 500 statuses when attempting to proxy some images.

There are two separate issues:

  • The fact that proxying fails in the first place
  • The response code is a 500 status yet contains a body with a 404 error

This issue is about the latter problem.

Steps to reproduce:

 curl -H 'Referer: https://viahtml.hypothes.is' -i 'https://viahtml.hypothes.is/proxy///upload.wikimedia.org/wikipedia/commons/thumb/3/31/Diplodocus_teeth_Smithsonian.png/500px-Diplodocus_teeth_Smithsonian.png'

Note the URL being requested here (//upload.wikimedia.org/wikipedia/commons/thumb/3/31/Diplodocus_teeth_Smithsonian.png/500px-Diplodocus_teeth_Smithsonian.png) is missing a scheme. The correct URL should be https://viahtml.hypothes.is/proxy/https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/Diplodocus_teeth_Smithsonian.png/500px-Diplodocus_teeth_Smithsonian.png.

Expected result:

A response with a 4xx status, since the URL is invalid. Alternatively the service might infer that the scheme should be HTTPS. I haven't tracked down why this particular URL is being generated and whether it "should" be accepted. In any case, it shouldn't produce a 5xx error.

Actual result:

The response has a 500 status, but a body that says "404 not found"

HTTP/2 500 
date: Thu, 19 Sep 2024 10:11:05 GMT
content-type: text/html
content-length: 1268
cache-control: no-store
referrer-policy: no-referrer-when-downgrade
x-robots-tag: noindex, nofollow
x-abuse-policy: https://web.hypothes.is/abuse-policy/
x-complaints-to: https://web.hypothes.is/report-abuse/
cf-cache-status: BYPASS
strict-transport-security: max-age=15552000; includeSubDomains; preload
x-content-type-options: nosniff
server: cloudflare
cf-ray: 8c58ca242e1863d6-LHR
alt-svc: h3=":443"; ma=86400

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta http-equiv="content-type" content="text/html; charset=UTF-8;charset=utf-8"/>
        <meta name="viewport" content="width=device-width, initial-scale=1">

        <title>Via Error</title>

        <link rel="stylesheet" href="/static/css/bootstrap.min.css"/>
<link rel="stylesheet" href="/static/css/font-awesome.min.css">
<link rel="stylesheet" href="/static/css/base.css">

<script src="/static/js/jquery-latest.min.js"></script>
<script src="/static/js/bootstrap.min.js"></script>
                            </head>

    <body>
                <header>
   </header>
        
        <section>
        <div class="container text-danger">
    <div class="row justify-content-center">
        <h2 class="display-2">Via Error</h2>
    </div>
    <div class="row">
        <div class="col-12 text-center">
        
            <p class="lead">None</p>

                            <p class="lead">Error Details:</p>
                <pre>Internal Error: 404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.</pre>
                            </div>
    </div>
</div>
        </section>

                            </body>
</html>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions