-
Notifications
You must be signed in to change notification settings - Fork 64
how requests are handled
This document describes how your App Engine application receives requests and sends responses.
For more details, see the Request Headers and Responses reference.
If your application uses services, you can address requests to a specific service or a specific version of that service. For more information about service addressability, see How Requests are Routed.
Your application is responsible for starting a webserver and handling requests. You can use any web framework that is available for your development language.
When App Engine receives a web request for your application, it invokes the
servlet that corresponds to the URL, as described in the application's
web.xml file in the WEB-INF/
directory. It supports the
Java Servlet 2.5 or 3.1 API
specifications, to provide the request
data to the servlet and accept the response data.
App Engine runs multiple instances of your application, each instance has its own web server for handling requests. Any request can be routed to any instance, so consecutive requests from the same user are not necessarily sent to the same instance. The number of instances can be adjusted automatically as traffic changes.
By default, each web server processes only one request at a time. To dispatch
multiple requests to each web server in parallel, mark your application as
threadsafe by adding a
<threadsafe>true</threadsafe>
element to your appengine-web.xml file.
The following example servlet class displays a simple message on the user's browser.
View RequestsServlet.java on GitHub (region:
gae_java8_request_example)
App Engine automatically allocates resources to your application as traffic increases. However, this is bound by the following restrictions:
-
App Engine reserves automatic scaling capacity for applications with low latency, where the application responds to requests in less than one second.
-
Applications that are heavily CPU-bound may also incur some additional latency in order to efficiently share resources with other applications on the same servers. Requests for static files are exempt from these latency limits.
Each incoming request to the application counts toward the Requests limit. Data sent in response to a request counts toward the Outgoing Bandwidth (billable) limit.
Both HTTP and HTTPS (secure) requests count toward the Requests, Incoming Bandwidth (billable), and Outgoing Bandwidth (billable) limits. The Google Cloud console Quota Details page also reports Secure Requests, Secure Incoming Bandwidth, and Secure Outgoing Bandwidth as separate values for informational purposes. Only HTTPS requests count toward these values. For more information, see the Quotas page.
The following limits apply specifically to the use of request handlers:
Limit | Amount ----- | ------ Request size | 32 megabytes Response size | 32
megabytes Request timeout | Depends on the type of
scaling
your app uses Maximum total number of files (app files and static files) | {{
dep_file_quota }} total
{{ dep_files_per_dir_quota }} per directory Maximum
size of an application file | 32 megabytes Maximum size of a static file | 32
megabytes Maximum total size of all application and static files | First 1
gigabyte is free
{{ code_store_after_free }} per gigabyte per month after
first 1 gigabyte Pending request timeout | 10 seconds Maximum size of a single
request header field | 8 kilobytes for second-generation
runtimes in the standard environment.
Requests to these runtimes with header fields exceeding 8 kilobytes will return
HTTP 400 errors.
All HTTP/2 requests will be translated into HTTP/1.1 requests when forwarded to the application server.
-
Dynamic responses are limited to 32 MB. If a script handler generates a response larger than this limit, the server sends back an empty response with a 500 Internal Server Error status code. This limitation does not apply to responses that serve data from
the legacy Blobstore or
-
The response header limit is 8 KB for second-generation runtimes. Response headers that exceed this limit will return HTTP 502 errors, with logs showing
upstream sent too big header while reading response header from upstream.
An incoming HTTP request includes the HTTP headers sent by the client. For security purposes, some headers are sanitized or amended by intermediate proxies before they reach the application.
For more information, see the Request headers reference.
App Engine is optimized for applications with short-lived requests, typically those that take a few hundred milliseconds. An efficient app responds quickly for the majority of requests. An app that doesn't will not scale well with App Engine's infrastructure. To ensure this level of performance, there is a system-imposed maximum request timeout that every app must respond by.
If your app exceeds this deadline, App Engine interrupts the request handler.
The Java runtime environment interrupts the servlet by throwing a
com.google.apphosting.api.DeadlineExceededException. If there is no request
handler to catch this exception, the runtime environment will return an HTTP 500
server error to the client.
If there is a request handler and the DeadlineExceededException is caught,
then the runtime environment gives the request handler time (less than a second)
to prepare a custom response. If the request handler takes more than a second
after raising the exception to prepare a custom response, a
HardDeadlineExceededError will be raised.
Both DeadlineExceededExceptions and HardDeadlineExceededErrors will force
termination of the request and stop the instance.
To find out how much time remains before the deadline, the application can
import
com.google.apphosting.api.ApiProxy
and call
ApiProxy.getCurrentEnvironment().getRemainingMillis().
This is useful if the application is planning to start on some work that might
take too long; if you know it takes five seconds to process a unit of work but
getRemainingMillis() returns less time, there's no point starting that unit of
work.
Warning: The DeadlineExceededException can potentially be raised from anywhere
in your program, including finally blocks, leaving your program in an invalid
state. This can cause deadlocks or unexpected errors in threaded code, because
locks may not be released. After the request completes, the runtime will stop
the server process, so future requests won't be affected; however, any
concurrent requests being handled on the same process will stop. This is
equivalent to calling
Thread.stop
on the main thread. For more information, see Why is Thread.stop
deprecated?.
To be safe, you should not rely on the DeadlineExceededException, and instead
ensure that your requests complete well before the time limit (using
getRemainingMillis()
if necessary).
App Engine calls the servlet with a request object and a response object, then waits for the servlet to populate the response object and return. When the servlet returns, the data on the response object is sent to the user.
There are size limits that apply to the response you generate, and the response may be modified before it is returned to the client.
For more information, see the Request responses reference.
App Engine does not support streaming responses where data is sent in incremental chunks to the client while a request is being processed. All data from your code is collected as described above and sent as a single HTTP response.
App Engine does its best to serve compressed (gzipped) content to clients that support it. To determine if content should be compressed, App Engine does the following when it receives a request:
-
Confirms if the client can reliably receive compressed responses by viewing both the
Accept-EncodingandUser-Agentheaders in the request. This approach avoids some well-known bugs with gzipped content in popular browsers. -
Confirms if compressing the content is appropriate by viewing the
Content-Typeheader that you have configured for the response handler. In general, compression is appropriate for text-based content types, and not for binary content types.
Note the following:
-
A client can force text-based content types to be compressed by setting both of the
Accept-EncodingandUser-Agentrequest headers togzip. -
If a request doesn't specify
gzipin theAccept-Encodingheader, App Engine will not compress the response data. -
The Google Frontend caches responses from App Engine static file and directory handlers. Depending on a variety of factors, such as which type of response data is cached first, which
Varyheaders you have specified in the response, and which headers are included in the request, a client could request compressed data but receive uncompressed data, and the other way around. For more information, see Response caching.
The Google Frontend, and potentially the user's browser and other intermediate caching proxy servers, will cache your app's responses as instructed by standard caching headers that you specify in the response. You can specify these response headers either through your framework, directly in your code, or through App Engine static file and directory handlers.
In the Google Frontend, the cache key is the full URL of the request.
To ensure that clients always receive updated static content as soon as it is
published, we recommend that you serve static content from versioned
directories, such as css/v1/styles.css. The Google Frontend will not validate
the cache (check for updated content) until the cache expires. Even after the
cache expires, the cache will not be updated until the content at the request
URL changes.
The following response headers that you can set in appengine-web.xml
app.yaml
influence how and when the Google Frontend caches content:
-
Cache-Controlshould be set topublicfor the Google Frontend to cache content; it may also be cached by the Google Frontend unless you specify aCache-Controlprivateorno-storedirective. If you don't set this header inappengine-web.xmlapp.yaml, App Engine automatically adds it for all responses handled by a static file or directory handler. For more information, see Headers added or replaced. -
Vary: To enable the cache to return different responses for a URL based on headers that are sent in the request, set one or more of the following values in theVaryresponse header:Accept,Accept-Encoding,Origin, orX-Origin
Due to the potential for high cardinality, data will not be cached for other
Vary values.
For example: 1. You specify the following response header:
`Vary: Accept-Encoding`
-
You app receives a request that contains the
Accept-Encoding: gzipheader. App Engine returns a compressed response and the Google Frontend caches the gzipped version of the response data. All subsequent requests for this URL that contain theAccept-Encoding: gzipheader will receive the gzipped data from the cache until the cache becomes invalidated (due to the content changing after the cache expires). -
Your app receives a request that does not contain the
Accept-Encodingheader. App Engine returns an uncompressed response and Google Frontend caches the uncompressed version of the response data. All subsequent requests for this URL that do not contain theAccept-Encodingheader will receive the compressed data from the cache until the cache becomes invalidated.
If you do not specify a Vary response header, the Google Frontend creates a
single cache entry for the URL and will use it for all requests regardless of
the headers in the request. For example:
- You do not specify the
Vary: Accept-Encodingresponse header. 1. A request contains theAccept-Encoding: gzipheader, and the gzipped version of the response data will be cached. - A second request does not contain the
Accept-Encoding: gzipheader. However, because the cache contains a gzipped version of the response data, the response will be gzipped even though the client requested uncompressed data.
The headers in the request also influence caching:
- If the request contains an
Authorizationheader, the content will not be cached by the Google Frontend.
By default, the caching headers that App Engine static file and directory handlers add to responses instruct clients and web proxies such as the Google Frontend to expire the cache after 10 minutes.
After a file is transmitted with a given expiration time, there is generally no way to clear it out of web-proxy caches, even if the user clears their own browser cache. Re-deploying a new version of the app will not reset any caches. Therefore, if you ever plan to modify a static file, it should have a short (less than one hour) expiration time. In most cases, the default 10-minute expiration time is appropriate.
You can change the default expiration for all static file and directory handlers by specifying the
static-files
element in your appengine-web.xml app.yaml file.
Your application can write information to the application logs using
java.util.logging.Logger.
Log data for your application can be viewed in the Google Cloud console using
Cloud Logging. Each request
logged is assigned a request
ID,
a globally unique identifier based on the request's start time. The Google Cloud
console can recognize the Logger class's log levels, and interactively display
messages at different levels.
Everything the servlet writes to the standard output stream (System.out) and
standard error stream (System.err) is captured by App Engine and recorded in
the application logs. Lines written to the standard output stream are logged at
the "INFO" level, and lines written to the standard error stream are logged at
the "WARNING" level. Any logging framework (such as log4j) that logs to the
output or error streams will work. However, for more fine-grained control of the
log level display in the Google Cloud console, the logging framework must use a
java.util.logging adapter.
View LoggingServlet.java on GitHub (region:
gae_java8_logging_example)
The App Engine Java SDK includes a template logging.properties file, in the
appengine-java-sdk/config/user/ directory. To use it, copy the file to your
WEB-INF/classes directory (or elsewhere in the WAR), then the system property
java.util.logging.config.file to "WEB-INF/logging.properties" (or whichever
path you choose, relative to the application root). You can set system
properties in the appengine-web.xml
file as follows:
The servlet logs messages using the INFO log level (using log.info()). The
default log level is WARNING, which suppresses INFO messages from the
output. To change the log level, edit the logging.properties file.
All system properties and environment variables are private to your application. Setting a system property only affects your application's view of that property, and not the JVM's view.
You can set system properties and environment variables for your app in the deployment descriptor.
App Engine sets several system properties that identify the runtime environment:
-
com.google.appengine.runtime.environmentis"Production"when running on App Engine, and"Development"when running in the development server.In addition to using
System.getProperty(), you can access system properties using our type-safe API. For example: -
com.google.appengine.runtime.versionis the version ID of the runtime environment, such as"1.3.0". You can get the version by invoking the following:String version = SystemProperty.version.get(); -
com.google.appengine.application.idis the application's ID. You can get the ID by invoking the following:String ID = SystemProperty.applicationId.get(); -
com.google.appengine.application.versionis the major and minor version of the currently running application service, as "X.Y". The major version number ("X") is specified in the service'sappengine-web.xmlfile. The minor version number ("Y") is set automatically when each version of the app is uploaded to App Engine. You can get the ID by invoking the following:String ID = SystemProperty.applicationVersion.get();On the development web server, the major version returned is always the default service's version, and the minor version is always "1".
App Engine also sets the following system properties when it initializes the JVM on an app server:
file.separatorpath.separatorline.separatorjava.versionjava.vendorjava.vendor.urljava.class.versionjava.specification.versionjava.specification.vendorjava.specification.namejava.vm.vendorjava.vm.namejava.vm.specification.versionjava.vm.specification.vendorjava.vm.specification.name-
user.dir
You can retrieve the ID of the instance handling a request using this code:
com.google.apphosting.api.ApiProxy.getCurrentEnvironment().getAttributes().get("com.google.appengine.instance.id")
In the production environment, a logged-in admin can use the ID in a url: {{instance_url}}. The request will be routed to that specific instance. If the instance cannot handle the request it returns an immediate 503.
At the time of the request, you can save the request ID, which is unique to the request. The request ID can be used later to correlate a request with the logs for that request.
Note: Currently, App Engine doesn't support the use of the request ID to directly look up the related logs.
The following code shows how to get the request ID in the context of a request:
For security reasons, all applications should encourage clients to connect over
https. To instruct the browser to prefer https over http for a given page
or entire domain, set the Strict-Transport-Security header in your responses.
For example:
Strict-Transport-Security: max-age=31536000; includeSubDomains
To set this header for any static content that is served by your app, add the header to your app's static file and directory handlers.
Most app frameworks and web servers provide support for setting this header for
responses that are generated from your code. For information about the
Strict-Transport-Security header in Spring Boot, see HTTP Strict Transport
Security
(HSTS).
Caution: Clients that have received the header in the past will refuse to
connect if https becomes non-functional or is disabled for any reason. To
learn more, see this Cheat Sheet on HTTP Strict Transport
Security.
Background work is any work that your app performs for a request after you have delivered your HTTP response. Avoid performing background work in your app, and review your code to make sure all asynchronous operations finish before you deliver your response.
For long-running jobs, we recommend using Cloud Tasks. With Cloud Tasks, HTTP requests are long-lived and return a response only after any asynchronous work ends.
Warning: Performing asynchronous background work can result in higher billing. App Engine might scale up additional instances due to high CPU load, even if there are no active requests. Users may also experience increased latency because of requests waiting in the pending queue for available instances.