Skip to content

Latest commit

 

History

History
197 lines (159 loc) · 12.7 KB

File metadata and controls

197 lines (159 loc) · 12.7 KB

Logging and the Debug Toolbar

Configuring Logging

During project development, it is very necessary to show enough debugging information to help developers debug code. After the project goes online, it is also very necessary to record warnings, errors, and other information that appear while the system is running, so related people can understand the system's running condition and maintain the code. At the same time, collecting log data also lays a foundation for the digital operation of a website. By analyzing system logs, we can monitor the website's traffic and traffic distribution, and we can also discover users' usage habits and behavior patterns.

Next, let us first look at how to configure logs through Django's configuration file. Django's log configuration can basically follow the official documentation and then be combined with the actual needs of the project. Most of these contents can be copied from the official documentation and then adjusted a little. Some reference configuration is given below.

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'simple': {
            'format': '%(asctime)s %(module)s.%(funcName)s: %(message)s',
            'datefmt': '%Y-%m-%d %H:%M:%S',
        },
        'verbose': {
            'format': '%(asctime)s %(levelname)s [%(process)d-%(threadName)s] '
                      '%(module)s.%(funcName)s line %(lineno)d: %(message)s',
            'datefmt': '%Y-%m-%d %H:%M:%S',
        }
    },
    'filters': {
        'require_debug_true': {
            '()': 'django.utils.log.RequireDebugTrue',
        },
    },
    'handlers': {
        'console': {
            'class': 'logging.StreamHandler',
            'level': 'DEBUG',
            'filters': ['require_debug_true'],
            'formatter': 'simple',
        },
        'file1': {
            'class': 'logging.handlers.TimedRotatingFileHandler',
            'filename': 'access.log',
            'when': 'W0',
            'backupCount': 12,
            'formatter': 'simple',
            'level': 'INFO',
        },
        'file2': {
            'class': 'logging.handlers.TimedRotatingFileHandler',
            'filename': 'error.log',
            'when': 'D',
            'backupCount': 31,
            'formatter': 'verbose',
            'level': 'WARNING',
        },
    },
    'loggers': {
        'django': {
            'handlers': ['console', 'file1', 'file2'],
            'propagate': True,
            'level': 'DEBUG',
        },
    }
}

You may already have noticed that the formatters in the logging configuration above are log formatters. They mean how log output is formatted. The format placeholders mean:

  1. %(name)s - the name of the logger
  2. %(levelno)s - the numeric form of the log level
  3. %(levelname)s - the text name of the log level
  4. %(filename)s - the file name of the source file that made the logging call
  5. %(pathname)s - the path name of the source file that made the logging call
  6. %(funcName)s - the function name that made the logging call
  7. %(module)s - the module name that made the logging call
  8. %(lineno)s - the line number that made the logging call
  9. %(created)s - the time when the log record was created
  10. %(asctime)s - date and time
  11. %(msecs)s - millisecond part
  12. %(thread)d - thread ID (integer)
  13. %(threadName)s - thread name
  14. %(process)d - process ID (integer)

The handlers in the logging configuration are used to specify log handlers. Simply speaking, they specify whether logs are output to the console, a file, or a server on the network. Available handlers include:

  1. logging.StreamHandler(stream=None) - can output information to any file-like object such as sys.stdout or sys.stderr
  2. logging.FileHandler(filename, mode='a', encoding=None, delay=False) - writes log messages to a file
  3. logging.handlers.DatagramHandler(host, port) - sends log messages to a network host at the specified host and port by using UDP
  4. logging.handlers.HTTPHandler(host, url) - uploads log messages to an HTTP server by using GET or POST
  5. logging.handlers.RotatingFileHandler(filename, mode='a', maxBytes=0, backupCount=0, encoding=None, delay=False) - writes log messages to a file, and if the file size goes beyond maxBytes, it creates another file to record logs
  6. logging.handlers.SocketHandler(host, port) - sends log messages to a network host at the specified host and port by using TCP
  7. logging.handlers.SMTPHandler(mailhost, fromaddr, toaddrs, subject, credentials=None, secure=None, timeout=1.0) - outputs logs to a specified email address
  8. logging.MemoryHandler(capacity, flushLevel=ERROR, target=None, flushOnClose=True) - outputs logs to the specified buffer in memory

Every log handler above specifies a property named level. It means the log level. Different log levels show the seriousness of the information recorded in the log. Python defines six log levels, and from low to high they are: NOTSET, DEBUG, INFO, WARNING, ERROR, and CRITICAL.

The loggers configured at the end are used to really output logs. Django provides the built-in loggers shown below:

  1. django - all message loggers in the Django hierarchy
  2. django.request - log messages related to request handling; 5xx responses are considered error messages, and 4xx responses are considered warning messages
  3. django.server - log messages related to requests received by the server started through runserver; 5xx responses are considered errors, 4xx responses are recorded as warnings, and everything else is recorded as INFO
  4. django.template - log messages related to template rendering
  5. django.db.backends - log messages generated by database interaction; if you want to show the SQL statements executed by the ORM framework, you can use this logger

The log level configured on the logger may not be the final log level, because we also need to look at the log level configured on the handler. The higher level of the two is taken as the final log level.

Django Debug Toolbar

If you want to debug your Django project, you really should not miss the tool called Django-Debug-Toolbar. It is a must-have tool for helping debugging and optimization during project development. As long as it is configured, you can conveniently see the project running information shown in the table below. This information is very important for debugging the project and improving Web application performance.

Item Description
Versions Django version
Time Time spent by the view
Settings Values set in the configuration file
Headers Information in request and response headers
Request All kinds of variables related to the request
StaticFiles Static file loading situation
Templates Template-related information
Cache Cache usage
Signals Built-in Django signal information
Logging Logged information
SQL SQL sent to the database and execution time
  1. Install Django-Debug-Toolbar.
pip install django-debug-toolbar
  1. Configure it by changing settings.py.
INSTALLED_APPS = [
    'debug_toolbar',
]

MIDDLEWARE = [
    'debug_toolbar.middleware.DebugToolbarMiddleware',
]

DEBUG_TOOLBAR_CONFIG = {
    'JQUERY_URL': 'https://cdn.bootcss.com/jquery/3.3.1/jquery.min.js',
    'SHOW_COLLAPSED': True,
    'SHOW_TOOLBAR_CALLBACK': lambda x: True,
}
  1. Configure it by changing urls.py.
if settings.DEBUG:
    import debug_toolbar
    urlpatterns.insert(0, path('__debug__/', include(debug_toolbar.urls)))
  1. After Django-Debug-Toolbar is configured, you will see a debug toolbar on the right side of the page, as shown in the picture below. It includes the debugging information mentioned above, such as execution time, project settings, request headers, SQL, static resources, templates, cache, and signals, so it is very convenient to view.

ORM Optimization

After logging or Django-Debug-Toolbar is configured, we can check the execution of the view function that exported teacher data to an Excel report earlier. What we care about here is what the SQL query generated by the ORM framework really looks like. I believe the result here may surprise you a little. After executing Teacher.objects.all(), we can notice that the SQL seen in the console or shown by Django-Debug-Toolbar looks like this:

SELECT `tb_teacher`.`no`, `tb_teacher`.`name`, `tb_teacher`.`detail`, `tb_teacher`.`photo`, `tb_teacher`.`good_count`, `tb_teacher`.`bad_count`, `tb_teacher`.`sno` FROM `tb_teacher`; args=()
SELECT `tb_subject`.`no`, `tb_subject`.`name`, `tb_subject`.`intro`, `tb_subject`.`create_date`, `tb_subject`.`is_hot` FROM `tb_subject` WHERE `tb_subject`.`no` = 101; args=(101,)
SELECT `tb_subject`.`no`, `tb_subject`.`name`, `tb_subject`.`intro`, `tb_subject`.`create_date`, `tb_subject`.`is_hot` FROM `tb_subject` WHERE `tb_subject`.`no` = 101; args=(101,)
SELECT `tb_subject`.`no`, `tb_subject`.`name`, `tb_subject`.`intro`, `tb_subject`.`create_date`, `tb_subject`.`is_hot` FROM `tb_subject` WHERE `tb_subject`.`no` = 101; args=(101,)
SELECT `tb_subject`.`no`, `tb_subject`.`name`, `tb_subject`.`intro`, `tb_subject`.`create_date`, `tb_subject`.`is_hot` FROM `tb_subject` WHERE `tb_subject`.`no` = 101; args=(101,)
SELECT `tb_subject`.`no`, `tb_subject`.`name`, `tb_subject`.`intro`, `tb_subject`.`create_date`, `tb_subject`.`is_hot` FROM `tb_subject` WHERE `tb_subject`.`no` = 103; args=(103,)
SELECT `tb_subject`.`no`, `tb_subject`.`name`, `tb_subject`.`intro`, `tb_subject`.`create_date`, `tb_subject`.`is_hot` FROM `tb_subject` WHERE `tb_subject`.`no` = 103; args=(103,)

This problem is usually called the "1+N query" problem. In some places, it is also called the "N+1 query" problem. Originally, getting teacher data only needed one SQL statement, but because teachers are related to subjects, when we query N rows of teacher data, Django's ORM framework sends N more SQL statements to the database to query the subject information of those teachers. Every SQL execution has a relatively large cost and also puts pressure on the database server. If we can finish the teacher and subject query in one SQL statement, that is surely a better way. This is also easy to do, and you have probably already thought of how to do it. Yes, we can use a join query, but how do we do this when using Django's ORM framework? For many-to-one relationships, such as teachers and subjects in the voting application, we can use the select_related() method of QuerySet to load related objects. For many-to-many relationships, such as orders and products in an e-commerce website, we can use the prefetch_related() method to load related objects.

In the view function that exports the teacher Excel report, we can optimize the code like this.

queryset = Teacher.objects.all().select_related('subject')

In fact, the operation of querying teacher upvote and downvote data in the view function that uses ECharts to generate the frontend report can also be optimized. In this example, we only need three pieces of data: the teacher's name, positive vote count, and negative vote count. But by default, the generated SQL queries all fields of the teacher table. We can use the only() method of QuerySet to specify the attributes we need to query, and we can also use the defer() method of QuerySet to specify the attributes we do not need to query for now. In this way, the generated SQL uses projection to specify the needed columns, so query performance is improved. The code is shown below.

queryset = Teacher.objects.all().only('name', 'good_count', 'bad_count')

Of course, if we want to count the average positive vote count and negative vote count of teachers in each subject, we can also do it with Django's ORM framework. The code is shown below.

queryset = Teacher.objects.values('subject').annotate(good=Avg('good_count'), bad=Avg('bad_count'))

The elements in the QuerySet obtained here are dictionary objects. Each dictionary has three groups of key-value pairs, which are subject for the subject ID, good for the positive vote count, and bad for the negative vote count. If you want to get the subject name instead of the ID, you can change the code like this.

queryset = Teacher.objects.values('subject__name').annotate(good=Avg('good_count'), bad=Avg('bad_count'))

So we can see that Django's ORM framework allows us to complete grouping and aggregate queries in a relational database in an object-oriented way.