Scrapy Logging – How to log data to a File

This tutorial explains how to use the logging feature in Scrapy

Scrapy is a complex library with a lot of different simultaneous requests and events being executed every second. In such a program it becomes hard to debug any error that may occur since locating the error is such a complex program isn’t easy.

Luckily, Python offers us the logging library, which we can use here with Scrapy to help us log the system events as they occur. Once everything is logged, it’s easy to go through it and locate error (in case of a problem).


Logging Levels

  • Debug : These are used to give information regarding the program while it’s running. Useful for debugging in the case of any error, or unexpected outcome.
  • Info : These are used to Confirm that things are working as expected
  • Warning : These are used to indicate that something unexpected has happened, or an error is likely to be produced due to certain events.
  • Error : Used to indicate a problem in the code. Usually thrown when the program has become unable to carry out some commands.
  • Critical : Highest level error that can occur. Usually severe enough to prevent the program from continuing any further. Terminates the program.

Logging Error Messages

The first thing we need to do is to import the logging library. Next we have to create an actual logger through which we’ll be using all the logging methods and functions.

import logging 
logger = logging.getLogger('mylogger')

In the above example we’ve done the first two steps. We can give the logger it’s own name by passing a suitable string into the getLogger() function. You can also choose to leave it blank.

In the below example we’ll be raising error messages of varying severity.

import logging 
logger = logging.getLogger('mylogger')

# Info
logger.info("This is some random information")

# Warning
logger.info("This is a warning")

# Critical 
logger.critical("This is a critical error")

Scrapy Logging Example

Below is a very simplistic example of how one would integrate the logging library with scrapy.

Once you’ve created the logger, you can begin calling the logging function anywhere within the scrapy code.

import logging
import scrapy

logger = logging.getLogger('my_logger')

class ScrapySpider(scrapy.Spider):
    name = 'spider'
    start_urls = ['http://quotes.toscrape.com/']

    def parse(self, response):
        logger.info('Parse Function called on %s', response.url)

The above code will log into a file, every URL that it scans and extracts from the quotes.toscrape web page.


Logging Settings

LOG_ENABLED takes a Boolean value of True of False to determine whether logging should be enabled or not.

LOG_FILE takes a file path as it’s value, and in stores all the logging messages within that file.

LOG_FORMAT is a setting that helps you choose what data you want to be printed out in the log message. By default, it’s '%(asctime)s [%(name)s] %(levelname)s: %(message)s', which prints out the time, name, level and the content of the log message.

LOG_LEVEL determines the minimum level of the error message which should be logged. Default value is DEBUG, which means any level of message will be logged. (Level increases from DEBUG to CRTICIAL)

LOG_DATEFORMAT determines the way the date is printed out in the log. The default format is '%Y-%m-%d %H:%M:%S'.


This marks the end of the Scrapy Logging tutorial. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the article content can be asked in the comments section.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments