[MRG+1] Respect custom log level: fixes GH-1612 #2581

lopuhin · 2017-02-21T16:58:56Z

See #1612: ~~this uses self.settings instead of settings, as elsewhere in the crawler init.~~
Main change is that now a new root logger is installed when a crawler is created if one was already installed before. This allows to respect custom settings related to logging, such as LOG_LEVEL, LOG_FILE, etc.

As the result, in case of several spiders, logging settings from the last spider take precedence. Without this change, all logging settings in custom_settings are ignored. Both behaviours are not great, but it seems that in most cases there is just one spider running in the process, so the new behaviour seems less bad, but it's a tough call.

codecov-io · 2017-02-21T17:23:39Z

Codecov Report

Merging #2581 into master will increase coverage by 0.07%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2581      +/-   ##
==========================================
+ Coverage   84.14%   84.21%   +0.07%     
==========================================
  Files         161      161              
  Lines        9063     9072       +9     
  Branches     1344     1346       +2     
==========================================
+ Hits         7626     7640      +14     
+ Misses       1177     1174       -3     
+ Partials      260      258       -2

Impacted Files	Coverage Δ
scrapy/utils/log.py	`88.5% <100%> (+7.25%)`	✅
scrapy/crawler.py	`90% <100%> (+0.12%)`	✅

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7e8453c...6f55ca4. Read the comment docs.

kmike · 2017-02-21T18:10:43Z

The fix makes sense, but I wonder what happens if there are two spiders in the same process (executed via CrawlerRunner), and one of them changes LOG_LEVEL - does it affect the other spider?

lopuhin · 2017-02-21T18:12:14Z

Yes, I don't think you can have two spiders with different log levels in one process, and I don't immediately see a backwards-compatible fix for that.

djunzu · 2017-02-23T00:50:19Z

I guess the last spider to start will override any previous log level.
In this case I think a warning note should be somewhere in the docs saying it.

kmike · 2017-02-23T14:26:19Z

@djunzu I'd expect it to still follow log level for this spider, at least if spider is in context. This is not always possible though because some messages are global (without spider in context). But if we document current behavior then maybe we'll documet a bug; I'm not sure. Anyways, it looks like a different issue.

djunzu · 2017-02-23T23:37:19Z

@kmike I decided to test it just to know what happens. I tried

# settings.py:
LOG_LEVEL = 'WARNING'

#testspider.py:
class TestSpider(scrapy.Spider):
    name = "Test"
    custom_settings = {
        'LOG_LEVEL': 'DEBUG',
    }

# output just with warning messages

and

# settings.py:
LOG_LEVEL = 'DEBUG'

#testspider.py:
class TestSpider(scrapy.Spider):
    name = "Test"
    custom_settings = {
        'LOG_LEVEL': 'WARNING',
    }

# output with a lot of debug messages

and

# settings.py:
#LOG_LEVEL = 'DEBUG'

#testspider.py:
class TestSpider(scrapy.Spider):
    name = "Test"
    custom_settings = {
        'LOG_LEVEL': 'WARNING',
    }

# output with a lot of debug messages (default LOG_LEVEL)

None of them works. It does not solve the problem and it is still not possible to set log level using custom_settings attribute. (This or I did something very wrong!)

tested with scrapy 1.3.2

@lopuhin Did you do any other test than the one you wrote?

lopuhin · 2017-03-01T17:54:03Z

@djunzu yes, I did check that this pull request solves the issue for me (specifically, I was using scrapy runspider command), and I checked that the test I wrote fails without this PR. Let me also check the scrapy crawl command.

lopuhin · 2017-03-01T18:55:23Z

@djunzu you are right, this is my mistake. Honestly, I'm not sure how did I test it and get what I expected. Thanks for checking! I'll see if I can fix it.
The test also needs updating: it just check log stats (which do respect log level in custom_settings), but not the actual log output.

lopuhin · 2017-03-02T11:53:07Z

The tests are green finally, yay! @kmike @djunzu @redapple do you mind having another look at this? I updated the PR description with summary of changes and also checked that scrapy crawl and scrapy runspider commands both respect custom log settings.

kmike · 2017-03-02T12:25:10Z

scrapy/crawler.py

        logging.root.addHandler(handler)
+        configure_logging(self.settings)


Hm, previously this function was only called by Scrapy commands, not by Crawler. I think we should check carefully that calling it in Crawler doesn't affect usage examples in https://doc.scrapy.org/en/latest/topics/logging.html#scrapy.utils.log.configure_logging.

Right, thanks for the catch! Probably it's wrong to call it unconditionally when starting any crawler - when you are using scrapy as a library you'd like to be in control of the logging. Initially I added this call to support other log settings as well, but now I see that this is problematic. Maybe it's better to change scrapy root handler only if it's already installed.

A new root logger is installed when a crawler is created if one was already installed before. This allows to respect custom settings related to logging, such as LOG_LEVEL, LOG_FILE, etc.

lopuhin · 2017-03-04T07:15:27Z

Now a new root logger is installed when a crawler is created only if one was already installed before.

kmike · 2017-03-07T11:11:07Z

tests/test_crawl.py

        with LogCapture('scrapy', level=logging.ERROR) as l:
-            crawler = self.runner.create_crawler(BrokenStartRequestsSpider)


what is the reason for this change?

I made that changes because a new root log handler was installed when crawler was created, and so LogCapture did not work (and many other tests also have crawler creation outside of LogCapture). But it looks like it is no longer required, at least when the tests are run in isolation. I revered this changes in 6f55ca4, let's see if the full test suite passes.

Yep, the tests passed, probably it's better to leave it as it was in master. Thanks for the catch!

kmike · 2017-03-07T13:09:59Z

Looks good to me, thanks @lopuhin for the fix and @djunzu for the review.

djunzu · 2017-03-07T20:15:59Z

👍

Back to my initial comment:
I think that docs should have this note: "in case of several spiders, logging settings from the last spider take precedence".

@kmike, regarding

But if we document current behavior then maybe we'll documet a bug

I think documented bugs are good: everyone knows what to expect. Undocumented bugs are bad: when you are not expecting it, something blows up in front of you!

Nonetheless, +1 for merge.

lopuhin force-pushed the respect-custom-log-level branch from a0b1872 to fc8b2a8 Compare February 21, 2017 17:03

kmike changed the title ~~Respect custom log level: fixes GH-1612~~ [MRG+1] Respect custom log level: fixes GH-1612 Feb 21, 2017

kmike added this to the v1.4 milestone Feb 22, 2017

lopuhin changed the title ~~[MRG+1] Respect custom log level: fixes GH-1612~~ [WIP] Respect custom log level: fixes GH-1612 Mar 1, 2017

lopuhin force-pushed the respect-custom-log-level branch from 97aea11 to 295d681 Compare March 2, 2017 10:09

lopuhin changed the title ~~[WIP] Respect custom log level: fixes GH-1612~~ Respect custom log level: fixes GH-1612 Mar 2, 2017

lopuhin changed the title ~~Respect custom log level: fixes GH-1612~~ [WIP] Respect custom log level: fixes GH-1612 Mar 2, 2017

lopuhin force-pushed the respect-custom-log-level branch from 295d681 to 0dd02f7 Compare March 2, 2017 10:48

lopuhin changed the title ~~[WIP] Respect custom log level: fixes GH-1612~~ Respect custom log level: fixes GH-1612 Mar 2, 2017

lopuhin force-pushed the respect-custom-log-level branch from 0dd02f7 to 6230d3d Compare March 2, 2017 10:57

lopuhin changed the title ~~Respect custom log level: fixes GH-1612~~ [WIP] Respect custom log level: fixes GH-1612 Mar 2, 2017

redapple mentioned this pull request Mar 2, 2017

[WIP] Add warning on duplicate spider name #2612

Merged

redapple self-assigned this Mar 2, 2017

lopuhin changed the title ~~[WIP] Respect custom log level: fixes GH-1612~~ Respect custom log level: fixes GH-1612 Mar 2, 2017

kmike reviewed Mar 2, 2017

View reviewed changes

lopuhin changed the title ~~Respect custom log level: fixes GH-1612~~ [WIP] Respect custom log level: fixes GH-1612 Mar 2, 2017

Respect log settings in custom_settings: fixes scrapyGH-1612

ef04cfd

A new root logger is installed when a crawler is created if one was already installed before. This allows to respect custom settings related to logging, such as LOG_LEVEL, LOG_FILE, etc.

lopuhin force-pushed the respect-custom-log-level branch from 34af1c5 to ef04cfd Compare March 3, 2017 15:03

lopuhin changed the title ~~[WIP] Respect custom log level: fixes GH-1612~~ Respect custom log level: fixes GH-1612 Mar 4, 2017

redapple removed their assignment Mar 5, 2017

kmike reviewed Mar 7, 2017

View reviewed changes

Revert unneeded test_crawl changes

6f55ca4

kmike changed the title ~~Respect custom log level: fixes GH-1612~~ [MRG+1] Respect custom log level: fixes GH-1612 Mar 7, 2017

redapple merged commit c0cbacc into scrapy:master Mar 9, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG+1] Respect custom log level: fixes GH-1612 #2581

[MRG+1] Respect custom log level: fixes GH-1612 #2581

lopuhin commented Feb 21, 2017 •

edited

Loading

codecov-io commented Feb 21, 2017 •

edited

Loading

kmike commented Feb 21, 2017

lopuhin commented Feb 21, 2017

djunzu commented Feb 23, 2017

kmike commented Feb 23, 2017

djunzu commented Feb 23, 2017

lopuhin commented Mar 1, 2017

lopuhin commented Mar 1, 2017

lopuhin commented Mar 2, 2017 •

edited

Loading

kmike Mar 2, 2017

lopuhin Mar 2, 2017

lopuhin commented Mar 4, 2017

kmike Mar 7, 2017

lopuhin Mar 7, 2017

lopuhin Mar 7, 2017

kmike commented Mar 7, 2017

djunzu commented Mar 7, 2017

		logging.root.addHandler(handler)
		configure_logging(self.settings)

		with LogCapture('scrapy', level=logging.ERROR) as l:
		crawler = self.runner.create_crawler(BrokenStartRequestsSpider)

[MRG+1] Respect custom log level: fixes GH-1612 #2581

[MRG+1] Respect custom log level: fixes GH-1612 #2581

Conversation

lopuhin commented Feb 21, 2017 • edited Loading

codecov-io commented Feb 21, 2017 • edited Loading

Codecov Report

kmike commented Feb 21, 2017

lopuhin commented Feb 21, 2017

djunzu commented Feb 23, 2017

kmike commented Feb 23, 2017

djunzu commented Feb 23, 2017

lopuhin commented Mar 1, 2017

lopuhin commented Mar 1, 2017

lopuhin commented Mar 2, 2017 • edited Loading

kmike Mar 2, 2017

Choose a reason for hiding this comment

lopuhin Mar 2, 2017

Choose a reason for hiding this comment

lopuhin commented Mar 4, 2017

kmike Mar 7, 2017

Choose a reason for hiding this comment

lopuhin Mar 7, 2017

Choose a reason for hiding this comment

lopuhin Mar 7, 2017

Choose a reason for hiding this comment

kmike commented Mar 7, 2017

djunzu commented Mar 7, 2017

lopuhin commented Feb 21, 2017 •

edited

Loading

codecov-io commented Feb 21, 2017 •

edited

Loading

lopuhin commented Mar 2, 2017 •

edited

Loading