-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] Respect custom log level: fixes GH-1612 #2581
Conversation
a0b1872
to
fc8b2a8
Compare
Codecov Report
@@ Coverage Diff @@
## master #2581 +/- ##
==========================================
+ Coverage 84.14% 84.21% +0.07%
==========================================
Files 161 161
Lines 9063 9072 +9
Branches 1344 1346 +2
==========================================
+ Hits 7626 7640 +14
+ Misses 1177 1174 -3
+ Partials 260 258 -2
Continue to review full report at Codecov.
|
The fix makes sense, but I wonder what happens if there are two spiders in the same process (executed via CrawlerRunner), and one of them changes LOG_LEVEL - does it affect the other spider? |
Yes, I don't think you can have two spiders with different log levels in one process, and I don't immediately see a backwards-compatible fix for that. |
I guess the last spider to start will override any previous log level. |
@djunzu I'd expect it to still follow log level for this spider, at least if |
@kmike I decided to test it just to know what happens. I tried # settings.py:
LOG_LEVEL = 'WARNING'
#testspider.py:
class TestSpider(scrapy.Spider):
name = "Test"
custom_settings = {
'LOG_LEVEL': 'DEBUG',
}
# output just with warning messages and # settings.py:
LOG_LEVEL = 'DEBUG'
#testspider.py:
class TestSpider(scrapy.Spider):
name = "Test"
custom_settings = {
'LOG_LEVEL': 'WARNING',
}
# output with a lot of debug messages and # settings.py:
#LOG_LEVEL = 'DEBUG'
#testspider.py:
class TestSpider(scrapy.Spider):
name = "Test"
custom_settings = {
'LOG_LEVEL': 'WARNING',
}
# output with a lot of debug messages (default LOG_LEVEL) None of them works. It does not solve the problem and it is still not possible to set log level using tested with scrapy 1.3.2 @lopuhin Did you do any other test than the one you wrote? |
@djunzu yes, I did check that this pull request solves the issue for me (specifically, I was using |
@djunzu you are right, this is my mistake. Honestly, I'm not sure how did I test it and get what I expected. Thanks for checking! I'll see if I can fix it. |
97aea11
to
295d681
Compare
295d681
to
0dd02f7
Compare
0dd02f7
to
6230d3d
Compare
scrapy/crawler.py
Outdated
logging.root.addHandler(handler) | ||
configure_logging(self.settings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, previously this function was only called by Scrapy commands, not by Crawler. I think we should check carefully that calling it in Crawler doesn't affect usage examples in https://doc.scrapy.org/en/latest/topics/logging.html#scrapy.utils.log.configure_logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, thanks for the catch! Probably it's wrong to call it unconditionally when starting any crawler - when you are using scrapy as a library you'd like to be in control of the logging. Initially I added this call to support other log settings as well, but now I see that this is problematic. Maybe it's better to change scrapy root handler only if it's already installed.
A new root logger is installed when a crawler is created if one was already installed before. This allows to respect custom settings related to logging, such as LOG_LEVEL, LOG_FILE, etc.
34af1c5
to
ef04cfd
Compare
Now a new root logger is installed when a crawler is created only if one was already installed before. |
tests/test_crawl.py
Outdated
with LogCapture('scrapy', level=logging.ERROR) as l: | ||
crawler = self.runner.create_crawler(BrokenStartRequestsSpider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the reason for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made that changes because a new root log handler was installed when crawler was created, and so LogCapture did not work (and many other tests also have crawler creation outside of LogCapture). But it looks like it is no longer required, at least when the tests are run in isolation. I revered this changes in 6f55ca4, let's see if the full test suite passes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, the tests passed, probably it's better to leave it as it was in master. Thanks for the catch!
👍 Back to my initial comment: @kmike, regarding
I think documented bugs are good: everyone knows what to expect. Undocumented bugs are bad: when you are not expecting it, something blows up in front of you! Nonetheless, +1 for merge. |
See #1612:
this usesself.settings
instead ofsettings
, as elsewhere in the crawler init.Main change is that now a new root logger is installed when a crawler is created if one was already installed before. This allows to respect custom settings related to logging, such as LOG_LEVEL, LOG_FILE, etc.
As the result, in case of several spiders, logging settings from the last spider take precedence. Without this change, all logging settings in
custom_settings
are ignored. Both behaviours are not great, but it seems that in most cases there is just one spider running in the process, so the new behaviour seems less bad, but it's a tough call.