Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Respect custom log level: fixes GH-1612 #2581

Merged
merged 2 commits into from
Mar 9, 2017

Conversation

lopuhin
Copy link
Member

@lopuhin lopuhin commented Feb 21, 2017

See #1612: this uses self.settings instead of settings, as elsewhere in the crawler init.
Main change is that now a new root logger is installed when a crawler is created if one was already installed before. This allows to respect custom settings related to logging, such as LOG_LEVEL, LOG_FILE, etc.

As the result, in case of several spiders, logging settings from the last spider take precedence. Without this change, all logging settings in custom_settings are ignored. Both behaviours are not great, but it seems that in most cases there is just one spider running in the process, so the new behaviour seems less bad, but it's a tough call.

@lopuhin lopuhin force-pushed the respect-custom-log-level branch from a0b1872 to fc8b2a8 Compare February 21, 2017 17:03
@codecov-io
Copy link

codecov-io commented Feb 21, 2017

Codecov Report

Merging #2581 into master will increase coverage by 0.07%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2581      +/-   ##
==========================================
+ Coverage   84.14%   84.21%   +0.07%     
==========================================
  Files         161      161              
  Lines        9063     9072       +9     
  Branches     1344     1346       +2     
==========================================
+ Hits         7626     7640      +14     
+ Misses       1177     1174       -3     
+ Partials      260      258       -2
Impacted Files Coverage Δ
scrapy/utils/log.py 88.5% <100%> (+7.25%)
scrapy/crawler.py 90% <100%> (+0.12%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7e8453c...6f55ca4. Read the comment docs.

@kmike
Copy link
Member

kmike commented Feb 21, 2017

The fix makes sense, but I wonder what happens if there are two spiders in the same process (executed via CrawlerRunner), and one of them changes LOG_LEVEL - does it affect the other spider?

@lopuhin
Copy link
Member Author

lopuhin commented Feb 21, 2017

Yes, I don't think you can have two spiders with different log levels in one process, and I don't immediately see a backwards-compatible fix for that.

@kmike kmike changed the title Respect custom log level: fixes GH-1612 [MRG+1] Respect custom log level: fixes GH-1612 Feb 21, 2017
@kmike kmike added this to the v1.4 milestone Feb 22, 2017
@djunzu
Copy link
Contributor

djunzu commented Feb 23, 2017

I guess the last spider to start will override any previous log level.
In this case I think a warning note should be somewhere in the docs saying it.

@kmike
Copy link
Member

kmike commented Feb 23, 2017

@djunzu I'd expect it to still follow log level for this spider, at least if spider is in context. This is not always possible though because some messages are global (without spider in context). But if we document current behavior then maybe we'll documet a bug; I'm not sure. Anyways, it looks like a different issue.

@djunzu
Copy link
Contributor

djunzu commented Feb 23, 2017

@kmike I decided to test it just to know what happens. I tried

# settings.py:
LOG_LEVEL = 'WARNING'

#testspider.py:
class TestSpider(scrapy.Spider):
    name = "Test"
    custom_settings = {
        'LOG_LEVEL': 'DEBUG',
    }

# output just with warning messages

and

# settings.py:
LOG_LEVEL = 'DEBUG'

#testspider.py:
class TestSpider(scrapy.Spider):
    name = "Test"
    custom_settings = {
        'LOG_LEVEL': 'WARNING',
    }

# output with a lot of debug messages

and

# settings.py:
#LOG_LEVEL = 'DEBUG'

#testspider.py:
class TestSpider(scrapy.Spider):
    name = "Test"
    custom_settings = {
        'LOG_LEVEL': 'WARNING',
    }

# output with a lot of debug messages (default LOG_LEVEL)

None of them works. It does not solve the problem and it is still not possible to set log level using custom_settings attribute. (This or I did something very wrong!)

tested with scrapy 1.3.2

@lopuhin Did you do any other test than the one you wrote?

@lopuhin
Copy link
Member Author

lopuhin commented Mar 1, 2017

@djunzu yes, I did check that this pull request solves the issue for me (specifically, I was using scrapy runspider command), and I checked that the test I wrote fails without this PR. Let me also check the scrapy crawl command.

@lopuhin
Copy link
Member Author

lopuhin commented Mar 1, 2017

@djunzu you are right, this is my mistake. Honestly, I'm not sure how did I test it and get what I expected. Thanks for checking! I'll see if I can fix it.
The test also needs updating: it just check log stats (which do respect log level in custom_settings), but not the actual log output.

@lopuhin lopuhin changed the title [MRG+1] Respect custom log level: fixes GH-1612 [WIP] Respect custom log level: fixes GH-1612 Mar 1, 2017
@lopuhin lopuhin force-pushed the respect-custom-log-level branch from 97aea11 to 295d681 Compare March 2, 2017 10:09
@lopuhin lopuhin changed the title [WIP] Respect custom log level: fixes GH-1612 Respect custom log level: fixes GH-1612 Mar 2, 2017
@lopuhin lopuhin changed the title Respect custom log level: fixes GH-1612 [WIP] Respect custom log level: fixes GH-1612 Mar 2, 2017
@lopuhin lopuhin force-pushed the respect-custom-log-level branch from 295d681 to 0dd02f7 Compare March 2, 2017 10:48
@lopuhin lopuhin changed the title [WIP] Respect custom log level: fixes GH-1612 Respect custom log level: fixes GH-1612 Mar 2, 2017
@lopuhin lopuhin force-pushed the respect-custom-log-level branch from 0dd02f7 to 6230d3d Compare March 2, 2017 10:57
@lopuhin lopuhin changed the title Respect custom log level: fixes GH-1612 [WIP] Respect custom log level: fixes GH-1612 Mar 2, 2017
@redapple redapple self-assigned this Mar 2, 2017
@lopuhin
Copy link
Member Author

lopuhin commented Mar 2, 2017

The tests are green finally, yay! @kmike @djunzu @redapple do you mind having another look at this? I updated the PR description with summary of changes and also checked that scrapy crawl and scrapy runspider commands both respect custom log settings.

@lopuhin lopuhin changed the title [WIP] Respect custom log level: fixes GH-1612 Respect custom log level: fixes GH-1612 Mar 2, 2017
logging.root.addHandler(handler)
configure_logging(self.settings)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, previously this function was only called by Scrapy commands, not by Crawler. I think we should check carefully that calling it in Crawler doesn't affect usage examples in https://doc.scrapy.org/en/latest/topics/logging.html#scrapy.utils.log.configure_logging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks for the catch! Probably it's wrong to call it unconditionally when starting any crawler - when you are using scrapy as a library you'd like to be in control of the logging. Initially I added this call to support other log settings as well, but now I see that this is problematic. Maybe it's better to change scrapy root handler only if it's already installed.

@lopuhin lopuhin changed the title Respect custom log level: fixes GH-1612 [WIP] Respect custom log level: fixes GH-1612 Mar 2, 2017
A new root logger is installed when a crawler is created
if one was already installed before.
This allows to respect custom settings related to logging,
such as LOG_LEVEL, LOG_FILE, etc.
@lopuhin lopuhin force-pushed the respect-custom-log-level branch from 34af1c5 to ef04cfd Compare March 3, 2017 15:03
@lopuhin lopuhin changed the title [WIP] Respect custom log level: fixes GH-1612 Respect custom log level: fixes GH-1612 Mar 4, 2017
@lopuhin
Copy link
Member Author

lopuhin commented Mar 4, 2017

Now a new root logger is installed when a crawler is created only if one was already installed before.

@redapple redapple removed their assignment Mar 5, 2017
with LogCapture('scrapy', level=logging.ERROR) as l:
crawler = self.runner.create_crawler(BrokenStartRequestsSpider)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the reason for this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made that changes because a new root log handler was installed when crawler was created, and so LogCapture did not work (and many other tests also have crawler creation outside of LogCapture). But it looks like it is no longer required, at least when the tests are run in isolation. I revered this changes in 6f55ca4, let's see if the full test suite passes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the tests passed, probably it's better to leave it as it was in master. Thanks for the catch!

@kmike kmike changed the title Respect custom log level: fixes GH-1612 [MRG+1] Respect custom log level: fixes GH-1612 Mar 7, 2017
@kmike
Copy link
Member

kmike commented Mar 7, 2017

Looks good to me, thanks @lopuhin for the fix and @djunzu for the review.

@djunzu
Copy link
Contributor

djunzu commented Mar 7, 2017

👍

Back to my initial comment:
I think that docs should have this note: "in case of several spiders, logging settings from the last spider take precedence".

@kmike, regarding

But if we document current behavior then maybe we'll documet a bug

I think documented bugs are good: everyone knows what to expect. Undocumented bugs are bad: when you are not expecting it, something blows up in front of you!

Nonetheless, +1 for merge.

@redapple redapple merged commit c0cbacc into scrapy:master Mar 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants