Conversation
- Resolves PEP8 violations in modules/checker.py - Resolves PEP8 violations in torcrawler.py.
- Reverts use of 'with' statement in check_ip function. modules/checker.py. - Refactors modules/crawler.py to implement Crawler function. - Refactors previous modules/crawler.py crawler method into 'crawl' method. - Resolves PEP8 violations in modules/extractor.py - Refactors use of string formating to enforce use of new string format convention. - Ammends Try/Catch statements to handles additional HTTP error cases within extractor methods. - Refactors torcrawl.py to utilise Crawler class and crawl method. resolve-pep8-violations
- Implements Error handling for uncaught http exceptions. - Implements TypeError handling for uncaught exceptions from BeautifulSoup. resolve-pep8-violations
…rCrawl.py into resolve-pep8-violations
MikeMeliz
left a comment
There was a problem hiding this comment.
Awesome work! I loved what you did with Crawler, and the formatting -needless to say- that was at least needed!
|
Hey @the-siegfried , I can't thank you enough for your time spend to re-format this script! It was really needed in order to be more accessible and usable to anyone using it. Now that I've re-visited it through your PR, there was a lot of mistakes that I've made back when I started learning python, and they were just kept through the years into this project. Amazing contribution! |
|
Hi @MikeMeliz, |
Description
This branch resolves a number of PEP8 violations, and includes some refactoring to use industry best practices and new coding conventions.
Motivation and Context
The following changes are required so that the application is left in a well documented and maintainable state so that future development can commence.
I intent to resolve some of the TODO comments within the solution, refactor the extractor module to provide a cinex/termex implementation which can be piped, and finally implement yara rule based keyword searching. However, want to do so from a clean codebase.
How Has This Been Tested?
Functionally, there has been no breaking changes. That being said I have tested the application by performing all possible workflows of the application; verbose, with-tor, without-tor, crawl, pause, depth, for each extract method, etc.
Please find attached the result sets of the following test cases:
Screenshots (if appropriate):
Types of changes