This is the Python implementation of referer-parser referer-parser, the library for extracting search marketing data from referer (sic) URLs.
The implementation uses the shared 'database' of known referers found in [referers.yml] referers-yml (converted to a referers.json file,
see below).
Currently the Python library only extracts search engine referers - it needs updating with the additional functionality now found in the Java/Scala version.
The Python version of referer-parser is maintained by [Don Spaulding] donspaulding.
$ pip install referer_parser
Create a new instance of a Referer object by passing in the url you want to parse:
from referer_parser import Referer
referer_url = 'http://www.google.com/search?q=gateway+oracle+cards+denise+linn&hl=en&client=safari'
r = Referer(referer_url)The r variable now holds a Referer instance. The important attributes are:
print(r.known) # True
print(r.referer) # 'Google'
print(r.search_parameter) # 'q'
print(r.search_term) # 'gateway oracle cards denise linn'
print(r.uri) # ParseResult(scheme='http', netloc='www.google.com', path='/search', params='', query='q=gateway+oracle+cards+denise+linn&hl=en&client=safari', fragment='')The uri attribute is an instance of ParseResult from the standard library's urlparse module.
Unlike the other ports, the Python version of referer-parser uses a referers.json file, generated from the main referers.yml file. This is for two reasons:
- Python's standard library includes a JSON parser but not a YAML parser
- Loading from JSON in Python is significantly faster than loading from YAML
To support the referers.json file, the distribution process for Python looks like this:
$ ./sync_yaml.sh
$ python/build_json.py
$ python setup.py sdist bdist_wininst upload
- Fork it
- Create your feature branch (
git checkout -b my-new-feature) - Commit your changes (
git commit -am 'Add some feature') - Push to the branch (
git push origin my-new-feature) - Create new Pull Request
The referer-parser Python library is copyright 2012-2013 Don Spaulding.
Licensed under the [Apache License, Version 2.0] license (the "License"); you may not use this software except in compliance with the License.
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.