mirror of
https://github.com/adulau/HHHash.git
synced 2024-12-22 08:46:05 +00:00
Merge pull request #3 from Rafiot/http2_notes
chg: Update readme, add notes regarding HTTP2 and cURL
This commit is contained in:
commit
ff4e17e928
1 changed files with 6 additions and 2 deletions
|
@ -20,8 +20,10 @@ The HHHash value is the SHA256 of the list.
|
||||||
|
|
||||||
### Calculating HHHash from a curl command
|
### Calculating HHHash from a curl command
|
||||||
|
|
||||||
~~~
|
Curl will attempt to run the request using HTTP2 by default. In order to get the same hash as the python requests module (which doesn't supports HTTP2), you need to specify the version with the `--http1.1` switch.
|
||||||
$ curl -s -D - https://www.circl.lu/ -o /dev/null | awk 'NR != 1' | cut -f1 -d: | sed '/^[[:space:]]*$/d' | sed -z 's/\n/:/g' | sed 's/.$//' | sha256sum | cut -f1 -d " " | awk {'print "hhh:1:"$1'}
|
|
||||||
|
~~~bash
|
||||||
|
curl --http1.1 -s -D - https://www.circl.lu/ -o /dev/null | awk 'NR != 1' | cut -f1 -d: | sed '/^[[:space:]]*$/d' | sed -z 's/\n/:/g' | sed 's/.$//' | sha256sum | cut -f1 -d " " | awk {'print "hhh:1:"$1'}
|
||||||
~~~
|
~~~
|
||||||
|
|
||||||
Output value
|
Output value
|
||||||
|
@ -33,6 +35,8 @@ hhh:1:78f7ef0651bac1a5ea42ed9d22242ed8725f07815091032a34ab4e30d3c3cefc
|
||||||
|
|
||||||
HHHash is an effective technique; however, its performance is heavily reliant on the characteristics of the HTTP client requests. Therefore, it is important to note that correlations between a set of hashes are typically established when using the same crawler or HTTP client parameters.
|
HHHash is an effective technique; however, its performance is heavily reliant on the characteristics of the HTTP client requests. Therefore, it is important to note that correlations between a set of hashes are typically established when using the same crawler or HTTP client parameters.
|
||||||
|
|
||||||
|
HTTP2 requires the [headers to be lowercase](https://www.rfc-editor.org/rfc/rfc7540#section-8.1.2). It will then changes the hash so you need to be aware of the HTTP version you're using.
|
||||||
|
|
||||||
### hhhash - Python Library
|
### hhhash - Python Library
|
||||||
|
|
||||||
The [hhhash package](https://pypi.org/project/hhhash/) can be installed via a `pip install hhhash` or build with Poetry from this repository `poetry build` and `poetry install`.
|
The [hhhash package](https://pypi.org/project/hhhash/) can be installed via a `pip install hhhash` or build with Poetry from this repository `poetry build` and `poetry install`.
|
||||||
|
|
Loading…
Reference in a new issue