Napkin is a Python tool to produce statistical analysis of a text.
Analysis features are :
- Verbs frequency
- Nouns frequency
- Digit frequency
- Labels frequency such as (Person, organisation, product, location) as defined in spacy.io [named entities](https://spacy.io/api/annotation#named-entities)
- URL frequency
- Email frequency
- Mention frequency (everything prefixed with an @ symbol)
- Out-Of-Vocabulary (OOV) word frequency meaning any words outside English dictionary
# requirements
- Python >= 3.6
- spacy.io
- redis (a redis server running on port 6380)
# how to use napkin
~~~~
usage: napkin.py [-h] [-v V] [-f F] [-t T] [-o O]
Extract statistical analysis of text
optional arguments:
-h, --help show this help message and exit
-v V verbose output
-f F file to analyse
-t T maximum value for the top list (default is 100) -1 is no limit
-o O output format (default is csv)
~~~~
# example usage of napkin
A sample file "The Prince, by Nicoló Machiavelli" is included to test napkin.