API Reference¶
damage¶
Manifest generator for data files.
Produces a text file with user specificied checksums for all files from the top of a specified tree and checks line length and ASCII character status for text files.
For statistics program files: SAS .sas7bdat SPSS .sav Stata .dta
Checker() will report number of cases and variables as rows and columns respectively.
Checker Objects¶
class Checker()
A collection of various tools attached to a file
__init__¶
def __init__(fname: str) -> None
Initializes Checker instance
fname : str
Path to file
__del__¶
def __del__() -> None
Destructor closes file
produce_digest¶
def produce_digest(prot: str = 'md5', blocksize: int = 2 * 16) -> str
Returns hex digest for object
fname : str
Path to a file object
prot : str
Hash type. Supported hashes: 'sha1', 'sha224', 'sha256',
'sha384', 'sha512', 'blake2b', 'blake2s', 'md5'.
Default: 'md5'
blocksize : int
Read block size in bytes
flat_tester¶
def flat_tester(**kwargs) -> dict
Checks file for line length and number of records.
Returns a dictionary:
{'min_cols': int, 'max_cols' : int, 'numrec':int, 'constant' : bool}
non_ascii_tester¶
def non_ascii_tester(**kwargs) -> list
Returns a list of dicts of positions of non-ASCII characters in a text file.
[{'row': int, 'col':int, 'char':str}...]
fname : str
Path/filename
Keyword arguments:
`flatfile` : bool
asctest : bool
— Perform character check (assuming it is text)
null_count¶
def null_count(**kwargs) -> dict
Returns an integer count of null characters in the file (‘