Commons
- class ds_capability.components.commons.Commons
- static filter_columns(data: pa.Table, headers=None, d_types: list = None, regex: [str, list] = None, drop: bool = None) pa.Table
Returns a subset of columns based on the filter criteria. The order of filter is d_type, headers then regex.
- Parameters:
data – the Canonical data to get the column headers from
d_types – (optional) a list of pyarrow DataTypes of the columns headers
headers – (optional) a list of header strings to select from the columns headers
regex – (optional) a regular expression to search from the columns headers
drop – (optional) reverses the selection and drops the selected column headers
- Returns:
a filtered list of headers
- Returns:
pa.Table
- static filter_headers(data: pa.Table, headers: [str, list] = None, d_types: list = None, regex: [str, list] = None, drop: bool = None) list
returns a list of headers based on the filter criteria. The order of filter is d_type, headers then regex. Data type are taken from pyarrow.types and should be a string or list of strings that question a data type. For example [‘is_integer’, ‘is_floating’]
- Parameters:
data – the Canonical data to get the column headers from
d_types – (optional) a list of pyarrow.types method names of the columns headers
headers – (optional) a list of header strings to select from the columns headers
regex – (optional) a regular expression to search from the columns headers
drop – (optional) reverses the selection and drops the selected column headers
- Returns:
a filtered list of headers
- Raise:
TypeError if any of the types are not as expected
- static list_diff(seq: list, other: list, symmetric: bool = True) list
Useful utility method to return the difference between two list where the list is unique. Symmetric set to True returns diff in both, False returns the difference of the first to the last
- static list_equal(seq: list, other: list) bool
checks if two lists are equal in count and frequency of elements, ignores order
- static list_formatter(value: Any) list
Useful utility method to convert any type of str, list, tuple or array into a list
- static list_intersect(seq: list, other: list) list
Useful utility method to return the intersection between two list where the list is unique.
- static list_match(seq: list, pattern: str) list
Useful utility method to run a regular expression on a list
- static list_search(seq: list, value: [int, float, str], low: int = None, high: int = None)
A binary search for a value in a list sequence between two index
- static list_union(seq: list, other: list) list
Useful utility method to return the union between two list where the list is unique.
- static list_unique(seq: list) list
Useful utility method to retain the order of a list but removes duplicates
- static table_append(t1: Table, t2: Table)
appends all the columns in t2 to t1
- static table_report(t: ~pyarrow.lib.Table, head: int = None, index_header: [<class 'str'>, <class 'list'>] = None, bold: [<class 'str'>, <class 'list'>] = None, large_font: [<class 'str'>, <class 'list'>] = None)
generates a stylised version of the pyarrow table