sbu.dataframe

A module which handles data parsing and DataFrame construction.

Index

get_sbu(df, project[, start, end])

Acquire the SBU usage for each account in the pandas.DataFrame.index.

parse_accuse(project[, start, end])

Gather SBU usage of a specific user account.

get_date_range([start, end])

Return a starting and ending date as two strings.

construct_filename(prefix[, suffix])

Construct a filename containing the current date.

_get_datetimeindex(start, end)

Create a Pandas DatetimeIndex from a start and end date.

_parse_date(input_date[, default_day, ...])

Parse any dates supplied to get_date_range().

_get_total_sbu_requested(df)

Return the total number of requested SBUs.

API

sbu.dataframe.get_sbu(df, project, start=None, end=None)[source]

Acquire the SBU usage for each account in the pandas.DataFrame.index.

The start and end of the reported interval can, optionally, be altered with start and end. Performs an inplace update of df, adding new columns to hold the SBU usage per month under the "Month' super-column. In addition, a single row and column is added ("sum") with SBU usage summed over the entire interval and over all users, respectively.

Parameters
  • df (pandas.DataFrame) – A Pandas DataFrame with usernames and information, constructed by yaml_to_pandas(). pandas.DataFrame.columns and pandas.DataFrame.index should be instances of pandas.MultiIndex and pandas.Index, respectively. User accounts are expected to be stored in pandas.DataFrame.index. SBU usage (including the sum) is stored in the "Month" super-column.

  • start (int or str, optional) – Optional: The starting year of the interval. Defaults to the current year if None.

  • end (str or int, optional) – Optional: The final year of the interval. Defaults to current year + 1 if None.

  • project (str, optional) – Optional: The project code of the project of interest. If not None, only SBUs expended under this project are considered.

Return type

None

sbu.dataframe.parse_accuse(project, start=None, end=None)[source]

Gather SBU usage of a specific user account.

The bash command accuse is used for gathering SBU usage along an interval defined by start and end. Results are collected and returned in a Pandas DataFrame.

Parameters
  • project (str) – The project code of the project of interest.

  • start (str) – The starting date of the interval. Accepts dates formatted as YYYY, MM-YYYY or DD-MM-YYYY.

  • end (str) – The final date of the interval. Accepts dates formatted as YYYY, MM-YYYY or DD-MM-YYYY.

Returns

The SBU usage of user over a specified period.

Return type

pandas.DataFrame

sbu.dataframe.get_date_range(start=None, end=None)[source]

Return a starting and ending date as two strings.

Parameters
  • start (int or str, optional) – The starting year of the interval. Accepts dates formatted as YYYY, MM-YYYY or DD-MM-YYYY. Defaults to the current year if None.

  • end (str or int, optional) – The final year of the interval. Accepts dates formatted as YYYY, MM-YYYY or DD-MM-YYYY. Defaults to the current year + 1 if None.

Returns

A tuple with the start and end data, formatted as strings. Dates are formatted as DD-MM-YYYY.

Return type

tuple [str, str]

sbu.dataframe.construct_filename(prefix, suffix='.csv')[source]

Construct a filename containing the current date.

Examples

>>> filename = construct_filename('my_file', '.txt')
>>> print(filename)
'my_file_31_May_2019.txt'
Parameters
  • prefix (str) – A prefix for the to-be returned filename. The current date will be appended to this prefix.

  • sufix (str, optional) – An optional sufix of the to be returned filename. No sufix will be attached if None.

Returns

A filename consisting of prefix, the current date and suffix.

Return type

str

sbu.dataframe._get_datetimeindex(start, end)[source]

Create a Pandas DatetimeIndex from a start and end date.

Parameters
  • start (str) – The start of the interval. Accepts dates formatted as DD-MM-YYYY.

  • end (str) – The end of the interval. Accepts dates formatted as DD-MM-YYYY.

Returns

A DatetimeIndex starting from sy and ending on ey.

Return type

pandas.DatetimeIndex

sbu.dataframe._parse_date(input_date, default_day='01', default_month='01', default_year=None)[source]

Parse any dates supplied to get_date_range().

Parameters
  • input_date (str, int or None) –

    The to-be parsed date. Allowed types and values are:

    • None: Defaults to the first day of the current year and month.

    • int: A year (e.g. 2019).

    • str: A date in YYYY, MM-YYYY or DD-MM-YYYY format (e.g. "22-10-2018").

  • default_month (str) – The default month if a month is not provided in input_date. Expects a month in MM format.

  • default_year (str, optional) – Optional: The default year if a year is not provided in input_date. Expects a year in YYYY format. Defaults to the current year if None.

Returns

A string, constructed from input_date, representing a date in DD-MM-YYYY format.

Return type

str

Raises
  • ValueError – Raised if input_date is provided as string and contains more than 2 dashes.

  • TypeError – Raised if input_date is neither None, a string nor an integer.

sbu.dataframe._get_total_sbu_requested(df)[source]

Return the total number of requested SBUs.

Return type

float