Welcome to SBU-Reporter’s documentation!¶
SBU-Reporter¶
Tools for collection, formating and reporting SBU usage on the SURFsara HPC clusters.
More details are provided in the documentation.
Installation¶
SBU-reporter can be installed as following:
PyPi:
pip install git+https://github.com/BvB93/SBU-Reporter@v0.4.0 --upgrade
Examples¶
get_sbu user_file.yaml --start=16-02-2021 --end=25-03-2022
SBU-Reporter API¶
sbu.dataframe¶
A module which handles data parsing and DataFrame construction.
Index¶
|
Acquire the SBU usage for each account in the |
|
Gather SBU usage of a specific user account. |
|
Return a starting and ending date as two strings. |
|
Construct a filename containing the current date. |
|
Create a Pandas DatetimeIndex from a start and end date. |
|
Parse any dates supplied to |
Return the total number of requested SBUs. |
API¶
- sbu.dataframe.get_sbu(df, project, start=None, end=None)[source]¶
Acquire the SBU usage for each account in the
pandas.DataFrame.index
.The start and end of the reported interval can, optionally, be altered with start and end. Performs an inplace update of df, adding new columns to hold the SBU usage per month under the
"Month'
super-column. In addition, a single row and column is added ("sum"
) with SBU usage summed over the entire interval and over all users, respectively.- Parameters
df (
pandas.DataFrame
) – A Pandas DataFrame with usernames and information, constructed byyaml_to_pandas()
.pandas.DataFrame.columns
andpandas.DataFrame.index
should be instances ofpandas.MultiIndex
andpandas.Index
, respectively. User accounts are expected to be stored inpandas.DataFrame.index
. SBU usage (including the sum) is stored in the"Month"
super-column.start (
int
orstr
, optional) – Optional: The starting year of the interval. Defaults to the current year ifNone
.end (
str
orint
, optional) – Optional: The final year of the interval. Defaults to current year + 1 ifNone
.project (
str
, optional) – Optional: The project code of the project of interest. If notNone
, only SBUs expended under this project are considered.
- Return type
- sbu.dataframe.parse_accuse(project, start=None, end=None)[source]¶
Gather SBU usage of a specific user account.
The bash command
accuse
is used for gathering SBU usage along an interval defined by start and end. Results are collected and returned in a Pandas DataFrame.- Parameters
- Returns
The SBU usage of user over a specified period.
- Return type
- sbu.dataframe.get_date_range(start=None, end=None)[source]¶
Return a starting and ending date as two strings.
- Parameters
start (
int
orstr
, optional) – The starting year of the interval. Accepts dates formatted as YYYY, MM-YYYY or DD-MM-YYYY. Defaults to the current year ifNone
.end (
str
orint
, optional) – The final year of the interval. Accepts dates formatted as YYYY, MM-YYYY or DD-MM-YYYY. Defaults to the current year + 1 ifNone
.
- Returns
A tuple with the start and end data, formatted as strings. Dates are formatted as DD-MM-YYYY.
- Return type
- sbu.dataframe.construct_filename(prefix, suffix='.csv')[source]¶
Construct a filename containing the current date.
Examples
>>> filename = construct_filename('my_file', '.txt') >>> print(filename) 'my_file_31_May_2019.txt'
- Parameters
- Returns
A filename consisting of prefix, the current date and suffix.
- Return type
- sbu.dataframe._get_datetimeindex(start, end)[source]¶
Create a Pandas DatetimeIndex from a start and end date.
- Parameters
- Returns
A DatetimeIndex starting from sy and ending on ey.
- Return type
- sbu.dataframe._parse_date(input_date, default_day='01', default_month='01', default_year=None)[source]¶
Parse any dates supplied to
get_date_range()
.- Parameters
input_date (
str
,int
orNone
) –The to-be parsed date. Allowed types and values are:
default_month (
str
) – The default month if a month is not provided in input_date. Expects a month in MM format.default_year (
str
, optional) – Optional: The default year if a year is not provided in input_date. Expects a year in YYYY format. Defaults to the current year ifNone
.
- Returns
A string, constructed from input_date, representing a date in DD-MM-YYYY format.
- Return type
- Raises
ValueError – Raised if input_date is provided as string and contains more than 2 dashes.
TypeError – Raised if input_date is neither
None
, a string nor an integer.
sbu.dataframe_postprocess¶
A module for creating new dataframes from the SBU-containing dataframe.
Index¶
Construct a new Pandas DataFrame with SBU usage per project. |
|
Calculate the SBU accumulated over all months in the |
|
Calculate the % accumulated SBU usage per project. |
|
|
Return a tuple with the names of all active users. |
API¶
- sbu.dataframe_postprocess.get_sbu_per_project(df)[source]¶
Construct a new Pandas DataFrame with SBU usage per project.
- Parameters
df (
pandas.DataFrame
) – A Pandas DataFrame with SBU usage per username, constructed byget_sbu()
.pandas.DataFrame.columns
andpandas.DataFrame.index
should be instances ofpandas.MultiIndex
andpandas.Index
, respectively.- Returns
A new Pandas DataFrame holding the SBU usage per project (i.e. df [project]).
- Return type
- sbu.dataframe_postprocess.get_agregated_sbu(df)[source]¶
Calculate the SBU accumulated over all months in the
"Month"
super-column.Examples
Considering the following DataFrame as input:
>>> print(df['Month']) 2019-01 2019-02 2019-03 username Donald Duck 1000.0 1500.0 750.0 Scrooge McDuck 1000.0 500.0 250.0 Mickey Mouse 1000.0 5000.0 4000.0
Which will be accumulated along each column in the following manner:
>>> df_new = get_agregated_sbu(df) >>> print(df_new['Month']) 2019-01 2019-02 2019-03 username Donald Duck 1000.0 2500.0 3250.0 Scrooge McDuck 1000.0 1500.0 1750.0 Mickey Mouse 1000.0 6000.0 10000.0
- Parameters
df (
pandas.DataFrame
) – A Pandas DataFrame with SBU usage per project, constructed byget_sbu_per_project()
.pandas.DataFrame.columns
andpandas.DataFrame.index
should be instances ofpandas.MultiIndex
andpandas.Index
, respectively.- Returns
A new Pandas DataFrame with SBU usage accumulated over all columns in the
"Month"
super-column.- Return type
- sbu.dataframe_postprocess.get_percentage_sbu(df)[source]¶
Calculate the % accumulated SBU usage per project.
The column storing the requested amount of SBUs can be defined in the global variable
_GLOBVAR["SBU_REQUESTED"]
(default value:("info", "SBU requested")
).Examples
Considering the following DataFrame with accumulated SBUs as input:
>>> print(df) info Month SBU requested 2019-01 2019-02 2019-03 username Donald Duck 3250.0 1000.0 2500.0 3250.0 Scrooge McDuck 5000.0 1000.0 1500.0 1750.0 Mickey Mouse 5000.0 1000.0 6000.0 10000.0
Which will result in the following SBU usage:
>>> df_new = get_percentage_sbu(df) >>> print(df_new['Month']) 2019-01 2019-02 2019-03 username Donald Duck 0.31 0.77 1.00 Scrooge McDuck 0.20 0.30 0.35 Mickey Mouse 0.20 1.20 2.00
- Parameters
df (
pandas.DataFrame
) – A Pandas DataFrame with the accumulated SBU usage per project, constructed byget_agregated_sbu()
.pandas.DataFrame.columns
andpandas.DataFrame.index
should be instances ofpandas.MultiIndex
andpandas.Index
, respectively.- Returns
A new Pandas DataFrame with % SBU usage accumulated over all columns in the
"Month"
super-column.- Return type
sbu.parse_yaml¶
A module for parsing and validating the .yaml input.
Index¶
|
Create a Pandas DataFrame out of a .yaml file. |
Validate that all users belonging to an account are available in the .yaml input file. |
API¶
- sbu.parse_yaml.yaml_to_pandas(filename)[source]¶
Create a Pandas DataFrame out of a .yaml file.
Examples
Example yaml input:
__project__: BlaBla A: description: Example project PI: Walt Disney SBU requested: 1000 users: user1: Donald Duck user2: Scrooge McDuck user3: Mickey Mouse
Example output:
>>> df, project = yaml_to_pandas(filename) >>> print(df) info ... project name ... SBU requested PI username ... user1 A Donald Duck ... 1000.0 Walt Disney user2 A Scrooge McDuck ... 1000.0 Walt Disney user3 A Mickey Mouse ... 1000.0 Walt Disney >>> print(project) BlaBla
- Parameters
filename (
str
) – The path+filename to the .yaml file.- Returns
A Pandas DataFrame and project name constructed from filename. Columns and rows are instances of
pandas.MultiIndex
andpandas.Index
, respectively. All retrieved .yaml data is stored under the"info"
super-column. The project name will beNone
if the__project__
key is absent from the .yaml file- Return type
pandas.DataFrame
&str
, optional
- sbu.parse_yaml.validate_usernames(df)[source]¶
Validate that all users belonging to an account are available in the .yaml input file.
Raises a KeyError If one or more usernames printed by the
accinfo
comand are absent from df.- Parameters
df (
pandas.DataFrame
) – A DataFrame, produced byyaml_to_pandas()
, containing user accounts.pandas.DataFrame.columns
andpandas.DataFrame.index
should be instances ofpandas.MultiIndex
andpandas.Index
, respectively. User accounts are expected to be stored inpandas.DataFrame.index
.- Raises
ValueError – Raised if one or more users reported by the
accinfo
command are absent from df or vice versa.- Return type
sbu.plot¶
A module for handling data plotting.