Pandas¶
Responsible for standardizing the use of DataFrame and Series across the project.
- class PandasDataframe(path, df, **kwargs)[source]
Bases:
objectClass responsible for the standardization and manipulation of Pandas DataFrames.
- path
Absolute path to the CSV file.
- Type:
str
- df
Data stored in the dataframe.
- Type:
pandas.DataFrame
- list
List of dataframes used for concatenation.
- Type:
list, optional
- dict
Dictionary used to create a dataframe.
- Type:
dict, optional
- csv_to_df()[source]
Reads a CSV file from the specified path and loads it into the dataframe.
- df_to_csv(path)[source]
Exports the current dataframe to a CSV file.
- Parameters:
path (str) – Destination path for the CSV file.
- dict_to_df()[source]
Converts the stored dictionary into a dataframe.
- drop_column(column, direction)[source]
Drops rows or columns from the dataframe.
- Parameters:
column (str or list) – Column name(s) or row label(s) to drop.
direction (int) – Axis to drop from (0 for rows, 1 for columns).
- find_row_data(row)[source]
Retrieves a row from the dataframe by index.
- Parameters:
row (int) – Row index.
- Returns:
Row data.
- Return type:
pandas.Series
- find_row_date_greater_or_equals_than_indicated(date_str) tuple[bool, int][source]
Finds the first row where the date is greater than or equal to the given date.
- Parameters:
date_str (str or datetime) – Reference date.
- Returns:
(True, index) if found, otherwise (False, 0).
- Return type:
tuple
- get_column_in_list(column)[source]
Returns a dataframe column as a Python list.
- Parameters:
column (str) – Column name.
- Returns:
Column values as a list.
- Return type:
list
- group_element(group_element)[source]
Groups the dataframe by the specified column(s).
- Parameters:
group_element (str or list) – Column(s) to group by.
- list_to_df()[source]
Concatenates a list of dataframes into a single dataframe.
- order_columns(order_list)[source]
Reorders the dataframe columns.
- Parameters:
order_list (list) – Desired column order.
- query_date(start_date, end_date, column_name)[source]
Filters rows between two dates based on a specified date column.
- Parameters:
start_date (str or datetime) – Start date for filtering.
end_date (str or datetime) – End date for filtering.
column_name (str) – Name of the date column.
- query_date_and_element(start_date, end_date, date_column_name, investment_cnpj, investment_column_cnpj)[source]
Filters rows based on both a date range and a specific element.
- Parameters:
start_date (str or datetime) – Start date for filtering.
end_date (str or datetime) – End date for filtering.
date_column_name (str) – Name of the date column.
investment_cnpj (str) – Value to filter in the investment column.
investment_column_cnpj (str) – Column name containing the investment identifier.
- Returns:
Filtered dataframe.
- Return type:
pandas.DataFrame
- query_element_in(column, collection)[source]
Filters rows where column values are within a given collection.
- Parameters:
column (str) – Column name to filter.
collection (list or set) – Collection of values to match.
- reset_index()[source]
Resets the dataframe index.
- sort_elements_list(sort_list)[source]
Sorts the dataframe by the specified columns.
- Parameters:
sort_list (list) – List of column names to sort by.