# core


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## column helpers

------------------------------------------------------------------------

<a
href="https://github.com/MIS-Analytics/mis_analytics/blob/main/mis_analytics/core.py#L15"
target="_blank" style="float:right; font-size:smaller">source</a>

### move_columns

``` python

def move_columns(
    df:DataFrame, # Input
    cols_to_move:str, # Single
    pos:int, # Target
)->DataFrame:

```

*Move one or more columns to a specified position in a DataFrame.*

``` python
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
df
```

<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }
&#10;    .dataframe tbody tr th {
        vertical-align: top;
    }
&#10;    .dataframe thead th {
        text-align: right;
    }
</style>

<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: right;">
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th">A</th>
<th data-quarto-table-cell-role="th">B</th>
<th data-quarto-table-cell-role="th">C</th>
</tr>
</thead>
<tbody>
<tr>
<td data-quarto-table-cell-role="th">0</td>
<td>1</td>
<td>4</td>
<td>7</td>
</tr>
<tr>
<td data-quarto-table-cell-role="th">1</td>
<td>2</td>
<td>5</td>
<td>8</td>
</tr>
<tr>
<td data-quarto-table-cell-role="th">2</td>
<td>3</td>
<td>6</td>
<td>9</td>
</tr>
</tbody>
</table>

</div>

``` python
move_columns(df, 'C', 0)
```

<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }
&#10;    .dataframe tbody tr th {
        vertical-align: top;
    }
&#10;    .dataframe thead th {
        text-align: right;
    }
</style>

<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: right;">
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th">C</th>
<th data-quarto-table-cell-role="th">A</th>
<th data-quarto-table-cell-role="th">B</th>
</tr>
</thead>
<tbody>
<tr>
<td data-quarto-table-cell-role="th">0</td>
<td>7</td>
<td>1</td>
<td>4</td>
</tr>
<tr>
<td data-quarto-table-cell-role="th">1</td>
<td>8</td>
<td>2</td>
<td>5</td>
</tr>
<tr>
<td data-quarto-table-cell-role="th">2</td>
<td>9</td>
<td>3</td>
<td>6</td>
</tr>
</tbody>
</table>

</div>

``` python
sample_string = 'KnotenNr. der Stücklistenposition'
```

``` python
sample_string = sample_string.lower()
sample_string
```

    'knotennr. der stücklistenposition'

``` python
[c for c in sample_string][:10]
```

    ['k', 'n', 'o', 't', 'e', 'n', 'n', 'r', '.', ' ']

``` python
[c.isalnum() for c in sample_string][:10]
```

    [True, True, True, True, True, True, True, True, False, False]

``` python
[c if c.isalnum() else '_' for c in sample_string][:10]
```

    ['k', 'n', 'o', 't', 'e', 'n', 'n', 'r', '_', '_']

``` python
sample_string = "".join([c if c.isalnum() else '_' for c in sample_string])
sample_string
```

    'knotennr__der_stücklistenposition'

``` python
sample_string.split('_')
```

    ['knotennr', '', 'der', 'stücklistenposition']

``` python
[o for o in filter(None, sample_string.split('_'))]
```

    ['knotennr', 'der', 'stücklistenposition']

``` python
'_'.join(filter(None,sample_string.split('_')))
```

    'knotennr_der_stücklistenposition'

------------------------------------------------------------------------

<a
href="https://github.com/MIS-Analytics/mis_analytics/blob/main/mis_analytics/core.py#L37"
target="_blank" style="float:right; font-size:smaller">source</a>

### clean_string

``` python

def clean_string(
    input_string:str
):

```

*Cleans input_string*

``` python
clean_string(sample_string)
```

    'knotennr_der_stücklistenposition'

------------------------------------------------------------------------

<a
href="https://github.com/MIS-Analytics/mis_analytics/blob/main/mis_analytics/core.py#L43"
target="_blank" style="float:right; font-size:smaller">source</a>

### clean_col_names

``` python

def clean_col_names(
    df:DataFrame
)->DataFrame:

```

*Returns df with clean column names by using
[`clean_string`](https://MIS-Analytics.github.io/mis_analytics/core.html#clean_string)
on each column name.*

``` python
df.head(2)
```

<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }
&#10;    .dataframe tbody tr th {
        vertical-align: top;
    }
&#10;    .dataframe thead th {
        text-align: right;
    }
</style>

<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: right;">
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th">Cust_ID.</th>
<th data-quarto-table-cell-role="th">Order--Date</th>
<th data-quarto-table-cell-role="th">Prdct.Name!</th>
<th data-quarto-table-cell-role="th">QTY___Ordered</th>
<th data-quarto-table-cell-role="th">Unit$Price</th>
</tr>
</thead>
<tbody>
<tr>
<td data-quarto-table-cell-role="th">0</td>
<td>101</td>
<td>2024-02-01</td>
<td>Widget A</td>
<td>10</td>
<td>99.99</td>
</tr>
<tr>
<td data-quarto-table-cell-role="th">1</td>
<td>102</td>
<td>2024-02-02</td>
<td>Widget B</td>
<td>20</td>
<td>149.99</td>
</tr>
</tbody>
</table>

</div>

``` python
df = clean_col_names(df)
df.head(2)
```

<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }
&#10;    .dataframe tbody tr th {
        vertical-align: top;
    }
&#10;    .dataframe thead th {
        text-align: right;
    }
</style>

<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: right;">
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th">cust_id</th>
<th data-quarto-table-cell-role="th">order_date</th>
<th data-quarto-table-cell-role="th">prdct_name</th>
<th data-quarto-table-cell-role="th">qty_ordered</th>
<th data-quarto-table-cell-role="th">unit_price</th>
</tr>
</thead>
<tbody>
<tr>
<td data-quarto-table-cell-role="th">0</td>
<td>101</td>
<td>2024-02-01</td>
<td>Widget A</td>
<td>10</td>
<td>99.99</td>
</tr>
<tr>
<td data-quarto-table-cell-role="th">1</td>
<td>102</td>
<td>2024-02-02</td>
<td>Widget B</td>
<td>20</td>
<td>149.99</td>
</tr>
</tbody>
</table>

</div>

------------------------------------------------------------------------

<a
href="https://github.com/MIS-Analytics/mis_analytics/blob/main/mis_analytics/core.py#L49"
target="_blank" style="float:right; font-size:smaller">source</a>

### show_identical_columns

``` python

def show_identical_columns(
    df:DataFrame, # The DataFrame to analyze
    columns:list, # The list of column names to compare
)->DataFrame: # A DataFrame matrix showing identity status between columns

```

*Checks if specified columns in `df` are identical.*

## etl functions

------------------------------------------------------------------------

<a
href="https://github.com/MIS-Analytics/mis_analytics/blob/main/mis_analytics/core.py#L64"
target="_blank" style="float:right; font-size:smaller">source</a>

### reduce_mem_usage

``` python

def reduce_mem_usage(
    df:pandas.DataFrame | pandas.Series, # Input DataFrame or Series
    verbose:bool=True, # Whether to print memory usage reduction
)->pandas.DataFrame | pandas.Series: # Reduced DataFrame or Series

```

*Reduces memory usage of a DataFrame or Series by downcasting numerical
types.*

------------------------------------------------------------------------

<a
href="https://github.com/MIS-Analytics/mis_analytics/blob/main/mis_analytics/core.py#L109"
target="_blank" style="float:right; font-size:smaller">source</a>

### group_resample

``` python

def group_resample(
    df:DataFrame, # Input DataFrame
    id_col:str, # Column name to group/unstack by
    value_col:str, # Column name containing values to aggregate
    date_col:str, # Column name containing dates
    freq:str='W', # Resampling frequency (e.g., 'W', 'ME', 'D')
    aggfunc:str='sum', # Aggregation function (e.g., 'sum', 'mean')
)->DataFrame: # Resampled and stacked DataFrame

```

*Group, resample, and restack a DataFrame by ID and date.*

------------------------------------------------------------------------

<a
href="https://github.com/MIS-Analytics/mis_analytics/blob/main/mis_analytics/core.py#L129"
target="_blank" style="float:right; font-size:smaller">source</a>

### polars_resample

``` python

def polars_resample(
    df, date_col:str='ds', group_cols:str='unique_id', agg_col:str='y', frequency:str='1mo'
):

```

*Call self as a function.*

## validate functions

Tests to check if we use out of scope variables inside function

``` python
a = 10
```

``` python
def my_sum(b):
    return a + b

my_sum(32)
```

    42

``` python
import inspect
```

``` python
# inspect.getclosurevars(eval("my_sum"))
inspect.getclosurevars(my_sum)
```

    ClosureVars(nonlocals={}, globals={'a': 10}, builtins={}, unbound=set())

``` python
def my_callback(result):
    print("Cell just finished running!")

get_ipython().events.register('post_run_cell', my_callback)
```

    Cell just finished running!

``` python
get_ipython().events.unregister('post_run_cell', my_callback)
```

``` python
import ast
```

``` python
def check_funcs(result):
    tree = ast.parse(result.info.raw_cell)
    func_names = [node.name for node in ast.walk(tree) if isinstance(node, ast.FunctionDef)]
    if func_names:
        print(f"Functions defined: {func_names}")
```

``` python
get_ipython().events.register('post_run_cell', check_funcs)
```

``` python
get_ipython().events.unregister('post_run_cell', check_funcs)
```

``` python
def check_funcs(result):
    tree = ast.parse(result.info.raw_cell)
    func_names = [node.name for node in ast.walk(tree) if isinstance(node, ast.FunctionDef)]
    if func_names:
        print(f"Functions defined: {func_names}")
```

``` python
get_ipython().events.register('post_run_cell', check_funcs)
```

``` python
get_ipython().events.unregister('post_run_cell', check_funcs)
```

``` python
import ast
import inspect
```

``` python
def check_global_deps(result):
    tree = ast.parse(result.info.raw_cell)
    func_names = [node.name for node in ast.walk(tree) if isinstance(node, ast.FunctionDef)]
    
    ns = get_ipython().user_ns
    for name in func_names:
        if name in ns:
            func = ns[name]
            cv = inspect.getclosurevars(func)
            if cv.globals:
                print(f"⚠️  '{name}' depends on global variables: {list(cv.globals.keys())}")
```

``` python
get_ipython().events.register('post_run_cell', check_global_deps)
```

``` python
def my_sum(b):
    return a + b

my_sum(32)
```

    42

    ⚠️  'my_sum' depends on global variables: ['a']

``` python
get_ipython().events.unregister('post_run_cell', check_global_deps)
```
