Data structures

class datatoolbox.Datatable(*args, **kwargs)[source]

Datatable is derrived from pandas dataframe. Datatables contain the addition meta attribute and have autotmated unit conversions

add(other, **kwargs)[source]

Get Addition of dataframe and other, element-wise (binary operator add).

Equivalent to dataframe + other, but with support to substitute a fill_value for missing data in one of the inputs. With reverse version, radd.

Among flexible wrappers (add, sub, mul, div, floordiv, mod, pow) to arithmetic operators: +, -, *, /, //, %, **.

Parameters

otherscalar, sequence, Series, dict or DataFrame

Any single or multiple element data structure, or list-like object.

axis{0 or ‘index’, 1 or ‘columns’}

Whether to compare by the index (0 or ‘index’) or columns. (1 or ‘columns’). For Series input, axis to match Series index on.

levelint or label

Broadcast across a level, matching Index values on the passed MultiIndex level.

fill_valuefloat or None, default None

Fill existing missing (NaN) values, and any new element needed for successful DataFrame alignment, with this value before computation. If data in both corresponding DataFrame locations is missing the result will be missing.

Returns

DataFrame

Result of the arithmetic operation.

See Also

DataFrame.add : Add DataFrames. DataFrame.sub : Subtract DataFrames. DataFrame.mul : Multiply DataFrames. DataFrame.div : Divide DataFrames (float division). DataFrame.truediv : Divide DataFrames (float division). DataFrame.floordiv : Divide DataFrames (integer division). DataFrame.mod : Calculate modulo (remainder after division). DataFrame.pow : Calculate exponential power.

Notes

Mismatched indices will be unioned together.

Examples

>>> df = pd.DataFrame({'angles': [0, 3, 4],
...                    'degrees': [360, 180, 360]},
...                   index=['circle', 'triangle', 'rectangle'])
>>> df
           angles  degrees
circle          0      360
triangle        3      180
rectangle       4      360

Add a scalar with operator version which return the same results.

>>> df + 1
           angles  degrees
circle          1      361
triangle        4      181
rectangle       5      361
>>> df.add(1)
           angles  degrees
circle          1      361
triangle        4      181
rectangle       5      361

Divide by constant with reverse version.

>>> df.div(10)
           angles  degrees
circle        0.0     36.0
triangle      0.3     18.0
rectangle     0.4     36.0
>>> df.rdiv(10)
             angles   degrees
circle          inf  0.027778
triangle   3.333333  0.055556
rectangle  2.500000  0.027778

Subtract a list and Series by axis with operator version.

>>> df - [1, 2]
           angles  degrees
circle         -1      358
triangle        2      178
rectangle       3      358
>>> df.sub([1, 2], axis='columns')
           angles  degrees
circle         -1      358
triangle        2      178
rectangle       3      358
>>> df.sub(pd.Series([1, 1, 1], index=['circle', 'triangle', 'rectangle']),
...        axis='index')
           angles  degrees
circle         -1      359
triangle        2      179
rectangle       3      359

Multiply a dictionary by axis.

>>> df.mul({'angles': 0, 'degrees': 2})
            angles  degrees
circle           0      720
triangle         0      360
rectangle        0      720
>>> df.mul({'circle': 0, 'triangle': 2, 'rectangle': 3}, axis='index')
            angles  degrees
circle           0        0
triangle         6      360
rectangle       12     1080

Multiply a DataFrame of different shape with operator version.

>>> other = pd.DataFrame({'angles': [0, 3, 4]},
...                      index=['circle', 'triangle', 'rectangle'])
>>> other
           angles
circle          0
triangle        3
rectangle       4
>>> df * other
           angles  degrees
circle          0      NaN
triangle        9      NaN
rectangle      16      NaN
>>> df.mul(other, fill_value=0)
           angles  degrees
circle          0      0.0
triangle        9      0.0
rectangle      16      0.0

Divide by a MultiIndex by level.

>>> df_multindex = pd.DataFrame({'angles': [0, 3, 4, 4, 5, 6],
...                              'degrees': [360, 180, 360, 360, 540, 720]},
...                             index=[['A', 'A', 'A', 'B', 'B', 'B'],
...                                    ['circle', 'triangle', 'rectangle',
...                                     'square', 'pentagon', 'hexagon']])
>>> df_multindex
             angles  degrees
A circle          0      360
  triangle        3      180
  rectangle       4      360
B square          4      360
  pentagon        5      540
  hexagon         6      720
>>> df.div(df_multindex, level=1, fill_value=0)
             angles  degrees
A circle        NaN      1.0
  triangle      1.0      1.0
  rectangle     1.0      1.0
B square        0.0      0.0
  pentagon      0.0      0.0
  hexagon       0.0      0.0
aggregate_region(mapping, skipna=False)[source]

This functions added the aggregates to the table according to the provided mapping.( See datatools.mapp.regions)

Returns the result, but does not inplace add it.

append(other, **kwargs)[source]

Append data to the datatable

Parameters

otherdatatable

New data that will be added to the datatable.

kwargsTYPE

Default pandas append arguments.

Returns

datatable

clean()[source]

Clean up the dataframe to only recogniszed regions, years and numeric values. Removed columns and rows with only nan values.

Returns

datatable

DESCRIPTION.

convert(newUnit, context=None, suffix_dict={}, **new_meta)[source]

Convert datatable to different unit and returns converted datatable.

Parameters

newUnitstr

New unit string in which the datatable should be converted.

contextstr, optional

Optional context (e.g. GWPAR4). The default is None.

Returns

datatable

Datatable converted in the new unit.

copy(deep=True)[source]

Make a copy of this Datatable object Parameters

Returns

copy : Datatable

diff(periods=1, axis=0)[source]

Compute the difference between different years in the datatable Equivalent do pandas diff but return datatable.

Parameters

periodsint, optional

DESCRIPTION. The default is 1.

axisint, optional

DESCRIPTION. The default is 0.

Returns

outTYPE

DESCRIPTION.

filter(spaceIDs)[source]

Filter dataframe based on a list of spatial IDs.

Parameters

spaceIDsiterable of str

DESCRIPTION.

Returns

datatable

DESCRIPTION.

classmethod from_excel(filepath, sheetName=None)[source]

Create a dataframe from a suitable excel file that is saved by datatoolbox.

Parameters

cls : class

filepathstr

Path to the file.

sheetNamestr, optional

Sheetn ame that is read in. The default is None.

Returns

datatable

DESCRIPTION.

classmethod from_multi_indexed_dataframe(df)[source]

Class function to create a datatable from a multi-indexed dataframe

Parameters

df : multi index dataframe

Returns

table : Datatable

classmethod from_pyam(idf, **kwargs)[source]

Create a datatable from an iam dataframe.

Parameters

clsdatatable class

DESCRIPTION.

idfpyam dataframe

dataframe that contrains the data that is used to create the datatable.

kwargsTYPE

DESCRIPTION.

Returns

datatatbledatatoolbox datatable

Datatable with original unit and related meta data.

generateTableID()[source]

Generates the table ID based on the meta data of the table.

Returns

datatable

DESCRIPTION.

getTableFileName()[source]

For compatibility to windows based sytems, the pipe symbols is replaces by double underscore for the csv filename.

getTableFilePath()[source]

Returns path to data on hard disk

Returns

TYPE

DESCRIPTION.

info()[source]

Returns information about the dataframe like shape, index and column extend and the number of non-nan entries.

Returns

str

Information about datatable.

interpolate(method='linear', add_missing_years=False)[source]

Interpoltate missing data between year with the option to add missing years in the columns.

Parameters

methodsting, optional

Interpolation method. The default is “linear”. - linear

add_missing_yearsbool, optional

If true, missing years within the time value range are added to the dataframe. The default is False.

Returns

datatable

Interpolated dataframe.

reduce(method='linear_piece_wise', eps=1e-06)[source]

Reduce data that is piecewise linear to the core data points (kinks).

source()[source]

Return the source of the table

squeeze_index_to_attrs()[source]

Does move all unique index levels to attrs.

Returns

Datatable

Datatable with only index levels that are non-unique. All other levels are in attrs. This operation can be reversed with table.to_multi_index_dataframe()

to_IamDataFrame(**kwargs)[source]

Function to sustain backwars compatibility Depreciated.

to_csv(fileName=None)[source]

Save the datatable to an annotated csv file.

Parameters

fileNamestr, optional

Path to file. The default is None.

Returns

None.

to_excel(fileName=None, sheetName='Sheet0', writer=None, append=False)[source]

Save datatable to excel.

Parameters

fileNamestr, optional

Relative file path. If None is provide, a writer is expected. The default is None.

sheetNamestr, optional

Sheet name that is read in. The default is “Sheet0”.

writerpandas excel writer, optional

Pandas writer that is used instead opening a new one. The default is None.

appendbool, optional

If true, data is appended to the writer. The default is False.

Returns

None.

to_multi_index_dataframe(meta_keys=None, exclude_meta=['ID', 'creator', 'source_name', 'source_year', '_timeformat'])[source]

Return a new datatable with a mult-index that has all meta data assigned

Parameters

exclude_meta: optinal list of meta keys that should be ignored

Returns

Datatable

New Datatable with multi_index and not meta data.

to_pyam(**kwargs)[source]

Conversion to pyam dataframe.

Parameters

kwargsTYPE

DESCRIPTION.

Raises

AssertionError

DESCRIPTION.

Returns

idfpyam dataframe

DESCRIPTION.

yearlyChange(forward=True)[source]
Noindex:

This methods returns the yearly change for all years (t1) that reported and and where the previous year (t0) is also reported

Parameters

forwardbool

If true, the yearly change is computed in the forward direction, otherwise backwards. Default is forward.

class datatoolbox.DataSet(data_vars: DataVars | None = None, coords: Mapping[Any, Any] | None = None, attrs: Mapping[Any, Any] | None = None)[source]

Very simple class to allow initialization of xarray datasets from pyam, wide pandas dataframes and datatoolbox queries.