Data structures¶

class datatoolbox.Datatable(*args, **kwargs)[source]¶
Datatable is derrived from pandas dataframe. Datatables contain the addition meta attribute and have autotmated unit conversions
add(other, **kwargs)[source]¶
Get Addition of dataframe and other, element-wise (binary operator add).

Equivalent to dataframe + other, but with support to substitute a fill_value for missing data in one of the inputs. With reverse version, radd.

Among flexible wrappers (add, sub, mul, div, floordiv, mod, pow) to arithmetic operators: +, -, *, /, //, %, **.

Parameters¶

otherscalar, sequence, Series, dict or DataFrame
Any single or multiple element data structure, or list-like object.

axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns. (1 or ‘columns’). For Series input, axis to match Series index on.

levelint or label
Broadcast across a level, matching Index values on the passed MultiIndex level.

fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for successful DataFrame alignment, with this value before computation. If data in both corresponding DataFrame locations is missing the result will be missing.

Returns¶

DataFrame
Result of the arithmetic operation.

See Also¶

DataFrame.add : Add DataFrames. DataFrame.sub : Subtract DataFrames. DataFrame.mul : Multiply DataFrames. DataFrame.div : Divide DataFrames (float division). DataFrame.truediv : Divide DataFrames (float division). DataFrame.floordiv : Divide DataFrames (integer division). DataFrame.mod : Calculate modulo (remainder after division). DataFrame.pow : Calculate exponential power.

Notes¶

Mismatched indices will be unioned together.
Examples¶
>>> df = pd.DataFrame({'angles': [0, 3, 4],
...                    'degrees': [360, 180, 360]},
...                   index=['circle', 'triangle', 'rectangle'])
>>> df
           angles  degrees
circle          0      360
triangle        3      180
rectangle       4      360
Add a scalar with operator version which return the same results.
>>> df + 1
           angles  degrees
circle          1      361
triangle        4      181
rectangle       5      361
>>> df.add(1)
           angles  degrees
circle          1      361
triangle        4      181
rectangle       5      361
Divide by constant with reverse version.
>>> df.div(10)
           angles  degrees
circle        0.0     36.0
triangle      0.3     18.0
rectangle     0.4     36.0
>>> df.rdiv(10)
             angles   degrees
circle          inf  0.027778
triangle   3.333333  0.055556
rectangle  2.500000  0.027778
Subtract a list and Series by axis with operator version.
>>> df - [1, 2]
           angles  degrees
circle         -1      358
triangle        2      178
rectangle       3      358
>>> df.sub([1, 2], axis='columns')
           angles  degrees
circle         -1      358
triangle        2      178
rectangle       3      358
>>> df.sub(pd.Series([1, 1, 1], index=['circle', 'triangle', 'rectangle']),
...        axis='index')
           angles  degrees
circle         -1      359
triangle        2      179
rectangle       3      359
Multiply a dictionary by axis.
>>> df.mul({'angles': 0, 'degrees': 2})
            angles  degrees
circle           0      720
triangle         0      360
rectangle        0      720
>>> df.mul({'circle': 0, 'triangle': 2, 'rectangle': 3}, axis='index')
            angles  degrees
circle           0        0
triangle         6      360
rectangle       12     1080
Multiply a DataFrame of different shape with operator version.
>>> other = pd.DataFrame({'angles': [0, 3, 4]},
...                      index=['circle', 'triangle', 'rectangle'])
>>> other
           angles
circle          0
triangle        3
rectangle       4
>>> df * other
           angles  degrees
circle          0      NaN
triangle        9      NaN
rectangle      16      NaN
>>> df.mul(other, fill_value=0)
           angles  degrees
circle          0      0.0
triangle        9      0.0
rectangle      16      0.0
Divide by a MultiIndex by level.
>>> df_multindex = pd.DataFrame({'angles': [0, 3, 4, 4, 5, 6],
...                              'degrees': [360, 180, 360, 360, 540, 720]},
...                             index=[['A', 'A', 'A', 'B', 'B', 'B'],
...                                    ['circle', 'triangle', 'rectangle',
...                                     'square', 'pentagon', 'hexagon']])
>>> df_multindex
             angles  degrees
A circle          0      360
  triangle        3      180
  rectangle       4      360
B square          4      360
  pentagon        5      540
  hexagon         6      720
>>> df.div(df_multindex, level=1, fill_value=0)
             angles  degrees
A circle        NaN      1.0
  triangle      1.0      1.0
  rectangle     1.0      1.0
B square        0.0      0.0
  pentagon      0.0      0.0
  hexagon       0.0      0.0
aggregate_region(mapping, skipna=False)[source]¶

This functions added the aggregates to the table according to the provided mapping.( See datatools.mapp.regions)

Returns the result, but does not inplace add it.

append(other, **kwargs)[source]¶

Append data to the datatable

Parameters¶

otherdatatable
New data that will be added to the datatable.

kwargsTYPE
Default pandas append arguments.

Returns¶

datatable

clean()[source]¶

Clean up the dataframe to only recogniszed regions, years and numeric values. Removed columns and rows with only nan values.

Returns¶

datatable
DESCRIPTION.

convert(newUnit, context=None, suffix_dict={}, **new_meta)[source]¶

Convert datatable to different unit and returns converted datatable.

Parameters¶

newUnitstr
New unit string in which the datatable should be converted.

contextstr, optional
Optional context (e.g. GWPAR4). The default is None.

Returns¶

datatable
Datatable converted in the new unit.

copy(deep=True)[source]¶

Make a copy of this Datatable object Parameters

Returns¶

copy : Datatable

diff(periods=1, axis=0)[source]¶

Compute the difference between different years in the datatable Equivalent do pandas diff but return datatable.

Parameters¶

periodsint, optional
DESCRIPTION. The default is 1.

axisint, optional
DESCRIPTION. The default is 0.

Returns¶

outTYPE
DESCRIPTION.

filter(spaceIDs)[source]¶

Filter dataframe based on a list of spatial IDs.

Parameters¶

spaceIDsiterable of str
DESCRIPTION.

Returns¶

datatable
DESCRIPTION.

classmethod from_excel(filepath, sheetName=None)[source]¶

Create a dataframe from a suitable excel file that is saved by datatoolbox.

Parameters¶

cls : class

filepathstr
Path to the file.

sheetNamestr, optional
Sheetn ame that is read in. The default is None.

Returns¶

datatable
DESCRIPTION.

classmethod from_multi_indexed_dataframe(df)[source]¶

Class function to create a datatable from a multi-indexed dataframe

Parameters¶

df : multi index dataframe

Returns¶

table : Datatable

classmethod from_pyam(idf, **kwargs)[source]¶

Create a datatable from an iam dataframe.

Parameters¶

clsdatatable class
DESCRIPTION.

idfpyam dataframe
dataframe that contrains the data that is used to create the datatable.

kwargsTYPE
DESCRIPTION.

Returns¶

datatatbledatatoolbox datatable
Datatable with original unit and related meta data.

generateTableID()[source]¶

Generates the table ID based on the meta data of the table.

Returns¶

datatable
DESCRIPTION.

getTableFileName()[source]¶

For compatibility to windows based sytems, the pipe symbols is replaces by double underscore for the csv filename.

getTableFilePath()[source]¶

Returns path to data on hard disk

Returns¶

TYPE
DESCRIPTION.

info()[source]¶

Returns information about the dataframe like shape, index and column extend and the number of non-nan entries.

Returns¶

str
Information about datatable.

interpolate(method='linear', add_missing_years=False)[source]¶

Interpoltate missing data between year with the option to add missing years in the columns.

Parameters¶

methodsting, optional
Interpolation method. The default is “linear”. - linear

add_missing_yearsbool, optional
If true, missing years within the time value range are added to the dataframe. The default is False.

Returns¶

datatable
Interpolated dataframe.

reduce(method='linear_piece_wise', eps=1e-06)[source]¶

Reduce data that is piecewise linear to the core data points (kinks).

source()[source]¶

Return the source of the table

squeeze_index_to_attrs()[source]¶

Does move all unique index levels to attrs.

Returns¶

Datatable
Datatable with only index levels that are non-unique. All other levels are in attrs. This operation can be reversed with table.to_multi_index_dataframe()

to_IamDataFrame(**kwargs)[source]¶

Function to sustain backwars compatibility Depreciated.

to_csv(fileName=None)[source]¶

Save the datatable to an annotated csv file.

Parameters¶

fileNamestr, optional
Path to file. The default is None.

Returns¶

None.

to_excel(fileName=None, sheetName='Sheet0', writer=None, append=False)[source]¶

Save datatable to excel.

Parameters¶

fileNamestr, optional
Relative file path. If None is provide, a writer is expected. The default is None.

sheetNamestr, optional
Sheet name that is read in. The default is “Sheet0”.

writerpandas excel writer, optional
Pandas writer that is used instead opening a new one. The default is None.

appendbool, optional
If true, data is appended to the writer. The default is False.

Returns¶

None.

to_multi_index_dataframe(meta_keys=None, exclude_meta=['ID', 'creator', 'source_name', 'source_year', '_timeformat'])[source]¶

Return a new datatable with a mult-index that has all meta data assigned

Parameters¶

exclude_meta: optinal list of meta keys that should be ignored

Returns¶

Datatable
New Datatable with multi_index and not meta data.

to_pyam(**kwargs)[source]¶

Conversion to pyam dataframe.

Parameters¶

kwargsTYPE
DESCRIPTION.

Raises¶

AssertionError
DESCRIPTION.

Returns¶

idfpyam dataframe
DESCRIPTION.

yearlyChange(forward=True)[source]¶

Noindex:

This methods returns the yearly change for all years (t1) that reported and and where the previous year (t0) is also reported

Parameters¶

forwardbool
If true, the yearly change is computed in the forward direction, otherwise backwards. Default is forward.
class datatoolbox.DataSet(data_vars: DataVars | None = None, coords: Mapping[Any, Any] | None = None, attrs: Mapping[Any, Any] | None = None)[source]¶

Very simple class to allow initialization of xarray datasets from pyam, wide pandas dataframes and datatoolbox queries.