baloo.core.indexes package¶
baloo.core.indexes.base module¶
-
class
baloo.core.indexes.base.
Index
(data, dtype=None, name=None)[source]¶ Bases:
baloo.weld.lazy_result.LazyArrayResult
,baloo.core.generic.BinaryOps
,baloo.core.generic.BitOps
,baloo.core.generic.IndexCommon
,baloo.core.generic.BalooCommon
Weld-ed Pandas Index.
See also
Examples
>>> import baloo as bl >>> import numpy as np >>> ind = bl.Index(np.array(['a', 'b', 'c'], dtype=np.dtype(np.bytes_))) >>> ind # repr Index(name=None, dtype=|S1) >>> print(ind) # str [b'a' b'b' b'c'] >>> ind.values array([b'a', b'b', b'c'], dtype='|S1') >>> len(ind) # eager 3
Attributes: - dtype
name
Name of the Index.
-
__getitem__
(item)[source]¶ Select from the Index. Currently used internally through DataFrame and Series.
Supported selection functionality exemplified below.
Examples
>>> ind = bl.Index(np.arange(3)) >>> print(ind[ind < 2].evaluate()) [0 1] >>> print(ind[1:2].evaluate()) [1]
-
__init__
(data, dtype=None, name=None)[source]¶ Initialize an Index object.
Parameters: - data : np.ndarray or WeldObject or list
Raw data or Weld expression.
- dtype : np.dtype, optional
Numpy dtype of the elements. Inferred from data by default.
- name : str, optional
Name of the Index.
-
dropna
()[source]¶ Returns Index without null values according to Baloo’s convention.
Returns: - Index
Index with no null values.
-
evaluate
(verbose=False, decode=True, passes=None, num_threads=1, apply_experimental=True)[source]¶ Evaluates by creating an Index containing evaluated data.
See LazyResult
Returns: - Index
Index with evaluated data.
-
fillna
(value)[source]¶ Returns Index with missing values replaced with value.
Parameters: - value : {int, float, bytes, bool}
Scalar value to replace missing values with.
Returns: - Index
With missing values replaced.
-
classmethod
from_pandas
(index)[source]¶ Create baloo Index from pandas Index.
Parameters: - index : pandas.base.Index
Returns: - Index
-
head
(n=5)[source]¶ Return Index with first n values.
Parameters: - n : int
Number of values.
Returns: - Series
Index containing the first n values.
Examples
>>> ind = bl.Index(np.arange(3, dtype=np.float64)) >>> print(ind.head(2).evaluate()) [0. 1.]
-
name
¶ Name of the Index.
Returns: - str
name
baloo.core.indexes.range module¶
-
class
baloo.core.indexes.range.
RangeIndex
(start=None, stop=None, step=None, name=None)[source]¶ Bases:
baloo.core.indexes.base.Index
Weld-ed Pandas RangeIndex.
See also
Examples
>>> import baloo as bl >>> import numpy as np >>> ind = bl.RangeIndex(3) >>> ind # repr RangeIndex(start=0, stop=3, step=1) >>> weld_code = str(ind) # weld_code >>> ind.evaluate() Index(name=None, dtype=int64) >>> print(ind.evaluate()) [0 1 2] >>> len(ind) # eager 3 >>> (ind * 2).evaluate().values array([0, 2, 4]) >>> (ind - bl.Series(np.arange(1, 4))).evaluate().values array([-1, -1, -1])
Attributes: - start
- stop
- step
- dtype
-
__init__
(start=None, stop=None, step=None, name=None)[source]¶ Initialize a RangeIndex object.
If only 1 value (start) is passed, it will be considered the stop value. Note that this 1 value may also be a WeldObject for cases such as creating a Series with no index as argument.
Parameters: - start : int or WeldObject
- stop : int or WeldObject, optional
- step : int, optional
-
empty
¶ Check whether the data structure is empty.
Returns: - bool
baloo.core.indexes.multi module¶
-
class
baloo.core.indexes.multi.
MultiIndex
(data, names=None)[source]¶ Bases:
baloo.core.generic.IndexCommon
,baloo.core.generic.BalooCommon
Weld-ed MultiIndex, however completely different to Pandas.
This version merely groups a few columns together to act as an index and hence does not follow the labels/levels approach of Pandas.
Examples
>>> import baloo as bl >>> import numpy as np >>> ind = bl.MultiIndex([[1, 2, 3], np.array([4, 5, 6], dtype=np.float64)], names=['i1', 'i2']) >>> ind # repr MultiIndex(names=['i1', 'i2'], dtypes=[dtype('int64'), dtype('float64')]) >>> print(ind) # str i1 i2 ---- ---- 1 4 2 5 3 6 >>> ind.values [Index(name=i1, dtype=int64), Index(name=i2, dtype=float64)] >>> len(ind) # eager 3
Attributes: - names
- dtypes
-
__getitem__
(item)[source]¶ Select from the MultiIndex.
Supported functionality exemplified below.
Examples
>>> mi = bl.MultiIndex([np.array([1, 2, 3]), np.array([4., 5., 6.])], names=['i1', 'i2']) >>> print(mi.values[0]) [1 2 3] >>> print(mi[:2].evaluate()) i1 i2 ---- ---- 1 4 2 5 >>> print(mi[mi.values[0] != 2].evaluate()) i1 i2 ---- ---- 1 4 3 6
-
__init__
(data, names=None)[source]¶ Initialize a MultiIndex object.
Parameters: - data : list of (numpy.ndarray or Index or list)
The internal data.
- names : list of str, optional
The names of the data.
-
__len__
()[source]¶ Eagerly get the length of the MultiIndex.
Note that if the length is unknown (such as for WeldObjects), it will be eagerly computed.
Returns: - int
Length of the MultiIndex.
-
dropna
()[source]¶ Returns MultiIndex without any rows containing null values according to Baloo’s convention.
Returns: - MultiIndex
MultiIndex with no null values.
-
empty
¶ Check whether the data structure is empty.
Returns: - bool
-
evaluate
(verbose=False, decode=True, passes=None, num_threads=1, apply_experimental=True)[source]¶ Evaluates by creating a MultiIndex containing evaluated data and index.
See LazyResult
Returns: - MultiIndex
MultiIndex with evaluated data.
-
classmethod
from_pandas
(index)[source]¶ Create baloo MultiIndex from pandas MultiIndex.
Parameters: - index : pandas.multi.MultiIndex
Returns: - MultiIndex
-
name
¶ Name of the Index.
Returns: - str
name
-
tail
(n=5)[source]¶ Return MultiIndex with the last n values in each column.
Parameters: - n : int
Number of values.
Returns: - MultiIndex
MultiIndex containing the last n values in each column.
-
values
¶ Retrieve internal data.
Returns: - list
The internal list data representation.