You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here are a few ideas of container-based features, borrowed from R and Python, that we might be able to support in Rev.
Data tables
Real world data table files (e.g. csv file) may contain fields of different types (int, float, str, etc.).
RevBayes does not have a data table type, but it can read data table files as two-dimensional vectors. RevBayes behaves well when all values in a csv file can be converted into the same type. For example:
However, when a file contains multiple distinct types, the resulting type is a generic RevObject[][]. For example:
> x2 = readDataDelimitedFile("example2.csv", delimiter=",", header=true)
> x2
RevObject[][] vector with 2 values
==================================
[1]
RevObject[] vector with 2 values
================================
[1]
0.1
[2]
cat
[2]
RevObject[] vector with 2 values
================================
[1]
0
[2]
NA
> type(x2)
RevObject[][]
Part of the problem comes from relying on a two-dimensional vector to represent the table. Row-vectors must have elements of the same type. That means, any row with different types across columns gets cast to the most generic type, RevObject.
A solution would be to add a DataTable object. This could then store vectors across columns (not rows) while also supporting more advanced ways of indexing (e.g. column names, slice-indexing, etc.).
Currently, we can either access the entire vector or individual vector elements. It should be possible to add basic support for slice-indexing. Current behavior:
> y = [0, 1, 2, 3, 4]
> y[0:3]
Error: Argument or label mismatch for function call.
Provided call:
[] (Natural[]<constant> 'index' )
Correct usage is:
[] (Natural<any> index)
> y[ [0, 1, 2] ]
Error: Argument or label mismatch for function call.
Provided call:
[] (Natural[]<constant> 'index' )
Correct usage is:
[] (Natural<any> index)
Another thing that would be useful to have is tuples. Pairs are a special case: a 2-tuple.
Tuples are different than vectors because each element of a tuple can have a different type, whereas every element of a vector must have the same type.
In c++, we have the type std::pair<T1,T2> for pairs. It would be nice to have the same think in RevBayes.
If we have a type like Vector<Pair<Int,String>>, then this is one way to implement a dictionary. Although not the most efficient.
dict = [("alice",1), ("bob",2)]
Implicitly but strongly typed
Sebastian noted that this combination can be complicated. However, note that languages like Rust are implicitly but strongly typed. So there is a lot of prior art here.
Here are a few ideas of container-based features, borrowed from R and Python, that we might be able to support in Rev.
Data tables
Real world data table files (e.g. csv file) may contain fields of different types (int, float, str, etc.).
RevBayes does not have a data table type, but it can read data table files as two-dimensional vectors. RevBayes behaves well when all values in a csv file can be converted into the same type. For example:
is read as
However, when a file contains multiple distinct types, the resulting type is a generic
RevObject[][]
. For example:Part of the problem comes from relying on a two-dimensional vector to represent the table. Row-vectors must have elements of the same type. That means, any row with different types across columns gets cast to the most generic type,
RevObject
.A solution would be to add a
DataTable
object. This could then store vectors across columns (not rows) while also supporting more advanced ways of indexing (e.g. column names, slice-indexing, etc.).Example of data table use:
Slice-indexing
Currently, we can either access the entire vector or individual vector elements. It should be possible to add basic support for slice-indexing. Current behavior:
Desired behavior:
Dictionaries/maps
It'd be nice to be able to use dictionaries or maps as unordered containers. Ideally, keys and values could be of any type. For example:
Dictionaries of containers (vectors or other dictionaries) could be useful, too.
The text was updated successfully, but these errors were encountered: