Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating an IntervalCollection of ValueIntervals #674

Open
charlesxucheng opened this issue Sep 10, 2017 · 5 comments
Open

Creating an IntervalCollection of ValueIntervals #674

charlesxucheng opened this issue Sep 10, 2017 · 5 comments
Labels

Comments

@charlesxucheng
Copy link

Hi,

Thanks for creating the library. It is what I need for date ranges manipulation.

I would like to use ValueInterval and IntervalCollection to model intervals that have an associated value (attributes of an entity that carries that value for during that interval - time series data e.g. interest rates).

However, I could not find a way to create an IntervalCollection from a list of ValueIntervals. Is there a way to do so? Thanks.

@MenoData
Copy link
Owner

You can easily create an IntervalCollection from a list of ValueInterval but... Example:

DateInterval i1 = DateInterval.between(PlainDate.of(2017, 1, 1), PlainDate.of(2017, 9, 11));
CalendarMonth i2 = CalendarMonth.of(2017, 8); // another interval type
ValueInterval<PlainDate, DateInterval, Integer> vi1 = i1.withValue(17);
ValueInterval<PlainDate, CalendarMonth, Integer> vi2 = i2.withValue(20);
IntervalCollection<PlainDate> icoll = IntervalCollection.onDateAxis().plus(vi1).plus(vi2);
List<ChronoInterval<PlainDate>> storedIntervals = icoll.getIntervals();
System.out.println(storedIntervals); // [[2017-01-01/2017-09-11]=>17, 2017-08=>20]

As you can see, the type of stored intervals is now just a list of the super interface ChronoInterval<PlainDate>. There are mainly two reasons why an IntervalCollection cannot memorize which interval type is really stored.

  • As in example, different interval types can be stored (on the same timeline, here the dateline) so a common super interface is needed to describe the stored intervals. You could even store a mixture of normal date intervals and value intervals as long as they are on the same timeline (described by the generic type parameter T of IntervalCollection).
  • An IntervalCollection is mainly designed for manipulation of interval boundaries. Several manipulation methods like withBlocks() or withSplits() require the creation of NEW intervals. Those new intervals are not value intervals but just normal intervals because the API cannot know how to aggregate or collect the associated values (especially in case of overlaps or gaps of stored intervals).

So this limitation is baked into the available type parameters of IntervalCollection. It uses the type T characterizing the timeline (here: PlainDate) but there is no second type parameter describing the concrete type of stored intervals (because it can change during manipulations).

Alternatives:

However, Time4J has another type of interval collection which indeed knows what kind of intervals are stored, namely: IntervalTree. An interval tree can be considered as read-only-collection. You fill it once with all your value intervals, and then you can query or sample it in several ways, for example by using the methods findIntersections(...). You will also get the same value interval type out of it when submitting search queries.

If you need to manipulate interval boundaries and know how to aggregate associated values of intervals during manipulation process then you might apply a combination of both IntervalCollection and IntervalTree. You would fill both with your value intervals. Then you manipulate the collection possibly creating new interval boundaries. And then you could use the new boundaries to submit search queries to the tree (knowing the times when to look for suitable value intervals) and then aggregate the values of found intersection value intervals.

Until now, I have no special API for aggregating values of value intervals. This phase is still up to you. You might also consider the various streaming methods of class DateInterval which you could use during flatMap-operations in the java.util.stream.Stream-API of Java-8.

If you think there is need for such an aggregator-API then let me know about your concrete proposals or suggestions.

@charlesxucheng
Copy link
Author

Thanks Meno for the explanation.

I think sometimes we will need to have a collection of value intervals and I will need to modify the intervals (or at least produce a new collection out of the existing one). Furthermore, I would think that a collection of value intervals should not overlap because the meaning is not clear. If I want multiple values for a period I would rather use a collection of values for the Value attribute of the interval. The use case as explained initially is to model time series data - we want to tag a value (or some values) to a period to indicate that value is effective during that period.

So I think one way is to have something like a ValueIntervalCollection that handles such requirements. I have not studied the API in details so I am not sure at this moment how ValueIntervalCollection should be related to IntervalCollection. We are working on a project so we can work out something during our project to enhance IntervalCollection along this line of thought, and then we can submit a PR. What do you think?

@MenoData
Copy link
Owner

If I understand you correctly then we can define a time series as

temporally ordered sequence of non-overlapping value intervals (with gaps only when we don't have defined values)

This would be a nice enhancement of Time4J. So I would welcome a PR as start. Personally I think, that such a time series will rather not be directly related to IntervalCollection because it is simply another specialized type of collection.

  • The existing class IntervalCollection has the main focus on the area of boundary manipulation of arbitrary time-related intervals which can overlap or have gaps. Quite different from a time series.
  • The existing class IntervalTree is a read-only collection which is good for quick search queries due to its special structure and algorithm. But again: It allows overlapping intervals. It is even optimized for that purpose.

For naming reasons, we could consider the name "IntervalSeries" for the new class (with same prefix like the other collection types for easier API-search and higher recognition value). Furthermore:

  • IntervalCollection<T> implements List<ChronoInterval<T>> (should be done?!)
  • IntervalTree<T, I> implements Collection<I> (already realized)
  • IntervalSeries<T, I, V> implements List<ValueInterval<T, I, V>> (my generic suggestion)

The IntervalSeries-class can hold a private state member of type List<ValueInterval<T, I, V>>. Then we can go further and introduce special subtypes of abstract IntervalSeries like DateIntervalSeries etc. which help to strongly reduce the count of generics type parameters (from 3 to 1).

About the features of such series-classes: Search operations probably based on binary search through the internal interval list, update operations might replace existing value intervals with new value intervals (that is setting a new value for a given time interval avoiding overlaps). Most of these functionality can be implemented in the super class IntervalSeries, but factory methods have to be placed into the concrete classes like DateIntervalSeries. By the way, it is maybe wise to not use upper generic bounds in the type parameters of IntervalSeries but to use such bounds in the factory methods (allows a more flexible design prepared even for things like DoubleInterval holding a double-primitive).

For comparison, we can also look at APIs like JFreeChart. That API does not do much more than described here and unfortunately uses mutable classes while I would strongly prefer immutable classes (as done throughout Time4J).

@charlesxucheng
Copy link
Author

Agree with what you suggest in general.

I am looking at how to build IntervalSeries and its sub-classes, and I have a question: Why is it that ChronoInterval has sub-classes like DateInterval, but ValueInterval does not have sub-classes like DateValueInterval? If so, when implementing DateIntervalSeries one can use DateValueInterval directly.

@MenoData
Copy link
Owner

MenoData commented Sep 17, 2017

About your question why ValueInterval has no subclasses:

I just wanted to limit the count of interval types, and the class ValueInterval can be applied in many contexts, see how to construct it via with(value)-methods in the classes IsoInterval and FixedCalendarInterval (hence in all subclasses, too, like DateInterval, MomentInterval, CalendarMonth etc). The API of time libraries needs to be flexible.

But of course, you can create your own value intervals implementing the interface ChronoInterval and use it in IntervalCollection, IntervalTree (and in a future class like IntervalSeries) just because a ValueInterval also implements the interface ChronoInterval. For your own implementations, you might look at the internal details of class ValueInterval. By the way, I am also thinking about a special value interval class combining an interval with a double-primitive...

About interval series versus date series:

After I have done some research about time series concepts in general, I would ask you if your value intervals might have some gaps or not.

Background of the question is: If gaps can never occur in your model then it is also possible to realize a DateSeries-class which only stores the starting date of an interval (together with the associated value) and create value intervals on the fly (next starting date as exclusive end date).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants