You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue statement
Trigger is limited to a single core, despite being suitable for multi-processing. The current workaround is to batch up the time window of interest in an external script. This, however, leads to the issue of files from each trigger run overwriting any existing file.
Proposition
Bake the batching of trigger into Trigger. This will also require dealing with overwriting triggered event files. This will be handled by adding the Julian day to the triggered events filename.
Primary issue
Bake batching process into Trigger.trigger()
Have TriggeredEvent.csv files write with the relevant Julian day
Ensure the process of reading these TriggeredEvent files for locate can handle the separate Julian days
Future
Multi-process the batched trigger
Allow for multi-processing arbitrary periods of time by breaking into batches with non-Julian day lengths.
Additional tasks
Break the _trigger() method into two stages
_trigger_candidates() - Identify all of the instances of the (normalised) coalescence exceeding the chosen detection threshold
_refine_candidates() - Merge events for which the marginal windows overlap with the minimum inter-event time.
Result
Clearer, multi-processed code leading to faster results and a clearer codebase.
Reach
Trigger files generated using the development branch prior to this change will no longer be compatible with locate(starttime, endtime). However, it is still possible to locate the events in this file using locate(trigger_file).
The text was updated successfully, but these errors were encountered:
Split the internal trigger method into a series of methods that each
capture a specific stage. In doing so, also update some of the
implementation.
_trigger_events() has been split into:
_get_threshold() - determine an array of values to use as a threshold
_identify_candidates() - find distinct periods of time for which the
maximum (normalised) coalescence trace
exceeds the chosen threshold
_refine_candidates() - merge candidate events for which the marginal
windows overlap with the minimum inter-event
time
_filter_events() - remove events within the padding time and/or
within a specific geographical region
In both the _identify_candidates() and _refine_candidates() methods, the
pandas.DataFrame.groupby() method has been used to remove the confusing
index twiddling.
This partially addresses some of the tasks in Issue #67.
The Trigger.trigger() method now internally batches the specified time
period into Julian days (with possible partial days on either end).
Consequently, the names of files output by trigger also include the year and
Julian day.
Triggered event files for locate between two timestamps are now read
by looping over the Julian days.
This addresses the primary issue in Issue #67
Trigger files generated prior to this change will no longer be compatible
with locate between two timestamps. However, they can still be used in
locate using locate(trigger_file="path/to/old_trigger_file"
Issue statement
Trigger is limited to a single core, despite being suitable for multi-processing. The current workaround is to batch up the time window of interest in an external script. This, however, leads to the issue of files from each trigger run overwriting any existing file.
Proposition
Bake the batching of trigger into Trigger. This will also require dealing with overwriting triggered event files. This will be handled by adding the Julian day to the triggered events filename.
Primary issue
Future
Additional tasks
Result
Clearer, multi-processed code leading to faster results and a clearer codebase.
Reach
Trigger files generated using the development branch prior to this change will no longer be compatible with locate(starttime, endtime). However, it is still possible to locate the events in this file using locate(trigger_file).
The text was updated successfully, but these errors were encountered: