You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was thinking a little bit more about the business of QuakeML and StationXML. While they should not be required, I think it might actually be helpful to optionally provide them. My idea would be to allow optional fields to the csv "stationxml", "QuakeML" and similarly to the waveform label, this would point to either a corresponding data structure within the hdf5 file, or to a separate file (I don't know which is better, I would consider this an implementation detail on which I am happy to delegate decision making). In this way seismologically proficient users of the library can access the richer information contained in the xml files, e.g. for doing their own instrument removal, while people coming from e.g. machine learning side can just ignore these fields and work with the time series as provided. Of course, there could be some issues, like inconsistencies between the csv fields and the QuakeML (or StationXML) file, but I don't think they would matter fundamentally, as long as the procedure is described clearly (it might even be a feature in some cases, as quakeML might, for example, refer to a different catalogue. These files would not exist for all benchmark data sets, of course, but StationXML is likely to be not to difficult to generate for many, and the responsibility would be with the 'dataloader' routine that converts to internal SeisBench format.
(migrated from suggestion made in issue #12 )
The text was updated successfully, but these errors were encountered:
I was thinking a little bit more about the business of QuakeML and StationXML. While they should not be required, I think it might actually be helpful to optionally provide them. My idea would be to allow optional fields to the csv "stationxml", "QuakeML" and similarly to the waveform label, this would point to either a corresponding data structure within the hdf5 file, or to a separate file (I don't know which is better, I would consider this an implementation detail on which I am happy to delegate decision making). In this way seismologically proficient users of the library can access the richer information contained in the xml files, e.g. for doing their own instrument removal, while people coming from e.g. machine learning side can just ignore these fields and work with the time series as provided. Of course, there could be some issues, like inconsistencies between the csv fields and the QuakeML (or StationXML) file, but I don't think they would matter fundamentally, as long as the procedure is described clearly (it might even be a feature in some cases, as quakeML might, for example, refer to a different catalogue. These files would not exist for all benchmark data sets, of course, but StationXML is likely to be not to difficult to generate for many, and the responsibility would be with the 'dataloader' routine that converts to internal SeisBench format.
(migrated from suggestion made in issue #12 )
The text was updated successfully, but these errors were encountered: