Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added publicID to station #3194

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open

Added publicID to station #3194

wants to merge 18 commits into from

Conversation

s-schneider
Copy link
Member

@s-schneider s-schneider commented Oct 27, 2022

What does this PR do?

Adds public_id to station objects, if station is read from inventory files from seiscomp.

Why was it initiated? Any relevant Issues?

see #3193

PR Checklist

  • Correct base branch selected? master for new features, maintenance_... for bug fixes
  • This PR is not directly related to an existing issue (which has no PR yet).
  • [ x If the PR is making changes to documentation, docs pages can be built automatically.
    Just add the "build_docs" tag to this PR.
    Docs will be served at docs.obspy.org/pr/{branch_name} (do not use master branch).
    Please post a link to the relevant piece of documentation.
  • If all tests including network modules (e.g. clients.fdsn) should be tested for the PR,
    just add the "test_network" tag to this PR.
  • All tests still pass.
  • Any new features or fixed regressions are covered via new tests.
  • Any new or changed features are fully documented.
  • Significant changes have been added to CHANGELOG.txt .
  • First time contributors have added your name to CONTRIBUTORS.txt .
  • [ x If the changes affect any plotting functions you have checked that the plots
    from all the CI builds look correct. Add the "upload_plots" tag so that plotting
    outputs are attached as artifacts.
  • Add the "ready for review" tag when you are ready for the PR to be reviewed.

@s-schneider s-schneider added ready for review PRs that are ready to be reviewed to get marked ready to merge and removed ready for review PRs that are ready to be reviewed to get marked ready to merge labels Oct 27, 2022
@d-chambers
Copy link
Member

Alright @s-schneider, the CI is working again so the failures you see here are probably related to your changes. Take a look and see if you can figure out how to fix them.

@megies
Copy link
Member

megies commented Nov 3, 2022

Hmm.. our inventory type classes are kept really close to the StationXML schema. Anything that diverges from it could potentially be trouble in the future, when we might have to adjust to changes in StationXML, or just be confusing to people because there are things that are not in StationXML or are missing when writing to StationXML. So, I'm not sure how I feel about this one.. It's quite a harmless looking change obviously, but it does have the above implications..

@megies
Copy link
Member

megies commented Nov 3, 2022

Would it be an option to keep this SCML specific piece of information in the extra part of the station objects maybe?

@s-schneider
Copy link
Member Author

Hm if we can add that field to the select method of inventories and networks, that would be a possible solution

@megies
Copy link
Member

megies commented Nov 3, 2022

Hm if we can add that field to the select method of inventories and networks, that would be a possible solution

I think I would prefer that, and I don't see a big problem having it in select() if the docstring for it makes clear that it is strictly a SCML thing and wont do anything in most cases.

@s-schneider
Copy link
Member Author

Should be no problem, I’ll change that and push an update

@s-schneider s-schneider removed the ready for review PRs that are ready to be reviewed to get marked ready to merge label Nov 22, 2022
@s-schneider
Copy link
Member Author

EB_response_NEW_stationXML.zip
I actually just experienced something weird while trying to fix the tests. If I take that example file, read it and write it again, it it adds a description to the response stage input units

  <Stage number="1">
    <PolesZeros name="EBR.2002.091.HE" resourceId="ResponsePAZ#20121207153142.208772.15386">
      <InputUnits>
        <Name>M/S</Name>
        <Description>None</Description>
      </InputUnits>

vs

  <Stage number="1">
    <PolesZeros name="EBR.2002.091.HE" resourceId="ResponsePAZ#20121207153142.208772.15386">
      <InputUnits>
        <Name>M/S</Name>
      </InputUnits>

The code I used:

from obspy import read_inventory

file = "EB_response_NEW_stationXML"
newfile = "EB_response_NEW2_stationXML"
stationxml_inventory = read_inventory(file, format="STATIONXML")
nsmap = {
    'scml': 'seiscompml', 
    'resp': 'http://geofon.gfz-potsdam.de/ns/seiscomp3-schema/0.7'
}
stationxml_inventory.write(newfile, format="STATIONXML", nsmap=nsmap)

@megies
Copy link
Member

megies commented Dec 1, 2022

@s-schneider I've rebased this branch on current master and force pushed, to avoid merge conflicts. Can you continue work on this version? Or let me know if that's a problem for you.

@@ -639,6 +640,8 @@ def select(self, network=None, station=None, location=None, channel=None,
:param maxradius: Only include stations/channels within the specified
maximum number of degrees from the geographic point defined by the
latitude and longitude parameters.
:type public_id: str, optional
:param public_id: Only include stations matching with extra.public_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add to the docstring that this mainly is for data read from sc3ml files, so it's clear what this is about here

@@ -457,6 +458,8 @@ def select(self, station=None, location=None, channel=None, time=None,
will be included in the result. This flag has no effect for
initially empty stations which will always be retained if they
are matched by the other parameters.
:type public_id: str, optional
:param public_id: Only include stations matching with extra.public_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

else:
continue
else:
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This confused me at first, although this might be doing what it's supposed to do, I would rewrite this, to make it less nested and less confusing.

Maybe something like..

if public_id is not None:
    try:
        sta_public_id = sta.extra.public_id
    except AttributeError:
        # deselect stations that do not have this attribute
        continue
    else:
        if not fnmatch.fnmatch(sta_public_id.upper(), public_id.upper()):
            continue

@@ -408,6 +408,9 @@ def test_inventory_select(self):
assert sum_stations(inv.select(station="RJOB")) == 9
assert sum_stations(inv.select(station="R?O*")) == 9

# Only Stations with public_id
assert sum_stations(inv.select(public_id="TEST")) == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There needs to be a test added that is actually filtering out some stations but keeping a desired station. Right now only deselecting everything on arbitrary public_id="..." input is checked.

@@ -179,6 +179,10 @@ def test_network_select(self):

# No matching station.
assert sum(len(i) for i in net.select(station="RR")) == 0

# Only stations with public_id
assert sum(len(i) for i in net.select(public_id="TEST")) == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above, test needs improvement

@@ -37,7 +38,8 @@

SOFTWARE_MODULE = "ObsPy %s" % obspy.__version__
SOFTWARE_URI = "http://www.obspy.org"
SCHEMA_VERSION = ['0.6', '0.7', '0.8', '0.9', '0.10', '0.11', '0.12']
SCHEMA_VERSION = "0.11"
READABLE_VERSIONS = ['0.6', '0.7', '0.8', '0.9', '0.10', '0.11', '0.12']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change confused me at first, i didn't see any reason for it.. until i looked at individual commits. So this is a relic from some early commits adding WIP on writing sc3ml which later got partially reverted. Ideally these commits should be removed from history, they would just add confusion to the history, I think, considering they never were a working version but just unfinished WIP.

In any case, this change here needs to be reverted too. It's not needed for what this PR is intending.

@@ -92,7 +94,7 @@ def _read_sc3ml(path_or_file_object, **kwargs):
root = etree.parse(path_or_file_object).getroot()

# Code can be used for version 0.6 to 0.12 (Seiscomp 4.x)
for version in SCHEMA_VERSION:
for version in READABLE_VERSIONS:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

'value': public_id,
'namespace': ns
}
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few things that should be changed here I think.

use namespaces correctly

Namespace should be set to the actual namespace value of the attribute publicID that is being parsed and not quite arbitrarily "seiscmopml". There is an easy way that might be doing the right thing..

namespace = sta_element.nsmap[None]

... but this is working on assumptions how namespaces are set in sc3ml and might break later on.
So to correctly determine the namespace of this attribute, you should check the attribute for a namespace first and if it doesn't have one it should be OK to use the default namespace of it's element

match = re.match(r'{([^}]*)}.*', sta_element.tag)
if match:
    namespace = match.group(1)
else:
    namespace = sta_element.nsmap[None]

I haven't messed around with details of lxml and xml namespaces in a while, but it seems that the default namespace is set on the station element, even though the whole document is operating on the implicit default namespace declared on the root element, so it should be OK to assume we can use that.

Actually, as a last thought.. when accessing public_id = sta_element.get("publicID") that is already assuming that publicID has the default namespace of the station item, I believe, so maybe just using station default namespace is fine..

naming of extra item

I would probably keep the name publicID as the name of the item in extra. Even though that means internally you would look at station.extra.publicID which isn't in line with our Python object naming scheme.. this a) makes it clear that this is not coming from Python land and b) if writing an inventory read from sc3ml as stationxml, your custom attribute would kind of properly show up in the xml file as something like..

<FDSNStationXML xmlns="http://www.fdsn.org/xml/station/1" schemaVersion="1.2">
  <Source>scxml import</Source>
  <Sender>ObsPy Inventory</Sender>
  <Module/>
  <ModuleURI/>
  <Created>2022-12-01T12:14:43.872676Z</Created>
  <Network code="EB" startDate="1980-01-01T00:00:00.000000Z" restrictedStatus="open">
    <Description>SINGLE STATION</Description>
    <Station xmlns:ns0="seiscompml" code="EBR" startDate=..... ns0:public_id="Station/EB/EBR/2002-04-01T00:00:00.0000Z">
      <Latitude unit="DEGREES">40.820599</Latitude>
      <Longitude unit="DEGREES">0.4933</Longitude>
...

should set type as attribute

Then it would come out as an attribute on StationXML output as opposed to an element.

self.sc3ml_inventory.write(sc3ml_bytes, "STATIONXML")
self.sc3ml_inventory.write(
sc3ml_bytes, "STATIONXML",
nsmap={'my_ns': 'seiscompml'}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use a proper sc3ml namespace URI

if hasattr(sta.extra, 'publicID'):
if not fnmatch.fnmatch(
sta.extra['publicID'].value.upper(),
public_id.upper()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if publicID in sc3ml is case sensitive, but personally, I'd probably leave the upper() away. I know it's done like this in some other comparisons, and I'm not sure if I agree with that but it is what it is and thats more coming from the SEED perspective where everything was supposed to be all uppercase characters only anyway.

@megies
Copy link
Member

megies commented Dec 1, 2022

I pushed some of the adjustments, if you wanna work on the rest?

@megies megies added this to the 1.5.0 milestone Dec 1, 2022
@megies
Copy link
Member

megies commented Dec 1, 2022

CC @filefolder since you worked on sc3ml stuff recently

@filefolder
Copy link
Contributor

CC @filefolder since you worked on sc3ml stuff recently

sure, but still out digging holes for another week

@megies
Copy link
Member

megies commented May 22, 2024

trying to wrap this up for the release @s-schneider @filefolder, maybe you can have a look, I think there was some last things to do

There were some merge conflicts, so need to double-check, most notably 0.13 read support was added in the meantime, so that needs checking if that changes anything

@filefolder
Copy link
Contributor

I think the example SCML probably needs to be rinsed through to a newer version, plus you can't specify file storage type (eg Steim2) anymore I think.

While I'm pretty sure reading in event data works fine for 0.13, I never really tested reading inventory files. Currently seeing "0.13 not supported" errors in 1.4.1 but it's a but too late for me to investigate. Attached is an example in the meantime
channel_level.0.13.scml.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants