Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow deploying a Datahub instance (GeoNetwork-UI) from the administration interface #8021

Open
7 tasks
jahow opened this issue May 7, 2024 · 4 comments
Open
7 tasks

Comments

@jahow
Copy link
Contributor

jahow commented May 7, 2024

This proposal aims at integrating the Datahub application provided by GeoNetwork-UI into all standard deployments of GeoNetwork.

The integration will be done as follows:

Administration interface

Main catalog

A new option will be added in the Administration > Settings > System settings page. This option will be composed of:

  • a checkbox to enable the Datahub interface for the main catalog
  • a button opening the Theme Editor (see below) in an overlay

Image

Subportals

A "enable Datahub" option similar as the one above will be added for each sources in the Administration > Settings > Sources page.

Theme Editor

Both options described above will offer a "theme editor" button. The Datahub Theme Editor will be a separate application provided by GeoNetwork-UI which will allow users to edit in real-time a configuration file for the datahub (for instance URLs, theme colors, fonts, map options etc.).

Image

The Theme Editor application will be packaged as a web component and, when clicking "Save", the resulting configuration will simply be transferred into a hidden text field in the settings form.

Using the Theme Editor will be optional; if left untouched, the default configuration will be used.

Where are configurations stored?

The main catalog Datahub configuration will be stored alongside other system settings as a large text field.

Configurations for subportals will be stored in the sources table as a large text field.

How will the Datahub be deployed alongside GeoNetwork?

The Datahub application, once compiled, is a collection of static files (HTML, JS, CSS) that can be served as is. These files (2Mo in total) will be bundled inside the GeoNetwork WAR package.

Bundling the application

A maven task will be added to

  1. clone the geonetwork-ui repository on a specific version
  2. build the Datahub application
  3. copy the resulting files in the static resources of the WAR package

The version of GeoNetwork-UI used will be set in the maven properties. Most likely it will be increased alongside GeoNetwork versions. We do not expect significant breaking changes on the configuration format, but if that happens, the Theme Editor can probably assist the user in migrating their configurations.

Serving the application

Because the Datahub application can be accessed in several ways, a Java service will have to be developed and will handle incoming requests to the Datahub.

Main catalog Datahub

Once enabled, the main Datahub will be accessible on /geonetwork/srv/datahub. The configuration defined in the System settings will be used.

Subportals Datahub

Once enabled, a subportal Datahub can be accessed with /geonetwork/subportal-name/datahub. The configuration defined in the subportal settings will be used.

Impacts

This proposal is expected to have many positive impacts:

  • Offer access to a simpler/alternative UI without any cost for the administrator (just a simple click)
  • Increase awareness on the GeoNetwork-UI project, hopefully making it more useful to the GeoNetwork community by revealing opportunities
  • Augment the subportal system with more customization possibilities for each
  • Let administrators include their subportals as web components in third party websites (since this is part of GeoNetwork-UI capabilities)

The technical impacts on the GeoNetwork project are:

  • some database fields are added, no change to the data model, integrity constraints etc.
  • The WAR size is expected to increase by 2-3 Mo
  • The WAR build time is expected to increase by 2,5 minutes (at the time of writing, the build time of the datahub is approx. 130s); this will not happen in dev or in the CI
  • A new Java service, mostly doing routing, will be implemented

Voting

PSC Support:

  • Jo Cook
  • Jose Garcia
  • Paul van Genuchten
  • Florent Gravin
  • Simon Pigot
  • Francois Prunayre
  • Jeroen Ticheler
@ticheler
Copy link
Member

ticheler commented May 9, 2024

Hi @jahow,
Thanks for a great proposal and for your hard work on this!

I would very much appreciate organising a session with the PSC where you demonstrate a working prototype of such integration for everyone to have a better understanding. Could you plan for that?

Another thing I feel is important to talk about at this stage is about building on and contributing to the GN-UI project. The mono repository could be something of concern in that respect, but I am also not sure what the current status of that is. Maybe that requirement has already been mitigated?

Cheers! Jeroen

@jahow
Copy link
Contributor Author

jahow commented May 9, 2024

Thanks for the feedback @ticheler 🙂 yes, showing a working prototype sounds like an excellent idea. We can probably organize that for the next PSC meeting if that's ok?

I'm not certain I see what you mean by concerns regarding the monorepo. I remember we had discussions about the project complexity being a obstacle to contributing more, is that it?

Looking forward to see that come to life!

@edevosc2c
Copy link

edevosc2c commented May 13, 2024

Hello,
I'm writing this message about the technical side of these proposals.

First of all, I actually deployed in the past "subportals" datahub for Geocat.ch. In simple terms, I deployed multiple containers linked to different datahub configurations.
Each datahub were accessible from different "subpaths": /datahub/thurgau, /datahub/viageo.
My main problem was that the docker image wasn't made for this kind of use case, I had to use workarounds for it to work.

But it wasn't too far from a proper implementation of this feature. There might need some tweaks to have something that is properly production ready.

About "main catalog" feature

My main concern is that if we go the direction to include datahub into the geonetwork "program". This is yet another component integrated into geonetwork.
This goes against the standard of today world, where we try to have microservices in order to improve the scability of our program and make them more resilient to failures.

Actually, in geOrchestra, we have a hard time integrating geonetwork in this microservice architecture. It's not possible to have multiple instances of geonetwork for redundancy, and it's hard to move geonetwork around multiples servers in case one server fail.

That's why we, at geOrchestra, liked the ability to have datahub as a separate program. It's a "stateless" application, so it is much easier to manage.

The perks are:

  • Having multiple instances of datahub for redundancy.
  • Being able to move datahub easily to other servers.
  • Its configuration is static (not managed by the user from a web interface), so we can rely on just an external Git repository for the deployment.

I'm fine with optionally being able to deploy datahub from geonetwork, but I would like to request to still have the ability to deploy it separately.

About "subportals" feature

Like I said in my first paragraph, technically deploying subportals of datahub is possible today. It's just that it's cumbersome and "hacky".

I would be interested to improve the current Docker image for datahub in order to have a proper deployment of separate datahub that counts as multiple subportals. Obviously, only the system administrator would be able to deploy subportals in this case.

I'm not fond to let the user deploy their own subportals by himself because it creates additional architectural difficulties: unknown possible CORS issues (if usage external geonetwork), not being in control of what kind of datahub is created and having to persist in the filesystem the user configuration of all the different subportals configuration.
But this highly relates to not liking the idea to have datahub inside geonetwork.

About "Theme Editor" feature

Same issue as having datahub packaged into geonetwork, this component will have difficulties being scalable and will have to be tied to the same "filesystem" as geonetwork.

Final notes about all the proposals

For normal users, local installation or low traffic instance of geonetwork, what I said previously are non-issue. They are most likely deploying geonetwork on one server, and it's perfectly ok to have these features in order to improve their workflow.

But I wanted to give my opinion for the people that deploy it on platforms that receive a lot of traffic and need to be "always available".

@jahow
Copy link
Contributor Author

jahow commented May 16, 2024

Thanks @edevosc2c for the shared knowledge.

My main problem was that the docker image wasn't made for this kind of use case, I had to use workarounds for it to work.

Let's keep in mind that we're not specifically talking here about a docker context. Actually one of the motivations of this proposal is to let people using GeoNetwork as a standalone WAR also benefit from GeoNetwork-UI.

I'm fine with optionally being able to deploy datahub from geonetwork, but I would like to request to still have the ability to deploy it separately.

This proposal will probably change almost nothing on GeoNetwork-UI side. Maybe a few adaptations to make deployment more flexible if necessary, but that's it.

unknown possible CORS issues (if usage external geonetwork), not being in control of what kind of datahub is created and having to persist in the filesystem the user configuration of all the different subportals configuration.

There should be no CORS concerns here since the Datahub will be accessed on the same host as GeoNetwork (e.g. http://localhost:8080/geonetwork/srv/datahub). As for configurations, they will be modifiable by hand by the user, but the theme editor should offer some kind of fool-proofing to make sure that configurations are still valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Proposal
Development

No branches or pull requests

3 participants