Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I can't set author for feeds from RSS-Bridge #982

Open
HolgerAusB opened this issue Jun 20, 2022 · 3 comments
Open

I can't set author for feeds from RSS-Bridge #982

HolgerAusB opened this issue Jun 20, 2022 · 3 comments

Comments

@HolgerAusB
Copy link
Collaborator

I host RSS-Bridge and FTR locally. I use the FeedMergeBridge.php to merge 5 sub-feeds of the same site to a single one, so FreshRSS can sort out duplicates. Between them is FullTextRSS (FTR) which takes the full articles. This works great!

But FTR sets the author field dc:creator to literal 'RSS-Bridge' and I can't find out how to get rid of that. I tried to find out how to do that on RSS-Bidge side, but that is too heavy for me.

So I made configs in FTR both for my ip-address of RSS-Bridge and for the original site, where the full-text-content is from.

While the debug mode shows, that FTR found a correct match for the author, this is not passed to the outgoing feed. That match comes from the original site config. I added to existing diepresse.com.txt:

author: //span[@class='author__name']
author: '-'

Some articles have no author, so the second line would catch that. Will this change to empty string when I get arround the problem. I uploaded both, the feed from bridge and the feed from FTR.

So how to override this feed-author field from RSS-Bridge?

The examples:

Bridge and FTR are not exposed to the internet, so no link, sorry

@fivefilters
Copy link
Owner

We should really document this aspect of Full-Text RSS. Currently the author name extracted from the article is only used when the input feed doesn’t contain author information. If the input feed has author info, it’s always prioritised. If you have control over the input feed and can remove the author info from it, then the extracted author should be used in the feed Full-Text RSS produces.

We’ll introduce a request parameter in a future version to override this behaviour, as we have with ‘use_extracted_title’, which you can use to tell Full-Text RSS that the extracted article title should replace any title in the source feed.

@HolgerAusB
Copy link
Collaborator Author

@fivefilters unfortunately I can't control the author field in the input feed (output from rss-bridge). See my question on their side

Just to be sure, I need to emphasize that it is the feed-author of rss-bridge not item-authors

@HolgerAusB
Copy link
Collaborator Author

My workaround is to use a cronjob which

  • curl full-text feed via FTR for each category-feed and save them in single files within a directory that is locally reacheabal by http via Apache2
  • merge feeds via RSS-Bridge and also write to that directory
  • rename the author field to something invalid
  • The feed aggregator will now regularly fetch the last and modified file
cd /var/www/secretpath/feeds

# get feeds via Fivefilter's Fulltext-RSS:
curl -s --connect-timeout 180 -m 180 -o derstandard-newsroom.rss   "https://local.example.com/ftr/makefulltextfeed.php?url=https%3A%2F%2Fwww.derstandard.at%2Frss&max=6&links=preserve&exc=1"
curl -s --connect-timeout 180 -m 180 -o derstandard-inland.rss     "https://local.example.com/ftr/makefulltextfeed.php?url=https%3A%2F%2Fwww.derstandard.at%2Frss%2Finland&max=10&links=preserve&exc=1"
curl -s --connect-timeout 180 -m 180 -o derstandard-web.rss        "https://local.example.com/ftr/makefulltextfeed.php?url=https%3A%2F%2Fwww.derstandard.at%2Frss%2Fweb&max=3&links=preserve&exc=1"
curl -s --connect-timeout 180 -m 180 -o derstandard-kultur.rss     "https://local.example.com/ftr/makefulltextfeed.php?url=https%3A%2F%2Fwww.derstandard.at%2Frss%2Fkultur&max=5&links=preserve&exc=1"
curl -s --connect-timeout 180 -m 180 -o derstandard-gesundheit.rss "https://local.example.com/ftr/makefulltextfeed.php?url=https%3A%2F%2Fwww.derstandard.at%2Frss%2Fgesundheit&max=2&links=preserve&exc=1"
curl -s --connect-timeout 180 -m 180 -o derstandard-lifestyle.rss  "https://local.example.com/ftr/makefulltextfeed.php?url=https%3A%2F%2Fwww.derstandard.at%2Frss%2Flifestyle&max=2&links=preserve&exc=1"

# combine feed via a preconfigured RSS-Bridge, the bride configuration contains the part-feed-addresses from above.
curl -s --connect-timeout 180 -m 180 -o derstandard-ftr.rss "http://ip-address:3000/?action=display&bridge=StandardAT&format=Atom"
sed -i 's/author>/origin>/g' derstandard-ftr.rss

Of course, it would be more efficient, having that override parameter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants