Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't subscribe to https://cn.wsj.com/zh-hans/rss #2643

Open
hackenfu opened this issue May 14, 2024 · 2 comments
Open

Can't subscribe to https://cn.wsj.com/zh-hans/rss #2643

hackenfu opened this issue May 14, 2024 · 2 comments

Comments

@hackenfu
Copy link

following error message on subscription:
Access to this website is not authorized. It could be a bad username or password.
But I can surf it in Chrome or any other web browser

@fguillot
Copy link
Member

This website uses a bot protection mechanism and reject Miniflux requests with a 403.

Same result with curl:

➜  ~ curl -I https://cn.wsj.com/zh-hans/rss
HTTP/2 403
content-type: text/html; charset=UTF-8
content-length: 390039
date: Thu, 16 May 2024 03:22:53 GMT
x-amz-replication-status: REPLICA
last-modified: Thu, 16 May 2024 03:21:51 GMT
etag: "fca0a3c12066d318225d0f3470f0415b"
x-amz-server-side-encryption: AES256
cache-control: public,max-age=60,s-maxage=60
x-amz-version-id: ZSsQPTFSDIxueorBceb7hXrrlaL2pD48
accept-ranges: bytes
server: AmazonS3
x-cache: Error from cloudfront

@hackenfu
Copy link
Author

This website uses a bot protection mechanism and reject Miniflux requests with a 403.

Same result with curl:

➜  ~ curl -I https://cn.wsj.com/zh-hans/rss
HTTP/2 403
content-type: text/html; charset=UTF-8
content-length: 390039
date: Thu, 16 May 2024 03:22:53 GMT
x-amz-replication-status: REPLICA
last-modified: Thu, 16 May 2024 03:21:51 GMT
etag: "fca0a3c12066d318225d0f3470f0415b"
x-amz-server-side-encryption: AES256
cache-control: public,max-age=60,s-maxage=60
x-amz-version-id: ZSsQPTFSDIxueorBceb7hXrrlaL2pD48
accept-ranges: bytes
server: AmazonS3
x-cache: Error from cloudfront

Thanks for reply.
Of course I know there's some protection to reject tools such as miniflux, freshrss, etc.
But in fact there're still many mobile apps such as Reeder, NetNewsWire which can access this feed smoothly. So I try to change the User Agent to pretend to be a mobile phone.
So I tried this tool: https://github.com/lwthiker/curl-impersonate#supported-browsers
docker pull lwthiker/curl-impersonate:0.6.1-chrome
docker run --rm lwthiker/curl-impersonate:0.6.1-chrome (Wrapper script) https://cn.wsj.com/zh-hans/rss
Only curl_chrome99_android can access.
I open https://github.com/lwthiker/curl-impersonate/blob/main/chrome/curl_chrome99_android, find the UA=Mozilla/5.0 (Linux; Android 12; Pixel 6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.58 Mobile Safari/537.36, copy it to miniflux, but it didn't work:(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants