Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include hash sum locations of AppImages - and popularize if good idea #2830

Open
misog opened this issue Feb 19, 2022 · 11 comments
Open

Include hash sum locations of AppImages - and popularize if good idea #2830

misog opened this issue Feb 19, 2022 · 11 comments

Comments

@misog
Copy link

misog commented Feb 19, 2022

Hi. AppImages are great. There are many community maintained sites which host AppImages. However, user has to trust the distributor and the distribution infrastructure of such community sites. Serious security incidents can happen; How one man could have hacked every Mac developer (73% of them, anyway).

One of good states of the AppImage ecosystem would be that every open-source and closed-source software author offers AppImage on their site behind valid SSL certificate. However, infrastructure for high bandwidth site or CDN of software releases could be not feasible to maintain for smaller producers.

I propose an alternative system and a convention which could be used as a base layer for integrity verification in the AppImage ecosystem. It is easy to implement so involved parties can use it without problems.

Software authors and community

  1. Software authors publish cryptographic hashsum of AppImage files on their website behind valid SSL certificate
  2. Community AppImage scrapers, community distributors and community package maintainers and other community members and community systems distribute AppImage files with URLs (locations) to cryptographic hashsums provided by software authors

Software users
Then users can verify integrity and authenticity of any AppImage file at their machines:

  1. Check if hashsum of AppImage file is the same as the hashsum found at provided URL
  2. Check if provided URL belongs to software authors

Verification on the user side can be done manually or semi-automatically:

Manual verification

  1. User verify that the URL belongs to software authors by SSL certificate or by general knowledge of the site (ex. firefox.org/releases/appimage-hashes.txt)
  2. User visits the URL with the hash (ex. firefox.org/releases/appimage-hashes.txt)
  3. User runs sha512sum someprogram.appimage and compares the output hash with the hash at the URL (ex. firefox.org/releases/appimage-hashes.txt)
  4. If hashes match, file integrity and authenticity are verified

Semi-automatic verification (just one user action required)

  1. User uses some trusted AppImage package manager or some audited version of AppImage package manager
  2. Package manager downloads untrusted AppImage files (ex. from HTTP non-SSL websites / repos / github) with their hashsums (ex. firefox.org/releases/appimage-hashes.txt) and downloads the actual hashes (only HTTPS with valid SSL certificate)
  3. Package manager displays the URL to the user (ex. firefox.org/releases/appimage-hashes.txt)
  4. User confirms the authenticity of the URL by typing y
  5. Package manager tests sha512sum(someprogram.appimage) == downloaded_hashsum_of_someprogram
  6. If hashes match, file integrity and authenticity are verified

Proposal of a standard of AppImage hashsums file (draft)

# comments starts with #, each line contains one record consisting of version of the record (currently v1)
# v1 expects five words separated by one space and wrapped in double quotes (") if word contain space:
#record_format_version file_name package_version hash_function hash_of_file
# package_version format is not defined, recommended format is semver (x.y.z) https://semver.org/
# processing party is responsible for detection and handling of package_version format (no list of formats in v1)
# file must include just one package (ex. firefox and firefox-esr must have two files)
# file should have extension .txt
# examples:
v1 appimage_filename_whatever.appimage ver3.6 sha512 32cc3e9b2a03d4e4a4875c427f723b7873d991334b3ef92118fd8e75e4f22d8d01b885276e55f8f3e61c6db2f3644f21ed52a7f73485176c92c123ed5c9ccf07
v1 program_v4.5.6_no_extension v6.0.1-dev sha256 c032d47970e81adad5e0eb6dd6351465a7afa8e388df1524ae79aff3406a21e4
v1 "program with spaces" 4.2.8 sha256 9bda22014bdfb1bc8ccec988e77f281c6f5bdfa4a0b3fb90a77f353cfc67f241
v1 "program with spaces" "5.6.9 dev" sha256 2bd1a722fab45ae34f90485dafc0ea2c0f0326fc74a9180523efc9a6127482a5

More precise specification of the format can be found here where I wrote about the idea in wider context srevinsaju/zap#66

To make this practical, it needs support from both software authors (producers) and the AppImage community.

@probonopd
Copy link
Member

Why not use embedded digital sigatures for this purpose?

@misog
Copy link
Author

misog commented Feb 20, 2022

Why not use embedded digital sigatures for this purpose?

Because embedded digital signatures are much more complicated to implement for all parties involved and that would prevent fast and widespread use. Both can be used of course, but hashing is very simple and useful to implement everywhere, the signature part with certificates is solved by SSL.

Also if I want to verify some random AppImage I found on some torrent site - I do not need to import any public keys, I just look up for hash on the official website of software producer. Example: If hashsum of firefox-70.0.0.appimage that I found on torrent site matches the officially published hashsum on firefox.org, then it is verified. I can duckduckgo or google a hash I computed locally - that can not be done in the signature way.

Hashing can be also included in right-click context menu or Properties window like some Windows apps do:
hashtab

Fedora offers SHA256 hashsums of releases and that hashsums are also signed by PGP. However that PGP rely on SSL certificate - so if SSL connection is compromised or I get phished or get mistaken on domain, PGP will not help me.

https://getfedora.org/static/checksums/35/iso/Fedora-Workstation-35-1.2-aarch64-CHECKSUM
https://getfedoraos.org/static/fedora.gpg (fake)
https://getfedora.org/static/fedora.gpg

Also what if small software producers are hacked and signing keys compromised? There should be some revoke-alert system, which adds process complexity to them and implementation complexity to the AppImage community.

However signing is good, its just too complex to start with on mass scale and hashes are simple and do the job of verification.

@probonopd
Copy link
Member

probonopd commented Feb 20, 2022

Because embedded digital signatures are much more complicated

Indeed, that is a valid reason.
But: the hash cannot be embedded into the AppImage itself, or else it could be tampered with.
An external tool (outside of the AppImage) needs to calculate and display the hash.

The screenshot above is showing the HashTab software doing this.

So this would need to be implemented by the file managers of desktop environments such as KDE, Xfce, helloDesktop etc.

Maybe @TheAssassin wants to share his view on this as well.

@misog
Copy link
Author

misog commented Feb 20, 2022

Yes, wide support from two parties is needed:

  1. Company which produces the software as AppImage file and publishes its hashsum on HTTPS website.
  2. Distributor which distributes the AppImage file and URL (link) to that hashsum at HTTPS website.

Then user can verify it manually or semi-manually as described in original post.

The hard problem to solve is to convince many companies to publish hashes of their AppImages.

Maybe such hashsum-and-ssl-based verification could get to the official standard of AppImage as description how to verify authenticity of any AppImage file. That could popularize it between software authors. And also that now there are many AppImage files all over the internet with unknown authenticity and possibly malicious, without clear trace to any GIT repository of AppImage distributor.

As for embedded hashsum - it is possible, there are at least two methods:
a) AppImage file will contain another AppImage file and hashsum of the embedded AppImage file.
b) AppImage standard file will contain file hashsum.txt which will contain hashsum of the AppImage file without the hashsum.txt file. So verification process will first read the hash from hashsum.txt and then remove the hashsum.txt file and then it will hash the AppImage file and compare it with read hash.

However it is not needed to include hash to AppImage file. Because anybody can alter it anyway. So the hash must come from software author directly and be compared to computed (real) hash of the file.

However, URL of hash could be embedded (ex. in hash_url.txt) and be even part of the AppImage specification. That way, AppImage distributors do not need to provide any additional hash to the package.

Maybe there should be two files; author_hash_url.txt and distributor_hash_url.txt so if author does not create AppImages but distributor builds it then it will include own URL.

@probonopd
Copy link
Member

We could make appimagetool produce the hash files automatically.

@misog
Copy link
Author

misog commented Feb 20, 2022

Yes, that would be nice - but it should create hash that will be placed online. Anybody can alter hashes inside a package so it should not be trusted. There must be the real hash stored behind HTTPS and then verification process hashes the file and compares it with the real hash stored behind HTTPS.

appimagetool can produce a line which will be appended to hash file (which is specification draft) from original post, like this:

v1 program_v4.5.6.appimage 6.0.1 sha256 c032d47970e81adad5e0eb6dd6351465a7afa8e388df1524ae79aff3406a21e4

@TheAssassin
Copy link
Member

Please stop using the term "cryptographic hashsum"... you're talking about checksums, not hashsums, not cryptographic hash algorithms.


I did have a much longer reply in the works, but I don't have the time to dissect every statement and explain why someone is wrong there. The following is just a summary of my findings.

The solution for authenticity checks is clearly to use signatures. Do not try to create some "authentication" scheme using plain checksums. You are not going to get it right most likely, but you easily create a false sense of security by adding such snake oil.

This proposal is not convincing at all. The author appears to have misconceptions because I see lots of contradictions and false assumptions. The proposed protocol is not supported anywhere at this point, whereas PGP is well supported. It is basically "home grown crypto" (always a bad sign), it is not "not complicated" as the author claims either. You might be able to get it working security wise by fixing the flaws, but the user ultimately has to validate the origin of the "authentication data", much like it is required with the existing support for PGP signatures. There is no apparent benefit, instead the attack surface is increased by changing the bit the user has to verify from a PGP key (fingerprint) to some URL.

So where's the gain of this protocol? appimagetool does all the signing and embedding already. All you have to do is generate a PGP key, which isn't hard (someone should document it in docs.appimage.org, though). Setting up a TLS-enabled webserver securely is complex, too, but serving with HTTPS is the standard nowadays, which is also why browsers don't highlight it as much as they used to. When using third-party webserver infrastructure, attributing that infra to some app authors isn't easy for an average user either (unless you're libreoffice.org for instance, a URL which most people know, how would you tell the URL belongs to some project? I could just register "appimage.fr" for instance, add an "official looking" website and trick people into trusting me...).

I see lots of open questions. For instance "what's the substitute of CRLs in your protocol" (which you correctly recognized, too, see below). Also, you're assuming some "educated user" is going to do all the manual work, but I miss a solution for "average joe" (not talking about generating checksums but making sure that URL is trustworthy).

Anyway, continuing this list is a waste of time, honestly. Embedded PGP signatures are not complicated to use, what is missing is the right tooling to work with them, but you could say the same about this protocol proposal. In case you want to make it easy, feel free to also have non-embedded PGP signatures, which can be worked with with plain gpg, the embedded ones are mainly useful for AppImageUpdate. But don't try to make checksums work for authentication, please.


I don't just want to claim someone has misconceptions without giving at least one example.

Fedora offers SHA256 hashsums of releases and that hashsums are also signed by PGP. However that PGP rely on SSL certificate - so if SSL connection is compromised or I get phished or get mistaken on domain, PGP will not help me.

How would your checksum help you there, though? It relies on that exactly same TLS certificate.

The key difference between using PGP and "checksums downloaded over TLS" is that nobody pins the key for TLS, but they do for PGP. I just need to make sure the PGP key is right once, then can use it to authenticate all downloaded images in the future.

Of course, pinning the TLS key is used a lot nowadays, too, for instance in mobile apps (to protect against reverse engineering, mostly). But using an authentication scheme that does not rely on TLS has a few advantages. No matter where the files are served from, you can always authenticate them after downloading. That's why there can be dozens of mirrors which are not maintained by the same project.

Because embedded digital signatures are much more complicated to implement for all parties involved and that would prevent fast and widespread use. Both can be used of course, but hashing is very simple and useful to implement everywhere, the signature part with certificates is solved by SSL.

No, it's not "solved by SSL". You use TLS as a cryptographic layer, sure, but what your scheme boils down to is "let's ask the user whether to trust a URL", but how to do so properly?

I'm sure most people can at most have a look at the domain name, perhaps open it in the browser to see what's going on. This is much weaker than using PGP keys, because "looking good in the browser" is basically "oh yeah no warning about broken TLS" and maybe "there's their logo, must be their homepage". Why bring the TLS CA infrastructure into the mix?

With PGP, one just authenticates the key. One way is to download it from a "trustworthy website". But it's not limited to this. Still, once that issue is solved, you have pinned the key, which is a great benefit over "pinning the URL".

Also if I want to verify some random AppImage I found on some torrent site - I do not need to import any public keys, I just look up for hash on the official website of software producer. Example: If hashsum of firefox-70.0.0.appimage that I found on torrent site matches the officially published hashsum on firefox.org, then it is verified. I can duckduckgo or google a hash I computed locally - that can not be done in the signature way.

That's nonsense. You don't have to search for the signature online. You'd look for the key instead. Why is this any harder than searching for a checksum?


Also what if small software producers are hacked and signing keys compromised? There should be some revoke-alert system, which adds process complexity to them and implementation complexity to the AppImage community.

That is a good point that has not been solved really for embedded signatures. I think AppImageUpdate should always check the keyservers to see whether a key has been revoked.

Your protocol doesn't have a solution to this. Would you just distribute a list of "known bad hashes"? Why should appimage.org maintain such a list even? Why should anyone trust appimage.org, whom they may never have heard about it before?

@TheAssassin
Copy link
Member

Also, I want to emphasize that this is the completely wrong forum for such a discussion.

@misog
Copy link
Author

misog commented Feb 21, 2022

@TheAssassin

Please stop using the term "cryptographic hashsum"... you're talking about checksums, not hashsums, not cryptographic hash algorithms.

Indeed I write about cryptographic hash algorithms which on Linux distributions are available under programs sha256sum or sha512sum, in general hashsum. Deal with it.

I recommend to read posts with understanding the relevant points and responding with real arguments.

This proposal is not convincing at all. The author appears to have misconceptions because I see lots of contradictions and false assumptions.

You listed zero.

The proposed protocol is not supported anywhere at this point, whereas PGP is well supported.

Yes, that is why I propose it :) But it is not entirely true that it is not used anywhere - it is used everytime when downloading software from HTTPS website, of course that is just "manual mode". The proposal can be implemented semi-automatically, see original post. Also PGP is much more complicated to implement and use for all parties involved (including average user coming from Windows) as was communicated.

It is basically "home grown crypto" (always a bad sign), it is not "not complicated" as the author claims either.

It is not "home grown crypto". It uses cryptographic hash functions https://en.wikipedia.org/wiki/Cryptographic_hash_function which are widely available on Linux distributions (ex. sha256sum, sha512sum, ...) and it uses SSL with existing PKI. PGP is much more complicated for all parties involved (including average user coming from Windows) than this proposal. The proposal contains end-to-end explanations how that can work, including what should provide each party in the system. I recommend that you find drawbacks in the proposed system and do not just argue that 'PGP is better'. It does not matter that something is better when it is not used. Also the proposal is not cryptographically weak.

You might be able to get it working security wise by fixing the flaws, but the user ultimately has to validate the origin of the "authentication data", much like it is required with the existing support for PGP signatures. There is no apparent benefit, instead the attack surface is increased by changing the bit the user has to verify from a PGP key (fingerprint) to some URL.

See post below, the discussion about validating the origin is later in the text.

@misog
Copy link
Author

misog commented Feb 21, 2022

There is no apparent benefit, instead the attack surface is increased by changing the bit the user has to verify from a PGP key (fingerprint) to some URL.

  1. The first benefit is that average user does not need to know how to manage XYZ PGP keys of software authors. And that is the point - average user can just know the correct HTTPS URL domain of the hash and it can verify the file (or let package manager to verify it, user just confirms that ex. firefox.org is correct domain).
  2. The second benefit is that no handling of asymmetric cryptography is required in the AppImage community ecosystem (package distributors, creators of package managers, ...).
  3. The third benefit is that no revoke-alert system needs to be implemented to handle leaked keys.
  4. The fourth benefit is that software producers do not need to deploy signing process with asymmetric cryptography and care about signing keys.
  5. The fifth benefit is that integration of hash verification is much more feasible than integration of PGP verification outside AppImage community ecosystem, ex. in community ecosystems such as GNOME or KDE.

I do not argue that signing is better/worse that this scheme in principle. I argue that this scheme is much easier for all involved parties (producers, distributors, package manage makers, users) and it is sufficient to verify integrity of a file with strong cryptographic hashing (even signing uses strong cryptographic hashing) and the solution is no worse than browsing HTTPS web and downloading packages from HTTPS website. Signing and PGP verification can be done too - you can go and convince all parties to use it.

How would your checksum help you there, though? It relies on that exactly same TLS certificate.

Yes, that was my point - PGP relies on SSL and domain verification in general case as well as the proposed scheme relies on SSL and domain verification. If user downloads fake PGP public key from fake domain, it is bad. But the point is that hashes and SSL and domains are much simpler for average user than PGP. And the proposed scheme is much more secure than the current state when AppImages are scattered all over the internet without any PGP, hash or even reference to author/distributor.

I just need to make sure the PGP key is right once, then can use it to authenticate all downloaded images in the future.

Or you can just get phished once... and never find out that you happily download new malware with updates. It is just about risk/usability ratio.

No, it's not "solved by SSL". You use TLS as a cryptographic layer, sure, but what your scheme boils down to is "let's ask the user whether to trust a URL", but how to do so properly?

I'm sure most people can at most have a look at the domain name, perhaps open it in the browser to see what's going on. This is much weaker than using PGP keys, because "looking good in the browser" is basically "oh yeah no warning about broken TLS" and maybe "there's their logo, must be their homepage". Why bring the TLS CA infrastructure into the mix?

With PGP, one just authenticates the key. One way is to download it from a "trustworthy website". But it's not limited to this. Still, once that issue is solved, you have pinned the key, which is a great benefit over "pinning the URL".

It is solved by SSL the same way as website authenticity is solved by SSL - user just checks both the domain and SSL certificate. Nowadays certificates are no help with generic Lets encrypt, but the domain check is good enough for AppImage published on website = good enough for AppImage verified locally with hashsum and domain check. Again, I do not argue that this is better security than signing. I argue that this is MUCH better than the current state of the AppImage world and MUCH more feasible for all parties involved (including average user) than PGP system. And still it offers strong guarantee to verify authenticity of a file.

Also, pinning of URL can be done by trusted/audited package manager by adding the domain to list of trusted domains. That is the equivalent of adding PGP public key to the system in PGP scheme.

Another point is that AppImage distributors must somehow redirect the user to download the PGP key. How can they do that? There is no other way just to show the domain! So again, PGP rely on the ability of user to verify domain.

That's nonsense. You don't have to search for the signature online. You'd look for the key instead. Why is this any harder than searching for a checksum?

No nonsense; When average user does not have key (most probable case) from authors of krita-5.0.0.appimage then the user needs to search for "krita" with a search engine, then open the right website (domain verification again...) and click multiple times to find out the public key, then import it to machine and then verify the file. Average user = Windows user that started to use Linux because AppImages are there.

Or average user can just find out hash of file (GUI tool or sha256sum), search the hash on the web with a search engine and verify the domain that search engine found. Then visually or in some tool compare two hashes. That is it. No keys, no nothing, simple comparison of hashes for verification of integrity and authenticity.

Your protocol doesn't have a solution to this. Would you just distribute a list of "known bad hashes"? Why should appimage.org maintain such a list even? Why should anyone trust appimage.org, whom they may never have heard about it before?

List of known bad hashes is not needed here. Because distributor distributes AppImage + URL to hash. Distributor would usually distribute correct URL of hash (correct domain). If distributor is malicious then user will see that the URL is wrong. The difference is that in this case it is not required because real verification is done by user verifying the domain and not trusting that relevant signing private key was not leaked. But yes, SSL certificate can not be leaked to make this scheme work. However SSL certificate can not be leaked to get correct PGP public key too. So in principle, as a qualitative judgement, both GPG and this proposal rely on SSL (excluding postal delivery of PGP and such). To make any quantitative judgements specific cases are needed.

As additional feature it can be beneficial to make distributed, community based blacklist of bad (malicious) hashes and optionally use it by package managers to warn users (package managers can be trusted or audited). Similarly how antivirus companies share databases of malware to antivirus clients. However it is not needed for security reasons - contrary to PGP system.

@misog
Copy link
Author

misog commented Feb 21, 2022

tldr; The proposed system is equivalent to downloading software with a web browser.

How downloading/installing software with web browser works

  1. User wants to run software (ex. Krita).
  2. User has trusted/audited web browser.
  3. Another trusted or audited system (ex. web search engine, Github search feature, Wikipedia, ...) shows untrusted URL (domain) of untrusted software package to the user.
  4. User verifies the untrusted URL (domain) with own eyes and with own knowledge of correct domain or with SSL certificate that is issued to Krita authors and that domain.
  5. The URL (domain) just become verified by user, so the software package is verified by user.
  6. User downloads the software package with the browser.
  7. User runs the software package.

How downloading/installing software with the proposed system works

  1. User wants to run software (ex. Krita).
  2. User has trusted/audited package manger.
  3. Another trusted or audited system (ex. appimage.github.io search engine, another distribution website, ...) shows untrusted URL (domain) of untrusted software package HASH to the user.
  4. User verifies the untrusted URL (domain) with own eyes and with own knowledge of correct domain or with SSL certificate that is issued to Krita authors and that domain.
  5. The URL (domain) just become verified by user, so the software package HASH just become verified by user.
  6. Package manager downloads untrusted software package from untrusted location.
  7. Package manager verifies authenticity of the software package by comparing computed hash of the software package and the hash from the already verified URL/domain/HASH, so the software package becomes verified by the user.
  8. User runs the software package.

Clearly it is equivalent to downloading software with browser from sites with SSL certificates. User is responsible for verification of the software package in both cases. User is also responsible for verification of all future downloads, or package manager can make the domain trusted for later use.

The user story behing AppImage is just that:

"As a user, I want to download an application from the original author, and run it on my Linux desktop system just like I would do with a Windows or Mac application."
https://appimage.org/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants