Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix command line parsing for all platforms #2009

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

nothingTVatYT
Copy link
Contributor

For systems using a posix-compliant shell command line arguments are parsed by the shell and relayed to the program via an array. The previous concatenation and parsing meant that escaped spaces were lost for Linux and OSX.

@mafiesto4
Copy link
Member

I think this adds a significant readability problem to the code. Can we properly construct the command line string to be passed to the engine instead? The purpose of engine accepting a single const Char* is to make it simple across all platforms (including consoles and mobile). Also, command line class should not have 3 usages of each command arg switch due to complexity of maintenance.

@nothingTVatYT
Copy link
Contributor Author

First of all, I agree that the code is not pretty - especially with the preprocessor directives.
But regarding the concatenation of command line arguments into a single char array the short answer is no, you cannot do that. The longer answer is well, yes, but with serious limitations or much more overhead than it's useful in this rather simple case.
I'm sure you're familiar with most of what I have to say but just to make sure we're talking about the same details, let me quickly recap what the problem with the current solution is.

Let's say we have the not so unusual case of spaces somewhere in the project path, like "flax projects\test1" and a very simple and portable code like this:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[])
{
    char cmdline[1024];

    *cmdline = 0;

    for (int i=0; i < argc; i++)
    {
        printf("argv[%d]=%s\n", i, argv[i]);
        if (i > 0)
        {
            if (i > 1) strcat(cmdline, " ");
            strcat(cmdline, argv[i]);
        }
    }

    printf("cmdline=%s\n", cmdline);

    exit(0);
}

In Windows 11 cmd a test could look like that:

G:\cmdtest>dir "flax projects\test1*"
 Volume in drive G is VBOX_shared
 Volume Serial Number is 0000-0812

 Directory of G:\cmdtest\flax projects

12/07/2023  04:15 AM    <DIR>          test1
               0 File(s)              0 bytes
               1 Dir(s)  967,365,767,168 bytes free

G:\cmdtest>cmdtest -project "flax projects\test1" -play
argv[0]=cmdtest
argv[1]=-project
argv[2]=flax projects\test1
argv[3]=-play
cmdline=-project flax projects\test1 -play

That means in the array it's clear that the space between flax and projects is part of the path whereas in the concatenated command line you have to guess.
On a system running a POSIX-compliant shell like Linux and macOS it's similar although there are more ways to escape the space:

./cmdtest -project "flax projects/test1" -play
argv[0]=./cmdtest
argv[1]=-project
argv[2]=flax projects/test1
argv[3]=-play
cmdline=-project flax projects/test1 -play

or in the bash default way of escaping:

./cmdtest -project flax\ projects/test1 -play
argv[0]=./cmdtest
argv[1]=-project
argv[2]=flax projects/test1
argv[3]=-play
cmdline=-project flax projects/test1 -play

or in virtually every way you can imagine, including shell variables, command output and whatever.

Now one could argue that the single arguments could be surrounded by quotes or special characters could be escaped.
The problem with this approach is that it will never be safe unless you introduce some restrictions that applies to the flax editor but not the operating system. In Linux (and to my knowledge macOS) single and double quotes are perfectly legal characters in file names - maybe unusual but there is no guarantee against someone naming his volume "joe's files".
That leads to something like:

./cmdtest -project flax\ projects/test\ with\ \"quotes\" -play
argv[0]=./cmdtest
argv[1]=-project
argv[2]=flax projects/test with "quotes"
argv[3]=-play
cmdline=-project flax projects/test with "quotes" -play

Even in this exotic example the array preserves the information of what the path is.
Interestingly Windows cmd struggles with this example. The automatic file name completion suggests that but the quotes are lost:

G:\cmdtest>cmdtest -project "flax projects\test with "quotes"" -play
argv[0]=cmdtest
argv[1]=-project
argv[2]=flax projects\test with quotes
argv[3]=-play
cmdline=-project flax projects\test with quotes -play

G:\cmdtest>dir "flax projects\test with*"
 Volume in drive G is VBOX_shared
 Volume Serial Number is 0000-0812

 Directory of G:\cmdtest\flax projects

12/07/2023  05:41 AM    <DIR>          test with "quotes"
               0 File(s)              0 bytes
               1 Dir(s)  967,365,746,688 bytes free

So even Windows can list directories with special characters and it's not a Linux-specific problem.

Let me show you another theoretical case that breaks the current logic. Assume somebody copies or moves a project and appends "-new" to it. Let's further assume this user is a bit clumsy on the keyboard and ends up with "test -new" i.e. another space. We end up with something like:

G:\cmdtest>dir "flax projects\test -*"
 Volume in drive G is VBOX_shared
 Volume Serial Number is 0000-0812

 Directory of G:\cmdtest\flax projects

12/07/2023  04:15 AM    <DIR>          test -new
               0 File(s)              0 bytes
               1 Dir(s)  967,365,746,688 bytes free
G:\cmdtest>cmdtest -project "flax projects\test -new" -play
argv[0]=cmdtest
argv[1]=-project
argv[2]=flax projects\test -new
argv[3]=-play
cmdline=-project flax projects\test -new -play

Now the -new part looks like an argument in the concatenated array but the user clearly stated that it's part of the name by using correct escape (quote) characters.

What I try to make clear is that concatenating arguments will always bear the risk of losing information that is hardly recoverable in a reliable and portable way.
Even the xbox defines an entry point of Main(string[] args) and obviously the related languages like C, C++, C#, Java, you name it - all adapt this scheme.

My suggestion is that we change the entry point so it expects an array of strings and we can get rid of the confusing and maintenance-unfriendly preprocessor directives and implement just one command line handling for all.
From my experience with C, C++ and Java the usual way seems to be a kind of module/function/library that handles the command line arguments for easier maintenance (i.e. for example a single point to define a new option or argument). Although I would refrain from introducing another external dependency and instead make one which works for all targeted platforms.

Another question is whether we want helpful error messages in case the arguments are used wrong. Things like "-project requires an argument" or "-porject is an invalid option" is best done when the structure is known beforehand. If those kind of checks and error reporting are not important than the parsing could be simplified a lot by going through the argument array just once.

@nothingTVatYT
Copy link
Contributor Author

Please have a look at this proposal. It simplifies the command line parsing a lot while still preserving argument boundaries and thus still fixes the spaces-in-path problem.
The main difference beside better maintainability is the lack of detailed error reporting.

It can be made even shorter if the Windows version would supply an argument list as well.

@mafiesto4 mafiesto4 added the enhancement New feature or request label Dec 7, 2023
@mafiesto4 mafiesto4 added this to the 1.8 milestone Dec 7, 2023
@nothingTVatYT nothingTVatYT changed the title fix command line parsing for Linux and OSX fix command line parsing for all platforms Dec 24, 2023
@mafiesto4 mafiesto4 removed this from the 1.8 milestone Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants