Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug .GCT written by cmapPy on Windows have inconsistent line endings #77

Open
KarlClauser opened this issue Jul 13, 2022 · 4 comments
Open

Comments

@KarlClauser
Copy link

Hi Lev,

#Bug .GCT files written with cmapPy on Windows, show alternating blank lines after the top 3 lines when opened in Excel, though fine in code editor Spyder v5.12.3

Fix: The line below writes the 1st 2 lines of a .GCT file and would otherwise default to OS line_terminator of \r\n which conflicts with all other lines terminated by \n

Inconsistent line endings probably tricks Excels auto line ending recognition

C:\ProgramData\Anaconda3\Lib\site-packages\cmapPy\pandasGEXpress\write_gct.py #line 102

Write top_half_df to file

#top_half_df.to_csv(f, header=False, index=False, sep="\t")
top_half_df.to_csv(f, header=False, index=False, sep="\t", line_terminator='\n')

Please incorporate into next version. Screenshots attached.

Thanks,

--Karl
cmapPybug_inconsistentLineEndings.docx

@levlitichev
Copy link
Contributor

Good catch. I think the better change would be to replace \n with os.linesep in write_version_and_dims:

f.write(("#" + version + "\n"))
f.write((dims[0] + "\t" + dims[1] + "\t" + dims[2] + "\t" + dims[3] + "\n"))

@KarlClauser
Copy link
Author

Won't that lead to \r\n line endings on Windows? We should be striving to get the entire file to be \n line endings. I seek to have a file that is identical, no matter whether it is written in linux or windows. That is how I encountered this bug.

--Karl

@levlitichev
Copy link
Contributor

I understand your point, but I feel that it would be wise to follow the convention chosen by pandas to use system-specific line terminators. I confirmed (on my Mac) that the file looks the same when opened in Excel if all the terminators are either all \n or all \r\n.

f = open("A.txt", "w")
f.write(("A" + "\n"))
f.write(("B" + "\n"))
f.close()

g = open("B.txt", "w")
g.write(("A" + "\r\n"))
g.write(("B" + "\r\n"))
g.close()

@KarlClauser
Copy link
Author

Hi Lev,

I'm a Windows guy and now-a-days Windows programs can routinely handle '\n' line terminators. For reading/writing .GCT files people upstream/downstream of me use Macs. Consequently it is important to be able to read/write and get the same result. If you force windows generated files to be '\r\n' then I'm going to have fix every one I produce with cmaPy to get the desired '\n'.

Would you please at least provide an option to specify the line terminator to be written? So long as I have a means to get '\n' then I don't care which you choose as a default.

Thanks,

--Karl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants