GSoC 2018 prateekiiest
-
Name: Prateek Chanda
-
Time Zone : +05:30 GMT
-
IRC Handle : prateekiiest
-
Github ID: prateekiiest
-
Blog - Prateek Chanda @Medium
-
LinkedIn : My LinkedIn Profile
-
Website : CodeFolio
-
Mail : prateekkol21@gmail.com or prateek.dd2015@cs.iiests.ac.in
-
University: Indian Institute of Engineering Science and Technology, Shibpur
-
Major: Computer Science and Engineering
-
Current Academic Year : Third Year
-
Graduate Year: 2019
-
I have been involved in open source for the past 2 years and been contributing to different open source projects including Algorithms based and Python based Desktop Applications.
-
I did two major projects on Machine Learning in Python which majorly uses statistical analysis (pandas, numpy) and the use of Jupyter Notebook. These were project works for Udacity Machine Learning Nanodegree: Titanic Survival Exploration,Boston Housing.
-
I worked recently on one of my personal projects based in python which saw many active participation from different communities. The project being called Code Sleep Python. It is a project based in python for building desktop applications and games in python.
-
Contributing to Organizations like FOSSASIA and other small organisations: I had been contributing to some open source organizations and repositories for the past two years. I have currently total 163 Pull Requests done so far and 72 issues being worked so far.
-
Mentored students under different Open Source events like Hacktoberfest , Kharagpur Winter of Code and 24 Pull Requests on some of my own personal projects primarily written in Python.
-
Got the opportunity to speak at different Open Source Summits. Got invited to different developer conferences like RISE Hong Kong 2017, FOSSASIA Open Tech , Google Developers Solve for India among others.
-
Currently I am the GitHub Campus Expert of my college and along with direct support from GitHub I am helping to grow my community in campus, involving more people in Open Source and coding.
OpenAstronomy consisting of 8 sub-organisations is a collaboration between open source astronomy and astrophysics projects that are being used by researchers around the world to study our universe. The analysis of data obtained from observatories like SDO , Hubble Space Telescope helps multiple types of research from being able to forecast a solar storm to detect planets in other stars.
Astronomy is at the frontier of science. There are new discoveries made all the time.
I had always a love for the sky and a kind of childhood fascination in the field astronomy and to me its feels amazing to see real time data of the Sun , captured just a few hours back, in my computer screen and I am able to analyse those data obtained. The organisation is inspiring new developers like me to join the open source community and build the project code base together with lead developers and that is where I find the joy of Open Source. It has been really a great experience contributing to Sunpy since I joined the Sunpy community.
I did not qualify for my first GSOC with Open Astronomy back in 2017. But I did not give up. I loved the project and wanted to contribute as much as I can. Having been worked with the SunPy project for the past 1 year, I learnt many new things in open source, team management, writing long lines of code and lastly the joy of your contribution being accepted.
OpenAstronomy along with Google Summer of Code will give me this opportunity to be a part of these huge project work on Solar Data Analysis and I feel it as a privilege to contribute to such an open-source software in the field of astronomy.
I have been involved with the SunPy project for the past 1 year contributing since December 2016. And I am delighted to be a part of the latest version releases of SunPy and am grateful to being acknowledged for my work.
Pull Requests | Corresponding Issue | Status |
More Mapcubes Examples - Gallery Examples #2455 | More examples in the gallery on simple map and mapcube manipulation #2413 | Open |
Remove Gamma usage in Map #2424 | Gamma in map doesn't do anything #2333 | Merged |
Finding Local Peaks in Solar Data - Gallery Update #2339 | Suggested this example | Merged |
Masking Hot Pixels | Found Bug in Example | Merged |
Brightest pixel location may occur at multiple position | Found similar bug in Example | Merged |
Brightest pixel location redundancy removed | Removed Redundancy in Examples | Merged |
Added documentation for suds-py3 incompatibility | VSOClient.query returns no result in Python 3.5 | Merged |
Update README.md Matrix Org linked | Enhancement | Merged |
Update README.md | README badges broken | Merged |
Added documentation for database/tests | database tests depend on data/tests dir | Merged |
Update vso.py | Enhancement Documentation | Merged |
Update time.py Removed extract_time | Remove extract_time function | Merged |
Update rescale.py | reshape_image_to_4d_superpixel array seems broken | Merged |
Contribution to SunPy Website
Pull Requests | Corresponding Issue | Status |
help section Docs link updated | Sunpy Documentation Link Showing Privacy Error | Merged |
Sunpy Presentations and Talks Upload on the Site | Enhancement | Merged |
Registry of Sunpy Affiliated Packages | Enhancement -Introducing Affiliated Packages Registry | Merged |
Update bootstrap.css bootstrap version updated | Update bootstrap | Merged |
Update about.html Community Link updated to matrix | Enhancement | Merged |
During analysis of solar data, precision at the observation time is one of the fundamental factors in deciding the accuracy of the data collected. Hence we need to be as much as precise as possible with respect to time of observation while collecting data from solar observatories.
Majority of SunPy uses the datetime.datetime object as its representation of time. However using only datetime.datetime has demerits including less precision as opposed to astropy.time which supports much higher precision representations of time such as leap seconds. astropy.time also supports time formats e.g. TAI time which can not be done using datetime.datetime. Hence switching to astropy.time is required in order to support such additional features.
-
Transition of every part of the SunPy code base to astropy.Time instead of using datetime.datetime for representation of time.
-
Redesigning of the sunpy.time.parse_time to return astropy.Time object upon parsing any time like inputs. Also in addition the function should have a better API design as opposed to the present situation.
-
Updating the modules under sunpy.net to use the modified version of parse_time.
-
Including tests for the newly designed parse_time along with proper documentation. Also some of the documentation needs to be updated or created wherever any SunPy module will be using astropy.Time instead of datetime.
Why we need a transition
If we make a quick comparison between datetime.datetime and astropy.time.Time we get the following defects of datetime
over astropy.time.Time
.
Features | datetime.datetime | astropy.time.Time |
Time formats | It includes limited time formats. Time format. It does not support time formats like Barycentric Dynamical Time(TDB), International Atomic Time(TAI) among others. | astropy.time.Time includes different types of time formats like TAI,TDB,TCB and TCB among others. Time format Issue #2155 |
Level of Precision | It ignores leap seconds since datetime can not handle leap seconds. | It can handle leap seconds. Issue 993. A good thing about astropy.time is that it supports setting the precision from the user side |
Support for Location | The current module doesn't support for passing any location frame (an EarthLocation instance)astropy.coordinates.EarthLocation | Supports location as user input to the astropy.Time class. Location Code |
The current issue also discusses about some of the demerits of datetime
.
Which modules under sunpy are currently using `datetime ?
-
Most of the program files under
sunpy.net
usedatetime
sunpy/net/tests/strategies.py
sunpy/net/dataretriever/sources/goes.py
sunpy/net/hek2vso/hek2vso.py
-
Modules like
sunpy.time
andsunpy.spectra
sunpy/time/timerange.py
sunpy/spectra/tests/test_callisto.py
-
Other modules like
sunpy.io
,sunpy.instr
,sunpy.coordinates
and many more under SunPy -
Also affiliated packages under SunPy like
solarbextrapolation
andIrispy
usedatetime
in their operations
What my work will include as part of the transition
This will involve updating the current functions which involves working on datetime objects and switch them to use astropy.Time objects. Functions which currently takes datetime
object as parameter should be made to take Time
as input. So proper modification of such functions are required to support Time
objects.
Secondly, tests for functions operating on datetime objects needs to be updated since they will now be using Time
objects. Some examples of tests updates are discussed under proposed solution to redesign of parse_time.
Also documentation for all such modules under the sunpy codebase needs to be updated since they will now be handling Time
objects.
Why we need a redesign ?
The parse_time
function residing under sunpy.time
module gets used for parsing specific time inputs and returning datetime
object as output. Since datetime
has its demerits as discussed above, we want parse_time
to return astropy.Time
object instead. Most functions under sunpy currently use parse_time
for parsing such time strings. Thus a redesign of parse_time
will let such functions using parse_time
handle only astropy.Time
object instead of datetime
.
Which modules under sunpy
use parse_time
?
- Currently most of the time related operations under SunPy is based on sunpy.time.parse_time function.There are currently 342 files that use parse_time, 231 from net, 10 from physics. Most modules under sunpy.net like vso , jsoc and helio also use sunpy.time.parse_time.
What changes are required in the function ?
-
The main change that will involve is to make the sunpy.time.parse_time return astropy.Time object instead of returning datetime objects as of now.
-
Secondly, the sunpy.time.parse_time needs to have a more robust API and provide extra features as discussed below.
-
Making the current function more modular.
There are many instances of datetime throughout the whole of the sunpy modules. Locating those datetime instances and replacing them with corresponding astropy.Time instances will involve a major part of this project.
This is a proposed solution to replace the common datetime operations with corresponding astropy.Time operations.
Datetime operations | Under which modules in sunpy | Astropy.Time operations |
---|---|---|
datetime.timedelta |
sunpy.net , sunpy.time , sunpy.util , sunpy.lightcurve , sunpy.timeseries
|
datetime.timedelta supports additional formats like hours, milliseconds, weeks as opposed to Timedelta which only supports jd and sec . See discussion below |
For datetime(year, month, day) tx = (year, month, day)
|
sunpy.time.tests |
t.Time('{}-{}-{}'.format(*tx)) |
datetime.isoformat |
sunpy.roi , sunpy.lightcurve , sunpy.timeseries ,sunpy.instr
|
t.Time(time_string).isot |
Making Parse_time return astropy.Time instead of datetime.
The parse_time function currently takes some of the input instances listed below as an input and checks accordingly the input instance.
The following table will show the input time instances that parse_time support currently,what it does and how I propose to modify that by returning corresponding astropy.Time object.
This can be seen here
What extra time strings will parse_time
support if it returns astropy.Time
The current issue discusses the inability of parse_time
to handle FITS compliant time formats. Like parse_time('2011-01-01T00:10:00.000(UTC)')
currently returns an error. But t.Time('2011-01-01T00:10:00.000(UTC)')
works fine.
If we can make provision for handling such time formats and apply the astropy.Time
on such type of strings, we can then make parse_time handle such inputs.
Making parse_time more modular
One of the fundamental functionality that parse_time doesn't implement separate sub-functions for converting a given time string to different time formats. It currently does all of that using the if-else statements for checking on the input instance and working accordingly. To make the current implementation more modular, we can write functions for converting such time_strings which can be called by the user whenever required. Much of this changes has been proposed in this PR
Separate functions that we can implement will include some of the following
Functions | Input | Output |
convert_time_pandasTimestamp | pandas.Timestamp | astropy.Time |
convert_time_pandasSeries | pandas.Series | astropy.Time |
convert_time_date | datetime.date | astropy.Time |
convert_time_datetime | datetime.datetime | astropy.Time |
Extra features in parse_time
- First of all, provision for handling FITs file formats should be made for
parse_time
. - Secondly I would discuss with mentors what other time instances they would like as input for
parse_time
apart from the current ones. I would thus proceed accordingly to work on such proposed time instances. -
parse_time
should be made to use the single dispatch module, since SunPy will moving over to Python 3 under 0.9 version. This issue discusses about this. - I plan to import some functions for setting the scale and formats of the Time objects which will let the user some freedom to set the scales and formats of the time_strings accordingly. Although this is subject to change as per reviews from mentors.
Updating functions using parse_time
This would involve updating all the functions under sunpy which use parse_time
. Since parse_time
will now return astropy.Time
object, such functions need to be carefully refactored since they have been using datetime
objects returned by older version of parse_time
. The documentation of such functions will need to be updated. Also new tests may need to be written down for such updated functions.
Documenting and Testing Parse_time
Function
This would involve updating the documentation for parse_time
. Since I plan to make parse_time
functionality more modular , the docs for all the separate conversion time functions needs to be written. Along with that new tests need to be written for such new conversion functions (like in this cases updating all the test functions).
Some examples of test updates which I implemented can be found here - Test Updates
Time Period | My Work Plan |
May 14 - Jun 11 |
|
Jun 11 -Jun15 | Phase 1 Evaluation Have the modified version of parse_time ready. Should address to any prevalent bugs that may occur due to involvement of new functions. Once it's set up , I plan to start working on the tests for the newly designed function. Tests will include writing tests for each of the separate time conversion functions for parse_time. Apart from this, I will try to complete as much as documentation possible. These will include documentation for the type we are using for input and output along with links to astropy Time docs corresponding to particular conversion operations, if required. |
Jun 15 - Jul 9 |
|
Jul 9 - Jul 13 | Phase 2 Evaluation Remaining time will be dedicated to check for any bugs or breaks in the code where the transitions are made. Cover up any remaining related issues during this time. |
Jul 13 - August 6 |
|
August 6 - 14 | Final Week - See to if there exist any bugs that was not addressable before. Any pending work or issues will be addressed to during this time |
I have not participated in GSoC before. This is the first time that I would be participating in GSoC.
Are you also applying to other projects?
No. This is the only project and SunPy under OpenAstronomy is the only organization that I have applied for.
-
I don't have any other internships or work for the summer. I don't have any plans to go on vacation either.
-
My classes for the new semester will begin around August 2nd, but I would still be able to give sufficient time for the project as academic load is very less during the initial few weeks of the semester. Hence it will not be much of a problem during the final week. I will be able to spare 35-40 hours for the project per week easily.
-
Also, because my summer vacation starts on May 7, I will start working on the project early so that I can try to complete the project well before the deadline ( around 1-2 weeks before the deadline ).
-
I have my semester exams from around 22nd of April to 1 May. So I will not be able to contribute much time to the project work during this time. Still I will try to devote 2-3 hours during the weekdays to do my work.
-
SunPy is the only organization and I am applying for another project in SunPy.
Yes, I am eligible to receive payments from Google. For any queries, clarifications or further explanations, feel free to contact me at prateekkol21@gmail.com .