Skip to content

GSoC 2018 prateekiiest

Nabil Freij edited this page Feb 22, 2024 · 4 revisions

Organization: SunPy in OpenAstronomy

Project: Transition to Astropy Time

Mentors: @Cadair, @Nabobalis, @Punyaslok

Student Information


University Information

  • University: Indian Institute of Engineering Science and Technology, Shibpur

  • Major: Computer Science and Engineering

  • Current Academic Year : Third Year

  • Graduate Year: 2019


Experience In Programming

  • I have been involved in open source for the past 2 years and been contributing to different open source projects including Algorithms based and Python based Desktop Applications.

  • I did two major projects on Machine Learning in Python which majorly uses statistical analysis (pandas, numpy) and the use of Jupyter Notebook. These were project works for Udacity Machine Learning Nanodegree: Titanic Survival Exploration,Boston Housing.

  • I worked recently on one of my personal projects based in python which saw many active participation from different communities. The project being called Code Sleep Python. It is a project based in python for building desktop applications and games in python.

Open Source experience

  • Contributing to Organizations like FOSSASIA and other small organisations: I had been contributing to some open source organizations and repositories for the past two years. I have currently total 163 Pull Requests done so far and 72 issues being worked so far.

  • Mentored students under different Open Source events like Hacktoberfest , Kharagpur Winter of Code and 24 Pull Requests on some of my own personal projects primarily written in Python.

  • Got the opportunity to speak at different Open Source Summits. Got invited to different developer conferences like RISE Hong Kong 2017, FOSSASIA Open Tech , Google Developers Solve for India among others.

  • Currently I am the GitHub Campus Expert of my college and along with direct support from GitHub I am helping to grow my community in campus, involving more people in Open Source and coding.


Interest In OpenAstronomy

OpenAstronomy consisting of 8 sub-organisations is a collaboration between open source astronomy and astrophysics projects that are being used by researchers around the world to study our universe. The analysis of data obtained from observatories like SDO , Hubble Space Telescope helps multiple types of research from being able to forecast a solar storm to detect planets in other stars.

Astronomy is at the frontier of science. There are new discoveries made all the time.

I had always a love for the sky and a kind of childhood fascination in the field astronomy and to me its feels amazing to see real time data of the Sun , captured just a few hours back, in my computer screen and I am able to analyse those data obtained. The organisation is inspiring new developers like me to join the open source community and build the project code base together with lead developers and that is where I find the joy of Open Source. It has been really a great experience contributing to Sunpy since I joined the Sunpy community.

I did not qualify for my first GSOC with Open Astronomy back in 2017. But I did not give up. I loved the project and wanted to contribute as much as I can. Having been worked with the SunPy project for the past 1 year, I learnt many new things in open source, team management, writing long lines of code and lastly the joy of your contribution being accepted.

OpenAstronomy along with Google Summer of Code will give me this opportunity to be a part of these huge project work on Solar Data Analysis and I feel it as a privilege to contribute to such an open-source software in the field of astronomy.

Contribution to Sunpy

I have been involved with the SunPy project for the past 1 year contributing since December 2016. And I am delighted to be a part of the latest version releases of SunPy and am grateful to being acknowledged for my work.

Pull Requests Corresponding Issue Status
More Mapcubes Examples - Gallery Examples #2455 More examples in the gallery on simple map and mapcube manipulation #2413 Open
Remove Gamma usage in Map #2424 Gamma in map doesn't do anything #2333 Merged
Finding Local Peaks in Solar Data - Gallery Update #2339 Suggested this example Merged
Masking Hot Pixels Found Bug in Example Merged
Brightest pixel location may occur at multiple position Found similar bug in Example Merged
Brightest pixel location redundancy removed Removed Redundancy in Examples Merged
Added documentation for suds-py3 incompatibility VSOClient.query returns no result in Python 3.5 Merged
Update README.md Matrix Org linked Enhancement Merged
Update README.md README badges broken Merged
Added documentation for database/tests database tests depend on data/tests dir Merged
Update vso.py Enhancement Documentation Merged
Update time.py Removed extract_time Remove extract_time function Merged
Update rescale.py reshape_image_to_4d_superpixel array seems broken Merged

Contribution to SunPy Website

Pull Requests Corresponding Issue Status
help section Docs link updated Sunpy Documentation Link Showing Privacy Error Merged
Sunpy Presentations and Talks Upload on the Site Enhancement Merged
Registry of Sunpy Affiliated Packages Enhancement -Introducing Affiliated Packages Registry Merged
Update bootstrap.css bootstrap version updated Update bootstrap Merged
Update about.html Community Link updated to matrix Enhancement Merged

Abstract

Mentors : @Cadair, @Nabobalis, @Punyaslok

During analysis of solar data, precision at the observation time is one of the fundamental factors in deciding the accuracy of the data collected. Hence we need to be as much as precise as possible with respect to time of observation while collecting data from solar observatories.

Majority of SunPy uses the datetime.datetime object as its representation of time. However using only datetime.datetime has demerits including less precision as opposed to astropy.time which supports much higher precision representations of time such as leap seconds. astropy.time also supports time formats e.g. TAI time which can not be done using datetime.datetime. Hence switching to astropy.time is required in order to support such additional features.

Milestones

The project will involve the following parts

  • Transition of every part of the SunPy code base to astropy.Time instead of using datetime.datetime for representation of time.

  • Redesigning of the sunpy.time.parse_time to return astropy.Time object upon parsing any time like inputs. Also in addition the function should have a better API design as opposed to the present situation.

  • Updating the modules under sunpy.net to use the modified version of parse_time.

  • Including tests for the newly designed parse_time along with proper documentation. Also some of the documentation needs to be updated or created wherever any SunPy module will be using astropy.Time instead of datetime.

Detailed Description

Switching to Astropy.Time from Datetime

Why we need a transition

If we make a quick comparison between datetime.datetime and astropy.time.Time we get the following defects of datetime over astropy.time.Time.

Features datetime.datetime astropy.time.Time
Time formats It includes limited time formats. Time format. It does not support time formats like Barycentric Dynamical Time(TDB), International Atomic Time(TAI) among others. astropy.time.Time includes different types of time formats like TAI,TDB,TCB and TCB among others. Time format Issue #2155
Level of Precision It ignores leap seconds since datetime can not handle leap seconds. It can handle leap seconds. Issue 993. A good thing about astropy.time is that it supports setting the precision from the user side
Support for Location The current module doesn't support for passing any location frame (an EarthLocation instance)astropy.coordinates.EarthLocation Supports location as user input to the astropy.Time class. Location Code

The current issue also discusses about some of the demerits of datetime.

Which modules under sunpy are currently using `datetime ?

  • Most of the program files under sunpy.net use datetime

    • sunpy/net/tests/strategies.py
    • sunpy/net/dataretriever/sources/goes.py
    • sunpy/net/hek2vso/hek2vso.py
  • Modules like sunpy.time and sunpy.spectra

    • sunpy/time/timerange.py
    • sunpy/spectra/tests/test_callisto.py
  • Other modules like sunpy.io, sunpy.instr , sunpy.coordinates and many more under SunPy

  • Also affiliated packages under SunPy like solarbextrapolation and Irispy use datetime in their operations

What my work will include as part of the transition

This will involve updating the current functions which involves working on datetime objects and switch them to use astropy.Time objects. Functions which currently takes datetime object as parameter should be made to take Time as input. So proper modification of such functions are required to support Time objects.

Secondly, tests for functions operating on datetime objects needs to be updated since they will now be using Time objects. Some examples of tests updates are discussed under proposed solution to redesign of parse_time.

Also documentation for all such modules under the sunpy codebase needs to be updated since they will now be handling Time objects.

Redesign of **sunpy.time.parse_time** function

Why we need a redesign ?

The parse_time function residing under sunpy.time module gets used for parsing specific time inputs and returning datetime object as output. Since datetime has its demerits as discussed above, we want parse_time to return astropy.Time object instead. Most functions under sunpy currently use parse_time for parsing such time strings. Thus a redesign of parse_time will let such functions using parse_time handle only astropy.Time object instead of datetime.

Which modules under sunpy use parse_time?

  • Currently most of the time related operations under SunPy is based on sunpy.time.parse_time function.There are currently 342 files that use parse_time, 231 from net, 10 from physics. Most modules under sunpy.net like vso , jsoc and helio also use sunpy.time.parse_time.

What changes are required in the function ?

  • The main change that will involve is to make the sunpy.time.parse_time return astropy.Time object instead of returning datetime objects as of now.

  • Secondly, the sunpy.time.parse_time needs to have a more robust API and provide extra features as discussed below.

  • Making the current function more modular.


Proposed Solution

Proposed Solution for Transition from Datetime to Astropy.Time

There are many instances of datetime throughout the whole of the sunpy modules. Locating those datetime instances and replacing them with corresponding astropy.Time instances will involve a major part of this project.

This is a proposed solution to replace the common datetime operations with corresponding astropy.Time operations.

Datetime operations Under which modules in sunpy Astropy.Time operations
datetime.timedelta sunpy.net, sunpy.time, sunpy.util, sunpy.lightcurve , sunpy.timeseries datetime.timedelta supports additional formats like hours, milliseconds, weeks as opposed to Timedelta which only supports jd and sec. See discussion below
For datetime(year, month, day) tx = (year, month, day) sunpy.time.tests t.Time('{}-{}-{}'.format(*tx))
datetime.isoformat sunpy.roi, sunpy.lightcurve, sunpy.timeseries,sunpy.instr t.Time(time_string).isot

Proposed Solution for Redesign of sunpy.time.parse_time function

Making Parse_time return astropy.Time instead of datetime.

The parse_time function currently takes some of the input instances listed below as an input and checks accordingly the input instance.

The following table will show the input time instances that parse_time support currently,what it does and how I propose to modify that by returning corresponding astropy.Time object.

This can be seen here


What extra time strings will parse_time support if it returns astropy.Time

The current issue discusses the inability of parse_time to handle FITS compliant time formats. Like parse_time('2011-01-01T00:10:00.000(UTC)') currently returns an error. But t.Time('2011-01-01T00:10:00.000(UTC)') works fine.

If we can make provision for handling such time formats and apply the astropy.Time on such type of strings, we can then make parse_time handle such inputs.

Making parse_time more modular

One of the fundamental functionality that parse_time doesn't implement separate sub-functions for converting a given time string to different time formats. It currently does all of that using the if-else statements for checking on the input instance and working accordingly. To make the current implementation more modular, we can write functions for converting such time_strings which can be called by the user whenever required. Much of this changes has been proposed in this PR

Separate functions that we can implement will include some of the following

Functions Input Output
convert_time_pandasTimestamp pandas.Timestamp astropy.Time
convert_time_pandasSeries pandas.Series astropy.Time
convert_time_date datetime.date astropy.Time
convert_time_datetime datetime.datetime astropy.Time

Extra features in parse_time

  • First of all, provision for handling FITs file formats should be made for parse_time.
  • Secondly I would discuss with mentors what other time instances they would like as input for parse_time apart from the current ones. I would thus proceed accordingly to work on such proposed time instances.
  • parse_time should be made to use the single dispatch module, since SunPy will moving over to Python 3 under 0.9 version. This issue discusses about this.
  • I plan to import some functions for setting the scale and formats of the Time objects which will let the user some freedom to set the scales and formats of the time_strings accordingly. Although this is subject to change as per reviews from mentors.

Updating functions using parse_time

This would involve updating all the functions under sunpy which use parse_time. Since parse_time will now return astropy.Time object, such functions need to be carefully refactored since they have been using datetime objects returned by older version of parse_time. The documentation of such functions will need to be updated. Also new tests may need to be written down for such updated functions.

Documenting and Testing Parse_time Function

This would involve updating the documentation for parse_time. Since I plan to make parse_time functionality more modular , the docs for all the separate conversion time functions needs to be written. Along with that new tests need to be written for such new conversion functions (like in this cases updating all the test functions).

Some examples of test updates which I implemented can be found here - Test Updates

Timeline

Time Period My Work Plan
May 14 - Jun 11
  • First Week Dedicate this time to knowing more about the project, work with mentors and discuss with them the desired changes to parse_time. I would share my proposed design for returning astropy.Time object and upon consent will start implementing the basic functionality. This period would also be dedicated to knowing about the API changes required in parse_time and the design for adding extra functionalities of astropy.Time as proposed above.
  • Second Week Start working on refactoring the parse_time function to return astropy.Time object. Start breaking the whole functionality of `parse_time` into separate functions for time conversion as has been proposed above.
  • Third Week and Fourth Week These two weeks will be divided according to the workload. Implement each of the time conversion for different time_string instances and return corresponding `astropy.Time` objects. Next work on the additional features as suggested by mentors. This may include writing separate functions for conversion of time to different scales or formats as offered by `astropy.Time`.
  • Jun 11 -Jun15 Phase 1 Evaluation Have the modified version of parse_time ready. Should address to any prevalent bugs that may occur due to involvement of new functions. Once it's set up , I plan to start working on the tests for the newly designed function. Tests will include writing tests for each of the separate time conversion functions for parse_time. Apart from this, I will try to complete as much as documentation possible. These will include documentation for the type we are using for input and output along with links to astropy Time docs corresponding to particular conversion operations, if required.
    Jun 15 - Jul 9
  • First Week Start working on updating the modules under sunpy for transition to use astropy.Time. This will involve modifying the functions currently taking input as datetime to updating them to use astropy.Time object. Also the inside body of such function needs to be changed in order to support astropy.Time objects.
  • Second Week During this time I plan to work on implementing the tests for all such modules where the transition is required.
  • Third Week Dedicate this time to working on documentation for the modules where transition is required.
  • Fourth Week Update the examples. Since the current examples work on datetime objects, they need to be updated to use astropy.Time.
  • Jul 9 - Jul 13 Phase 2 Evaluation Remaining time will be dedicated to check for any bugs or breaks in the code where the transitions are made. Cover up any remaining related issues during this time.
    Jul 13 - August 6
  • First Week I plan to replace all usage of previous parse_time function with the newly designed one. This will require changes to most of the sunpy.net modules like helio, vso. Much of the code in each of these modules will need to be modified a bit in order to tune in to the newly designed function.
  • Second Week Work on increasing test coverage wherever its required.
  • Third Week Provide a more detailed documentation for the modules using parse_time along with support examples for the user to correlate to.
  • August 6 - 14 Final Week - See to if there exist any bugs that was not addressable before. Any pending work or issues will be addressed to during this time

    GSoC

    Have you participated previously in GSoC? When? With which project?

    I have not participated in GSoC before. This is the first time that I would be participating in GSoC.

    Are you also applying to other projects?

    No. This is the only project and SunPy under OpenAstronomy is the only organization that I have applied for.


    Commitment

    • I don't have any other internships or work for the summer. I don't have any plans to go on vacation either.

    • My classes for the new semester will begin around August 2nd, but I would still be able to give sufficient time for the project as academic load is very less during the initial few weeks of the semester. Hence it will not be much of a problem during the final week. I will be able to spare 35-40 hours for the project per week easily.

    • Also, because my summer vacation starts on May 7, I will start working on the project early so that I can try to complete the project well before the deadline ( around 1-2 weeks before the deadline ).

    • I have my semester exams from around 22nd of April to 1 May. So I will not be able to contribute much time to the project work during this time. Still I will try to devote 2-3 hours during the weekdays to do my work.

    • SunPy is the only organization and I am applying for another project in SunPy.


    Eligibility

    Yes, I am eligible to receive payments from Google. For any queries, clarifications or further explanations, feel free to contact me at prateekkol21@gmail.com .

    Clone this wiki locally