GSoC 2021 Adwait Bhope
Organization: OpenAstronomy Sub-organization: sunpy
Mentors:
- Dan Ryan (@DanRyanIrish)
- Stuart Mumford (@Cadair)
Adwait Bhope adwaitbhope@gmail.com
GitHub: @adwaitbhope Matrix: @adwaitbhope
Final year CS undergrad at University of Pune(SPPU), admitted to Pune Vidyarthi Griha's College of Engineering and Technology, India.
-
Have you participated in GSoC previously? No, this is the first time I'm applying to an organization.
-
Are you also applying to other projects? Yes, I plan to apply to another one of sunpy's projects: Update sunraster to ndcube 2.0 under OpenAstronomy.
-
Are you eligible to receive payments from Google? Yes, I'm eligible.
-
How much time do you plan to invest in the project before, during, and after the Summer of Code? I will be able to satisfy the official GSoC guideline of 18 hours per week throughout. I expect my college exams to be over before the coding period begins on 7th June. I'm currently interning at a company as a Software Engineer, but I believe it won't hamper my commitment to GSoC. After the Summer of Code, I'll keep contributing with code as well as other things like docs. The community is very active and new work is always available.
Being in my final year, I've accumulated a fair amount of programming experience until now. I have worked on multiple projects across different domains like mobile app development, backend web development, and machine learning.
I love participating in hackathons and developing solutions. Our team recently won Smart India Hackathon (India's biggest hackathon), where I was the team leader. We worked on a Machine Learning based solution for calculating a safe speed for a vehicle to drive at, based on its surroundings. We're currently writing a paper on our approach and hope to get it published. We also went on to win the ASEAN India International Hackathon a few weeks ago.
My experience with Python started with learning backend web development using Django. I've gotten proficient with it now. I'm currently interning at a company where I'm the primary person tasked with writing REST APIs with Python. A few months ago, we also developed the initial stage of a mutual fund portfolio management portal, whose development was sponsored by Principal Global Services, a Fortune 500 FinTech company. The entire backend for this application was written using Python.
I have some more Python experience related to machine learning, much of it from my previous internship at Resolute AI in Bangalore, India. This mostly entailed work with libraries like TensorFlow
, Keras
, numpy
, pandas
, etc. I worked on developing a model that tracked PPE kit compliance for people working in the healthcare industry.
I'm involved in the clubs at my campus like the Google Developer Student Club as the Technical Head. We organize and conduct technical workshops for fellow juniors. Through this, I've had the opportunity to reach hundreds of students and help them through their journey.
Some of my Python projects:
-
A desktop GUI app for Object Recognition https://github.com/adwaitbhope/object_recognition_poc
-
A client-side app for running Object Detection https://github.com/adwaitbhope/sih-client-side-object-detection
-
REST backend for a mobile app https://github.com/adwaitbhope/denizen-backend
Here is a summary of some relevant subjects I studied at college:
- Engineering Mathematics (I, II and III), Engineering Physics
- Data Structures & Algorithms, Object-Oriented Programming
- Computer Graphics, Engineering Graphics
- Design & Analysis of Algorithms, Theory of Computation
- Software Modelling & Design, Software Testing & Quality Assurance
These courses have strengthened my grasp of core computer science concepts and programming logic. Also, I believe that some of these courses like Mathematics, Physics, and Graphics will specifically help me with the science side of things as I contribute to this organization.
Also, here is a link to my resume for reference.
-
(Merged) Fixed a bug in DataManager https://github.com/sunpy/sunpy/pull/5089
-
Added namespaces for downloaded files https://github.com/sunpy/sunpy/pull/5111
-
(Misc.) Invalid link on the GSoC webpage of this project https://github.com/OpenAstronomy/openastronomy.github.io/issues/285
-
(Misc.) Fixed a typo in a function doc reference https://github.com/sunpy/ndcube/pull/419
I'm currently working on my final year academic project with our mentor from Persistent Systems Ltd., India. We are collecting spectral data of galaxies in the optical region from the Sloan Digital Sky Survey, and working on setting up a Machine Learning pipeline that can predict the age of the galaxy. In fact, we are using astropy, one of the sub-organizations here, to handle the FITS files.
I have always been excited about stars, galaxies, and other astronomical entities, and keep reading about them. All of the above project exposure piqued my interest, and I was enthralled when I discovered OpenAstronomy as one of the organizations.
In our project, we had to resample all the spectral data on the same grid so that it can be fed to the ML algorithms. We used the Python package pysynphot
to achieve this. Upon seeing a similar project under OpenAstronomy, it was a no-brainer for me. I understand why this functionality would be helpful since I had to experience it myself firsthand. I find it exciting that this will make it convenient for scientific researchers to tinker with their data.
- Why are you suited to work on this project? As stated above, I understand the motive behind this project, I have good experience with Python and programming in general, and I also have some exposure to Astronomy. I have made contributions to sunpy, and I feel comfortable with the math that goes behind these WCS transformations. I believe this makes me a strong candidate for the project.
There are other packages (like pysynphot
that I mentioned above) that can resample data, but Astropy's reproject
is probably the most helpful one for ndcube
. It will accept an NDCube
along with a target WCS and will project the data on it accordingly. This will make it very easy for ndcube
to support resampling. The challenge to construct a WCS object with the necessary modifications remains, as reproject will not help with that.
Currently, there are multiple algorithms supported by reproject
like Interpolation, HEALPIX Projection, DeForest Adaptive Resampling, etc. For Interpolation, it supports "nearest-neighbour", "bilinear", "biquadratic", and "bicubic".
I studied a bit about different coordinate systems and WCS from this article that @nabobalis shared with me. It provided me with a clearer understanding of how these transformations work. I have put some code together, trying to resample an NDCube
using bicubic interpolation. I upscaled the wavelength axis by a factor of 10, manually creating a new WCS that supported this transformation. For this, I modified the CDELT
parameter that corresponds to pixel width (for real-world values). I have also plotted a slice of the cube to compare the result.
The code and the graph can be found at this gist: https://gist.github.com/adwaitbhope/d056fd5a5a8eb5781a9ccc5615a644f0.
-
How do you plan to implement the project? First, I plan to closely study different use cases for resampling. It is key to identify the necessary type of WCS transformations. I will be studying more about this to figure out how to create a new WCS that supports the resampled data. My first step would be to get an API up and running, that solves a basic use case, something like a Minimum Viable Product. This can involve modifying
NDCube
's other parameters likeextra_coords
. The API can then be extended to more complicated use cases. There are some more things to plan before starting with the implementation, such as deciding whether to support only integer values as the scaling factor. This will largely depend on how we are able to usereproject
withndcube
. -
Does the project include API changes? Yes, this project will introduce a new API. In the original issue that referenced this topic, @DanRyanIrish (one of the mentors) has ideated a possible API. Under the hood, ndcube will construct a new WCS object and resample the data along with other parameters like
extra_coords
. This will make a powerful API that will let users manipulate all their data with a single function call. However, I don't suppose that there will be any breaking change to the existing API. -
Will you need additional software package requirements? Yes, the primary requirement will be Astropy's
reproject
, which is released as a separate package here. This package implements the necessary mathematical algorithms we can use for resampling, thereby reducing effort and redundancy. Its API is easy to use withNDCube
, as demonstrated in the gist that I have linked above.
Period | Plan | Date |
---|---|---|
Community Bonding Period |
|
17th May - 7th June |
Coding Period | 7th June - 16th August | |
Week 1 |
|
7th June - 14th June |
Week 2 |
|
14th June - 21st June |
Week 3, 4 |
|
21st June - 5th July |
Week 5 |
|
5th July - 12th July |
First Evaluation | 12th July - 16th July | |
Week 6, 7 |
|
12th July - 26th July |
Week 8 |
|
26th July - 2nd August |
Week 9 |
|
2nd August - 9th August |
Week 10 |
|
9th August - 16th August |
Submission and Final Evaluation | 16th August - 23rd August | |
Post Google Summer of Code |
|
Primarily, I aim to gain more experience working with open-source communities. I have contributed a bit to other repositories before, and hope that Summer of Code serves as an entry point for me to give back more to the community.
Secondly, I'm excited about being mentored during this project. Even during my recent contributions to sunpy, I understood that developing software at this level is different from academic projects. I want to grow these skills and hope to learn the difference between good, bad, and great code.