GSoC 2016 Ideas Page
Suggested Mentor(s): Stuart Mumford, Dan Ryan, Andrew Inglis, Jack Ireland
Difficulty: Beginner
Astronomy knowledge needed: None
Programming skills: Python
The Lightcurve
class is one of the three core datatypes in SunPy, along with Map and Spectra.
Lightcurve
is designed to read in, process and store meta data related to solar physics time series data.
Currently, Lightcurve
uses the pandas library as its underlying data structure, however, this is subject to change in the future.
Much like the map
submodule, lightcurve
needs to be able to read in various supported data formats (such as FITS, ascii and others in the future), store their meta data and give users unified access to this metadata independently of the original source of the data.
As currently implemented (as of 0.6) the lightcurve
module performs three core tasks:
- Download the raw data
- Read this data into a pandas dataframe
- store the meta data obtained with the data.
As of the SunPy 0.7 release the first stage will be moved out of lightcurve
and into the net
subpackage as part of the UnifiedDownloader
Pull Request.
This leaves lightcurve
in a similar position to map
where the data acquisition is not part of the core data type and is managed separately.
The objective of this project is to re-implement the core of the lightcurve submodule, such that it no longer contains the code to download data from the internet. The lightcurve module should be able to open file from disk that have been downloaded using the new UnifiedDownloader submodule. The lightcurve factory must be able to read files from multiple sources some of which will be able to be auto-detcted and some which will not. The lightcurve module must also be able to combine multiple files into a single timeseries.
Expected Outcomes
Someone under taking this project will complete the following tasks:
- Become familiar with the
UnifiedDownloader
code, if it has not been accepted into the SunPy codebase, complete the remaining tasks for this to be achieved. - Write a factory class for
lightcurve
similar to thesunpy.map.Map
class. This class will be a generic constructor forlightcurve
allowing the user to instantiate any one of the many subclasses ofGenericLightcurve
present insunpy.lightcurve.sources
. The API design for the factory class is in SEP 7. - Design and develop a robust method of dealing with lightcurve meta data, which can handle joining different parts of timeseries from different files, each with their own meta data. (See #1122)
A successful proposal for this project will demonstrate that the applicant has understood the mechanism behind the Map
factory as already implemented in SunPy and presents a timeline of what things need to change in Lightcurve to mirror the design of Map
and follow the design for Lightcurve in SEP 7.
Suggested Mentor(s): Stuart Mumford, Simon Liedtke, Steven Christe
Difficulty: Intermediate
Astronomy knowledge needed: None
Programming skills: Python, some database design knowledge would be helpful.
The database
module provides functionality to users to manage collections of files on disk in a way not reliant upon folder structure and file name.
The database allows users to find files on disk by either physical parameters, such as wavelength and time or properties of the instrument such as name and spacecraft.
It also allows more complex queries by enabling searches of the raw meta data associated with the files.
The SunPy database will also act as a proxy for some web services supported by SunPy. When used like this, the database module takes a user query, downloads the data from the web service and then stores it in the database, and then returns the query to the user. SunPy contains clients to various web services, the first and primary web service SunPy supported was the Virtual Solar Observatory (VSO), this is the web service the database was originally designed to support. Since the original development of the database module, the database has also been extended to support the HEK client.
The SunPy web clients, use a system named attrs
(an abbreviation for attributes) to compose queries, this attrs system is also used by the database to perform queries on the database, with some of the attrs shared between the VSO client and the database.
Recently, a new downloader front end (originally named UnifiedDownloader
, now affectionately known as Fido
) has been developed, this provides a Factory Class, with which various download clients (such as the VSO) can register with, providing information about which attrs and attr values that client supports. Using this approach, the Fido
downloader provides a single interface to the many different services SunPy supports.
The first part of this project will be to update the database module to support the new Fido
interface, specifically by using Fido
inside the database to retrieve data.
The second part of the project will be to update the caching mechanism implemented in the database module. The current caching system serialises the users VSO query and stores it as JSON, upon the user requesting another query, the query will be compared to the cache of serialised queries and if a match is found, the results from the cached query returned. This mechanism is limiting in that if the user requests 100 records in query A and 100 records in query B, but 50 of the records requested in both queries are the same (i.e. two overlapping time windows) then the 50 records will be re-downloaded as the cache of query A will not match query B. The updated caching system will store the records a query returns (before the data is downloaded) and then link the results of a query to the records in the database (once the data has been downloaded). Then when records are retrieved from a web service, any records that are stored in the cache table can be skipped for retrieval from the web service and returned from the records in the database. This will allow the caching of partial queries rather than whole queries as is currently implemented.
This project aims to achieve the following things:
- Update the current implementation of the database using the VSO attributes to use the slightly refactored
Fido
attributes and useFido
inside the database to download data from the VSO. - Implement a new caching mechanism bases of the results of Queries with
Fido
rather than the current caching which is based upon the VSO query.
A successful proposal will schedule updates to the database package in small sections, rather than in one large pull request. The work should be understood and broken down into individual sections.
There are various other maintenance tasks which need undertaking (https://github.com/sunpy/sunpy/labels/Database) which would be a good way for someone interested in this project to familiarise themselves with the codebase.
Suggested Mentor(s): David Perez-Suarez, Jack Ireland
Difficulty: Beginner-Intermediate
Astronomy knowledge needed: none
Programming skills: Python
Real time data is very useful for
spaceweather operations, SunPy
provides access to data by different virtual observatories or services (like
sunpy.net.vso
or sunpy.net.hek
) or by accessing to direct data archives.
Fido
(formerly called UnifiedDownloader
) provides a single point of access
to them all. However, this needs to be extended to
other data archives,
and a logic implemented so depending on the time range asked it downloads the
data from the realtime archives or from the full-archive.
Additionally, this project should produce some visualisation tools to combine data from different sources. Some examples are overlay of active regions on top of solar images (like in SolarMonitor), GOES X-ray flux with active regions number on the flares detected (like in Latest Events), latest features observed available from HEK on top of a map (e.g. isolsearh).
In summary, this project has two objectives:
- Implementation of real time archives and logic on
Fido
. - Creation of visualisation tools to represent real-time data.
Familiarisation with the
unidown
branch and
matplotlib
library will help you to create a proper
timeline on how much time will take to implement, test and document each part of
the project.
Suggested Mentor(s): Drew Leonard, Will Barnes
Difficulty: Beginner
Astronomy knowledge needed: Some knowledge of coronal emission processes would be beneficial
Programming skills: Python.
The CHIANTI atomic physics database is a valuable resource for solar physics. The CHIANTI database holds a large amount of information on the physical properties of different elements in different ionisation states and enables the calculation of various parameters from this information. Using CHIANTI it is possible to calculate the spectra of various types of solar plasma (e.g., flare, quiet sun, etc.) from the observed elemental abundances and ionisation states. These synthetic spectra are essential for calculating response functions of various instruments. An instrument's wavelength response function describes how much light emitted at a given wavelength is measured by the instrument. Similarly, the temperature response function describes the instrument's sensitivity to light emitted by plasma at a particular temperature. These response functions play a vital role in correctly interpreting observations, as does proper calculation of these functions.
Currently, SunPy has no implementation of instrument response functions. This project would develop the routines necessary to calculate response functions using the Python interface to the CHIANTI database, ChiantiPy. The primary implementation of this would be to produce default wavelength and temperature response functions for the Atmospheric Imaging Assembly instrument. A detailed discussion of the AIA response functions can be found in Boerner et al 2012 and Boerner et al 2013.
Other potential applications of ChiantiPy in SunPy include:
- Generalisation of the code to produce response functions using arbitrary values of physical parameters (elemental abundances, etc.).
- Calculation of response functions for other instruments.
- Conversion of ChiantiPy spectra objects to SunPy Spectra objects.
Expected Outcomes: This project would facilitate SunPy becoming independent from Solar SoftWare (SSW) for analysing AIA data, particularly with respect to inferring plasma properties such as temperature and density.
A successful proposal will outline a schedule for implementing at least a single set of temperature and wavelength response functions for AIA, and the response functions for arbitrary plasma conditions would be a bonus. Familiarity with CHIANTI, ChiantiPy and SSW's implementation of the response functions will help to properly assess how long will be required to recreate them in SunPy.
Suggested Mentor(s): Jose Iván Campos Rozo (National Astronomical Observatory, National University of Colombia), Santiago Vargas Domínguez (National Astronomical Observatory, National University of Colombia), David Pérez Suárez.
Difficulty: Intermediate
Astronomy knowledge needed: None
Programming skills: Python, basic knowledge of qt4, pyqt4, qt designer
The Local Correlation Tracking (LCT, November & Simon, 1988) technique is a robust method used to study the dynamics of structures in a time series of images. By tracking pixel displacements, using a correlation window, LCT can determine proper motions and generate flow maps of horizontal velocities. This procedure is used to study the dynamics of plasma in the solar photosphere at different spatial scales, e.g the analysis of granular and supergranular convective cells, meridional flows, etc. A widget implemented in Python was developed. It generates a user-friendly graphical user interface (GUI) to control various parameters for the process of calculating flow maps of proper motions for a series of filtergrams (data cube). Our purpose is to implement this tool in Sunpy using its structure and to improve it with some more options, i.e. masks, statistics, histograms, contours and multi-plots. Although an initial version is already developed, our proposal is to focus on the efficient integration of the code in the SunPy libraries. The code (without widget files yet) is https://github.com/Hypnus1803/flow_maps
Expected Outcomes: To integrate efficiently the code in SunPy libraries.
Suggested Mentor(s): Stuart Mumford, Steven Christe,
Difficulty: Beginner
Astronomy knowledge needed: None
Programming skills: Python
The SunPy documentation need some love. The purpose of the
Expected Outcomes
Someone under taking this project will complete the following tasks:
- Research and implement a gallery of SunPy examples which support jupyter notebooks.
- Update the docs (insert tasks).
- Net documentation need to be refactored.
- Update the website (insert tasks).
A successful proposal for this project will demonstrate that the applicant has understood the task and presents a timeline of what things need to change to be implemented.