Machine learning for light curves.

Currently focusing on broad classification of astronomical objects. Project in hibernation as of May 2018. Like a bear.

Requirements

Python 3
SQLite command line tool (optional)

Local install

Set LCML environment variable to repo checkout's path (e.g., export LCML=/Users/*/code/light_curve_ml)
cd $LCML && pip install -e . --user

AWS Ubuntu install

See instructions in conf/dev/ubuntu_install.txt

Running ML pipeline

Supervised and unsupervised machine learning pipelines are run via the run_pipeline.py entry point. It expects the path to a job (config) file and file name for logger output. For example:

python3 lcml/pipeline/run_pipeline.py --path conf/local/supervised/macho.json --logFileName super_macho.log

Job File

The pipeline expects a job file (macho.json in above example) specifying the configuration of the pipeline and detailed declaration of experiment parameters.

The specified job file supercedes and overrides the default job file (conf/common/pipeline.json) on a per field basis recursively. So any, or none, of the default fields may be overridden. The default settings are located at conf/common/pipeline.json.

Sections

Job files have the following structure:

globalParams - Parameters used across multiple pipeline stages
database - All db config and table names
loadData - Stage coverting raw data into coherent light curves
preprocessData - Stage cleaning and preprocessing light curves
extractFeatures - Stage extracting features from cleaned light curves
postprocessFeatures - Stage further processing extracted features
modelSearch - Stage testing several ML models with differing hyperparameters
- function - search function name
- model - ML model spec including non-searched parameters
- params - parameters controlling the model search
serialization - Stage persisting ML model and metadata to disk

Pipeline 'stages' are customizable processors. Each stage definition has the following components:

skip - Boolean determining whether stage should execute
params - stage-specific parameters
writeTable - name of db table to which output is written

Example Jobs

Some representative job files provided in this repo include:

local/supervised/fast_macho.json - Runs tiny portion of MACHO dataset through all supervised stages. Useful for pipeline debugging and for integration testing.
local/supervised/macho.json - Full supervised learning pipeline for MACHO dataset. Uses feets library for feature extraction and random forests for classification.
local/supervised/ogle3.json - Ditto for OGLE3
local/unsupervised/macho.json - Unsupervised learning pipeline for MACHO focused on Mini-batch KMeans and Agglomerative clustering

Other Scripts

lcml.data.acquisistion - Scripts used to acquire and/or process various datasets including MACHO, OGLE3, Catalina, and Gaia
lcml.poc - One-off proof-of-concept scripts for various libaries

Logging Config

The LoggingManager class allows for convenient customization of Python Logger objects. The default Logging config is specified conf/common/logging.json. This config should contain the following main keys:

basicConfig - values passed to logging.basicConfig
handlers - handler definitions with a type attribute, which may be either stream or file
modules - list of module specific logger level settings

Main modules should initialize the manager by invoking LoggingManager.initLogging at the start of execution before logger objects have been created.

Name		Name	Last commit message	Last commit date
Latest commit History 209 Commits
conf		conf
data		data
jars		jars
lcml		lcml
report		report
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conf

conf

data

data

jars

jars

lcml

lcml

report

report

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

VERSION

VERSION

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

Machine learning for light curves.

Requirements

Local install

AWS Ubuntu install

Running ML pipeline

Job File

Sections

Example Jobs

Other Scripts

Logging Config

About

Releases 1

Packages

Languages

License

lsst-epo/light_curve_ml

Folders and files

Latest commit

History

Repository files navigation

Machine learning for light curves.

Requirements

Local install

AWS Ubuntu install

Running ML pipeline

Job File

Sections

Example Jobs

Other Scripts

Logging Config

About

Topics

Resources

License

Stars

Watchers

Forks

Languages