Modeling, Mining, and Machine Learning in Astronomy
Tuesday-Thursday 1:25-2:40 pm
301 Space Sciences Building
520 Space Sciences Building
This course builds upon a foundation of probability and statistics to explore, develop, and apply algorithms for discovering objects and events in astronomical data, for inference of sophisticated models for populations of objects using frequentist and Bayesian methods, and for visualization and presentation of results to address fundamental questions using persuasive, data-based arguments. Topics include time-series analysis; clustering and classification algorithms; genetic algorithms; Markov Chain Monte Carlo methods; and neural networks. Analysis projects include investigation of simulated and real data using Python and Jupyter notebooks. The emphasis is on understanding the fundamentals of algorithms and developing expertise for choosing appropriate methods to address top-level goals.
Lectures, reading material, and assignments will be posted here as we go.
Lecture 1: Introduction to the Course
Reading for the next two lectures:
Probability basics: here are several reference options.
After reading item 1, make a quick run through items 2, 3 or 4.
Key features to be emphasized in class include random variables,
probability density functions, examples like the Gaussian distribution, and the Central Limit Theorem.
Relevance: we will treat data as 'processes' that
are collections of random variables and we need to be able to
manipulate their probabilities.
1. Chapter 1 and portions of Chapter 5 from Gregory:
all sections except
5.5.4, 5.6, 5.7.3, 5.8.3, 5.8.4, 5.11.1, 5.13.2, 5.14, 5.15
2. Course Notes (slides 2,3 summarize main elements we will use)
3. Another approach: Laws of Probability, Bayes' theorem, and the Central Limit Theorem
4. Another approach: Basics of Probability from CS229 Stanford
Lecture 2: Basics of Probability and Processes.
Lecture 3: Probability Applications.
Homework: Assignment 1 A simple demonstration of Frequentist and Bayesian methods
Jupyter notebook: Frequentist_Bayesian_Example.ipynb
Lecture 4: Correlation functions, transformation of variables, Fourier transforms
Address the questions asked in the notebook by writing up what you did and what your conclusions are. You can add this to the notebook but RENAME the notebook with your name prepended to the notebook name. Alternatively you can submit (by email) a text-edited document converted to PDF (also with a name that includes your name). Due Thursday Feb 7.
1. Fourier transforms: Appendix B sections B.1 through B.4.2 of Gregory
2. Fourier transforms: Course notes
3. Linear Shift Invariant Systems
Notes: Correlation Functions as a Diagnostic Tool
Lecture 5: Linear models, cost functions, solutions, covariance matrices
1. Linear least squares
2. Sections 10.1-10.6 in Gregory
Homework: Assignment 2 Due February 21 (Thursday)
Lecture 6: Iterative linear regression; overfitting, detection and classification
1. DFT, FFT Usage, Fourier-based Power Spectra
Lecture 7: Detection/classification of aligned features, Matched Filtering
1. Matched Filtering (first 18 slides)
2. Chapter 39.4 The Single Neuron as a Classifier, in
Information Theory, Inference, and Learning Algorithms (MacKay)
3. Single layer neural networks (notebook from Sebastian Raschka)
Lecture 8: Detection/classification of misaligned features, Matched Filtering II
Jupyter notebooks: these are for the hackathon in the Lecture 9 class
1. Classical least squares demonstration (matrix algebra, covariance matrix, cost function, etc.)
2. Least squares fit done iteratively in a simple network
3. Pulse detection (classification) with a simple network (aligned pulses)
4. Misaligned Pulse detection (first take)
5. A neural net in 11 lines of Python
Lecture 9 (19 Feb) Hackathon to investigate simple networks