Temporal Analysis with R - ASSIGNMENTS
Individual work
Although you are certainly welcome to confer with others - including me - during your research, the resultant report should be essentially unique and different from anyone else's. Collaborative projects may be conducted only with my explicit prior approval. Read the UMBC Undergraduate Student Academic Conduct Policy, especially Sec. III f.
Submission
Submissions (or a URL) should be emailed to LDECOLA@COMCAST.NET (not to UMBC) by the due time listed below on the assignment.
Use the following file naming rule as it makes it easy to keep track of your work. Submission files are to be named as follows:
flname_n.ext
, where
f =
first letter of your first name,
lname =
your last name,
n =
assignment number, and
.ext =
file extension (.doc, .pdf, .ppt
, etc.).
You are welcome to create an HTML document and post it on the web, sending me the URL.
Format
- All submissions must show a title, your name, class name, and date.
- Any item you copy or refer to must be cited correctly (see the examples in the syllabus) but look at a style guide for details. Be sure to include the correct URL and exact references for any data you use.
- I rarely print documents and generally view MSWord documents in 'web layout' mode, so don't spend much time trying to format them. The font should be readable at normal size.
- Graphics should fill about 2/3 of the width of the screen; anything smaller is too small, anything larger is likely to bleed off the screen in some circumstances.
- If you submit a spreadsheet, the worksheets and columns should be labeled meaningfully.
- You're welcome to submit the data as well if possible so I can experiment with them.
- If you want to include a map, submit it as a graphics file (.gif, .jpg, etc.) in order to share your work.
Useful information
-
A report writing guide from my website; please read and refer to this when writing!
-
A few places with time series data:
IPCC Climate data
United Nations
World Bank
UN World Health Organization
US Statistical Abstract
US Centers for Disease Control and Prevention (CDC)
- A list of some of the R base time series datasets.
As always: please post questions, etc. on the Blackboard discussion forum!
Assignment #1: Time series description and visualization (due 2017-09-25 23:59)
Find some time series data that are and perform your own descriptive analysis (forecasting not necessary). Correctly cite at least 3 references (may be from the course recommendations).
The data may be from the R base (except the R LakeHuron series) or any library package, chosen from the list above, or from any other source.
The report should be about 3 "pages" in length. I suggest the following outline:
- Find a univariate time series either from with R or from an online source.
- Describe the data using standard statistical measures (
mean(), summary()
), as well as visualizations (stem(), hist()
, qqnorm()
etc.).
- Visualize the series in several ways (e.g. points, lines) and format the visualization (e.g. with
ylim, xlab, las
) to highlight the 'behavior' of the data: trend, cyclicality, randomness.
- Even though we may not have covered these techniques, explore differences, diff(), and regression, lm(y ~ x, data).
- Discuss the series; factors that may drive changes in the pattern (other factors, seasons, etc.), and how the series might behave in the future - you don't need to use formal methods.
Assignment #2: Forecasting (due 2017-10-22 23:59)
This assignment requires an extensive description, analysis, and forecast of a time series. I would prefer that you use a series with measurements made more than once per unit of time (e.g. monthly: freq = 12). The following steps are suggested and may not all be performed. Sometimes you may just present a visual result and speculate as to what it means without a formal knowledge of the theory! You must supply a few references to publications about the series or related data. Although you are invited to discuss the research with others - especially on Blackboard - your work must be independently carried out.
Description
- plot the time series at least 2 ways and discuss what each visualization shows,
- provide a histogram and discuss the shape of the data,
- examine first differences, diff(), and discuss their behavior,
- plot decompose() and discuss.
Analysis
- perform a linear regression: discuss coefficients and R2, plot the prediction, and examine the residuals,
- plot a moving average rollmean() from the zoo package,
- look at acf() and discuss temporal autocorrelation.
Forecast
- use the forecast package and do some kind of forecast into the 'future,'
- discuss how the forecast relates to the history of the series and speculate about your confidence in the result,
- provide any general conclusions about what you have found.
Assignment #3: Final report (due 2017-11-26 23:59)
This research will be presented in class at Shady Grove on Nov 28
Develop an original project based on your professional or course work or some other interest. You're urged to share your ideas with me and others as you work on the project. Submit it by the usual deadline and be prepared to present it at our last meeting. This could be an elaboration of Assignment #2, but it will then have to be much more detailed.
Requirements
The project must be based on data that are in some way more complicated than a simple vector of measurements; therefore use at least one of the following types of data:
- multivariate: 2 or more variables,
- temporally multiscale, with a frequency > 1 (e.g. quarters, months, weeks),
- space/time: data containing spatial and temporal measurements,
- events, e.g. with a POSIX-type time stamp.
Submission and presentation
- The submission must be a digital report document (see my report writing guide).
- The presentation may be from the digital document, PowerPoint slide show, etc.
- For the presentation, be prepared to demo your analysis using R: i.e. the data should be on your computer or readible in class for further exploration.
Suggestions
- It's always interesting to look deeper into data you're already familiar with from prior work; browse your computer and old reports and see what you can come up with. Work from concurrent classes is also acceptable provided you check with me first.
- Do some searching using keywords of interest plus DATA, DATASET, DOWNLOAD, SPREADSHEET, etc...
- 2N eyes are better than 2, so please share ideas you find - especially on Blackboard!