+
Skip to content

DSCI 525 Group III - Web and Cloud Computing: Development of rainfall predictor via AWS cloud services.

License

Notifications You must be signed in to change notification settings

UBC-MDS/majacloud

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

majacloud

A DSCI 525 project by Group 3 to predict rainfall in NSW, Australia based on big datasets

Overview

Our goal is to develop and deploy cloud-based ensemble machine learning model for future rainfall prediction in NSW, Australia. The datasets we used contain rain (mm/day) over time observed or computed by different models, retrieved on figshare. The datasets were loaded and combined together using pandas and dask, and underwent exploratory data analysis using both python and R. The data will then be used for big data machine learning model building and deployment to predict future rainfall in Australia.

Milestones Timeline
Milestone 1 (Week 1) - Get the Data from Web & familiarize with advanced file formats 2021-04-03
Milestone 2 (Week 3)- Setup S3 bucket, EC2 instance & TLJH 2021-04-17
Milestone 3 (Week 4)- Setup EMR-spark instance & rewrite ML model you have from previous milestone in spark 2021-04-24
Milestone 4 (Week 5)- Deploy ML model using flask 2021-04-29

Dependencies

Python 3.8.3

re == 2.2.1
requests == 2.25.1
json == 2.0.9
pandas == 1.2.3
dask == 2021.3.1
rpy2 == 3.4.3
pyspark == 3.1.1
s3fs == 0.6.0
joblib == 1.0.1
matplotlib == 3.4.1
sklearn == 0.0
json5 == 0.9.5
urllib3 == 1.26.4
flask == 1.1.2

R 4.0.2

arrow == 3.0.0
dplyr == 1.0.3

Contributors

We welcome and recognize all contributions. You can see a list of current contributors in the contributors tab. UBC MDS DSCI 525 Group III:

Attributions

CMIP6 Experimental Design and Organization https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6

Pangeo Coupled Model Intercomparison Project Phase 6 https://pangeo-data.github.io/pangeo-cmip6-cloud/

SILO - Australian Climate Data https://www.longpaddock.qld.gov.au/silo/

About

DSCI 525 Group III - Web and Cloud Computing: Development of rainfall predictor via AWS cloud services.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载