This presentation was recorded at GOTO Aarhus 2014 http://gotocon.com Cliff Click - CTO at 0xdata ABSTRACT We have built an open-source platform for dealing with in-memory distributed data. We've used it to built state-of-the-art predictive modeling and analytics (e.g. GLM & Logistic Regression, GBM, Random Forest, Neural Nets, PCA to name a few) that's 1000x faster than the disk-bound alternatives, and 100x faster than R (we love R but it's tooo slow on big data!). We can run R expressions on tera-scale datasets, or munge data from Scala & Python. We're building our newest algorithms in a few weeks, start to finish, because the platform makes Big Math easy. We routinely test on 100G datasets, have customers using 1T datasets. This talk is about the platform, coding style & API that lets us seamlessly deal with datasets from 1K to 1TB without changing a line of code, lets us use clusters ranging from your laptop to 100 server clusters with many many TB of ram and hundreds of CPUs. https://twitter.com/gotocon https://www.facebook.com/GOTOConference http://gotocon.com Looking for a unique learning experience? Attend the next GOTO conference near you! Get your ticket at https://gotopia.tech Sign up for updates and specials at https://gotopia.tech/newsletter
Get notified about new features and conference additions.