- Main presentation/demo by Masood Krohy (Title: Large-scale Experimentation with Spark & Productionizing Native Spark ML Models)
- Lightning talks:
- Jenny Midwinter (Title: CNN Based Auto-Pilot for a Wheelchair)
- Claude Coulombe (Title: Text Data Augmentation Made Simple By Leveraging NLP Cloud APIs)
- Shiv Kumar & Darius Vaillancourt (Title: Open Source Simulations with Python and Blended Learning Approaches to Data Science and Machine Learning Education)
Main presentation by Masood Krohy:
Title: Large-scale Experimentation with Spark & Productionizing Native Spark ML Models
Summary: Apache Spark is the state-of-the-art distributed processing, analytics and ML engine and we are presenting and demo-ing two interesting ways one can use Spark in ML projects: 1) we use Spark to distribute the grid-search optimization of a generic ML model (from a regular, single-machine ML library). We show how Spark can distribute processing tasks over the CPU cores of a cluster which gives a near-linear speedup and lowers processing times; hence it facilitates the exploration of a much larger space to find the optimal hyperparameters for the ML model. This use case is suitable when the projects do not involve Big Data and we use Big Data technologies, i.e., Spark, for the purpose of speeding up the processing of tasks; 2) we demonstrate how to train an example model using the ML lib of Spark itself and how to serve the model with MLeap, a production-quality, low-latency serving engine. This second use case/workflow is suitable when projects do involve Big Data.
Bio: Masood Krohy is a Data Science Platform Architect/Advisor and most recently acted as the Chief Architect of UniAnalytica, an advanced data science platform with wide, out-of-the-box support for time-series and geospatial use cases. He has worked with several corporations in different industries in the past few years to design, implement and productionize Deep Learning and Big Data products. He holds a Ph.D. in computer engineering.
Title: CNN Based Auto-Pilot for a Wheelchair: A brief story of the results of rapid development of a CNN based self driving wheelchair using open source software & low cost hardware
Summary: This lightening talk describes the development & results that were successfully used to create a proof-of-concept (POC) prototype for a CNN based auto-pilot function that was able to autonomously drive a wheelchair based on a single input image stream from forward facing camera . A brief explanation of how the CNN is used to autonomously drive the wheelchair is provided, including a short video demo. As well, a description of the low-cost hardware/software system, & the learnings that were obtained, now being used as a basis for the next stage of development, will also be presented.
Bio: Jenny is collaborating as Chief Scientist, Applied AI, at Eightfold Technologies, and responsible for the development of vision based self-driving technology for Eightfold’s SmartChair. She is the CEO and founder of Blue Horizon AI, a start up dedicated to applying AI in autonomous technologies to assist the disabled. Prior to this, she has 25 years experience in R&D in the Telecom Industry.
Title: Text Data Augmentation Made Simple By Leveraging NLP Cloud APIs
Summary: In natural language processing, it is common to find oneself with far too little data to train a deep model. This “Big Data wall” represents a challenge for minority language communities, organizations, laboratories and companies that compete with the web GAFAM web giants. In this presentation we will discuss various simple, practical and robust text data augmentation techniques based on NLP and machine learning to overcome the lack of text data for training large statistical models, particularly for deep learning.
Shiv Kumar & Darius Vaillancourt
Title: Open Source Simulations with Python and Blended Learning Approaches to Data Science and Machine Learning Education
Summary: Enterprise software for engineering is currently expensive, with few open source options available for use by electrical engineers. Python with Power Electronics aims to close this gap by providing tools to simulate power electronics and circuitry with initial uses tailored to the renewable energy industry. The lightening talk will also cover affordable techniques for learning machine learning and data science by combining online learning with in-person workshops.
Bios: Shiv Kumar (PhD) is an electrical engineer by training with over ten years of experience across renewable energy R&D and software development. Shiv Kumar is author of textbook Simulating Nonlinear Circuits with Python Power Electronics and is an instructor at Collabo Academy where he teaches Data Science & Machine Learning with Python.
Darius is a founder and CEO of Collabo Academy, an education technology that delivers blended learning at an affordable price.