Main presentation by Masood Krohy:

Title: Large-scale Experimentation with Spark & Productionizing Native Spark ML Models

Summary: Apache Spark is the state-of-the-art distributed processing, analytics and ML engine and we are presenting and demo-ing two interesting ways one can use Spark in ML projects: 1) we use Spark to distribute the grid-search optimization of a generic ML model (from a regular, single-machine ML library). We show how Spark can distribute processing tasks over the CPU cores of a cluster which gives a near-linear speedup and lowers processing times; hence it facilitates the exploration of a much larger space to find the optimal hyperparameters for the ML model. This use case is suitable when the projects do not involve Big Data and we use Big Data technologies, i.e., Spark, for the purpose of speeding up the processing of tasks; 2) we demonstrate how to train an example model using the ML lib of Spark itself and how to serve the model with MLeap, a production-quality, low-latency serving engine. This second use case/workflow is suitable when projects do involve Big Data.

Bio: Masood Krohy is a Data Science Platform Architect/Advisor and most recently acted as the Chief Architect of UniAnalytica, an advanced data science platform with wide, out-of-the-box support for time-series and geospatial use cases. He has worked with several corporations in different industries in the past few years to design, implement and productionize Deep Learning and Big Data products. He holds a Ph.D. in computer engineering.

Lightning talks:

Jenny Midwinter

Title: CNN Based Auto-Pilot for a Wheelchair: A brief story of the results of rapid development of a CNN based self driving wheelchair using open source software & low cost hardware

Summary: This lightening talk describes the development & results that were successfully used to create a proof-of-concept (POC) prototype for a CNN based auto-pilot function that was able to autonomously drive a wheelchair based on a single input image stream from forward facing camera . A brief explanation of how the CNN is used to autonomously drive the wheelchair is provided, including a short video demo. As well, a description of the low-cost hardware/software system, & the learnings that were obtained, now being used as a basis for the next stage of development, will also be presented.

Bio: Jenny is collaborating as Chief Scientist, Applied AI, at Eightfold Technologies, and responsible for the development of vision based self-driving technology for Eightfold’s SmartChair. She is the CEO and founder of Blue Horizon AI, a start up dedicated to applying AI in autonomous technologies to assist the disabled. Prior to this, she has 25 years experience in R&D in the Telecom Industry.

Claude Coulombe

Title: Text Data Augmentation Made Simple By Leveraging NLP Cloud APIs

Summary: In natural language processing, it is common to find oneself with far too little data to train a deep model. This “Big Data wall” represents a challenge for minority language communities, organizations, laboratories and companies that compete with the web GAFAM web giants. In this presentation we will discuss various simple, practical and robust text data augmentation techniques based on NLP and machine learning to overcome the lack of text data for training large statistical models, particularly for deep learning.

Bio: Claude Coulombe has evolved from the budding young scientist who participated in 15 science fairs, got a B.Sc. in physics and a master’s degree in AI at Université de Montréal (Homo Scientificus), evolved to become a Québec’s passionated high-tech entrepreneur, co-founder of an AI startup called Machina Sapiens, where he participated in the creation of a new generation of grammatical checker tools (Homo Québecensis). Following the bursting of the technology bubble, Claude took a new evolutionary path to start a family, launch Lingua Technologies, which combines machine translation and web technologies, and undertake a PhD in machine learning at MILA under the supervision of Yoshua Bengio (Homo FamilIA). In 2008, resources becoming scarce, Claude transformed into a Java tech lead, specializing in the creation of rich web applications with Ajax, HTML5, Javascript, GWT, REST architectures, cloud and mobile applications (Java Man). In 2013, Claude started a new PhD in cognitive science, participated in the development of two massive open online courses (MOOC) at TÉLUQ, learned Python and deep learning (Python Man, not to be confused with the Piltdown Man). In short, Claude is an old fossil that has evolved, reproduced, created tools and being adapted to the rhythm of his passions.

Shiv Kumar & Darius Vaillancourt

Title: Open Source Simulations with Python and Blended Learning Approaches to Data Science and Machine Learning Education

Summary: Enterprise software for engineering is currently expensive, with few open source options available for use by electrical engineers. Python with Power Electronics aims to close this gap by providing tools to simulate power electronics and circuitry with initial uses tailored to the renewable energy industry. The lightening talk will also cover affordable techniques for learning machine learning and data science by combining online learning with in-person workshops.

Bios: Shiv Kumar (PhD) is an electrical engineer by training with over ten years of experience across renewable energy R&D and software development. Shiv Kumar is author of textbook Simulating Nonlinear Circuits with Python Power Electronics and is an instructor at Collabo Academy where he teaches Data Science & Machine Learning with Python.

Darius is a founder and CEO of Collabo Academy, an education technology that delivers blended learning at an affordable price.