[ML UTD 2] Machine Learning Up-To-Date

[ML UTD logo] human brain image on circuit board

Welcome to Machine Learning Up-To-Date (ML UTD) 2! The LifeWithData blog separates the signal from the noise in today’s hectic front lines of the intersection between software engineering and machine learning.

LifeWithData aims to consistently deliver curated machine learning newsletters that point the reader to key developments without massive amounts of backstory for each. This enables frequent, concise updates across the industry without overloading readers with information.

ML UTD 2 brings innovations in the fields of dataset preparation, deep learning theory & software, IoT computer vision, and…real estate!

Dataset Preparation

Many fall into the fallacy of thinking that neural networks can solve any problem, given enough data. In reality, the truth is that neural networks can solve a wide variety of tasks, given enough of the appropriate data for the task. Would you expect an algorithm to learn how to classify pictures of cats vs dogs, given an infinite supply of stock market prices?

MIT’s CSAIL laboratory understands the importance of what data a learner sees during training and has developed the TextFooler library to aid natural language processing (NLP) models generalize better. In a recent arXiv paper, members of the lab show how even state-of-the-art NLP models such as Google’s BERT fall victim to adversarial examples, which are seemingly random changes to a well-understood data sample which thoroughly confuse a learner.

Deep Learning Theory & Software

Geoffrey Hinton, coined the “father of deep learning,” unveiled the idea of capsule networks in 2017 as an improvement over traditional linear and convolutional architectures. A new publication on arXiv delves into a comparison between capsule networks and convolutional networks. In lieu of providing a summary, I encourage those interested in the matter to read the full paper to fully understand the benefits and tradeoffs. For those interested in an introduction to capsule networks, hear it from the man himself below.

Geoffrey Hinton talk: “What is wrong with convolutional neural networks?”

Amidst all of the wonderful developments deep learning has shown in the past few decades, the data the networks handle have largely corresponded to one-or-two-dimensional objects (no, the color dimension does not count). However, Facebook wants to change that with PyTorch3D. Check out a short demonstration video below. I expect developments with modeling spatial objects to accelerate quickly with this new toolbox.

PyTorch3D’s “Learning Camera”


NVIDIA’s Jetson hardware platform has rapidly risen in popularity, allowing deep learning to be deployed effectively at the edge, resulting in many more potential applications. Because it wants to drive even more adoption, the company has lowered the bar to get started by providing a series of “getting started” tutorials. The majority of the libraries are written in C++, but how does using it to perform object detection in 10 lines of python sound?

Computer Vision

Do you ever wonder how large data sets get created for data-hungry neural networks? Well, here’s a taste. To efficiently annotate their video data sets, self-driving vehicle startup Waymo has built an in-house content search tool to create ultra-granular labels.

Last year, Google released an augmented-reality-powered HUD mode to some of its AR-powered devices. Now, the transit app Moovit has followed suit, introducing an AR display mode to overlap map information onto a phone’s video feed. Bad news: this only exists for phones running iOS. Stay patient, Android users!

Real Estate

We are well underway into machine learning’s adoption into stock market investing. Interestingly, the same is now becoming true for both residential and commercial real estate. As the old saying goes, the most important aspect of real estate is “location, location, location.” However, that is now changing into “the most important aspect of real estate is information, information, information.”

Cherre, a New York real estate data integration startup, has raised another $16 million for its core product, which uses several, diverse data sources to form a “knowledge graph” of potential investments. In simple terms, this is similar to how Wikipedia presents an organized body of knowledge for your searches. Cherre’s software is a little different, though — it performs this organization automatically!

Stay Up To Date

That’s all for ML UTD 2. However, things are happening very quickly in academics and industry! Aside from ML UTD, keep yourself updated with the blog at LifeWithData.

0 Replies to “[ML UTD 2] Machine Learning Up-To-Date”

This site uses Akismet to reduce spam. Learn how your comment data is processed.