The Batch
August 7, 2019


Dear friends,

I spent last weekend coding and training a few neural networks on my laptop. I enjoy many different types of work, but playing around in a Jupyter Notebook with TensorFlow/Keras is still one of my favorites!

You can do a lot with just a laptop. The idea that you need thousands of dollars of GPUs to do anything is oversimplified. I’ve mentioned many times that one cutting edge of deep learning is scalability. We still need faster computers, and scaling up existing supervised and unsupervised learning methods will drive significant improvements. (I consider both GPT-2 and BERT recent examples of this.) Faster computers will also open up new research directions.

But people were training on MNIST (60,000 examples, albeit tiny 28x28 images) before there were GPUs. And there’s plenty of cutting edge work to be done with Small Data. Last weekend I was playing with a manufacturing problem where I had 3 labeled examples, and the challenge was getting the network architecture right, not scaling it up. My model trained in 20 seconds on my Mac, without any GPUs.

If you’re in a position to push scalability, please keep doing so--we need that! But if you don’t have a supercomputer at your disposal, you can still do plenty of of cutting edge research with some creativity and a laptop.

Keep learning! 




Bach to the Future  

Google Brain earlier this year brought the experience of composing music to a global audience with its Bach Doodle. The team explains how they did it in a new paper.
Key insight: Unfamiliar technology, such as music-generation models, can be intimidating. Wrapped in a user-friendly interface, though, AI can give users who may know nothing about music an intuitive understanding of musical concepts. You can try out the fun app here.
How it works: Researchers led by Anna Huang adapted Coconet, which fills in incomplete music scores, as a simple toy, lightweight enough to deliver in a browser.

  • The model is trained on Bach's compositions to predict the next note based on permutations of surrounding notes. To generate new music, it alternates between erasing notes and filling them in, improving its output iteratively based on the evolving context.
  • The user interface starts by orienting users with an animated demonstration of how notes combine to form harmony. Then it prompts them to input a brief melody by clicking in a conventional lines-and-spaces music display. It limits the input to simple rhythms and note choices, though a less limited mode is available for more advanced users.
  • The researchers added new operations to TensorFlow’s javascript library and modified the model architecture for smaller size and quicker execution. These strategies cut the model to 400kB and generation time from 40 to 2 seconds.

Results: In three days, users spent 350 years' worth of time playing with the Bach Doodle, and Coconet received more than 55 million queries. In addition, the team amassed a public data set containing 21.6 million generated compositions along with user ratings, country of origin, and length data.
Why it matters: Bach Doodle is a hint of fresh experiences that deep learning holds in store for both entertainment and artistry. A clever way to generate a huge amount of data, too!
We’re thinking: The 17th-century invention of tuning systems that sounded equally good in all musical keys liberated Bach to use any musical key within a single composition. Could innovations in AI also open doors for contemporary geniuses of music?


Deepfakes Dial For Dollars  

Never mind deepfake videos' potential to disrupt democracy. Deepfake audio has already been used to bilk businesses of millions of dollars.
What happened: A generated voice of the CEO called financial officers at three companies, according to the security company Symantec. The fake voice persuaded all of them to transfer funds into a thief’s account.
How it works: Security experts interviewed by Axios believe the culprits trained their model on audio from earnings calls, corporate keynotes, and other public presentations. The calls used background noise to cover up words and syllables the AI had difficulty replicating.
Behind the news: Deepfaked audio made headlines in May when a Canadian startup called Dessa generated a voice that sounded like the popular podcaster Joe Rogan extolling the virtues of chimpanzees as hockey players.
Why it matters: Audio forgeries are useful for more than stealing money outright. Black-hat investors could release faked earnings calls that cause a company's stock to plummet, then scoop up bargain-priced shares. Faked interview clips could wreak havoc in any field.
We’re thinking: The digital security industry is already booming, thanks to more than a half century of computer-aided grift. These companies could use more AI brainpower to counter the rise of deepcrime.


Robots With Sensitive Skin

Researchers at National University of Singapore developed an electronic skin that transmits sensations of contact, pressure, and temperature from different parts of the membrane at once.
How it works: Asynchronously Coded Electronic Skin (or ACES) comprises several types of sensors that are triggered by external events rather than internal queries. 

  • ACES sensors share a range of bandwidth on a single wire. Most other e-skin prototypes send information serially, which can cause latency in a large array. 
  • Each sensor processes events independently. This makes ACES more like the human nervous system, which can experience sitting in a chair, being pawed by a cat, and feeling oppressed by mid-summer humidity all at once. 
  • The sensor array can be paired with different substrates depending on the application.
  • ACES can take a licking. It continued to work even after the researchers cut it into a zagging ribbon.

Smarter skin: The researchers trained neural nets to infer complex sensations by combining the inputs from several sensors over time.

  • For instance, they instructed an e-skinned robot hand to grab an object, then tugged that object using a string.
  • The network determined the object was slipping from its grasp based on the sequence of sensors reporting loss of pressure and concurrent signals from others reporting new contact.

Why it matters: Terminators and Jedi amputees aren’t the only ones who could benefit from resilient e-skin.

  • The sensor fabric could coat hazard suits, helping first responders monitor their surroundings as they pick through the rubble of a disaster. 
  • Tactile feedback could enable warehouse robots to operate more safely with human co-workers. 
  • Farm robots of the future will need to reach through branches while maintaining enough sensitivity to pick delicate fruit. 

We’re thinking: Machine learning thrives on data, and ACES, which packs around 70 sensors into a square inch, could generate plenty. That may help dexterous robots learn to interact with the world in more nuanced ways.



ML Iterative Cycle Course Ad

It’s hard to know in advance what approach will work best in a ML project. Learn how you can try many ideas through an iterative process, and how to build train, dev, and test sets to speed up that process in the Deep Learning Specialization. Enroll today!


Protein Predictor  

An AI tool capable of predicting which amino acid sequences produce novel proteins could give pharmaceutical companies a key to designing drugs from the bottom up.
What’s new: The 21 amino acids are like an alphabet that can be endlessly combined into new sequences. But, like letters sequenced to form words, relatively few amino acid sequences form viable proteins. UniRep, a deep learning model developed by a group at Harvard Medical School, predicts such sequences.
How it works: The model learns a sort of language for combining amino acids into short-length proteins durable enough to function in biological systems.

  • A recurrent neural network was trained on 24 million known protein sequences.
  • The model learned to predict which amino acids could go next in a sequence based on a dynamic summary of the existing sequence.
  • It organized predicted proteins into 53 groups of proteins called proteomes that are capable of interacting within a single organism. The authors fed the model proteins of reference organisms like flatworms and zebra finches.

Why it matters: UniRep predicts amino-acid sequences that form stable bonds. In industry, that’s vital for determining the production yields, reaction rates, and shelf life of protein-based products. 
We’re thinking: Drugs generally are developed by brute-force chemistry: Scientists search for promising proteins by mixing and testing, then pharma companies trial and error their way to a viable product. Intelligent design of therapeutic proteins could bring new medicines to market faster and more cheaply, with fewer side effects. Custom-designed proteins could target specific conditions, treating illnesses for which no treatment exists and potentially saving countless lives.



normalization sized

The New Normalization  

Researchers have devised various normalization methods such as batch normalization to accelerate neural network training, but it’s seldom clear which method is best for a given model. A new method automatically figures out what works best for individual layers.
What’s new: Switchable Normalization is a normalization layer that predicts the optimal combination of normalization methods with little cost in computation. Developed by Ping Luo and colleagues at the University of Hong Kong and SenseTime, SN adapts to various network architectures and tasks, and it works with a wide range of batch sizes.
Key insight: Most normalization methods adjust inputs based on a simple normalization equation that depends on a mean and a variance. However, these values differ depending on the method. SN computes these parameters as weighted averages of mean and variances created by traditional methods, making for more flexible normalization across different models or different layers within a model.
How it works: SN tracks mean and variance for batch, instance, and layer normalizations and computes weighted averages for each hidden unit. During training, the network learns the weight of each parameter, and thus the ratio of normalization techniques to apply in each layer.

  • Layer inputs are passed to traditional normalization techniques to compute the corresponding mean and variances. 
  • SN calculates weighted averages by learning weights of the means and variances. It uses three traditional normalizations, so it learns six additional weights, optimized via backpropagation.
  • Finally, SN applies the normalization formula for each technique according to the weighted averages.

Results: Switchable Normalization outperforms batch, layer, and instance normalization alone in a variety of benchmarks including image classification, face recognition, object detection, video recognition, scene parsing, and neural architecture search.
Why it matters: The choice of normalization has a significant impact on training speed. It can be very sensitive to hyperparameter values, though, so picking the right one can take lots of tedious testing, and whatever you choose applies to the whole network. SN takes the problem off your hands, resulting in models that converge and perform better.
Takeaway: Many networks are trained using one normalization method or another. Now machine learning engineers can have the best of all worlds.

Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai.

Subscribe here and add our address to your contacts list so our mailings don't end up in the spam folder. You can unsubscribe from this newsletter or update your preferences​ here​.

Copyright 2019  deeplearning.ai, 195 Page Mill Road, Suite 115, Palo Alto, California 94306, United States. All 
rights reserved.
  Behance tw link  




195 Page Mill Road

Suite 115

Palo Alto California 94306 United States