Another year, Jeff Dean represented Google AI and summarized the AI trend in the past year.
This is my brother-in-law’s routine annual report as the chief executive of Google AI, and also the muscle display of the world’s largest AI and even the largest manufacturer of cutting-edge technology.
He said that the past 2019 has been a very exciting year.
There are still academic and application blossoms, and open source and new technologies are advancing simultaneously.
From basic research to the application of technology in emerging fields, and then to 2020.
Although the reporting format has not changed, artificial intelligence technology has taken a big step forward.
Jeff Dean summarized 16 major aspects of AI results and revealed that the number of AI papers published in the year reached 754, with an average of 2 papers published every day.
Covers AutoML, machine learning algorithms, quantum computing, sensing technology, robotics, medical AI, AI for good …
The piles and pieces are not only promoting all aspects of the social role of AI at present, but also a small display of future trends.
It is no exaggeration to say that if you want to know the progress of AI technology in 2019, it is more appropriate to see Jeff’s summary; if you want to know where AI will go in 2020, you can also see a lot of benefits from watching Jeff.
For your convenience, we have organized a small directory for you:
Machine learning algorithms: Understanding the nature of dynamic training in neural networks
AutoML: Keeping an eye on automating machine learning
Natural language understanding: combining multiple methods and tasks to improve technical skills
Machine perception: deeper understanding and perception of images, videos, and environments
Robotics: self-supervised training, release benchmarks for robot testing
Quantum computing: achieving quantum superiority for the first time
Applications of AI in other disciplines: from the brain of flies to mathematics, as well as chemical molecular research and artistic creation
Mobile AI applications: Locally deployed voice and image recognition models, as well as stronger translation, navigation, and photography
Health and Medical: Used for clinical diagnosis of breast cancer and skin diseases
AI-assisted disabled people: using image recognition, speech transcription technology to benefit the disadvantaged
AI promotes social welfare: foretells floods, protects flora and fauna, teaches children to learn literacy and mathematics, and has spent more than $ 100 million on 20 public welfare projects
Developer tools build and benefit the research community: TensorFlow usher in a comprehensive upgrade
Open 11 datasets: from reinforcement learning to natural language processing to image segmentation
Global Expansion of Summit Research and Google Research: Publish a large number of papers, invest a lot of resources to support the research of teachers, students and researchers in various fields
Artificial Intelligence Ethics: Advances in Research on Artificial Intelligence in Equity, Privacy Protection, and Interpretability
Looking ahead to 2020 and beyond: The deep learning revolution will continue to reshape how we think about computing and computers.
A major focus is understanding the nature of dynamic training in neural networks.
In the following study, the researchers’ experimental results show that scaling the amount of data parallelism can make the model converge faster and more efficiently.
Google has shown that the statistics of the margin distribution can be used to predict the generalization gap, which can help understand which model is most effectively generalized.
In addition, Off-Policy classification is also studied in the context of reinforcement learning in order to better understand which models may be best generalized.
This method can automate many aspects of machine learning, and usually achieve better results in certain types of machine learning meta-decisions, such as:
Google showed how to use neural structure search technology to get better results on computer vision problems. Its accuracy rate on ImageNet is 84.4%, and the parameters are 8 times less than the previous best model.
Google showed a neural architecture search method, showing how to find an efficient model suitable for a specific hardware accelerator. So as to provide mobile devices with high-precision, low-computation operation model.
Google showed how to extend the work of AutoML to the field of video models, how to find architectures that achieve the most advanced results, and lightweight architectures that can match the performance of manual models.
Google developed AutoML technology for tabular data and co-released the technology as a new product for Google Cloud AutoML Tables.
It shows how to find interesting neural network architectures without using any training steps to update the weight of the model being evaluated, making the structure search computation more efficient.
Explored the architecture of the discovery NLP task. The performance of these tasks is significantly better than the ordinary Transformer model, and the computational cost is greatly reduced.
Research has proved that the automatic learning data augmentation method can be extended to speech recognition models.
Compared with the existing human ML-expert-driven data augmentation methods, significantly higher accuracy can be obtained with fewer data.
Launched the first speech application using AutoML for keyword recognition and spoken language recognition.
In experiments, better models were found than human designs: they are more efficient and perform better.
In the past few years, significant progress has been made in models for natural language understanding, translation, natural dialogue, speech recognition, and related tasks.
One theme of Google’s work in 2019 is to improve technical skills by combining various methods or tasks to train more powerful models.
For example, only one model is used to train translation between 100 languages (instead of using 100 different models), which significantly improves the quality of translation.
Demonstrated how to combine speech recognition with language models and train the system on multiple languages can significantly improve the accuracy of speech recognition.
Studies have shown that it is possible to train a joint model to complete speech recognition, translation, and text-to-speech generation tasks.
It also has certain advantages, such as retaining the speaker’s voice in the generated translated audio, and a simpler overall learning system.
Research shows how many different goals can be combined to generate models that are significantly better at semantic retrieval.
It shows how to use adversarial training programs to significantly improve the quality and robustness of language translation.
With the development of models based on seq2seq, Transformer, BERT, Transformer-XL, and ALBERT, Google’s language comprehension technology capabilities have been continuously improved. And has been applied to many core products and functions.
In 2019, the application of BERT in core search and ranking algorithms has brought the biggest improvement in search quality in the past five years (and one of the biggest improvements ever).
This is followed by Google’s main research in this area over the past year.
Including a deeper understanding of images and videos, as well as the perception of life and the environment, specifically:
Study using time period consistency to learn better representations of fine-grained temporal understanding of video.
The application of machine learning in robot control is an important research area of Google. Google believes this is an important tool to enable robots to operate effectively in complex real-world environments (such as everyday homes, businesses).
Google’s work in robotics in 2019 includes:
1. In remote robot navigation through automatic reinforcement learning, Google showed how to combine reinforcement learning with remote projects to enable robots to more effectively navigate in complex environments, such as Google office buildings.
2, in PlaNet in, Google demonstrated the world’s model is valid only learn from the image, and how to use this model with fewer number of learning tasks.
3. On TossingBot, Google unifies the laws of physics and deep learning, lets robots learn intuitive physics principles through experiments, and then throws objects into boxes according to the learned rules.
4. In the study of Soft Actor-Critic, Google proved that the way to train reinforcement learning algorithms can be achieved by maximizing the expected reward or maximizing the entropy of the strategy.
5. Google has also developed a self-supervised learning algorithm for robots, which allows robots to learn to assemble objects in a self-supervised manner by decomposing objects. This shows that robots can learn from disassembly like children.
In 2019, Google made major pictures on quantum computing, showing for the first time the superiority of measuring power to the world: in a computing task, quantum computers are much faster than classical computers.
The original classical computer needed to calculate the task of 10,000 years, and the quantum computer took only 200 seconds to complete. The study appeared on the cover of Nature magazine on October 24 this year.
“It’s like the first rocket to successfully break away from Earth’s gravity and fly to the edge of space,” said Google CEO Pichai. Quantum computers will play an important role in areas such as materials science, quantum chemistry, and large-scale optimization.
Google is also working to make quantum algorithms easier to express and easier to control hardware, and Google has found ways to use classical machine learning techniques in quantum computing.
The interactive, automatic 3D reconstruction of the fly’s brain uses machine learning models to carefully map each neuron in the fly’s brain, which Jeff Dean calls a milestone in mapping the structure of the fly’s brain.
In learning better simulation methods for partial differential equations, Google uses machine learning to speed up the calculation of partial differential equations, which is also the core of studying basic calculation problems such as climate change, fluid dynamics, electromagnetics, heat conduction, and general relativity.
Google also uses machine learning models to determine the smell, and GNN to determine the molecular structure to predict what it smells like.
In terms of artistic creation, GoogleAI’s efforts are even more, such as the artistic expression of AI + AR
Google Translate’s photo translation function has also been upgraded, adding support for multiple languages such as Arabic, Hindi, Malay, Thai, and Vietnamese, and not only English and other language translations, other than English Language translation is also possible, and it can automatically find where the text in the camera frame is.
The night scene challenge also has a huge improvement in shooting stars, and also published a paper by SIGGRAPH Asia.
2019 is the first full year that the Google Health team has experienced.
At the end of 2018, Google reorganized the Google Research health team, Deepmind Health, and health-related hardware divisions to create a new Google Health team.
1. In the diagnosis and early detection of diseases, Google has made a number of results:
Using deep learning models to detect breast cancer is more accurate than human experts, reducing false positives and false negatives in diagnosis. The study was recently published in the journal Nature.
Google AI breast cancer test surpasses humans, LeCun questioning raises discussion, but the flat-breasted girl may not apply
In addition, Google has also made some new achievements in the diagnosis of skin diseases, prediction of acute kidney injury, and detection of early lung cancer.
2. Google combined machine learning with other technologies in other medical technologies, such as adding enhanced display technology to the microscope to help doctors quickly locate lesions.
Google has also built a human-centric similarity image search tool for pathologists that allows examining similar cases to help doctors make more effective diagnoses.
AI assists people with disabilities
AI is getting closer to our lives. Over the past year, Google has used AI to help our daily lives.
We can easily see beautiful images, hear favorite songs, or talk to loved ones. However, more than one billion people worldwide cannot understand the world in these ways.
Machine learning technology can serve people with disabilities by converting these audiovisual signals into other signals. The AI assistant technologies provided by Google are:
Lookout helps blind or visually impaired people identify their surroundings.
Live Transcribe, a real-time transcription technology, helps deaf or hearing-impaired people quickly convert speech to text.
Google AI New Year: Bringing Technological Benefits to Deaf People, Surprises on Home Graffiti
The Euphonia project enables personalized speech-to-text conversion. For people who have slurred speech due to diseases such as frostbite, this study improves the accuracy of automatic speech recognition.
There is also a Parrotron project that also uses end-to-end neural networks to help improve communication, but the research focus is on speech-to-speech conversion.
For blind and low-vision people, Google uses AI technology to generate image descriptions. When screen readers encounter undescribed images or graphics, Chrome can now automatically create descriptive content.
Lens for Google Go, an audio-reading tool, has greatly helped illiterate users to gain information in the world of words.
AI promotes social welfare
Jeff Dean said that machine learning is of great significance to solving many major social problems, and Google has been working in some areas of social problems, working to enable others to use creativity and skills to solve these problems.
For example, floods affect hundreds of millions of people every year. Google uses machine learning, calculations and better databases to make flood forecasts and send alerts to millions of people in affected areas.
Google also works with the US Oceanic and Atmospheric Administration to determine the location of whale populations using underwater sound data.
Google has released a set of tools to study biodiversity using machine learning.
They also hosted a Kaggle competition to use computer vision to classify various diseases on cassava leaves. Cassava is the second-largest source of carbohydrates in Africa. Cassava disease affects people’s video security issues.
For education, Google has made a Bolo application with speech recognition technology to guide children to learn English. This application is deployed locally and can be run offline. It has helped 800,000 Indian children to literate, and children have read a billion words. In the pilot of 200 villages in India, 64% of children have improved their reading ability.
In addition to literacy, there are more complex subjects such as mathematics and physics. Google has made a Socratic app to help high school students learn math.
In addition, in order to make AI play a greater role in public welfare, Google held the AI Impact Challenge, which collected more than 2,600 proposals from 119 countries.
In the end, 20 proposals that could solve major social and environmental issues stood out. Google invested US $ 25 million (more than 170 million yuan) in these proposals and made some achievements, including:
Doctors Without Borders (MSF) has created a free mobile app that uses image recognition tools to help clinicians in poor conditions analyze antibacterial images and advise patients on what medicine to use. This project has been piloted in Jordan.
Billions of people in the world live on small farms, but once diseases and insect pests occur, their lives will be broken.
Therefore, an NPO named Wadhwani AI used an image classification model to identify pests on the farm and gave advice on which pesticides should be sprayed and when to spray them, increasing crop yields.
Illegal logging of tropical rainforests is a major factor in climate change. An organization called “Rainforest Connection” uses deep learning for bioacoustic testing. With some old mobile phones, you can track the health of the rainforest and detect threats in it. .
△ 20 public welfare projects funded by Google
Developer tools for the benefit of the researcher community
As the world’s largest AI company, Google is also a pioneer of open source, and continues to glow for the community, on the one hand, it focuses on TensorFlow.
Jeff Dean said that because of the release of TensorFlow 2.0, the past year has been an exciting year for the open-source community.
This is the first major upgrade since TensorFlow was released, making building ML systems and applications easier than ever.
Qubit-related reports are as follows:
GoogleTF2.0 released early in the morning! “Change everything, push PyTorch”
In TensorFlow Lite, they added support for fast-moving GPU inference; and released Teachable Machine 2.0, which requires no coding and can train a machine learning model with a single button.
Qubit-related reports are as follows:
TensorFlow Lite releases major updates! Support for mobile GPU, 4-6 times faster inference
There is also MLIR, an open-source machine learning compiler foundation tool that addresses the growing complexity of software and hardware fragmentation, making it easier to build artificial intelligence applications.
In addition, they also open-sourced MediaPipe, a framework for building ML pipelines for perception and multimodal applications:
According to Jeff Dean, as of the end of 2019, they have given more than 1,500 researchers around the world access to the Cloud TPU for free through TensorFlow Research Cloud, and their introductory course on Coursera has more than 100,000 students and so on.
At the same time, he also introduced some “warm heart” cases. For example, with the help of TensorFlow, a college student discovered two new planets and established a method to help others discover more planets.
And college students use TensorFlow to identify potholes and dangerous road cracks in Los Angeles and more.
After the data set search engine was released in 2018, Google is still working on this aspect this year, and doing its best to contribute to this search engine.
In the past year, Google has opened 11 data sets in various fields. The resources are starting to be distributed below. Please close it ~
Open Images V5, adding segmentation masks to the annotation set, with a sample size of 2.8 million and spanning 350 categories, qubit reports:
2.8 million samples! Google opens the largest segmentation mask data set in history, starts a new round of challenges
“Natural Questions” dataset, the first dataset that uses naturally occurring queries and finds answers by reading the entire page, instead of extracting answers from a short paragraph, 300,000 pairs of questions and answers, BERT can not reach 70 points, quantum Reports:
Google publishes a super-difficult question and answer dataset “natural questions”: 300,000 pairs of questions and answers, BERT can not reach 70 points
The football simulation environment Google Research Football, the agent can play freely in this FIFA-like world, learn more kicking skills, qubit reports:
Google has built a virtual football field that allows AI to do reinforcement learning training like FIFA 丨 Open source API
Landmark dataset Google-Landmarks-v2: Includes 5 million pictures with 200,000 landmarks. Qubit reports:
5 million pictures, 200,000 landmarks, and Google released a large dataset
YouTube-8M Segments dataset, large-scale classification and time positioning dataset, including manual verification tags at the YouTube-8M video 5-second segment level:
AVA Spoken Activity dataset, multimodal audio + visual video perception dialogue dataset:
PAWS and PAWS-X: used for machine translation. Both datasets are composed of highly structured sentence pairs and have a high degree of lexical overlap with each other. About half of the sentences have corresponding multilingual interpretations:
A natural language conversation dataset that lets two people talk, and simulates a man conversation with digital assistants:
Visual Task Adaptation Benchmark: This is a benchmark for visual task adaptation benchmarked by GLUE, ImageNet, and Google.
Help users better understand which visual representations can be generalized to more other new tasks, thereby reducing the data requirements on all visual tasks:
The largest public database of task-oriented dialogues-the pattern-guided dialogue dataset, with more than 18,000 dialogues spanning 17 domains:
CVPR has more than 40 papers, ICML has more than 100 papers, ICLR has more than 60 papers, ACL has more than 40 papers, ICCV has more than 40 papers, and NeuroIPS has more than 120 papers.
They also hosted 15 independent workshops at Google on topics ranging from improving global flood warnings to using machine learning to build better systems for people with disabilities to speeding up the development of algorithms for quantum processors (NISQ), Applications and tools, and more.
And through the annual PhD scholarship project, more than 50 doctoral students have been funded worldwide, and startup companies have been supported.
Similarly, Google’s research locations are still expanding globally in 2019, with a research office opened in Bangalore. At the same time, Jeff Dean also issued a recruitment request: if you are interested, hurry up to the bowl ~
As in previous years, this report is the most open. In fact, Jeff first talked about Google’s work on artificial intelligence ethics.
This is also Google’s clear declaration in terms of AI practice, ethics, and technology for good.
In 2018, Google released the seven AI principles and applied them around them. In June 2019, Google handed in a transcript showing how these principles can be put into practice in research and product development.
Jeff Dean said that because these principles basically cover the most active areas of artificial intelligence and machine learning research, such as bias, security, fairness, reliability, transparency, nd privacy in machine learning systems.
Therefore, Google’s goal is to apply technologies in these fields to work and continue research to continue to advance related technology.
On the one hand, Google has also published a number of papers at academic conferences such as KDD’19, AIES 19 to explore the fairness and interpretability of machine learning models.
For example, study how Activation Atlases helps to explore the behavior of neural networks and how it can help interpretability of machine learning models.
On the other hand, Google’s efforts have also come to fruition, and the products have been practically produced.
For example, TensorFlow Privacy was released to help train machine learning models that guarantee privacy.
In addition, Google has released a new dataset to help researchers identify deep fakes.
Looking forward to 2020 and beyond
Finally, Jeff also stood on the development course of the past 10 years, and looked forward to the research trends of 2020 and beyond.
He said that in the past decade, machine learning and computer science have made significant progress, and we now make computers more capable than ever before to see, listen to and understand language.
With complex computing devices in our pockets, we can use these capabilities to better help us accomplish many tasks in our daily lives.
We redesigned our computing platform around these machine learning methods by developing specialized hardware that enabled us to handle larger problems.
These have changed our view of computing devices in data centers, and the deep learning revolution will continue to reshape our view of computing and computers.
At the same time, he also pointed out that there are still a large number of unresolved issues. This is also the research direction of Google in 2020 and beyond:
First, how to build a machine learning system that can handle millions of tasks and automatically complete new tasks successfully?
Second, how can we make the most advanced progress in important areas of artificial intelligence research, such as avoiding prejudice, improving interpretability and understandability, improving privacy and ensuring security?
Third, how to apply computing and machine learning to make progress in important new areas of science? Examples include climate science, healthcare, bioinformatics, and many others.
Fourth, regarding the ideas and directions pursued by the machine learning and computer science research communities, how can we ensure that more and more researchers propose and explore them? How can we best support new researchers from different backgrounds entering this field?
Finally, what do you think of the breakthrough and progress of Google AI in the past year?