Advances in artificial intelligence over the past few years have endowed machines with capabilities that exceed humans: computers are now able to see and recognize objects with an accuracy of 96.5% compared to humans at 94.9%; they are able to understand human speech with an accuracy greater than 95%; and to speak nearly indistinguishably from humans.
Andrew Ng, co-founder of Google Brain and Baidu’s AI program, offers this simple heuristic to determine what AI is capable of, “If a typical person can do a mental task with less than one second of thought, we can probably automate it using AI either now or in the near future.”
AI’s Achilles Heel: But, the AI systems of today require massive amounts of data to train their algorithms. And of course, not all problems are organized in a way to amass this data; and not all companies have access to large datasets.
Data science for data of all shapes and sizes
Data science combines statistics, analysis, and machine learning to understand patterns in existing datasets. This broad field is ideal for deriving meaning in data in combination with an expert - and excels when faced with data that is sparse or in short supply.
Some of the techniques and approaches to apply, regardless of the volume of data include:
Recommendations - Popularized by companies like Amazon and Netflix, these companies rely on technologies such as nearest-neighbors, collaborative filtering, and entity labeling to give you the highest quality recommendations.
Machine Learning - Unsupervised learning is commonly used by data scientists when datasets are smaller or when exploring the structure of data without the need for labels and include neural networks, bayesian networks, and genetic algorithms.
Predictive Analytics - The field of Predictive Analytics includes statistical techniques such as trend analysis, outcomes prediction, and aberrance detection.
Language Analytics - There is a long history of research and tools available in this area - which can be applied to datasets of all sizes. Language analytics includes topics such as key phrase detection, entity detection, sentiment analysis, and text translation.
What you should know: Don’t be discouraged if you don’t have massive amounts of data available for a particular problem. Despite the possible sparsity or lack of depth of your dataset, you can derive deeper understanding by combining experts in data science with experts in your business domain.
Gaining insights may be more challenging, but the scope of problems that can be solved is wide and vast.